CN113345440A - Signal processing method, device and equipment and Augmented Reality (AR) system - Google Patents

Signal processing method, device and equipment and Augmented Reality (AR) system Download PDF

Info

Publication number
CN113345440A
CN113345440A CN202110639381.8A CN202110639381A CN113345440A CN 113345440 A CN113345440 A CN 113345440A CN 202110639381 A CN202110639381 A CN 202110639381A CN 113345440 A CN113345440 A CN 113345440A
Authority
CN
China
Prior art keywords
voice
target text
processing
text
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110639381.8A
Other languages
Chinese (zh)
Inventor
刘坚
李秋平
李磊
王长虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Youzhuju Network Technology Co Ltd
Original Assignee
Beijing Youzhuju Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Youzhuju Network Technology Co Ltd filed Critical Beijing Youzhuju Network Technology Co Ltd
Priority to CN202110639381.8A priority Critical patent/CN113345440A/en
Publication of CN113345440A publication Critical patent/CN113345440A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/332Displays for viewing with the aid of special glasses or head-mounted displays [HMD]

Abstract

The embodiment of the application discloses a signal processing method, a device, equipment, a storage medium and an augmented reality AR system, which are applied to voice processing equipment and comprise the following steps: collecting voice signals; processing the voice signal to obtain a target text; and sending the target text to a first AR device connected with a voice processing device so as to display the target text on the first AR device. Therefore, in the embodiment of the application, the target text subjected to the voice processing is displayed on the first AR device instead of the voice playing, so that the target text can be displayed on the first AR device while the voice processing device collects and processes the voice, and the voice played by the voice playing device in the prior art cannot be doped during voice collection. In addition, the equipment for carrying out voice acquisition and the equipment for carrying out text display can be far away, are not limited by the position of a user, and improve user experience.

Description

Signal processing method, device and equipment and Augmented Reality (AR) system
Technical Field
The present application relates to the field of computers, and in particular, to a signal processing method, apparatus, device, and storage medium, and an augmented reality AR system.
Background
With the development of artificial intelligence technology, the development of speech processing by artificial intelligence technology is also accelerated. The current equipment for processing voice by adopting artificial intelligence technology is mainly divided into the following parts: the 3 devices are integrated together to form a voice processing device. The sound receiving device is used for collecting sound in a voice processing scene, such as a voice translation scene, and transmitting the collected sound to the processing device; the processing device translates the voice, synthesizes the translated text into computer voice and sends the computer voice to the playback device; and after receiving the translated text sent by the processing device, the sound playing device plays the computer-synthesized voice of the translated text.
However, when the radio device receives the radio, the speech processing device plays the translated text speech, which may cause the played translated text speech to be collected by the radio device, and finally, the speech collected by the radio device has poor effect, which affects the subsequent speech processing.
Therefore, the current voice processing device has poor user experience effect and cannot meet the requirements of users.
Disclosure of Invention
In order to solve the problems that the experience effect of a voice translation device user is poor and the requirement of the user cannot be met in the prior art, the embodiment of the application provides a signal processing method, a signal processing device, a storage medium and an Augmented Reality (AR) system.
The embodiment of the application provides a signal processing method, which is applied to voice processing equipment and comprises the following steps:
collecting voice signals;
processing the voice signal to obtain a target text;
and sending the target text to a first augmented display (AR) device connected with the voice processing device so as to display the target text on the first AR device.
Optionally, the processing the voice signal to obtain a target text includes:
recognizing the voice signal to obtain a voice text;
and translating the voice text to obtain a target text.
Optionally, the processing the voice signal to obtain a target text includes:
and identifying the voice signal to obtain a target text.
Optionally, the sending the target text to the first AR device includes:
and sending the target text to the first AR device through a wireless port.
Optionally, the sending the target text to the first AR device through a wireless port includes:
and sending the target text to a network device through a wireless port so that the network device sends the target text to the first AR device through the wireless port.
Optionally, the method further comprises:
receiving a correction text obtained by processing the voice signal by a cloud server;
replacing the target text stored at the speech processing device with the corrected text.
Optionally, the method further comprises:
sending the corrected text to the first AR device.
Optionally, the speech processing device comprises a second AR device.
The embodiment of the application provides a signal processing device, which is applied to a voice processing device and comprises:
the acquisition unit is used for acquiring voice signals;
the processing unit is used for processing the voice signal to obtain a target text;
a sending unit, configured to send the target text to a first augmented display AR device connected to the speech processing device, so as to display the target text on the first AR device.
Optionally, the processing unit is configured to identify the voice signal to obtain a voice text; and translating the voice text to obtain a target text.
Optionally, the processing unit is configured to recognize the speech signal to obtain a target text.
Optionally, the sending unit is configured to send the target text to the first AR device through a wireless port.
Optionally, the sending unit is configured to send the target text to a network device through a wireless port, so that the network device sends the target text to the first AR device through the wireless port.
Optionally, the receiving unit is configured to receive a corrected text obtained by processing the voice signal by the cloud server;
the processing unit is further configured to replace the target text stored in the speech processing device with the corrected text.
Optionally, the sending unit is further configured to send the corrected text to the first AR device.
Optionally, the speech processing device comprises a second AR device.
An embodiment of the present application provides a signal processing apparatus, including: a processor and a memory;
the memory to store instructions;
the processor is configured to execute the instructions in the memory and execute the method according to the above embodiment.
The embodiment of the application provides an Augmented Reality (AR) system, which comprises a voice processing device and a first AR device, wherein the voice processing device is in wireless connection with the first AR device;
the voice processing device is used for executing the method of the embodiment;
the first AR device is used for displaying the target text.
The embodiment of the present application provides a computer-readable storage medium, which includes instructions, when executed on a computer, cause the computer to execute the method of the above-mentioned embodiment.
The embodiment of the application provides a signal processing method, which is applied to voice processing equipment and comprises the following steps: collecting voice signals; processing the voice signal to obtain a target text; sending the target text to a first Augmented Reality (AR) device connected to a voice processing device to display the target text on the first AR device. Therefore, in the embodiment of the application, the target text subjected to the voice processing is displayed on the first AR device instead of the voice playing, so that the target text can be displayed on the first AR device while the voice processing device collects and processes the voice, and the voice played by the voice playing device in the prior art cannot be doped during voice collection. In addition, the voice processing device for acquiring and processing voice and the first AR device for displaying the target text are not integrated, namely the voice acquiring device and the text displaying device can be far away from each other and are not limited by the position of the user, and user experience is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a block diagram of an augmented reality AR system provided in the present application;
fig. 2 is a block diagram of another augmented reality AR system provided in the present application;
fig. 3 is a flowchart of a signal processing method provided in the present application;
fig. 4 is a block diagram of a signal processing apparatus provided in the present application;
fig. 5 is a block diagram of a signal processing apparatus according to the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As described in the background, with the development of artificial intelligence technology, devices for speech processing using artificial intelligence technology, such as speech translation, have been developed at an accelerated speed. The current equipment for processing voice by adopting artificial intelligence technology is mainly divided into the following parts: the 3 devices are integrated together to form a voice processing device. The sound receiving device is used for collecting sound in a voice processing scene, such as a voice translation scene, and transmitting the collected sound to the processing device by using a network; the processing device translates the voice, synthesizes the translated text into computer voice and sends the computer voice to the playback device; and the sound reproduction device receives the translated text sent by the processing device through the network and then reproduces the computer synthesized voice of the translated text.
However, the speech processing device cannot play the translated text speech when the radio receiving device receives the radio, because the translated text speech is played when the radio receiving device receives the radio, the translated text speech is also collected by the radio receiving device, and finally, the speech effect collected by the radio receiving device is poor, and subsequent speech processing is affected. When the voice processing equipment is adopted, only the original text voice acquisition can be realized, and then the alternate voice processing scene of the translated text voice playing additionally increases the time consumption of one time of the original text voice. Moreover, because the voice processing device is a sound receiving device, a processing device and a sound playing device which are integrated together, when the sound receiving device is far away from a sound source, the voice collected by the sound receiving device is unclear, and further the subsequent voice processing is influenced.
Therefore, the current voice processing device has poor user experience effect and cannot meet the requirements of users.
Based on this, the embodiment of the present application provides a signal processing method, applied to a speech processing device, including: collecting voice signals; processing the voice signal to obtain a target text; sending the target text to a first Augmented Reality (AR) device connected with the voice processing device to display the target text on the first AR device.
Therefore, in the embodiment of the application, the target text subjected to the voice processing is displayed on the first AR device instead of the voice playing, so that the target text can be displayed on the first AR device while the voice processing device collects and processes the voice, and the voice played by the voice playing device in the prior art cannot be doped during voice collection. In addition, the voice processing device for collecting and processing voice and the first AR device for displaying the target text are not integrated, namely the sound reception and the display of the method can be far away, and are not limited by the position of a user, so that the user experience is improved.
The embodiment of the application provides an Augmented Reality (AR) system. The AR system provided by the embodiment of the application comprises voice processing equipment and first AR equipment, wherein the voice processing equipment is in wireless connection with the first AR equipment. And the voice processing equipment is used for acquiring voice signals, processing the voice signals to obtain a target text and sending the target text to the first AR equipment. The first AR device is used for displaying the target text.
The voice processing device is a device for processing voice, for example, can perform voice recognition or translation, and includes a sound receiving unit and a processing unit, which may or may not be integrated and connected by wire or wirelessly.
Fig. 1 and 2 show block diagrams of two different AR systems:
referring to fig. 1, an AR system 100 includes a speech processing device, which may be a second AR device 120, and a first AR device 110, when a sound receiving unit and a processing unit are integrated on the second AR device. And the second AR equipment collects and processes the voice signals, and sends the processed target text to the first AR equipment for display.
Referring to fig. 2, the AR system 200 includes a voice processing device and the first AR device 110, the voice processing device includes a sound receiving unit 121 and a processing unit 122, in which case, the sound receiving unit and the processing unit are not integrated on the same device, the sound receiving unit may be a microphone or a microphone, the processing unit may be a processing chip, and the processing chip may be a chip for performing voice processing based on an artificial intelligence algorithm. The microphone collects voice signals, the processing unit processes the voice signals to obtain a target text, and the target text is sent to the first AR device to be displayed.
If the AR system shown in fig. 2 is used, when a plurality of users communicate with each other by using a plurality of voice processing devices, a microphone may be disposed in front of each user, and at this time, the processing unit may be separately integrated on a processing device, and each microphone is connected to the processing device, so as to process a voice signal collected by each microphone. The independent processing device is arranged, so that the setting cost of the processing unit can be saved, and the voice processing efficiency is improved.
Referring to fig. 3, the flowchart of a signal processing method according to an embodiment of the present application is shown.
The signal processing method provided by the embodiment is applied to a voice processing device.
The signal processing method provided by the embodiment comprises the following steps:
s101, voice processing equipment collects voice signals.
In the embodiment of the application, the voice signal is collected by the voice processing equipment, and the voice processing equipment can be close to a sound source so as to collect clear voice signals.
There are two implementation scenarios for collecting speech signals:
referring to fig. 1, the voice processing device is a second AR device, the second AR device may be an AR device worn on the head of a speaker so as to clearly collect a voice signal of the speaker, and the processing unit integrated in the second AR device processes the voice signal collected by the sound receiving unit to obtain a target text corresponding to the voice signal.
Referring to fig. 2, the speech processing device includes a microphone and a processing unit, the microphone may be disposed in front of the speaker so as to clearly collect a speech signal of the speaker, and the processing unit processes the speech signal collected by the microphone to obtain a target text corresponding to the speech signal.
According to the embodiment of the application, voice signals are not collected by the first AR equipment, the voice signals are collected by the voice processing equipment, the voice processing equipment is closer to a source which emits sound, clear voice can be collected, namely, the first AR equipment which displays a target text in the embodiment of the application does not collect the voice signals, the first AR equipment is located near a sound source or a speaker, for example, the second AR equipment worn by the speaker or a microphone in front of the speaker collects the voice signals, namely, the embodiment of the application can avoid the problem that the voice signals are not clear probably because the voice processing equipment collects the voice signals far away from the sound source in the prior art.
And S102, processing the voice signal by the voice processing equipment to obtain a target text.
In an embodiment of the present application, after the speech processing device acquires the speech signal, the speech signal may be processed to obtain a target text corresponding to the speech signal. The target text may be text that conforms to the language habits of the user.
The voice processing equipment is provided with a processing chip for processing voice signals, and the processing chip is a chip for processing voice based on an artificial intelligence algorithm. The voice processing device with the processing chip can recognize or translate the voice by using the processing chip under the condition of no wireless network, namely, the voice processing is directly completed locally without the help of artificial intelligence operation of a cloud server. Therefore, the influence of the strength of wireless network signals during voice processing can be reduced, and the limitation of the strength of the wireless network signals is avoided.
There are two possible implementations of processing speech signals:
in a first possible implementation manner, when the language corresponding to the speech signal is not the language conforming to the language habit of the user, the collected speech signal may be recognized to obtain a speech text, and then the obtained speech text is translated to obtain a target text conforming to the language habit of the user.
In a second possible implementation manner, when the language corresponding to the speech signal is a language conforming to the language habit of the user, only the collected speech signal may be identified, and the target text conforming to the language habit of the user may be directly obtained. That is, if the language habit of the user is the same as the language of the speaker, the speech signal does not need to be translated.
S103, the voice processing device sends the target text to the first AR device.
In the embodiment of the application, the voice processing device is wirelessly connected with the first AR device, and after the voice processing device obtains the target text corresponding to the voice signal, the obtained target text is sent to the first AR device.
In a first possible implementation manner, after a speech text corresponding to a speech signal is translated, a target text conforming to the language habit of a user is obtained, and the first AR device displays the translated target text conforming to the language habit of the user. The method can help a user to increase understanding of the content of the speaker when the user carries out conversation or activity, and the target text subjected to voice processing is displayed on the first AR device instead of being played, so that the target text can be displayed on the first AR device while the voice processing device collects and processes the voice, and the voice played by a voice playing device in the prior art cannot be doped during voice collection. And the user experience of the user on the voice processing equipment is improved.
In a second possible implementation manner, when the language corresponding to the voice signal is the language used by the user, the first AR device displays the target text conforming to the language used by the user, that is, although the user may be able to directly understand the content described by the speaker, understanding can be enhanced by the displayed text, and user experience is improved. In addition, the method can help the deaf-mute to increase the understanding of the narrative content of the speaker when the deaf-mute carries out conversation or activity, can be separated from the limitation of a teacher who has to see the sign language, and improves the user experience of the deaf-mute user for the voice processing equipment.
In practical applications, the speech processing device may send the processed target text to the first AR device through the wireless port. For example, when the speech processing device is a second AR device, the second AR device may send the target text to the first AR device via bluetooth or a wireless local area network.
S104, displaying the target text on the first AR device.
In an embodiment of the application, after the first AR device receives the target text, the target text is displayed on the first AR device.
In practical application, if a plurality of users use a plurality of AR devices to perform mutual communication, each AR device needs to be wirelessly connected with the other AR devices one by one, which may cause network congestion, at this time, an independent network device may be set, the plurality of AR devices only perform network connection with the network device, the second AR device performing voice acquisition sends a target text to the network device through a wireless port, and then the network device sends the target text to the first AR device through the wireless port. This may reduce network congestion problems that may occur when multiple AR devices are interacting together. The independent network equipment is adopted to communicate with the AR equipment, so that the voice collection and processing can be realized without the support of an external wireless network, and the privacy and timeliness of voice data and the requirement of voice processing in the environment without the external network can be ensured.
In the embodiment of the application, when the speech processing device performs speech processing, the speech processing device may store the collected speech signal and the corresponding target text, so that a user may subsequently copy the contents of the speech signal and the target text.
In practical application, in order to ensure the portability and the timely communication performance of the voice processing device, the volume of the voice processing device is not too large, and further the volume of the processing chip is not too large, so that the voice processing capability of the processing chip is limited compared with that of a cloud server. Therefore, the cloud server can be used for processing the voice signal to obtain a correction text corresponding to the voice signal, and the correction text is sent to the voice processing device, so that the voice processing device can replace the target text to be the correction text. Generally speaking, since the cloud server has stronger computing power than the processing chip in the speech processing device, the corrected text obtained by the cloud server has higher correctness than the target text, and the corrected text with higher correctness can be substituted for the target text previously stored in the speech processing device.
In practical application, after receiving the correction text sent by the cloud server, the voice processing device may also send the correction text to the first AR device, so that a user using the first AR device can obtain the correction text in time.
In the embodiment of the application, the cloud server can compare the difference between the target text and the corrected text, and when the difference is greater than or equal to the threshold, the current voice processing effect of the voice processing device is different from the voice processing effect of the cloud server, and at the moment, a prompt message can be sent to a worker to remind the worker to check or further process the target text and the corrected text. The prompt message may be a mail or a short message, which is not specifically limited in this embodiment.
When the difference between the target text and the corrected text is specifically compared, each vocabulary of the target text and the corrected text may be compared first, the number of different vocabularies is determined, the number proportion of the different vocabularies occupying the total vocabulary of the target text is calculated, and if the proportion is greater than or equal to a threshold value, it is indicated that the voice processing effect of the current voice processing device is different from the voice processing effect of the cloud server.
In practical application, the voice processing effect of the current voice processing device is different from the voice processing effect of the cloud server, and there may be the following two situations:
the first situation is that the target text is not modified by the user, so the target text is only processed by the speech processing device, and if the difference between the target text and the corrected text is large, it indicates that the speech processing effect of the current speech processing device is poor, and the artificial intelligence model of the processing chip in the speech processing device can be retrained to enhance the speech processing effect.
And in the second situation, the target text is modified by the user, so that the target text is the text modified by the user, at this time, if the difference between the target text and the corrected text is large, the content of the text modified by the user is taken as the standard, which indicates that the voice processing effect of the current cloud server is deviated, and at this time, after the authorization agreement of the user, the text modified by the user is used for retraining the artificial intelligence model, so that the voice processing effect is enhanced.
The augmented reality AR system provided by the embodiment of the present application is shown in fig. 2 and fig. 3, and the working principle thereof is described in detail below with reference to the drawings.
The augmented reality AR system provided by the embodiment of the application comprises a voice processing device and a first AR device, wherein the voice processing device is in wireless connection with the first AR device;
the voice processing device is configured to execute the signal processing method provided in the foregoing embodiment, acquire a voice signal, process the voice signal to obtain a target text, and send the target text to the first AR device.
The first AR device is used for displaying the target text.
Optionally, the processing, by the speech processing device, the speech signal to obtain a target text includes:
the voice processing equipment identifies the voice signal to obtain a voice text;
and the voice processing equipment translates the voice text to obtain a target text.
Optionally, the processing, by the speech processing device, the speech signal to obtain a target text includes:
and the voice processing equipment identifies the voice signal to obtain a target text.
Optionally, the AR system further includes a network device, and the voice processing device and the first AR device are wirelessly connected to the network device respectively;
the sending, by the speech processing device, the target text to the first AR device includes:
the voice processing equipment sends the target text to network equipment;
the network device is configured to send the target text to the first AR device.
Optionally, the system further includes: a cloud server;
the cloud server is used for processing the voice signal to obtain a correction text and sending the correction text to the voice processing equipment;
the voice processing device is further configured to receive the corrected text and replace the target text stored in the voice processing device with the corrected text.
Optionally, the speech processing device is further configured to send the corrected text to the first AR device.
The cloud server is further used for comparing the difference degree between the target text and the corrected text and sending prompt information when the difference degree is larger than or equal to a threshold value.
Optionally, the speech processing device includes a second AR device.
Based on the signal processing method provided by the above embodiment, the embodiment of the present application further provides a signal processing apparatus, and the working principle of the signal processing apparatus is described in detail below with reference to the accompanying drawings.
Referring to fig. 4, the figure is a block diagram of a signal processing apparatus according to an embodiment of the present application.
The signal processing apparatus 400 provided in this embodiment includes:
a collecting unit 410 for collecting voice signals;
the processing unit 420 is configured to process the voice signal to obtain a target text;
a sending unit 430, configured to send the target text to a first augmented reality AR device connected to the speech processing device, so as to display the target text on the first AR device.
Optionally, the processing unit 420 is configured to recognize the voice signal to obtain a voice text; and translating the voice text to obtain a target text.
Optionally, the processing unit 420 is configured to recognize the speech signal to obtain a target text.
Optionally, the sending unit 430 is configured to send the target text to the first AR device through a wireless port.
Optionally, the sending unit 430 is configured to send the target text to a network device through a wireless port, so that the network device sends the target text to the first AR device through the wireless port.
Optionally, the receiving unit is configured to receive a corrected text obtained by processing the voice signal by the cloud server;
the processing unit 420 is further configured to replace the target text stored in the speech processing device with the corrected text.
Optionally, the sending unit is further configured to send the corrected text to the first AR device.
Optionally, the speech processing device comprises a second AR device.
Based on the signal processing method provided by the above embodiment, an embodiment of the present application further provides a signal processing apparatus, where the signal processing apparatus 500 includes:
a processor 510 and a memory 520, the number of which may be one or more. In some embodiments of the present application, the processor and memory may be connected by a bus or other means.
The memory may include both read-only memory and random access memory, and provides instructions and data to the processor. The portion of memory may also include NVRAM. The memory stores an operating system and operating instructions, executable modules or data structures, or subsets thereof, or expanded sets thereof, wherein the operating instructions may include various operating instructions for performing various operations. The operating system may include various system programs for implementing various basic services and for handling hardware-based tasks.
The processor controls the operation of the terminal device and may also be referred to as a CPU.
The method disclosed in the embodiments of the present application may be applied to a processor, or may be implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The processor described above may be a general purpose processor, a DSP, an ASIC, an FPGA or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The embodiment of the present application further provides a computer-readable storage medium for storing a program code, where the program code is used to execute any one implementation of the methods of the foregoing embodiments.
When introducing elements of various embodiments of the present application, the articles "a," "an," "the," and "said" are intended to mean that there are one or more of the elements. The terms "comprising," "including," and "having" are intended to be inclusive and mean that there may be additional elements other than the listed elements.
It should be noted that, as one of ordinary skill in the art would understand, all or part of the processes of the above method embodiments may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when executed, the computer program may include the processes of the above method embodiments. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, and the units and modules described as separate components may or may not be physically separate. In addition, some or all of the units and modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is directed to embodiments of the present application and it is noted that numerous modifications and adaptations may be made by those skilled in the art without departing from the principles of the present application and are intended to be within the scope of the present application.

Claims (19)

1. A signal processing method applied to a speech processing apparatus, comprising:
collecting voice signals;
processing the voice signal to obtain a target text;
and sending the target text to a first augmented display (AR) device connected with the voice processing device so as to display the target text on the first AR device.
2. The method of claim 1, wherein the processing the speech signal to obtain the target text comprises:
recognizing the voice signal to obtain a voice text;
and translating the voice text to obtain a target text.
3. The method of claim 1, wherein the processing the speech signal to obtain the target text comprises:
and identifying the voice signal to obtain a target text.
4. The method of claim 1, wherein the sending the target text to the first AR device comprises:
and sending the target text to the first AR device through a wireless port.
5. The method of claim 4, wherein the sending the target text to the first AR device over a wireless port comprises:
and sending the target text to a network device through a wireless port so that the network device sends the target text to the first AR device through the wireless port.
6. The method of claim 1, further comprising:
receiving a correction text obtained by processing the voice signal by a cloud server;
replacing the target text stored at the speech processing device with the corrected text.
7. The method of claim 6, further comprising:
sending the corrected text to the first AR device.
8. The method of any of claims 1-7, wherein the speech processing device comprises a second AR device.
9. A signal processing apparatus, applied to a speech processing device, comprising:
the acquisition unit is used for acquiring voice signals;
the processing unit is used for processing the voice signal to obtain a target text;
a sending unit, configured to send the target text to a first augmented display AR device connected to the speech processing device, so as to display the target text on the first AR device.
10. The apparatus of claim 9,
the processing unit is used for identifying the voice signal to obtain a voice text; and translating the voice text to obtain a target text.
11. The apparatus of claim 9,
and the processing unit is used for identifying the voice signal to obtain a target text.
12. The apparatus of claim 9,
the sending unit is configured to send the target text to the first AR device through a wireless port.
13. The apparatus of claim 12,
the sending unit is configured to send the target text to a network device through a wireless port, so that the network device sends the target text to the first AR device through the wireless port.
14. The apparatus of claim 9, further comprising:
the receiving unit is used for receiving a correction text obtained by processing the voice signal by the cloud server;
the processing unit is further configured to replace the target text stored in the speech processing device with the corrected text.
15. The apparatus of claim 14,
the sending unit is further configured to send the corrected text to the first AR device.
16. The apparatus of any of claims 9-15, wherein the speech processing device comprises a second AR device.
17. A signal processing apparatus, characterized in that the apparatus comprises: a processor and a memory;
the memory to store instructions;
the processor, configured to execute the instructions in the memory, to perform the method of any of claims 1 to 8.
18. An Augmented Reality (AR) system is characterized by comprising a voice processing device and a first AR device, wherein the voice processing device is in wireless connection with the first AR device;
the speech processing device for performing the method of any one of claims 1-8;
the first AR device is used for displaying the target text.
19. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the method of any one of claims 1-8.
CN202110639381.8A 2021-06-08 2021-06-08 Signal processing method, device and equipment and Augmented Reality (AR) system Pending CN113345440A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110639381.8A CN113345440A (en) 2021-06-08 2021-06-08 Signal processing method, device and equipment and Augmented Reality (AR) system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110639381.8A CN113345440A (en) 2021-06-08 2021-06-08 Signal processing method, device and equipment and Augmented Reality (AR) system

Publications (1)

Publication Number Publication Date
CN113345440A true CN113345440A (en) 2021-09-03

Family

ID=77475395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110639381.8A Pending CN113345440A (en) 2021-06-08 2021-06-08 Signal processing method, device and equipment and Augmented Reality (AR) system

Country Status (1)

Country Link
CN (1) CN113345440A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550430A (en) * 2022-04-27 2022-05-27 北京亮亮视野科技有限公司 Character reminding method and device based on AR technology

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140081634A1 (en) * 2012-09-18 2014-03-20 Qualcomm Incorporated Leveraging head mounted displays to enable person-to-person interactions
CN111709414A (en) * 2020-06-29 2020-09-25 济南浪潮高新科技投资发展有限公司 AR device, character recognition method and device thereof, and computer-readable storage medium
CN111741394A (en) * 2020-06-05 2020-10-02 北京搜狗科技发展有限公司 Data processing method and device and readable medium
CN111862940A (en) * 2020-07-15 2020-10-30 百度在线网络技术(北京)有限公司 Earphone-based translation method, device, system, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140081634A1 (en) * 2012-09-18 2014-03-20 Qualcomm Incorporated Leveraging head mounted displays to enable person-to-person interactions
CN111741394A (en) * 2020-06-05 2020-10-02 北京搜狗科技发展有限公司 Data processing method and device and readable medium
CN111709414A (en) * 2020-06-29 2020-09-25 济南浪潮高新科技投资发展有限公司 AR device, character recognition method and device thereof, and computer-readable storage medium
CN111862940A (en) * 2020-07-15 2020-10-30 百度在线网络技术(北京)有限公司 Earphone-based translation method, device, system, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550430A (en) * 2022-04-27 2022-05-27 北京亮亮视野科技有限公司 Character reminding method and device based on AR technology

Similar Documents

Publication Publication Date Title
US20220044463A1 (en) Speech-driven animation method and apparatus based on artificial intelligence
US11138903B2 (en) Method, apparatus, device and system for sign language translation
CN109147784B (en) Voice interaction method, device and storage medium
CN109993150B (en) Method and device for identifying age
US9742710B2 (en) Mood information processing method and apparatus
US20140036022A1 (en) Providing a conversational video experience
US20210398527A1 (en) Terminal screen projection control method and terminal
US11631408B2 (en) Method for controlling data, device, electronic equipment and computer storage medium
CN110174942B (en) Eye movement synthesis method and device
CN108903521B (en) Man-machine interaction method applied to intelligent picture frame and intelligent picture frame
CN111107278B (en) Image processing method and device, electronic equipment and readable storage medium
CN106686226A (en) Method and system for playing audio of terminal
CN113345440A (en) Signal processing method, device and equipment and Augmented Reality (AR) system
CN111522524A (en) Presentation control method and device based on conference robot, storage medium and terminal
EP4207195A1 (en) Speech separation method, electronic device, chip and computer-readable storage medium
WO2016206642A1 (en) Method and apparatus for generating control data of robot
CN112634932B (en) Audio signal processing method and device, server and related equipment
CN114333774A (en) Speech recognition method, speech recognition device, computer equipment and storage medium
CN114025235A (en) Video generation method and device, electronic equipment and storage medium
CN109194998A (en) Data transmission method, device, electronic equipment and computer-readable medium
CN112447179A (en) Voice interaction method, device, equipment and computer readable storage medium
CN112820273B (en) Wake-up judging method and device, storage medium and electronic equipment
CN114639392A (en) Audio processing method and device, electronic equipment and storage medium
CN111580766B (en) Information display method and device and information display system
CN114005436A (en) Method, device and storage medium for determining voice endpoint

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210903

RJ01 Rejection of invention patent application after publication