CN110392273A - Method, apparatus, electronic equipment and the storage medium of audio-video processing - Google Patents
Method, apparatus, electronic equipment and the storage medium of audio-video processing Download PDFInfo
- Publication number
- CN110392273A CN110392273A CN201910641537.9A CN201910641537A CN110392273A CN 110392273 A CN110392273 A CN 110392273A CN 201910641537 A CN201910641537 A CN 201910641537A CN 110392273 A CN110392273 A CN 110392273A
- Authority
- CN
- China
- Prior art keywords
- video
- audio
- electronic equipment
- voice
- amplitude spectrum
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/2187—Live feed
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/475—End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/478—Supplemental services, e.g. displaying phone caller identification, shopping application
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Television Signal Processing For Recording (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The embodiment of the present disclosure provides a kind of audio/video processing method, device, electronic equipment and storage medium, the method is applied to server, include: obtain Virtual Space in the first electronic equipment issue dub instruction, wherein, the first electronic equipment is with the electronic equipment that permission is broadcast live in the Virtual Space;Corresponding preset of instruction is dubbed described in determination dubs type;It determines wait match audio-video;When dubbing sign on of the first electronic equipment sending is obtained, is dubbed described in type broadcasting according to described preset wait match the corresponding no voice video of audio-video;During playing the no voice video, the acquisition no voice video is corresponding to dub audio, while the audio of dubbing is sent to the second electronic equipment, wherein the second electronic equipment is the electronic equipment with the viewing live streaming permission in the Virtual Space.It uses this programme user that can interact in a manner of dubbing in Virtual Space, increases the diversity of interaction mode, user experience is improved.
Description
Technical field
This disclosure relates to field of computer technology, more particularly to a kind of audio/video processing method, device, electronic equipment and
Storage medium.
Background technique
Network direct broadcasting is rapidly developed in recent years, and obtains liking for people.In network direct broadcasting field, live streaming is installed
The terminal of application program is properly termed as user terminal, and the user terminal that main broadcaster's live streaming is watched during live streaming is then viewer end.
When carrying out network direct broadcasting, main broadcaster can be broadcast live in several ways, can also be with spectators or other main broadcasters
It is interacted.For example, spectators can with main broadcaster chat, gifts between main broadcaster, each main broadcaster can company of progress wheat live streaming, company
Wheat battle etc..But at present in network direct broadcasting, no matter between main broadcaster and spectators or between main broadcaster and main broadcaster, interaction mode list
One.
Summary of the invention
To overcome the problems in correlation technique, the embodiment of the present disclosure provide the processing method of audio-video a kind of, device,
Electronic equipment and storage medium.Specific technical solution is as follows:
According to the first aspect of the embodiments of the present disclosure, a kind of processing method of audio-video is provided, server is applied to, it is described
Method includes:
Obtain Virtual Space in the first electronic equipment issue dub instruction, wherein first electronic equipment for
The electronic equipment of permission is broadcast live in the Virtual Space;
Corresponding preset of instruction is dubbed described in determination dubs type;
It determines wait match audio-video;
When dubbing sign on of the first electronic equipment sending is obtained, is dubbed described in type broadcasting according to described preset
Wait match the corresponding no voice video of audio-video;
During playing the no voice video, obtain that the no voice video is corresponding to dub audio, while by institute
It states and dubs audio and be sent to the second electronic equipment, wherein second electronic equipment is with watching in the Virtual Space
The electronic equipment of permission is broadcast live.
As an implementation, the default type of dubbing is main broadcaster's show type;
It is described to preset the step of dubbing described in type broadcasting to no voice video corresponding with audio-video, packet according to described
It includes:
It controls first electronic equipment and second electronic equipment while playing described wait match the corresponding nothing of audio-video
Voice video.
As an implementation, the default type of dubbing is that more main broadcasters fight type;
It is described to preset the step of dubbing described in type broadcasting to no voice video corresponding with audio-video, packet according to described
It includes:
Determine the corresponding battle sequence of corresponding first electronic equipment of each main broadcaster;
According to the battle sequence, controls first electronic equipment and its corresponding second electronic equipment plays in order institute
It states wait match the corresponding no voice video of audio-video.
As an implementation, the default type of dubbing is that more people dub type;
It is described to preset the step of dubbing described in type broadcasting to no voice video corresponding with audio-video, packet according to described
It includes:
Corresponding each second electronic equipment of user in instant messaging region is controlled in the Virtual Space, while described in playing
Wait match the corresponding no voice video of audio-video.
As an implementation, user corresponding each second in instant messaging region in the control Virtual Space
Electronic equipment, at the same play it is described to no voice video corresponding with audio-video the step of, comprising:
When obtaining the broadcast message that first electronic equipment is sent, send described wait match audio-video and sign on extremely
Corresponding each second electronic equipment of user in instant messaging region in the Virtual Space, so that each second electronic equipment exists
When receiving the sign on, while playing described wait match the corresponding no voice video of audio-video.
As an implementation, the step of determination is wait match audio-video, comprising:
Obtain the video that first electronic equipment uploads;
The video of the upload is determined as wait match audio-video.
As an implementation, the acquisition modes of the no voice video, comprising:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;
The amplitude spectrum is inputted into the network model that training is completed in advance, is obtained described to be covered with the corresponding voice of audio-video
Film matrix, wherein the network model is based on the amplitude spectrum sample and its corresponding voice exposure mask matrix obtained in advance trained
It arrives, the network model includes the corresponding relationship of amplitude spectrum Yu voice exposure mask matrix;
Using the voice exposure mask matrix and the amplitude spectrum, unmanned acoustic amplitude spectrum is calculated;
It is determined based on the unmanned acoustic amplitude spectrum described wait match the corresponding no voice video of audio-video.
As an implementation, the acquisition modes of the no voice video, comprising:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;
The amplitude spectrum is inputted into the network model that training is completed in advance, is obtained described wait match the corresponding no voice of audio-video
Audio, wherein the network model is obtained based on the amplitude spectrum sample and its corresponding unmanned sound audio training obtained in advance, institute
State the corresponding relationship that network model includes amplitude spectrum Yu unmanned sound audio;
It is determined based on the unmanned sound audio described wait match the corresponding no voice video of audio-video.
According to the second aspect of an embodiment of the present disclosure, a kind of processing method of audio-video is provided, is set applied to the first electronics
It is standby, wherein first electronic equipment is with the electronic equipment that permission is broadcast live in Virtual Space, which comprises
Acquisition dubs instruction in the Virtual Space;
Corresponding preset of instruction is dubbed described in determination dubs type;
It determines wait match audio-video;
When sign on is dubbed in acquisition, according to it is described it is default dub type play it is described to audio-video it is corresponding nobody
Sound video;
During playing the no voice video, obtain that the no voice video is corresponding to dub audio, while by institute
It states and dubs audio and be sent to server.
As an implementation, the default type of dubbing is main broadcaster's show type;
It is described to preset the step of dubbing described in type broadcasting to no voice video corresponding with audio-video, packet according to described
It includes:
Broadcasting is described wait match the corresponding no voice video of audio-video, and broadcasting is described wait match simultaneously for the second electronic equipment of control
The corresponding no voice video of audio-video, wherein second electronic equipment is with watching broadcasting right in the Virtual Space
The electronic equipment of limit.
As an implementation, the default type of dubbing is that more main broadcasters fight type;
It is described to preset the step of dubbing described in type broadcasting to no voice video corresponding with audio-video, packet according to described
It includes:
Determine the corresponding battle sequence of corresponding first electronic equipment of each main broadcaster;
According to the battle sequence, controls first electronic equipment and its corresponding second electronic equipment plays in order institute
It states wait match the corresponding no voice video of audio-video.
As an implementation, the default type of dubbing is that more people dub type;
It is described to preset the step of dubbing described in type broadcasting to no voice video corresponding with audio-video, packet according to described
It includes:
Corresponding each second electronic equipment of user in instant messaging region is controlled in the Virtual Space, while described in playing
Wait match the corresponding no voice video of audio-video.
As an implementation, user corresponding each second in instant messaging region in the control Virtual Space
Electronic equipment, at the same play it is described to no voice video corresponding with audio-video the step of, comprising:
The broadcast message of transmission is to the server, so that server transmission is described wait match audio-video and sign on
Corresponding each second electronic equipment of user into instant messaging region in the Virtual Space, so that each second electronic equipment
When receiving the sign on, while playing described wait match the corresponding no voice video of audio-video.
As an implementation, the step of determination is wait match audio-video, comprising:
Obtain the video that user uploads;
The video of the upload is determined as wait match audio-video.
As an implementation, the acquisition modes of the no voice video, comprising:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;
The amplitude spectrum is inputted into the network model that training is completed in advance, is obtained described to be covered with the corresponding voice of audio-video
Film matrix, wherein the network model is based on the amplitude spectrum sample and its corresponding voice exposure mask matrix obtained in advance trained
It arrives, the network model includes the corresponding relationship of amplitude spectrum Yu voice exposure mask matrix;
Using the voice exposure mask matrix and the amplitude spectrum, unmanned acoustic amplitude spectrum is calculated;
It is determined based on the unmanned acoustic amplitude spectrum described wait match the corresponding no voice video of audio-video.
As an implementation, the acquisition modes of the no voice video, comprising:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;
The amplitude spectrum is inputted into the network model that training is completed in advance, is obtained described wait match the corresponding no voice of audio-video
Audio, wherein the network model is obtained based on the amplitude spectrum sample and its corresponding unmanned sound audio training obtained in advance, institute
State the corresponding relationship that network model includes amplitude spectrum Yu unmanned sound audio;
It is determined based on the unmanned sound audio described wait match the corresponding no voice video of audio-video.
According to the third aspect of an embodiment of the present disclosure, a kind of processing method of audio-video is provided, is set applied to the second electronics
It is standby, wherein second electronic equipment is the electronic equipment with the viewing live streaming permission in the Virtual Space, the method
Include:
Get in Virtual Space when dubbing sign on, play obtain in advance to audio-video it is corresponding nobody
Sound video;
During playing the no voice video, when getting that the no voice video is corresponding to dub audio, play
It is described to dub audio.
As an implementation, it is described get in Virtual Space when dubbing sign on, play in advance obtain
To with audio-video corresponding no voice video the step of, comprising:
Receive in the Virtual Space that server is sent when matching audio-video and sign on, play received wait match
The corresponding no voice video of audio-video.
As an implementation, the acquisition modes of the no voice video, comprising:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;
The amplitude spectrum is inputted into the network model that training is completed in advance, is obtained described to be covered with the corresponding voice of audio-video
Film matrix, wherein the network model is based on the amplitude spectrum sample and its corresponding voice exposure mask matrix obtained in advance trained
It arrives, the network model includes the corresponding relationship of amplitude spectrum Yu voice exposure mask matrix;
Using the voice exposure mask matrix and the amplitude spectrum, unmanned acoustic amplitude spectrum is calculated;
It is determined based on the unmanned acoustic amplitude spectrum described wait match the corresponding no voice video of audio-video.
As an implementation, the acquisition modes of the no voice video, comprising:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;
The amplitude spectrum is inputted into the network model that training is completed in advance, is obtained described wait match the corresponding no voice of audio-video
Audio, wherein the network model is obtained based on the amplitude spectrum sample and its corresponding unmanned sound audio training obtained in advance, institute
State the corresponding relationship that network model includes amplitude spectrum Yu unmanned sound audio;
It is determined based on the unmanned sound audio described wait match the corresponding no voice video of audio-video.
According to a fourth aspect of embodiments of the present disclosure, a kind of processing unit of audio-video is provided, server is applied to, it is described
Device includes:
It dubs instruction first and dubs instruction acquisition module, be configured as executing the first electronic equipment hair in acquisition Virtual Space
Out dub instruction, wherein first electronic equipment be in the Virtual Space be broadcast live permission electronic equipment;
It is default to dub that type first is default to dub determination type module, be configured as executing determine described in dub instruction and correspond to
Default dub type;
It determines to, wait match audio-video determining module, be configured as executing with audio-video determining module first wait match audio-video;
Without voice video first without voice video playback module, it is configured as executing acquisition the first electronic equipment sending
When dubbing sign on, according to it is described it is default dub type play it is described wait match the corresponding no voice video of audio-video;
It dubs audio first and dubs audio sending module, be configured as executing during playing the no voice video,
It obtains that the no voice video is corresponding to dub audio, while the audio of dubbing is sent to the second electronic equipment, wherein institute
Stating the second electronic equipment is the electronic equipment with the viewing live streaming permission in the Virtual Space.
As an implementation, the default type of dubbing is main broadcaster's show type;
Described first includes: without voice video playback module
First without voice video playing submodule, is configured as executing control first electronic equipment and its described second
Electronic equipment plays described wait match the corresponding no voice video of audio-video simultaneously.
As an implementation, the default type of dubbing is that more main broadcasters fight type;
Described first includes: without voice video playback module
Battle sequence determines submodule, is configured as executing the corresponding battle of corresponding first electronic equipment of determining each main broadcaster
Sequentially;
Second without voice video playing submodule, is configured as executing according to the battle sequence, controls first electricity
Sub- equipment and its corresponding second electronic equipment play in order described wait match the corresponding no voice video of audio-video.
As an implementation, the default type of dubbing is that more people dub type;
Described first includes: without voice video playback module
Third is configured as executing in the control Virtual Space in instant messaging region without voice video playing submodule
Corresponding each second electronic equipment of user, while playing described wait match the corresponding no voice video of audio-video.
As an implementation, the third includes: without voice video playing submodule
First without voice video playback unit, is configured as executing and disappears obtaining the broadcast that first electronic equipment is sent
When breath, send it is described to audio-video and sign on into the Virtual Space user corresponding each the in instant messaging region
Two electronic equipments, so that each second electronic equipment is regarded when receiving the sign on, while described in broadcasting wait dub
Frequently corresponding no voice video.
As an implementation, described first to include: with audio-video determining module
First video acquisition submodule is configured as executing the video for obtaining the first electronic equipment upload;
First to determine submodule with audio-video, is configured as executing the video of the upload is determined as wait dub view
Frequently.
As an implementation, the audio-video processing unit further includes first without voice video determining module;
Described first includes: without voice video determining module
First amplitude, which is composed, determines submodule, is configured as executing the determining corresponding width of audio signal wait match audio-video
Value spectrum;
The first aural masking matrix determines submodule, is configured as executing inputting the amplitude spectrum and trains completion in advance
Network model obtains described wait match the corresponding voice exposure mask matrix of audio-video, wherein the network model based on obtaining in advance
Amplitude spectrum sample and its training of corresponding voice exposure mask matrix obtain, and the network model includes amplitude spectrum and voice exposure mask matrix
Corresponding relationship;
First unmanned acoustic amplitude, which is composed, determines submodule, is configured as execution and utilizes the voice exposure mask matrix and the amplitude
Unmanned acoustic amplitude spectrum is calculated in spectrum;
First determines submodule without voice video, is configured as executing determining described wait match based on the unmanned acoustic amplitude spectrum
The corresponding no voice video of audio-video.
As an implementation, the audio-video processing unit further includes second without voice video determining module;
Described second includes: without voice video determining module
Second amplitude spectrum determines submodule, is configured as executing the determining corresponding width of audio signal wait match audio-video
Value spectrum;
First unmanned sound audio determines submodule, is configured as executing the net that the amplitude spectrum is inputted to training completion in advance
Network model obtains described wait match the corresponding unmanned sound audio of audio-video, wherein the network model is based on the amplitude obtained in advance
Spectrum sample and its corresponding unmanned sound audio training obtain, and the network model, which includes that amplitude spectrum is corresponding with unmanned sound audio, to close
System;
Second determines submodule without voice video, is configured as executing determining described wait dub based on the unmanned sound audio
The corresponding no voice video of video.
According to a fifth aspect of the embodiments of the present disclosure, a kind of processing unit of audio-video is provided, is set applied to the first electronics
It is standby, wherein first electronic equipment is the electronic equipment with the live streaming permission in Virtual Space, and described device includes:
Second dubs instruction acquisition module, is configured as executing obtaining and dubs instruction in the Virtual Space;
Second it is default dub determination type module, be configured as executing determine described in dub that instruction is corresponding default to dub class
Type;
Second wait match audio-video determining module, be configured as executing determining wait match audio-video;
Second without voice video playback module, is configured as executing when sign on is dubbed in acquisition, according to described default
It is described wait match the corresponding no voice video of audio-video to dub type broadcasting;
Second dubs audio sending module, is configured as executing during playing the no voice video, described in acquisition
No voice video is corresponding to dub audio, while the audio of dubbing is sent to server.
As an implementation, the default type of dubbing is main broadcaster's show type;
Described second includes: without voice video playback module
4th without voice video playing submodule, is configured as executing described in broadcasting to the corresponding no voice view of audio-video
Frequently, it and controls described in the broadcasting simultaneously of the second electronic equipment wait match the corresponding no voice video of audio-video, wherein second electronics
Equipment is the electronic equipment with the viewing live streaming permission in the Virtual Space.
As an implementation, the default type of dubbing is that more main broadcasters fight type;
Described second includes: without voice video playback module
Battle sequence determines submodule, is configured as executing the corresponding battle of corresponding first electronic equipment of determining each main broadcaster
Sequentially;
5th without voice video playing submodule, is configured as executing according to the battle sequence, controls first electricity
Sub- equipment and its corresponding second electronic equipment play in order described wait match the corresponding no voice video of audio-video.
As an implementation, the default type of dubbing is that more people dub type;
Described second includes: without voice video playback module
6th without voice video playing submodule, is configured as executing in the control Virtual Space in instant messaging region
Corresponding each second electronic equipment of user, while playing described wait match the corresponding no voice video of audio-video.
As an implementation, the described 6th includes: without voice video playing submodule
Second without voice video playback unit, is configured as executing the broadcast message sent to the server, so that institute
State server send it is described to which with audio-video and sign on, into the Virtual Space, user is corresponding in instant messaging region
Each second electronic equipment so that each second electronic equipment is when receiving the sign on, while playing described wait match
The corresponding no voice video of audio-video.
As an implementation, described second to include: with audio-video determining module
Second video acquisition submodule is configured as executing the video for obtaining user's upload;
Second to determine submodule with audio-video, is configured as executing the video of the upload is determined as wait dub view
Frequently.
As an implementation, the audio-video processing unit further includes third without voice video determining module;
The third includes: without voice video determining module
Third amplitude spectrum determines submodule, is configured as executing the determining corresponding width of audio signal wait match audio-video
Value spectrum;
Second voice exposure mask matrix determines submodule, is configured as executing inputting the amplitude spectrum and trains completion in advance
Network model obtains described wait match the corresponding voice exposure mask matrix of audio-video, wherein the network model based on obtaining in advance
Amplitude spectrum sample and its training of corresponding voice exposure mask matrix obtain, and the network model includes amplitude spectrum and voice exposure mask matrix
Corresponding relationship;
Second unmanned acoustic amplitude, which is composed, determines submodule, is configured as execution and utilizes the voice exposure mask matrix and the amplitude
Unmanned acoustic amplitude spectrum is calculated in spectrum;
Third determines submodule without voice video, is configured as executing determining described wait match based on the unmanned acoustic amplitude spectrum
The corresponding no voice video of audio-video.
As an implementation, the audio-video processing unit further includes the 4th without voice video determining module;
Described 4th includes: without voice video determining module
4th amplitude spectrum determines submodule, is configured as executing the determining corresponding width of audio signal wait match audio-video
Value spectrum;
Second unmanned sound audio determines submodule, is configured as executing the net that the amplitude spectrum is inputted to training completion in advance
Network model obtains described wait match the corresponding unmanned sound audio of audio-video, wherein the network model is based on the amplitude obtained in advance
Spectrum sample and its corresponding unmanned sound audio training obtain, and the network model, which includes that amplitude spectrum is corresponding with unmanned sound audio, to close
System;
4th determines submodule without voice video, is configured as executing determining described wait dub based on the unmanned sound audio
The corresponding no voice video of video.
According to a sixth aspect of an embodiment of the present disclosure, a kind of processing unit of audio-video is provided, is set applied to the second electronics
It is standby, wherein second electronic equipment is the electronic equipment with the viewing live streaming permission in the Virtual Space, described device
Include:
Third is configured as executing and dubs sign on getting in Virtual Space without voice video playback module
When, play obtain in advance wait match the corresponding no voice video of audio-video;
Audio playing module is dubbed, is configured as executing during playing the no voice video, gets the nothing
Voice video is corresponding when dubbing audio, dubs audio described in broadcasting.
As an implementation, the third includes: without voice video playback module
7th without voice video playing submodule, be configured as executing receive in the Virtual Space that server is sent wait match
When audio-video and sign on, play received wait match the corresponding no voice video of audio-video.
As an implementation, the audio-video processing unit further includes the 5th without voice video determining module;
Described 5th includes: without voice video determining module
5th amplitude spectrum determines submodule, is configured as executing the determining corresponding width of audio signal wait match audio-video
Value spectrum;
Third party's aural masking matrix determines submodule, is configured as executing inputting the amplitude spectrum and trains completion in advance
Network model obtains described wait match the corresponding voice exposure mask matrix of audio-video, wherein the network model based on obtaining in advance
Amplitude spectrum sample and its training of corresponding voice exposure mask matrix obtain, and the network model includes amplitude spectrum and voice exposure mask matrix
Corresponding relationship;
The unmanned acoustic amplitude of third, which is composed, determines submodule, is configured as execution and utilizes the voice exposure mask matrix and the amplitude
Unmanned acoustic amplitude spectrum is calculated in spectrum;
5th determines submodule without voice video, is configured as executing determining described wait match based on the unmanned acoustic amplitude spectrum
The corresponding no voice video of audio-video.
As an implementation, the audio-video processing unit further includes the 5th without voice video determining module;
Described 5th includes: without voice video determining module
6th amplitude spectrum determines submodule, is configured as executing the determining corresponding width of audio signal wait match audio-video
Value spectrum;
The unmanned sound audio of third determines submodule, is configured as executing the net that the amplitude spectrum is inputted to training completion in advance
Network model obtains described wait match the corresponding unmanned sound audio of audio-video, wherein the network model is based on the amplitude obtained in advance
Spectrum sample and its corresponding unmanned sound audio training obtain, and the network model, which includes that amplitude spectrum is corresponding with unmanned sound audio, to close
System;
6th determines submodule without voice video, is configured as executing determining described wait dub based on the unmanned sound audio
The corresponding no voice video of video.
According to the 7th of the embodiment of the present disclosure the aspect, a kind of server is provided, comprising:
Processor;
For storing the memory of the processor-executable instruction;
Wherein, the processor is configured to described instruction is executed, to realize audio-video described in above-mentioned first aspect
Processing method.
According to the eighth aspect of the embodiment of the present disclosure, a kind of electronic equipment is provided, comprising:
Processor;
For storing the memory of the processor-executable instruction;
Wherein, the processor is configured to described instruction is executed, to realize described in above-mentioned second aspect or the third aspect
Audio-video processing method.
According to the 9th of embodiment of the present disclosure aspect, provide a kind of storage medium, when the instruction in the storage medium by
When the processor of electronic equipment executes, so that electronic equipment is able to carry out the processing side of audio-video described in any of the above-described aspect
Method.
In scheme provided by the embodiment of the present disclosure, the first electronic equipment is issued in the available Virtual Space of server
Instruction is dubbed, determines that dubbing corresponding preset of instruction dubs type, then determines wait match audio-video, and then obtaining the first electronics
When dubbing sign on of equipment sending plays according to default type of dubbing to broadcast with the corresponding no voice video of audio-video
During putting no voice video, obtains and dub audio without voice video is corresponding, while audio will be dubbed and be sent to the second electronics
Equipment.Wherein, the first electronic equipment be in Virtual Space be broadcast live permission electronic equipment, the second electronic equipment be with
The electronic equipment of viewing live streaming permission in Virtual Space.Use this programme user can Virtual Space in a manner of dubbing into
Row interaction, increases the diversity of interaction mode, user experience is improved.It should be understood that above general description is with after
Text datail description be only it is exemplary and explanatory, do not limit the disclosure.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure
Example, and together with specification for explaining the principles of this disclosure, do not constitute the improper restriction to the disclosure.
Fig. 1 is the flow chart of the first audio/video processing method shown according to an exemplary embodiment;
Fig. 2 is a kind of schematic diagram for dubbing button shown according to an exemplary embodiment;
Fig. 3 is the first flow chart of step S104 in embodiment illustrated in fig. 1 shown according to an exemplary embodiment;
Fig. 4 is the first flow chart of the acquisition modes of no voice video shown according to an exemplary embodiment;
Fig. 5 is second of flow chart of the acquisition modes of no voice video shown according to an exemplary embodiment;
Fig. 6 is the flow chart of second of audio/video processing method shown according to an exemplary embodiment;
Fig. 7 is the flow chart of the third audio/video processing method shown according to an exemplary embodiment;
Fig. 8 is the structural block diagram of the first audio-video processing unit shown according to an exemplary embodiment;
Fig. 9 is the structural block diagram of second of audio-video processing unit shown according to an exemplary embodiment;
Figure 10 is the structural block diagram of the third audio-video processing unit shown according to an exemplary embodiment;
Figure 11 is the structural block diagram of a kind of electronic equipment shown according to an exemplary embodiment.
Specific embodiment
In order to make ordinary people in the field more fully understand the technical solution of the disclosure, below in conjunction with attached drawing, to this public affairs
The technical solution opened in embodiment is clearly and completely described.
It should be noted that the specification and claims of the disclosure and term " first " in above-mentioned attached drawing, "
Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way
Data be interchangeable under appropriate circumstances, so as to embodiment of the disclosure described herein can in addition to illustrating herein or
Sequence other than those of description is implemented.Embodiment described in following exemplary embodiment does not represent and disclosure phase
Consistent all embodiments.On the contrary, they are only and as detailed in the attached claim, the disclosure some aspects
The example of consistent device and method.
In order to enrich the interaction mode in Virtual Space, user experience is improved, the embodiment of the present disclosure provides a kind of sound view
Frequency processing method, device, server, electronic equipment and computer readable storage medium.
The first audio/video processing method provided by the embodiment of the present disclosure is introduced first below.The disclosure is implemented
The first audio/video processing method provided by example can be applied to the server of live streaming application program.
As shown in Figure 1, a kind of processing method of audio-video, is applied to server, which comprises
In step s101, obtain Virtual Space in the first electronic equipment issue dub instruction;
Wherein, first electronic equipment is with the electronic equipment that permission is broadcast live in the Virtual Space.
In step s 102, corresponding preset of instruction is dubbed described in determining dubs type;
In step s 103, it determines wait match audio-video;
In step S104, when dubbing sign on of the first electronic equipment sending is obtained, according to the pre- establishing
Sound type plays described wait match the corresponding no voice video of audio-video;
In step s105, during playing the no voice video, the acquisition no voice video is corresponding to be dubbed
Audio, while the audio of dubbing is sent to the second electronic equipment.
Wherein, second electronic equipment is the electronic equipment with the viewing live streaming permission in the Virtual Space.
As it can be seen that in scheme provided by the embodiment of the present disclosure, the first electronic equipment in the available Virtual Space of server
What is issued dubs instruction, determine dub instruction it is corresponding it is default dub type, then determine wait match audio-video, and then in acquisition the
When dubbing sign on of one electronic equipment sending plays according to default type of dubbing to regard with the corresponding no voice of audio-video
Frequently, it during playing without voice video, obtains and dubs audio without voice video is corresponding, while audio will be dubbed and be sent to the
Two electronic equipments.Wherein, the first electronic equipment is with the electronic equipment that permission is broadcast live in Virtual Space, the second electronic equipment
For the electronic equipment with the viewing live streaming permission in Virtual Space.Use this programme user can be in Virtual Space to dub
Mode interacts, and increases the diversity of interaction mode, and user experience is improved.
Above-mentioned first electronic equipment is the electronic equipment with the live streaming permission in Virtual Space, and main broadcaster can use first
Electronic equipment is broadcast live.During main broadcaster is broadcast live, can by the way of dubbing with spectators or other main broadcasters into
Row interaction, at this point, main broadcaster, which can issue, dubs instruction.It, can in the live streaming interface of the first electronic equipment for the ease of user's operation
To provide user interface, for example, as shown in Fig. 2, the live streaming interface of the first electronic equipment can show " object for appreciation is dubbed " button 201,
Main broadcaster can click the button 201 and dub instruction to issue.
In turn, in above-mentioned steps S101, what server can obtain that the first electronic equipment in Virtual Space issues matches
Sound instruction shows that main broadcaster needs to interact by the way of dubbing with spectators or other main broadcasters at this time.Due in Virtual Space
The mode of dubbing can there are many, so at this time server can determine it is acquired dub instruction it is corresponding it is default dub type,
Namely execute step S102.
In one embodiment, different user interfaces can be provided in the live streaming interface of the first electronic equipment, respectively
It is corresponding it is different it is default dub type, user dubs instruction by which user interface sending, can determine that this dubs instruction
Corresponding preset kind is that corresponding preset of the user interface dubs type.
Wherein, it presets and dubs type and can be arranged according to user demand, for example, can be dubbed for one people of main broadcaster performance,
It can carry out dubbing battle for multiple main broadcasters, one section of completion can also be cooperated to dub for main broadcaster and spectators, do not done herein specific
It limits.
After what the first electronic equipment issued in obtaining Virtual Space dubs instruction, server can execute above-mentioned steps
S103, that is, determine wait match audio-video.It is used as in order to facilitate the video that user selects to be suitble to oneself to need wait match audio-video, the
Selection panel of videos can be shown in the live streaming interface of one electronic equipment, wherein may include the video of main broadcaster's downloading, on network
Compare popular video, recommend the more suitable video of user etc., it is not specifically limited herein.Main broadcaster can choose wherein one
A video, server are also assured that the video is wait match audio-video.
The content wait match audio-video is familiar in order to facilitate user, so that dubbed effect is more preferable, the first electronic equipment can be broadcast
It puts wait match audio-video for main broadcaster's viewing, meanwhile, server can control each second electronic equipment and be played simultaneously wait dub view
Frequently, for each spectators viewing.Wherein, the second electronic equipment is the electronic equipment with the viewing live streaming permission in Virtual Space,
Spectators can use the live streaming of the second electronic equipment viewing main broadcaster.
Next, illustrating that user needs to start to match obtaining when dubbing sign on of the first electronic equipment sending
Sound, then default can dub type according to above-mentioned and play wait match the corresponding no voice video of audio-video.In order to facilitate user
It operates, respective user interfaces can be provided in the live streaming interface of the first electronic equipment, for example, the live streaming interface of the first electronic equipment
It can show " starting to dub " button, main broadcaster, which clicks the button to issue, dubs sign on.
Obtain when dubbing sign on of the first electronic equipment sending, server can dub Type Control the according to default
The first electronic equipment that one electronic equipment, the second electronic equipment and other main broadcasters use starts to play to corresponding with audio-video
Without voice video.Wherein, no voice video is the video for removing voice and only retaining background music.
As an implementation, no voice video can be to be pre-stored within server or electronics that each user uses
Equipment local, when no voice video is stored in server, server can will be sent to each user without voice video and use
Electronic equipment so that the electronic equipment that uses of each user is played without voice video.As another embodiment, it determines wait match
After audio-video, server can be treated and be handled with audio-video, obtain to the corresponding no voice video of audio-video in case
With this is all reasonable.
During playing without voice video, the available no voice video of server is corresponding to dub audio, simultaneously will
It dubs audio and is sent to the second electronic equipment, that is, execute above-mentioned steps S105, so that performance is dubbed in spectators' viewing.It is playing
During voice video, main broadcaster and/or spectators and/or other main broadcasters can issue audio signal to carry out role in video
It dubs, corresponding user terminal electronic equipment can collect the audio signal of user's sending at this time, that is, dub audio, into
And it is sent to server.
What server can also receive that each user terminal electronic equipment sends dubs audio, and then will dub audio transmission
To each second electronic equipment.Each second electronic equipment is playing wait match the corresponding no voice video of audio-video at this time, this
Sample is dubbed audio and is played together with no voice video, and spectators, which can watch, dubs performance.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default to dub type can be main broadcaster's show type.?
That is only one people of main broadcaster dubs in dubbing process, spectators viewing main broadcaster's dubs performance.
It is above-mentioned to be played according to the default type of dubbing for presetting and dubbing the case where type is main broadcaster's show type
It is described to audio-video corresponding no voice video the step of, may include:
It controls first electronic equipment and its second electronic equipment while playing described to corresponding with audio-video
Without voice video.
Since in this case, main broadcaster dubs, spectators viewing main broadcaster's dubs performance, then server can be controlled
Make the first electronic equipment and its corresponding second electronic equipment and meanwhile play it is above-mentioned to the corresponding no voice video of audio-video, this
Sample, when main broadcaster dubs, server will dub audio and be sent to each second electronic equipment, and each second electronic equipment can be
While broadcasting to no voice video corresponding with audio-video, audio is dubbed in broadcasting, and spectators can watch dubbing for main broadcaster
Performance.
As it can be seen that in the present embodiment, main broadcaster can carry out dubbing performance, to be interacted with spectators, can be enhanced virtual
The interactivity and interest in space improve user experience.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default type of dubbing can fight type for more main broadcasters.
That is, multiple main broadcasters can be respectively to dub with audio-video, spectators can watch dubbing between multiple main broadcasters
Battle performance.
It in one embodiment, can be in the second-level menu of intelligence and art battle function in the live streaming interface of the first electronic equipment
Show " object for appreciation is dubbed " button, main broadcaster, which clicks, is somebody's turn to do " object for appreciation is dubbed " button, can determine that the main broadcaster wants the more main broadcaster's battles of progress and matches
Sound.
Server can be by the corresponding each first electronic equipment progress of the main broadcaster for currently selecting more main broadcaster's battles to dub
Match, as the first electronic equipment for dubbing of battle will be carried out.It is above-mentioned to be selected by wherein any one main broadcaster wait match audio-video,
It can certainly be determined according to other rules, for example, can be selected for the least main broadcaster of spectator attendance in Virtual Space, with
Increase the popularity etc. of the main broadcaster.
For presetting and dubbing the case where type is main broadcaster's show type, as shown in figure 3, above-mentioned according to the pre- establishing
The step of described in the broadcasting of sound type to no voice video corresponding with audio-video, may include:
S301 determines the corresponding battle sequence of corresponding first electronic equipment of each main broadcaster;
Due to needing to carry out dubbing battle there is currently multiple main broadcasters, in order to guarantee that the sense of battle is dubbed in spectators' viewing
By each main broadcaster needs to carry out one by one to dub performance, so server can determine that corresponding first electronic equipment of each main broadcaster is corresponding
Battle sequence.
In one embodiment, server can determine the corresponding battle sequence of above-mentioned each first electronic equipment at random,
And inform the corresponding battle sequence of each first electronic equipment.In another embodiment, it can be determined by one of main broadcaster
The corresponding battle sequence of each first electronic equipment.In another embodiment, it can be decided through consultation in such a way that each main broadcaster is by even wheat
Battle sequence, this is all reasonable.
S302 controls first electronic equipment and its corresponding second electronic equipment successively according to the battle sequence
It plays described wait match the corresponding no voice video of audio-video.
After above-mentioned battle sequence has been determined, each main broadcaster can start to dub battle, that is to say, that according to battle sequence
For to dub with audio-video, to the last a main broadcaster dubs completion since first main broadcaster.In the process, it services
Device can control each first electronic equipment and its corresponding second electronic equipment is playd in order wait match the corresponding no voice of audio-video
Video, main broadcaster can dub, and what spectators can watch each main broadcaster dubs performance.
When each main broadcaster dubs, corresponding first electronic equipment can acquire the voice letter of main broadcaster sending
Number, and then it is sent to server, server can be sent to other first electronic equipments as audio is dubbed and own
Corresponding second electronic equipment of first electronic equipment, each main broadcaster and spectators, which can watch, dubs battle performance.
As it can be seen that can carry out dubbing battle performance in the present embodiment, between multiple main broadcasters, with other main broadcasters and sight
Crowd interacts, and can further enhance the interactivity and interest of Virtual Space, further increase user experience.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default type of dubbing can dub type for more people.?
That is main broadcaster and spectators can be respectively to be completed jointly to dub with the different role in audio-video wait match audio-video
Dub.The general spectators are user in instant messaging region in Virtual Space.For example, can be in chatroom in direct broadcasting room
User.
In this case, to be selected by main broadcaster according to the number for participating in dubbing with audio-video.It can also be by servicing
For device according to the quantity of user recommends suitable video in instant messaging region in Virtual Space, this is all reasonably, not do herein
It is specific to limit.It is dubbed for convenience, main broadcaster and the user for participating in dubbing can decide through consultation role in instant communications zone
Distribution.
It is above-mentioned to be played according to the default type of dubbing for presetting and dubbing the case where type dubs type for more people
It is described to audio-video corresponding no voice video the step of, may include:
Corresponding each second electronic equipment of user in instant messaging region is controlled in the Virtual Space, while described in playing
Wait match the corresponding no voice video of audio-video.
In order to guarantee that user can smoothly complete to dubbing with audio-video, the first electricity in main broadcaster and instant messaging region
The corresponding each second electronic equipment needs of user are played simultaneously to corresponding with audio-video in sub- equipment and instant messaging region
Without voice video, in this way, main broadcaster and each user can just smoothly complete and dub interaction.
As it can be seen that in the present embodiment, performance is dubbed in user's completion that can cooperate in main broadcaster and instant messaging region,
Interacting between main broadcaster and spectators is stronger, and spectators' sense of participation enhancing can further enhance the interactivity and entertaining of Virtual Space
Property, further increase user experience.
As a kind of embodiment of the embodiment of the present disclosure, used in instant messaging region in the above-mentioned control Virtual Space
Corresponding each second electronic equipment in family, at the same play it is described to no voice video corresponding with audio-video the step of, may include:
When obtaining the broadcast message that first electronic equipment is sent, send described wait match audio-video and sign on extremely
Corresponding each second electronic equipment of user in instant messaging region in the Virtual Space, so that each second electronic equipment exists
When receiving the sign on, while playing described wait match the corresponding no voice video of audio-video.
In order to which corresponding each second electronic equipment of user for guaranteeing that the first electronic equipment and participation are dubbed can be broadcast simultaneously
No voice video is put, the first electronic equipment can issue by way of sending broadcast message starts to dub instruction, and server exists
When obtaining the broadcast message of the first electronic equipment transmission, just send it is above-mentioned to audio-video and sign on into Virtual Space i.e.
When communications zone in corresponding each second electronic equipment of user.
In this way, just starting to play wait match the corresponding no voice of audio-video when each second electronic equipment receives sign on
Video guarantees that each second electronic equipment starts to play without voice video in synchronization.
In dubbing process, in order to guarantee that each user terminal plays dub audio be it is synchronous, in one embodiment,
Voice transfer can be carried out by the way of connecting wheat in real time.For example, voice signal can be adopted using 20 milliseconds of time intervals
Collection, coded data can be transmitted by udp (User Datagram Protocol, User Datagram Protocol) data packet, be passed through
FEC (Forward Error Correction, forward error correction) mode handles Network Packet Loss, after receiving end receives rear data packet,
It can carry out data packet sequencing by sequence number to restore the data packet of loss by PLC, in this manner it is ensured that hair
The data packet of sending end can be transferred to receiving end in 400 milliseconds.Also ensure that each user terminal dubs sound being played simultaneously
Frequently.
As it can be seen that in the present embodiment, server can when obtaining the broadcast message that the first electronic equipment is sent, send to
With audio-video and sign on into Virtual Space corresponding each second electronic equipment of user in instant messaging region so that each
Two electronic equipments are played when receiving sign on to guarantee to be played simultaneously with the corresponding no voice video of audio-video
Without voice video, it is ensured that dubbing can go on smoothly.
As a kind of embodiment of the embodiment of the present disclosure, the step of above-mentioned determination is wait match audio-video, may include:
Obtain the video that first electronic equipment uploads;The video of the upload is determined as wait match audio-video.
Determining that main broadcaster can choose the video oneself liked, will thereon by the first electronic equipment when matching audio-video
Server is reached, server can also obtain the video of the first electronic equipment upload, and in turn, server can be by the first electronics
The video that equipment uploads is determined as wait match audio-video.
Server can also carry out subtitle recognition to the video that the first electronic equipment uploads, and obtain recognition result, and will know
Other result is added to the video of upload, to facilitate each user to check subtitle when dubbing.For the concrete mode of subtitle recognition, originally
Open embodiment is not specifically limited again, as long as the subtitle of video can be identified.First electronic equipment can also incite somebody to action
The video of main broadcaster's selection is stored in local, and the video uploaded can be used in each live streaming.
As it can be seen that in the present embodiment, the video that available first electronic equipment of server uploads, and then by the upload
Video is determined as wait match audio-video.In this way, can satisfy the demand of main broadcaster, user experience is further increased.
As a kind of embodiment of the embodiment of the present disclosure, as shown in figure 4, the acquisition modes of above-mentioned no voice video, it can
To include:
S401 determines the corresponding amplitude spectrum of audio signal wait match audio-video;
In order to be handled with audio-video band, its corresponding no voice video is obtained, it is necessary first to determine wait dub view
The corresponding amplitude spectrum of the audio signal of frequency.Specifically, the audio signal with audio-video can be treated and carry out sub-frame processing, obtained
Every frame audio signal, and then every frame audio signal is transformed into frequency-region signal, obtain the amplitude spectrum of its every frame audio signal.
For example, being 16KHz to the audio signal with audio-video, monophonic, 16 audio signals quantified, then can be first
Framing operation is carried out to the audio signal, frame length is 512 sampled points, and it is 256 sampled points that frame, which moves, obtains every frame audio letter
Number, in turn, to every frame audio signal carry out Short Time Fourier Transform, can obtain the corresponding phase spectrum of every frame audio signal and
Amplitude spectrum.
The amplitude spectrum is inputted the network model that training is completed in advance by S402, is obtained described to corresponding with audio-video
Voice exposure mask matrix;
Next, server can be by the corresponding amplitude spectrum input network mould that training is completed in advance of every frame audio signal
Type.Wherein, which can be obtained based on the amplitude spectrum sample obtained in advance and its training of corresponding voice exposure mask matrix,
It may include the corresponding relationship of amplitude spectrum Yu voice exposure mask matrix.Therefore, which can be according to amplitude spectrum and voice
The corresponding relationship of exposure mask matrix determines voice exposure mask matrix corresponding to the corresponding amplitude spectrum of every frame audio signal.
Wherein, voice exposure mask matrix is the exposure mask matrix that can remove voice, each element in voice exposure mask matrix
Value is 0~1, indicates to indicate more to have kept off voice closer to 1, can pass through setting in this way closer to there is voice closer to 0
The element for being lower than threshold value in voice exposure mask matrix is all set to 0, indicates that its corresponding audio signal parts is someone by threshold value
Sound.
Above-mentioned network model can be convolutional neural networks, Recognition with Recurrent Neural Network even depth learning network model, again not
It is specifically limited.
Unmanned acoustic amplitude spectrum is calculated using the voice exposure mask matrix and the amplitude spectrum in S403;
Next, the above-mentioned amplitude spectrum by voice exposure mask matrix and every frame audio signal can be done dot product by server, i.e.,
The amplitude spectrum of signal after available separation, it is to be understood that signal is unmanned sound audio after separation.
S404 is determined described wait match the corresponding no voice video of audio-video based on the unmanned acoustic amplitude spectrum.
After being separated after the amplitude spectrum of signal, server can be by the amplitude spectrum of signal after separation and above-mentioned phase spectrum knot
It closes, then transforms it into time-domain signal, the time-domain signal of separation signal, that is, unmanned sound audio can be obtained.
In turn, which combines with to the image section with audio-video, can obtain wait match audio-video pair
Answer without voice video.
As it can be seen that in the present embodiment, server can use the network model of training completion in advance and obtain wait match audio-video pair
Answer without voice video, can rapidly and accurately determine to further increase user's body with the corresponding no voice video of audio-video
It tests.
As a kind of embodiment of the embodiment of the present disclosure, as shown in figure 5, the acquisition modes of above-mentioned no voice video, it can
To include:
S501 determines the corresponding amplitude spectrum of audio signal wait match audio-video;
Step S501 is identical as above-mentioned steps S401, and related place may refer to the description and explanation of the part step S401,
Details are not described herein.
The amplitude spectrum is inputted the network model that training is completed in advance by S502, is obtained described to corresponding with audio-video
Unmanned sound audio;
Amplitude spectrum obtained in step S501 can be inputted the network model that training is completed in advance by server, wherein should
Network model can be obtained based on the amplitude spectrum sample and its corresponding unmanned sound audio training obtained in advance, may include width
The corresponding relationship of value spectrum and unmanned sound audio.Therefore, which can close according to amplitude spectrum is corresponding with unmanned sound audio
System determines the corresponding unmanned sound audio of amplitude spectrum of input, and then outputs it.
Specifically, which can first determine voice exposure mask corresponding to the corresponding amplitude spectrum of every frame audio signal
Then the amplitude spectrum of voice exposure mask matrix and every frame audio signal is done dot product by matrix, the amplitude spectrum of signal after being separated, then
By the amplitude spectrum of signal after separation in conjunction with above-mentioned phase spectrum, then time-domain signal is transformed it into, separation signal can be obtained
Time-domain signal, that is, unmanned sound audio.
Above-mentioned network model may be convolutional neural networks, Recognition with Recurrent Neural Network even depth learning network model, again
It is not specifically limited.
S503 is determined described wait match the corresponding no voice video of audio-video based on the unmanned sound audio.
In turn, server can be by unmanned sound audio that upper network model exports and the image portion split-phase knot wait match audio-video
It closes, can obtain wait match the corresponding no voice video of audio-video.
As it can be seen that in the present embodiment, server can use the network model of training completion in advance and obtain wait match audio-video pair
Answer without voice video, can rapidly and accurately determine to further increase user's body with the corresponding no voice video of audio-video
It tests.
Above-mentioned no voice video can be removal voice, retain the video of background music, be also possible to voice and background sound
Ledu removal, retains the video of some cadence informations, can also be the video of absolutely not sound, this is all reasonably, specifically
It can be according to demand setting voice exposure mask matrix be dubbed, to reach corresponding effect.
As a kind of embodiment of the embodiment of the present disclosure, after obtaining above-mentioned unmanned sound audio, in the first embodiment party
In formula, the above method can also include:
Determine the corresponding amplitude spectrum of unmanned sound audio;The amplitude spectrum is inputted into the network model that training is completed in advance, is obtained
It is calculated to the corresponding musical instrument exposure mask matrix of the unmanned sound audio using the musical instrument exposure mask matrix and the amplitude spectrum
Target musical instrument amplitude spectrum;The corresponding target instrumental audio of the unmanned sound audio is determined based on the target musical instrument amplitude spectrum.
Wherein, the network model is based on the amplitude spectrum sample and its corresponding musical instrument exposure mask matrix obtained in advance trained
It arrives, the network model includes the corresponding relationship of amplitude spectrum Yu musical instrument exposure mask matrix.Musical instrument exposure mask matrix is that can remove it
His audio signal, retains the matrix of certain instrumental audio signal.
Since the method for determination of target instrumental audio and the method for determination of the first above-mentioned unmanned sound audio are essentially identical, In
This is repeated no more.
In the second embodiment, the above method can also include:
Determine the corresponding amplitude spectrum of unmanned sound audio;The amplitude spectrum is inputted into the network model that training is completed in advance, is obtained
To the corresponding target instrumental audio of the unmanned sound audio.
Wherein, the network model is obtained based on the amplitude spectrum sample obtained in advance and its training of corresponding instrumental audio,
The network model includes the corresponding relationship of amplitude spectrum and instrumental audio.Due to the method for determination of target instrumental audio and above-mentioned the
The method of determination of two kinds of unmanned sound audios is essentially identical, and details are not described herein.
Above-mentioned target musical instrument can be set according to actual needs, for example, can be the musical instruments such as piano, guitar, drum.
As it can be seen that available various target instrumental audios, server can use other musical instruments in the way of above two
Audio replaces the target instrumental audio in unmanned sound audio, and the cadence information of unmanned sound audio can also be determined according to instrumental audio
Deng providing convenience for the multifarious mode of dubbing, further enhance the diversity for dubbing interaction, improve user experience.
As a kind of embodiment of the embodiment of the present disclosure, after the completion of dubbing, in the uploading instructions for receiving user's sending
Afterwards, server can dub above-mentioned audio and be and to be distributed to live streaming software platform with audio-video without voice Video coding, with
It is checked for user's downloading.
The embodiment of the present disclosure additionally provides the processing method of second of audio-video, second provided by the embodiment of the present disclosure
Audio/video processing method can be applied to the first electronic equipment for being equipped with live streaming application program.
Wherein, the first electronic equipment is with the electronic equipment that permission is broadcast live in Virtual Space, and main broadcaster can be by the
One electronic equipment is broadcast live.
As shown in fig. 6, a kind of processing method of audio-video, is applied to the first electronic equipment, wherein first electronics is set
Standby is with the electronic equipment that permission is broadcast live in Virtual Space, which comprises
In step s 601, it obtains and dubs instruction in the Virtual Space;
In step S602, corresponding preset of instruction is dubbed described in determination and dubs type;
In step S603, determine wait match audio-video;
In step s 604, it when sign on is dubbed in acquisition, dubs described in type broadcasting according to described preset wait dub
The corresponding no voice video of video;
In step s 605, during playing the no voice video, the acquisition no voice video is corresponding to be dubbed
Audio, while the audio of dubbing is sent to server.
As it can be seen that in scheme provided by the embodiment of the present disclosure, the first electronic equipment available matching in Virtual Space
Sound instruction determines that dubbing corresponding preset of instruction dubs type, then determines wait match audio-video, and then dub in acquisition and start to refer to
When enabling, play according to default type of dubbing to be obtained during playing without voice video with the corresponding no voice video of audio-video
It takes no voice video is corresponding to dub audio, while audio will be dubbed and be sent to server.It can be in void using this programme user
Quasi- space is interacted in a manner of dubbing, and increases the diversity of interaction mode, user experience is improved.
During main broadcaster is broadcast live, it can be interacted by the way of dubbing with spectators or other main broadcasters, this
When, main broadcaster can be issued by the first electronic equipment and dub instruction.In turn, in above-mentioned steps S101, the first electronic equipment is just
In available Virtual Space main broadcaster issue dub instruction, show at this time main broadcaster need by the way of dubbing and spectators or
Other main broadcasters interaction.Due to the mode of dubbing in Virtual Space can there are many, so the first electronic equipment can determine at this time
Acquired corresponding preset of instruction of dubbing dubs type, that is, executes step S602.
After what main broadcaster issued in obtaining Virtual Space dubs instruction, the first electronic equipment can execute above-mentioned steps
S603, that is, determine wait match audio-video.Next, illustrating main broadcaster's needs obtaining when dubbing sign on of main broadcaster's sending
Start to be dubbed, then default can dub type according to above-mentioned and play wait match the corresponding no voice video of audio-video.
In turn, obtain that main broadcaster issues when dubbing sign on, the first electronic equipment can dub type and broadcast according to default
It puts to audio-video, during playing this without voice video, obtains and dub audio without voice video is corresponding, while will dub
Audio is sent to server.Server can be sent to the second electronic equipment for audio is dubbed and other main broadcasters use
First electronic equipment.Wherein, no voice video is the video for removing voice and only retaining background music.Second electronic equipment is tool
There is the electronic equipment of the viewing live streaming permission in Virtual Space.
It is dubbed since the first electronic equipment is determining and instructs the corresponding default mode for dubbing type, determines wait match audio-video
Mode and acquisition can be determined with above-mentioned server respectively without the corresponding mode for dubbing audio of voice video dub instruction
The corresponding default mode for dubbing type determines to the mode with audio-video and obtains and dub audio without voice video is corresponding
Mode it is identical, so details are not described herein.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default to dub type can be main broadcaster's show type.
Correspondingly, above-mentioned play the step wait match the corresponding no voice video of audio-video according to the default type of dubbing
Suddenly, may include:
Broadcasting is described wait match the corresponding no voice video of audio-video, and broadcasting is described wait match simultaneously for the second electronic equipment of control
The corresponding no voice video of audio-video.
First electronic equipment plays when no voice video corresponding with audio-video, can send a request to server, with
So that server is controlled the second electronic equipment while playing wait match the corresponding no voice video of audio-video.Guarantee that the spectators of main broadcaster can
Watch main broadcaster's to dub performance simultaneously.
As it can be seen that in the present embodiment, main broadcaster can carry out dubbing performance, to be interacted with spectators, can be enhanced virtual
The interactivity and interest in space improve user experience.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default type of dubbing can fight type for more main broadcasters.
Correspondingly, above-mentioned play the step wait match the corresponding no voice video of audio-video according to the default type of dubbing
Suddenly, may include:
Determine the corresponding battle sequence of corresponding first electronic equipment of each main broadcaster;Sequentially according to the battle, described in control
First electronic equipment and its corresponding second electronic equipment play in order described wait match the corresponding no voice video of audio-video.
Due to needing to carry out dubbing battle there is currently multiple main broadcasters, in order to guarantee that the sense of battle is dubbed in spectators' viewing
By each main broadcaster needs to carry out one by one to dub performance, so the first electronic equipment that above-mentioned main broadcaster uses can determine each main broadcaster couple
The first electronic equipment answered corresponding battle sequence, and then according to battle sequence controls the first electronic equipment and its corresponding
Second electronic equipment plays in order described wait match the corresponding no voice video of audio-video.
In one embodiment, the first electronic equipment, which can send to dub, handovers request to server, and server receives
After dubbing switching request to this, each first electronic equipment can be controlled and its corresponding second electronic equipment is playd in order wait match
The corresponding no voice video of audio-video, main broadcaster can dub, and what spectators can watch each main broadcaster dubs performance.
As it can be seen that can carry out dubbing battle performance in the present embodiment, between multiple main broadcasters, with other main broadcasters and sight
Crowd interacts, and can further enhance the interactivity and interest of Virtual Space, further increase user experience.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default type of dubbing can dub type for more people.
Correspondingly, above-mentioned play the step wait match the corresponding no voice video of audio-video according to the default type of dubbing
Suddenly, may include:
Corresponding each second electronic equipment of user in instant messaging region is controlled in the Virtual Space, while described in playing
Wait match the corresponding no voice video of audio-video.
In order to guarantee that user can smoothly complete to dubbing with audio-video, the first electricity in main broadcaster and instant messaging region
The corresponding each second electronic equipment needs of user are played simultaneously to corresponding with audio-video in sub- equipment and instant messaging region
Without voice video, in this way, main broadcaster and each user can just smoothly complete and dub interaction.
As it can be seen that in the present embodiment, performance is dubbed in user's completion that can cooperate in main broadcaster and instant messaging region,
Interacting between main broadcaster and spectators is stronger, and spectators' sense of participation enhancing can further enhance the interactivity and entertaining of Virtual Space
Property, further increase user experience.
As a kind of embodiment of the embodiment of the present disclosure, used in instant messaging region in the above-mentioned control Virtual Space
Corresponding each second electronic equipment in family, at the same play it is described to no voice video corresponding with audio-video the step of, may include:
The broadcast message of transmission is to the server, so that server transmission is described wait match audio-video and sign on
Corresponding each second electronic equipment of user into instant messaging region in the Virtual Space, so that each second electronic equipment
When receiving the sign on, while playing described wait match the corresponding no voice video of audio-video.
In order to which corresponding each second electronic equipment of user for guaranteeing that the first electronic equipment and participation are dubbed can be broadcast simultaneously
No voice video is put, the first electronic equipment can issue by way of sending broadcast message starts to dub instruction, and server exists
When obtaining the broadcast message of the first electronic equipment transmission, just send it is above-mentioned to audio-video and sign on into Virtual Space i.e.
When communications zone in corresponding each second electronic equipment of user.
In this way, just starting to play wait match the corresponding no voice of audio-video when each second electronic equipment receives sign on
Video guarantees that each second electronic equipment starts to play without voice video in synchronization.
As it can be seen that in the present embodiment, server can when obtaining the broadcast message that the first electronic equipment is sent, send to
With audio-video and sign on into Virtual Space corresponding each second electronic equipment of user in instant messaging region so that each
Two electronic equipments are played when receiving sign on to guarantee to be played simultaneously with the corresponding no voice video of audio-video
Without voice video, it is ensured that dubbing can go on smoothly.
As a kind of embodiment of the embodiment of the present disclosure, the step of above-mentioned determination is wait match audio-video, may include:
Obtain the video that user uploads;The video of the upload is determined as wait match audio-video.
Determining that main broadcaster can choose the video oneself liked and upload, the first electronic equipment when matching audio-video
The video of user's upload can be obtained, in turn, the video that user uploads can be determined as wait dub view by the first electronic equipment
Frequently.
As it can be seen that in the present embodiment, the video that the available user of the first electronic equipment uploads, and then by the view of the upload
Frequency is determined as wait match audio-video.In this way, can satisfy the demand of main broadcaster, user experience is further increased.
As a kind of embodiment of the embodiment of the present disclosure, the acquisition modes of above-mentioned no voice video may include:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;The amplitude spectrum is inputted training in advance to complete
Network model, obtain described to match the corresponding voice exposure mask matrix of audio-video, wherein the network model is based on acquisition in advance
Amplitude spectrum sample and its corresponding voice exposure mask matrix training obtain, the network model includes amplitude spectrum and voice exposure mask square
The corresponding relationship of battle array;Using the voice exposure mask matrix and the amplitude spectrum, unmanned acoustic amplitude spectrum is calculated;Based on the nothing
Voice amplitude spectrum determines described wait match the corresponding no voice video of audio-video.
As a kind of embodiment of the embodiment of the present disclosure, the acquisition modes of above-mentioned no voice video may include:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;The amplitude spectrum is inputted training in advance to complete
Network model, obtain described wait match the corresponding unmanned sound audio of audio-video, wherein the network model based on obtaining in advance
Amplitude spectrum sample and its corresponding unmanned sound audio training obtain, and the network model includes pair of amplitude spectrum Yu unmanned sound audio
It should be related to;It is determined based on the unmanned sound audio described wait match the corresponding no voice video of audio-video.
The mode of mode and the acquisition of above-mentioned server without voice video without voice video is obtained due to the first electronic equipment
It is identical, it may refer to the explanation that above-mentioned server obtains the mode part without voice video, details are not described herein.
The embodiment of the present disclosure additionally provides the processing method of the third audio-video, provided by the embodiment of the present disclosure the third
Audio/video processing method can be applied to the second electronic equipment for being equipped with live streaming application program.
Wherein, the second electronic equipment is the electronic equipment with the viewing live streaming permission in Virtual Space, and spectators can lead to
Cross the viewing live streaming of the second electronic equipment.
As shown in fig. 7, a kind of processing method of audio-video, is applied to the second electronic equipment, which comprises
In step s 701, get in Virtual Space when dubbing sign on, play obtain in advance wait dub
The corresponding no voice video of video;
In step S702, during playing the no voice video, get that the no voice video is corresponding to match
When sound audio, audio is dubbed described in broadcasting.
As it can be seen that the second electronic equipment can be in getting Virtual Space in scheme provided by the embodiment of the present disclosure
When dubbing sign on, play obtain in advance to play without voice video process with the corresponding no voice video of audio-video
In, when getting that no voice video is corresponding to dub audio, audio is dubbed in broadcasting.It can be in Virtual Space using this programme user
It is interacted in a manner of dubbing, increases the diversity of interaction mode, user experience is improved.
Spectators can watch the live streaming of main broadcaster by above-mentioned second electronic equipment, and the second electronic equipment is getting virtual sky
Between in when dubbing sign on, illustrate that main broadcaster at this time or other spectators will start to carry out dubbing performance, then second electricity
Sub- equipment can play obtain in advance wait match the corresponding no voice video of audio-video.
Wherein, dubbing sign on can generate and send for server to the second electronic equipment, be also possible to first
Electronic equipment is sent to server, and server is forwarded to the second electronic equipment, this is all reasonable.
It is determined after with audio-video in server or the first electronic equipment, this can be waited for being sent to second with audio-video
Electronic equipment can also will be sent to the second electronic equipment to the mark with audio-video, and the second electronic equipment is also assured that
The corresponding video of the mark is and then to obtain wait match the corresponding no voice video of audio-video wait match audio-video.
In above-mentioned steps S702, during playing without voice video, the second electronic equipment gets the no voice
Video is corresponding when dubbing audio, can play and dub audio, spectators, which are also perceived by, dubs performance, wherein dubs
What the second electronic equipment that audio can receive the first electronic equipment for server or other spectators use was sent dubs sound
Frequently, and it is forwarded to the second electronic equipment.
When carrying out dubbing performance, what the available main broadcaster of the first electronic equipment issued dubs audio and is sent to main broadcaster
To server.When carrying out dubbing performance, the available spectators of the second electronic equipment that other spectators use issue other spectators
Dub audio and send it to server.
It is above-mentioned to dub sign on getting in Virtual Space as a kind of embodiment of the embodiment of the present disclosure
When, play obtain in advance to audio-video corresponding no voice video the step of, may include:
Receive in the Virtual Space that server is sent when matching audio-video and sign on, play received wait match
The corresponding no voice video of audio-video.
In order to which corresponding each second electronic equipment of user for guaranteeing that the first electronic equipment and participation are dubbed can be broadcast simultaneously
No voice video is put, the first electronic equipment can issue by way of sending broadcast message starts to dub instruction, and server exists
When obtaining the broadcast message of the first electronic equipment transmission, just send it is above-mentioned to audio-video and sign on into Virtual Space i.e.
When communications zone in corresponding each second electronic equipment of user.
In this way, just starting to play wait match the corresponding no voice of audio-video when each second electronic equipment receives sign on
Video guarantees that each second electronic equipment starts to play without voice video in synchronization.
As it can be seen that in the present embodiment, server can when obtaining the broadcast message that the first electronic equipment is sent, send to
With audio-video and sign on into Virtual Space corresponding each second electronic equipment of user in instant messaging region so that each
Two electronic equipments are played when receiving sign on to guarantee to be played simultaneously with the corresponding no voice video of audio-video
Without voice video, it is ensured that dubbing can go on smoothly.
As a kind of embodiment of the embodiment of the present disclosure, the acquisition modes of above-mentioned no voice video may include:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;The amplitude spectrum is inputted training in advance to complete
Network model, obtain described to match the corresponding voice exposure mask matrix of audio-video, wherein the network model is based on acquisition in advance
Amplitude spectrum sample and its corresponding voice exposure mask matrix training obtain, the network model includes amplitude spectrum and voice exposure mask square
The corresponding relationship of battle array;Using the voice exposure mask matrix and the amplitude spectrum, unmanned acoustic amplitude spectrum is calculated;Based on the nothing
Voice amplitude spectrum determines described wait match the corresponding no voice video of audio-video.
As a kind of embodiment of the embodiment of the present disclosure, the acquisition modes of above-mentioned no voice video may include:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;The amplitude spectrum is inputted training in advance to complete
Network model, obtain described wait match the corresponding unmanned sound audio of audio-video, wherein the network model based on obtaining in advance
Amplitude spectrum sample and its corresponding unmanned sound audio training obtain, and the network model includes pair of amplitude spectrum Yu unmanned sound audio
It should be related to;It is determined based on the unmanned sound audio described wait match the corresponding no voice video of audio-video.
The mode of mode and the acquisition of above-mentioned server without voice video without voice video is obtained due to the second electronic equipment
It is identical, it may refer to the explanation that above-mentioned server obtains the mode part without voice video, details are not described herein.
Fig. 8 is the processing unit block diagram of the first audio-video shown according to an exemplary embodiment.
As shown in figure 8, a kind of processing unit of audio-video, is applied to server, described device includes:
First dubs instruction acquisition module 810, is configured as executing what the first electronic equipment in acquisition Virtual Space issued
Dub instruction;
Wherein, first electronic equipment is with the electronic equipment that permission is broadcast live in the Virtual Space.
First it is default dub determination type module 820, be configured as executing determine described in dub instruction corresponding pre- establishing
Sound type;
First wait match audio-video determining module 830, be configured as executing determining wait match audio-video;
First without voice video playback module 840, is configured as executing dubbing for acquisition the first electronic equipment sending
When sign on, dub described in type broadcasting according to described preset wait match the corresponding no voice video of audio-video;
First dubs audio sending module 850, is configured as executing during playing the no voice video, obtains institute
It states that no voice video is corresponding to dub audio, while the audio of dubbing is sent to the second electronic equipment.
Wherein, second electronic equipment is the electronic equipment with the viewing live streaming permission in the Virtual Space.
As it can be seen that in scheme provided by the embodiment of the present disclosure, the first electronic equipment in the available Virtual Space of server
What is issued dubs instruction, determine dub instruction it is corresponding it is default dub type, then determine wait match audio-video, and then in acquisition the
When dubbing sign on of one electronic equipment sending plays according to default type of dubbing to regard with the corresponding no voice of audio-video
Frequently, it during playing without voice video, obtains and dubs audio without voice video is corresponding, while audio will be dubbed and be sent to the
Two electronic equipments.Wherein, the first electronic equipment is with the electronic equipment that permission is broadcast live in Virtual Space, the second electronic equipment
For the electronic equipment with the viewing live streaming permission in Virtual Space.Use this programme user can be in Virtual Space to dub
Mode interacts, and increases the diversity of interaction mode, and user experience is improved.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default to dub type can be main broadcaster's show type;
Above-mentioned first may include: without voice video playback module 840
First (is not shown) without voice video playing submodule in Fig. 8, is configured as execution control first electronics and sets
Standby and described second electronic equipment plays described wait match the corresponding no voice video of audio-video simultaneously.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default type of dubbing can fight type for more main broadcasters;
Above-mentioned first may include: without voice video playback module 840
Battle sequence determines submodule (being not shown in Fig. 8), is configured as executing corresponding first electronics of determining each main broadcaster
The corresponding battle sequence of equipment;
Second (is not shown) without voice video playing submodule in Fig. 8, is configured as executing according to the battle sequence, control
Make first electronic equipment and its corresponding second electronic equipment play in order it is described to the corresponding no voice view of audio-video
Frequently.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default type of dubbing can dub type for more people;
Above-mentioned first may include: without voice video playback module 840
Third (is not shown) without voice video playing submodule in Fig. 8, is configured as executing in the control Virtual Space
Corresponding each second electronic equipment of user in instant messaging region, while playing described to the corresponding no voice view of audio-video
Frequently.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned third may include: without voice video playing submodule
First (is not shown) without voice video playback unit in Fig. 8, is configured as execution and sets in acquisition first electronics
When the broadcast message that preparation is sent, send it is described to audio-video and sign on into the Virtual Space in instant messaging region
Corresponding each second electronic equipment of user, so that each second electronic equipment is broadcast simultaneously when receiving the sign on
It puts described wait match the corresponding no voice video of audio-video.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned first to may include: with audio-video determining module 830
First video acquisition submodule (being not shown in Fig. 8) is configured as executing acquisition the first electronic equipment upload
Video;
First, to determine submodule (being not shown in Fig. 8) with audio-video, is configured as executing that the video of the upload is true
It is set to wait match audio-video.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned audio-video processing unit can also include first without voice
Video determining module (is not shown) in Fig. 8;
Above-mentioned first may include: without voice video determining module
First amplitude, which is composed, determines submodule (being not shown in Fig. 8), is configured as executing the determining sound wait match audio-video
The corresponding amplitude spectrum of frequency signal;
The first aural masking matrix determines submodule (being not shown in Fig. 8), is configured as execution and inputs the amplitude spectrum
The network model that training is completed in advance obtains described wait match the corresponding voice exposure mask matrix of audio-video;
Wherein, the network model is based on the amplitude spectrum sample and its corresponding voice exposure mask matrix obtained in advance trained
It arrives, the network model includes the corresponding relationship of amplitude spectrum Yu voice exposure mask matrix.
First unmanned acoustic amplitude, which is composed, determines submodule (being not shown in Fig. 8), is configured as execution and utilizes the voice exposure mask
Unmanned acoustic amplitude spectrum is calculated in matrix and the amplitude spectrum;
First determines submodule (being not shown in Fig. 8) without voice video, is configured as executing based on the unmanned acoustic amplitude
Spectrum determines described wait match the corresponding no voice video of audio-video.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned apparatus can also determine mould without voice video including second
Block (is not shown) in Fig. 8;
Above-mentioned second may include: without voice video determining module
Second amplitude spectrum determines submodule (being not shown in Fig. 8), is configured as executing the determining sound wait match audio-video
The corresponding amplitude spectrum of frequency signal;
First unmanned sound audio determines submodule (being not shown in Fig. 8), is configured as executing that amplitude spectrum input is pre-
The first network model that training is completed obtains described wait match the corresponding unmanned sound audio of audio-video;
Wherein, the network model is based on the amplitude spectrum sample and its corresponding unmanned sound audio obtained in advance trained
It arrives, the network model includes the corresponding relationship of amplitude spectrum Yu unmanned sound audio.
Second determines submodule (being not shown in Fig. 8) without voice video, is configured as executing based on the unmanned sound audio
It determines described wait match the corresponding no voice video of audio-video.
Fig. 9 is the processing unit block diagram of second of audio-video shown according to an exemplary embodiment.
As shown in figure 9, a kind of processing unit of audio-video, is applied to the first electronic equipment, wherein first electronics is set
Standby is the electronic equipment with the live streaming permission in Virtual Space, and described device includes:
Second dubs instruction acquisition module 910, is configured as executing obtaining and dubs instruction in the Virtual Space;
Second it is default dub determination type module 920, be configured as executing determine described in dub instruction corresponding pre- establishing
Sound type;
Second wait match audio-video determining module 930, be configured as executing determining wait match audio-video;
Second without voice video playback module 940, is configured as executing when sign on is dubbed in acquisition, according to described pre-
Establishing sound type plays described wait match the corresponding no voice video of audio-video;
Second dubs audio sending module 950, is configured as executing during playing the no voice video, obtains institute
It states that no voice video is corresponding to dub audio, while the audio of dubbing is sent to server.
As it can be seen that in scheme provided by the embodiment of the present disclosure, the first electronic equipment available matching in Virtual Space
Sound instruction determines that dubbing corresponding preset of instruction dubs type, then determines wait match audio-video, and then dub in acquisition and start to refer to
When enabling, play according to default type of dubbing to be obtained during playing without voice video with the corresponding no voice video of audio-video
It takes no voice video is corresponding to dub audio, while audio will be dubbed and be sent to server.It can be in void using this programme user
Quasi- space is interacted in a manner of dubbing, and increases the diversity of interaction mode, user experience is improved.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default type of dubbing is main broadcaster's show type;
Above-mentioned second may include: without voice video playback module 940
4th (is not shown) without voice video playing submodule in Fig. 9, is configured as executing described in broadcasting wait match audio-video
Corresponding no voice video, and control described in the broadcasting simultaneously of the second electronic equipment wait match the corresponding no voice video of audio-video,
In, second electronic equipment is the electronic equipment with the viewing live streaming permission in the Virtual Space.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default type of dubbing is that more main broadcasters fight type;
Above-mentioned second may include: without voice video playback module 940
Battle sequence determines submodule (being not shown in Fig. 9), is configured as executing corresponding first electronics of determining each main broadcaster
The corresponding battle sequence of equipment;
5th (is not shown) without voice video playing submodule in Fig. 9, is configured as executing according to the battle sequence, control
Make first electronic equipment and its corresponding second electronic equipment play in order it is described to the corresponding no voice view of audio-video
Frequently.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned default type of dubbing is that more people dub type;
Above-mentioned second may include: without voice video playback module 940
6th (is not shown) without voice video playing submodule in Fig. 9, is configured as executing in the control Virtual Space
Corresponding each second electronic equipment of user in instant messaging region, while playing described to the corresponding no voice view of audio-video
Frequently.
May include: without voice video playing submodule as a kind of embodiment of the embodiment of the present disclosure, the above-mentioned 6th
Second (is not shown) without voice video playback unit in Fig. 9, is configured as executing the broadcast message sent to described
Server, so that server transmission is described wait match audio-video and sign on the instant messaging region into the Virtual Space
Corresponding each second electronic equipment of middle user, so that each second electronic equipment is when receiving the sign on, simultaneously
It plays described wait match the corresponding no voice video of audio-video.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned second to may include: with audio-video determining module 930
Second video acquisition submodule (being not shown in Fig. 9) is configured as executing the video for obtaining user's upload;
Second, to determine submodule (being not shown in Fig. 9) with audio-video, is configured as executing that the video of the upload is true
It is set to wait match audio-video.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned audio-video processing unit can also include third without voice
Video determining module (is not shown) in Fig. 9;
The third may include: without voice video determining module
Third amplitude spectrum determines submodule (being not shown in Fig. 9), is configured as executing the determining sound wait match audio-video
The corresponding amplitude spectrum of frequency signal;
Second voice exposure mask matrix determines submodule (being not shown in Fig. 9), is configured as execution and inputs the amplitude spectrum
The network model that training is completed in advance obtains described wait match the corresponding voice exposure mask matrix of audio-video;
Wherein, the network model is based on the amplitude spectrum sample and its corresponding voice exposure mask matrix obtained in advance trained
It arrives, the network model includes the corresponding relationship of amplitude spectrum Yu voice exposure mask matrix.
Second unmanned acoustic amplitude, which is composed, determines submodule (being not shown in Fig. 9), is configured as execution and utilizes the voice exposure mask
Unmanned acoustic amplitude spectrum is calculated in matrix and the amplitude spectrum;
Third determines submodule (being not shown in Fig. 9) without voice video, is configured as executing based on the unmanned acoustic amplitude
Spectrum determines described wait match the corresponding no voice video of audio-video.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned audio-video processing unit can also include the 4th without voice
Video determining module (is not shown) in Fig. 9;
Described 4th may include: without voice video determining module
4th amplitude spectrum determines submodule (being not shown in Fig. 9), is configured as executing the determining sound wait match audio-video
The corresponding amplitude spectrum of frequency signal;
Second unmanned sound audio determines submodule (being not shown in Fig. 9), is configured as executing that amplitude spectrum input is pre-
The first network model that training is completed obtains described wait match the corresponding unmanned sound audio of audio-video;
Wherein, the network model is based on the amplitude spectrum sample and its corresponding unmanned sound audio obtained in advance trained
It arrives, the network model includes the corresponding relationship of amplitude spectrum Yu unmanned sound audio.
4th determines submodule (being not shown in Fig. 9) without voice video, is configured as executing based on the unmanned sound audio
It determines described wait match the corresponding no voice video of audio-video.
Figure 10 is the processing unit block diagram of the third audio-video shown according to an exemplary embodiment.
As shown in Figure 10, a kind of processing unit of audio-video is applied to the second electronic equipment, wherein second electronics
Equipment is the electronic equipment with the viewing live streaming permission in the Virtual Space, and described device includes:
Third is configured as executing and starts to refer to getting dubbing in Virtual Space without voice video playback module 1010
When enabling, play obtain in advance wait match the corresponding no voice video of audio-video;
Audio playing module 1020 is dubbed, is configured as executing during playing the no voice video, gets institute
When stating that no voice video is corresponding to dub audio, audio is dubbed described in broadcasting.
As it can be seen that the second electronic equipment can be in getting Virtual Space in scheme provided by the embodiment of the present disclosure
When dubbing sign on, play obtain in advance to play without voice video process with the corresponding no voice video of audio-video
In, when getting that no voice video is corresponding to dub audio, audio is dubbed in broadcasting.It can be in Virtual Space using this programme user
It is interacted in a manner of dubbing, increases the diversity of interaction mode, user experience is improved.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned third can wrap without voice video playback module 1010
It includes:
7th (is not shown) without voice video playing submodule in Figure 10, is configured as executing the void for receiving server transmission
In quasi- space when matching audio-video and sign on, play received wait match the corresponding no voice video of audio-video.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned audio-video processing unit can also include the 5th without voice
Video determining module;
Described 5th may include: without voice video determining module
5th amplitude spectrum determines submodule (being not shown in Figure 10), is configured as executing the determining sound wait match audio-video
The corresponding amplitude spectrum of frequency signal;
Third party's aural masking matrix determines submodule (being not shown in Figure 10), is configured as execution and inputs the amplitude spectrum
The network model that training is completed in advance obtains described wait match the corresponding voice exposure mask matrix of audio-video;
Wherein, the network model is based on the amplitude spectrum sample and its corresponding voice exposure mask matrix obtained in advance trained
It arrives, the network model includes the corresponding relationship of amplitude spectrum Yu voice exposure mask matrix.
The unmanned acoustic amplitude of third, which is composed, determines submodule (being not shown in Figure 10), is configured as execution and utilizes the voice exposure mask
Unmanned acoustic amplitude spectrum is calculated in matrix and the amplitude spectrum;
5th determines submodule (being not shown in Figure 10) without voice video, is configured as executing based on the unmanned acoustic amplitude
Spectrum determines described wait match the corresponding no voice video of audio-video.
As a kind of embodiment of the embodiment of the present disclosure, above-mentioned audio-video processing unit can also include the 5th without voice
Video determining module (is not shown) in Figure 10;
Described 5th may include: without voice video determining module
6th amplitude spectrum determines submodule (being not shown in Figure 10), is configured as executing the determining sound wait match audio-video
The corresponding amplitude spectrum of frequency signal;
The unmanned sound audio of third determines submodule (being not shown in Figure 10), is configured as executing that amplitude spectrum input is pre-
The first network model that training is completed obtains described wait match the corresponding unmanned sound audio of audio-video;
Wherein, the network model is based on the amplitude spectrum sample and its corresponding unmanned sound audio obtained in advance trained
It arrives, the network model includes the corresponding relationship of amplitude spectrum Yu unmanned sound audio.
6th determines submodule (being not shown in Figure 10) without voice video, is configured as executing based on the unmanned sound audio
It determines described wait match the corresponding no voice video of audio-video.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method
Embodiment in be described in detail, no detailed explanation will be given here.
The embodiment of the present disclosure additionally provides a kind of electronic equipment, and as shown in figure 11, electronic equipment may include processor
1101, communication interface 1102, memory 1103 and communication bus 1104, wherein processor 1101, communication interface 1102, storage
Device 1103 completes mutual communication by communication bus 1104,
Memory 1103, for storing computer program;
Processor 1101 when for executing the program stored on memory 1103, realizes any institute in above-described embodiment
The audio/video processing method stated.Specifically, electronic equipment can be server, processor 1101, for executing memory
When the program stored on 1103, the first audio/video processing method described in any of the above-described embodiment is realized.Electronic equipment can
Think that above-mentioned first electronic equipment, processor 1101 when for executing the program stored on memory 1103, realize above-mentioned
Second of audio/video processing method described in one embodiment.Electronic equipment can be above-mentioned second electronic equipment, processor 1101,
When for executing the program stored on memory 1103, the third audio-video processing side described in any of the above-described embodiment is realized
Method.
As it can be seen that using this programme user that can interact in a manner of dubbing in Virtual Space, increase interaction mode
Diversity, user experience are improved.
The communication bus that above-mentioned electronic equipment is mentioned can be Peripheral Component Interconnect standard (Peripheral Component
Interconnect, PCI) bus or expanding the industrial standard structure (Extended Industry Standard
Architecture, EISA) bus etc..The communication bus can be divided into address bus, data/address bus, control bus etc..For just
It is only indicated with a thick line in expression, figure, it is not intended that an only bus or a type of bus.
Communication interface is for the communication between above-mentioned electronic equipment and other equipment.
Memory may include random access memory (Random Access Memory, RAM), also may include non-easy
The property lost memory (Non-Volatile Memory, NVM), for example, at least a magnetic disk storage.Optionally, memory may be used also
To be storage device that at least one is located remotely from aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit,
CPU), network processing unit (Network Processor, NP) etc.;It can also be digital signal processor (Digital Signal
Processing, DSP), it is specific integrated circuit (Application Specific Integrated Circuit, ASIC), existing
It is field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete
Door or transistor logic, discrete hardware components.
The embodiment of the present disclosure additionally provides a kind of computer readable storage medium, when the instruction in the storage medium is by taking
When the processor of business device executes, enable the server to execute any audio/video processing method in above-described embodiment.
As it can be seen that using this programme user that can interact in a manner of dubbing in Virtual Space, increase interaction mode
Diversity, user experience are improved.
The embodiment of the present disclosure additionally provides a kind of application product, and the application product for executing at runtime
State any audio/video processing method in embodiment.
As it can be seen that using this programme user that can interact in a manner of dubbing in Virtual Space, increase interaction mode
Diversity, user experience are improved.
Those skilled in the art will readily occur to its of the disclosure after considering specification and practicing application disclosed herein
Its embodiment.The disclosure is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or
Person's adaptive change follows the general principles of this disclosure and including the undocumented common knowledge in the art of the disclosure
Or conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the disclosure are by above
Claim is pointed out.
It should be understood that the present disclosure is not limited to the precise structures that have been described above and shown in the drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present disclosure is only limited by the accompanying claims.
Claims (10)
1. a kind of processing method of audio-video, which is characterized in that be applied to server, which comprises
What the first electronic equipment issued in acquisition Virtual Space dubs instruction, wherein first electronic equipment is in institute
State the electronic equipment that permission is broadcast live in Virtual Space;
Corresponding preset of instruction is dubbed described in determination dubs type;
It determines wait match audio-video;
When dubbing sign on of the first electronic equipment sending is obtained, is dubbed described in type broadcasting according to described preset wait match
The corresponding no voice video of audio-video;
During playing the no voice video, the acquisition no voice video is corresponding to dub audio, while matching by described in
Sound audio is sent to the second electronic equipment, wherein second electronic equipment is with the viewing live streaming in the Virtual Space
The electronic equipment of permission.
2. the method as described in claim 1, which is characterized in that the default type of dubbing is main broadcaster's show type;
It is described to preset the step of dubbing described in type broadcasting to no voice video corresponding with audio-video according to described, comprising:
It controls first electronic equipment and second electronic equipment while playing described wait match the corresponding no voice of audio-video
Video.
3. the method as described in claim 1, which is characterized in that the default type of dubbing is that more main broadcasters fight type;
It is described to preset the step of dubbing described in type broadcasting to no voice video corresponding with audio-video according to described, comprising:
Determine the corresponding battle sequence of corresponding first electronic equipment of each main broadcaster;
According to battle sequence, control first electronic equipment and its corresponding second electronic equipment play in order it is described to
With the corresponding no voice video of audio-video.
4. the method as described in claim 1, which is characterized in that the default type of dubbing is that more people dub type;
It is described to preset the step of dubbing described in type broadcasting to no voice video corresponding with audio-video according to described, comprising:
Corresponding each second electronic equipment of user in instant messaging region is controlled in the Virtual Space, while being played described wait match
The corresponding no voice video of audio-video.
5. method as claimed in claim 4, which is characterized in that used in instant messaging region in the control Virtual Space
Corresponding each second electronic equipment in family, at the same play it is described to no voice video corresponding with audio-video the step of, comprising:
When obtaining the broadcast message that first electronic equipment is sent, send it is described to audio-video and sign on to described
Corresponding each second electronic equipment of user in instant messaging region in Virtual Space, so that each second electronic equipment is receiving
When to the sign on, while playing described wait match the corresponding no voice video of audio-video.
6. the method according to claim 1 to 5, which is characterized in that the step of determination is wait match audio-video, comprising:
Obtain the video that first electronic equipment uploads;
The video of the upload is determined as wait match audio-video.
7. the method according to claim 1 to 5, which is characterized in that the acquisition modes of the no voice video, comprising:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;
The amplitude spectrum is inputted into the network model that training is completed in advance, is obtained described wait match the corresponding voice exposure mask square of audio-video
Battle array, wherein the network model is obtained based on the amplitude spectrum sample obtained in advance and its training of corresponding voice exposure mask matrix, institute
State the corresponding relationship that network model includes amplitude spectrum Yu voice exposure mask matrix;
Using the voice exposure mask matrix and the amplitude spectrum, unmanned acoustic amplitude spectrum is calculated;
It is determined based on the unmanned acoustic amplitude spectrum described wait match the corresponding no voice video of audio-video.
8. the method according to claim 1 to 5, which is characterized in that the acquisition modes of the no voice video, comprising:
Determine the corresponding amplitude spectrum of audio signal wait match audio-video;
The amplitude spectrum is inputted into the network model that training is completed in advance, is obtained described wait match the corresponding unmanned sound of audio-video
Frequently, wherein the network model is obtained based on the amplitude spectrum sample and its corresponding unmanned sound audio training obtained in advance, described
Network model includes the corresponding relationship of amplitude spectrum Yu unmanned sound audio;
It is determined based on the unmanned sound audio described wait match the corresponding no voice video of audio-video.
9. a kind of processing method of audio-video, which is characterized in that be applied to the first electronic equipment, wherein first electronics is set
Standby is with the electronic equipment that permission is broadcast live in Virtual Space, which comprises
Acquisition dubs instruction in the Virtual Space;
Corresponding preset of instruction is dubbed described in determination dubs type;
It determines wait match audio-video;
When sign on is dubbed in acquisition, regarded described in type broadcasting wait match the corresponding no voice of audio-video according to default dub
Frequently;
During playing the no voice video, the acquisition no voice video is corresponding to dub audio, while matching by described in
Sound audio is sent to server.
10. a kind of processing method of audio-video, which is characterized in that be applied to the second electronic equipment, wherein second electronics is set
Standby is the electronic equipment with the viewing live streaming permission in the Virtual Space, which comprises
Get in Virtual Space when dubbing sign on, play obtain in advance to the corresponding no voice view of audio-video
Frequently;
During playing the no voice video, when getting that the no voice video is corresponding to dub audio, described in broadcasting
Dub audio.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910641537.9A CN110392273B (en) | 2019-07-16 | 2019-07-16 | Audio and video processing method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910641537.9A CN110392273B (en) | 2019-07-16 | 2019-07-16 | Audio and video processing method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110392273A true CN110392273A (en) | 2019-10-29 |
CN110392273B CN110392273B (en) | 2023-08-08 |
Family
ID=68284991
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910641537.9A Active CN110392273B (en) | 2019-07-16 | 2019-07-16 | Audio and video processing method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110392273B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111640442A (en) * | 2020-06-01 | 2020-09-08 | 北京猿力未来科技有限公司 | Method for processing audio packet loss, method for training neural network and respective devices |
CN112261435A (en) * | 2020-11-06 | 2021-01-22 | 腾讯科技(深圳)有限公司 | Social interaction method, device, system, equipment and storage medium |
CN112954377A (en) * | 2021-02-04 | 2021-06-11 | 广州繁星互娱信息科技有限公司 | Live broadcast fighting picture display method, live broadcast fighting method and device |
Citations (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101261864A (en) * | 2008-04-21 | 2008-09-10 | 中兴通讯股份有限公司 | A method and system for mixing recording voice at a mobile terminal |
US20110008022A1 (en) * | 2009-07-13 | 2011-01-13 | Lee Alex Y | System and Methods for Recording a Compressed Video and Audio Stream |
US8010692B1 (en) * | 2009-11-05 | 2011-08-30 | Adobe Systems Incorporated | Adapting audio and video content for hardware platform |
CN102325173A (en) * | 2011-08-30 | 2012-01-18 | 重庆抛物线信息技术有限责任公司 | Mixed audio and video sharing method and system |
CN102752499A (en) * | 2011-12-29 | 2012-10-24 | 新奥特(北京)视频技术有限公司 | System dubbing through dubbing-free workstation |
US20140098177A1 (en) * | 2012-10-09 | 2014-04-10 | Tv Ears, Inc. | Mobile application for accessing television audio |
US20140143218A1 (en) * | 2012-11-20 | 2014-05-22 | Apple Inc. | Method for Crowd Sourced Multimedia Captioning for Video Content |
CN104135667A (en) * | 2014-06-10 | 2014-11-05 | 腾讯科技(深圳)有限公司 | Video remote explanation synchronization method, terminal equipment and system |
CN105847913A (en) * | 2016-05-20 | 2016-08-10 | 腾讯科技(深圳)有限公司 | Live video broadcast control method, mobile terminal and system |
WO2016184295A1 (en) * | 2015-05-19 | 2016-11-24 | 腾讯科技(深圳)有限公司 | Instant messenger method, user equipment and system |
CN106534618A (en) * | 2016-11-24 | 2017-03-22 | 广州爱九游信息技术有限公司 | Method, device and system for realizing pseudo field interpretation |
WO2017181594A1 (en) * | 2016-04-19 | 2017-10-26 | 乐视控股(北京)有限公司 | Video display method and apparatus |
CN107452389A (en) * | 2017-07-20 | 2017-12-08 | 大象声科(深圳)科技有限公司 | A kind of general monophonic real-time noise-reducing method |
CN107484016A (en) * | 2017-09-05 | 2017-12-15 | 深圳Tcl新技术有限公司 | Video dubs switching method, television set and computer-readable recording medium |
CN107492383A (en) * | 2017-08-07 | 2017-12-19 | 上海六界信息技术有限公司 | Screening technique, device, equipment and the storage medium of live content |
WO2018018482A1 (en) * | 2016-07-28 | 2018-02-01 | 北京小米移动软件有限公司 | Method and device for playing sound effects |
WO2018095219A1 (en) * | 2016-11-24 | 2018-05-31 | 腾讯科技(深圳)有限公司 | Media information processing method and device |
CN108668151A (en) * | 2017-03-31 | 2018-10-16 | 腾讯科技(深圳)有限公司 | Audio/video interaction method and device |
US20180318713A1 (en) * | 2016-03-03 | 2018-11-08 | Tencent Technology (Shenzhen) Company Limited | A content presenting method, user equipment and system |
CN109119063A (en) * | 2018-08-31 | 2019-01-01 | 腾讯科技(深圳)有限公司 | Video dubs generation method, device, equipment and storage medium |
CN109151592A (en) * | 2018-09-21 | 2019-01-04 | 广州华多网络科技有限公司 | Connect the interactive approach, device and server of wheat across channel |
CN109151565A (en) * | 2018-09-04 | 2019-01-04 | 北京达佳互联信息技术有限公司 | Play method, apparatus, electronic equipment and the storage medium of voice |
CN109361954A (en) * | 2018-11-02 | 2019-02-19 | 腾讯科技(深圳)有限公司 | Method for recording, device, storage medium and the electronic device of video resource |
CN109361930A (en) * | 2018-11-12 | 2019-02-19 | 广州酷狗计算机科技有限公司 | Method for processing business, device and computer readable storage medium |
CN109587509A (en) * | 2018-11-27 | 2019-04-05 | 广州市百果园信息技术有限公司 | Live-broadcast control method, device, computer readable storage medium and terminal |
CN109710798A (en) * | 2018-12-28 | 2019-05-03 | 北京金山安全软件有限公司 | Music performance evaluation method and device |
CN109830245A (en) * | 2019-01-02 | 2019-05-31 | 北京大学 | A kind of more speaker's speech separating methods and system based on beam forming |
-
2019
- 2019-07-16 CN CN201910641537.9A patent/CN110392273B/en active Active
Patent Citations (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101261864A (en) * | 2008-04-21 | 2008-09-10 | 中兴通讯股份有限公司 | A method and system for mixing recording voice at a mobile terminal |
US20110008022A1 (en) * | 2009-07-13 | 2011-01-13 | Lee Alex Y | System and Methods for Recording a Compressed Video and Audio Stream |
US8010692B1 (en) * | 2009-11-05 | 2011-08-30 | Adobe Systems Incorporated | Adapting audio and video content for hardware platform |
CN102325173A (en) * | 2011-08-30 | 2012-01-18 | 重庆抛物线信息技术有限责任公司 | Mixed audio and video sharing method and system |
CN102752499A (en) * | 2011-12-29 | 2012-10-24 | 新奥特(北京)视频技术有限公司 | System dubbing through dubbing-free workstation |
US20140098177A1 (en) * | 2012-10-09 | 2014-04-10 | Tv Ears, Inc. | Mobile application for accessing television audio |
US20140143218A1 (en) * | 2012-11-20 | 2014-05-22 | Apple Inc. | Method for Crowd Sourced Multimedia Captioning for Video Content |
CN104135667A (en) * | 2014-06-10 | 2014-11-05 | 腾讯科技(深圳)有限公司 | Video remote explanation synchronization method, terminal equipment and system |
WO2016184295A1 (en) * | 2015-05-19 | 2016-11-24 | 腾讯科技(深圳)有限公司 | Instant messenger method, user equipment and system |
US20180318713A1 (en) * | 2016-03-03 | 2018-11-08 | Tencent Technology (Shenzhen) Company Limited | A content presenting method, user equipment and system |
WO2017181594A1 (en) * | 2016-04-19 | 2017-10-26 | 乐视控股(北京)有限公司 | Video display method and apparatus |
CN105847913A (en) * | 2016-05-20 | 2016-08-10 | 腾讯科技(深圳)有限公司 | Live video broadcast control method, mobile terminal and system |
WO2018018482A1 (en) * | 2016-07-28 | 2018-02-01 | 北京小米移动软件有限公司 | Method and device for playing sound effects |
CN106534618A (en) * | 2016-11-24 | 2017-03-22 | 广州爱九游信息技术有限公司 | Method, device and system for realizing pseudo field interpretation |
WO2018095219A1 (en) * | 2016-11-24 | 2018-05-31 | 腾讯科技(深圳)有限公司 | Media information processing method and device |
CN108668151A (en) * | 2017-03-31 | 2018-10-16 | 腾讯科技(深圳)有限公司 | Audio/video interaction method and device |
CN107452389A (en) * | 2017-07-20 | 2017-12-08 | 大象声科(深圳)科技有限公司 | A kind of general monophonic real-time noise-reducing method |
CN107492383A (en) * | 2017-08-07 | 2017-12-19 | 上海六界信息技术有限公司 | Screening technique, device, equipment and the storage medium of live content |
CN107484016A (en) * | 2017-09-05 | 2017-12-15 | 深圳Tcl新技术有限公司 | Video dubs switching method, television set and computer-readable recording medium |
CN109119063A (en) * | 2018-08-31 | 2019-01-01 | 腾讯科技(深圳)有限公司 | Video dubs generation method, device, equipment and storage medium |
CN109151565A (en) * | 2018-09-04 | 2019-01-04 | 北京达佳互联信息技术有限公司 | Play method, apparatus, electronic equipment and the storage medium of voice |
CN109151592A (en) * | 2018-09-21 | 2019-01-04 | 广州华多网络科技有限公司 | Connect the interactive approach, device and server of wheat across channel |
CN109361954A (en) * | 2018-11-02 | 2019-02-19 | 腾讯科技(深圳)有限公司 | Method for recording, device, storage medium and the electronic device of video resource |
CN109361930A (en) * | 2018-11-12 | 2019-02-19 | 广州酷狗计算机科技有限公司 | Method for processing business, device and computer readable storage medium |
CN109587509A (en) * | 2018-11-27 | 2019-04-05 | 广州市百果园信息技术有限公司 | Live-broadcast control method, device, computer readable storage medium and terminal |
CN109710798A (en) * | 2018-12-28 | 2019-05-03 | 北京金山安全软件有限公司 | Music performance evaluation method and device |
CN109830245A (en) * | 2019-01-02 | 2019-05-31 | 北京大学 | A kind of more speaker's speech separating methods and system based on beam forming |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111640442A (en) * | 2020-06-01 | 2020-09-08 | 北京猿力未来科技有限公司 | Method for processing audio packet loss, method for training neural network and respective devices |
CN111640442B (en) * | 2020-06-01 | 2023-05-23 | 北京猿力未来科技有限公司 | Method for processing audio packet loss, method for training neural network and respective devices |
CN112261435A (en) * | 2020-11-06 | 2021-01-22 | 腾讯科技(深圳)有限公司 | Social interaction method, device, system, equipment and storage medium |
CN112261435B (en) * | 2020-11-06 | 2022-04-08 | 腾讯科技(深圳)有限公司 | Social interaction method, device, system, equipment and storage medium |
CN112954377A (en) * | 2021-02-04 | 2021-06-11 | 广州繁星互娱信息科技有限公司 | Live broadcast fighting picture display method, live broadcast fighting method and device |
Also Published As
Publication number | Publication date |
---|---|
CN110392273B (en) | 2023-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107027050B (en) | Audio and video processing method and device for assisting live broadcast | |
Rottondi et al. | An overview on networked music performance technologies | |
CN110392273A (en) | Method, apparatus, electronic equipment and the storage medium of audio-video processing | |
CN111818004B (en) | Cloud game live broadcast method and system and computer readable storage medium | |
CN110910860B (en) | Online KTV implementation method and device, electronic equipment and storage medium | |
CN112019874B (en) | Live wheat-connecting method and related equipment | |
CN110390927B (en) | Audio processing method and device, electronic equipment and computer readable storage medium | |
Alexandraki et al. | Exploring new perspectives in network music performance: The DIAMOUSES framework | |
WO2007111842A2 (en) | Method and system for low latency high quality music conferencing | |
US20060242676A1 (en) | Live streaming broadcast method, live streaming broadcast device, live streaming broadcast system, program, recording medium, broadcast method, and broadcast device | |
CN102845076A (en) | Display apparatus, control apparatus, television receiver, method of controlling display apparatus, program, and recording medium | |
CN107018466A (en) | Strengthen audio recording | |
CN108616800A (en) | Playing method and device, storage medium, the electronic device of audio | |
CN103945258B (en) | A kind of channel switching method and radiovisor | |
US7559079B2 (en) | Realtime service system using the interactive data communication and method thereof | |
CN110099242A (en) | A kind of remote living broadcast method and device | |
CN114339302B (en) | Method, device, equipment and computer storage medium for guiding broadcast | |
Cairns et al. | Recording music in the metaverse: a case study of XR BBC Maida Vale Recording Studios | |
CN104954730B (en) | A kind of method and device playing video | |
CN110798640A (en) | Full high-definition recording and broadcasting method | |
RU2527732C2 (en) | Method of sounding video broadcast | |
CN109951650B (en) | Campus radio station system | |
KR100874024B1 (en) | Station and method for internet broadcasting interaction type-content and record media recoded program realizing the same | |
CN112133300B (en) | Multi-device interaction method, related device and system | |
TW201630416A (en) | Network synchronous coordinating performance system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |