CN110062267A - Live data processing method, device, electronic equipment and readable storage medium storing program for executing - Google Patents
Live data processing method, device, electronic equipment and readable storage medium storing program for executing Download PDFInfo
- Publication number
- CN110062267A CN110062267A CN201910368522.XA CN201910368522A CN110062267A CN 110062267 A CN110062267 A CN 110062267A CN 201910368522 A CN201910368522 A CN 201910368522A CN 110062267 A CN110062267 A CN 110062267A
- Authority
- CN
- China
- Prior art keywords
- style
- tone color
- live streaming
- network parameter
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/426—Internal components of the client ; Characteristics thereof
- H04N21/42607—Internal components of the client ; Characteristics thereof for processing the incoming bitstream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/431—Generation of visual interfaces for content selection or interaction; Content or additional data rendering
- H04N21/4312—Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4394—Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4398—Processing of audio elementary streams involving reformatting operations of audio signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/442—Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
- H04N21/44213—Monitoring of end-user related data
- H04N21/44218—Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/478—Supplemental services, e.g. displaying phone caller identification, shopping application
- H04N21/4788—Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4884—Data services, e.g. news ticker for displaying subtitles
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Social Psychology (AREA)
- Computer Networks & Wireless Communication (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The embodiment of the present application provides a kind of live data processing method, device, electronic equipment and readable storage medium storing program for executing, the first voice data with target tone color style is handled by network parameter learning model trained in advance to obtain the corresponding target network parameter of target tone color style, and style conversion is carried out to the second speech data that main broadcaster inputs by adjusting for the style switching network after target network parameter, the living broadcast interactive data flow of virtual live streaming image is generated according to the third voice data being converted to target tone color style, and it is sent to live streaming reception terminal and plays out.So, it can be directed to any main broadcaster, while not changing audio content, the tone color style during virtual live streaming image live streaming is converted into arbitrary tone color style to interact with spectators, interaction effect during raising live streaming in turn, more Shangdi transfer spectators are interacted with main broadcaster's.
Description
Technical field
This application involves internets, and field is broadcast live, in particular to a kind of live data processing method, device, electronics
Equipment and readable storage medium storing program for executing.
Background technique
In internet live streaming, replaces the reality image of main broadcaster to participate in living broadcast interactive so that image is virtually broadcast live, be mesh
A kind of preceding more popular direct-seeding.
In current direct-seeding, the tone color of virtual live streaming image mostly uses greatly the former tone color style or solid in advance of main broadcaster
Fixed a certain tone color style provides live data streams, can not be converted into other tone color styles and interact with spectators, such as
This is unable to satisfy certain particular demands of specific main broadcaster or niche audience, so that will lead to interaction live streaming effect reduces.Such as it sees
The sound that crowd may prefer to hear is the tone color style of oneself liked star or the tone color style of people known to oneself.
In another example main broadcaster may be not intended to the tone color style show of oneself exposing privacy concern to other spectators.
Summary of the invention
In view of this, the embodiment of the present application is designed to provide a kind of live data processing method, device, electronic equipment
And readable storage medium storing program for executing, to solve the above problems.
According to the one aspect of the embodiment of the present application, a kind of electronic equipment is provided, may include that one or more storages are situated between
Matter and one or more processors communicated with storage medium.One or more storage mediums are stored with the executable machine of processor
Device executable instruction.When electronic equipment operation, the processor executes the machine-executable instruction, to execute live data
Processing method.
According to the another aspect of the embodiment of the present application, a kind of live data processing method is provided, is applied to live streaming and provides eventually
End, which comprises
It parses the tone color convert requests received and obtains target tone color style;
First voice data with the target tone color style is obtained, and first voice data is input in advance
In trained network parameter learning model, the corresponding target network parameter of the target tone color style is obtained;
The network parameter of the style switching network prestored is adjusted to the target network parameter, and according to wind adjusted
Lattice switching network carries out style conversion to the second speech data that main broadcaster inputs, and obtains the third with the target tone color style
Voice data;
The living broadcast interactive data flow of virtual live streaming image is generated according to the third voice data, and is sent to live streaming and is received
Terminal plays out.
According to the another aspect of the embodiment of the present application, a kind of live data processing unit is provided, is applied to live streaming and provides eventually
End, described device include:
Parsing module obtains target tone color style for parsing the tone color convert requests received;
Input module, for obtaining the first voice data with the target tone color style, and by first voice
Data are input in network parameter learning model trained in advance, obtain the corresponding target network ginseng of the target tone color style
Number;
Style conversion module, the network parameter of the style switching network for will prestore are adjusted to the target network ginseng
Number, and style conversion is carried out to the second speech data that main broadcaster inputs according to style switching network adjusted, it obtains with institute
State the third voice data of target tone color style;
Sending module is generated, for generating the living broadcast interactive data of virtual live streaming image according to the third voice data
Stream, and be sent to live streaming reception terminal and play out.
According to the another aspect of the embodiment of the present application, a kind of readable storage medium storing program for executing is provided, is stored on the readable storage medium storing program for executing
There is machine-executable instruction, the step of above-mentioned live data processing method can be executed when which is run by processor
Suddenly.
Based on any of the above-described aspect, compared to existing technologies, the embodiment of the present application passes through network ginseng trained in advance
Number learning model handles the first voice data with target tone color style corresponding to obtain the target tone color style
Target network parameter, and by adjusting the second speech data inputted for the style switching network after target network parameter to main broadcaster
Style conversion is carried out, the live streaming of virtual live streaming image is generated according to the third voice data being converted to target tone color style
Interactive data stream, and be sent to live streaming reception terminal and play out.It so, it is possible do not changing in audio for any main broadcaster
While appearance, it is mutual to carry out with spectators that the tone color style during virtual live streaming image live streaming is converted into arbitrary tone color style
Dynamic, and then improve the interaction effect during live streaming, more interacting for spectators and main broadcaster is transferred in Shangdi.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached
Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair
The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this
A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows the schematic diagram of live broadcast system provided by the embodiment of the present application;
Fig. 2 shows one of the flow diagrams of live data processing method provided by the embodiment of the present application;
Fig. 3 shows a kind of boundary that selection target tone color style in Internet application is broadcast live provided by the embodiment of the present application
Face schematic diagram;
Fig. 4 shows the schematic diagram of style conversion process provided by the embodiment of the present application;
Fig. 5 shows live streaming provided by the embodiment of the present application and provides the live streaming interface schematic diagram of terminal;
Fig. 6 shows two of the flow diagram of live data processing method provided by the embodiment of the present application;
Fig. 7 shows the stream for each sub-steps that step S101 shown in Fig. 6 provided by the embodiment of the present application includes
Journey schematic diagram;
Fig. 8 shows the training flow diagram of style transformation model provided by the embodiment of the present application;
Fig. 9 shows the schematic diagram of electronic equipment provided by the embodiment of the present application.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.The application being usually described and illustrated herein in the accompanying drawings is implemented
The component of example can be arranged and be designed with a variety of different configurations.
Unless indicated to the contrary, the embodiment of the present application refers to " first ", " second ", third " etc. ordinal numbers be used for more
A object distinguishes, and is not used in sequence, timing, position, priority or the significance level for limiting multiple objects.
Shown in referring to Fig.1, Fig. 1 is the configuration diagram of live broadcast system 10 provided by the embodiments of the present application.For example, live streaming system
System 10 can be the service platform for such as internet live streaming etc.Live broadcast system 10 may include direct broadcast server 200, straight
It broadcasts and terminal 100 and live streaming reception terminal 300 is provided, direct broadcast server 200 provides terminal 100 with live streaming respectively and live streaming connects
It receives terminal 300 to communicate to connect, provides direct broadcast service for providing terminal 100 for live streaming and reception terminal 300 being broadcast live.For example,
Live streaming, which provides terminal 100, can be sent to the live video stream of direct broadcasting room direct broadcast server 200, and spectators can be connect by live streaming
It receives terminal 300 and pulls live video stream from direct broadcast server 200 to watch the live video of direct broadcasting room.In another example direct broadcast service
Device 200 can also receive terminal 300 to the live streaming of the spectators when the direct broadcasting room that spectators subscribe to starts broadcasting and send a notification message.Live streaming
Video flowing can be the complete video stream that the video flowing being broadcast live in platform is currently being broadcast live or is being formed after the completion of live streaming.
Live streaming, which provides, can install in terminal 100 and live streaming reception terminal 300 for providing the mutual of internet direct broadcast service
Networked product, for example, internet product can be it is relevant to internet direct broadcast service used in computer or smart phone
Application APP, Web page, small routine etc..
In the present embodiment, live broadcast system 10 can also include the video acquisition device for acquiring main broadcaster's video frame of main broadcaster
400, video acquisition device 400 is mounted directly or is integrated in live streaming and provides terminal 100, can also provide terminal independently of live streaming
100 and with live streaming provide terminal 100 connect.
Referring to shown in Fig. 2, Fig. 2 shows the flow diagram of live data processing method provided by the embodiments of the present application,
Offer terminal 100 can be broadcast live as shown in Fig. 1 and execute for the live data processing method.The live data processing method it is detailed
Step is described below.
Step S110 parses the tone color convert requests received and obtains target tone color style.
Step S120 obtains first voice data with target tone color style, and the first voice data is input to pre-
First in trained network parameter learning model, the corresponding target network parameter of target tone color style is obtained.
The network parameter of the style switching network prestored is adjusted to target network parameter by step S130, and according to adjustment
The second speech data that style switching network afterwards inputs main broadcaster carries out style conversion, obtains having the of target tone color style
Three voice data.
Step S140, the living broadcast interactive data flow of virtual live streaming image is generated according to third voice data, and is sent to straight
Reception terminal 300 is broadcast to play out.
In the present embodiment, for step S110, live streaming provides terminal 100 after receiving tone color convert requests, can be with
The target tone color style for obtaining the main broadcaster or selecting into the spectators of the direct broadcasting room, the target are parsed from the tone color convert requests
Tone color style can be understood as the main broadcaster or the spectators of the direct broadcasting room into the main broadcaster wish to listen when listening aforementioned live audio
The tone color style arrived.For example, the main broadcaster may want to the audio data of oneself output, to sound like the idol oneself liked bright
The tone color style of the tone color style of star or friend known to oneself or intonation of the speaking (such as " Taiwan oneself liked
Chamber ", " Beijing chamber " etc.) tone color style.In another example may also wish the main broadcaster oneself heard for a part of spectators
The audio data of output sounds the tone color style of the similar idol star oneself liked or the tone color of friend known to oneself
Style.Based on this, which both can provide terminal 100 by the corresponding live streaming of main broadcaster and issue, can also be by entering
The live streaming of the spectators of the direct broadcasting room of the main broadcaster receives terminal 300 and issues.
For example, live streaming provides terminal 100 or live streaming receives in the interface for the live streaming Internet application installed in terminal 300
The selection interface for the target tone color style can be set, which shows the choosing of multiple and different tone color styles
, the spectators of the main broadcaster or the direct broadcasting room into the main broadcaster can select certainly from the respective option shown in the selection interface
Then it is raw to receive terminal 300 by live streaming offer terminal 100 or live streaming for the corresponding option of target tone color style required for oneself
At corresponding tone color convert requests.
Only as an example, referring to Fig. 3, showing that live streaming provides terminal 100 or live streaming is received and installed in terminal 300
The interface schematic diagram of Internet application is broadcast live, the option of different tone color styles is shown in the interface, respectively includes tone color style
A, the spectators of tone color style B, tone color style C, tone color style D etc., the main broadcaster or the direct broadcasting room into the main broadcaster can be from this
The corresponding option of target tone color style required for oneself is selected in selection interface.For example, the main broadcaster like oneself one be familiar with
Friend A tone color style, and tone color style A be friend A tone color style, then the main broadcaster can choose tone color style A, then
Terminal 100, which is provided, by live streaming generates corresponding tone color convert requests.In another example the spectators of the direct broadcasting room of the main broadcaster like some
The tone color style of singer, and tone color style B is the tone color style of the singer, then the spectators can choose tone color style B, then lead to
It crosses live streaming reception terminal 300 and generates corresponding tone color convert requests.
Referring to shown in Fig. 4, the schematic diagram of style conversion process in the embodiment of the present application is shown, below with reference to Fig. 4 to preceding
Embodiment is stated to illustrate.
For step S120, live streaming, which provides terminal 100, can locally be previously stored with the corresponding audio of various tone color styles
Data then can be from local the first voice data searched and have the target tone color style after determining target tone color style.Or
Person, direct broadcast server 200 can also provide various tone color styles corresponding audio data, after determining target tone color style, then
The first voice data with the target tone color style can be obtained from direct broadcast server 200.
On this basis, live streaming, which provides terminal 100, can be input to the first voice data network parameter trained in advance
It practises in model, obtains the corresponding target network parameter of target tone color style.
Wherein, which can learn the corresponding style network parameter of various different tone color styles, example
Second speech samples of the first speech samples and any main broadcaster that such as can use at least one tone color style are based on deep learning
Neural metwork training obtain, wherein it is described at least one tone color style include the target tone color style.In this way, can be with needle
To the audio data of any tone color style of input, audio data corresponding to the tone color style is exported, in this way without for every
Kind tone color style individually trains style switching network again, greatly reduces training amount.
As a kind of possible embodiment, in the step s 120, it is corresponding with reference to wind that the first voice data is extracted first
Lattice characteristic pattern will then be input in network parameter learning model with reference to style and features figure, it is corresponding to obtain target tone color style
Target network parameter.
Through present inventor the study found that any a segment of audio data (such as first voice data) can be connected by one
The waveform diagram of string indicates, is based on this, the first voice data for extracting the main broadcaster is corresponding with reference to the one of style and features figure
Kind exemplary approach, which may is that, carries out cutting for the first voice data at interval of preset time (such as every 10 seconds), obtains more
Then a data slot extracts audiograph, spectrogram or the sound spectrograph of each data slot or the sound wave of each data slot
For image after figure, spectrogram or sound spectrograph progress image processing transformation as audio frequency characteristics figure, which then includes interior
Hold characteristic pattern and above-mentioned with reference to style and features figure.It can be used to indicate that the style of the first voice data is special with reference to style and features figure
Sign, such as tone color style etc.;Content characteristic figure can be used to indicate that the content characteristic of the first voice data, such as volume,
Speech content etc..
The present embodiment, can be to avoid the audio data disposably handled by the way that the first voice data is carried out cutting as a result,
It measures excessive caused live streaming and the Caton of terminal 100, the time span for each data slot that another aspect cutting obtains is provided
It unanimously, can be in order to subsequent processing.
For step S130, join exporting the corresponding target network of the target tone color style by network parameter learning model
After number, the network parameter of the style switching network prestored can be adjusted to target network parameter above-mentioned.In this way, adjusted
The tone color style of the audio data of any main broadcaster can be converted to the target tone color style by style switching network, without being directed to
The target tone color individually trains style switching network again.
The live streaming Internet application that any main broadcaster installs on through starting live streaming offer terminal 100 is opened with entering direct broadcasting room
Begin that the data such as live video stream, live streaming picture, live audio, text barrage can be generated during live streaming after live streaming to pass through
The live streaming that direct broadcast server 200 is sent into each spectators of the direct broadcasting room receives terminal 300.In above process, first
The audio frequency characteristics figure of the second speech data of main broadcaster's input is extracted by feature extraction network, audio frequency characteristics figure includes content characteristic
Figure and style and features figure.Then, the style switching network by adjusting after handles content characteristic pattern, obtains with target
The style converting characteristic figure of tone color style.Finally, carrying out feature inverse transform to content characteristic pattern and style converting characteristic figure, obtain
Third voice data with the target tone color style.
In detail, original in this way since the style and features figure in original audio frequency characteristics figure is substituted in style converting characteristic figure
The style converting characteristic figure after content characteristic figure and conversion in audio frequency characteristics figure can be understood as having the target tone color style
Audio frequency characteristics figure.On this basis, in order to generate the audio data that spectators can hear, the present embodiment is also needed the content
Style transition diagram after characteristic pattern and conversion carries out feature inverse transform, obtains the third voice number with the target tone color style
According to.In this way, the style conversion after the third voice data integration second speech data corresponding content characteristic figure and conversion
The style and features of figure, to reach corresponding to the target tone color style while not changing the content of the second speech data
Auditory effect.
It is worth noting that although the function that can be changed voice in the prior art using some changes of voice (such as old man's sound, little Hai Sheng
Sound etc.) to change one's voice in speech, but the sound effect converted in this scheme is unsatisfactory, is unable to reach preferable effect true to nature
Fruit, and can not still be converted to required tone color style.The technical solution provided through this embodiment, the tone color after conversion are
For the tone color of required target tone color style, there is extremely strong vivid effect.
It, can be in the display interface of direct broadcasting room in order to improve the interest during living broadcast interactive for step S140
Virtual live streaming image replaces the reality image of the main broadcaster to interact with spectators.Virtual live streaming image can be outer with main broadcaster
The virtual figure image that looks, posture, makings etc. are consistent, such as two-dimensional virtual figure image or three-dimensional personage can be used
Image is also possible to cartoon character or true man's image etc..For example, virtual live streaming image can imitate the table of the main broadcaster in real time
The characteristic attributes such as feelings, movement interact to represent the main broadcaster with spectators, i.e., spectators can be by being virtually broadcast live image and being somebody's turn to do
Main broadcaster interacts, which can be any one numerous subscribed in bean vermicelli of main broadcaster.It specifically, can be during live streaming
The limb action of main broadcaster, facial expression, audio data etc. are captured and identified, and virtual live streaming image is combined to carry out
It plays, is then forwarded on direct broadcast server 200, receive terminal 300 from direct broadcast server 200 to enter the live streaming of direct broadcasting room
In pull live data streams and watched.In this way, the virtual live streaming image that spectators experience in this way can have it is similar
In the impression of practical main broadcaster true man movement and voice.For example, spectators it is seen that a cartoon dinosaur virtual figure image, but
But the real time data of the movement from this main broadcaster and audio data is transmitted for the movement of this cartoon dinosaur and voice.
After live streaming offer terminal 100 generates third voice data above-mentioned, virtual live streaming image can be generated in real time
Living broadcast interactive data flow, and be sent to live streaming reception terminal 300 and play out.For example, can according to setting time interval (such as
5 seconds, 10 seconds etc.) by third voice data cutting be multiple audio data sections, and be directed to each audio data section, identify the audio
The content parameters of data segment, the content parameters may include content characteristic, emotional characteristics and amplitude characteristic, wherein emotional characteristics
For controlling the emotional state of virtual live streaming image, amplitude characteristic is used to control the shape of the mouth as one speaks folding condition of virtual live streaming image.Example
Such as, if recognizing emotional characteristics is the corresponding parameter of happy state, virtual live streaming image can be adjusted according to the emotional characteristics
The value of emotion attribute is smile, and successively expression, movement and the posture of the virtual live streaming image of control.In another example if in recognizing
Holding feature is " I am very happy ", then the content of the action attributes of the adjustable virtual live streaming image is to execute " applause " in real time
Movement, while the expression attribute for adjusting the virtual live streaming image is smile.
In other possible embodiments, main broadcaster can also can be acquired by video acquisition device 400 shown in Fig. 1
Real-time expression, movement and posture.For example, where the face position of identification main broadcaster and angle, the profile of face, human face five-sense-organ
Position, Rotation of eyeball position, eyelid eyebrow, the motion state of lip and gesture motion etc., the information that these are acquired in real time into
The result of analysis is converted to customized control instruction set, and passes through the control interaction of these control instruction set by row analysis
The virtual live streaming image on interface imitates institute collected expression, movement and posture in real time.Such as when collected gesture is dynamic
When referring to downwards as hand, then the value for adjusting the action attributes of the virtual live streaming image is to execute the movement of " sitting down " in real time.
Thus, it is possible to which it is corresponding virtual straight to generate the audio data section according to content characteristic, emotional characteristics and amplitude characteristic
The interdynamic video section of image is broadcast, and each audio data section and its corresponding interdynamic video section are synthesized, is obtained virtual straight
Broadcast the living broadcast interactive data flow of image, will virtual live streaming image living broadcast interactive data flow be sent to live streaming receive terminal 300 into
Row plays.
For example, referring to Fig. 5, showing live streaming provides a kind of live streaming examples of interfaces figure of terminal 100, at the live streaming interface
In, it may include that interface display frame, main broadcaster's video frame display box, barrage area, virtual image region and every frame of main broadcaster is broadcast live
The word content XXXXX of audio frame.Wherein, the video being broadcast live in platform is currently being broadcast live for showing in live streaming interface display frame
The complete video stream formed after the completion of stream or live streaming, main broadcaster's video frame display box is for showing that video acquisition device acquires in real time
The main broadcaster's video frame arrived, virtual image region are used to show the virtual image of main broadcaster and the living broadcast interactive data flow of virtual image,
Barrage area is used to show the interaction content (such as AAAAA, BBBBB, CCCCC, DDDDD, EEEEE) between spectators and main broadcaster.
On this basis, live streaming, which provides terminal, can correspond to adjustment according to the interactive information for receiving terminal 300 from live streaming
The characteristic attribute of virtual live streaming image, so that spectators can carry out virtual interactive with the virtual live streaming image.Still shown in Fig. 5
Live streaming interface for, spectators can by live streaming receive terminal 300 send interactive information, live streaming provide terminal 100 can basis
Vivid characteristic attribute is virtually broadcast live to correspond to adjustment in these interactive information, thus completes live streaming reception terminal 300 and mutual arena
Interaction on face between the shown virtual live streaming image.It should be appreciated that these interactive information and the virtual live streaming image
Preset corresponding relationship may be present between characteristic attribute, this preset corresponding relationship can be established by pre- learning process, herein
It is not illustrating one by one.
In this way, the present embodiment can be directed to any main broadcaster, while not changing audio content, will virtually be broadcast live vivid straight
Tone color style during broadcasting is converted to arbitrary tone color style to interact with spectators, and then improves mutual during being broadcast live
Dynamic effect, more Shangdi transfer spectators are interacted with main broadcaster's.
As a kind of possible embodiment, provided in this embodiment straight referring to Fig. 6, before abovementioned steps S110
Multicast data processing method can also include the following steps:
Step S101 obtains network parameter learning model previously according to training sample training, referring specifically to Fig. 7, step
S101 may include following sub-step:
Sub-step S1011, obtain training sample, training sample include at least one tone color style the first speech samples and
The second speech samples of any main broadcaster.
In the present embodiment, aforementioned at least one tone color style may include target tone color style and other tone color styles,
First speech samples can be any speech samples with target tone color style and other tone color styles.For example, if target
Tone color style is the tone color style of some known friend A, then can collect the audio data of a large amount of known friend A as one
The first speech samples of part.
In the present embodiment, the second speech samples are not specifically limited, and can be any main broadcaster or other any users
Audio data can be collected as second speech samples.
Please refer to Fig. 8, the training process of the present embodiment be related to feature extraction network, characteristic vector pickup network with
And initial conversion network.Exemplary elaboration is carried out below based on training process of the Fig. 8 to style transformation model in this step S101.
Sub-step S1012 extracts corresponding content characteristic sample graph from the second speech samples of any main broadcaster.
It is shown in Figure 8, it can be according to the above-mentioned side for extracting audio frequency characteristics figure from the second speech data that main broadcaster inputs
Formula extracts the content characteristic figure of the second speech samples by feature extraction network.
Sub-step S1013 extracts corresponding wind from the first speech samples of the tone color style for every kind of tone color style
Lattice feature samples figure.
It is shown in Figure 8, it can be according to the above-mentioned side for extracting audio frequency characteristics figure from the second speech data that main broadcaster inputs
Formula extracts the style and features sample graph of corresponding first speech samples of every kind of tone color style by feature extraction network.
Sub-step S1014, according to content characteristic sample graph and the corresponding style and features sample graph of every kind of tone color style to member
Learning network is trained, and obtains network parameter learning model, and is stored in live streaming and is provided in terminal 100.
Exemplary elaboration is carried out below based on detailed training process of the Fig. 8 to this sub-step S1014.
The first, the corresponding style and features sample graph of every kind of tone color style is input in meta learning network, obtains every kind of sound
The style network parameter of color style.
The second, it is carried out according to network parameter of the style network parameter of every kind of tone color style to preset style switching network
Adjustment, and content characteristic sample graph is input in style switching network adjusted, obtain corresponding style converting characteristic sample
This figure.
Third adjusts member according to the style and features sample graph of every kind of tone color style and corresponding style converting characteristic sample graph
The network parameter of learning network obtains network parameter learning model.
In detail, as an implementation, the style and features sample graph of every kind of tone color style and corresponding can be calculated
Loss function value between style converting characteristic sample graph, and according to loss function value update meta learning network network parameter after
Repetitive exercise, when meta learning network meets training termination condition, obtained network parameter learning model is trained in output.
Wherein, above-mentioned training termination condition may include at least one of following three kinds of conditions:
1) repetitive exercise number reaches setting number;2) loss function value is lower than given threshold;3) loss function value is no longer
Decline.
Wherein, in condition 1) in, in order to save operand, the maximum value of the number of iterations can be set, if the number of iterations
Reach setting number, the iteration of this iteration cycle can be stopped, using the deep learning network finally obtained as tone color modulus of conversion
Type.In condition 2) in, if loss function value is lower than given threshold, illustrate that current tone color transformation model can expire substantially
Sufficient condition can stop iteration at this time.In condition 3) in, loss function value no longer declines, and shows to have formd optimal sound
Color transformation model can stop iteration.
It should be noted that above-mentioned iteration stopping condition can be used in combination, a use can also be selected, for example, can be
Loss function value, which no longer declines, stops iteration, alternatively, stopping iteration when the number of iterations reaches setting number, alternatively, losing
Functional value stops iteration when no longer declining.Alternatively, given threshold can also be lower than in loss function value, and loss function value is not
When declining again, stop iteration.
In addition, in the actual implementation process, can also be not limited to using above-mentioned example as training termination condition, this field
Technical staff can design the training termination condition different from above-mentioned example according to actual needs.
Based on the network parameter learning model that above-mentioned steps obtain, it can be used for the sound of any tone color style according to input
Frequency corresponds to the network parameter of tone color style according to exporting, and the style switching network after the subsequent parameter using aforementioned network can be not
While changing the audio content of the audio data of any main broadcaster, the tone color style during virtual image is broadcast live is converted to pair
The tone color style answered improves the interaction effect during live streaming to interact with spectators, and more Shangdi, which is transferred, sees
Crowd interacts with main broadcaster's.Also, the present embodiment is no longer needed to for each main broadcaster, or individually trains wind for every kind of tone color style
Lattice transformation model greatly reduces training amount.
Fig. 9 shows the schematic diagram of electronic equipment provided by the embodiments of the present application, and in the present embodiment, which can be with
Refer to that live streaming shown in FIG. 1 provides terminal 100 comprising storage medium 110, processor 120 and live data processing unit
500。
Wherein, processor 120 can be a general central processing unit (CentralProcessing Unit, CPU),
Microprocessor, application-specific integrated circuit (application-specificintegrated circuit, ASIC) or one
Or the integrated circuit that multiple programs for controlling the live data processing method of above method embodiment offer execute.
Storage medium 110 can be ROM or can store the other kinds of static storage device of static information and instruction,
RAM or the other kinds of dynamic memory that can store information and instruction, are also possible to the read-only storage of electric erazable programmable
Device (Electrically erasable programmabler-only memory, EEPROM), CD-ROM
(compactdisc read-only memory, CD-ROM) or other optical disc storages, optical disc storage (including compression optical disc, swash
Optical disc, optical disc, Digital Versatile Disc, Blu-ray Disc etc.), magnetic disk storage medium or other magnetic storage apparatus or can use
In carry or storage have instruction or data structure form desired program code and can by computer access it is any its
His medium, but not limited to this.Storage medium 110, which can be, to be individually present, and is connected by communication bus with processor 120.It deposits
Storage media 110 can also be integrated with processor.Wherein, the storage medium 110 executes application scheme for storing
Application code, such as live data processing unit 500 shown in Fig. 9, and execution is controlled by processor 120.Institute
Processor 120 is stated for executing the application code stored in the storage medium 110, such as live data processing unit
500, to execute the live data processing method of above method embodiment.
The application can carry out the division of functional module according to above method embodiment to live data processing unit 500,
For example, each functional module of each function division can be corresponded to, two or more functions can also be integrated in one
In processing module.Above-mentioned integrated module both can take the form of hardware realization, can also use the shape of software function module
Formula is realized.It should be noted that be schematical, only a kind of logical function partition to the division of module in the application, it is real
There may be another division manner when border is realized.For example, in the case where each function division of use correspondence each functional module,
Live data processing unit 500 shown in Fig. 9 is a kind of schematic device, separately below to the live data processing unit
The function of 500 each functional module is described in detail.
Parsing module 510 obtains target tone color style for parsing the tone color convert requests received.
Input module 520, for obtaining the first voice data with target tone color style, and the first voice data is defeated
Enter into network parameter learning model trained in advance, obtains the corresponding target network parameter of target tone color style.
The network parameter of style conversion module 530, the style switching network for will prestore is adjusted to target network parameter,
And style conversion is carried out to the second speech data that main broadcaster inputs according to style switching network adjusted, it obtains with target sound
The third voice data of color style.
Sending module 540 is generated, for generating the living broadcast interactive data flow of virtual live streaming image according to third voice data,
And it is sent to live streaming reception terminal 300 and plays out.
Since live data processing unit 500 provided by the embodiments of the present application is live data processing method shown in Fig. 2
Another way of realization, and live data processing unit 500 can be used for executing method provided by embodiment shown in Fig. 2,
Therefore it, which can be obtained technical effect, can refer to above method embodiment, and details are not described herein.
Further, based on the same inventive concept, the embodiment of the present application also provides a kind of computer readable storage medium,
It is stored with computer program on the computer readable storage medium, which executes above-mentioned live streaming when being run by processor
The step of data processing method.
Specifically, which can be general storage medium, such as mobile disk, hard disk, on the storage medium
Computer program when being run, be able to carry out above-mentioned live data processing method.
The embodiment of the present application is referring to according to the method for the embodiment of the present application, equipment (electronic equipment of such as Fig. 9) and calculating
The flowchart and/or the block diagram of machine program product describes.It should be understood that can be realized by computer program instructions flow chart and/or
The combination of the process and/or box in each flow and/or block and flowchart and/or the block diagram in block diagram.It can mention
For the processing of these computer program instructions to general purpose computer, special purpose computer, Embedded Processor or other programmable datas
The processor of equipment is to generate a machine, so that being executed by computer or the processor of other programmable data processing devices
Instruction generation refer to for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram
The device of fixed function.
Although the application is described in conjunction with each embodiment herein, however, implementing the application claimed
In the process, those skilled in the art are by checking the attached drawing, disclosure and the appended claims, it will be appreciated that and it is real
Other variations of the existing open embodiment.In the claims, one word of " comprising " is not excluded for other components or step,
"a" or "an" is not excluded for multiple situations.Single processor or other units may be implemented to enumerate in claim several
Item function.Mutually different has been recited in mutually different dependent certain measures, it is not intended that these measures cannot group close
To generate good effect.
More than, the only various embodiments of the application, but the protection scope of the application is not limited thereto, and it is any to be familiar with
Those skilled in the art within the technical scope of the present application, can easily think of the change or the replacement, and should all cover
Within the protection scope of the application.Therefore, the protection scope of the application shall be subject to the protection scope of the claim.
Claims (12)
1. a kind of live data processing method, which is characterized in that be applied to live streaming and provide terminal, which comprises
It parses the tone color convert requests received and obtains target tone color style;
First voice data with the target tone color style is obtained, and first voice data is input to preparatory training
Network parameter learning model in, obtain the corresponding target network parameter of the target tone color style;
The network parameter of the style switching network prestored is adjusted to the target network parameter, and is turned according to style adjusted
Switching network carries out style conversion to the second speech data that main broadcaster inputs, and obtains the third voice with the target tone color style
Data;
The living broadcast interactive data flow of virtual live streaming image is generated according to the third voice data, and is sent to live streaming and is received terminal
It plays out.
2. live data processing method according to claim 1, which is characterized in that described that first voice data is defeated
Enter into network parameter learning model trained in advance, obtains the step of the corresponding target network parameter of the target tone color style
Suddenly, comprising:
It is corresponding with reference to style and features figure to extract first voice data;
It is input to described in the network parameter learning model with reference to style and features figure, it is corresponding to obtain the target tone color style
Target network parameter.
3. live data processing method according to claim 1, which is characterized in that described to be converted according to style adjusted
Network carries out style conversion to the second speech data that main broadcaster inputs, and obtains the third voice number with the target tone color style
According to the step of, comprising:
The audio frequency characteristics figure of the second speech data is extracted, the audio frequency characteristics figure includes content characteristic figure;
The content characteristic figure is handled by the style switching network adjusted, obtains that there is the target tone color
The style converting characteristic figure of style;
Feature inverse transform is carried out to the content characteristic figure and the style converting characteristic figure, obtains that there is the target tone color style
Third voice data.
4. live data processing method described in any one of -3 according to claim 1, which is characterized in that the network parameter
Learning model is based on depth using the first speech samples of at least one tone color style and the second speech samples of any main broadcaster
The neural metwork training of habit obtains, wherein at least one tone color style includes the target tone color style.
5. live data processing method described in any one of -3 according to claim 1, which is characterized in that described from reception
To tone color convert requests in obtain target tone color style before, the method also includes:
The network parameter learning model is obtained previously according to training sample training, is specifically included:
Obtain training sample, the training sample include at least one tone color style the first speech samples and any main broadcaster the
Two speech samples, wherein at least one tone color style includes the target tone color style;
Corresponding content characteristic sample graph is extracted from the second speech samples of any main broadcaster;
For every kind of tone color style, corresponding style and features sample graph is extracted from the first speech samples of the tone color style;
Meta learning network is carried out according to the content characteristic sample graph and every kind of tone color style corresponding style and features sample graph
Training, obtains the network parameter learning model, and is stored in the live streaming and provides in terminal.
6. live data processing method according to claim 5, which is characterized in that described according to the content characteristic sample
Scheme the step of style and features sample graph corresponding with every kind of tone color style is trained meta learning network, comprising:
The corresponding style and features sample graph of every kind of tone color style is input in the meta learning network, every kind of tone color style is obtained
Style network parameter;
It is adjusted according to network parameter of the style network parameter of every kind of tone color style to preset style switching network, and will
The content characteristic sample graph is input in style switching network adjusted, obtains corresponding style converting characteristic sample graph;
According to the style and features sample graph and the corresponding style converting characteristic sample graph adjustment meta learning of every kind of tone color style
The network parameter of network obtains the network parameter learning model.
7. live data processing method according to claim 6, which is characterized in that the wind according to every kind of tone color style
The step of lattice feature samples figure and corresponding style converting characteristic sample graph adjust the network parameter of the meta learning network, packet
It includes:
Calculate the loss function between the style and features sample graph of every kind of tone color style and corresponding style converting characteristic sample graph
Value;
Repetitive exercise after the network parameter of the meta learning network is updated according to the loss function value, until the meta learning net
When network meets training termination condition, obtained network parameter learning model is trained in output.
8. live data processing method according to claim 7, which is characterized in that the trained termination condition includes following
At least one of condition:
The loss function value no longer declines;
The loss function value is lower than setting value;
Repetitive exercise number reaches setting number.
9. live data processing method according to claim 1, which is characterized in that described according to the third voice data
The living broadcast interactive data flow of virtual live streaming image is generated, and is sent to live streaming and receives the step of terminal plays out, comprising:
According to setting time interval by the third voice data cutting be multiple audio data sections;
For each audio data section, identify that the content parameters of the audio data section, the content parameters include content characteristic, feelings
Thread feature and amplitude characteristic, the emotional characteristics are used to control the emotional state of the virtual live streaming image, the amplitude characteristic
For controlling the shape of the mouth as one speaks folding condition of the virtual live streaming image;
The corresponding virtual live streaming image of the audio data section is generated according to the content characteristic, emotional characteristics and amplitude characteristic
Interdynamic video section;
Each audio data section and its corresponding interdynamic video section are synthesized, the live streaming for obtaining the virtual live streaming image is mutual
Dynamic data flow, and the living broadcast interactive data flow of the virtual live streaming image is sent to live streaming reception terminal and is played out.
10. a kind of live data processing unit, which is characterized in that be applied to live streaming and provide terminal, described device includes:
Parsing module obtains target tone color style for parsing the tone color convert requests received;
Input module, for obtaining the first voice data with the target tone color style, and by first voice data
It is input in network parameter learning model trained in advance, obtains the corresponding target network parameter of the target tone color style;
Style conversion module, the network parameter of the style switching network for will prestore are adjusted to the target network parameter, and
Style conversion is carried out to the second speech data that main broadcaster inputs according to style switching network adjusted, obtains that there is the target
The third voice data of tone color style;
Sending module is generated, for generating the living broadcast interactive data flow of virtual live streaming image according to the third voice data, and
Live streaming reception terminal is sent to play out.
11. a kind of electronic equipment, which is characterized in that the electronic equipment includes one or more storage mediums and one or more
The processor communicated with storage medium, one or more storage mediums are stored with the executable machine-executable instruction of processor,
When electronic equipment operation, processor executes the machine-executable instruction, to realize described in any one of claim 1-9
Live data processing method.
12. a kind of readable storage medium storing program for executing, which is characterized in that the readable storage medium storing program for executing is stored with machine-executable instruction, described
Machine-executable instruction, which is performed, realizes live data processing method described in any one of claim 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910368522.XA CN110062267A (en) | 2019-05-05 | 2019-05-05 | Live data processing method, device, electronic equipment and readable storage medium storing program for executing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910368522.XA CN110062267A (en) | 2019-05-05 | 2019-05-05 | Live data processing method, device, electronic equipment and readable storage medium storing program for executing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110062267A true CN110062267A (en) | 2019-07-26 |
Family
ID=67322286
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910368522.XA Pending CN110062267A (en) | 2019-05-05 | 2019-05-05 | Live data processing method, device, electronic equipment and readable storage medium storing program for executing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110062267A (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111312267A (en) * | 2020-02-20 | 2020-06-19 | 广州市百果园信息技术有限公司 | Voice style conversion method, device, equipment and storage medium |
CN111343473A (en) * | 2020-02-25 | 2020-06-26 | 北京达佳互联信息技术有限公司 | Data processing method and device for live application, electronic equipment and storage medium |
CN112017698A (en) * | 2020-10-30 | 2020-12-01 | 北京淇瑀信息科技有限公司 | Method and device for optimizing manual recording adopted by voice robot and electronic equipment |
CN112019874A (en) * | 2020-09-09 | 2020-12-01 | 广州华多网络科技有限公司 | Live wheat-connecting method and related equipment |
CN112164407A (en) * | 2020-09-22 | 2021-01-01 | 腾讯音乐娱乐科技(深圳)有限公司 | Tone conversion method and device |
CN112446938A (en) * | 2020-11-30 | 2021-03-05 | 重庆空间视创科技有限公司 | Multi-mode-based virtual anchor system and method |
CN112672172A (en) * | 2020-11-30 | 2021-04-16 | 北京达佳互联信息技术有限公司 | Audio replacement system, method and device, electronic equipment and storage medium |
WO2021077663A1 (en) * | 2019-10-21 | 2021-04-29 | 南京创维信息技术研究院有限公司 | Method and system for automatically adjusting sound and image modes on basis of scene recognition |
CN112788359A (en) * | 2020-12-30 | 2021-05-11 | 北京达佳互联信息技术有限公司 | Live broadcast processing method and device, electronic equipment and storage medium |
CN112954378A (en) * | 2021-02-05 | 2021-06-11 | 广州方硅信息技术有限公司 | Method and device for playing voice barrage in live broadcast room, electronic equipment and medium |
CN112995530A (en) * | 2019-12-02 | 2021-06-18 | 阿里巴巴集团控股有限公司 | Video generation method, device and equipment |
CN113111791A (en) * | 2021-04-16 | 2021-07-13 | 深圳市格灵人工智能与机器人研究院有限公司 | Image filter conversion network training method and computer readable storage medium |
CN113259701A (en) * | 2021-05-18 | 2021-08-13 | 游艺星际(北京)科技有限公司 | Method and device for generating personalized timbre and electronic equipment |
CN115412773A (en) * | 2021-05-26 | 2022-11-29 | 武汉斗鱼鱼乐网络科技有限公司 | Method, device and system for processing audio data of live broadcast room |
CN115550503A (en) * | 2021-06-30 | 2022-12-30 | 华为技术有限公司 | Method and device for generating multiple sound effects and terminal equipment |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120316882A1 (en) * | 2011-06-10 | 2012-12-13 | Morgan Fiumi | System for generating captions for live video broadcasts |
CN107154069A (en) * | 2017-05-11 | 2017-09-12 | 上海微漫网络科技有限公司 | A kind of data processing method and system based on virtual role |
CN107248195A (en) * | 2017-05-31 | 2017-10-13 | 珠海金山网络游戏科技有限公司 | A kind of main broadcaster methods, devices and systems of augmented reality |
CN107481735A (en) * | 2017-08-28 | 2017-12-15 | 中国移动通信集团公司 | A kind of method, server and the computer-readable recording medium of transducing audio sounding |
CN109120985A (en) * | 2018-10-11 | 2019-01-01 | 广州虎牙信息科技有限公司 | Image display method, apparatus and storage medium in live streaming |
CN109151366A (en) * | 2018-09-27 | 2019-01-04 | 惠州Tcl移动通信有限公司 | A kind of sound processing method of video calling |
-
2019
- 2019-05-05 CN CN201910368522.XA patent/CN110062267A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120316882A1 (en) * | 2011-06-10 | 2012-12-13 | Morgan Fiumi | System for generating captions for live video broadcasts |
CN107154069A (en) * | 2017-05-11 | 2017-09-12 | 上海微漫网络科技有限公司 | A kind of data processing method and system based on virtual role |
CN107248195A (en) * | 2017-05-31 | 2017-10-13 | 珠海金山网络游戏科技有限公司 | A kind of main broadcaster methods, devices and systems of augmented reality |
CN107481735A (en) * | 2017-08-28 | 2017-12-15 | 中国移动通信集团公司 | A kind of method, server and the computer-readable recording medium of transducing audio sounding |
CN109151366A (en) * | 2018-09-27 | 2019-01-04 | 惠州Tcl移动通信有限公司 | A kind of sound processing method of video calling |
CN109120985A (en) * | 2018-10-11 | 2019-01-01 | 广州虎牙信息科技有限公司 | Image display method, apparatus and storage medium in live streaming |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021077663A1 (en) * | 2019-10-21 | 2021-04-29 | 南京创维信息技术研究院有限公司 | Method and system for automatically adjusting sound and image modes on basis of scene recognition |
CN112995530A (en) * | 2019-12-02 | 2021-06-18 | 阿里巴巴集团控股有限公司 | Video generation method, device and equipment |
CN111312267A (en) * | 2020-02-20 | 2020-06-19 | 广州市百果园信息技术有限公司 | Voice style conversion method, device, equipment and storage medium |
CN111312267B (en) * | 2020-02-20 | 2023-08-11 | 广州市百果园信息技术有限公司 | Voice style conversion method, device, equipment and storage medium |
CN111343473B (en) * | 2020-02-25 | 2022-07-01 | 北京达佳互联信息技术有限公司 | Data processing method and device for live application, electronic equipment and storage medium |
CN111343473A (en) * | 2020-02-25 | 2020-06-26 | 北京达佳互联信息技术有限公司 | Data processing method and device for live application, electronic equipment and storage medium |
CN112019874A (en) * | 2020-09-09 | 2020-12-01 | 广州华多网络科技有限公司 | Live wheat-connecting method and related equipment |
CN113784163B (en) * | 2020-09-09 | 2023-06-20 | 广州方硅信息技术有限公司 | Live wheat-connecting method and related equipment |
CN113784163A (en) * | 2020-09-09 | 2021-12-10 | 广州方硅信息技术有限公司 | Live wheat-connecting method and related equipment |
CN112164407A (en) * | 2020-09-22 | 2021-01-01 | 腾讯音乐娱乐科技(深圳)有限公司 | Tone conversion method and device |
CN112017698B (en) * | 2020-10-30 | 2021-01-29 | 北京淇瑀信息科技有限公司 | Method and device for optimizing manual recording adopted by voice robot and electronic equipment |
CN112017698A (en) * | 2020-10-30 | 2020-12-01 | 北京淇瑀信息科技有限公司 | Method and device for optimizing manual recording adopted by voice robot and electronic equipment |
CN112672172B (en) * | 2020-11-30 | 2023-04-28 | 北京达佳互联信息技术有限公司 | Audio replacing system, method and device, electronic equipment and storage medium |
CN112446938B (en) * | 2020-11-30 | 2023-08-18 | 重庆空间视创科技有限公司 | Multi-mode-based virtual anchor system and method |
CN112672172A (en) * | 2020-11-30 | 2021-04-16 | 北京达佳互联信息技术有限公司 | Audio replacement system, method and device, electronic equipment and storage medium |
CN112446938A (en) * | 2020-11-30 | 2021-03-05 | 重庆空间视创科技有限公司 | Multi-mode-based virtual anchor system and method |
CN112788359A (en) * | 2020-12-30 | 2021-05-11 | 北京达佳互联信息技术有限公司 | Live broadcast processing method and device, electronic equipment and storage medium |
CN112954378A (en) * | 2021-02-05 | 2021-06-11 | 广州方硅信息技术有限公司 | Method and device for playing voice barrage in live broadcast room, electronic equipment and medium |
CN113111791A (en) * | 2021-04-16 | 2021-07-13 | 深圳市格灵人工智能与机器人研究院有限公司 | Image filter conversion network training method and computer readable storage medium |
CN113111791B (en) * | 2021-04-16 | 2024-04-09 | 深圳市格灵人工智能与机器人研究院有限公司 | Image filter conversion network training method and computer readable storage medium |
CN113259701B (en) * | 2021-05-18 | 2023-01-20 | 游艺星际(北京)科技有限公司 | Method and device for generating personalized timbre and electronic equipment |
CN113259701A (en) * | 2021-05-18 | 2021-08-13 | 游艺星际(北京)科技有限公司 | Method and device for generating personalized timbre and electronic equipment |
CN115412773A (en) * | 2021-05-26 | 2022-11-29 | 武汉斗鱼鱼乐网络科技有限公司 | Method, device and system for processing audio data of live broadcast room |
WO2023273440A1 (en) * | 2021-06-30 | 2023-01-05 | 华为技术有限公司 | Method and apparatus for generating plurality of sound effects, and terminal device |
CN115550503A (en) * | 2021-06-30 | 2022-12-30 | 华为技术有限公司 | Method and device for generating multiple sound effects and terminal equipment |
CN115550503B (en) * | 2021-06-30 | 2024-04-23 | 华为技术有限公司 | Method and device for generating multiple sound effects, terminal equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110062267A (en) | Live data processing method, device, electronic equipment and readable storage medium storing program for executing | |
CN110085244A (en) | Living broadcast interactive method, apparatus, electronic equipment and readable storage medium storing program for executing | |
CN106878820B (en) | Live broadcast interaction method and device | |
CN105450642B (en) | It is a kind of based on the data processing method being broadcast live online, relevant apparatus and system | |
WO2022166709A1 (en) | Virtual video live broadcast processing method and apparatus, and storage medium and electronic device | |
US11113884B2 (en) | Techniques for immersive virtual reality experiences | |
JP2020034895A (en) | Responding method and device | |
CN111010589A (en) | Live broadcast method, device, equipment and storage medium based on artificial intelligence | |
CN106488311B (en) | Sound effect adjusting method and user terminal | |
WO2023011221A1 (en) | Blend shape value output method, storage medium and electronic apparatus | |
EP3826314A1 (en) | Electrical devices control based on media-content context | |
CN109348274A (en) | A kind of living broadcast interactive method, apparatus and storage medium | |
CN106792013A (en) | A kind of method, the TV interactive for television broadcast sounds | |
CN114128299A (en) | Template-based excerpts and presentations for multimedia presentations | |
US11671562B2 (en) | Method for enabling synthetic autopilot video functions and for publishing a synthetic video feed as a virtual camera during a video call | |
US20230039530A1 (en) | Automated generation of haptic effects based on haptics data | |
CN113439447A (en) | Room acoustic simulation using deep learning image analysis | |
CN113704390A (en) | Interaction method and device of virtual objects, computer readable medium and electronic equipment | |
Alexanderson et al. | Animated Lombard speech: Motion capture, facial animation and visual intelligibility of speech produced in adverse conditions | |
Sodoyer et al. | A study of lip movements during spontaneous dialog and its application to voice activity detection | |
CN110337041A (en) | Video broadcasting method, device, computer equipment and storage medium | |
CN109286760A (en) | A kind of entertainment video production method and its terminal | |
CN108965904A (en) | A kind of volume adjusting method and client of direct broadcasting room | |
US20230353707A1 (en) | Method for enabling synthetic autopilot video functions and for publishing a synthetic video feed as a virtual camera during a video call | |
CN116756285A (en) | Virtual robot interaction method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190726 |