CN110062267A - Live data processing method, device, electronic equipment and readable storage medium storing program for executing - Google Patents

Live data processing method, device, electronic equipment and readable storage medium storing program for executing Download PDF

Info

Publication number
CN110062267A
CN110062267A CN201910368522.XA CN201910368522A CN110062267A CN 110062267 A CN110062267 A CN 110062267A CN 201910368522 A CN201910368522 A CN 201910368522A CN 110062267 A CN110062267 A CN 110062267A
Authority
CN
China
Prior art keywords
style
tone color
live streaming
network parameter
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910368522.XA
Other languages
Chinese (zh)
Inventor
徐子豪
刘炉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huya Information Technology Co Ltd
Original Assignee
Guangzhou Huya Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huya Information Technology Co Ltd filed Critical Guangzhou Huya Information Technology Co Ltd
Priority to CN201910368522.XA priority Critical patent/CN110062267A/en
Publication of CN110062267A publication Critical patent/CN110062267A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/426Internal components of the client ; Characteristics thereof
    • H04N21/42607Internal components of the client ; Characteristics thereof for processing the incoming bitstream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The embodiment of the present application provides a kind of live data processing method, device, electronic equipment and readable storage medium storing program for executing, the first voice data with target tone color style is handled by network parameter learning model trained in advance to obtain the corresponding target network parameter of target tone color style, and style conversion is carried out to the second speech data that main broadcaster inputs by adjusting for the style switching network after target network parameter, the living broadcast interactive data flow of virtual live streaming image is generated according to the third voice data being converted to target tone color style, and it is sent to live streaming reception terminal and plays out.So, it can be directed to any main broadcaster, while not changing audio content, the tone color style during virtual live streaming image live streaming is converted into arbitrary tone color style to interact with spectators, interaction effect during raising live streaming in turn, more Shangdi transfer spectators are interacted with main broadcaster's.

Description

Live data processing method, device, electronic equipment and readable storage medium storing program for executing
Technical field
This application involves internets, and field is broadcast live, in particular to a kind of live data processing method, device, electronics Equipment and readable storage medium storing program for executing.
Background technique
In internet live streaming, replaces the reality image of main broadcaster to participate in living broadcast interactive so that image is virtually broadcast live, be mesh A kind of preceding more popular direct-seeding.
In current direct-seeding, the tone color of virtual live streaming image mostly uses greatly the former tone color style or solid in advance of main broadcaster Fixed a certain tone color style provides live data streams, can not be converted into other tone color styles and interact with spectators, such as This is unable to satisfy certain particular demands of specific main broadcaster or niche audience, so that will lead to interaction live streaming effect reduces.Such as it sees The sound that crowd may prefer to hear is the tone color style of oneself liked star or the tone color style of people known to oneself. In another example main broadcaster may be not intended to the tone color style show of oneself exposing privacy concern to other spectators.
Summary of the invention
In view of this, the embodiment of the present application is designed to provide a kind of live data processing method, device, electronic equipment And readable storage medium storing program for executing, to solve the above problems.
According to the one aspect of the embodiment of the present application, a kind of electronic equipment is provided, may include that one or more storages are situated between Matter and one or more processors communicated with storage medium.One or more storage mediums are stored with the executable machine of processor Device executable instruction.When electronic equipment operation, the processor executes the machine-executable instruction, to execute live data Processing method.
According to the another aspect of the embodiment of the present application, a kind of live data processing method is provided, is applied to live streaming and provides eventually End, which comprises
It parses the tone color convert requests received and obtains target tone color style;
First voice data with the target tone color style is obtained, and first voice data is input in advance In trained network parameter learning model, the corresponding target network parameter of the target tone color style is obtained;
The network parameter of the style switching network prestored is adjusted to the target network parameter, and according to wind adjusted Lattice switching network carries out style conversion to the second speech data that main broadcaster inputs, and obtains the third with the target tone color style Voice data;
The living broadcast interactive data flow of virtual live streaming image is generated according to the third voice data, and is sent to live streaming and is received Terminal plays out.
According to the another aspect of the embodiment of the present application, a kind of live data processing unit is provided, is applied to live streaming and provides eventually End, described device include:
Parsing module obtains target tone color style for parsing the tone color convert requests received;
Input module, for obtaining the first voice data with the target tone color style, and by first voice Data are input in network parameter learning model trained in advance, obtain the corresponding target network ginseng of the target tone color style Number;
Style conversion module, the network parameter of the style switching network for will prestore are adjusted to the target network ginseng Number, and style conversion is carried out to the second speech data that main broadcaster inputs according to style switching network adjusted, it obtains with institute State the third voice data of target tone color style;
Sending module is generated, for generating the living broadcast interactive data of virtual live streaming image according to the third voice data Stream, and be sent to live streaming reception terminal and play out.
According to the another aspect of the embodiment of the present application, a kind of readable storage medium storing program for executing is provided, is stored on the readable storage medium storing program for executing There is machine-executable instruction, the step of above-mentioned live data processing method can be executed when which is run by processor Suddenly.
Based on any of the above-described aspect, compared to existing technologies, the embodiment of the present application passes through network ginseng trained in advance Number learning model handles the first voice data with target tone color style corresponding to obtain the target tone color style Target network parameter, and by adjusting the second speech data inputted for the style switching network after target network parameter to main broadcaster Style conversion is carried out, the live streaming of virtual live streaming image is generated according to the third voice data being converted to target tone color style Interactive data stream, and be sent to live streaming reception terminal and play out.It so, it is possible do not changing in audio for any main broadcaster While appearance, it is mutual to carry out with spectators that the tone color style during virtual live streaming image live streaming is converted into arbitrary tone color style Dynamic, and then improve the interaction effect during live streaming, more interacting for spectators and main broadcaster is transferred in Shangdi.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows the schematic diagram of live broadcast system provided by the embodiment of the present application;
Fig. 2 shows one of the flow diagrams of live data processing method provided by the embodiment of the present application;
Fig. 3 shows a kind of boundary that selection target tone color style in Internet application is broadcast live provided by the embodiment of the present application Face schematic diagram;
Fig. 4 shows the schematic diagram of style conversion process provided by the embodiment of the present application;
Fig. 5 shows live streaming provided by the embodiment of the present application and provides the live streaming interface schematic diagram of terminal;
Fig. 6 shows two of the flow diagram of live data processing method provided by the embodiment of the present application;
Fig. 7 shows the stream for each sub-steps that step S101 shown in Fig. 6 provided by the embodiment of the present application includes Journey schematic diagram;
Fig. 8 shows the training flow diagram of style transformation model provided by the embodiment of the present application;
Fig. 9 shows the schematic diagram of electronic equipment provided by the embodiment of the present application.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, instead of all the embodiments.The application being usually described and illustrated herein in the accompanying drawings is implemented The component of example can be arranged and be designed with a variety of different configurations.
Unless indicated to the contrary, the embodiment of the present application refers to " first ", " second ", third " etc. ordinal numbers be used for more A object distinguishes, and is not used in sequence, timing, position, priority or the significance level for limiting multiple objects.
Shown in referring to Fig.1, Fig. 1 is the configuration diagram of live broadcast system 10 provided by the embodiments of the present application.For example, live streaming system System 10 can be the service platform for such as internet live streaming etc.Live broadcast system 10 may include direct broadcast server 200, straight It broadcasts and terminal 100 and live streaming reception terminal 300 is provided, direct broadcast server 200 provides terminal 100 with live streaming respectively and live streaming connects It receives terminal 300 to communicate to connect, provides direct broadcast service for providing terminal 100 for live streaming and reception terminal 300 being broadcast live.For example, Live streaming, which provides terminal 100, can be sent to the live video stream of direct broadcasting room direct broadcast server 200, and spectators can be connect by live streaming It receives terminal 300 and pulls live video stream from direct broadcast server 200 to watch the live video of direct broadcasting room.In another example direct broadcast service Device 200 can also receive terminal 300 to the live streaming of the spectators when the direct broadcasting room that spectators subscribe to starts broadcasting and send a notification message.Live streaming Video flowing can be the complete video stream that the video flowing being broadcast live in platform is currently being broadcast live or is being formed after the completion of live streaming.
Live streaming, which provides, can install in terminal 100 and live streaming reception terminal 300 for providing the mutual of internet direct broadcast service Networked product, for example, internet product can be it is relevant to internet direct broadcast service used in computer or smart phone Application APP, Web page, small routine etc..
In the present embodiment, live broadcast system 10 can also include the video acquisition device for acquiring main broadcaster's video frame of main broadcaster 400, video acquisition device 400 is mounted directly or is integrated in live streaming and provides terminal 100, can also provide terminal independently of live streaming 100 and with live streaming provide terminal 100 connect.
Referring to shown in Fig. 2, Fig. 2 shows the flow diagram of live data processing method provided by the embodiments of the present application, Offer terminal 100 can be broadcast live as shown in Fig. 1 and execute for the live data processing method.The live data processing method it is detailed Step is described below.
Step S110 parses the tone color convert requests received and obtains target tone color style.
Step S120 obtains first voice data with target tone color style, and the first voice data is input to pre- First in trained network parameter learning model, the corresponding target network parameter of target tone color style is obtained.
The network parameter of the style switching network prestored is adjusted to target network parameter by step S130, and according to adjustment The second speech data that style switching network afterwards inputs main broadcaster carries out style conversion, obtains having the of target tone color style Three voice data.
Step S140, the living broadcast interactive data flow of virtual live streaming image is generated according to third voice data, and is sent to straight Reception terminal 300 is broadcast to play out.
In the present embodiment, for step S110, live streaming provides terminal 100 after receiving tone color convert requests, can be with The target tone color style for obtaining the main broadcaster or selecting into the spectators of the direct broadcasting room, the target are parsed from the tone color convert requests Tone color style can be understood as the main broadcaster or the spectators of the direct broadcasting room into the main broadcaster wish to listen when listening aforementioned live audio The tone color style arrived.For example, the main broadcaster may want to the audio data of oneself output, to sound like the idol oneself liked bright The tone color style of the tone color style of star or friend known to oneself or intonation of the speaking (such as " Taiwan oneself liked Chamber ", " Beijing chamber " etc.) tone color style.In another example may also wish the main broadcaster oneself heard for a part of spectators The audio data of output sounds the tone color style of the similar idol star oneself liked or the tone color of friend known to oneself Style.Based on this, which both can provide terminal 100 by the corresponding live streaming of main broadcaster and issue, can also be by entering The live streaming of the spectators of the direct broadcasting room of the main broadcaster receives terminal 300 and issues.
For example, live streaming provides terminal 100 or live streaming receives in the interface for the live streaming Internet application installed in terminal 300 The selection interface for the target tone color style can be set, which shows the choosing of multiple and different tone color styles , the spectators of the main broadcaster or the direct broadcasting room into the main broadcaster can select certainly from the respective option shown in the selection interface Then it is raw to receive terminal 300 by live streaming offer terminal 100 or live streaming for the corresponding option of target tone color style required for oneself At corresponding tone color convert requests.
Only as an example, referring to Fig. 3, showing that live streaming provides terminal 100 or live streaming is received and installed in terminal 300 The interface schematic diagram of Internet application is broadcast live, the option of different tone color styles is shown in the interface, respectively includes tone color style A, the spectators of tone color style B, tone color style C, tone color style D etc., the main broadcaster or the direct broadcasting room into the main broadcaster can be from this The corresponding option of target tone color style required for oneself is selected in selection interface.For example, the main broadcaster like oneself one be familiar with Friend A tone color style, and tone color style A be friend A tone color style, then the main broadcaster can choose tone color style A, then Terminal 100, which is provided, by live streaming generates corresponding tone color convert requests.In another example the spectators of the direct broadcasting room of the main broadcaster like some The tone color style of singer, and tone color style B is the tone color style of the singer, then the spectators can choose tone color style B, then lead to It crosses live streaming reception terminal 300 and generates corresponding tone color convert requests.
Referring to shown in Fig. 4, the schematic diagram of style conversion process in the embodiment of the present application is shown, below with reference to Fig. 4 to preceding Embodiment is stated to illustrate.
For step S120, live streaming, which provides terminal 100, can locally be previously stored with the corresponding audio of various tone color styles Data then can be from local the first voice data searched and have the target tone color style after determining target tone color style.Or Person, direct broadcast server 200 can also provide various tone color styles corresponding audio data, after determining target tone color style, then The first voice data with the target tone color style can be obtained from direct broadcast server 200.
On this basis, live streaming, which provides terminal 100, can be input to the first voice data network parameter trained in advance It practises in model, obtains the corresponding target network parameter of target tone color style.
Wherein, which can learn the corresponding style network parameter of various different tone color styles, example Second speech samples of the first speech samples and any main broadcaster that such as can use at least one tone color style are based on deep learning Neural metwork training obtain, wherein it is described at least one tone color style include the target tone color style.In this way, can be with needle To the audio data of any tone color style of input, audio data corresponding to the tone color style is exported, in this way without for every Kind tone color style individually trains style switching network again, greatly reduces training amount.
As a kind of possible embodiment, in the step s 120, it is corresponding with reference to wind that the first voice data is extracted first Lattice characteristic pattern will then be input in network parameter learning model with reference to style and features figure, it is corresponding to obtain target tone color style Target network parameter.
Through present inventor the study found that any a segment of audio data (such as first voice data) can be connected by one The waveform diagram of string indicates, is based on this, the first voice data for extracting the main broadcaster is corresponding with reference to the one of style and features figure Kind exemplary approach, which may is that, carries out cutting for the first voice data at interval of preset time (such as every 10 seconds), obtains more Then a data slot extracts audiograph, spectrogram or the sound spectrograph of each data slot or the sound wave of each data slot For image after figure, spectrogram or sound spectrograph progress image processing transformation as audio frequency characteristics figure, which then includes interior Hold characteristic pattern and above-mentioned with reference to style and features figure.It can be used to indicate that the style of the first voice data is special with reference to style and features figure Sign, such as tone color style etc.;Content characteristic figure can be used to indicate that the content characteristic of the first voice data, such as volume, Speech content etc..
The present embodiment, can be to avoid the audio data disposably handled by the way that the first voice data is carried out cutting as a result, It measures excessive caused live streaming and the Caton of terminal 100, the time span for each data slot that another aspect cutting obtains is provided It unanimously, can be in order to subsequent processing.
For step S130, join exporting the corresponding target network of the target tone color style by network parameter learning model After number, the network parameter of the style switching network prestored can be adjusted to target network parameter above-mentioned.In this way, adjusted The tone color style of the audio data of any main broadcaster can be converted to the target tone color style by style switching network, without being directed to The target tone color individually trains style switching network again.
The live streaming Internet application that any main broadcaster installs on through starting live streaming offer terminal 100 is opened with entering direct broadcasting room Begin that the data such as live video stream, live streaming picture, live audio, text barrage can be generated during live streaming after live streaming to pass through The live streaming that direct broadcast server 200 is sent into each spectators of the direct broadcasting room receives terminal 300.In above process, first The audio frequency characteristics figure of the second speech data of main broadcaster's input is extracted by feature extraction network, audio frequency characteristics figure includes content characteristic Figure and style and features figure.Then, the style switching network by adjusting after handles content characteristic pattern, obtains with target The style converting characteristic figure of tone color style.Finally, carrying out feature inverse transform to content characteristic pattern and style converting characteristic figure, obtain Third voice data with the target tone color style.
In detail, original in this way since the style and features figure in original audio frequency characteristics figure is substituted in style converting characteristic figure The style converting characteristic figure after content characteristic figure and conversion in audio frequency characteristics figure can be understood as having the target tone color style Audio frequency characteristics figure.On this basis, in order to generate the audio data that spectators can hear, the present embodiment is also needed the content Style transition diagram after characteristic pattern and conversion carries out feature inverse transform, obtains the third voice number with the target tone color style According to.In this way, the style conversion after the third voice data integration second speech data corresponding content characteristic figure and conversion The style and features of figure, to reach corresponding to the target tone color style while not changing the content of the second speech data Auditory effect.
It is worth noting that although the function that can be changed voice in the prior art using some changes of voice (such as old man's sound, little Hai Sheng Sound etc.) to change one's voice in speech, but the sound effect converted in this scheme is unsatisfactory, is unable to reach preferable effect true to nature Fruit, and can not still be converted to required tone color style.The technical solution provided through this embodiment, the tone color after conversion are For the tone color of required target tone color style, there is extremely strong vivid effect.
It, can be in the display interface of direct broadcasting room in order to improve the interest during living broadcast interactive for step S140 Virtual live streaming image replaces the reality image of the main broadcaster to interact with spectators.Virtual live streaming image can be outer with main broadcaster The virtual figure image that looks, posture, makings etc. are consistent, such as two-dimensional virtual figure image or three-dimensional personage can be used Image is also possible to cartoon character or true man's image etc..For example, virtual live streaming image can imitate the table of the main broadcaster in real time The characteristic attributes such as feelings, movement interact to represent the main broadcaster with spectators, i.e., spectators can be by being virtually broadcast live image and being somebody's turn to do Main broadcaster interacts, which can be any one numerous subscribed in bean vermicelli of main broadcaster.It specifically, can be during live streaming The limb action of main broadcaster, facial expression, audio data etc. are captured and identified, and virtual live streaming image is combined to carry out It plays, is then forwarded on direct broadcast server 200, receive terminal 300 from direct broadcast server 200 to enter the live streaming of direct broadcasting room In pull live data streams and watched.In this way, the virtual live streaming image that spectators experience in this way can have it is similar In the impression of practical main broadcaster true man movement and voice.For example, spectators it is seen that a cartoon dinosaur virtual figure image, but But the real time data of the movement from this main broadcaster and audio data is transmitted for the movement of this cartoon dinosaur and voice.
After live streaming offer terminal 100 generates third voice data above-mentioned, virtual live streaming image can be generated in real time Living broadcast interactive data flow, and be sent to live streaming reception terminal 300 and play out.For example, can according to setting time interval (such as 5 seconds, 10 seconds etc.) by third voice data cutting be multiple audio data sections, and be directed to each audio data section, identify the audio The content parameters of data segment, the content parameters may include content characteristic, emotional characteristics and amplitude characteristic, wherein emotional characteristics For controlling the emotional state of virtual live streaming image, amplitude characteristic is used to control the shape of the mouth as one speaks folding condition of virtual live streaming image.Example Such as, if recognizing emotional characteristics is the corresponding parameter of happy state, virtual live streaming image can be adjusted according to the emotional characteristics The value of emotion attribute is smile, and successively expression, movement and the posture of the virtual live streaming image of control.In another example if in recognizing Holding feature is " I am very happy ", then the content of the action attributes of the adjustable virtual live streaming image is to execute " applause " in real time Movement, while the expression attribute for adjusting the virtual live streaming image is smile.
In other possible embodiments, main broadcaster can also can be acquired by video acquisition device 400 shown in Fig. 1 Real-time expression, movement and posture.For example, where the face position of identification main broadcaster and angle, the profile of face, human face five-sense-organ Position, Rotation of eyeball position, eyelid eyebrow, the motion state of lip and gesture motion etc., the information that these are acquired in real time into The result of analysis is converted to customized control instruction set, and passes through the control interaction of these control instruction set by row analysis The virtual live streaming image on interface imitates institute collected expression, movement and posture in real time.Such as when collected gesture is dynamic When referring to downwards as hand, then the value for adjusting the action attributes of the virtual live streaming image is to execute the movement of " sitting down " in real time.
Thus, it is possible to which it is corresponding virtual straight to generate the audio data section according to content characteristic, emotional characteristics and amplitude characteristic The interdynamic video section of image is broadcast, and each audio data section and its corresponding interdynamic video section are synthesized, is obtained virtual straight Broadcast the living broadcast interactive data flow of image, will virtual live streaming image living broadcast interactive data flow be sent to live streaming receive terminal 300 into Row plays.
For example, referring to Fig. 5, showing live streaming provides a kind of live streaming examples of interfaces figure of terminal 100, at the live streaming interface In, it may include that interface display frame, main broadcaster's video frame display box, barrage area, virtual image region and every frame of main broadcaster is broadcast live The word content XXXXX of audio frame.Wherein, the video being broadcast live in platform is currently being broadcast live for showing in live streaming interface display frame The complete video stream formed after the completion of stream or live streaming, main broadcaster's video frame display box is for showing that video acquisition device acquires in real time The main broadcaster's video frame arrived, virtual image region are used to show the virtual image of main broadcaster and the living broadcast interactive data flow of virtual image, Barrage area is used to show the interaction content (such as AAAAA, BBBBB, CCCCC, DDDDD, EEEEE) between spectators and main broadcaster.
On this basis, live streaming, which provides terminal, can correspond to adjustment according to the interactive information for receiving terminal 300 from live streaming The characteristic attribute of virtual live streaming image, so that spectators can carry out virtual interactive with the virtual live streaming image.Still shown in Fig. 5 Live streaming interface for, spectators can by live streaming receive terminal 300 send interactive information, live streaming provide terminal 100 can basis Vivid characteristic attribute is virtually broadcast live to correspond to adjustment in these interactive information, thus completes live streaming reception terminal 300 and mutual arena Interaction on face between the shown virtual live streaming image.It should be appreciated that these interactive information and the virtual live streaming image Preset corresponding relationship may be present between characteristic attribute, this preset corresponding relationship can be established by pre- learning process, herein It is not illustrating one by one.
In this way, the present embodiment can be directed to any main broadcaster, while not changing audio content, will virtually be broadcast live vivid straight Tone color style during broadcasting is converted to arbitrary tone color style to interact with spectators, and then improves mutual during being broadcast live Dynamic effect, more Shangdi transfer spectators are interacted with main broadcaster's.
As a kind of possible embodiment, provided in this embodiment straight referring to Fig. 6, before abovementioned steps S110 Multicast data processing method can also include the following steps:
Step S101 obtains network parameter learning model previously according to training sample training, referring specifically to Fig. 7, step S101 may include following sub-step:
Sub-step S1011, obtain training sample, training sample include at least one tone color style the first speech samples and The second speech samples of any main broadcaster.
In the present embodiment, aforementioned at least one tone color style may include target tone color style and other tone color styles, First speech samples can be any speech samples with target tone color style and other tone color styles.For example, if target Tone color style is the tone color style of some known friend A, then can collect the audio data of a large amount of known friend A as one The first speech samples of part.
In the present embodiment, the second speech samples are not specifically limited, and can be any main broadcaster or other any users Audio data can be collected as second speech samples.
Please refer to Fig. 8, the training process of the present embodiment be related to feature extraction network, characteristic vector pickup network with And initial conversion network.Exemplary elaboration is carried out below based on training process of the Fig. 8 to style transformation model in this step S101.
Sub-step S1012 extracts corresponding content characteristic sample graph from the second speech samples of any main broadcaster.
It is shown in Figure 8, it can be according to the above-mentioned side for extracting audio frequency characteristics figure from the second speech data that main broadcaster inputs Formula extracts the content characteristic figure of the second speech samples by feature extraction network.
Sub-step S1013 extracts corresponding wind from the first speech samples of the tone color style for every kind of tone color style Lattice feature samples figure.
It is shown in Figure 8, it can be according to the above-mentioned side for extracting audio frequency characteristics figure from the second speech data that main broadcaster inputs Formula extracts the style and features sample graph of corresponding first speech samples of every kind of tone color style by feature extraction network.
Sub-step S1014, according to content characteristic sample graph and the corresponding style and features sample graph of every kind of tone color style to member Learning network is trained, and obtains network parameter learning model, and is stored in live streaming and is provided in terminal 100.
Exemplary elaboration is carried out below based on detailed training process of the Fig. 8 to this sub-step S1014.
The first, the corresponding style and features sample graph of every kind of tone color style is input in meta learning network, obtains every kind of sound The style network parameter of color style.
The second, it is carried out according to network parameter of the style network parameter of every kind of tone color style to preset style switching network Adjustment, and content characteristic sample graph is input in style switching network adjusted, obtain corresponding style converting characteristic sample This figure.
Third adjusts member according to the style and features sample graph of every kind of tone color style and corresponding style converting characteristic sample graph The network parameter of learning network obtains network parameter learning model.
In detail, as an implementation, the style and features sample graph of every kind of tone color style and corresponding can be calculated Loss function value between style converting characteristic sample graph, and according to loss function value update meta learning network network parameter after Repetitive exercise, when meta learning network meets training termination condition, obtained network parameter learning model is trained in output.
Wherein, above-mentioned training termination condition may include at least one of following three kinds of conditions:
1) repetitive exercise number reaches setting number;2) loss function value is lower than given threshold;3) loss function value is no longer Decline.
Wherein, in condition 1) in, in order to save operand, the maximum value of the number of iterations can be set, if the number of iterations Reach setting number, the iteration of this iteration cycle can be stopped, using the deep learning network finally obtained as tone color modulus of conversion Type.In condition 2) in, if loss function value is lower than given threshold, illustrate that current tone color transformation model can expire substantially Sufficient condition can stop iteration at this time.In condition 3) in, loss function value no longer declines, and shows to have formd optimal sound Color transformation model can stop iteration.
It should be noted that above-mentioned iteration stopping condition can be used in combination, a use can also be selected, for example, can be Loss function value, which no longer declines, stops iteration, alternatively, stopping iteration when the number of iterations reaches setting number, alternatively, losing Functional value stops iteration when no longer declining.Alternatively, given threshold can also be lower than in loss function value, and loss function value is not When declining again, stop iteration.
In addition, in the actual implementation process, can also be not limited to using above-mentioned example as training termination condition, this field Technical staff can design the training termination condition different from above-mentioned example according to actual needs.
Based on the network parameter learning model that above-mentioned steps obtain, it can be used for the sound of any tone color style according to input Frequency corresponds to the network parameter of tone color style according to exporting, and the style switching network after the subsequent parameter using aforementioned network can be not While changing the audio content of the audio data of any main broadcaster, the tone color style during virtual image is broadcast live is converted to pair The tone color style answered improves the interaction effect during live streaming to interact with spectators, and more Shangdi, which is transferred, sees Crowd interacts with main broadcaster's.Also, the present embodiment is no longer needed to for each main broadcaster, or individually trains wind for every kind of tone color style Lattice transformation model greatly reduces training amount.
Fig. 9 shows the schematic diagram of electronic equipment provided by the embodiments of the present application, and in the present embodiment, which can be with Refer to that live streaming shown in FIG. 1 provides terminal 100 comprising storage medium 110, processor 120 and live data processing unit 500。
Wherein, processor 120 can be a general central processing unit (CentralProcessing Unit, CPU), Microprocessor, application-specific integrated circuit (application-specificintegrated circuit, ASIC) or one Or the integrated circuit that multiple programs for controlling the live data processing method of above method embodiment offer execute.
Storage medium 110 can be ROM or can store the other kinds of static storage device of static information and instruction, RAM or the other kinds of dynamic memory that can store information and instruction, are also possible to the read-only storage of electric erazable programmable Device (Electrically erasable programmabler-only memory, EEPROM), CD-ROM (compactdisc read-only memory, CD-ROM) or other optical disc storages, optical disc storage (including compression optical disc, swash Optical disc, optical disc, Digital Versatile Disc, Blu-ray Disc etc.), magnetic disk storage medium or other magnetic storage apparatus or can use In carry or storage have instruction or data structure form desired program code and can by computer access it is any its His medium, but not limited to this.Storage medium 110, which can be, to be individually present, and is connected by communication bus with processor 120.It deposits Storage media 110 can also be integrated with processor.Wherein, the storage medium 110 executes application scheme for storing Application code, such as live data processing unit 500 shown in Fig. 9, and execution is controlled by processor 120.Institute Processor 120 is stated for executing the application code stored in the storage medium 110, such as live data processing unit 500, to execute the live data processing method of above method embodiment.
The application can carry out the division of functional module according to above method embodiment to live data processing unit 500, For example, each functional module of each function division can be corresponded to, two or more functions can also be integrated in one In processing module.Above-mentioned integrated module both can take the form of hardware realization, can also use the shape of software function module Formula is realized.It should be noted that be schematical, only a kind of logical function partition to the division of module in the application, it is real There may be another division manner when border is realized.For example, in the case where each function division of use correspondence each functional module, Live data processing unit 500 shown in Fig. 9 is a kind of schematic device, separately below to the live data processing unit The function of 500 each functional module is described in detail.
Parsing module 510 obtains target tone color style for parsing the tone color convert requests received.
Input module 520, for obtaining the first voice data with target tone color style, and the first voice data is defeated Enter into network parameter learning model trained in advance, obtains the corresponding target network parameter of target tone color style.
The network parameter of style conversion module 530, the style switching network for will prestore is adjusted to target network parameter, And style conversion is carried out to the second speech data that main broadcaster inputs according to style switching network adjusted, it obtains with target sound The third voice data of color style.
Sending module 540 is generated, for generating the living broadcast interactive data flow of virtual live streaming image according to third voice data, And it is sent to live streaming reception terminal 300 and plays out.
Since live data processing unit 500 provided by the embodiments of the present application is live data processing method shown in Fig. 2 Another way of realization, and live data processing unit 500 can be used for executing method provided by embodiment shown in Fig. 2, Therefore it, which can be obtained technical effect, can refer to above method embodiment, and details are not described herein.
Further, based on the same inventive concept, the embodiment of the present application also provides a kind of computer readable storage medium, It is stored with computer program on the computer readable storage medium, which executes above-mentioned live streaming when being run by processor The step of data processing method.
Specifically, which can be general storage medium, such as mobile disk, hard disk, on the storage medium Computer program when being run, be able to carry out above-mentioned live data processing method.
The embodiment of the present application is referring to according to the method for the embodiment of the present application, equipment (electronic equipment of such as Fig. 9) and calculating The flowchart and/or the block diagram of machine program product describes.It should be understood that can be realized by computer program instructions flow chart and/or The combination of the process and/or box in each flow and/or block and flowchart and/or the block diagram in block diagram.It can mention For the processing of these computer program instructions to general purpose computer, special purpose computer, Embedded Processor or other programmable datas The processor of equipment is to generate a machine, so that being executed by computer or the processor of other programmable data processing devices Instruction generation refer to for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram The device of fixed function.
Although the application is described in conjunction with each embodiment herein, however, implementing the application claimed In the process, those skilled in the art are by checking the attached drawing, disclosure and the appended claims, it will be appreciated that and it is real Other variations of the existing open embodiment.In the claims, one word of " comprising " is not excluded for other components or step, "a" or "an" is not excluded for multiple situations.Single processor or other units may be implemented to enumerate in claim several Item function.Mutually different has been recited in mutually different dependent certain measures, it is not intended that these measures cannot group close To generate good effect.
More than, the only various embodiments of the application, but the protection scope of the application is not limited thereto, and it is any to be familiar with Those skilled in the art within the technical scope of the present application, can easily think of the change or the replacement, and should all cover Within the protection scope of the application.Therefore, the protection scope of the application shall be subject to the protection scope of the claim.

Claims (12)

1. a kind of live data processing method, which is characterized in that be applied to live streaming and provide terminal, which comprises
It parses the tone color convert requests received and obtains target tone color style;
First voice data with the target tone color style is obtained, and first voice data is input to preparatory training Network parameter learning model in, obtain the corresponding target network parameter of the target tone color style;
The network parameter of the style switching network prestored is adjusted to the target network parameter, and is turned according to style adjusted Switching network carries out style conversion to the second speech data that main broadcaster inputs, and obtains the third voice with the target tone color style Data;
The living broadcast interactive data flow of virtual live streaming image is generated according to the third voice data, and is sent to live streaming and is received terminal It plays out.
2. live data processing method according to claim 1, which is characterized in that described that first voice data is defeated Enter into network parameter learning model trained in advance, obtains the step of the corresponding target network parameter of the target tone color style Suddenly, comprising:
It is corresponding with reference to style and features figure to extract first voice data;
It is input to described in the network parameter learning model with reference to style and features figure, it is corresponding to obtain the target tone color style Target network parameter.
3. live data processing method according to claim 1, which is characterized in that described to be converted according to style adjusted Network carries out style conversion to the second speech data that main broadcaster inputs, and obtains the third voice number with the target tone color style According to the step of, comprising:
The audio frequency characteristics figure of the second speech data is extracted, the audio frequency characteristics figure includes content characteristic figure;
The content characteristic figure is handled by the style switching network adjusted, obtains that there is the target tone color The style converting characteristic figure of style;
Feature inverse transform is carried out to the content characteristic figure and the style converting characteristic figure, obtains that there is the target tone color style Third voice data.
4. live data processing method described in any one of -3 according to claim 1, which is characterized in that the network parameter Learning model is based on depth using the first speech samples of at least one tone color style and the second speech samples of any main broadcaster The neural metwork training of habit obtains, wherein at least one tone color style includes the target tone color style.
5. live data processing method described in any one of -3 according to claim 1, which is characterized in that described from reception To tone color convert requests in obtain target tone color style before, the method also includes:
The network parameter learning model is obtained previously according to training sample training, is specifically included:
Obtain training sample, the training sample include at least one tone color style the first speech samples and any main broadcaster the Two speech samples, wherein at least one tone color style includes the target tone color style;
Corresponding content characteristic sample graph is extracted from the second speech samples of any main broadcaster;
For every kind of tone color style, corresponding style and features sample graph is extracted from the first speech samples of the tone color style;
Meta learning network is carried out according to the content characteristic sample graph and every kind of tone color style corresponding style and features sample graph Training, obtains the network parameter learning model, and is stored in the live streaming and provides in terminal.
6. live data processing method according to claim 5, which is characterized in that described according to the content characteristic sample Scheme the step of style and features sample graph corresponding with every kind of tone color style is trained meta learning network, comprising:
The corresponding style and features sample graph of every kind of tone color style is input in the meta learning network, every kind of tone color style is obtained Style network parameter;
It is adjusted according to network parameter of the style network parameter of every kind of tone color style to preset style switching network, and will The content characteristic sample graph is input in style switching network adjusted, obtains corresponding style converting characteristic sample graph;
According to the style and features sample graph and the corresponding style converting characteristic sample graph adjustment meta learning of every kind of tone color style The network parameter of network obtains the network parameter learning model.
7. live data processing method according to claim 6, which is characterized in that the wind according to every kind of tone color style The step of lattice feature samples figure and corresponding style converting characteristic sample graph adjust the network parameter of the meta learning network, packet It includes:
Calculate the loss function between the style and features sample graph of every kind of tone color style and corresponding style converting characteristic sample graph Value;
Repetitive exercise after the network parameter of the meta learning network is updated according to the loss function value, until the meta learning net When network meets training termination condition, obtained network parameter learning model is trained in output.
8. live data processing method according to claim 7, which is characterized in that the trained termination condition includes following At least one of condition:
The loss function value no longer declines;
The loss function value is lower than setting value;
Repetitive exercise number reaches setting number.
9. live data processing method according to claim 1, which is characterized in that described according to the third voice data The living broadcast interactive data flow of virtual live streaming image is generated, and is sent to live streaming and receives the step of terminal plays out, comprising:
According to setting time interval by the third voice data cutting be multiple audio data sections;
For each audio data section, identify that the content parameters of the audio data section, the content parameters include content characteristic, feelings Thread feature and amplitude characteristic, the emotional characteristics are used to control the emotional state of the virtual live streaming image, the amplitude characteristic For controlling the shape of the mouth as one speaks folding condition of the virtual live streaming image;
The corresponding virtual live streaming image of the audio data section is generated according to the content characteristic, emotional characteristics and amplitude characteristic Interdynamic video section;
Each audio data section and its corresponding interdynamic video section are synthesized, the live streaming for obtaining the virtual live streaming image is mutual Dynamic data flow, and the living broadcast interactive data flow of the virtual live streaming image is sent to live streaming reception terminal and is played out.
10. a kind of live data processing unit, which is characterized in that be applied to live streaming and provide terminal, described device includes:
Parsing module obtains target tone color style for parsing the tone color convert requests received;
Input module, for obtaining the first voice data with the target tone color style, and by first voice data It is input in network parameter learning model trained in advance, obtains the corresponding target network parameter of the target tone color style;
Style conversion module, the network parameter of the style switching network for will prestore are adjusted to the target network parameter, and Style conversion is carried out to the second speech data that main broadcaster inputs according to style switching network adjusted, obtains that there is the target The third voice data of tone color style;
Sending module is generated, for generating the living broadcast interactive data flow of virtual live streaming image according to the third voice data, and Live streaming reception terminal is sent to play out.
11. a kind of electronic equipment, which is characterized in that the electronic equipment includes one or more storage mediums and one or more The processor communicated with storage medium, one or more storage mediums are stored with the executable machine-executable instruction of processor, When electronic equipment operation, processor executes the machine-executable instruction, to realize described in any one of claim 1-9 Live data processing method.
12. a kind of readable storage medium storing program for executing, which is characterized in that the readable storage medium storing program for executing is stored with machine-executable instruction, described Machine-executable instruction, which is performed, realizes live data processing method described in any one of claim 1-9.
CN201910368522.XA 2019-05-05 2019-05-05 Live data processing method, device, electronic equipment and readable storage medium storing program for executing Pending CN110062267A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910368522.XA CN110062267A (en) 2019-05-05 2019-05-05 Live data processing method, device, electronic equipment and readable storage medium storing program for executing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910368522.XA CN110062267A (en) 2019-05-05 2019-05-05 Live data processing method, device, electronic equipment and readable storage medium storing program for executing

Publications (1)

Publication Number Publication Date
CN110062267A true CN110062267A (en) 2019-07-26

Family

ID=67322286

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910368522.XA Pending CN110062267A (en) 2019-05-05 2019-05-05 Live data processing method, device, electronic equipment and readable storage medium storing program for executing

Country Status (1)

Country Link
CN (1) CN110062267A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111312267A (en) * 2020-02-20 2020-06-19 广州市百果园信息技术有限公司 Voice style conversion method, device, equipment and storage medium
CN111343473A (en) * 2020-02-25 2020-06-26 北京达佳互联信息技术有限公司 Data processing method and device for live application, electronic equipment and storage medium
CN112017698A (en) * 2020-10-30 2020-12-01 北京淇瑀信息科技有限公司 Method and device for optimizing manual recording adopted by voice robot and electronic equipment
CN112019874A (en) * 2020-09-09 2020-12-01 广州华多网络科技有限公司 Live wheat-connecting method and related equipment
CN112164407A (en) * 2020-09-22 2021-01-01 腾讯音乐娱乐科技(深圳)有限公司 Tone conversion method and device
CN112446938A (en) * 2020-11-30 2021-03-05 重庆空间视创科技有限公司 Multi-mode-based virtual anchor system and method
CN112672172A (en) * 2020-11-30 2021-04-16 北京达佳互联信息技术有限公司 Audio replacement system, method and device, electronic equipment and storage medium
WO2021077663A1 (en) * 2019-10-21 2021-04-29 南京创维信息技术研究院有限公司 Method and system for automatically adjusting sound and image modes on basis of scene recognition
CN112788359A (en) * 2020-12-30 2021-05-11 北京达佳互联信息技术有限公司 Live broadcast processing method and device, electronic equipment and storage medium
CN112954378A (en) * 2021-02-05 2021-06-11 广州方硅信息技术有限公司 Method and device for playing voice barrage in live broadcast room, electronic equipment and medium
CN112995530A (en) * 2019-12-02 2021-06-18 阿里巴巴集团控股有限公司 Video generation method, device and equipment
CN113111791A (en) * 2021-04-16 2021-07-13 深圳市格灵人工智能与机器人研究院有限公司 Image filter conversion network training method and computer readable storage medium
CN113259701A (en) * 2021-05-18 2021-08-13 游艺星际(北京)科技有限公司 Method and device for generating personalized timbre and electronic equipment
CN115412773A (en) * 2021-05-26 2022-11-29 武汉斗鱼鱼乐网络科技有限公司 Method, device and system for processing audio data of live broadcast room
CN115550503A (en) * 2021-06-30 2022-12-30 华为技术有限公司 Method and device for generating multiple sound effects and terminal equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120316882A1 (en) * 2011-06-10 2012-12-13 Morgan Fiumi System for generating captions for live video broadcasts
CN107154069A (en) * 2017-05-11 2017-09-12 上海微漫网络科技有限公司 A kind of data processing method and system based on virtual role
CN107248195A (en) * 2017-05-31 2017-10-13 珠海金山网络游戏科技有限公司 A kind of main broadcaster methods, devices and systems of augmented reality
CN107481735A (en) * 2017-08-28 2017-12-15 中国移动通信集团公司 A kind of method, server and the computer-readable recording medium of transducing audio sounding
CN109120985A (en) * 2018-10-11 2019-01-01 广州虎牙信息科技有限公司 Image display method, apparatus and storage medium in live streaming
CN109151366A (en) * 2018-09-27 2019-01-04 惠州Tcl移动通信有限公司 A kind of sound processing method of video calling

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120316882A1 (en) * 2011-06-10 2012-12-13 Morgan Fiumi System for generating captions for live video broadcasts
CN107154069A (en) * 2017-05-11 2017-09-12 上海微漫网络科技有限公司 A kind of data processing method and system based on virtual role
CN107248195A (en) * 2017-05-31 2017-10-13 珠海金山网络游戏科技有限公司 A kind of main broadcaster methods, devices and systems of augmented reality
CN107481735A (en) * 2017-08-28 2017-12-15 中国移动通信集团公司 A kind of method, server and the computer-readable recording medium of transducing audio sounding
CN109151366A (en) * 2018-09-27 2019-01-04 惠州Tcl移动通信有限公司 A kind of sound processing method of video calling
CN109120985A (en) * 2018-10-11 2019-01-01 广州虎牙信息科技有限公司 Image display method, apparatus and storage medium in live streaming

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021077663A1 (en) * 2019-10-21 2021-04-29 南京创维信息技术研究院有限公司 Method and system for automatically adjusting sound and image modes on basis of scene recognition
CN112995530A (en) * 2019-12-02 2021-06-18 阿里巴巴集团控股有限公司 Video generation method, device and equipment
CN111312267A (en) * 2020-02-20 2020-06-19 广州市百果园信息技术有限公司 Voice style conversion method, device, equipment and storage medium
CN111312267B (en) * 2020-02-20 2023-08-11 广州市百果园信息技术有限公司 Voice style conversion method, device, equipment and storage medium
CN111343473B (en) * 2020-02-25 2022-07-01 北京达佳互联信息技术有限公司 Data processing method and device for live application, electronic equipment and storage medium
CN111343473A (en) * 2020-02-25 2020-06-26 北京达佳互联信息技术有限公司 Data processing method and device for live application, electronic equipment and storage medium
CN112019874A (en) * 2020-09-09 2020-12-01 广州华多网络科技有限公司 Live wheat-connecting method and related equipment
CN113784163B (en) * 2020-09-09 2023-06-20 广州方硅信息技术有限公司 Live wheat-connecting method and related equipment
CN113784163A (en) * 2020-09-09 2021-12-10 广州方硅信息技术有限公司 Live wheat-connecting method and related equipment
CN112164407A (en) * 2020-09-22 2021-01-01 腾讯音乐娱乐科技(深圳)有限公司 Tone conversion method and device
CN112017698B (en) * 2020-10-30 2021-01-29 北京淇瑀信息科技有限公司 Method and device for optimizing manual recording adopted by voice robot and electronic equipment
CN112017698A (en) * 2020-10-30 2020-12-01 北京淇瑀信息科技有限公司 Method and device for optimizing manual recording adopted by voice robot and electronic equipment
CN112672172B (en) * 2020-11-30 2023-04-28 北京达佳互联信息技术有限公司 Audio replacing system, method and device, electronic equipment and storage medium
CN112446938B (en) * 2020-11-30 2023-08-18 重庆空间视创科技有限公司 Multi-mode-based virtual anchor system and method
CN112672172A (en) * 2020-11-30 2021-04-16 北京达佳互联信息技术有限公司 Audio replacement system, method and device, electronic equipment and storage medium
CN112446938A (en) * 2020-11-30 2021-03-05 重庆空间视创科技有限公司 Multi-mode-based virtual anchor system and method
CN112788359A (en) * 2020-12-30 2021-05-11 北京达佳互联信息技术有限公司 Live broadcast processing method and device, electronic equipment and storage medium
CN112954378A (en) * 2021-02-05 2021-06-11 广州方硅信息技术有限公司 Method and device for playing voice barrage in live broadcast room, electronic equipment and medium
CN113111791A (en) * 2021-04-16 2021-07-13 深圳市格灵人工智能与机器人研究院有限公司 Image filter conversion network training method and computer readable storage medium
CN113111791B (en) * 2021-04-16 2024-04-09 深圳市格灵人工智能与机器人研究院有限公司 Image filter conversion network training method and computer readable storage medium
CN113259701B (en) * 2021-05-18 2023-01-20 游艺星际(北京)科技有限公司 Method and device for generating personalized timbre and electronic equipment
CN113259701A (en) * 2021-05-18 2021-08-13 游艺星际(北京)科技有限公司 Method and device for generating personalized timbre and electronic equipment
CN115412773A (en) * 2021-05-26 2022-11-29 武汉斗鱼鱼乐网络科技有限公司 Method, device and system for processing audio data of live broadcast room
WO2023273440A1 (en) * 2021-06-30 2023-01-05 华为技术有限公司 Method and apparatus for generating plurality of sound effects, and terminal device
CN115550503A (en) * 2021-06-30 2022-12-30 华为技术有限公司 Method and device for generating multiple sound effects and terminal equipment
CN115550503B (en) * 2021-06-30 2024-04-23 华为技术有限公司 Method and device for generating multiple sound effects, terminal equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110062267A (en) Live data processing method, device, electronic equipment and readable storage medium storing program for executing
CN110085244A (en) Living broadcast interactive method, apparatus, electronic equipment and readable storage medium storing program for executing
CN106878820B (en) Live broadcast interaction method and device
CN105450642B (en) It is a kind of based on the data processing method being broadcast live online, relevant apparatus and system
WO2022166709A1 (en) Virtual video live broadcast processing method and apparatus, and storage medium and electronic device
US11113884B2 (en) Techniques for immersive virtual reality experiences
JP2020034895A (en) Responding method and device
CN111010589A (en) Live broadcast method, device, equipment and storage medium based on artificial intelligence
CN106488311B (en) Sound effect adjusting method and user terminal
WO2023011221A1 (en) Blend shape value output method, storage medium and electronic apparatus
EP3826314A1 (en) Electrical devices control based on media-content context
CN109348274A (en) A kind of living broadcast interactive method, apparatus and storage medium
CN106792013A (en) A kind of method, the TV interactive for television broadcast sounds
CN114128299A (en) Template-based excerpts and presentations for multimedia presentations
US11671562B2 (en) Method for enabling synthetic autopilot video functions and for publishing a synthetic video feed as a virtual camera during a video call
US20230039530A1 (en) Automated generation of haptic effects based on haptics data
CN113439447A (en) Room acoustic simulation using deep learning image analysis
CN113704390A (en) Interaction method and device of virtual objects, computer readable medium and electronic equipment
Alexanderson et al. Animated Lombard speech: Motion capture, facial animation and visual intelligibility of speech produced in adverse conditions
Sodoyer et al. A study of lip movements during spontaneous dialog and its application to voice activity detection
CN110337041A (en) Video broadcasting method, device, computer equipment and storage medium
CN109286760A (en) A kind of entertainment video production method and its terminal
CN108965904A (en) A kind of volume adjusting method and client of direct broadcasting room
US20230353707A1 (en) Method for enabling synthetic autopilot video functions and for publishing a synthetic video feed as a virtual camera during a video call
CN116756285A (en) Virtual robot interaction method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190726