CN107846605A

CN107846605A - Main broadcaster end stream medium data generation system and method, network direct broadcasting system and method

Info

Publication number: CN107846605A
Application number: CN201710037179.1A
Authority: CN
Inventors: 黄志伟; 卢哲
Original assignee: Hunan Happly Sunshine Interactive Entertainment Media Co Ltd
Current assignee: Hunan Happly Sunshine Interactive Entertainment Media Co Ltd
Priority date: 2017-01-19
Filing date: 2017-01-19
Publication date: 2018-03-27
Anticipated expiration: 2037-01-19
Also published as: CN107846605B

Abstract

The invention discloses a kind of main broadcaster end stream medium data generation system and method, network direct broadcasting system and method, wherein main broadcaster end stream medium data generation system includes audio-video collection module, audio/video coding module, Streaming Media package module, scene analysis module and control module, the output end of audio-video collection module is connected by scene analysis module with the input of control module, and the output end of control module is connected with audio/video coding module；The video data that scene analysis module collects to audio-video collection module is analyzed, and judges current scene type N_iAnd by N_iIt is delivered to control module；Control module is according to N_iSelection performs corresponding stream medium data generation scheme in M.The present invention can during network direct broadcasting intelligent distinguishing main broadcaster end state, on the premise of user's viewing experience is ensured, scheme is generated using different stream medium data, reduces the network bandwidth that live broadcast stream media data take, flow is saved, reduces cost.

Description

Main broadcaster end stream medium data generation system and method, network direct broadcasting system and method

Technical field

The invention belongs to network direct broadcasting technical field, more particularly to a kind of main broadcaster end stream medium data generation system and side Method, network direct broadcasting system and method.

Background technology

" network direct broadcasting " is substantially divided to two classes, and the first kind is to provide the viewing of TV signal, such as all kinds of physical culture ratio on the net Match is live with recreational activities, and this kind of live principle is that TV (simulation) signal is converted into data signal input by collection Computer, website is uploaded in real time and is watched for people, equivalent to " Web TV "；Second class is then " network direct broadcasting " truly： Independent signal collecting device collection multi-medium data (including voice data and video data) is set up at the scene, imports main broadcaster end (instructor in broadcasting's equipment or platform), then uploaded onto the server by network, it is distributed to network address and is watched for people.It is the present invention is directed above-mentioned The second class situation in network direct broadcasting, especially for personal live, i.e., the most common mode in current live market.Second Maximum difference of the class network direct broadcasting compared with the first kind is that live independence：Individually controllable audio-video collection, it is entirely different Watched in single (moreover viewing effect is not so good as the smoothness of television-viewing) of relay television signal.

Multi-medium data during network direct broadcasting generally includes video data and voice data.As shown in figure 1, common Network direct broadcasting system includes main broadcaster end 1, server 2 and client 3, and wherein main broadcaster end 1 generates stream after collecting multi-medium data Media data, and stream medium data is uploaded onto the server 2 by network；Server 2 receive client 3 viewing it is live please After asking, stream medium data is transmitted through the network to client 3.

From accompanying drawing 1 as can be seen that stream medium data needs to be transmitted through journey by uploading and downloading two nets.Because net passes generally It is required for using the network of operator, therefore how on the premise of Consumer's Experience is ensured, reduces the band taken in network direct broadcasting Width, flow is saved for user, be always the problem of needing to consider in network direct broadcasting product development.

The stream medium data of net crossing is produced by the stream medium data of main broadcaster end 1 generation system, as shown in Fig. 2 traditional Main broadcaster end 1 stream medium data generation system include audio-video collection module 4, audio/video coding module 5 and Streaming Media Encapsulation Moulds Block 6, the output end of audio-video collection module 4 are connected by audio/video coding module 5 with the input of Streaming Media package module 6, Wherein through the compressed encoding of audio/video coding module 5 after the collection of audio-video collection module 4 audio, video data, then through Streaming Media Encapsulation Moulds Block 6 is packaged, and finally exports stream medium data.

Generally, netting the stream medium data of biography includes video compression data and audio compression data.Audio compression data is usual All without too big, the proportion that video compression data accounts in stream medium data is very big, and influence video compression data size because Element is mainly the complexity (complexity generally includes motion complexity, Texture complication etc.) and encoder of video image, wherein It is again related to coding parameter after encoder determines.

In network direct broadcasting, the state of main broadcaster is that various, common state is main broadcaster before camera lens and bean vermicelli carries out interaction Chat, and main broadcaster may sleep while live sometimes, main broadcaster is of short duration sometimes leaves but live continuation, also sometimes may be used Energy main broadcaster actively shelters from camera lens.The different state in main broadcaster end 1, it is meant that in the different periods, video figure to be encoded As complexity is that different, current live scheme is all that a set of stream medium data generation scheme tackles all main broadcaster's shapes State, i.e., the coder parameters of pre-set audio/video coding module 5, the video and voice data that will be generated after compressed encoding, Stream medium data is packaged into, net spreads out of.The defects of this 1 stream medium data generation method of main broadcaster end is not account for master The state of broadcasting can change, live picture complexity has significant change (most commonly between static scene and dynamic scene Change), can have the redundancy of stream medium data for viewing user, waste flow.

All do not go to consider substantially for this problem, in existing direct seeding technique, scheme relatively might have Following several ways：1. main broadcaster is of short duration when leave, main broadcaster live closing, live interruption in this case, returns manually Reconnection is needed after coming, Consumer's Experience is influenceed bigger；2. live end setting options, main broadcaster can manually select only transmission sound Frequency according to, do not propagate video data, this scheme needs main broadcaster's interactive operation, and main broadcaster can only select band with it is straight without video Broadcast, video data encoding mode is not optimized, solution is relatively rough, and bigger Streaming Media number still be present According to redundancy.

The content of the invention

Existing main broadcaster end stream medium data generation system tackles all masters using a set of stream medium data generation scheme State is broadcast, larger stream medium data redundancy be present, increase user watches the cost of network direct broadcasting.It is an object of the present invention to For above-mentioned the deficiencies in the prior art, there is provided a kind of main broadcaster end stream medium data generation system and method, network direct broadcasting system and Method, main broadcaster end state can be differentiated during network direct broadcasting, suitable stream medium data generating mode is automatically selected, is protecting On the premise of demonstrate,proving user's viewing experience, scheme is generated using different stream medium datas, reduces what live broadcast stream media data took Network bandwidth, and then live flow is saved, reduce the cost that user watches network direct broadcasting.

In order to solve the above technical problems, the technical solution adopted in the present invention is：

A kind of main broadcaster end stream medium data generation system, including audio-video collection module, audio/video coding module and stream matchmaker Body package module, the output end of audio-video collection module pass through audio/video coding module and the input phase of Streaming Media package module Even, Streaming Media package module output stream medium data；It, which is structurally characterized in that, also includes scene analysis module and control module, and sound regards The output end of frequency acquisition module is connected by scene analysis module with the input of control module, the output end and sound of control module Video encoding module is connected；The video data that wherein scene analysis module is used to collect audio-video collection module divides Analysis, judges current scene type N_iAnd by scene type N_iIt is delivered to control module；Control module internal preset has comprising n Element in scene type set N and stream medium data generation scheme the set M, wherein M and N of element corresponds,Control module is according to N_iSelection performs corresponding stream medium data generation scheme in M；The stream medium data generation Scheme includes setting coding parameter of the audio/video coding module to video data.

The code prosecutor formula of constant code rate generally is used to the coding of video data in existing live application, i.e., in certain time Interior coding code check can only carry out minor fluctuations on target bit rate.Image subjective quality and figure to be encoded after being compressed due to coding Size of data after the complexity of picture and compression is relevant, wherein after video compress data size again and current encoder coding Parameter setting is related.The present invention carries out image procossing and analysis using scene analysis module to video data, judges current scene Belong to some scene type set in advance, further according to the scene type determined, using different stream medium data generation sides Case.Because different scene types employs different coding parameters, coding loss size is controlled by adjusting coding parameter, will The high video image of complexity and the low video image compression of complexity to same subjective quality grade, wherein pass through increase it is complicated The coding loss of low video image is spent, reaches the purpose of the video data size after reducing compression on the whole.

Further, the output end of the control module is also connected with Streaming Media package module；The stream medium data life Include control Streaming Media package module into scheme and only encapsulate voice data, or control Streaming Media package module only encapsulates video counts According to, or control Streaming Media package module encapsulation Voice ＆ Video data.

By said structure, on the premise of user's viewing experience is not influenceed, chosen whether to encapsulate video counts according to scene According to.Because audio compression data is generally all without too greatly, the proportion that video compression data accounts in stream medium data is very big, at certain Video data is not packaged under a little scenes, greatly reduces the size of stream medium data.

Based on same inventive concept, present invention also offers a kind of main broadcaster end stream medium data generation method, including step Suddenly：

Step 1, audio-video collection module collection voice data and video data；

Step 2, the video data that scene analysis module collects to audio-video collection module are analyzed, and are judged current Scene type N_iAnd by scene type N_iIt is delivered to control module；

Step 3, control module is according to N_iSelection performs corresponding stream medium data generation scheme in M, and according to Streaming Media Data generation scheme sets coding parameter of the audio/video coding module to video data；Wherein control module (8) internal preset has A pair of element 1 in scene type set N and stream medium data generation scheme set M, wherein M and N comprising n element Should,

Step 4, video data of the audio/video coding module in the stream medium data generation scheme selected in step 3 Coding parameter is to video data encoding；Simultaneously to audio data coding.

Further, in addition to step 5, Streaming Media package module generate according to the stream medium data selected in step 3 Scheme only encapsulates voice data, either only encapsulates video data or encapsulation Voice ＆ Video data.

As a kind of preferred embodiment, N={ N_i| i=1,2,3 }, wherein N₁=normal scene, N₂=picture still scene, N₃ =camera lens blocks scene；M={ M_j| j=1,2,3 }；Wherein M₁=target bit rate of the audio/video coding module to video data is set For steady state value T₁And Streaming Media package module encapsulation Voice ＆ Video data, M₂=audio/video coding module is set to video data Target bit rate be steady state value T₂And Streaming Media package module encapsulation Voice ＆ Video data, M₃=Streaming Media package module only seals Fill voice data, wherein T₂＜ T₁；N₁Corresponding M₁, N₂Corresponding M₂, N₃Corresponding M₃；

Step 2 Scene analysis module is analyzed video image and judges that the process of current scene classification is as follows：

A. the average brightness value AVG of video image is sought_lumaIf the AVG of continuous S two field pictures_lumaLess than preset value Th₁, then really It is N to determine current scene classification₃；Otherwise step b is jumped to；

B. noise reduction process is filtered to video image, by frame difference method detect present image relative to previous frame image or The amount of exercise of former two field pictures, then the image after frame difference method is filtered and handles and count motion pixel number Sum_mov, If the Sum of continuous S two field pictures_movLess than preset value Th₂, it is determined that current scene classification is N₂；Otherwise step c is jumped to；

C. determine that current scene classification is N₁。

As another preferred embodiment, N={ N_i| i=1~n }, wherein N₁=camera lens blocks scene N_m=frame stabilization and Image motion complexity fluctuates, N_n=picture state is unstable；

M={ M_j| j=1~n }, wherein M₁=Streaming Media package module only encapsulates voice data, M_m=set audio frequency and video to compile Code module is steady state value T to the target bit rate of video data_mAnd Streaming Media package module encapsulation Voice ＆ Video data, M_n=set It is steady state value T to the target bit rate of video data to put audio/video coding module₁And Streaming Media package module encapsulation Voice ＆ Video Data；Wherein m ∈ [2, n-2]；T_mNumerical value by it is following rule determine：

(if T/ (n-2)) * m >=Th₃, then T_m=(T/ (n-2)) * m, otherwise T_m=Th₃, wherein T is given object code Rate；Th₃For default target bit rate value；

A. the average brightness value AVG of video image is sought_lumaIf the AVG of continuous S two field pictures_lumaLess than preset value Th₁, then really It is N to determine current scene classification₁；Otherwise step b is jumped to；

B. noise reduction process is filtered to video image, by frame difference method detect present image relative to previous frame image or The amount of exercise of former two field pictures, then the image after frame difference method is filtered and handles and count motion pixel number Sum_mov, If the Sum of continuous S two field pictures_mov∈[Sum_m,Sum_m+ Δ Sum), it is determined that current scene classification is N_m；Otherwise step is jumped to c；Wherein, Sum_m=(m-1) * (W*H/ (n-2)), Δ Sum=(W*H/ (n-2)), W represent the width of video image, and H represents video The height of image；

C. determine that current scene classification is N_n。

Based on same inventive concept, present invention also offers a kind of network direct broadcasting system, including described main broadcaster end to flow Media data generates system.

Based on same inventive concept, present invention also offers a kind of live network broadcast method, including described main broadcaster end to flow Media data generation method.

Compared with prior art, the present invention can during network direct broadcasting intelligent distinguishing main broadcaster end state, automatically select Suitable stream medium data generating mode, on the premise of user's viewing experience is ensured, generated using different stream medium datas Scheme, the network bandwidth that live broadcast stream media data take is reduced, and then save live flow, reduced user and watch network direct broadcasting Cost.

Brief description of the drawings

Fig. 1 is network direct broadcasting system block diagram.

Fig. 2 is that traditional main broadcaster end stream medium data generates system block diagram.

Fig. 3 is that main broadcaster end of the present invention stream medium data generates system block diagram.

Wherein, 1 is main broadcaster end, and 2 be server, and 3 be client, and 4 be audio-video collection module, and 5 be audio/video coding mould Block, 6 be Streaming Media package module, and 7 be scene analysis module, and 8 be control module.

Embodiment

As shown in Fig. 2 the main broadcaster end stream medium data generation system in network direct broadcasting system includes audio-video collection module 4th, audio/video coding module 5 and Streaming Media package module 6, the output end of audio-video collection module 4 pass through audio/video coding module 5 It is connected with the input of Streaming Media package module 6, Streaming Media package module 6 exports stream medium data；It is structurally characterized in that and also wrapped Scene analysis module 7 and control module 8 are included, the output end of audio-video collection module 4 passes through scene analysis module 7 and control module 8 input is connected, and the output end of control module 8 is connected with audio/video coding module 5；Wherein scene analysis module 7 be used for pair The video data that audio-video collection module 4 collects is analyzed, and judges current scene type N_iAnd by scene type N_iConveying To control module 8；The internal preset of control module 8 has the scene type set N comprising n element and stream medium data generation side Element in case set M, wherein M and N corresponds,Control module 8 is according to N_iSelection performs corresponding stream matchmaker in M Volume data generates scheme；The stream medium data generation scheme includes setting coding of the audio/video coding module 5 to video data Parameter.

The output end of the control module 8 is also connected with Streaming Media package module 6；The stream medium data generates scheme bag Include control Streaming Media package module 6 and only encapsulate voice data, or control Streaming Media package module 6 only encapsulates video data, or Person controls Streaming Media package module 6 to encapsulate Voice ＆ Video data.

The effect of scene analysis module 7 and control module 8 is to carry out algorithm and logic judgment, typically in the form of software Operate on common processor and (these algorithms and logic judgment can also be realized and solidified by hardware programmable). The substantially flow of the present invention is, after audio-video collection module 4 gathers voice data and video data, scene analysis module 7 to regarding Frequency image is analyzed, and intelligently judges that current scene belongs to some scene type set in advance.According to the field determined Scape classification, scheme is generated using different stream medium datas, these stream medium datas generation scheme includes to be used to Video coding Different strategy (adjustment coding parameter etc.), the various combination of audio, video data encapsulation, last plug-flow are gone out to carry out net biography.

The implementation process of main broadcaster end stream medium data generation method in inventive network live broadcasting method is as follows：

(1) set-up procedure

(1) some live states often occurred according to main broadcaster end 1 pre-set state scene set N, live state field Whether for that can be that the scene settings stream medium data generates scheme to reduce stream medium data redundancy, one straight for scape partitioning standards The state scene of broadcasting is typically the state that main broadcaster can continue within a period of time.

(2) scheme is generated for a set of stream medium data of each state scene category setting, stream medium data generation scheme Number is consistent with scene state number, and corresponds.Stream medium data generation scheme set is arranged to M, and (M and N set is one by one It is corresponding), stream medium data generation scheme can use one of following two, or the two is combined：

The first, is adjusted for different state scenes to encoder, including but not limited to encoder to video The coding parameter of data is adjusted, it is therefore an objective to which finding does not influence the forced coding side of user's subjective effect under the state scene Case.

Second, the various combination that control Streaming Media package module 6 encapsulates to voice data and video data, including only seal Fill voice data, only encapsulate video data, encapsulation Voice ＆ Video three kinds of situations of data.

(2) handling process

(1) main broadcaster end 1 carries out adopting for voice data and video data in audio-video collection module 4 by collecting device Collection；

(2) for the video data after collection, live end state is analyzed using the method for image procossing, and marks Scene type S_i, the foundation of analysis can be color characteristic of the motion complexity of video image, video image etc..Need to indicate , the specific method of graphical analysis is not limited here, does not also limit the feature chosen in graphical analysis, it is all by figure As analyzing the scheme sorted out all in the protection domain of this patent；

(3) result analyzed in (2) is sorted out, all scenes is attributed to certain a kind of live scene set in advance In, i.e. S_i∈N_i, wherein (S_i∈N).Comprising two kinds of situations, a kind of situation be with correspondingly, i.e., each video image scene Classification is exactly a live scene, using a kind of Streaming Media generation method；Another situation be with being many-to-one relation, i.e., it is more The corresponding live scene of individual image scene classification, scheme is generated using same Streaming Media.

(4) it is live scene classification N_iStream medium data generation scheme M corresponding to selection_i, wherein M_i∈M。

(5) stream medium data is produced according to the stream medium data of selection in (4) generation scheme, the inside includes audio/video coding Encapsulated with Streaming Media.

(6) generated stream medium data is transmitted by agreement, so far completes the stream medium data production at main broadcaster end 1 It is raw.

In order to preferably describe whole process, it is assumed that the resolution ratio of live video image is 360 × 640, remembers W × H=360 × 640, normal live target bit rate is arranged to 600kb/s, remembers T₁=600.

The live state that the present invention often occurs to main broadcaster end 1 is classified, and the foundation of classification is by substantial amounts of straight Broadcast video data to be analyzed, and the Streaming Media number that code check is more saved compared to traditional scheme can be made for the state scene According to generation scheme.Whether corresponded according to graphical analysis scene type and main broadcaster's state, there is following two main broadcaster's state classification Scheme.

(1) graphical analysis scene type and main broadcaster's state are many-to-one relations, artificially rule of thumb, by common master The state scene of broadcasting is divided into n (n=3) class：Normal scene, picture still scene, camera lens block scene and (have only arranged three fields here Scape, scene type can also be increased as needed), it is that a kind of stream medium data of each live state scene setting generates scheme：

Scheme 1：Corresponding normal scene, keeps traditional stream medium data generation method, i.e. stream medium data encapsulation scheme For voice data+video data, the target bit rate of video data encoding is not changed；

Scheme 2：Corresponding picture still scene (main broadcaster is of short duration to be left, main broadcaster's sleep etc.), video image now to be encoded Complexity is low, can be by adjusting video coding parameter the methods of, reduces the data volume of video multimedia, and voice data keeps passing System strategy.Concrete scheme is that stream medium data encapsulation scheme is voice data+video data, changes the object code of Video coding Rate；

Scheme 3：Corresponding camera lens blocks scene (being usually that main broadcaster is actively blocked), and now whole picture is a piece of black, transmission Video data is without what meaning, using the scheme for only transmitting voice data.

It can be seen that when graphical analysis scene type and main broadcaster's state are many-to-one relations, N={ N_i| i=1,2,3 }, its Middle N₁=normal scene, N₂=picture still scene, N₃=camera lens blocks scene；M={ M_j| j=1,2,3 }；Wherein M₁=set Audio/video coding module is steady state value T to the target bit rate of video data₁And Streaming Media package module encapsulation Voice ＆ Video number According to M₂=setting audio/video coding module is steady state value T to the target bit rate of video data₂And Streaming Media package module encapsulation sound Frequency and video data, M₃=Streaming Media package module only encapsulates voice data, wherein T₂＜ T₁；N₁Corresponding M₁, N₂Corresponding M₂, N₃It is right Answer M₃；

Specific implementation steps are as follows：

Step 1, main broadcaster end 1 carry out voice data and video data in audio-video collection module 4 by collecting device Collection；

Step 2, scene analysis module 7 are directed to the video data that audio-video collection module 4 collects, utilize image procossing Method live end state is analyzed, judge current scene type N_iAnd by scene type N_iIt is delivered to control module 8； The brightness and movement degree of video image are mainly considered during analysis, specific analysis deterministic process is as follows：

A. the average brightness value of video image is sought, counts the luma component values of all pixels in a frame, each pixel Brightness value is designated as P_i, AVG that average brightness value is designated as_luma, wherein,W represents video The width of image, H represent the height of video image；(if S can elect the integral multiple of GOP length as to continuous S frames, if GOP length is 50 Frame, 2 times of GOP length are then 100 frames) AVG of image_lumaLess than preset value Th₁(Th₁Be chosen for empirical value, scope be generally (0, 20]), it is determined that current scene classification is N₃(camera lens blocks scene)；Otherwise step b is jumped to；

B. the pretreatment such as noise reduction is filtered to video image, present image F is detected by frame difference method_curRelative to the time Previous frame image F on axle_lastThe amount of exercise of (or former two field pictures), then the image after frame difference method is filtered and handles and unites Meter motion pixel number Sum_movIf the Sum of continuous S two field pictures_movLess than preset value Th₂(Th₂It is an empirical, generally Span is [0, W × H/20]), it is determined that current scene classification is N₂(picture still scene)；Otherwise step c is jumped to；

C. determine that current scene classification is N₁(other states of one of two states of a, b are unsatisfactory for, are all classified as normal Scene).

Step 3, control module 8 is according to N_iSelection performs corresponding stream medium data generation scheme in M, and according to stream matchmaker Volume data generation scheme sets coding parameter of the audio/video coding module 5 to video data；

Step 4, video counts of the audio/video coding module 5 in the stream medium data generation scheme selected in step 3 According to coding parameter to video data encoding；Simultaneously to audio data coding.

Step 5, Streaming Media package module 6 only encapsulate audio according to the stream medium data generation scheme selected in step 3 Data, either only encapsulate video data or encapsulation Voice ＆ Video data.

The detailed process of step 3 to step 5 is as follows：

Live scene classification of the control module 8 in step 2, the corresponding stream medium data life set in advance of selection It is as follows into scheme, Scheme Choice：Normal scene selection scheme 1, picture still scene selection scheme 2, camera lens block scene selection Scheme 3.

Scheme 1：Program implementation process is consistent with traditional live end stream medium data generation method, i.e., compiles audio frequency and video Code module 5 is arranged to 600kb/s to the target bit rate of video data encoding, and audio data coding still presses traditional scheme, compiles After code is complete, Streaming Media package module 6 encapsulates into video compression data and audio compression data when encapsulating together；

Scheme 2：The normal scene of ratio that the program sets audio/video coding module 5 to the target bit rate of video data encoding It is low, 200kb/s is arranged to, after having encoded, by video compression data and audio compression data when Streaming Media package module 6 encapsulates Encapsulation is entered together；

Scheme 3：In order to not change the flow of voice data and video data encoding, Ke Yizheng in the implementation process of the program The coding of voice data and video data is often carried out, after simply audio, video data has encoded, when Streaming Media package module 6 encapsulates Only encapsulation audio compression data, does not encapsulate video compression data.

Generated stream medium data is transmitted by agreement, the stream medium data of main broadcaster end 1 is so far completed and produces.

(2) graphical analysis scene type and main broadcaster's state are man-to-man relations, i.e. main broadcaster's status categories can contemplate mirror Head blocks, field motion complexity the two factors, it is assumed that main broadcaster's status number is preset as n, and (wherein a kind of classification represents that camera lens hides Gear, n-2 kind live state classifications are determined that remaining a kind of classification represents that picture state is unstable by different image motion complexities Fixed live scene), it is corresponding that we should set n kinds stream medium data generation scheme：

Scheme 1：Voice data is only included in stream medium data generation scheme, without video data；

2~scheme of scheme n-2：Stream medium data generation scheme is voice data+video data, is changed according to motion complexity Become the target bit rate of video data encoding；

Scheme 3：Stream medium data generation scheme is voice data+video data, does not change the target of video data encoding Code check.

It can be seen that when graphical analysis scene type and main broadcaster's state are many-to-one relations, N={ N_i| i=1~n }, its Middle N₁=camera lens blocks scene, N_m=frame stabilization and the fluctuation of image motion complexity, N_n=picture state is unstable；M={ M_j| J=1~n }, wherein, M₁=Streaming Media package module only encapsulates voice data, M_m=audio/video coding module is set to video counts According to target bit rate be steady state value T_mAnd Streaming Media package module encapsulation Voice ＆ Video data, M_n=audio/video coding mould is set Block is steady state value T to the target bit rate of video data₁And Streaming Media package module encapsulation Voice ＆ Video data；Wherein m ∈ [2, n-2]；

Specific implementation steps are as follows：

A. the average brightness value of video image is sought, counts the luma component values of all pixels in a frame, each pixel Brightness value is designated as P_i, AVG that average brightness value is designated as_luma, wherein,W represents video The width of image, H represent the height of video image；(if S can elect the integral multiple of GOP length as to continuous S frames, if GOP length is 50 Frame, 2 times of GOP length are then 100 frames) AVG of image_lumaLess than preset value Th₁(Th₁Be chosen for empirical value, scope be generally (0, 20]), it is determined that current scene classification is N₁(camera lens blocks scene)；Otherwise step b is jumped to；

B. the pretreatment such as noise reduction is filtered to video image, present image F is detected by frame difference method_curRelative to the time Previous frame image F on axle_lastThe amount of exercise of (or former two field pictures), then the image after frame difference method is filtered and handles and unites Meter motion pixel number Sum_mov.Because the state of a total of n-2 kinds different motion complexity, the judgment mode of each state For：If Sum_mov∈[Sum_m,Sum_m+ Δ Sum), (wherein m ∈ [2, n-2], Sum_m=(m-1) * (W*H/ (N-2)), Δ Sum =(W*H/ (N-2)), W represent the width of video image, and H represents the height of video image), and the state continue for the (value selection of S frames Method be same as above), then current scene kind judging is classification N_m；Otherwise step c is jumped to.What deserves to be explained is, it is based in addition The classification of motion complexity is not suitable for the meticulous of division.

C. determine that current scene classification is N_n(other states of one of two states of a, b are unsatisfactory for, are all classified as picture The live scene of state labile).

The detailed process of step 3 to step 5 is as follows：

Live scene classification of the control module 8 in step 2, the corresponding stream medium data life set in advance of selection It is as follows into scheme, Scheme Choice：Camera lens blocks scene selection scheme 1, frame stabilization and image motion complexity fluctuation selecting party Case 2, the unstable selection scheme 3 of picture state.

Scheme 1：In order to not change the flow of voice data and video data encoding, Ke Yizheng in the implementation process of the program The coding of voice data and video data is often carried out, after simply audio, video data has encoded, when Streaming Media package module 6 encapsulates Only encapsulation audio compression data, does not encapsulate video compression data.

Scheme 2：The program is by mainly adjustment audio/video coding module 5 to the target bit rate of video data encoding.Classification For N_mThe corresponding target bit rate T of scene type_mNumerical value by it is following rule determine：

(if T/ (n-2)) * m >=Th₃, then T_m=(T/ (n-2)) * m, otherwise T_m=Th₃, wherein T takes 600, (in example only Given this setting scheme of target bit rate, other similar schemes also should be within the scope of the present invention), Th3 choosings Take with image resolution ratio correlation, the value can be set to 100 under 360 × 640 resolution ratio.After having encoded, Streaming Media Encapsulation Moulds Block 6 encapsulates into video compression data and audio compression data when encapsulating together.

Scheme 3：Program implementation process is consistent with traditional live end stream medium data generation method, i.e., compiles audio frequency and video Code module 5 is arranged to 600kb/s to the target bit rate of video data encoding, and audio data coding still presses traditional scheme, compiles After code is complete, Streaming Media package module 6 encapsulates into video compression data and audio compression data when encapsulating together.

Claims

1. a kind of main broadcaster end stream medium data generate system, including audio-video collection module (4), audio/video coding module (5) and Streaming Media package module (6), the output end of audio-video collection module (4) are encapsulated by audio/video coding module (5) and Streaming Media The input of module (6) is connected, Streaming Media package module (6) output stream medium data；Characterized in that, also include scene analysis Module (7) and control module (8), the output end of audio-video collection module (4) pass through scene analysis module (7) and control module (8) input is connected, and the output end of control module (8) is connected with audio/video coding module (5)；

The video data that wherein scene analysis module (7) is used to collect audio-video collection module (4) is analyzed, and judges to work as Preceding scene type N_iAnd by scene type N_iIt is delivered to control module (8)；

Control module (8) internal preset has the scene type set N comprising n element and stream medium data generation scheme set M, wherein M correspond with the element in N,Control module (8) is according to N_iSelection performs corresponding Streaming Media number in M According to generation scheme；The stream medium data generation scheme includes setting audio/video coding module (5) to join the coding of video data Number.

2. main broadcaster end as claimed in claim 1 stream medium data generates system, it is characterised in that the control module (8) Output end is also connected with Streaming Media package module (6)；The stream medium data generation scheme includes control Streaming Media package module (6) voice data is only encapsulated, either controls Streaming Media package module (6) only to encapsulate video data or control Streaming Media encapsulation Module (6) encapsulates Voice ＆ Video data.

3. a kind of main broadcaster end stream medium data generation method, it is characterised in that including step：

Step 1, audio-video collection module (4) collection voice data and video data；

Step 2, the video data that scene analysis module (7) collects to audio-video collection module (4) are analyzed, and judge to work as Preceding scene type N_iAnd by scene type N_iIt is delivered to control module (8)；

Step 3, control module (8) is according to N_iSelection performs corresponding stream medium data generation scheme in M, and according to Streaming Media number According to generation scheme, coding parameter of the audio/video coding module (5) to video data is set；Wherein control module (8) internal preset has A pair of element 1 in scene type set N and stream medium data generation scheme set M, wherein M and N comprising n element Should,

Step 4, video data of the audio/video coding module (5) in the stream medium data generation scheme selected in step 3 Coding parameter is to video data encoding；Simultaneously to audio data coding.

4. main broadcaster end stream medium data generation method as claimed in claim 3, it is characterised in that also include：

Step 5, Streaming Media package module (6) only encapsulate audio number according to the stream medium data generation scheme selected in step 3 According to, either only encapsulate video data or encapsulation Voice ＆ Video data.

5. main broadcaster end stream medium data generation method as claimed in claim 4, it is characterised in that N={ N_i| i=1,2,3 }, its Middle N₁=normal scene, N₂=picture still scene, N₃=camera lens blocks scene；M={ M_j| j=1,2,3 }；Wherein M₁=set Audio/video coding module is steady state value T to the target bit rate of video data₁And Streaming Media package module encapsulation Voice ＆ Video number According to M₂=setting audio/video coding module is steady state value T to the target bit rate of video data₂And Streaming Media package module encapsulation sound Frequency and video data, M₃=Streaming Media package module only encapsulates voice data, wherein T₂＜ T₁；N₁Corresponding M₁, N₂Corresponding M₂, N₃It is right Answer M₃；

Step 2 Scene analysis module (7) is analyzed video image and judges that the process of current scene classification is as follows：

A. the average brightness value AVG of video image is sought_lumaIf the AVG of continuous S two field pictures_lumaLess than preset value Th₁, it is determined that when Preceding scene type is N₃；Otherwise step b is jumped to；

B. noise reduction process is filtered to video image, present image is detected relative to previous frame image or former by frame difference method The amount of exercise of two field picture, then the image after frame difference method is filtered and handles and count motion pixel number Sum_movIf even The Sum of continuous S two field pictures_movLess than preset value Th₂, it is determined that current scene classification is N₂；Otherwise step c is jumped to；

C. determine that current scene classification is N₁。

6. main broadcaster end stream medium data generation method as claimed in claim 4, it is characterised in that N={ N_i| i=1~n }, its Middle N₁=camera lens blocks scene, N_m=frame stabilization and the fluctuation of image motion complexity, N_n=picture state is unstable；M={ M_j| J=1~n }, wherein M₁=Streaming Media package module only encapsulates voice data, M_m=audio/video coding module is set to video data Target bit rate be steady state value T_mAnd Streaming Media package module encapsulation Voice ＆ Video data, M_n=audio/video coding module is set Target bit rate to video data is steady state value T₁And Streaming Media package module encapsulation Voice ＆ Video data；Wherein m ∈ [2, n- 2]；T_mNumerical value by it is following rule determine：

(if T/ (n-2)) * m >=Th₃, then T_m=(T/ (n-2)) * m, otherwise T_m=Th₃,

Wherein T is given target bit rate；Th₃For default target bit rate value；

A. the average brightness value AVG of video image is sought_lumaIf the AVG of continuous S two field pictures_lumaLess than preset value Th₁, it is determined that when Preceding scene type is N₁；Otherwise step b is jumped to；

B. noise reduction process is filtered to video image, present image is detected relative to previous frame image or former by frame difference method The amount of exercise of two field picture, then the image after frame difference method is filtered and handles and count motion pixel number Sum_movIf even The Sum of continuous S two field pictures_mov∈[Sum_m,Sum_m+ Δ Sum), it is determined that current scene classification is N_m；Otherwise step c is jumped to；Its In, Sum_m=(m-1) * (W*H/ (n-2)), Δ Sum=(W*H/ (n-2)), W represent the width of video image, and H represents video image Height；

C. determine that current scene classification is N_n。

7. a kind of network direct broadcasting system, it is characterised in that given birth to including main broadcaster end as claimed in claim 1 or 2 stream medium data Into system.

8. a kind of live network broadcast method, it is characterised in that including the main broadcaster end Streaming Media as described in any one of claim 3 to 6 Data creation method.