CN106899860B

CN106899860B - Pass through the system and method for transmission of network media

Info

Publication number: CN106899860B
Application number: CN201610377813.1A
Authority: CN
Inventors: 郭荣昌; 杨昇龙; 邓安伦
Original assignee: Ubida Co
Current assignee: Yobeta Co.,Ltd.
Priority date: 2015-12-21
Filing date: 2016-05-31
Publication date: 2019-10-11
Anticipated expiration: 2036-05-31
Also published as: TW201722520A; TWI637772B; JP6306089B2; JP2017117431A; CN106899860A

Abstract

The invention discloses a kind of system and methods that media are sent to a user apparatus by network from a server, comprising: one, in the virtual reality applications program executed on server, generates a virtual VR 3D environment comprising multiple 3D models；This server examines each state according to a predetermined order；Then, server only will not yet be pre-stored in the 3D model of user apparatus, is rendered into a left eye shadow lattice of a 2D video streaming and a right eye shadow lattice and is sent to user apparatus；Remaining has been pre-stored in the 3D model of user apparatus, and server does not render, only synchronous driving its annotate and translate information.After user apparatus receives left and right two eye shadows lattice and annotates both information, using left and right two eye shadows lattice as background frame, and according to information is annotated, rendering is re-started using the 3D model for being pre-stored in itself, becomes foreground picture；Last both mixed foreground and background generate the mixing VR shadow lattice as the output video streaming for including a VR scene and output it.

Description

Pass through the system and method for transmission of network media

Technical field

The present invention relates to a kind of system and method by the media such as transmission of network image and sound, particularly relate to it is a kind of in Virtual reality (Virtual-Reality is rendered on user apparatus；Abbreviation VR) image 3D object method, this method by User apparatus renders 3D object to combine the 2D video streaming of the VR scene provided by server.

Background technique

Between past few years, game on line has become world trends, with cloud calculate related system and science and technology development, one Kind using server crossfire game content and the technology of service is provided also comes out.

A kind of method of traditional offer cloud game services is to be responsible for almost all of operation by server, also that is, When cloud game services to be provided, it includes multiple 3D objects that can be moved or be controlled by participant which, which needs to generate one, Virtual 3D environment.In known technology, these 3D objects may include audio, and the control according to participant (player) is dynamic later Make, the server will virtual 3D environment with 3D object in conjunction with and render on game machine one with stereosonic 2D game screen On.Later, the image after which will render passes through network transmission with stereo with a 2D video streaming comprising sound To the device of player, player device is after reception, it is only necessary to the 2D video streaming is decoded and show, without carrying out additional 3D Rendering calculates.However, aforementioned execute the traditional technology that rendering calculates on the same server for numerous players, execution will lead to The server load that 3D rendering calculates is excessive；In addition, because picture seen in player is all with the 2D video streaming by destructive compression Form transmission, therefore, either quality of the quality of image and sound with former display 3D object have a paragraph poor, and server A large amount of network communication bandwidths between player device also become a big problem.

Virtual reality (Virtual-Reality；Abbreviation VR) technology is very popular in the recent period.In order to provide one for human eye The visual experience of a VR, virtual VR scene must specialize in the ornamental image of mankind's left eye comprising one and another specialize in the mankind The ornamental image of right eye.The present invention provides a kind of system and method by the media such as transmission of network image and sound, passes through 3D object is rendered in user apparatus to combine the 2D video streaming of the VR scene provided by server.

Summary of the invention

The main purpose of the present invention is to provide a kind of system by media such as transmission of network such as image and sound and sides Method can reduce the load of server, promote the quality of the image and sound that show on user apparatus, and save server and user Communication bandwidth between device；The characteristics of the method for the present invention be user apparatus render 3D object (also referred to as 3D model) come combine by The 2D video streaming for the VR scene that server provides, to reach in rendering virtual reality (Virtual- on user apparatus Reality；Abbreviation VR) image 3D object result.

In order to achieve the above object, the present invention provides a kind of system and method by transmission of network media, the media Including multiple images.This system includes that a server and a user apparatus, the method include the following steps:

Step (A): executing a virtual reality (VR) application program on a server, includes multiple 3D models to generate one Virtual VR 3D environment, whether every 3D model instruction 3D model of arranging in pairs or groups be pre-stored in state in a user apparatus；

Step (B): the state of the multiple 3D model of the server check, to determine which 3D model will be encoded as one The left eye shadow lattice and a right eye shadow lattice that 2D video streaming is included, coding mode are to be pre-stored in non-in the user apparatus Multiple 3D model based codings are into the left eye shadow lattice and the right eye shadow lattice；

Step (C): the server is at least passed by the left eye shadow lattice of the 2D video streaming and the right eye shadow lattice, by network It is sent to the user apparatus；Wherein, which also makes a reservation for the non-multiple 3D model being pre-stored in user apparatus with one suitable Sequence is sent to the user apparatus；When the user apparatus receive by the server transmit Lai multiple 3D model when, the user Multiple 3D model is stored and issues an information to the server by device, to change the state of multiple 3D model, and is referred to Show that multiple 3D model is currently to be pre-stored in the user apparatus；And

Step (D): the user apparatus should be decoded from the received left eye shadow lattice of the server and the right eye shadow lattice, and benefit Use the left eye shadow lattice and the right eye shadow lattice as render it is multiple be pre-stored in user apparatus but be not included in the left eye shadow lattice and One background frame of the 3D model in the right eye shadow lattice, the one of an output video streaming of a VR scene to generate as including Mix VR shadow lattice.

In an embodiment, in the step (D), the user apparatus will from the received left eye shadow lattice of the server and After right eye shadow lattice decoding, which further merges into a combined VR shadow for the left eye shadow lattice and the right eye shadow lattice Then lattice are rendered as the background frame using the VR shadow lattice of the merging and multiple be pre-stored in user apparatus but be not included in 3D model in the left eye shadow lattice and the right eye shadow lattice, being somebody's turn to do to generate as the output video streaming for including the VR scene Mix VR shadow lattice.

In an embodiment, the server further include:

One VR scene transfer device, for be compiled in the VR application program or in runing time dynamic link the VR apply A chained library in program；Wherein, which includes the state of all 3D models and each 3D model A list, the state to indicate the 3D model state be " Not Ready (not being ready for) ", " Loading (in downloading) " And one of " Ready for Client (user has downloaded) "；And

One VR scene server is with the VR application program in the server program executed on the server；Wherein, VR The relay station that scene server is transmitted as information between the VR scene transfer device and the user apparatus, the VR scene server is also As the downloading server program for downloading the necessary 3D model from the server for the user apparatus.

In an embodiment, the user apparatus further include:

One VR scene client is one in the program operated on the user apparatus, to generate the output video streaming simultaneously It is connected to by the network with the server；

One shadow lattice colligator, the left eye shadow lattice are merged into the VR shadow lattice merged with the right eye shadow lattice；And

One VR scene cache, to the 3D model downloaded before storing at least one from the server.

Detailed description of the invention

Fig. 1 is schematic diagram of the present invention by one standards Example of system of transmission of network media；

Fig. 2 is the schematic diagram of one embodiment of system structure of the invention；

Fig. 3 A is flow chart of the present invention by one embodiment of method of transmission of network media；

Fig. 3 B is flow chart of the present invention by another embodiment of method of transmission of network media；

Fig. 4 A, 4B and 4C are the schematic diagram how the method for the present invention transmits one embodiment of video streams and 3D model；

Fig. 5 A, 5B and 5C are how the method for the present invention determines which 3D model must be encoded into the signal of one embodiment of shadow lattice Figure；

Fig. 6 A, 6B and 6C are the signal for one embodiment of video streams and 3D sound how the method for the present invention transmits tool sound Figure；

Fig. 7 A, 7B and 7C are how the method for the present invention determines which 3D sound must be encoded into the video streams shadow of tool sound The schematic diagram of one embodiment of lattice；

Fig. 8 is the schematic diagram of one embodiment of system architecture of virtual reality (VR) scene system of the invention；

One embodiment of the function of the shadow lattice colligator of virtual reality (VR) scene system of Fig. 9 to illustrate the invention is illustrated Figure；

Figure 10 is the schematic diagram of the second embodiment of the system architecture of VR scene system of the invention；

Figure 11 is the schematic diagram of the 3rd embodiment of the system architecture of VR scene system of the invention.

Description of symbols: 1~server；3~network (base station)；4~network；21~user apparatus (intelligent hand Machine)；22~user apparatus (laptop)；23~user apparatus (desktop computer)；51~user's eyes；52~projection Face；70~server；71,71a~people；72,72a~house；73,73a~shadow lattice；74~user apparatus；75~video streams Shadow lattice；81,81a~sound；82,82a~sound；83,83a, 1711,1712,1713~shadow lattice；85~video streams shadow lattice； 100,1100~application program；110,1110~scene transfer device (chained library)；120,1120~scene server；121~number According to wrapper；170,1170~scene client (program)；1111,1171~shadow lattice colligator；190,1190~scene cache； 60~67,661,60a~67a, 661a~step；101~114,122,124,172,174,176,192,194,1101,1112 ~1115,1122,1124,1172,1174,1176,1192,1194~path.

Specific embodiment

In order to more clearly describe the System and method for proposed by the invention by transmission of network media, will match below Schema is closed to be described in detail.

The present invention with one of be game on line, player is using a user apparatus by network in carrying out on a server Game, this server is according to the instruction action of player and in generating video signal on user apparatus；For example, when a player adopts in user apparatus When taking movement, this movement can be sent to server unit and calculate an image thereon, then image post-back to user is filled It sets.In many games on line, 2D image caused by server includes the 3D rendering of other objects in range of visibility (Rendering)。

3D model and 3D sound needed for the present invention provides user apparatus by server, in server and user apparatus Between carry out the 3D rendering parsing for being located at object in range of visibility, for example, server provides some or all of 3D model and 3D sound Sound carries every 3D model secretly or relevant annotate of 3D sound translates data to user apparatus, such as position, seat to and status data Deng.

For example, initial in game, all relevant to game images on a user device (including relevant 3D wash with watercolours Dye) it is to be generated by network by server, become the stereosonic 2D video streaming of tool.System of the invention is being regarded by network The media such as 3D model and 3D sound and its spatial cue are pushed within the scope of line to user apparatus, the object compared with nearly (close to eyes) is excellent First push.Present system in the rendering for carrying out 3D model and 3D sound on user apparatus, is had to take the second best as far as possible, be in The rendering of such as 3D model or 3D sound is carried out on server.

One 3D model or a 3D sound were both stored on user apparatus, and server only needs to provide object (3D model or 3D sound) Annotate and translate data to user apparatus, user apparatus can render these objects accordingly and result is presented in provided by the server In what stereosonic 2D video signal of tool；Unless the requirement of user apparatus, otherwise server will not to this 3D model and 3D sound into Row rendering.This arrangement of the method for the present invention will save the GPU on server and calculate, and server can safeguard a dynamic data base, wrap Model containing 3D and 3D sound, to improve the efficiency with user communication.

Include following combination shown by user apparatus in the present invention: (a) one in the 3D scene rendered on server, As a result it to have the form of stereosonic 2D video streaming, is sent to client and is played by user apparatus, and (b) from service Download and be stored on user apparatus on device, voluntarily rendered by user apparatus 3D model and 3D sound as a result, this tool is three-dimensional The mixing of the 3D model and 3D sound that are rendered on the 2D video streams and user apparatus of sound, it will in the situation for reducing bandwidth occupancy Under create an in riotous profusion 3D scene and moving ring field audio.

In one embodiment, the tool stereosonic 2D video streams entrainment 3D models and 3D sound of user apparatus are sent to It annotates and translates data, user apparatus can detect oneself whether have this 3D model and 3D sound, if it's not true, user apparatus will From 3D model and 3D sound needed for server downloading, after downloading, user apparatus can be stored and be established list, in case It rebuilds needed for scene later.In this way, the delay of video streaming and the problems such as need massive band width, it will improved, and by with Family device end voluntarily renders, and quality more preferably image will can be obtained (because without video compress).

Above-mentioned annotate translates data and will allow user apparatus can be in the situation for not omitting or repeating any 3D model or 3D sound Under correctly hybrid subscriber device end with 3D model and 3D sound rendered as a result, with server provide the stereosonic 2D of tool Video streams；As previously mentioned, user apparatus can have been rebuild after user apparatus storage institute's 3D model and 3D sound in need Whole 3D scene and sound, at this point, server no longer need to carry out any rendering, until one it is new be added but user apparatus end not One new 3D model or 3D sound of storage stops when occurring, and when encountering a new 3D model, server can be by this new 3D Model and subsequent all objects are rendered, until user apparatus can voluntarily render this new 3D model.In addition, if Encounter a new 3D sound, then server can render this 3D sound, until its can for user apparatus end with when stop.

User apparatus can as much as possible by the 3D model of downloading and 3D sound storage (cache) on the storage device of oneself, Repeated downloads are needed to avoid when executing backward, therefore, the bandwidth cost of network will further decrease, if can not store, Downloading and rendering can be completed when being executed.

Pass through the schematic diagram of one standards Example of system of transmission of network media as shown in Figure 1 for the present invention.Server 1 is To execute the application program of an offer service, this service can be (but being not limited to) cloud game on line service；Multiple use Family device 21,22,23 can be linked and (be logined) by a network 4 to server 1, to use the application journey by operating on server 1 Service provided by sequence.In this embodiment, network 4 is an internet, and user apparatus 21,22,23 is that can be connected to internet Any electronic device, such as (but being not limited to) smart phone 21, digiboard, laptop 22, desktop computer 23, view Interrogate game machine or intelligent TV etc., some user apparatus 21,22 by one action base station wireless link to network 4, it is another A little user apparatus can then be linked in a wired manner on network 4 by a router；The application program operated on the server It can produce a virtual 3D environment comprising multiple 3D models and 3D sound, every 3D model or 3D sound and state of arranging in pairs or groups, instruction Whether 3D model or 3D sound are pre-stored in a user apparatus 21,22,23.In a preferred embodiment of the invention, to each For user apparatus, all there is a corresponding stand-alone utility, also that is, an application program provides service only to a user apparatus, But multiple application programs can execute on the same server simultaneously, to provide service to multiple user apparatus.As shown, with Family device 21,22,23 links to server 1 by network 4, to obtain by application program generation and include at least some 3D models And the media of 3D sound, this system architecture are specified in Fig. 2 and relevant narration to its feature.

Fig. 2 show the schematic diagram of one embodiment of system structure of the invention.

In the present invention, application program 100 is operated on a server 1, to generate the rendering result of 3D image 3D sound, A usually 3D game.3D scene transfer device 110 be a chained library (library), in application program 100 compiling period and Static linkage, or in application program 100 execute during and dynamic link；3D scene client (program) 170 be one in The program executed on family device 21,22,23, to generate and export the 3D image and 3D sound wash with watercolours that are generated by application program 100 Contaminate result.In this embodiment, for each user apparatus 21,22,23, all it is corresponding with independent application program out of the ordinary 100 and scene transfer device 110.

In the present invention, 3D scene client 170 and 3D scene cache 190 form the program and execution method of client, with Play the operational capability that user apparatus itself renders 3D model Yu 3D sound.

3D scene server 120 is one and application program 100 is jointly in the server program executed on server 1, is made The relaying that information is transmitted between the 3D scene transfer device 110 of server 1 and the 3D scene client 170 of user apparatus 21,22,23 It stands.It is also used as a file download service device simultaneously, downloads necessity from server 1 for 3 clients 170 of user apparatus 21,22,23 3D model and 3D sound.3D scene transfer device 110 is stored with an inventory, list all 3D models and 3D sound and model or The state of sound, this state to indicate each 3D model or 3D sound state be (1) " Not Ready (not being ready for) ", (2) one of " Loading (in downloading) " and (3) " Ready for Client (user has downloaded) ".

The main program of application program 100 by way of API Calls chained library (path 101 of Fig. 2) by 3D scene information Be sent to 3D scene transfer device 110, this 3D scene information include title, position, speed, attribute, seat to and every other 3D mould Data needed for type and the rendering of 3D sound.After 3D scene transfer device 110 receives such data, i.e., executable following procedure.

Step (a): for 3D model, all 3D models that need to be rendered being sorted, and sortord can be empty for opposite one Quasi- position (such as eyes of the perspective plane 3D or user) are from closely to remote sequence.

For 3D sound, all 3D sound that need to be rendered are sorted, sortord can be for an opposite virtual location (such as The eyes of the perspective plane 3D or user) from closely to remote sequence.

In some cases, the 3D model A in 3D scene can be coated or repeatedly be overlayed in another 3D Model B, for example, model A can be a house, and Model B can be then the desk in house, and in this case, which model, which is closer to emulation location, is in fact One equivocal problem can be described as 3D model (A+B) at this point, model A and Model B will be considered as same 3D model.

Some can be used for assisting sorting to the Given information of scene, for example, the ground in game, can be considered as in other 3D A big and flat 3D model under object, in general, the eyes of user can be above the ground level, therefore, the 3D model on ground is in the ranking It need to especially handle, before being shown in other 3D models to avoid it.

Step (b): saying 3D model, finds first from closest approach (closest to eyes person) and does not have " Ready for The 3D model " M " of Client " state, in other words, the state of first 3D model " M " is " Not Ready " state, (herein it Afterwards, " Not Ready " state is referred to as NR state)；It is of course also possible to have no such 3D model and exist (such as it is all will be by The 3D model of display is all denoted as " Ready for Client " state).

3D sound is sayed, first is found from closest approach (closest to eyes person) and does not have " Ready for Client " The 3D sound " S " of state, in other words, the state of first 3D sound " S " are " Not Ready " state, (after this, " Not Ready " state is referred to as NR state)；It is of course also possible to have no such 3D sound and exist (such as all 3D that will be shown Sound is all denoted as " Ready for Client " state).

Step (c): saying 3D model, and server renders 3D model M and 3D models all thereafter, that is, all ratios M is apart from the farther 3D model of eyes.If (not having 3D model M, show with a blank screen), the result after coding rendering is one 2D video streams shadow lattice (frame).

3D sound is sayed, renders that (broadcastings) is all not to have " ready for client " state on the server 3D sound (if without such 3D sound, it is mute to generate one), and then, the result after coding rendering is in tool step (C) 2D video streaming shadow lattice it is stereo.Note: the 3D sound after connecting 3D model S is only non-for " Ready in its state Shi Caihui is rendered Client ", this is different from the 3D model in step (C).

Step (d): following six information of transmission to 3D scene server 120 (path 112): [Info 112-A], [Info 112-B], [Info 112-C], [Info 112-D], [Info 112-E] and [Info 112-F]), and 3D scene server Information above can be sent to 3D scene client 170 (path 122) by 120.

[Info 112-A] is the status information (or annotation data) of all 3D models before 3D model M.Paying attention to may nothing Such model exists.This class model all has " Ready for Client " state, means that these models have been preloaded on Client terminal device, the 3D scene client (program) 170 above client terminal device 21,22,23 can voluntarily render these Model.In order to reduce data transfer bandwidth, 3D scene transfer device 110 may not necessarily transmit whole status informations, as long as transmission In status information, the difference of this rendering and last time rendering.

[Info 112-B] if server has found the state of 3D model M and its user apparatus when being " Not Ready ", Server will change its User Status be " Loading ", and send out a 3D model M downloading instruction, it is desirable that client downloads this 3D model M；If User Status has been " Loading ", any instruction is not sent out, because downloading instruction has been sent out.

[Info 112-C] is that the view after the coding in step (C) believes crossfire shadow lattice.

[Info 112-D] refer to the stateful 3D sound for " ready for client " (also may be without such 3D sound In the presence of) status information (or annotation data), such sound type all has " Ready for Client " state, it is meant that this A little sound have been preloaded on client terminal device, and the 3D scene client (program) 170 above client terminal device 21,22,23 is (broadcasting) these sound can voluntarily be rendered.In order to reduce data transfer bandwidth, 3D scene transfer device 110 may not necessarily be transmitted entirely The status information in portion, as long as in transferring status data, the difference of this rendering and last time rendering.

[Info 112-E] if server has found 3D sound S and its User Status when being " Not Ready ", change Its User Status is " Loading ", and sends out the downloading instruction of a 3D sound S, it is desirable that this 3D sound of client downloads S；If User Status has been " Loading ", then not send out any instruction, because downloading instruction has been sent out.

[Info 112-F] is stereo after the coding in step (C).

When new 3D contextual data is updated to 3D scene transfer device 110 by the main program of each application program 100, repeat Step (a)~(d), in general, the main program of application program 100 can update such data in rendering each time.

As soon as carrying out aftermentioned rendering program after 3D scene client 170 receives aforementioned data.

Step (i): the video signal shadow lattice of decoding [Info 112-C] simultaneously use this shadow lattice as the back of subsequent 3D model rendering Scape；In addition, and decoding the background sound that there is the stereo of [Info 112-F] video signal and render using it for subsequent 3D sound.

Step (ii): the 3D model in all [Info 112-A] is rendered on the view letter shadow lattice after step (i) coding, is It reduces network bandwidth to occupy, 3D scene client 170 will store this one [Info 112-A] information into memory, therefore next time 3D scene transfer device 110 can only transmit state [Info 112-A] difference between rendering next time and this rendering, not need to transmit Whole status informations；In the same manner, when rendering all 3D sound for belonging to [Info 112-D], by it is mixed in step (i) It is decoded stereo, it is occupied to reduce network bandwidth, 3D scene client 170 will store this one [Info 112-D] information Into memory, therefore 3D scene transfer device next time 110 can only transmit state [the Info 112- between rendering next time and this rendering D] difference, do not need the whole status information of transmission.

Step (iii): in step (ii), being mixed with the stereosonic video signal shadow lattice of tool that server transmits, with 3D scene visitor The 3D model and 3D sound that family end 170 voluntarily renders, the two mixing resultant is exported, the output video streams of a tool sound are become (path 176).

Provided that there is the state of [Info 112-B], 3D scene client 170 will handle 3D model M according to following procedure.

Step (I): searching in 3D scene cache 190 (path 174), and 3D scene cache 190 is downloaded and is stored in front of including 3D model database in user apparatus 21,22,23.

Step (II): if having had 3D model M in 3D scene cache 190, (V) is thened follow the steps.

Step (III): if not having 3D model M in 3D scene cache 190,3D scene client 170 will send out one For downloading demand to 3D scene server 120 (path 172), 3D scene server 120 will return the data of 3D model M to 3D Scape client 170 (path 124).

Step (IV): after a 3D model is downloaded completely, 3D scene client 170 is deposited into 190 (road of 3D scene cache Diameter 194), whereby, when next time there is similar demand, that is, it is not required to be downloaded again.

Step (V): 3D scene client 170 will be extracted in 3D model M (path 192) from 3D scene cache 190.

Step (VI): a downloading is completed (or downloading already before), the i.e. extractable 3D model M of 3D scene client 170；3D Scene client 170 will send out one on " 3D model is ready on client (3D model on a user device) " For information to 3D scene server 120 (path 113), 3D scene server 120 will transfer this information to 3D scene transfer device 110 (path 114).

Step (VII): a 3D scene transfer device 110 receive this information after, i.e., can by the state of 3D model M by " Loading " is changed to " Ready for Client ".

Step (VIII): in rendering next time, 3D scene transfer device 110 will understand that 3D model M has been preloaded on In user apparatus, therefore 3D scene client 170 will be asked voluntarily to render, therefore, server 1 will no longer be required to render this 3D mould Type M.

Provided that there is the state of [Info 112-E], 3D scene client 170 will be ready for 3D sound S according to following programs. (similar previously for about [Info 112-B] one depicted)

Step (I): searching in 3D scene cache 190 (path 174), and 3D scene cache 190 is downloaded and is stored in front of including 3D audio database in user apparatus 21,22,23.

Step (II): if having 3D sound in 3D scene cache 190, (V) is thened follow the steps.

Step (III): if not having 3D sound in 3D scene cache 190,3D scene client 170 will send out one For downloading demand to 3D scene server 120 (path 172), 3D scene server 120 will return the data of 3D sound to 3D Scape client 170 (path 124).

Step (IV): after a 3D sound is downloaded completely, 3D scene client 170 is deposited into 190 (road of 3D scene cache Diameter 194), whereby, when next time there is similar demand, that is, it is not required to be downloaded again.

Step (V): 3D scene client 170 will extract 3D sound S (path 192) from 3D scene cache 190.

Step (VI): (or downloading already before), the i.e. extractable 3D sound S of 3D scene client 170 are completed in a downloading；3D Scene client 170 will send out one " 3D sound is ready on client (3D sound on a user device) " Information is to 3D scene server 120 (path 113), and 3D scene server 120 will transfer this information to 3D scene transfer device 110 (paths 114).

Step (VII): as soon as 3D scene transfer device 110 receive this information after, i.e., can by the state of 3D sound S by " Loading " is changed to " Ready for Client ".

Step (VIII): in rendering next time, 3D scene transfer device 110 will understand that 3D sound S has been preloaded on In user apparatus, therefore 3D scene client 170 will be requested voluntarily to render (broadcasting), therefore, server 1 no longer needs to render This 3D sound S.

It is no any 3D model and 3D sound in most initial, in user apparatus 21,22,23, so 3D scene passes Defeated device 110 will render all 3D models and 3D sound and its result is encoded to the stereosonic 2D video streaming of tool, 3D scene Transmitter 110 will the downloading demand [Info 112-B] of 3D model and the downloading demand [Info 112-E] of 3D sound from Closest approach is sent to the perspective plane 3D (or eyes of user), and 3D scene client 170 will be downloaded often from 3D scene server 120 One 3D model or 3D sound, or extracted one by one from 3D scene cache 190；And when more 3D models and 3D sound can be 3D When scene client 170 obtains, 3D scene transfer device 110 will notify 3D scene client 170 voluntarily to render these moulds automatically Type and sound, and the quantity of the 3D model and 3D sound that are rendered by 3D scene transfer device 110 is reduced, whereby, in encoded 2D 3D model and 3D sound in video streams can be fewer and fewer, until can obtain all 3D in last 3D scene client 170 Until when model and 3D sound, and later, in this stage, the black screen for not having sound is only remained, in other words, server 1 is not 2D video streaming need to be transmitted again into user apparatus 21,22,23, and the communication band between server 1 and user apparatus 21,22,23 Width is occupied and can be also greatly reduced.

In the present invention, when a new 3D model N appears in outdoor scene, 3D scene transfer device 110 is meeting (1) notice 3D scene visitor Family end 170 only renders all 3D models (opposite eyes of user is sayed) before this new 3D model N, and (2) notify 3D Scape client 170 downloads this new 3D model N, and (3) 3D scene transfer device 110 will render this new 3D model N and institute The 2D video streaming for having all models being located at thereafter and its result being encoded to a tool sound, and then this is had into sound 2D video streaming is sent to 3D scene client 170, and then, 3D scene client 170 can be standby on a user device in 3D model N Before appropriate, the 3D image of application program 100 and the rendering result of sound are persistently remake.

And when a new 3D sound T appears in outdoor scene, 3D scene transfer device 110 is that meeting (1) notifies 3D scene client 170 download this new 3D sound T, and (2) 3D scene transfer device 110 will render this new 3D sound T and compile its result Code is stereo for one, and then this stereo and 2D video streaming is sent to 3D scene client 170, then, 3D scene visitor Family end 170 before 3D sound T is ready on a user device, can persistently remake the rendering of the 3D image and sound of application program 100 As a result.In this program, only renders new 3D sound T, 3D scene transfer device 110 and be not required to after rendering other 3D sound T again Side all 3D sound, this practice be because sound essence be it is different from image, image can block the display of image thereafter, But sound will not.

Background music can be considered the 3D sound of the predetermined position 3D of a tool one, defined for that can download background music as early as possible The predetermined position 3D should be better closer to eyes of user.

To reduce server load, or noise caused by avoiding being transmitted by unstable network data, server can be put Abandon the coding of all 3D sound in video signal.In this case, 3D sound only in its downloading and is pre-stored in a user apparatus the rear, beginning In being played on user apparatus.

3D sound is sayed, server 1 examines the state of 3D sound, to determine that 3D sound need to be encoded as a tool Stereosonic 2D video streaming, coding mode be by the non-3D acoustic coding being pre-stored in user apparatus into video shadow lattice； Wherein, when a 3D sound is encoded as stereo in video signal shadow lattice, the volumes of left and right acoustic channels is by its position and opposite uses The speed of family ear is determined；Wherein, background music may be defined as the 3D audio on a predetermined position.

Fig. 3 A show the present invention and passes through the flow chart of one embodiment of method of transmission of network media；When beginning through network (step 60) when transmitting image executes an application program on a server and generates a virtual 3D environment comprising multiple 3D models (step 61), every 3D model are arranged in pairs or groups a state, and whether this 3D model of the state instruction is pre-stored in user apparatus.

Server then verifies that the state (step 62) of 3D model, to determine which 3D model need to be coded into 2D video Crossfire shadow lattice, the non-3D model being pre-stored in user apparatus, will be encoded into shadow lattice；Server will be with a certain virtual location Subject to (the usually perspective plane 3D or user's eyes), from the near to the remote, the state of each 3D model is examined one by one, in inspection, when It was found that firstIt is non-to prestoreWhen 3D model in user apparatus, i.e., this 3D model found is labeled as a NR state, connect , no matter whether 3D model thereafter is pre-stored in user apparatus, and all 3D models at this 3D model M and its rear all can be by Coding (step 63) into shadow lattice；And when the position change of any 3D model or when referential virtual location that sorts changes When, that is, inspection above-mentioned is re-executed, and determine whether a 3D model must be encoded into video signal shadow lattice according to newest inspection result In.

Step 64: after 2D video streaming shadow trellis coding, server i.e. by this 2D video streaming shadow lattice andIt is non-to prestore? 3D model (that is, all 3D models of this 3D model and its rear with NR state) in user apparatus is in a predetermined order It is sent to user apparatus, this predetermined order to farthest point 3D Zi a little projecting closest to the perspective plane 3D (or user's eyes) The sequence of any of face；One user apparatus receives 2D video streaming shadow lattice (step 65), and user apparatus is decoded and passed from server The shadow lattice that come simultaneously use this shadow lattice as rendering the background of 3D model for being pre-stored in user apparatus but being not included in shadow lattice, with Generate the mixing shadow lattice (step 66) of the output video streaming of a tool sound；When user apparatus receives the 3D mould transmitted by server When type, this 3D model is stored and then transmits an information to server by user apparatus, notifies server change 3D model State is " being pre-stored in user apparatus now ", later, user apparatus by video streaming that server transmits with voluntarily render As a result, the two mixing output, becomes new video signal.

In step 62, when a new 3D model appears in 3D environment, no matter whether the 3D model at its rear is prestored In user apparatus, i.e., by new 3D model and all 3D model based codings in its rear into shadow lattice.

In step 64, server also by be not encoded into the 3D model in video streams shadow lattice status information (or Annotation data) it is sent to user apparatus；User apparatus carries out when reception and test status information according to following manner: if received To status information in any 3D model be it is non-be pre-stored in user apparatus, then user apparatus i.e. send out a demand to service Device, to download 3D model (step 661), status information include it is each be not encoded into annotating in shadow lattice and translate data, each annotate is translated Data include a title of 3D model, a position, a speed, one to and an attribute and every 3D model state.

Fig. 3 B show the present invention and passes through the flow chart of another embodiment of method of transmission of network media；When beginning through net When network transmits sound (step 60a), an application program is executed on a server and generates a virtual 3D comprising multiple 3D sound Environment (step 61a), every 3D sound are arranged in pairs or groups a state, and whether this 3D sound of the state instruction is pre-stored in user apparatus.

Server then verifies that the state (step 62a) of 3D sound, to determine which 3D sound need to be coded into 2D view Frequency crossfire shadow lattice, the non-3D sound being pre-stored in user apparatus will be encoded into shadow lattice；Server will be with a certain virtual location Subject to (the usually perspective plane 3D or user's eyes), from the near to the remote, the state of each 3D sound is examined one by one, in inspection, when It was found that firstIt is non-to prestoreIt when 3D sound in user apparatus, i.e., is a NR state by this 3D voice mark found.

Step 64a: after the video streaming shadow trellis coding comprising sound, server is by the 2D video streaming of this tool sound Shadow lattice andIt is non-to prestore3D sound (that is, this 3D sound with NR state) in user apparatus transmits in a predetermined order To user apparatus, this predetermined order Zi closest to the perspective plane 3D (or user's eyes) a little to the farthest point perspective plane 3D The sequence of another point；After one user apparatus receives the video streaming shadow lattice (step 65a) comprising sound, user apparatus decodes packet Contained in the audio (that is, sound) in video streams and this audio is used to be pre-stored in user apparatus but be not included in as rendering The background of 3D sound in video streams shadow lattice, to generate a mixed audio (step 66a)；When user apparatus is received by server When the 3D sound transmitted, user apparatus by this 3D voice storage and then transmits an information to server, and notice server is more The state for changing 3D sound is " being pre-stored in user apparatus now ", later, the video streaming that user apparatus transmits server In message with voluntarily render (broadcasting) 3D sound as a result, the two mixes output, become new message.

In step 62a, when a new 3D sound appears in 3D environment, that is, it is about to new 3D acoustic coding to tool sound In the 2D video streams shadow lattice of sound, however, this new 3D sound has no effect on whether other 3D sound are rendered, this point with it is aforementioned 3D model in step 62 is different.

In step 64, the status information for the 3D sound not being encoded into shadow lattice is also sent to user's dress by server It sets；User apparatus carries out when reception and test status information according to following manner: if any in the status information received 3D sound is pre-stored in person in user apparatus to be non-, then user apparatus sends out a demand to server, to download 3D sound (step 661a), status information include it is each be not encoded into annotating in shadow lattice and translate data, it is each to annotate that translate data include the one of 3D sound Title, a position, a speed, one to and an attribute and every 3D sound state.

As Fig. 4 A, 4B and 4C show the signal how the method for the present invention transmits view letter crossfire and one embodiment of 3D model Figure.

As shown in Figure 4 A, when original user device 74 logins the application program 70 operated on server, do not have any 3D model is pre-stored in user apparatus, and therefore, server renders all 3D models (including a people 71 and subsequent a house 72), all 3D models should be all shown on the screen of user apparatus, and rendering result is encoded to a 2D video streams by server This shadow lattice 73 is then sent to user apparatus 74 by shadow lattice 73.In this stage, shadow lattice 73 include people 71 and house 72, user Device 74 only exports this shadow lattice 73, without rendering other objects.

Then, as shown in Figure 4 B, 3D model is first sent to user apparatus by the beginning of server 70, from closest to user apparatus The 3D model on the perspective plane screen 3D starts；In this embodiment, compared with house 72, people 71 is to be closer to the perspective plane 3D (or to make User's eyes), therefore, the 3D model of people 71 is first sent to user apparatus 74, and the 3D model of a people 71 transmits and is stored in user After on device 74, user apparatus 74 transmits an information to server 70, has been pre-stored in user with the 3D model of nunciator 71 In device 74；Later, it is a 2D video streams shadow lattice 73a that server 70, which renders house 72, encodes its rendering result, transmits this shadow Annotating for lattice 73a and people 71a translates data to user apparatus 74, and user apparatus 74 utilizes to annotate automatically immediately translates data render people, then ties The rendering result and shadow lattice 73a (including house) of people is closed, to obtain identical output result；This program (such as server is with one 3D model is sent to user apparatus 74 by next mode) one is repeated, until all clients need 3D mould to be shown Type, when all being transmitted and be pre-stored in 74 in user apparatus until.

As shown in Figure 4 C, a user apparatus 74 possesses all 3D models (the 3D model including people and house), server It is not required to carry out Rendering operations again, is also not required to transmit video streams shadow lattice (component 75) again；Server only needs annotating for transmission 3D model Data (comprising people 71a and house 72a) are translated to user apparatus 74.User apparatus can voluntarily render all 3D models, to obtain Identical output result.

As Fig. 6 A, 6B and 6C show one embodiment of video streams and 3D sound how the method for the present invention transmits tool sound Schematic diagram.

As shown in Figure 6A, when original user device 74 logins the application program 70 operated on server, do not have any 3D sound is pre-stored in user apparatus, and therefore, server renders all 3D sound (including a sound 81 and subsequent a sound 82), all 3D sound should be all presented on the loudspeaker of user apparatus, and rendering result is encoded to a tool sound by server Depending on believing crossfire shadow lattice 83, then, this tool sound is sent to user apparatus 74 depending on letter crossfire shadow lattice 83.In this stage, have The view letter crossfire shadow lattice 83 of sound include sound 81 and sound 82, and user apparatus 74 only exports the video streams shadow of this tool sound Lattice 83, without rendering (broadcasting) other sound.

Then, as shown in Figure 6B, 3D sound is first sent to user apparatus by the beginning of server 70, from closest to user apparatus The 3D sound on the perspective plane screen 3D starts；In this embodiment, compared with sound 82, sound 81 be closer to the perspective plane 3D (or User's eyes), therefore, the 3D sound of sound 81 is first sent to user apparatus 74, and the 3D sound of a sound 81 is transmitted and stored After on user apparatus 74, user apparatus 74 transmits an information to server 70, has been pre-stored in user with alert sound 81 In device 74；Later, server 70 renders sound 82, encodes the 2D view letter crossfire shadow lattice 83a that its rendering result is a tool sound, It transmits this shadow lattice 83a and annotating for sound 81 translates data to user apparatus 74, user apparatus 74 utilizes to annotate automatically immediately translates data (broadcasting) sound is rendered, rendering result and shadow lattice 83a (including sound) in conjunction with sound, to obtain identical output result； This program (such as 3D sound is sent to user apparatus 74 in a manner of one at a time by server) is repeated one, until all The 3D sound for needing to play on the loudspeaker of user apparatus, when all being transmitted and be pre-stored in 74 in user apparatus until.

As shown in Figure 6 C, a user apparatus 74 possesses all 3D sound (the 3D sound including sound 81 Yu sound 82), Server is not required to carry out Rendering operations again, and also therefore video streams shadow lattice (component 85) only include image without including sound；Clothes Business device only needs annotating for transmission 3D sound 81 to translate data (comprising sound 81a and sound 82a) to user apparatus 74.User then may be used (broadcasting) all 3D sound are rendered, voluntarily to obtain identical output result.

As Fig. 5 A, 5B and 5C show how the method for the present invention determines which 3D model must be encoded into one embodiment of shadow lattice Schematic diagram.

In the present invention, server sorts all 3D models to be rendered according to a predetermined order, this predetermined order are as follows: Opposite a virtual location (perspective plane 3D 52 of such as user apparatus screen or user's eyes 51) are from closely to remote sequence.Such as figure Shown in 5A, four objects A, B, C and D need to be shown on the screen of user apparatus, and wherein object A is closest to perspective plane 52, then It is followed successively by object B, object C and object D, when original user device logins the application program operated on server, does not have and appoints What 3D sound is pre-stored in user apparatus, and therefore, server renders all object A, object B, object C and object D, will render As a result a video streams shadow lattice are encoded to, then this shadow lattice is sent to user apparatus.Meanwhile server start it is suitable according to making a reservation for one by one Sequence sends out the 3D models such as object A, object B, object C and object D, also that is, the 3D model of object A can be transmitted first, then It is successively object B, object C and object D, until all 3D models being shown on user apparatus all by until having transmitted.

As shown in Figure 5 B, after the 3D model of object A and B are all pre-stored in user apparatus, when server according to aforementioned from close When examining the state of 3D model to remote predetermined order, server will be apparent that object C is firstIt is non-to prestoreIn user apparatus In object, therefore, server can render object C and every other object (such as object D) after object C, no matter right As whether the 3D model of D is pre-stored in user apparatus, and at this point, server will not render the 3D model of object A and B, And why in this way, being because object A and B is both that be pre-stored in user apparatus be before the object C at this time.

As shown in Figure 5 C, when a new object E be shown in application program creation virtual 3D environment in when, object E and its All objects afterwards can all be rendered by server, and no matter whether this object is pre-stored in user apparatus, for example, such as Fig. 5 C institute Show, compared with object B, object C and object D, new object E is relatively close proximity to the perspective plane 3D 52, although the 3D model of object B is It is pre-stored in user apparatus, but because object B shows after new object E, therefore server can be to all object E, C, B and D It is rendered, even if object B only may be covered partially by other objects of the front.

As Fig. 7 A, 7B and 7C show how the method for the present invention determines which 3D sound must be encoded into the video signal of tool sound The schematic diagram of one embodiment of crossfire shadow lattice.

In the present invention, server sorts all 3D sound to be rendered according to a predetermined order, this predetermined order are as follows: Opposite a virtual location (perspective plane 3D 52 of such as user apparatus screen or user's eyes 51), from closely to remote sequence.Such as figure Shown in 7A, four 3D sound As, B, C and D need to be played on the loudspeaker of user apparatus, and wherein sound A is closest to perspective plane 52, Then it is followed successively by sound B, sound C and sound D, when original user device logins the application program operated on server, not There is any 3D sound to be pre-stored in user apparatus, therefore, server renders all sound As, sound B, sound C and sound D, will Rendering result is encoded to the video streams shadow lattice of a tool sound, then this shadow lattice is sent to user apparatus.Meanwhile server starts The data transmission of sound A, sound B, sound C and sound D are gone out according to predetermined order one by one, also that is, the 3D sound of sound A can first quilt Transmission, is then successively sound B, sound C and sound D, until all 3D sound are all stored into after user apparatus.

As shown in Figure 7 B, after the 3D sound of sound A and B is all pre-stored in user apparatus, when server according to aforementioned from close When examining the state of 3D sound to remote predetermined order, server will be apparent that sound C is firstIt is non-to prestoreIn user apparatus In sound, therefore, server can render sound C and every other sound (such as sound D) after sound C, and service Device will not render the 3D sound of sound A and B, because in this stage, sound A and B have been pre-stored in user apparatus.

As seen in figure 7 c, when a new sound E is added in the virtual 3D environment created to application program, sound E will Can be rendered by server, but this rendering will not impact the renderings of other sound, this with the 3D model in processing Fig. 5 C not Together, as seen in figure 7 c, compared with sound B, sound C and sound D, new sound E is relatively close proximity to the perspective plane 3D 52, not like Fig. 5 C In 3D model, the sound being pre-stored in user apparatus (such as sound A and B) can still render by user apparatus, but non-be pre-stored in It is rendered if the sound (such as sound E, C and D) of user apparatus by server.

The above-mentioned technology of the present invention can also be employed for virtual reality (Virtual-Reality；Abbreviation VR) scene system, By 3D model caused by VR scene application program performed by server and VR video streaming by transmission of network to user Device will be described below.

In order to provide the visual experience of a VR for human eye, virtual VR scene must specialize in the sight of mankind's left eye comprising one The image of reward and another image specializing in mankind's right eye and watching.It is illustrated in figure 8 the system tray of VR scene system of the invention The schematic diagram of one first embodiment of structure.

Scene server 1120 in the present invention be one be implemented in one have VR scene application program 1100 (also referred to as VR application program or application program) server 1 on server computer software, include multiple 3D models to generate One virtual VR 3D environment.VR scene application program 1100 is also to operate on server 1, a usually VR game.VR scene clothes Device 1120 of being engaged in is one and the field VR in the server program executed on server 1, as server 1 jointly of application program 1100 The relay station that information is transmitted between scape transmitter 1110 and the VR scene client 1170 of user apparatus 21,22,23.VR scene service Device 1120 is also used as a file download service device simultaneously, for user apparatus 21,22,23 VR scene client 1170 from server The 1 necessary 3D model of downloading.VR scene transfer device 1110 is a chained library (library), is existed in VR scene application program 1100 Compiling period and static linkage, or in VR scene application program 1100 execute during and dynamic link.VR scene client (program) 1170 is one in the program executed on user apparatus 21,22,23, to generate and be exported by VR in user apparatus The 3D image rendering result that scene application program 1100 generates.In this embodiment, for each user apparatus 21,22,23, It is all corresponding with independent VR scene application program 1100 and VR scene transfer device 1110.VR scene transfer device 1110 is stored with one A inventory, lists all 3D models and whether each 3D model has been stored in the state of user apparatus, this state is to refer to Each the bright state of 3D model in user apparatus is (1) " Not Ready (not being ready for) ", (2) " Loading (is downloaded One of in) " and (3) " Ready for Client (user has downloaded) ".

Server 1 can check these 3D model states, to determine which 3D model need to be coded in a 2D video streams One left eye shadow lattice, and which 3D model need to be coded in a right eye shadow lattice of the 2D video streaming, in the present invention, those do not have Being stored beforehand can all be coded in the left eye shadow lattice and right eye shadow lattice in the 3D model in user apparatus 21,22,23.In order to Reach this function, (path 1101 of Fig. 8) will by way of API Calls chained library for the main program of VR scene application program 1100 VR scene information is sent to VR scene transfer device 1110, this VR scene information include title, position, speed, attribute, seat to and institute Data needed for having other 3D model renderings.After VR scene transfer device 1110 receives such data, i.e., executable following procedure.

Step (a): for all 3D models, the 3D models that need to be rendered all in left eye shadow lattice being sorted, sequence side Formula can be opposite a virtual location (the left eye eyeball on such as perspective plane 3D or user) from closely to remote sequence.

Step (b): saying 3D model, finds first from closest approach (closest to the left eye eyeball person of user) and does not have The 3D model " M " of " Ready for Client " state, in other words, the state of first 3D model " M " are " Not Ready " shape State, (after this, " Not Ready " state is referred to as NR state)；It is of course also possible to which having no such 3D model has (example It is such as all that shown 3D model is all denoted as " Ready for Client " state).

Step (c): saying 3D model, 3D model M and 3D models all thereafter rendered by server 1, that is, It is all than M apart from the farther 3D model of left eye eyeball.If (there is no 3D model M, show with a blank screen), after coding rendering As a result it is the left eye shadow lattice (frame) of a 2D video streaming, is supplied to the left eye viewing of user.

Step (d): above-mentioned step (a) to (c) is repeated for right eye shadow lattice, that is, above-mentioned steps (a) to (c) Described in the operation of left eye eyeball be all changed to right eye eyeball, generate another shadow lattice i.e. right eye shadow of another 2D video streaming whereby Lattice are supplied to the right eye viewing of user.

Step (e): following three information is transmitted to VR scene server 1120 (path 1112) for left eye shadow lattice: [Info 1112-A], [Info 1112-B] and [Info 1112-C], and, transmit following three information to VR scene for right eye shadow lattice Server 1120 (path 1113): [Info 1113-A], [Info 1113-B] and [Info 1113-C].

Step (f): the data packet device 121 in VR scene server 1120 understands the information ([Info left and right two 1112-A], [Info 1112-B], [Info 1112-C], [Info 1113-A], [Info 1113-B] and [Info 1113- C]) it is packaged into a message packet.

Step (g): VR scene server 1120 message packet generated in step (f) can be sent to user apparatus 21, 22, the VR scene client 1170 (path 1122) in 23.

[Info 1112-A] is the status information (or annotation data) of all 3D models before 3D model M.Paying attention to may nothing Such model exists.This class model all has " Ready for Client " state, it is meant that these models have been preloaded on User apparatus, the VR scene client (program) 1170 above user apparatus 21,22,23 can voluntarily render these moulds Type.In order to reduce data transfer bandwidth, VR scene transfer device 1110 may not necessarily transmit whole status informations, as long as transmission shape In state information, the difference of this rendering and last time rendering.

[Info 1112-B] is if server has found 3D model M and it is in the state that user apparatus stores in advance When " Not Ready ", it is " Loading " that server, which will change its User Status, and sends out the downloading instruction of a 3D model M, User apparatus is asked to download this 3D model M；If User Status has been " Loading ", any instruction is not sent out, because downloading refers to Show and has sent out.

[Info 1112-C] is the video streaming shadow lattice of the left eye after the coding in step (c), that is, left eye shadow lattice.

[Info 1113-A], [Info 1113-B] and [Info 1113-C] is substantially substantially identical in [Info respectively 1112-A], [Info 1112-B] and [Info 1112-C], only [Info 1113-A], [Info 1113-B] and [Info 1113-C] it is about right eye shadow lattice.

When new VR contextual data is updated to VR scene transfer device by the main program of each VR scene application program 1100 When 1110, step (a)~(g) is repeated, in general, the main program of VR scene application program 1100 can be in the rendering period each time Update such data.

After one VR scene client 1170 receives aforementioned data, that is, carry out aftermentioned rendering program.

Step (i): decoding [Info 1112-C and Info 1113-C] in video signal shadow lattice (including left eye shadow lattice with Both right eye shadow lattice) and send this two shadows lattice to shadow lattice colligator 1171.

Step (ii): shadow lattice colligator 1171 is by this two shadows lattice (including both left eye shadow lattice 1711 and right eye shadow lattice 1712) Merging becomes a combined VR shadow lattice 1713 (please referring to the 9th figure), the background as subsequent 3D model rendering.

Step (iii): all [Info 1112-A and are rendered on the combined VR shadow lattice after step (ii) coding Info 1113-A] in 3D model.It is occupied to reduce network bandwidth, VR scene client 1170 will store this [Info 1112-A and Info 1113-A] information is into memory, therefore VR scene transfer device next time 1110 can only transmit next rendering The difference of [Info 1112-A and Info 1113-A] state between this rendering does not need the whole state letter of transmission Breath.

Step (iv): the rendering result in output step (iii) is to contain the output video streaming of VR scene as one In a rendering after mixing VR shadow lattice, that is, the video streaming result (path 1176) finally exported.In this embodiment In, which is the electronic device of a glasses or helmet moulding comprising before being located at user's left eye and right eye Two display screens of side；Wherein, the screen on the left side is used to show the image (shadow lattice) for the viewing of user's left eye, the screen on the right For showing the image (shadow lattice) for the viewing of user's right eye.Mixing VR shadow lattice in the output video streaming are to broadcast in the following manner It is put on this two screen of user apparatus, also that is, each pixel in left side of every a line in mixing VR shadow lattice is all shown in this Left eye screen, and each pixel of right side of every a line is then all shown in the right eye screen in mixing VR shadow lattice, to provide user The visual experience of virtual reality (VR).

Provided that indicating that 3D model M is needed by VR when having the state of [Info 1112-B] and [Info 1113-B] Scene client 1170 prepares, at this point, VR scene client 1170 will handle 3D model M according to following procedure.

Step (I): searching in VR scene cache 1190 (path 1174), and VR scene cache 1190 is downloaded before including and storage 3D model database in user apparatus 21,22,23.

Step (II): if having there is 3D model M in VR scene cache 1190, directly execution step (V).

Step (III): if not having 3D model M in VR scene cache 1190, VR scene client 1170 will be sent out For one downloading demand to VR scene server 1120 (path 1172), VR scene server 1120 will return the data of 3D model M To VR scene client 1170 (path 1124).

Step (IV): after a 3D model is downloaded completely, VR scene client 1170 is deposited into VR scene cache 1190 (path 1194) when having similar demand next time, that is, is not required to be downloaded again whereby.

Step (V): VR scene client 1170 will be extracted in 3D model M (path 1192) from VR scene cache 1190.

Step (VI): a downloading is completed (or downloading already before), the i.e. extractable 3D model M of VR scene client 1170； VR scene client 1170 will send out one on " 3D model is ready on client (3D model on a user device) " Information to VR scene server 1120 (path 1115), VR scene server 1120 will transfer this information to VR scene transfer Device 1110 (path 1114).

Step (VII): a VR scene transfer device 1110 receive this information after, i.e., can by the state of 3D model M by " Loading " is changed to " Ready for Client ".

Step (VIII): in rendering next time, VR scene transfer device 1110 will understand that 3D model M has been preloaded on In user apparatus, therefore VR scene client 1170 will be asked voluntarily to render, therefore, server 1 will no longer be required to render this 3D Model M.

It is no any 3D model in most initial, in user apparatus 21,22,23, so VR scene transfer device 1110 All 3D models will be rendered and its result is encoded to the 2D video streaming including left eye shadow lattice Yu right eye shadow lattice.VR scene passes Defeated device 1110 will be the downloading demand [Info 1112-B] of 3D model and [Info 1113-B], from closest to the perspective plane 3D (or left eye or right eye of user) person starts to process.VR scene client 1170 will be downloaded often from VR scene server 1120 One 3D model, or extracted one by one from VR scene cache 1190.And when more 3D models can be by VR scene client 1170 When acquirement, VR scene transfer device 1110 will notify VR scene client 1170 voluntarily to render these models and sound automatically, and Reduce the quantity of the 3D model rendered by VR scene transfer device 1110.Whereby, there is left eye shadow lattice and right eye shadow in encoded 3D model in the 2D video streams of lattice can be fewer and fewer, until can obtain all 3D moulds in last VR scene client 1170 Until when type；And later, in this stage, only remaining black screen is encoded by server 1, and in other words, server 1 is not required to again 2D video streaming is transmitted into user apparatus 21,22,23, and the communication bandwidth between server 1 and user apparatus 21,22,23 accounts for With can also be greatly reduced.

When one new 3D model N appears in VR scene, VR scene transfer device 1110 is that meeting (1) notifies VR scene client 1170 only render all 3D models (for the left eye or right eye of opposite user) before this new 3D model N, and (2) are logical Know that VR scene client 1170 downloads this new 3D model N, and (3) VR scene transfer device 1110 will render this new 3D mould Type N and all models being located at thereafter and the 2D video that its result is encoded to one comprising left eye shadow lattice and right eye shadow lattice Crossfire.And then this 2D video streaming comprising left eye shadow lattice and right eye shadow lattice is sent to VR scene client 1170.In It is that VR scene client 1170 still can persistently remake VR scene application program 1100 in 3D model N before being ready on user apparatus 3D image rendering result.

Figure 10 is the schematic diagram of the second embodiment of the system architecture of VR scene system of the invention.In shown in Fig. 10 In two embodiments, most of component and function it is substantially the same or similar to Fig. 8 first embodiment, its shadow lattice knot only Clutch 1111 is to be located in VR scene transfer device 1110, rather than be located at VR scene client 1170；Also therefore, identical in Figure 10 or Similar component will be given and be identically numbered with Fig. 8, and not repeat its details.

As shown in Figure 10, the main program of VR scene application program 1100 is by way of API Calls chained library by VR scene Information is sent to VR scene transfer device 1110, this VR scene information include title, position, speed, attribute, seat to and it is every other Data needed for 3D model rendering.After VR scene transfer device 1110 receives such data, i.e., executable following procedure.

Step (b): saying 3D model, finds first from closest approach (closest to the left eye eyeball person of user) and does not have The 3D model " M " of " Ready for Client " state, in other words, the state of first 3D model " M " are " Not Ready " shape State, (after this, " Not Ready " state is referred to as NR state)；Exist it is of course also possible to have no such 3D model.

Step (c): 3D model " M " and all subsequent 3D models are all rendered (if being not present in the server 1 When described 3D model " M ", then a blank screen is directly generated) it is then stored in memory.

Step (d): above-mentioned step (a) to (c) is repeated for right eye shadow lattice, that is, above-mentioned steps (a) to (c) Described in the operation of left eye eyeball be all changed to right eye eyeball, generate a right eye shadow lattice whereby, be supplied to the right eye viewing of user.

Step (e): You Yingge colligator 1111, which merges the left eye shadow lattice rendered with right eye shadow lattice, becomes a 2D video A combined VR shadow lattice in crossfire.

Step (e): following three information is transmitted to 1120 (path of VR scene server for left eye shadow lattice and right eye shadow lattice 1112): [Info 1112-A], [Info 1112-B] and [Info 1112-C], then, the meeting of VR scene server 1120 again will It is sent to the VR scene client 1170 (path 1122) in user apparatus 21,22,23.

[Info 1112-C] is having rendered and having contained the video streaming of left eye shadow lattice Yu right eye shadow lattice in step (e) Combined VR shadow lattice in shadow lattice.

Step (i): combined VR shadow lattice in decoding [Info 1112-C] and when as subsequent 3D model rendering Background.

Step (ii): the 3D model in all [Info 1112-A] is rendered on the VR shadow lattice of the merging.To reduce network Bandwidth occupancy, VR scene client 1170 will store this one [Info 1112-A] information into memory, therefore VR scene next time Transmitter 1110 can only transmit the difference of [Info 1112-A] state between rendering next time and this rendering, and it is complete not need transmission The status information in portion

Step (iii): the rendering result in output step (ii) is to contain the output video streaming of VR scene as one In a rendering after mixing VR shadow lattice, that is, the video streaming result (path 1176) finally exported.

Figure 11 is the schematic diagram of the 3rd embodiment of the system architecture of VR scene system of the invention.Shown in Figure 11 In three embodiments, most of component and function is substantially the same or first embodiment similar to Fig. 8, only this third is implemented Example no longer has shadow lattice colligator；Also therefore, same or similar component will be given volume identical with Fig. 8 in Figure 11 Number, and its details is not repeated.

As shown in figure 11, VR scene server 1120 be one be implemented in one with VR scene application program 1100 service Server computer software on device 1, to generate include multiple 3D models a virtual VR 3D environment.VR scene service Device 1120 is one and the VR scene in the server program executed on server 1, as server 1 jointly of application program 1100 The relay station that information is transmitted between transmitter 1110 and the VR scene client 1170 of user apparatus 21,22,23.VR scene server 1120 are also used as a file download service device simultaneously, for user apparatus 21,22,23 VR scene client 1170 from server 1 Download necessary 3D model.VR scene transfer device 1110 is stored with an inventory, lists all 3D models and each 3D model Whether the state of user apparatus has been stored in, this state is to indicate each state of 3D model in user apparatus (1) " Not Ready (not being ready for) ", (2) " Loading (in downloading) " and (3) " Ready for Client (and user under One of carry) ".

Server 1 can check the state of these these 3D models, to determine which 3D model need to be coded in 2D view A left eye shadow lattice of crossfire are interrogated, and which 3D model need to be coded in a right eye shadow lattice of the 2D video streaming, in the present invention, Those, which are not stored beforehand the 3D model in user apparatus 21,22,23, can all be coded in the left eye shadow lattice and right eye shadow In lattice.In order to reach this function, the main program of VR scene application program 1100 by way of API Calls chained library (Figure 11's Path 1101) VR scene information is sent to VR scene transfer device 1110, this VR scene information includes title, position, speed, category Property, seat to and every other 3D model rendering needed for data.After VR scene transfer device 1110 receives such data, it can hold Row following procedure.

Step (c): saying 3D model, 3D model M and 3D models all thereafter rendered by server 1, that is, It is all than M apart from the farther 3D model of left eye eyeball.If (there is no 3D model M, show with a blank screen), after coding rendering As a result it is a left eye shadow lattice (frame) of a 2D video streaming, is supplied to the left eye viewing of user.

As soon as carrying out aftermentioned rendering program after VR scene client 1170 receives aforementioned data.

Step (i): decoding [Info 1112-C and Info 1113-C] in video signal shadow lattice (including left eye shadow lattice with Both right eye shadow lattice) and this two shadows lattice is stored in different memory headrooms.

Step (ii): all [Info1112-A and are rendered on left eye shadow lattice and right eye shadow lattice after the decoding respectively Info 1113-A] included in 3D model (if this 3D model exist if).It is occupied to reduce network bandwidth, VR scene Client 1170 will store this one [Info 1112-A and Info 1113-A], and information is into memory, therefore next time VR Scape transmitter 1110 can only transmit [Info 1112-A and Info 1113-A] state between rendering next time and this rendering Difference does not need the whole status information of transmission.

Step (iii): the rendering result in output step (ii) is to contain the output video streaming of VR scene as one In rendering after a mixing left eye shadow lattice with one mix right eye shadow lattice, that is, the video streaming result (road finally exported Diameter 1176).Wherein, the mixing left eye shadow lattice with mix right eye shadow lattice, the combinable mixing being referred to as previously once referred to VR shadow lattice.

In this embodiment, which is the electronic device of a glasses or helmet moulding comprising is located at Two display screens in front of user's left eye and right eye；Wherein, the screen on the left side is used to show the shadow for the viewing of user's left eye As (shadow lattice), the screen on the right is used to show the image (shadow lattice) for the viewing of user's right eye.Mixing in the output video streaming VR shadow lattice are to be played on this two screen of user apparatus in the following manner, also that is, each of mixing VR shadow lattice are mixed It closes left eye shadow lattice and is all shown in the left eye screen, and each mixing right eye shadow lattice is then shown in the right side in mixing VR shadow lattice Paropion curtain, to provide the visual experience of user's virtual reality (VR).

And in another embodiment, in the video streaming exported on a screen of user apparatus be on the screen at the same according to Sequence shows that the mixing left eye shadow lattice mix right eye shadow lattice with this in turn.User can wear the electronic device of a glasses moulding, can Its left side is sequentially opened and closed in turn to mix right eye shadow lattice with this corresponding to mixing left eye shadow lattice shown on the screen Eye window and right eye window, to provide the visual experience of user's virtual reality (VR).

Embodiment described above be not applied to limit it is of the invention can application range, protection scope of the present invention Ying Yiben Based on the range that the scope of the claims content institute definition techniques spirit and its equivalent change of invention are included.I.e. generally Yi Benfa The equivalent change and modification that bright scope of the claims is done also do not depart from of the invention where will not losing main idea of the invention Spirit and scope, former capital should be regarded as further status of implementation of the invention.

Claims

1. a kind of method by transmission of network media, which includes multiple images, which is characterized in that this method includes following Step:

Step (A): executing a virtual reality applications program on a server, includes the virtual of multiple 3D models to generate one Whether VR3D environment, every 3D model instruction 3D model of arranging in pairs or groups are pre-stored in state in a user apparatus；

Step (B): the state of the multiple 3D model of the server check, to determine which 3D model will be encoded as 2D view The left eye shadow lattice and a right eye shadow lattice that frequency crossfire is included, coding mode are that this that be pre-stored in non-in the user apparatus is more A 3D model based coding is into the left eye shadow lattice and the right eye shadow lattice；

Step (C): the server is at least arrived the left eye shadow lattice of the 2D video streaming and the right eye shadow lattice by transmission of network The user apparatus；Wherein, which also passes the non-multiple 3D model being pre-stored in user apparatus in a predetermined order It send to the user apparatus；When the user apparatus receive by the server transmit Lai multiple 3D model when, the user apparatus Multiple 3D model is stored and issues an information to the server, to change the state of multiple 3D model, and indicating should Multiple 3D models are currently to be pre-stored in the user apparatus；And

Step (D): the user apparatus will be decoded from the received left eye shadow lattice of the server and the right eye shadow lattice, and utilize the left side Eye shadow lattice and the right eye shadow lattice are used as rendering is multiple to be pre-stored in user apparatus but be not included in the left eye shadow lattice and the right eye shadow One background frame of the 3D model in lattice, a mixing VR shadow of an output video streaming of a VR scene to generate as including Lattice.

2. the method according to claim 1 by transmission of network media, which is characterized in that in the step (B), this is more The state of a 3D model by the server with closest to a virtual location a little to another point farthest away from the virtual location Sequence is tested；Also, in inspection, when finding first non-3D model being pre-stored in the user apparatus, no matter position Whether multiple 3D model in thereafter is pre-stored in the user apparatus, all by remaining all 3D of the 3D model including the discovery Model based coding is into the left eye shadow lattice and the right eye shadow lattice.

3. the method according to claim 2 by transmission of network media, which is characterized in that when a new 3D model occurs When in the VR3D environment, whether the multiple 3D model no matter being located at thereafter is pre-stored in the user apparatus, and all will include should All model based codings of 3D thereafter of new 3D model are into the left eye shadow lattice and the right eye shadow lattice.

4. the method according to claim 2 by transmission of network media, which is characterized in that the virtual location is 3D throwing Shadow face；Also, in the step (D), the user apparatus will be from the received left eye shadow lattice of the server and the right eye shadow lattice After decoding, which further merges into a combined VR shadow lattice for the left eye shadow lattice and the right eye shadow lattice, then makes It is rendered with the VR shadow lattice of the merging as the background frame and multiple be pre-stored in user apparatus but be not included in the left eye shadow 3D model in lattice and the right eye shadow lattice, to generate the mixing VR shadow as the output video streaming for including the VR scene Lattice.

5. the method according to claim 2 by transmission of network media, it is characterised in that:

In the step (C), the server is to be sent to this for the non-multiple 3D model being pre-stored in the user apparatus The predetermined order of user apparatus is one with a little arriving farthest away from another point of the virtual location closest to the virtual location Sequentially；

In the step (C), which will not be encoded into one of the 3D model in the left eye shadow lattice and the right eye shadow lattice Status information is sent in the user apparatus, which carries out when receiving and examining the status information according to following manner: If receiving any 3D model in the status information to prestore in the apparatus to be non-, which sends out a demand To the server, to download the 3D model；Wherein, which includes each left side for not being encoded into the 2D video streaming At least the one of eye shadow lattice and the 3D model in the right eye shadow lattice, which annotates, translates data, each 3D model this annotate translate data include should One title of 3D model, a position, a speed, one to and an attribute.

6. a kind of system by transmission of network media characterized by comprising

One server, to execute a virtual reality applications program, to generate a virtual VR3D ring comprising multiple 3D models Whether border, every 3D model instruction 3D model of arranging in pairs or groups are pre-stored in state in a user apparatus；And

The user apparatus includes to be generated at least by the VR application program to obtain by a network linking to the server The media of some 3D models；

Wherein, which includes multiple images, the transmission mode of multiple image the following steps are included:

Step (B): the state of the multiple 3D model of the server check, to determine which 3D model need to be encoded as 2D view One left eye shadow lattice of frequency crossfire and a right eye shadow lattice, coding mode are by the non-multiple 3D mould being pre-stored in the user apparatus Type is encoded into the left eye shadow lattice and the right eye shadow lattice

Step (C): the left eye shadow lattice of the 2D video streaming and the right eye shadow lattice are at least passed through the transmission of network by the server To the user apparatus；Wherein, the server also by the non-multiple 3D model being pre-stored in user apparatus in a predetermined order It is sent to the user apparatus；When the user apparatus receive by the server transmission come multiple 3D model when, the user fill It sets and multiple 3D model is stored to and issued an information to the server, to change the state of multiple 3D model, and indicating should Multiple 3D models are currently to be pre-stored in the user apparatus；

Step (D): the user apparatus will be decoded from the received left eye shadow lattice of the server and the right eye shadow lattice, and by the left eye Shadow lattice and the right eye shadow lattice merge into a combined VR shadow lattice, and the VR shadow lattice of the merging is then used to be pre-stored in as rendering is multiple In user apparatus but be not included in the merging VR shadow lattice a background frame, with generate include a VR scene one output view One mixing VR shadow lattice of frequency crossfire；And

Step (E): user apparatus output includes the mixing VR shadow lattice of the output video streaming of the VR scene.

7. the system according to claim 6 by transmission of network media, which is characterized in that in the step (B), this is more The state of a 3D model by the server with closest to a virtual location a little to another point farthest away from the virtual location Sequence is tested, and in inspection, when finding first non-3D model being pre-stored in the user apparatus, no matter is located at thereafter Multiple 3D model whether be pre-stored in the user apparatus, all by include the discovery 3D model remaining all 3D model compile Code is into the left eye shadow lattice and the right eye shadow lattice.

8. the system according to claim 7 by transmission of network media, which is characterized in that when a new 3D model occurs When in the VR3D environment, whether the multiple 3D model no matter being located at thereafter is pre-stored in the user apparatus, and all will include should Remaining all 3D model based coding of new 3D model are into the left eye shadow lattice and the right eye shadow lattice.

9. the system according to claim 7 by transmission of network media, it is characterised in that:

In the step (C), the server is to be sent to the use for the non-multiple 3D model being pre-stored in the user apparatus The predetermined order of family device is one to be somebody's turn to do closest to the 3D model of the virtual location to farthest away from the another of the virtual location The sequence of 3D model；

In the step (C), which will also not be encoded into the 3D model in the left eye shadow lattice and the right eye shadow lattice One status information is sent in the user apparatus；The user apparatus when receiving and examining the status information according to following manner into Row: it is prestored in the apparatus if receiving any 3D model in the status information to be non-, which sends out One demand is to the server, to download the 3D model；Wherein, the status information include it is each be not encoded into the left eye shadow lattice and One of the 3D model in the right eye shadow lattice, which annotates, translates data, and this of each 3D model annotates and translate one that data include the 3D model Title, a position, a speed, one to and an attribute.

10. the system according to claim 6 by transmission of network media, which is characterized in that the server further include:

One VR scene transfer device, for be compiled in the VR application program or in runing time dynamic link in the VR application program On a chained library；Wherein, which includes the one of the state of all 3D models and each 3D model List, the state are one of " not being ready for ", " in downloading " and " user has downloaded " to the state for indicating the 3D model；And

One VR scene server is with the VR application program in the server program executed on the server；Wherein, VR scene The relay station that server is transmitted as information between the VR scene transfer device and the user apparatus, the VR scene server also conduct A downloading server program of the necessary 3D model is downloaded from the server for the user apparatus.

11. the system according to claim 10 by transmission of network media, which is characterized in that the user apparatus also wraps It includes:

One VR scene client is one in the program operated on the user apparatus, to generate the output video streaming and pass through The network is connected to the server；

One VR scene cache, which includes the 3D model database being stored in the user apparatus, to store The 3D model downloaded before at least one from the server.

12. a kind of method by transmission of network media, which includes multiple images, which is characterized in that this method include with Lower step:

Step (A): executing a virtual reality applications program on a server, virtual to generate one comprising multiple 3D models Whether VR3D environment, every 3D model instruction 3D model of arranging in pairs or groups are pre-stored in state in a user apparatus；

Step (B): the state of the multiple 3D model of the server check, to determine which 3D model need to be encoded as a 2D One left eye shadow lattice of video streaming and a right eye shadow lattice, coding mode are by the non-multiple 3D being pre-stored in the user apparatus Model based coding is into left eye shadow lattice and the right eye shadow lattice of the 2D video streaming；Then, the server by the left eye shadow lattice with The right eye shadow lattice merge into a combined VR shadow lattice of the 2D video streaming；

Step (C): the VR shadow lattice of the merging of the 2D video streaming are at least passed through a transmission of network to the user by the server Device；Wherein, which is also sent to this for the non-multiple 3D model being pre-stored in user apparatus in a predetermined order User apparatus；When the user apparatus receive by the server transmission come multiple 3D model when, the user apparatus will this Multiple 3D models store and issue an information to the server, to change the state of multiple 3D model, and to indicate that this is more A 3D model is currently to be pre-stored in the user apparatus；And

Step (D): the user apparatus will be decoded from the VR shadow lattice of the merging of the received 2D video streaming of the server, And using the VR shadow lattice of the merging as in the multiple VR shadow lattice for being pre-stored in user apparatus but being not included in the merging of rendering One background frame of 3D model mixes VR shadow lattice with generate the output video streaming that one includes a VR scene one.

13. the method according to claim 12 by transmission of network media, it is characterised in that:

In the step (B), the state of multiple 3D model by the server with closest to a virtual location a little to farthest The sequence of another point from the virtual location is tested, and in inspection, when discovery, first non-is pre-stored in the user apparatus 3D model when, no matter be located at whether multiple 3D model thereafter is pre-stored in the user apparatus, all will include the discovery Remaining all 3D model based coding of 3D model are into left eye shadow lattice and the right eye shadow lattice of the 2D video streaming；

In the step (C), the server is also by multiple non-3D models being pre-stored in the user apparatus certainly closest to the void Quasi- position is a little sent in the user apparatus to a predetermined order of another point farthest away from the virtual location；As the user Device receive by the server transmit Lai multiple 3D model when, which stores multiple 3D model and issues One information to change those states of multiple 3D model, and indicates that multiple 3D model is currently to be pre-stored in the server In the user apparatus.

14. the method according to claim 13 by transmission of network media, which is characterized in that when a new 3D model goes out When in the present VR3D environment, whether the multiple 3D model no matter being located at thereafter is pre-stored in the user apparatus, all will include The 3D model based codings all thereafter of the new 3D model are into the left eye shadow lattice and the right eye shadow lattice；The virtual location therein is One perspective plane 3D.

15. the method according to claim 12 by transmission of network media, which is characterized in that, should in the step (C) One status information of the 3D model not being encoded into the left eye shadow lattice and the right eye shadow lattice is also sent to the use by server In the device of family, which is carried out when receiving and examining the status information according to following manner: if receiving state letter Any 3D model in breath prestores in the apparatus to be non-, then the user apparatus i.e. one demand of submitting to the server, below Carry the 3D model；Wherein, which includes each left eye shadow lattice and the right eye for not being encoded into the 2D video streaming One of the 3D model in shadow lattice, which annotates, translates data, this, which is annotated, translates the title that data include the 3D model, a position, a speed, one Seat to and an attribute.