CN112492357A

CN112492357A - Method, device, medium and electronic equipment for processing multiple video streams

Info

Publication number: CN112492357A
Application number: CN202011270703.8A
Authority: CN
Inventors: 王珂晟; 黄劲; 黄钢; 许巧龄
Original assignee: Beijing Anbo Shengying Education Technology Co ltd
Current assignee: Beijing Anbo Shengying Education Technology Co ltd
Priority date: 2020-11-13
Filing date: 2020-11-13
Publication date: 2021-03-12

Abstract

The present disclosure provides a method, apparatus, medium, and electronic device for processing multiple video streams. The method comprises the following steps: receiving a first video stream acquired by a plurality of terminals in real time; the first video stream comprises time stamps; performing alignment operation on each first video stream based on the timestamp to acquire an alignment starting point position of each first video stream; and synthesizing the first video streams based on the alignment starting point position and the preset playing position of each first video stream at the terminal to generate a second video stream. The method and the device combine the received multiple paths of video streams into one video stream, thereby reducing the data volume of network transmission, improving the fluency of live video and providing space for improving the picture quality and sound quality of live video.

Description

Method, device, medium and electronic equipment for processing multiple video streams

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, a medium, and an electronic device for processing multiple video streams.

Background

Network teaching is a network application, and refers to that teachers and students in a plurality of physical spaces perform face-to-face teaching in a virtual classroom through communication equipment and a network. Each person in the virtual classroom can listen to the sound of other persons through the loudspeaker synchronously, can see the image, action and expression of other persons through the display synchronously, and can also receive and send the content of the electronic demonstration board, so that the participant has the feeling of being personally on the scene.

Generally, in a virtual classroom, if each terminal needs to display multiple videos simultaneously, the terminal transmits the acquired videos to a video server through a video stream, the video server distributes the video stream to each terminal in the virtual classroom, and the terminal displays the received video stream in a corresponding video playing window in a video mode.

However, the processing method of the video stream causes large network data flow and is very easy to cause network blockage. To avoid this, it is often necessary to sacrifice picture quality and sound quality.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

An object of the present disclosure is to provide a method, an apparatus, a medium, and an electronic device for processing multiple video streams, which can solve at least one of the above-mentioned technical problems. The specific scheme is as follows:

according to a specific embodiment of the present disclosure, in a first aspect, the present disclosure provides a method for processing multiple video streams, including:

receiving a first video stream acquired by a plurality of terminals in real time; the first video stream comprises time stamps;

performing alignment operation on each first video stream based on the timestamp to acquire an alignment starting point position of each first video stream;

and synthesizing the first video streams based on the alignment starting point position and the preset playing position of each first video stream at the terminal to generate a second video stream.

According to a second aspect, the present disclosure provides an apparatus for processing multiple video streams, including:

the device comprises a first video stream receiving unit, a first video stream processing unit and a first video stream processing unit, wherein the first video stream receiving unit is used for receiving first video streams collected by a plurality of terminals in real time; the first video stream comprises time stamps;

an alignment starting point position obtaining unit, configured to perform an alignment operation on each first video stream based on the timestamp, and obtain an alignment starting point position of each first video stream;

and the second video stream generating unit is used for synthesizing the first video streams based on the alignment starting point position and the preset playing position of each first video stream at the terminal to generate a second video stream.

According to a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of processing multiple video streams according to any of the first aspect.

According to a fourth aspect thereof, the present disclosure provides an electronic device, comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method of processing multiple video streams according to any one of the first aspect.

Compared with the prior art, the scheme of the embodiment of the disclosure at least has the following beneficial effects:

the present disclosure provides a method, apparatus, medium, and electronic device for processing multiple video streams. The method and the device combine the received multiple paths of video streams into one video stream, thereby reducing the data volume of network transmission, improving the fluency of live video and providing space for improving the picture quality and sound quality of live video.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and elements are not necessarily drawn to scale. In the drawings:

fig. 1 shows a flow diagram of a method of processing multiple video streams according to an embodiment of the present disclosure;

fig. 2 shows a block diagram of elements of an apparatus for processing multiple video streams according to an embodiment of the present disclosure;

fig. 3 shows an electronic device connection structure schematic according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

Alternative embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.

A first embodiment, namely, an embodiment of a method for processing multiple video streams, is provided by the present disclosure.

The embodiments of the present disclosure are described in detail below with reference to fig. 1.

Step S101, receiving a first video stream collected by a plurality of terminals in real time.

The video stream refers to a video format played in a network by adopting a streaming transmission mode. When the method is applied, a transmitting party divides the video stream into segments, packs the segments into data packets and transmits the data packets to a network, and a receiving party decompresses the data packets and then plays the video stream according to the time sequence before packaging. Generally, a terminal needs to receive multiple video streams and play them simultaneously, and then also needs to play them simultaneously according to the precedence relationship of the time sequence before packing.

The first video stream includes time stamps.

A timestamp (PTS), which is usually a sequence of characters that uniquely identifies a Time instant, is a marker of the play relationship of a video stream. In the same path of video stream, the playing sequence is shown, and the smaller the PTS value is, the earlier the playing is. Meanwhile, the difference in PTS between two video stream frames represents the interval in which the two video streams are played. Generally, terminals participating in video playing acquire clock information from the same clock source as a basis of a timestamp, so as to ensure that multiple video streams can be played synchronously based on the timestamp. That is, videos collected at the same time can be synchronously played in the same terminal.

Step S102, based on the time stamp, performing alignment operation on each first video stream, and acquiring an alignment starting point position of each first video stream.

Due to the offset in the time at which the timestamps are generated in each first video stream, for example, timestamp A is generated at 8:0:0 seconds in video stream A, timestamp B is generated at 8:0:1 seconds in video stream B, and timestamp C is generated at 7:59:59 seconds in video stream C. Therefore, it is necessary to perform an alignment operation on the respective first video streams to ensure that each first video stream can start playing based on an alignment start position corresponding to the same start time, that is, each first video stream can be played synchronously.

And aligning the starting position to indicate that the corresponding frame image of each first video stream at the position is acquired at the same moment.

Optionally, the performing an alignment operation on each first video stream based on the timestamp to obtain an alignment start position of each first video stream includes the following steps:

step S102-1, determining a main video stream and a secondary video stream from the first video stream.

For the purpose of alignment, the embodiment of the present disclosure determines a master video stream as a reference standard of slave video streams from a plurality of first video streams, and adjusts the start position of each slave video stream to the same position as the first time stamp, using the first time stamp of the master video stream as a standard.

For example, continuing the above example, video stream a is determined to be the master video stream and video streams B and C are determined to be the slave video streams.

And step S102-2, determining a first time stamp for characteristic reference of the secondary video stream according to the primary video stream.

For example, the first timestamp of the main video stream is timestamp a generated at 8:0:0 seconds.

And step S102-3, acquiring a second time stamp of the slave video stream.

For example, video stream B is time stamp B generated for 8:0:1 seconds, and video stream C is time stamp C generated for 7:59:59 seconds.

Optionally, the obtaining the second timestamp of the slave video stream includes the following steps:

and step S102-3-1, acquiring the second timestamp based on the first timestamp and a preset alignment threshold.

The purpose of this step is to avoid the time difference between the selected second timestamp and the first timestamp being too large, and to increase the time cost of the calculation, therefore, this step further defines the selected second timestamp by using a preset alignment threshold, and makes the selected second timestamp close to the first timestamp, so as to reduce the time cost of the calculation.

However, due to network reasons, the data packet for alignment in at least one of the multiple video streams cannot be transmitted to the video server simultaneously with the data packet for alignment in the other video streams, that is, the data packet may arrive at the video server later than the current alignment operation, so that the video stream cannot complete the alignment operation. In order to solve the above problem, optionally, the obtaining the second timestamp based on the first timestamp and a preset alignment threshold further includes:

step S102-3-1-1, when the second timestamp of the first slave video stream cannot be obtained based on the first timestamp and a preset alignment threshold, determining that the second timestamp is equal to the first timestamp, obtaining a third timestamp after the second timestamp from the first slave video stream, and inserting a preset frame image from the second timestamp to the third timestamp in the first slave video stream to generate a second slave video stream.

The determining of the second timestamp is equal to the first timestamp, i.e. the second timestamp of the video stream is assumed to be the same as the first timestamp. Although the packet for alignment is later than the time of the current alignment operation, the packet transmitted after the packet for alignment may be transmitted to the video server earlier than the time of the alignment operation. Accordingly, the disclosed embodiment selects a timestamp (i.e., the third timestamp) from the later transmitted data packet. For example, continuing the above example, the first timestamp is timestamp a generated 8:0:0 seconds, video stream B is used for aligned first packet loss, and the video server receives a second packet that includes a third timestamp generated 8:0:2 seconds; then, the second timestamp of the video stream B is set to 8:0:0 seconds, and 24 × 2 ═ 48 preset frame images are inserted between the second timestamp and the third timestamp, so that the video stream B1 is generated.

Namely, the frame images from the second time stamp to the third time stamp are replaced by the preset frame images, so that the alignment operation of the first slave video stream with the lost data packet can be ensured in a complete state. However, the frame image before the third time stamp in the second packet of this step may be replaced by the preset frame image, thereby losing a part of the image. Therefore, in order to reduce the loss of the received frame image, it is preferable that the third time stamp included in the second packet should be minimized.

And step S102-4, acquiring relative time relative to the second timestamp according to the first timestamp and the second timestamp.

For example, the relative time of timestamp A to timestamp B is 1 second, and the relative time of timestamp A to timestamp C is-1 second.

And step S102-5, acquiring the alignment starting point position of the slave video stream based on the relative time.

For example, if 24 frames of images are generated in 1 second, the alignment start position of the video stream B is 24 frames before the time stamp B; the alignment start position of the video stream C is 24 frames after the time stamp C.

Step S103, synthesizing the first video streams based on the alignment starting point position and the preset playing position of each first video stream at the terminal, and generating a second video stream.

I.e. combining multiple video streams into one video stream.

Optionally, the synthesizing, based on the alignment starting point position and the preset playing position of each first video stream at the terminal, the first video streams to generate a second video stream includes the following steps:

and step S103-1, sequentially acquiring first frame images at the same relative position of each first video stream from the alignment starting point position.

For example, continuing the above example, a first frame image of video stream a, video stream B, and video stream C is acquired, a second frame image of video stream a, video stream B, and video stream C is acquired, … …, an nth frame image of video stream a, video stream B, and video stream C is acquired, and N is a positive integer greater than zero.

And S103-2, synthesizing the first frame images based on the preset playing position and the acquisition sequence, and sequentially generating second frame images.

For example, continuing the above example, the first frame images of the video stream a, the video stream B, and the video stream C are synthesized into the first frame and the second frame image; synthesizing the first frame images of the video stream a, the video stream B and the video stream C into a second frame image, … … synthesizing the nth frame images of the video stream a, the video stream B and the video stream C into an nth frame image.

The preset playing position can be fixed or dynamically adjusted in real time. For example, in live video teaching, a teacher dynamically adjusts the positions of student videos in a display terminal as needed.

And step S103-3, synthesizing the second frame images into a second video stream based on the generation sequence.

In order to achieve the purpose of live teaching of the teacher, optionally, the method further includes the following steps:

and step S104, acquiring main audio information from the main video stream.

Step S105, synthesizing the main audio information into the second video stream.

For example, if the main video stream determined by the video server is the first video stream transmitted by the teacher terminal, the teaching audio information of the teacher is acquired from the main video stream as the main audio information, and the teaching audio information is synthesized into the second video stream. Thereby generating a complete video stream for the live teaching of the teacher.

Optionally, the method further includes the following steps:

and step S106, respectively transmitting the second video stream to each terminal.

The embodiment of the disclosure synthesizes the received multiple paths of video streams into one video stream, thereby reducing the data volume of network transmission, improving the fluency of live video and providing space for improving the picture quality and sound quality of live video.

The present disclosure also provides a second embodiment, namely an apparatus for processing multiple video streams, corresponding to the first embodiment provided by the present disclosure. Since the second embodiment is basically similar to the first embodiment, the description is simple, and the relevant portions should be referred to the corresponding description of the first embodiment. The device embodiments described below are merely illustrative.

Fig. 2 illustrates an embodiment of an apparatus for processing multiple video streams provided by the present disclosure.

As shown in fig. 2, the present disclosure provides an apparatus for processing multiple video streams, comprising:

a first video stream receiving unit 201, configured to receive a first video stream acquired by multiple terminals in real time; the first video stream comprises time stamps;

an obtaining alignment starting point position unit 202, configured to perform an alignment operation on each first video stream based on the timestamp, and obtain an alignment starting point position of each first video stream;

a second video stream generating unit 203, configured to synthesize the first video streams based on the alignment starting point position and a preset playing position of each first video stream at the terminal, and generate a second video stream.

Optionally, the generating the second video stream unit 203 includes:

a sub-unit for obtaining frame images at the same position, which is used for sequentially obtaining the first frame images at the same relative position of each first video stream from the alignment starting point position;

a second frame image generation subunit, configured to synthesize the first frame images based on the preset playing position and the acquisition order, and sequentially generate second frame images;

and a composite second video stream subunit for compositing the second frame images into a second video stream based on the generation order.

Optionally, the obtaining of the alignment starting point position unit 202 includes:

a determining video stream subunit for determining a primary video stream and a secondary video stream from the first video stream;

a determining first time stamp subunit for determining a first time stamp for the slave video stream feature reference from the master video stream;

a get second timestamp subunit for getting a second timestamp of the slave video stream;

a relative time acquiring subunit, configured to acquire a relative time with respect to the second timestamp according to the first timestamp and the second timestamp;

an acquire alignment start position subunit configured to acquire an alignment start position of the slave video stream based on the relative time.

Optionally, in the obtaining the second timestamp subunit, the method includes:

a limit second timestamp subunit for obtaining the second timestamp based on the first timestamp and a preset alignment threshold.

Optionally, in the limiting second time stamp subunit, the method further includes:

and a lost data supplementation subunit, configured to, when the second timestamp of the first slave video stream cannot be obtained based on the first timestamp and a preset alignment threshold, determine that the second timestamp is equal to the first timestamp, obtain a third timestamp after the second timestamp from the first slave video stream, insert a preset frame image between the second timestamp and the third timestamp in the first slave video stream, and generate a second slave video stream.

Optionally, the apparatus further includes:

the main audio information acquisition unit is used for acquiring main audio information from the main video stream;

a main audio information synthesizing unit for synthesizing the main audio information into the second video stream.

Optionally, the apparatus further includes:

and the transmitting second video stream unit is used for respectively transmitting the second video stream to each terminal.

The present disclosure provides a third embodiment, that is, an electronic device, configured to implement a method for processing multiple video streams, where the electronic device includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the one processor to cause the at least one processor to perform the method of processing multiple video streams as described in the first embodiment.

The present disclosure provides a fourth embodiment, namely a computer storage medium for processing multiple video streams, wherein the computer storage medium stores computer-executable instructions, and the computer-executable instructions can execute the method for processing multiple video streams as described in the first embodiment.

Referring now to FIG. 3, shown is a schematic diagram of an electronic device suitable for use in implementing embodiments of the present disclosure. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 3, the electronic device may include a processing device (e.g., a central processing unit, a graphics processor, etc.) 301 that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)302 or a program loaded from a storage device 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the electronic apparatus are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.

Generally, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage devices 308 including, for example, magnetic tape, hard disk, etc.; and a communication device 309. The communication means 309 may allow the electronic device to communicate wirelessly or by wire with other devices to exchange data. While fig. 3 illustrates an electronic device having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication means 309, or installed from the storage means 308, or installed from the ROM 302. The computer program, when executed by the processing device 301, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

1. A method of processing multiple video streams, comprising:

2. The method according to claim 1, wherein the synthesizing of the first video streams based on the alignment start position and the preset playing position of each first video stream at the terminal to generate a second video stream comprises:

sequentially acquiring first frame images at the same relative position of each first video stream from the alignment starting point position;

synthesizing the first frame images based on the preset playing position and the acquisition sequence, and sequentially generating second frame images;

and synthesizing the second frame images into a second video stream based on the generation sequence.

3. The method according to claim 1, wherein performing an alignment operation on each first video stream based on the timestamp to obtain an alignment start position of each first video stream comprises:

determining a master video stream and a slave video stream from the first video stream;

determining a first timestamp for the slave video stream feature reference from the master video stream;

obtaining a second time stamp of the slave video stream;

acquiring relative time relative to the second timestamp according to the first timestamp and the second timestamp;

and acquiring the alignment starting point position of the slave video stream based on the relative time.

4. The method of claim 3, wherein the obtaining the second timestamp of the slave video stream comprises:

and acquiring the second timestamp based on the first timestamp and a preset alignment threshold.

5. The method of claim 4, wherein the obtaining the second timestamp based on the first timestamp and a preset alignment threshold further comprises:

when the second timestamp of the first slave video stream cannot be acquired based on the first timestamp and a preset alignment threshold, determining that the second timestamp is equal to the first timestamp, acquiring a third timestamp after the second timestamp from the first slave video stream, and inserting a preset frame image from the second timestamp to the third timestamp in the first slave video stream to generate a second slave video stream.

6. The method of claim 3, further comprising:

acquiring main audio information from the main video stream;

synthesizing the primary audio information into the second video stream.

7. The method of claim 1, further comprising:

and transmitting the second video stream to each terminal respectively.

8. An apparatus for processing multiple video streams, comprising:

9. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1 to 7.

10. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method of any one of claims 1 to 7.