CN116915937A - Multipath video input method and system - Google Patents

Multipath video input method and system Download PDF

Info

Publication number
CN116915937A
CN116915937A CN202311011211.0A CN202311011211A CN116915937A CN 116915937 A CN116915937 A CN 116915937A CN 202311011211 A CN202311011211 A CN 202311011211A CN 116915937 A CN116915937 A CN 116915937A
Authority
CN
China
Prior art keywords
video
video streams
receiving end
code rate
streams
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311011211.0A
Other languages
Chinese (zh)
Inventor
刘丽君
赵兴国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Sailian Information Technology Co ltd
Original Assignee
Shanghai Sailian Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Sailian Information Technology Co ltd filed Critical Shanghai Sailian Information Technology Co ltd
Priority to CN202311011211.0A priority Critical patent/CN116915937A/en
Publication of CN116915937A publication Critical patent/CN116915937A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention provides a multipath video input method and a multipath video input system. Wherein the method comprises: receiving at least two paths of video streams as original video streams; selecting at least two paths of video streams from the original video streams as selected video streams based on the requirement of a receiving end, and selecting the code rate of the selected video streams to obtain video streams with specified code rates; transmitting the video stream with the specified code rate to the receiving end; and the receiving end lays out the video picture corresponding to the video stream with the specified code rate and displays the laid video picture. The multi-channel video input method and the system thereof effectively realize flexible adjustment of video quality according to the requirement of a receiving end, ensure the stability of video stream transmission with higher priority and improve the watching experience and watching effect of users.

Description

Multipath video input method and system
Technical Field
The invention relates to the technical field of video communication, in particular to a multipath video input method and a multipath video input system.
Background
Internet-based video communication technology is widely used in video conference scenes for work and life. Where multiple video inputs (typically referred to as the ability to simultaneously receive and process multiple video signals in one system or device) are used to simultaneously process video data from multiple cameras or sources, such as video stream composition, video segmentation, object tracking, etc., more options and possibilities are provided for the fields of video conferencing systems, surveillance systems, real-time video processing, and virtual and augmented reality.
In a video conferencing system, multiple video inputs allow multiple participants to share their video simultaneously so that other participants can view multiple video pictures simultaneously. This is important for intra-enterprise teleconferencing, cross-regional team collaboration, and distance education scenarios.
In a monitoring system, multiple paths of video inputs can simultaneously receive video signals from multiple monitoring cameras, so that monitoring personnel can simultaneously and comprehensively monitor multiple areas, safety is improved, and potential risks are prevented.
In real-time video processing, for example, in live broadcast, video live broadcast platforms or sporting events, multiple video inputs may simultaneously receive pictures from multiple cameras and combine them in real-time into a unified video stream that provides a multi-angle viewing experience for the viewer. As another example, multi-channel video input applications are in medical remote settings, allowing remote consultation between a medical professional and a patient, or remote medical and/or visual education between a medical professional and other doctors, e.g., real-time transmission of operating room panoramas, B-ultrasound or electrocardiography, surgical fields, etc.
Virtual reality and augmented reality: in Virtual Reality (VR) and Augmented Reality (AR) applications, multiple video inputs may provide video pictures from multiple perspectives, enhancing the user's immersive and interactive experience.
In the prior art, the multi-path video input is based on AVC (Advanced Video Coding ) architecture, and the multi-path video input receives multi-path video streams, but only outputs one path of video stream to the receiving end, the receiving end is equivalent to a forced conference mode, and only views video pictures with good layout of the control unit, but cannot adjust the picture layout and the definition of each sub-picture by itself.
And AVC is a single stream video coding standard, each video stream being coded independently. It usually generates a fixed code stream, which cannot be dynamically adjusted according to network conditions and equipment capabilities, and which does not have scalability, and which cannot adjust decoding complexity according to the performance and bandwidth of the receiving end, so that video quality degradation or a stuck phenomenon may occur under the condition of unstable network or limited bandwidth.
In short, the multi-channel video input system based on AVC in the prior art is inflexible, and firstly, a receiving end cannot automatically adjust the layout according to requirements; secondly, the video quality of the receiving end cannot be dynamically adjusted according to the network condition and the equipment capability of the receiving end, and a screen is directly jumped once packet loss occurs; thirdly, the coding layout tasks are born by the control unit, the requirements on the control unit are high, the number of the sending end and the receiving end which can be dealt with is small, the video stream capacity is small, and the expansibility is poor.
Disclosure of Invention
Unlike available technology, which can only receive one video stream even if multiple video stream receiving ends are collected, the present invention provides one SVC frame based multiple video input method and system, and the sending end can transmit multiple video streams to the receiving end simultaneously and can also select the video stream of specified code rate dynamically based on the layout setting, equipment capacity and network condition of the receiving end to send, so as to realize the flexible video quality regulation based on the requirement of the receiving end.
In a first aspect, the present invention provides a multi-path video input method, which is characterized in that a multi-path video input system is provided, and the multi-path video input system includes a receiving end, and the method includes:
receiving at least two paths of video streams as original video streams;
selecting at least two paths of video streams from the original video streams as selected video streams based on the requirement of a receiving end, and selecting the code rate of the selected video streams to obtain video streams with specified code rates;
transmitting the video stream with the specified code rate to the receiving end;
and the receiving end lays out the video picture corresponding to the video stream with the specified code rate and displays the laid video picture.
In a second aspect, the present invention further provides a multi-path video input system, which is characterized in that the multi-path video input system includes a processing device and a receiving end, where the processing device includes a receiving unit, a selecting unit and a transmitting unit; wherein the method comprises the steps of
The receiving unit is used for receiving at least two paths of video streams as original video streams;
the selecting unit is used for selecting at least two paths of video streams from the original video streams to serve as selected video streams based on the requirement of a receiving end, and selecting the code rate of the selected video streams to obtain video streams with specified code rates;
the sending unit is used for sending the video stream with the specified code rate to the receiving end;
the receiving end is used for laying out the video pictures corresponding to the video stream with the specified code rate and displaying the laid video pictures.
The multi-path video input method and the system based on the SVC framework provided by the invention have the advantages that the sending end can simultaneously transmit the multi-path video streams to the receiving end, which is different from the prior art that only one path of video stream combined by the control unit can be transmitted; the control unit does not need to bear the task of coding layout, has low requirement on the control unit, can cope with a large number of sending ends and receiving ends, has large video stream capacity and strong expansibility; the multipath video streams can adapt to different layout settings, equipment capacities and network conditions of the receiving end, and the matched specified code rate video streams are dynamically selected in real time to be sent, so that the problems of slow video buffer loading and the like when network connection is slower are avoided, the flexible adjustment of video quality according to the requirements of the receiving end is effectively realized, the stability of transmission of video streams with higher priority (such as main picture video streams) is ensured, and the watching experience and watching effect of a user are improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a multi-channel video input method provided by an embodiment of the invention;
FIG. 2 is a block diagram of a multi-channel video input system provided by an embodiment of the present invention;
fig. 3 is a schematic diagram of the architecture of a video conference SVC in the prior art.
Detailed Description
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Summary of The Invention
As described above, the present invention provides a method and a system for dynamically adjusting multiple video inputs of a video stream in real time, which avoid the problems of slow video buffer loading when network connection is slower, effectively realize flexible adjustment of video quality according to the requirement of a receiving end, ensure the stability of video stream (such as a main picture video stream) transmission with higher priority, and improve the viewing experience and viewing effect of users.
Exemplary method
Fig. 1 is a flowchart of a multi-channel video input method according to an embodiment of the present invention, where the method is based on SVC (Scalable Video Coding) architecture.
Unlike the AVC architecture, which does not have scalability, SVC is a multi-stream video coding standard, and a video stream can be divided into a plurality of progressive sub-streams. Each substream is a separate coding layer with scalability and one or more layers of the multiple substreams can be dynamically selected for decoding based on device capabilities (processing, storage, decoding and network connectivity). This feature enables SVC to provide better video quality and higher coding efficiency while accommodating different network conditions and devices.
For ease of understanding, the architecture principles of the video conferencing SVC are described below in connection with fig. 3.
In the figure, three conference terminals A-C respectively transmit blue people, green people and purple people with various resolutions to a background MCU, the MCU transmits video pictures with certain resolution to the conference terminals according to actual demands of the conference terminals, three white frames in the conference terminals A-C are display pictures corresponding to the video pictures, the conference terminals A select large-size green people with high resolution to encode and display, the conference terminals B select small-size blue people with low resolution and small-size purple people to encode and display, and the conference terminals C select medium-size green people with medium resolution and small-size blue people to encode and display.
SVC: firstly, the conference terminals directly splice pictures, each conference terminal is respectively responsible for picture coding of the terminal, a background MCU only forwards and does not bear picture splicing, MCU is not required to carry out coding for multiple times, and the requirement on MCU is low; secondly, the encoded raw materials are pictures with various resolutions sent by each conference terminal, the conference terminal directly selects a large picture with high resolution when needing a large window, directly selects a small picture with low resolution when needing a small window, is changeable in real time, has strong packet loss resistance, automatically reduces the resolution when packet loss occurs, and ensures smooth pictures. Therefore, SVC is utilized for once coding, the video quality after coding is high, the network delay is low, the network adaptability is high, the terminal capacity which can be dealt with is large, the expansibility is high, the pictures with different resolutions can be flexibly selected, and not only the superposition of video pictures but also the superposition of shared content can be supported.
And SCV supports the multi-scene cascade of public cloud, private cloud and mixed cloud, and each cascade conference can be respectively realized by different cloud services. The two conferences in cascade can come from any combination of public cloud, private cloud and mixed cloud, namely public cloud-public cloud, public cloud-private cloud and private cloud A-private cloud B. Functionally support cascading of video conference basic functions such as audio and video pictures and shared content. And because of adopting the SVC scheme, the invention can inherit the characteristics of SVC, and support the advantages of ultra-large-scale concurrency, ultra-low time delay, network packet loss resistance, jitter resistance and the like.
Specifically, a multi-channel video input system is provided, which comprises a transmitting end, a control unit and a receiving end.
The transmitting end is equipment or software for collecting and transmitting video data in the video communication or transmission process, and the equipment or software comprises, but is not limited to, a computer, a mobile phone, a tablet personal computer, a vehicle-mounted video conference terminal and the like.
The transmitting end comprises at least two cameras. The cameras include, but are not limited to, consumer cameras, security cameras, sports cameras, industrial cameras, infrared cameras, fish-eye cameras, 3D cameras. The camera can be connected with the transmitting end through wires or wirelessly.
The control unit (video conference media multipoint control unit, MCU, multi Control Unit) is responsible for managing the forwarding of multiple video streams and rate control. The control unit comprises a conventional MCU (typically a hardware device) and a cloud-based virtual MCU. Compared with the traditional MCU, the virtual MCU solution based on the cloud can enable the video conference to be more flexible and extensible.
The receiving end is equipment or software for receiving video data in the video communication or transmission process, and the equipment or software comprises, but is not limited to, a computer, a mobile phone, a tablet personal computer, a television, a vehicle-mounted video conference terminal, a monitoring display, a virtual reality head display and the like.
The following steps S101-S103 are performed by the control unit.
This embodiment comprises the steps of:
s101: at least two video streams are received as original video streams.
The original video stream is a video stream selected to be sent by the sending end, and each path of video stream in the original video stream can be a video stream with different code rates from the same camera. And defaulting the video stream name and the picture name corresponding to the video stream to be the corresponding camera or the camera interface name.
For example, the original video stream includes 4 video streams a-D acquired by cameras a-D, respectively, wherein each video stream includes four code rates of 300kbps, 2Mbps, 5Mbps, and 12Mbps for standard definition 480p, high definition 720p, full high definition 1080p, and ultra high definition 4K display, respectively.
The step S101 further includes:
first, the transmitting end selects at least two cameras and configures parameters of the cameras.
Wherein, the sending end can select one or a plurality of cameras connected with the sending end, and can also select all cameras.
In particular, the camera parameters include, but are not limited to, camera position, resolution, frame rate, focal length, aperture, focus mode, white balance, automatic gain control, automatic exposure, and the like.
Where the camera is generally positioned at a position (e.g., a high position, a low position, a front position, a side position, a rear position, etc.) and at an angle (e.g., a top view, a bottom view, a head-up view, a squint view, etc.).
For example, in surgical vision teaching, a fisheye camera for shooting panorama of an operating room is set to be short in focal length, so that wide-angle and wide-range vision can be shot; the camera is used for shooting an operation field (which refers to a specific area or range which is operated and processed by a doctor in operation), a proper high-resolution 1080p or 4K is selected to provide clear images so as to capture key operation details, the high frame rate is 30fps or more, smooth motion display is ensured, high color reproducibility is realized, colors in the operation field are accurately restored, and the color of a shot video is ensured to be true.
And secondly, respectively acquiring at least two paths of video streams by utilizing the at least two cameras, and selecting at least two paths of video streams from the at least two paths of video streams to transmit.
For example, the transmitting end is connected with 7 cameras a-g, and selects to perform parameter configuration on the camera a and the camera d, and respectively collects 5 paths of video streams by using the camera a, the camera b, the camera d, the camera f and the camera g, and selects 3 paths of video streams from the 5 paths of video streams to transmit, such as video stream a, video stream b and video stream f collected by the camera a, the camera b and the camera f respectively.
Specifically, at least two video streams may be automatically or manually selected for transmission according to a specific rule.
The specific rule includes selecting a video stream with a fluctuation range of shooting content exceeding a certain threshold, for example, in a security camera monitoring system, a person enters a monitoring area a to cause the fluctuation range of shooting content of a camera a to exceed the threshold, and a transmitting end can automatically select the video stream acquired by the camera a to transmit. Setting a suitable fluctuation range threshold is important for accurately capturing critical events or motions, and specifically, the threshold can be adjusted to adapt to different shooting contents and targets according to application scenes and requirements.
The specific rule may further comprise selecting a video stream having a sound variation amplitude exceeding a certain threshold. In this case, the video stream contains not only images but also audio information. The transmitting end detects the sound in the video stream in real time and analyzes the change of the sound. When the sound fluctuation range (such as increase or decrease of volume) in the video stream collected by a certain camera exceeds a preset threshold, the sending end is triggered to select the video stream for sending. For example, in monitoring video, when the volume increase in the video stream c collected by the camera c exceeds the threshold by 100dB, the transmitting end operator manually selects the video stream c to transmit in order to help identify a possible emergency or abnormal situation.
The specific rule may further include selecting to send a specific video stream according to the level of the receiving end, for example, dividing the receiving end into a VIP paying user and a normal user, where the two viewable video streams have different content rights, and when the receiving end is the normal user, only selecting to send two paths of video streams; when the receiving end pays for the VIP, all four paths of video streams of the sending end are selected.
S102: and selecting at least two paths of video streams from the original video streams as selected video streams based on the requirement of a receiving end, and selecting the code rate of the selected video streams to obtain the video streams with the specified code rate.
The selected video streams are selected from the original video streams based on the requirements of the receiving end, and each path of selected video stream is a video stream with multiple code rates. The video stream with the specified code rate is a video stream with the selected video stream determined to have a good code rate.
For example, the transmitting end selects the transmitting video streams a-d as the original video streams, the demand control unit based on the receiving end selects the original video streams a, b and d as the selected video streams from the original video streams a-d, and specifically selects the code rates of the selected video streams a, b and d as high code rate, low code rate and medium code rate, and then designates the code rate video streams as high code rate video stream a, low code rate video stream b and medium code rate video stream d.
And step S102 may further include converting the original video stream into a video stream of a plurality of code rates. For example, the original video stream has a code rate between 300kbps and 3Mbps, and the control unit may convert it into four code rates of 300kbps, 2Mbps, 5Mbps and 12Mbps based on the demand of the receiving end.
When the code rate of the original video stream is matched with the requirement of the receiving end, a control unit is not required to perform code rate conversion additionally.
Step S102 may further include: and obtaining the requirements of the receiving end based on the layout setting, the equipment capability and the network condition of the receiving end.
Specifically, the demand of the receiving end can be obtained by the receiving end based on comprehensive analysis of layout setting, equipment capability and network condition, and then sent to the control unit; the network condition analysis system can also be obtained by the control unit directly based on the layout setting of the receiving end, the equipment capability and the network condition comprehensive analysis. The three reference factors of the receiving end give priority to the equipment capability and the network condition, and secondly consider the layout setting. For example, when the layout of the receiving end is set to select the video stream a as the main picture, and the three video streams b-d are used as the picture-in-picture canvas of the three small windows, but the network bandwidth is only 1Mbps, even if four video streams all select 300kbps with low code rate, the layout setting cannot be satisfied, at this time, the control unit automatically switches and selects the original main picture video stream a with the code rate of 1Mbps to send to the receiving end as the video stream with the specified code rate, and the receiving end displays the video stream a with the code rate of 1Mbps in the full-screen mode layout. That is, when the network condition is not allowed, the display of the main screen is preferentially ensured.
The network may be a Local Area Network (LAN), a Wide Area Network (WAN), the internet, or other specific network to which the receiving end is connected. Network conditions include, but are not limited to, bandwidth, delay, packet loss rate, network topology, security, and routing and switching.
The device capabilities include processing capabilities, storage capabilities, decoding capabilities, and network connection capabilities.
Processing power: the processor performance and the computing power of the receiving end are referred, for example, the stronger processing power can better process the decoding work of the multipath video stream, thereby realizing higher-quality video output;
energy storage capacity: the multi-path video stream based on SVC needs larger decoding and buffering space, and the stronger the storage capacity of a receiving end is, the more favorable the storage and processing of more video stream data with higher code rate are;
decoding capability: the decoder of the device must support the SVC standard and be capable of decoding multiple layers of multiple video streams, and different decoding capabilities determine whether the receiving end can simultaneously decode multiple video layers, so as to obtain video outputs with different quality;
network connection capability: the network connection capability of the receiving end is crucial to the effect of receiving and decoding multiple video streams, and the higher network connection capability can better receive and transmit data of multiple video layers.
Wherein the layout setting includes layout content and a layout mode.
The layout content refers to specific video stream content required by a receiving end user.
The layout modes include tiling layout, presenter mode, canvas office in picture, custom layout, full screen mode, sidebar layout.
Tiling layout: all video pictures are displayed on the screen in average split side-by-side, typically in a grid-like fashion. For example, in a small-scale conference, all participants can see the video of other participants at the same time.
Presenter mode: the video picture of the current speaker is enlarged and displayed, and other pictures are smaller or hidden. This may focus on the current speaker, making the meeting more focused and orderly.
Canvas office in picture: one main video picture is displayed in the center of the screen while small window videos of other video pictures are displayed at other positions of the screen. This arrangement enables the participants to focus on both the presenter and other important video pictures.
Custom layout: the user customizes the layout as desired. The user can manually adjust and arrange the different video pictures to meet specific requirements.
Full screen mode: in some cases, the current presenter or some important video frame may be set to a full screen display to highlight emphasis or provide a greater visual effect.
Sidebar layout: the main video picture is displayed on one side of the screen while the widget video of the other video picture is displayed on the other side or bottom. The layout can provide the picture of the presenter and other pictures at the same time, and has more interactivity.
Specifically, the layout setting can be completed through user dragging, and also can be completed through parameter setting, so that the dragging is more convenient to operate, and the parameter setting is more accurate. The receiving end user can adjust the layout at any time. For example, the receiving end user wants to view the small picture c in the current canvas office more clearly, and can directly pull the small picture c up as a new main picture, and the original main picture is automatically switched to the position of the original small picture c and displayed as the small picture.
S103: and sending the video stream with the specified code rate to the receiving end.
Steps S101-S103 may also be performed by the sender.
For example, the receiving end is classified into different classes, such as high, medium, and low, according to the receiving end device capabilities and network conditions. The transmitting end prepares an original video stream with high resolution and high code rate for a high-level receiving end, and prepares an original video stream with low resolution and low code rate for a low-level receiving end. When the transmitting end and the receiving end establish connection, the transmitting end can dynamically select an original video stream with a certain code rate which is closest to or accords with the equipment capacity and the network condition of the receiving end as a video stream with a specified code rate according to the grade of the receiving end through exchanging equipment information and network conditions, and the original video stream is transmitted to the receiving end. According to the real-time network condition and equipment capability, the transmitting end can continuously adjust the transmitted video stream with the specified code rate so as to adapt to the fluctuation of network bandwidth or the change of the equipment state of the receiving end.
By selecting and transmitting the specific video stream with the specified code rate according to the grade of the receiving end, the video transmission efficiency can be optimized to the greatest extent, better video quality and smooth playing experience are provided, and the method is applied to the fields of video streaming media, video conferences, live broadcasting and the like.
Compared with the implementation mode executed by the control unit, the implementation mode executed by the sending end in the steps S101-S103 is that the sending end can directly select the video stream with the specified code rate based on the requirement of the receiving end, the control unit only needs to forward video stream data and information, and does not need to bear the tasks of selecting the selected video stream from the original video stream and selecting the code rate of the selected video stream, so that the processing capability requirement on the control unit is lower, the number of the sending end and the receiving end which can be dealt with is larger, the video stream capacity is larger, and the expansibility is stronger.
S104: and the receiving end lays out the video picture corresponding to the video stream with the specified code rate and displays the laid video picture.
Specifically, the receiving end lays out a video picture corresponding to the video stream with the specified code rate from the control unit or the transmitting end.
For example, in a video conference that explains a product, there are four paths of constant rate video streams that need to be laid out and displayed at the receiving end. The four paths of video streams are respectively: the high-code rate video stream has a code rate of 5Mbps, has very high image quality and shows the details of product demonstration; the medium and high bit rate video stream has a bit rate of 3Mbps and higher image quality, and is used for displaying the face close-up of a lecturer; the medium and low bit rate video stream has a bit rate of 2Mbps and general image quality and is used for displaying panoramic pictures of conference rooms; the low code rate video stream has a code rate of 1Mbps and lower image quality, and is used for displaying the video of the remote participant. The receiving end carries out self-defined layout according to the four paths of video streams, places the high-code-rate video stream in a large picture of 3/4 of the left side, places the medium-high-code-rate video stream in a small picture of the right lower corner of the large picture in a picture-in-picture manner, places the medium-low-code-rate video stream in the right upper corner of 1/8, and places the low-code-rate video stream in the right lower corner of 1/8. The high-code rate video stream occupies a large picture of 3/4 of the left side and is used for displaying product details with higher image quality, the medium-high code rate video stream displays the presenter close-up in the large picture by using a picture-in-picture canvas office, and the medium-low code rate video stream and the low code rate video stream occupy a part of 1/4 of the left side and are used for displaying panoramic pictures and videos of remote participants.
The receiving end user can watch the multipath video content at the same time, and the picture layout is customized according to the segmentation and the priority of the picture, so that the receiving end user can focus on important content, meanwhile, the information of panorama and remote participants is reserved, and the interactivity and the participation sense of the conference are increased.
The layout of the receiving end can be flexibly adjusted according to the requirements of the user of the receiving end, so that better user experience and viewing effect are provided.
In a word, the multi-channel video input method based on the SVC architecture has stronger code rate adaptability, and can dynamically select the video stream with the specified code rate according to the layout setting, the network condition and the equipment capability of the receiving end. For example, in the case of bandwidth limitation or lower device performance, the code rate may be reduced to maintain the smoothness and viewing experience of the video, while on a bandwidth-sufficient or high-performance device, a higher code rate may be provided to obtain a higher quality video picture. The subjective layout requirements of the user and the objective network and equipment conditions of the receiving end are comprehensively considered, and better experience is provided for real-time video transmission.
Exemplary System
Correspondingly, the embodiment of the invention also provides a multipath video input system. Fig. 2 is a block diagram of a multi-path video input system according to an embodiment of the present invention, and as shown in fig. 2, a system 100 according to the present embodiment includes: a processing apparatus 101 and a receiving end 102, the processing apparatus 101 comprising a receiving unit 103, a selecting unit 104 and a transmitting unit 105; wherein the method comprises the steps of
The receiving unit 103 is configured to receive at least two paths of video streams as original video streams;
the selecting unit 104 is configured to select at least two paths of video streams from the original video streams as selected video streams based on a requirement of a receiving end, and select a code rate of the selected video streams to obtain a video stream with a specified code rate;
the sending unit 105 is configured to send the video stream with the specified code rate to the receiving end;
the receiving end 102 is configured to lay out a video frame corresponding to the video stream with the specified code rate, and display the laid out video frame.
The multi-channel video input system 100 further comprises a transmitting end 106, wherein the transmitting end 106 comprises at least two cameras 107 and a selective transmitting unit 108; wherein the method comprises the steps of
The at least two cameras 107 are used for respectively acquiring at least two paths of video streams;
the selection transmitting unit 108 is configured to select at least two paths of video streams from the collected video streams for transmission.
The selection transmitting unit 108 is configured to select at least two paths of video streams for transmission according to a specific rule.
The specific rule is specifically to select a video stream whose shooting content fluctuation range exceeds a certain threshold value.
The camera 107 includes: common consumer grade cameras, security cameras, motion cameras, industrial cameras, infrared cameras, fisheye cameras, 3D cameras.
Each of the original video streams is a video stream with a different code rate from the same camera 107.
The processing apparatus 101 further comprises a requirement acquisition unit 109 for obtaining a requirement of the receiving end based on the layout setting, the device capabilities and the network conditions of the receiving end.
The device capabilities include processing capabilities, storage capabilities, decoding capabilities, and network connection capabilities;
the layout setting comprises layout content and a layout mode;
the layout modes include tiling layout, presenter mode, canvas office in picture, custom layout, full screen mode, sidebar layout.
The layout setting is completed by user dragging.
The multi-path video input system 100 further includes a control unit 110;
the processing means 101 are in the control unit 110 or the sender 106.
The video stream name and the picture name corresponding to the video stream default to the corresponding camera 107 or the camera interface name.
The transmitting end 106 further includes a parameter configuration unit 111, which is configured to select at least two cameras 107 and configure the parameters of the cameras.
The camera parameters comprise camera position, resolution, frame rate, focal length, aperture, focusing mode, white balance, automatic gain control and automatic exposure.
The system 100 is based on the SVC architecture.
The selection unit 104 is further configured to convert the original video stream into a video stream with multiple code rates.
It should be noted that although the operations of the multi-way video input method of the present invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in that particular order or that all of the illustrated operations be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.
Furthermore, although several devices, units, or modules of a multi-way video input system are mentioned in the above detailed description, such partitioning is merely exemplary and not mandatory. Indeed, the features and functions of two or more modules described above may be embodied in one module in accordance with embodiments of the present invention. Conversely, the features and functions of one module described above may be further divided into a plurality of modules to be embodied.
While the spirit and principles of the present invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments nor does it imply that features of the various aspects are not useful in combination, nor are they useful in any combination, such as for convenience of description. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
The invention provides:
1. a method of multiplexing video input, wherein a multiplexing video input system is provided, comprising a receiving end, the method comprising:
receiving at least two paths of video streams as original video streams;
selecting at least two paths of video streams from the original video streams as selected video streams based on the requirement of a receiving end, and selecting the code rate of the selected video streams to obtain video streams with specified code rates;
transmitting the video stream with the specified code rate to the receiving end;
and the receiving end lays out the video picture corresponding to the video stream with the specified code rate and displays the laid video picture.
2. The multi-path video input method according to claim 1, wherein the multi-path video input system further comprises a transmitting end, and the transmitting end comprises at least two cameras;
the step of receiving at least two video streams as original video streams further includes:
and the transmitting end respectively collects at least two paths of video streams by utilizing the at least two cameras and selects at least two paths of video streams from the video streams to transmit.
3. The multi-path video input method according to claim 2, wherein the step of selecting at least two paths of video streams for transmission comprises: and selecting at least two paths of video streams to transmit according to a specific rule.
4. The multi-path video input method of claim 2 or 3, wherein the specific rule includes:
and selecting video streams with shooting content fluctuation ranges exceeding a certain threshold value.
5. The multi-path video input method according to any one of claims 2 to 4, wherein the camera includes: common consumer grade cameras, security cameras, motion cameras, industrial cameras, infrared cameras, fisheye cameras, 3D cameras.
6. The multi-path video input method according to any one of claims 2 to 5, wherein each path of video stream in the original video stream is a video stream of a different code rate from the same camera.
7. The multi-path video input method according to any one of claims 1 to 6, wherein the step of selecting at least two paths of video streams from the original video streams as selected video streams based on the demand of the receiving end, and selecting the code rate of the selected video streams, and further comprises, before the step of obtaining the video stream with the specified code rate: and obtaining the requirements of the receiving end based on the layout setting, the equipment capability and the network condition of the receiving end.
8. The multi-way video input method of claim 7 wherein the device capabilities include processing capabilities, storage capabilities, decoding capabilities, and network connection capabilities;
the layout setting comprises layout content and a layout mode;
the layout modes include tiling layout, presenter mode, canvas office in picture, custom layout, full screen mode, sidebar layout.
9. The multi-path video input method of claim 7 or 8, wherein the layout setting is accomplished by user dragging.
10. The multi-path video input method according to claim 1, wherein the multi-path video input system further comprises a control unit;
executing the receiving of at least two paths of video streams as original video streams by the control unit or the transmitting end; selecting at least two paths of video streams from the original video streams as selected video streams based on the requirement of a receiving end, and selecting the code rate of the selected video streams to obtain video streams with specified code rates; and transmitting the video stream with the specified code rate to the receiving end.
11. The multi-path video input method according to any one of claims 2 to 10, wherein the video stream name and the picture name corresponding to the video stream default to a corresponding camera or camera interface name.
12. The multi-path video input method according to any one of claims 2 to 11, wherein the step of the transmitting end respectively acquiring at least two paths of video streams by using the at least two cameras and selecting at least two paths of video streams from the at least two paths of video streams for transmission specifically comprises:
the transmitting end selects at least two cameras and carries out camera parameter configuration on the cameras;
and respectively acquiring at least two paths of video streams by using the selected at least two cameras, and selecting at least two paths of video streams from the at least two paths of video streams to transmit.
13. The method of claim 12, wherein the camera parameters include camera position, resolution, frame rate, focal length, aperture, focus mode, white balance, automatic gain control, automatic exposure.
14. The multi-path video input method of any one of claims 1-13, wherein the method is based on an SVC architecture.
15. The multi-path video input method according to any one of claims 1 to 14, wherein the step of selecting at least two paths of video streams from the original video streams as selected video streams based on the demand of the receiving end, and selecting the code rate of the selected video streams, and obtaining the video stream with the specified code rate further comprises: and converting the original video stream into video streams with various code rates.
16. The multipath video input system is characterized by comprising a processing device and a receiving end, wherein the processing device comprises a receiving unit, a selecting unit and a sending unit; wherein the method comprises the steps of
The receiving unit is used for receiving at least two paths of video streams as original video streams;
the selecting unit is used for selecting at least two paths of video streams from the original video streams to serve as selected video streams based on the requirement of a receiving end, and selecting the code rate of the selected video streams to obtain video streams with specified code rates;
the sending unit is used for sending the video stream with the specified code rate to the receiving end;
the receiving end is used for laying out the video pictures corresponding to the video stream with the specified code rate and displaying the laid video pictures.
17. The multiple video input system of claim 16, further comprising a transmitting end, wherein the transmitting end comprises at least two cameras and a selective transmitting unit; wherein the method comprises the steps of
The at least two cameras are used for respectively collecting at least two paths of video streams;
the selection transmitting unit is used for selecting at least two paths of video streams from the acquired video streams to transmit.
18. The multi-channel video input system of claim 17 wherein the selection transmitting unit is configured to select at least two channels of video streams for transmission according to a specific rule.
19. The multi-channel video input system of claim 17 or 18, wherein the specific rule is specifically to select a video stream whose shot content variation range exceeds a certain threshold.
20. The multiple video input system of any one of claims 17-19, wherein the camera comprises: common consumer grade cameras, security cameras, motion cameras, industrial cameras, infrared cameras, fisheye cameras, 3D cameras.
21. The multi-path video input system of any one of claims 17-20, wherein each of the original video streams is a video stream of a different code rate from the same camera.
22. The multi-path video input system according to any one of claims 16 to 21, wherein the processing apparatus further comprises a requirement acquisition unit for obtaining a requirement of the receiving end based on a layout setting, a device capability, and a network condition of the receiving end.
23. The multi-way video input system of claim 22 wherein the device capabilities include processing capabilities, storage capabilities, decoding capabilities, and network connection capabilities;
the layout setting comprises layout content and a layout mode;
the layout modes include tiling layout, presenter mode, canvas office in picture, custom layout, full screen mode, sidebar layout.
24. The multi-way video input system of claim 22 or 23 wherein the layout setting is accomplished by user drag.
25. The multiple video input system of claim 16, further comprising a control unit;
the processing means are in the control unit or the transmitting end.
26. The multi-way video input system of any one of claims 17-25 wherein the video stream names and picture names corresponding to the video streams default to corresponding camera or camera interface names.
27. The multi-channel video input system of any one of claims 17-26, wherein the transmitting end further comprises a parameter configuration unit for selecting at least two cameras and configuring the camera parameters thereof.
28. The multi-way video input system of claim 27 wherein the camera parameters include camera position, resolution, frame rate, focal length, aperture, focus mode, white balance, automatic gain control, automatic exposure.
29. The multiple video input system of any one of claims 16-28, wherein the system is based on an SVC architecture.
30. The multi-path video input system of any one of claims 16 to 29, wherein the selection unit is further configured to convert the original video stream into a video stream of a plurality of code rates.

Claims (10)

1. A method of multiplexing video input, wherein a multiplexing video input system is provided, comprising a receiving end, the method comprising:
receiving at least two paths of video streams as original video streams;
selecting at least two paths of video streams from the original video streams as selected video streams based on the requirement of a receiving end, and selecting the code rate of the selected video streams to obtain video streams with specified code rates;
transmitting the video stream with the specified code rate to the receiving end;
and the receiving end lays out the video picture corresponding to the video stream with the specified code rate and displays the laid video picture.
2. The multi-path video input method of claim 1, wherein the multi-path video input system further comprises a transmitting end comprising at least two cameras;
the step of receiving at least two video streams as original video streams further includes:
and the transmitting end respectively collects at least two paths of video streams by utilizing the at least two cameras and selects at least two paths of video streams from the video streams to transmit.
3. The multi-path video input method according to claim 2, wherein the step of selecting at least two paths of video streams for transmission comprises: and selecting at least two paths of video streams to transmit according to a specific rule.
4. A multi-path video input method as claimed in claim 2 or 3, wherein the specific rule comprises:
and selecting video streams with shooting content fluctuation ranges exceeding a certain threshold value.
5. The multi-path video input method according to any one of claims 2 to 4, wherein the camera includes: common consumer grade cameras, security cameras, motion cameras, industrial cameras, infrared cameras, fisheye cameras, 3D cameras.
6. The multi-path video input method according to any one of claims 2 to 5, wherein each of the original video streams is a video stream of a different code rate from the same camera.
7. The multi-path video input method according to any one of claims 1 to 6, wherein the step of selecting at least two paths of video streams from the original video streams as selected video streams based on the demand of the receiving end, and selecting the code rate of the selected video streams, further comprises, before the step of obtaining the video stream with the specified code rate: and obtaining the requirements of the receiving end based on the layout setting, the equipment capability and the network condition of the receiving end.
8. The multi-way video input method of claim 7 wherein the device capabilities include processing capabilities, storage capabilities, decoding capabilities, and network connection capabilities;
the layout setting comprises layout content and a layout mode;
the layout modes include tiling layout, presenter mode, canvas office in picture, custom layout, full screen mode, sidebar layout.
9. The multi-path video input method according to claim 7 or 8, wherein the layout setting is done by user dragging.
10. The multipath video input system is characterized by comprising a processing device and a receiving end, wherein the processing device comprises a receiving unit, a selecting unit and a sending unit; wherein the method comprises the steps of
The receiving unit is used for receiving at least two paths of video streams as original video streams;
the selecting unit is used for selecting at least two paths of video streams from the original video streams to serve as selected video streams based on the requirement of a receiving end, and selecting the code rate of the selected video streams to obtain video streams with specified code rates;
the sending unit is used for sending the video stream with the specified code rate to the receiving end;
the receiving end is used for laying out the video pictures corresponding to the video stream with the specified code rate and displaying the laid video pictures.
CN202311011211.0A 2023-08-10 2023-08-10 Multipath video input method and system Pending CN116915937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311011211.0A CN116915937A (en) 2023-08-10 2023-08-10 Multipath video input method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311011211.0A CN116915937A (en) 2023-08-10 2023-08-10 Multipath video input method and system

Publications (1)

Publication Number Publication Date
CN116915937A true CN116915937A (en) 2023-10-20

Family

ID=88364941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311011211.0A Pending CN116915937A (en) 2023-08-10 2023-08-10 Multipath video input method and system

Country Status (1)

Country Link
CN (1) CN116915937A (en)

Similar Documents

Publication Publication Date Title
US8619953B2 (en) Home videoconferencing system
US8319814B2 (en) Video conferencing system which allows endpoints to perform continuous presence layout selection
RU2536807C2 (en) Terminal, block of multi-address control, system and method for generation of multi-window images of high definition
AU2011320410B2 (en) Conference control method, apparatus and system thereof
US8966556B2 (en) Real-time multi-media streaming bandwidth management
US10560725B2 (en) Aggregated region-based reduced bandwidth video streaming
US10244168B1 (en) Video system for real-time panoramic video delivery
WO2008131644A1 (en) A method, device and system for realizing picture switching in the video service
US20060159099A1 (en) Videoconferencing device and system
US20060192848A1 (en) Video conferencing system
WO2011116611A1 (en) Method for playing video of tv meeting
NO318911B1 (en) Distributed composition of real-time media
JP2011523330A (en) Improved view layout management in scalable video and audio communication systems
KR101528863B1 (en) Method of synchronizing tiled image in a streaming service providing system of panoramic image
US9602794B2 (en) Video processing system and video processing method
JP2008515273A (en) Method for encoding partial video images
KR20110112686A (en) Video conference apparatus and method
CN116915937A (en) Multipath video input method and system
US9319719B2 (en) Method for processing video and/or audio signals
WO2013060295A1 (en) Method and system for video processing
KR102555962B1 (en) Hybrid server and hybrid server operation method for providing videotelephony service
Niamut et al. Towards scalable and interactive delivery of immersive media
Blum et al. End-user viewpoint control of live video from a medical camera array
Holub HD Multi-point Videoconferencing
Macq et al. Application Scenarios and Deployment Domains

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination