CN115550680A - Course recording and playing method and system - Google Patents

Course recording and playing method and system Download PDF

Info

Publication number
CN115550680A
CN115550680A CN202211217708.3A CN202211217708A CN115550680A CN 115550680 A CN115550680 A CN 115550680A CN 202211217708 A CN202211217708 A CN 202211217708A CN 115550680 A CN115550680 A CN 115550680A
Authority
CN
China
Prior art keywords
video
recording
video frame
broadcasting
small window
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211217708.3A
Other languages
Chinese (zh)
Inventor
吴敏
王贵闯
郭建森
庞新春
郭岳鹏
柴博文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Huafu Packaging Technology Co ltd
Lucky Huaguang Graphics Co Ltd
Original Assignee
Henan Huafu Packaging Technology Co ltd
Lucky Huaguang Graphics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan Huafu Packaging Technology Co ltd, Lucky Huaguang Graphics Co Ltd filed Critical Henan Huafu Packaging Technology Co ltd
Priority to CN202211217708.3A priority Critical patent/CN115550680A/en
Publication of CN115550680A publication Critical patent/CN115550680A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/08Electrically-operated educational appliances providing for individual presentation of information to a plurality of student stations
    • G09B5/12Electrically-operated educational appliances providing for individual presentation of information to a plurality of student stations different stations being capable of presenting different information simultaneously
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23406Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving management of server-side video buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Educational Technology (AREA)
  • Educational Administration (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Marketing (AREA)
  • Computer Security & Cryptography (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The invention provides a course recording and broadcasting method and a course recording and broadcasting system, and belongs to the technical field of recording and broadcasting. The recording and broadcasting method comprises the steps of S1, obtaining recording parameters; s2, adjusting the resolution of a related camera mechanism according to the recording parameters, wherein the camera mechanism is used for collecting teaching videos; s3, caching video frame images and audio stream data obtained by decoding a video source; s4, acquiring a time stamp of the video frame image sampling moment; s5, processing the video frame image according to the recording parameters; s6, sequentially extracting video frame images with similar timestamps according to the sequence of the timestamps, and splicing the video frame images into new video frame images according to the recording parameters; and S7, coding, storing and pushing stream live broadcast are carried out on the spliced video frame images by combining with audio stream data. The invention can realize splicing recording and live streaming of multiple video sources in a classroom, reduces the consumption of processing resources by adjusting the resolution of the camera mechanism, and realizes synchronization among different video sources by acquiring the sampling time of video frames.

Description

Course recording and playing method and system
Technical Field
The invention belongs to the technical field of recording and broadcasting, and particularly relates to a course recording and broadcasting method and a course recording and broadcasting system.
Background
The recording and playing system integrates and synchronously records the video, audio and other signals shot and recorded on site to generate a standardized streaming media file for storage, external live broadcast, later edition and on-demand. The recording and broadcasting system and the school teaching are organically combined, the video camera records and stores the acquired signals of the competitive curriculum, and students or teachers can review the video files of the curriculum. Meanwhile, the course video can be uploaded to the Internet after being sorted, so that the purpose of fully sharing the network learning resources is achieved.
For example, patent document CN113422933A proposes a recording and playing system based on teaching, a recording and playing method, and a recording and playing medium, where the recording and playing system can change a video picture through a client, support setting of the video picture, and randomly change a video source of a video window, a position of the video window, a size of the video window, and a number of the video windows. However, in order to realize playback, the recording and playing system can switch between a plurality of videos at will, and needs to store one video file for each video source, which occupies a large amount of storage space. And when the recording and broadcasting system plays in a split screen mode, the problem of synchronization among different video streams due to different time delays is not considered.
Disclosure of Invention
The invention aims to solve the technical problem of providing a course recording and broadcasting method and a course recording and broadcasting system aiming at the defects of the prior art.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a course recording and broadcasting method comprises the following steps:
s1, acquiring recording parameters;
s2, adjusting the resolution of a relevant camera shooting mechanism according to the recording parameters, wherein the camera shooting mechanism is used for collecting teaching videos;
s3, caching video frame images and audio stream data obtained by decoding a video source;
s4, acquiring a time stamp of the video frame image sampling moment;
s5, processing the video frame image according to the recording parameters;
s6, sequentially extracting video frame images with similar timestamps according to the sequence of the timestamps, and splicing the video frame images into new video frame images according to the recording parameters;
and S7, coding, storing and pushing the stream live broadcast of the spliced video frame image by combining with audio stream data.
Further, the method for generating the recording parameter in step S1 includes:
s11, acquiring video resolution WXH;
s12, determining a reference value of each small window according to the selected split screen mode;
s13, determining the position parameters of the small windows according to the video resolution and the reference values of the small windows;
s14, acquiring video source numbers related to the small windows;
s15, acquiring position parameters of video pictures of each video source;
and S16, generating recording parameters.
Further, the reference value S of the small window in step S12 m =(X′ m ,Y′ m ,W′ m ,H′ m ) Wherein m is the number of the small window, (X' m ,Y′ m ) Is composed ofPosition coordinate reference value of small window m' m ,H′ m ) The width and height reference value of the small window m; x' m Is the ratio of the X-axis coordinate value of the small window m to the width of the whole video window, Y' m Is the ratio of the Y-axis coordinate value of the small window m to the width of the whole video window, W' m Is the ratio of the width of the small window m to the width of the entire video window, H' m Is the ratio of the height of the small window m to the height of the entire video window.
Further, the position parameter P of the small window in step S13 m =(X m ,Y m ,W m ,H m ),X m =X′ m ×W,Y m =Y′ m ×H,W m =W′ m ×W,H m =H′ m ×H。
Further, the position parameter P of the video frame in step S15 n =(X n ,Y n ,W n ,H n ) Wherein n is the number of the video source, (X) n ,Y n ) Is the coordinate value of the origin of the video picture, (W) n ,H n ) Are the width value and the height value of the video picture.
Further, step S2 adjusts the resolution of the imaging means to W n ×H n Or slightly larger than W n ×H n
Further, the recording parameter is (n, P) n ,P m ) Video source n is associated with a small window m.
Further, the processing of the video frame image in step S5 includes resizing processing and trimming processing.
Further, the video frame image adopts a YUV format.
A course recording and broadcasting system is used for executing the recording and broadcasting method and comprises a camera shooting mechanism, a recording and broadcasting mechanism and a storage mechanism;
the camera shooting mechanisms are used for collecting teaching videos, and a sound pick-up is arranged in one camera shooting mechanism or connected with one camera shooting mechanism;
the recording and playing mechanism is used for acquiring the audio and video acquired by the camera shooting mechanism through a network, splicing multiple paths of video streams into one video, storing the video in the storage mechanism and pushing the spliced video to the outside;
the storage mechanism is used for storing the course video recorded by the recording and broadcasting mechanism;
the recording and broadcasting mechanism also provides network time calibration service for the camera shooting mechanism;
and the computer program running on the recording and broadcasting mechanism is also used for generating the recording parameters.
In the field of curriculum recording and broadcasting, ultrahigh-resolution curriculum recordings do not need to be recorded, and the conventional 1080P (1920 × 1080) resolution is fully applicable. However, with the rapid development of the current image technology, the resolution output by the camera mechanism 101 can reach 4K (e.g., 3840 × 2160), 600 ten thousand pixels (e.g., 3072 × 2048), 400 ten thousand pixels (e.g., 2560 × 1440), 300 ten thousand pixels (e.g., 2048 × 1536), etc., which exceed the resolution (1920 × 1080) of the final video, and far exceed the resolution of a part of small window video in the video picture.
When splicing video pictures, the original video pictures need to be zoomed to the size of a small window, and if only the final effect of the zoomed images is considered, the effect is optimal when the images with the original size are used. Due to limitations of system memory and processor performance, directly processing high resolution raw images can increase system burden and consume a large amount of computing resources. For example, the processor resources required for scaling a 4K video image to 1080P are larger than those required for scaling a 300 ten thousand pixel video image to 1080P, but the resolutions of the 1080P video images scaled by the two are basically the same (at the same code rate).
Moreover, transmitting a high-resolution video stream results in more power consumption, and transmitting a high-resolution video stream also requires more network bandwidth. The real-time video processing involved in the present invention is typically a CPU/GPU intensive task that consumes significant processor resources, while the large video stream data also occupies significant memory resources. In addition, video transmission is performed through a streaming media protocol for network transmission, and although a mature video compression algorithm exists, real-time video streaming still consumes a large amount of network resources, especially in the case of transmitting high-resolution real-time video streaming.
To solve the above problems, those skilled in the art are looking for more efficient video image processing algorithms to reduce the occupation of processor and memory resources, or to configure a higher performance processor, or to configure an edge computing device to pre-process video images to reduce the resource consumption of a central processing device. However, on one hand, existing video image processing algorithms are mature and it is difficult to seek improvements to the algorithms, and on the other hand, adding edge computing devices or configuring high performance processors can increase system cost.
In this case, the present application directly adjusts the output resolution of the front-end camera to be equal to or slightly greater than the resolution of the small-window video in the video screen. Therefore, the camera shooting mechanism serves as image preprocessing equipment, the computing resources required by the recording and playing mechanism are dispersed, the video images with small resolution are processed by the recording and playing mechanism, and the required processing resources and the required memory resources are smaller. Meanwhile, the small-resolution video output by the camera mechanism consumes less network resources during transmission, and the real-time transmission of the video is prevented from being influenced by the large occupation of the network transmission performance or the exchange performance of the switch.
When the camera shooting mechanism outputs the video stream, the image acquisition and encoding processes need to be carried out, and the time consumption of the encoding process of the camera shooting mechanism is different due to the fact that the camera shooting mechanism in a classroom is adjusted to be different in output resolution. The camera shooting mechanism consumes more processing resources when coding the high-resolution video image, so that the network transmission performance is also influenced, and the high-resolution video image consumes more network resources when being transmitted, so that the time delay of the high-resolution video when being transmitted to the recording and broadcasting mechanism is increased, and the video streams of different camera shooting mechanisms have different time delays. When the course is recorded, different video streams need to be spliced, but the video streams do not have synchronism, and although the delay difference between the different video streams may not exceed 1 second, the watching effect of the course video can still be influenced after the asynchronous pictures are spliced. However, in the current recording and broadcasting system, the video resolutions of the front-end camera mechanisms are the same or have little difference, and the real-time difference between different video sources is not obvious, so that the problem of synchronization between different video sources is mostly not considered.
Therefore, in order to achieve synchronization between different video streams, it is necessary to align video frames at the same time or close to the same time between different video streams. In this case, it is necessary to obtain the sampling time of each video frame of the video stream and align the video frame of the closest sampling time.
In the prior art, a real-time transport protocol (RTP) and a real-time transport control protocol (RTCP) can be used to obtain an NTP timestamp of a video frame sampling moment, but this method needs clock synchronization of all camera mechanisms, i.e. has consistent time, and even some camera mechanisms do not support the RTP protocol or the RTCP protocol. Under the condition, the invention utilizes the OCR function to recognize the time characters which are superposed and displayed on the video images, although the time characters can only be displayed to seconds, the time stamps of the sampling time of the video frames can still be obtained according to the time of jumping seconds, and the time stamps of the sampling time of the subsequent video frames can be determined according to the frame rate of the video.
Taking the 25fps frame rate set by the invention as an example, under the condition of clock synchronization of the camera mechanism, the error of the sampling time of the acquired video frame can not exceed 40ms by using a mode of recognizing time characters by using OCR (optical character recognition), although the video frame processed in this way can have errors, the error smaller than 40ms can not influence the effect of the video recording and watching of the curriculum.
Compared with the prior art, the invention has the following beneficial effects:
the recording and broadcasting mechanism runs a computer program which is provided with a recording parameter configuration window, and the recording parameter configuration window can directly select a split screen mode. When the split screen mode is selected, the whole video picture is split into small windows in the left preview window according to the mode, and each small window in the split screen displays the video picture of one video source. For the small window without the video source added, the adding button (+) can be clicked to add the video source; the small window with the video source added can click the delete button (-) to delete the video source. Therefore, the invention can conveniently set the screen splitting mode during recording, can add video sources of the camera mechanism in a classroom as required, and finally splices multiple video sources into a new video according to the screen splitting mode.
In the preview window of the recording parameter configuration window, the video source of the small window can be directly exchanged by using a mouse dragging mode, for example, the small window C is dragged to the upper part of the small window B by clicking with a mouse, and after the mouse is released, the video sources of the small window B and the small window C are automatically exchanged. Meanwhile, videos displayed in the small window can be amplified, reduced or cut by using a mouse, specifically, the mouse is moved to a certain small window, a mouse roller is rolled, and a corresponding video picture is amplified or reduced; and clicking a mouse to drag the video, so that the display position of the video picture in the small window can be adjusted. Therefore, the invention can conveniently adjust the size and the position of the video source of each small window in the split screen, namely the video source.
The method and the device configure the window by using the recording parameters, set the resolution of the recorded video, set the split screen mode, set the video source of each small window, set the size and the position of the video image in each small window, automatically generate the recording parameters according to the parameters, subsequently use the recording parameters to perform image processing and splicing on the related multi-channel video, then re-encode the multi-channel video into a new video, and then record, store and push-stream live broadcast the encoded video.
The invention adjusts the resolution of the camera mechanism to the resolution of the small-window video or slightly larger than the resolution of the small-window video, thereby reducing the requirement on the processor capacity of the recording and broadcasting mechanism, reducing the waste of processing resources, reducing the bandwidth occupation of the camera mechanism when the video is transmitted by a network, improving the transmission efficiency and avoiding the problem of the transmitted video picture. The invention can acquire the sampling time of each video frame image, and can align the video frames with the closest sampling time when splicing multiple video sources, thereby realizing the synchronization of the multiple video sources.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings.
FIG. 1: the embodiment 1 of the invention is a schematic diagram of a recording and broadcasting system;
FIG. 2: embodiment 1 of the invention records the schematic diagram of the broadcast organization;
FIG. 3: the invention embodiment 1 computer program recording parameter configuration window schematic diagram;
FIG. 4 is a schematic view of: embodiment 2 of the invention records the flow chart of the broadcast method;
FIG. 5 is a schematic view of: the embodiment 2 of the present invention is a flowchart of a method for generating recording parameters in step S1.
Detailed Description
For a better understanding of the present invention, the following examples and the accompanying drawings are used to further clarify the content of the present invention, but the content of the present invention is not limited to the following examples. In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without one or more of these specific details.
Example 1:
the purpose of this embodiment is to provide a course recorded broadcast system, which is used for recording, storing and live broadcasting the course of giving lessons of teachers in a classroom. As shown in fig. 1, the recording and playing system includes: an image pickup mechanism 101, a recording and playing mechanism 102, and a storage mechanism 103.
The camera mechanism 101 is a monitoring camera, a video conference terminal or other network cameras; the camera mechanism 101 collects images during teaching and encodes the images into videos, and the encoded videos are transmitted through a network. In a classroom, at least one camera mechanism 101 may be provided as needed, for example, four camera mechanisms 101 may be provided for capturing video images of close-up teacher, close-up student, panoramic teacher, and panoramic student. Meanwhile, a pickup is built in or connected to the camera mechanism 101 for collecting the sound of the teacher giving lessons.
The recording and playing mechanism 102 is configured to acquire a video stream transmitted by the camera mechanism 101, and is configured to combine video images of multiple paths of video streams into one video image, and store and push the video image after the video image is encoded into a video. Specifically, the recording and playing mechanism 102 may be a computer or a server, and its internal structure diagram may be as shown in fig. 2. The recording and playback mechanism 102 includes a processor and memory coupled via a system bus, where the processor is configured to provide computing and control capabilities, and includes a Central Processing Unit (CPU) and may also include a Graphics Processing Unit (GPU). The memory comprises a nonvolatile storage medium and an internal memory; the non-volatile storage medium stores an operating system, a computer program, and a database; the internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium.
The recording and playing mechanism 102 is also connected with a network interface and a video interface through a system bus. The network interface is used for connecting and communicating with external equipment through a network, for example, the camera mechanism 101 can send the acquired video image to the recording and broadcasting mechanism 102 through the network interface; the video output port (such as HDMI, VGA, DVI, etc.) is used for outputting a video image, for example, outputting a screen of the recording and playing mechanism 102 to a display device, or outputting an encoded video to other devices for live broadcasting.
The recording and broadcasting mechanism 102 can also provide network time calibration service for the camera mechanism 101; the NTP server address of the camera mechanism 101 is the IP address of the recording mechanism 102. Therefore, the image pickup mechanism 101 can keep time with the recording and broadcasting mechanism 102.
The storage mechanism 103 is used for storing the recorded lesson videos recorded by the recording and broadcasting mechanism 102. The storage mechanism 103 may be a readable storage medium built in the recording and playing mechanism 102, or may be an external storage device or a cloud storage space communicatively connected to the storage mechanism 103.
The computer program on the recording and broadcasting mechanism 102 is used for realizing recording storage and live streaming of teaching when being executed by the central processing unit. The computer program can adopt a stand-alone architecture, and is controlled on the recording and broadcasting mechanism 102, and can also adopt a B/S architecture or a C/S architecture, and is controlled by a browser or a client on other computer terminals connected with the recording and broadcasting mechanism 102 through a network.
Fig. 3 is a recording parameter configuration window of the computer program, where the left side of the window is a video preview window, and the right side of the window can set the resolution of a recorded video, and can select a split screen mode. When the split screen mode is selected, the entire video frame in the left preview window is split into small windows according to the mode, and each small window in the split screen displays the video frame of one video source (the camera mechanism 101). For the small window without the video source added, the adding button (+) can be clicked to add the video source; the small window with the video source added can click the delete button (-) to delete the video source. If the video source added to the small window A in FIG. 2 is "teacher close-up", the video source can be deleted by clicking the delete button on the right of the small window A, and then the video source can be added again by clicking the add button.
The video sources of the small window in the video preview window can also be directly exchanged by using a mouse dragging mode, for example, the small window C is dragged to the upper part of the small window B by clicking with a mouse, and after the mouse is released, the video sources of the small window B and the small window C are automatically exchanged. Meanwhile, the video displayed in the small window can be amplified, reduced or cut by using a mouse, specifically, the mouse is moved to a certain small window, a mouse roller is rolled, and the corresponding video picture is amplified or reduced; clicking a mouse to drag the video, the display position of the video picture in the small window can be adjusted, and the video picture outside the small window is cut and discarded during coding.
After the video related parameters are configured, the application button is clicked, the recording parameters are automatically generated, then the computer program carries out video coding according to the recording parameters, and carries out recording storage and push stream live broadcast on the coded video.
Example 2:
the purpose of this embodiment is to provide a course recording and playing method, where the course recording and playing method is executed by the recording and playing system described in embodiment 1. As shown in fig. 4, the recording and playing method includes:
1. and S1, acquiring recording parameters.
In this step, the computer program described in embodiment 1 is used to configure parameters of the recorded video. The method for generating the recording parameters in the step comprises the following steps:
1.1, step S11, acquiring the set video resolution.
In this step, the video resolution WXH set in the recording parameter configuration window is obtained.
And 1.2, step S12, determining the reference value of each small window according to the selected split screen mode.
In the invention, each split screen mode is preset with a corresponding small window reference value S m =(X′ m ,Y′ m ,W′ m ,H′ m ) Wherein m is the number of the small window (such as A, B, C, etc.), S m Is a reference value of a small window m, (X' m ,Y′ m ) Is a small window m position coordinate reference value, (W' m ,H′ m ) Is the width and height reference value of the small window m. The coordinate reference value is a ratio of a small window origin (upper left corner) actual coordinate value to the whole window width/height, the width reference value is a ratio of the small window actual width or height to the whole window width/height, and specifically, X' m Is the ratio of the X-axis coordinate value of the small window m to the width of the whole video window, Y' m Is the ratio of the Y-axis coordinate value of the small window m to the width of the whole video window, W' m Is the ratio of the width of the small window m to the width of the entire video window, H' m Is the ratio of the height of the small window m to the height of the entire video window.
In the split screen mode as shown in fig. 2, the small window a occupies the upper two thirds of the whole video window, the small windows B, C, D together occupy the lower one third of the whole video window, and the areas occupied by the small windows B, C, D are the same, so the small window reference values of the allocation mode are: (S) A ,S B ,S C ,S D ),
Figure BDA0003873875740000071
Figure BDA0003873875740000072
And 1.3, step S13, determining the position parameter of each small window according to the video resolution and the reference value of each small window.
The position information of each small window comprises a coordinate point and a width and height value, specifically P m =(X m ,Y m ,W m ,H m ),X m =X′ m ×W,Y m =Y′ m ×H,W m =W′ m ×W,H m =H′ m ×H。Wherein P is m Is the position parameter of the small window m.
1.4, step S14, acquiring the video source number associated with each small window.
This step acquires the numbers of the video sources of the camera mechanism 101 displayed in each small window, and associates the numbers of the small windows with the numbers of the video sources.
1.5, step S15, obtaining the position parameter of the video picture of each video source.
The method comprises the steps of obtaining coordinate values, width and height of each video picture in the whole video window, wherein the width and height are the width and height of the video picture of the video source displayed in the small window, or the width and height of the video picture after being amplified or reduced. In particular P n =(X n ,Y n ,W n ,H n ) Where n is the number of the video source, P n For the position parameter of the video source n video pictures, (X) n ,Y n ) Is the coordinate value of the origin (upper left corner) of the video picture, (W) n ,H n ) The width value and the height value of the video picture.
When a video source is just added into the small window, the video image is automatically displayed in the middle of the small window, the original position information of the video image can be determined according to the position parameter of the small window and the aspect ratio of the video image, and then the position information of the video image after being amplified or reduced is obtained in response to a mouse wheel event (the video image is amplified or reduced); and responding to a click release event of a left mouse button, judging whether the display position of the video image is adjusted, and if so, judging the position information of the video image after movement according to the coordinate positions before and after cursor drag and drop.
1.6, step S16, generating recording parameters.
The generated recording parameters are (n, P) n ,P m ) Where n is the number of the video source associated with the small window m, P n For a position parameter of n video pictures of a video source, P m Is the position parameter of the small window m.
2. And S2, adjusting the resolution of the relevant camera mechanism 101 according to the recording parameters.
This embodiment is based on recording the parameterCombining multiple video sources into one video, firstly, each video image needs to be enlarged or reduced to W n ×H n Then according to (X) m ,Y m ) And (X) n ,Y n ) Cutting edges of the video images, splicing the cut edges of the video images, and finally coding the spliced video images.
Because the video image can not be amplified without damage, in order to ensure the definition of the video image as much as possible, the resolution ratio of a video source is larger than W n ×H n I.e. to adjust the video image to W in a reduced manner n ×H n
In the field of curriculum recording and broadcasting, it is not necessary to record ultrahigh resolution curriculum recordings, and the resolution of 1920 × 1080 (1080P) as set in fig. 2 is fully applicable. However, with the rapid development of the current imaging technology, the resolution output by the camera mechanism 101 can reach 4K, 600 ten thousand pixels, 400 ten thousand pixels, 300 ten thousand pixels, etc., and these resolutions exceed the resolution 1920 × 1080 of the final video, and far exceed the resolution of the video image in a part of small windows.
Although the definition can be guaranteed when the video source with high resolution is subjected to reduction processing, the reduction processing depends on the performance of a processor, and the higher the resolution of the processed video image is, the larger the occupied processor resource is. For example, the processor resources required for scaling a 4K video image to 1080P are larger than those required for scaling a 300 ten thousand pixel video image to 1080P, but the resolutions of the 1080P video images scaled by the two are basically the same (at the same bitrate).
Especially, a video with ultrahigh resolution occupies more bandwidth during transmission, for example, in some schools, one recording and playing mechanism 102 is arranged in a master control room to realize recording and playing in all classrooms, and the camera mechanism 101 in a teacher transmits video data to the recording and playing mechanism 102 by using a local area network. If the video output by the camera mechanism 101 is a high-resolution or ultra-high-resolution video, which is limited by the network transmission performance or the switching performance of the switch, the situations of video picture blocking, screen blooming, tearing, screen blacking, etc. may occur, which may affect the final recording effect.
Therefore, in this step, the resolution of the imaging means 101 is adjusted to the corresponding W n ×H n Or slightly larger than W n ×H n . For the camera mechanism 101 capable of completely customizing the resolution, the resolution can be directly set to W n ×H n (ii) a For the imaging mechanism 101 capable of selecting only a specified resolution, the ratio W can be selected n ×H n Slightly larger by one resolution.
Meanwhile, the frame rate of the camera shooting mechanism 101 is uniformly set to 25fps, so that the requirement of course video recording is met, high bandwidth occupation caused by high frame rate is avoided, and the subsequent steps are convenient to process video frame images.
This step is based on W n ×H n The resolution of the corresponding camera mechanism 101 is set, so that the requirement on the processing capacity of a processor can be reduced, and the waste of processing resources is reduced; the bandwidth occupation of the video shot by the camera mechanism 101 during network transmission can be reduced, the transmission efficiency is improved, and the problem of the transmitted video pictures is avoided.
3. And S3, caching video frame images and audio stream data obtained by decoding the video source.
Decoding a video stream output by a video source, acquiring an image of each video frame, and caching the video frame image; and decoding the audio stream output by the video source containing the teaching audio to obtain audio stream data and caching the audio stream data.
The decoded video frame image adopts a YUV format, and the YUV format allows the bandwidth of chroma to be reduced in consideration of human perception capability, so that the required storage space is small, and the video frame image is more suitable for an image system.
4. And S4, acquiring a time stamp of the video frame image sampling moment.
Step S2 according to W n ×H n The resolution of the camera mechanisms 101 is set, so that the resolutions of different camera mechanisms 101 are different, the models or performances of different camera mechanisms 101 are different, or the network transmission paths of the video images are different, and finally the video images output to the recording and playing mechanism 102 reach the recording and playing machineThe delay times of the structures 102 are also different. If the video frames of the recording and playing mechanism 102 are directly spliced and recorded under the condition of different time delays, the frames in different small windows do not have synchronism, and the final recording effect is affected.
Therefore, when different video sources are spliced and recorded, video images of different video sources are synchronized, video frames of different video sources at the same time or at similar times are aligned, and the video images have synchronicity. The method comprises the steps of obtaining the sampling time of each video frame, associating the time stamp of the sampling time with the corresponding video frame image, and aligning the video frames of different video sources according to the time stamp.
When the camera mechanism 101 supports a real-time transport protocol (RTP) and a real-time transport control protocol (RTCP), an RTP timestamp of each video source is extracted, and a timestamp of a video frame image sampling time is calculated by combining information about an NTP timestamp in an RTCP code stream corresponding to the RTP code stream in which the RTP timestamp is located.
If the camera mechanism 101 does not support the RTP protocol and the RTCP protocol, the following method may be adopted to obtain the timestamp of the video frame image sampling time:
the camera mechanism 101 is first enabled to perform network timing by using the recording and broadcasting mechanism 102, and then the time character superimposing function of the camera mechanism 101 is enabled, the superimposed time is accurate to seconds, and the superimposing position is set at the same position (for example, uniformly set at the upper right corner).
The overlapped time is identified from the video frame image by using the OCR function, if the time identified from the current video frame image is increased by one second compared with the previous video frame image, the time stamp converted from the time identified from the current video frame image is the time stamp of the sampling time of the video frame image, and the time stamp of the sampling time of the subsequent video frame image can be determined according to the inter-frame interval (if the frame rate is 25fps in the embodiment, the inter-frame interval is 40 ms).
Since the time characters are superimposed at the same position, an image near the position can be captured and subjected to OCR recognition, which can increase OCR efficiency.
5. And S5, processing the video frame image according to the recording parameters.
First resize (reduce or enlarge) the video frame image of video source n to W n ×H n Then according to the corresponding (X) m ,Y m ) And (X) n ,Y n ) Two coordinates clipping video frame image to W m ×H m Size. The method specifically comprises the following steps: cropping out the height (X) above the video frame image m -X n ) The height of the lower part of the cut-out part is (H) n -H m -X m +X n ) Cutting out the left part with a width of (Y) m -Y n ) The right part width of the cut-out portion is (W) n -W m -Y m +Y n ) The remaining portion has a size of W m ×H m
6. And S6, sequentially extracting the video frame images with the closest timestamps according to the sequence of the timestamps, and splicing the video frame images into new video frame images according to the recording parameters.
The method comprises the steps of taking video frames of a video source containing teaching audio as reference frame images, and sequentially extracting the reference frame images from a cache according to the sequence of time stamps. When extracting the reference frame image, other video frame images closest to the time stamp thereof are extracted at the same time, and then according to the coordinate point (X) m ,Y m ) And splicing the extracted video frame images into a new video frame image with the size of W multiplied by H.
The extracted reference frame image is automatically removed from the buffer.
7. And S7, combining the spliced video frame images with audio stream data to perform coding storage and push stream live broadcast.
And extracting audio stream data corresponding to the reference frame from the cache, combining the spliced video frame images as video streams, synchronizing the video stream data and the video frame images, and uniformly coding to form a new audio/video stream.
This audio-video streaming data is then stored in the storage mechanism 103, completing the storage of the video of the lesson. Meanwhile, the audio and video stream is pushed outwards through a network interface or a video interface, so that the course video can be conveniently watched by other people in real time.
Finally, the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited, and other modifications or equivalent substitutions made by the technical solutions of the present invention by the persons skilled in the art should be covered within the scope of the claims of the present invention as long as they do not depart from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A course recording and playing method is characterized in that: the method comprises the following steps:
s1, acquiring recording parameters;
s2, adjusting the resolution of a related camera mechanism according to the recording parameters, wherein the camera mechanism is used for collecting teaching videos;
s3, caching video frame images and audio stream data obtained by decoding a video source;
s4, acquiring a time stamp of the video frame image sampling moment;
s5, processing the video frame image according to the recording parameters;
s6, sequentially extracting video frame images with the most similar time stamps according to the time stamp sequence, and splicing the video frame images into new video frame images according to the recording parameters;
and S7, coding, storing and pushing the stream live broadcast of the spliced video frame image by combining with audio stream data.
2. The lesson recording and broadcasting method as claimed in claim 1, wherein: the method for generating the recording parameters in the step S1 comprises the following steps:
s11, acquiring video resolution WXH;
s12, determining a reference value of each small window according to the selected split screen mode;
s13, determining the position parameters of the small windows according to the video resolution and the reference values of the small windows;
s14, acquiring video source numbers associated with the small windows;
s15, acquiring position parameters of video pictures of each video source;
and S16, generating recording parameters.
3. The lesson as recited in claim 2The recording and broadcasting method is characterized in that: reference value S of the small window in step S12 m =(X′ m ,Y′ m ,W′ m ,H′ m ) Wherein m is the number of the small window, (X' m ,Y′ m ) Is a small window m position coordinate reference value, (W' m ,H′ m ) The width and height reference value of the small window m; x' m Is the ratio of the X-axis coordinate value of the small window m to the width of the whole video window, Y' m Is the ratio of the Y-axis coordinate value of the small window m to the width of the whole video window, W' m Is the ratio of the width of the small window m to the width of the entire video window, H' m Is the ratio of the height of the small window m to the height of the entire video window.
4. The lesson recording and broadcasting method as claimed in claim 3, wherein: step S13, position parameter P of small window m =(X m ,Y m ,W m ,H m ),X m =X′ m ×W,Y m =Y′ m ×H,W m =W′ m ×W,H m =H′ m ×H。
5. The lesson recording and broadcasting method as claimed in claim 4, wherein: position parameter P of video picture in step S15 n =(X n ,Y n ,W n ,H n ) Wherein n is the number of the video source, (X) n ,Y n ) Is the coordinate value of the origin of the video picture, (W) n ,H n ) Is the width value and the height value of the video picture.
6. The lesson recording and broadcasting method as claimed in claim 5, wherein: step S2 of adjusting the resolution of the image pickup mechanism to W n ×H n Or slightly larger than W n ×H n
7. The lesson recording and broadcasting method as claimed in claim 6, wherein: the recording parameter is (n, P) n ,P m ) Video source n is associated with a small window m.
8. The lesson recording and broadcasting method as claimed in claim 1, wherein: the processing of the video frame image in step S5 includes resizing processing and trimming processing.
9. The lesson recording and broadcasting method as claimed in claim 1, wherein: the video frame image adopts YUV format.
10. A lesson recording system for executing the recording method of any one of claims 1 to 9, characterized in that: the recording and broadcasting device comprises a camera shooting mechanism, a recording and broadcasting mechanism and a storage mechanism;
the camera shooting mechanisms are used for collecting teaching videos, and a sound pick-up is arranged in one camera shooting mechanism or connected with one camera shooting mechanism;
the recording and playing mechanism is used for acquiring the audio and video acquired by the camera shooting mechanism through a network, splicing multiple paths of video streams into one video, storing the video in the storage mechanism and pushing the spliced video to the outside;
the storage mechanism is used for storing the course video recorded by the recording and broadcasting mechanism;
the recording and broadcasting mechanism also provides network time calibration service for the camera shooting mechanism;
and the computer program run on the recording and broadcasting mechanism is also used for generating the recording parameters.
CN202211217708.3A 2022-09-30 2022-09-30 Course recording and playing method and system Pending CN115550680A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211217708.3A CN115550680A (en) 2022-09-30 2022-09-30 Course recording and playing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211217708.3A CN115550680A (en) 2022-09-30 2022-09-30 Course recording and playing method and system

Publications (1)

Publication Number Publication Date
CN115550680A true CN115550680A (en) 2022-12-30

Family

ID=84730834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211217708.3A Pending CN115550680A (en) 2022-09-30 2022-09-30 Course recording and playing method and system

Country Status (1)

Country Link
CN (1) CN115550680A (en)

Similar Documents

Publication Publication Date Title
WO2019205872A1 (en) Video stream processing method and apparatus, computer device and storage medium
JP6430653B2 (en) Image synchronous display method and apparatus
US9294710B2 (en) Image comparison device using personal video recorder and method using the same
CN108650542B (en) Method for generating vertical screen video stream and processing image, electronic equipment and video system
CN110519477B (en) Embedded device for multimedia capture
US10574933B2 (en) System and method for converting live action alpha-numeric text to re-rendered and embedded pixel information for video overlay
CN105376547A (en) Micro video course recording system and method based on 3D virtual synthesis technology
US20080168512A1 (en) System and Method to Implement Interactive Video Streaming
CN108965746A (en) Image synthesizing method and system
US10334204B2 (en) News production system with integrated display
JP2011527841A (en) Image processing apparatus and imaging apparatus using the same
CN110933350A (en) Electronic cloud mirror recording and broadcasting system, method and device
US8873642B2 (en) Video content analysis methods and systems
JP2020524450A (en) Transmission system for multi-channel video, control method thereof, multi-channel video reproduction method and device thereof
JP4649640B2 (en) Image processing method, image processing apparatus, and content creation system
KR20150112113A (en) Method for managing online lecture contents based on event processing
CN115550680A (en) Course recording and playing method and system
JP3144285B2 (en) Video processing equipment
TWI482470B (en) Digital signage playback system, real-time monitoring system, and real-time monitoring method thereof
CN111813534A (en) Method for reducing CPU occupancy rate in intelligent recording and broadcasting
CN111614869A (en) 4K high definition camera double-circuit picture collection system
CN112272284B (en) Multi-party video communication method, device and system based on double cameras
WO2023189520A1 (en) Information processing system, information processing method, and program
US20220232297A1 (en) Multi-media processing system for live stream and multi-media processing method for live stream
KR20230068852A (en) Vertical mode streaming method, and portable vertical mode streaming system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination