CN117294683A - Video processing method, transmitting end, receiving end, storage medium and program product - Google Patents

Video processing method, transmitting end, receiving end, storage medium and program product Download PDF

Info

Publication number
CN117294683A
CN117294683A CN202210678897.8A CN202210678897A CN117294683A CN 117294683 A CN117294683 A CN 117294683A CN 202210678897 A CN202210678897 A CN 202210678897A CN 117294683 A CN117294683 A CN 117294683A
Authority
CN
China
Prior art keywords
video
target
frame rate
content type
identification information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210678897.8A
Other languages
Chinese (zh)
Inventor
张殿凯
高思
刘少丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN202210678897.8A priority Critical patent/CN117294683A/en
Priority to PCT/CN2023/099449 priority patent/WO2023241485A1/en
Publication of CN117294683A publication Critical patent/CN117294683A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Abstract

The application discloses a video processing method, a transmitting end, a receiving end, a storage medium and a program product, comprising the following steps: the video sending terminal obtains a first video content type of a target video, determines first identification information and a first frame rate parameter according to the first video content type, takes the first identification information as target identification information and takes the first frame rate parameter as target frame rate parameter; acquiring a target video according to the target frame rate parameter, and encoding the target identification information and the target video to obtain a video code stream; and transmitting the video code stream to a video receiving end, so that the video receiving end decodes the video code stream to obtain decoded video and target identification information, and then displaying the decoded video according to target frame rate parameters corresponding to the target identification information. According to the video processing method and device, the corresponding target frame rate parameters can be determined according to the first video content type of the target video to process the video, so that bandwidth resources can be saved, play jamming can be reduced, and subjective experience is enhanced.

Description

Video processing method, transmitting end, receiving end, storage medium and program product
Technical Field
The present application relates to the field of communications technologies, and in particular, to a video processing method, a transmitting end, a receiving end, a storage medium, and a program product.
Background
At present, in order to better integrate the problems of working efficiency and safety, remote work and home office work gradually become a new normal state. In order for businesses to operate properly, many businesses have introduced video conferencing systems. The construction of the video conference system solves the communication problem among different enterprises to a certain extent, provides a convenient and quick communication mode for branch institutions distributed in different areas, reduces the travel cost of the enterprises, and enables people to realize face-to-face audio and video communication without going out. In the process of using the video conference system, in order to better communicate, it is often necessary to share some data, such as documents, videos, pictures, etc., with other participants by using the video conference auxiliary stream video.
However, the existing video conference often has problems such as bandwidth resource waste or playing blocking, for example: when sharing document content, page turning can be performed once after a few seconds, the content change is slow, and if the document content is played at a higher frame rate, bandwidth resource waste can be caused; in addition, when sharing video content, if playing is performed at a lower frame rate, the playing is blocked, and subjective experience is poor.
Disclosure of Invention
The embodiment of the application provides a video processing method, a video sending end, a video receiving end, a storage medium and a program product, which not only can save bandwidth resources, but also can reduce play jamming conditions and enhance subjective experience.
In a first aspect, an embodiment of the present application provides a video processing method, applied to a video sending end, where the method includes: acquiring a first video content type of a target video, determining corresponding first identification information and a first frame rate parameter according to the first video content type, taking the first identification information as target identification information and taking the first frame rate parameter as target frame rate parameter; acquiring the target video according to the target frame rate parameter, and coding the target identification information and the acquired target video to obtain a video code stream; and sending the video code stream to a video receiving end, so that the video receiving end decodes the video code stream to obtain decoded video and the target identification information, and then displaying the decoded video according to the target frame rate parameter corresponding to the target identification information.
In a second aspect, an embodiment of the present application further provides a video processing method, applied to a video receiving end, where the method includes: receiving a video code stream from a video sending end; decoding the video code stream to obtain decoded video and target identification information; and determining a corresponding target frame rate parameter according to the target identification information, and displaying the decoded video according to the target frame rate parameter.
In a third aspect, an embodiment of the present application further provides a video sending end, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the video processing method of the first aspect as described above when the computer program is executed.
In a fourth aspect, an embodiment of the present application further provides a video receiving end, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the video processing method of the second aspect as described above when the computer program is executed.
In a fifth aspect, embodiments of the present application also provide a computer-readable storage medium storing computer-executable instructions for performing the video processing method as described above.
In a sixth aspect, embodiments of the present application further provide a computer program product comprising a computer program or computer instructions stored in a computer readable storage medium, from which the computer program or computer instructions are read by a processor of a computer device, the processor running the computer program or computer instructions such that the computer device performs the video processing method as described above.
In the embodiment of the application, firstly, a video sending end obtains a first video content type of a target video, determines corresponding first identification information and first frame rate parameters according to the first video content type, and takes the first identification information as target identification information and takes the first frame rate parameters as target frame rate parameters; then, the video transmitting end collects target videos according to the target frame rate parameters, encodes the target identification information and the collected target videos to obtain video code streams, and transmits the video code streams to the video receiving end; and finally, the video receiving end decodes the video code stream to obtain decoded video and target identification information, and displays the decoded video according to the target frame rate parameter corresponding to the target identification information. According to the technical scheme of the embodiment of the application, the corresponding target frame rate parameters can be determined according to the first video content type of the target video, and different frame rate parameters are adopted for processing the video for different video content types, so that bandwidth resources can be saved under the condition of playing the content with low frame rate requirements such as sharing document content; and the situation of playing the jamming can be reduced under the condition of playing the content with high frame rate requirement, such as sharing video content, so that subjective experience is enhanced.
Drawings
FIG. 1 is a schematic diagram of an implementation environment for performing a video processing method according to one embodiment of the present application;
fig. 2 is a flowchart of a video processing method at a video transmitting end provided in an embodiment of the present application;
fig. 3 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application;
fig. 4 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application;
fig. 5 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application;
fig. 6 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application;
fig. 7 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application;
fig. 8 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application;
fig. 9 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application;
fig. 10 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application;
fig. 11 is a flowchart of a video processing method at a video receiving end provided in an embodiment of the present application;
FIG. 12 is a schematic diagram of an implementation environment for performing a video processing method according to one embodiment of the present application;
FIG. 13 is a general flow chart of a video processing method provided in one embodiment of the present application;
FIG. 14 is a general flow chart of a video processing method provided in another embodiment of the present application;
fig. 15 is a schematic structural diagram of a video transmitting end according to an embodiment of the present application;
fig. 16 is a schematic structural diagram of a video receiving end according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical methods and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
It should be noted that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different from that in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
At present, in order to better integrate the problems of working efficiency and safety, remote work and home office work gradually become a new normal state. In order for businesses to operate properly, many businesses have introduced video conferencing systems. The construction of the video conference system solves the communication problem among different enterprises to a certain extent, provides a convenient and quick communication mode for branch institutions distributed in different areas, reduces the travel cost of the enterprises, and enables people to realize face-to-face audio and video communication without going out. In the process of using the video conference system, in order to better communicate, it is often necessary to share some data, such as documents, videos, pictures, etc., with other participants by using the video conference auxiliary stream video.
However, the existing video conference often has problems such as bandwidth resource waste or playing blocking, for example: when sharing document content, page turning can be performed once after a few seconds, the content change is slow, and if the document content is played at a higher frame rate, bandwidth resource waste can be caused; in addition, when sharing video content, if playing is performed at a lower frame rate, the playing is blocked, and subjective experience is poor.
Based on this, the embodiment of the application provides a video processing method, a video sending end, a video receiving end, a computer readable storage medium and a computer program product, which not only can save bandwidth resources, but also can reduce play jamming conditions and enhance subjective experience.
Embodiments of the present application are further described below with reference to the accompanying drawings.
As shown in fig. 1, fig. 1 is a schematic structural diagram of an implementation environment for performing a video processing method according to an embodiment of the present application.
In the example of fig. 1, the implementation environment includes, but is not limited to, a video sender 100 and a video receiver 200, where the video sender 100 and the video receiver 200 are communicatively coupled.
In an embodiment, the relative positions, the number, and the like of the video transmitting end 100 and the video receiving end 200 may be set correspondingly in a specific application scenario, and the relative positions and the number of the video transmitting end 100 and the video receiving end 200 are not specifically limited in this embodiment.
It will be appreciated by those skilled in the art that the implementation environment for performing the video processing method may be applied to a 3G communication network system, an LTE communication network system, a 5G communication network system, a 6G communication network system, and a mobile/fixed communication network system that is evolved later, etc., which are not particularly limited in this embodiment.
It will be appreciated by those skilled in the art that the implementation environment shown in fig. 1 is not limiting of the embodiments of the present application and may include more or fewer components than shown, or certain components may be combined, or a different arrangement of components.
Based on the above-described implementation environment, various embodiments of the video processing method on the video transmitting side of the present application are set forth below.
As shown in fig. 2, fig. 2 is a flowchart of a video processing method at a video transmitting end side according to an embodiment of the present application, and the video processing method may be applied to the video transmitting end in fig. 1, including but not limited to step S100, step S200, and step S300.
Step S100, acquiring a first video content type of a target video, determining corresponding first identification information and first frame rate parameters according to the first video content type, and taking the first identification information as target identification information and the first frame rate parameters as target frame rate parameters;
step S200, acquiring a target video according to a target frame rate parameter, and coding target identification information and the acquired target video to obtain a video code stream;
and step S300, the video code stream is sent to a video receiving end, so that the video receiving end decodes the video code stream to obtain decoded video and target identification information, and then the decoded video is displayed according to target frame rate parameters corresponding to the target identification information.
Specifically, in the video processing process, firstly, a video transmitting end acquires a first video content type of a target video, determines corresponding first identification information and first frame rate parameters according to the first video content type, and takes the first identification information as target identification information and takes the first frame rate parameters as target frame rate parameters; then, the video transmitting end collects target videos according to the target frame rate parameters, encodes the target identification information and the collected target videos to obtain video code streams, and transmits the video code streams to the video receiving end; and finally, the video receiving end decodes the video code stream to obtain decoded video and target identification information, obtains corresponding target frame rate parameters according to the target identification information, and displays the decoded video according to the target frame rate parameters.
The above-mentioned target video refers to a content to be displayed on the video transmitting end, in other words, a content that the video transmitting end needs to share to the video receiving end.
In addition, it should be noted that, regarding the above-mentioned first video content type, the type of the first video content may be a document type, or may be a non-document type such as a video image, or may be another content type.
It should be noted that, regarding the first video content type, the first identification information and the first frame rate parameter, the three are in a one-to-one correspondence. For example, when the first video content type is a document type, the first identification information corresponds to a value of 1, and the first frame rate parameter corresponds to f1; when the first video content type is a non-document type, the first identification information corresponds to a value of 0, and the first frame rate parameter corresponds to f2. Wherein f1 is set to be smaller than f2 because the playframe rate requirement of the document type is smaller than the playframe rate requirement of the non-document type.
It should be noted that, regarding the method for acquiring the first video content type of the target video in step S100, the first video content type may be input by the user at the time of initial sharing, and the first identification information and the first frame rate parameter may be input by the user at the same time; it is also possible to determine the first video content type by means of motion vector parameters of the encoded frames during video encoding, which is similar to the method steps in fig. 3 described below; the first video content type may also be determined by image features of a video image in the target video, in a process similar to the method steps in fig. 7 described below; other acquisition modes are also possible, and the embodiment of the present application does not specifically limit the acquisition mode of the first video content type.
In addition, it is noted that after the video transmitting end obtains the first video content type, the corresponding first identification information and the first frame rate parameter can be determined by a local table look-up mode; or the first video content type is sent to the cloud server, and the first identification information and the first frame rate parameter fed back by the cloud server are received; other acquisition modes are also possible, and the embodiment of the present application does not specifically limit the acquisition modes of the first identification information and the first frame rate parameter.
In addition, it is noted that, after determining that the target frame rate parameter is the first frame rate parameter, the embodiments of the present application can adaptively adjust the video acquisition and encoding frame rate parameter based on the first frame rate parameter; the video sending end also codes and writes the first identification information and the target video into a specific position in the code stream together, and then transmits the first identification information and the target video to the video receiving end through a network for sharing; the video receiving end decodes the received video code stream, extracts first identification information, determines a first frame rate parameter according to the first identification information, and adaptively adjusts the video display frame rate parameter based on the first frame rate parameter.
Notably, the embodiment of the application can determine the corresponding target frame rate parameter according to the first video content type of the target video, and process the video by adopting different frame rate parameters for different video content types, so that bandwidth resources can be saved when playing the content with low frame rate requirements such as sharing document content; and the situation of playing the jamming can be reduced under the condition of playing the content with high frame rate requirement, such as sharing video content, so that subjective experience is enhanced.
In addition, as shown in fig. 3, fig. 3 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application, and after encoding the target identification information and the acquired target video in the step S200, the video processing method according to the embodiment of the present application further includes, but is not limited to, step S410, step S420, and step S430.
Step S410, obtaining a coding frame in the coding process, and determining motion vector parameters of the coding frame;
step S420, determining a second video content type of the target video according to the motion vector parameter, and determining corresponding second identification information and a second frame rate parameter according to the second video content type;
step S430, updating the target identification information to the second identification information, and updating the target frame rate parameter to the second frame rate parameter.
Specifically, in an embodiment of the present application, during encoding of the target identification information and the target video, the video transmitting end determines a second video content type of the target video according to a motion vector parameter of an encoding frame in an encoding process, and then determines corresponding second identification information and a second frame rate parameter according to the second video content type, so as to update the target identification information and the target frame rate parameter, respectively, so that the second identification information and the second frame rate parameter are used in a subsequent acquisition and encoding process.
In addition, it should be noted that, regarding the second video content type, the second video content type may be a document type, a non-document type such as a video image, or another content type, as in the first video content type, and the type of the second video content type is not particularly limited in the embodiments of the present application.
It should be noted that, regarding the second video content type, the second identification information and the second frame rate parameter, the above three are in a one-to-one correspondence. For example, when the second video content type is a document type, the second identification information corresponds to a value of 1, and the second frame rate parameter corresponds to f1; when the second video content type is a non-document type, the second identification information corresponds to a value of 0, and the second frame rate parameter corresponds to f2. Wherein f1 is set to be smaller than f2 because the playframe rate requirement of the document type is smaller than the playframe rate requirement of the non-document type.
In addition, it is noted that after the video transmitting end obtains the second video content type, the corresponding second identification information and the second frame rate parameter can be determined by a local table look-up mode; or the second video content type is sent to the cloud server, and second identification information and second frame rate parameters fed back by the cloud server are received; other acquisition modes are also possible, and the embodiment of the present application does not specifically limit the acquisition modes of the second identification information and the second frame rate parameter.
In addition, as shown in fig. 4, fig. 4 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application, and regarding the second video content type of the target video determined according to the motion vector parameter in the above step S420, the method includes, but is not limited to, step S510, step S520, and step S530.
Step S510, performing image division on the coded frame to obtain a plurality of block images, and determining the total image number of all the block images;
step S520, calculating the difference values of the motion vector parameters of all adjacent two block images, and determining the number of motion blocks of the motion block images in the block images according to all the difference values;
step S530, determining a second video content type of the target video according to the total image number and the motion block number.
Specifically, regarding the above specific procedure of determining the second video content type of the target video according to the motion vector parameter, the following is: firstly, a video transmitting end performs image division on a coding frame so as to obtain a plurality of block images, and the total image number of all the block images is counted; then, for two adjacent block images of each group, the video transmitting end calculates the difference value of the motion vector parameters of the two adjacent block images, determines whether the second block image in the two adjacent block images is a motion block image according to the difference value, and counts the number of the motion blocks of the motion block images in the two adjacent block images of all groups; finally, a second video content type of the target video is calculated according to the total image number and the motion block number.
In addition, as shown in fig. 5, fig. 5 is a flowchart of a video processing method at the video transmitting end side according to another embodiment of the present application, and regarding the determination of the number of moving blocks of the moving block image in the block image according to all the differences in the above step S520, including but not limited to step S610 and step S620.
Step S610, for each two adjacent block images, the difference between the motion vector parameter of the current block image and the motion vector parameter of the next block image is larger than a first preset threshold, and the next block image is recorded as a motion block image;
step S620, counting the number of the motion blocks of all the motion block images.
Specifically, for two adjacent segmented images of each group, if the difference between the motion vector parameter of the previous segmented image and the motion vector parameter of the next segmented image is larger than a first preset threshold value, the video transmitting end marks the next segmented image as a motion block image; in addition, if the difference between the motion vector parameter of the previous block image and the motion vector parameter of the next block image is smaller than or equal to the first preset threshold, it indicates that the previous and next images change less, and the video transmitting end does not record the next block image as a motion block image.
It may be understood that, regarding the above-mentioned first preset threshold, the first preset threshold may be input in advance by a user, or may be determined according to a preset rule according to a current video transmission situation, and a determining manner of the first preset threshold in the embodiment of the present application is not specifically limited.
In addition, as shown in fig. 6, fig. 6 is a flowchart of a video processing method at the video transmitting end side according to another embodiment of the present application, and regarding the second video content type of the target video determined according to the total number of images and the number of motion blocks in the above step S530, includes, but is not limited to, step S710 and step S720.
Step S710, calculating the ratio of the number of the motion blocks to the total image number;
and step S720, comparing the ratio with a second preset threshold value, and determining a second video content type of the target video according to the comparison result.
Specifically, after the total image number and the motion block number are calculated, the video transmitting end calculates the ratio between the motion block number and the total image number, and if the ratio is greater than a second preset threshold, it can be determined that the second video content type of the target video is a non-document type; if the ratio is less than or equal to a second preset threshold, it may be determined that the second video content type of the target video is a document type.
It may be understood that, regarding the above-mentioned second preset threshold, the second preset threshold may be input in advance by a user, or may be determined according to a preset rule according to a current video transmission situation, and a determining manner of the second preset threshold in the embodiment of the present application is not specifically limited.
In addition, as shown in fig. 7, fig. 7 is a flowchart of a video processing method at the video transmitting end side according to another embodiment of the present application, and regarding the second video content type of the target video determined according to the total image number and the motion block number in the above step S530, includes, but is not limited to, step S810 and step S820.
Step 810, for each encoded frame, determining a video content type according to the total number of images and the number of motion blocks;
step S820, selecting the most video content type as the second video content type of the target video.
Specifically, in order to make the second video content type more accurate, the video transmitting end determines a video content type for each encoded frame, so that for a plurality of encoded frames, the video transmitting end determines to obtain a plurality of video content types; finally, the video sender may select the most numerous video content types as the second video content type of the target video.
In addition, as shown in fig. 8, fig. 8 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application, where after the target video is collected according to the target frame rate parameter in the step S200, the video processing method according to the embodiment of the present application further includes, but is not limited to, step S910, step S920, and step S930.
Step S910, extracting features of a video image of a target video to obtain image features;
step S920, determining a second video content type of the target video according to the image characteristics, and determining corresponding second identification information and a second frame rate parameter according to the second video content type;
step S930, updating the target identification information to the second identification information, and updating the target frame rate parameter to the second frame rate parameter.
Specifically, in an embodiment of the present application, after the target video is acquired according to the target frame rate parameter, the video transmitting end performs feature extraction on the video image of the target video to obtain an image feature, then determines the second video content type according to the image feature, and determines second identification information and a second frame rate parameter corresponding to the second video content type, so as to update the target identification information and the target frame rate parameter, respectively, so that the second identification information and the second frame rate parameter are adopted in the subsequent acquisition and encoding process.
In addition, it should be noted that, regarding the second video content type, the second video content type may be a document type, a non-document type such as a video image, or another content type, as in the first video content type, and the type of the second video content type is not particularly limited in the embodiments of the present application.
It should be noted that, regarding the second video content type, the second identification information and the second frame rate parameter, the above three are in a one-to-one correspondence. For example, when the second video content type is a document type, the second identification information corresponds to a value of 1, and the second frame rate parameter corresponds to f1; when the second video content type is a non-document type, the second identification information corresponds to a value of 0, and the second frame rate parameter corresponds to f2. Wherein f1 is set to be smaller than f2 because the playframe rate requirement of the document type is smaller than the playframe rate requirement of the non-document type.
In addition, it is noted that after the video transmitting end obtains the second video content type, the corresponding second identification information and the second frame rate parameter can be determined by a local table look-up mode; or the second video content type is sent to the cloud server, and second identification information and second frame rate parameters fed back by the cloud server are received; other acquisition modes are also possible, and the embodiment of the present application does not specifically limit the acquisition modes of the second identification information and the second frame rate parameter.
In addition, as shown in fig. 9, fig. 9 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application, and regarding the second video content type of the target video determined according to the image feature in the above step S920, the video processing method according to the embodiment of the present application further includes, but is not limited to, step S1010 and step S1020.
Step S1010, classifying and identifying the image characteristics to obtain a probability value of the video content type of the target video;
and step S1020, comparing the probability value with a third preset threshold value, and determining a second video content type of the target video according to the comparison result.
Specifically, after extracting the image features, the video transmitting end classifies and identifies the image features to obtain a probability value of the video content type of the target video; and then, the video transmitting end compares the probability value with a third preset threshold value, and finally determines a second video content type of the target video according to the comparison result.
When the probability value is a probability value of a non-document such as a video belonging to the category, if the probability value is greater than a third preset threshold, determining that the second video content type of the target video is a video type; and if the probability value is smaller than or equal to a third preset threshold value, determining that the second video content type of the target video is the document type.
In addition, when the probability value is a probability value that the category belongs to a document, if the probability value is greater than a third preset threshold value, determining that the second video content type of the target video is the document type; and if the probability value is smaller than or equal to a third preset threshold value, determining that the second video content type of the target video is a non-document type.
It may be understood that, regarding the third preset threshold, the third preset threshold may be input in advance by a user, or may be determined according to a preset rule according to a current video transmission situation, and the determining manner of the third preset threshold in the embodiment of the present application is not specifically limited.
In addition, as shown in fig. 10, fig. 10 is a flowchart of a video processing method at a video transmitting end according to another embodiment of the present application, and regarding the determination of the second video content type of the target video according to the comparison result in the step S1020, the video processing method according to the embodiment of the present application further includes, but is not limited to, step S1110 and step S1120.
Step S1110, for each video image, determining a comparison result according to the probability value and a third preset threshold value;
step S1120, determining the second video content type of the target video according to the most number of comparison results.
Specifically, in order to make the second video content type more accurate, the video transmitting end determines a video content type for each video image, so that for a plurality of video images, the video transmitting end determines to obtain a plurality of video content types; finally, the video sender may select the most numerous video content types as the second video content type of the target video.
Based on the above-described implementation environment and the video processing method on the video transmitting side, various embodiments of the video processing method on the video receiving side of the present application are set forth below.
As shown in fig. 11, fig. 11 is a flowchart of a video processing method at a video receiving end side according to an embodiment of the present application, and the video processing method may be applied to the video receiving end in fig. 1, including but not limited to step S1200, step S1300, and step S1400.
Step 1200, receiving a video code stream from a video transmitting end;
step S1300, decoding the video code stream to obtain decoded video and target identification information;
step S1400, determining corresponding target frame rate parameters according to the target identification information, and displaying the decoded video according to the target frame rate parameters.
Specifically, in the video processing process, firstly, a video transmitting end acquires a first video content type of a target video, determines corresponding first identification information and first frame rate parameters according to the first video content type, and takes the first identification information as target identification information and takes the first frame rate parameters as target frame rate parameters; then, the video transmitting end collects target videos according to the target frame rate parameters, encodes the target identification information and the collected target videos to obtain video code streams, and transmits the video code streams to the video receiving end; and finally, the video receiving end decodes the video code stream to obtain decoded video and target identification information, obtains corresponding target frame rate parameters according to the target identification information, and displays the decoded video according to the target frame rate parameters.
In addition, it is noted that after the video receiving end obtains the second video content type, the corresponding second identification information and the second frame rate parameter can be determined by a local table look-up mode; or the second video content type is sent to the cloud server, and second identification information and second frame rate parameters fed back by the cloud server are received; other acquisition modes are also possible, and the embodiment of the present application does not specifically limit the acquisition modes of the second identification information and the second frame rate parameter.
The video receiving end decodes the received video code stream, extracts target identification information, determines target frame rate parameters according to the target identification information, and adjusts video display frame rate parameters based on the target frame rate parameters in a self-adaptive mode.
It should be noted that, since the video processing method at the video receiving side of the embodiment of the present application corresponds to the video processing method at the video transmitting side of the above embodiment, the specific implementation and the technical effect of the video processing method at the video receiving side of the embodiment of the present application may correspond to the specific implementation and the technical effect of the video processing method at the video transmitting side described above.
Based on the above implementation environment, the video processing method at the video transmitting side, and the video processing method at the video receiving side, overall embodiments of the video processing method of the present application are presented below.
As shown in fig. 12, fig. 12 is a schematic diagram of an implementation environment for performing a video processing method according to an embodiment of the present application. The method comprises the following steps: firstly, a video transmitting end performs related parameter setting including a video content type mark, a frame rate and the like, then performs video coding on the video content type mark and an acquired original video together to output a video code stream, transmits the video code stream to a receiving end through a network, and the video receiving end performs video decoding on the received code stream and extracts the video content type mark and adjusts the frame rate parameter of video display according to the video content type mark; and meanwhile, detecting and analyzing the video content by utilizing the original video and the information generated by encoding, and updating a video content type mark and a frame rate parameter according to the detection result for subsequent video processing.
Wherein, regarding the principle steps in fig. 12, the following are respectively:
parameter setting: setting parameters such as a video content type flag, a video frame rate and the like.
And (3) video acquisition: and acquiring video according to the set frame rate.
Video coding: the collected video and video content type flags are encoded using a common video encoder such as h.26x, AV1, AVs, etc.
Network transmission: the video stream is transmitted from the sender to the receiver.
Video display: and performing display rendering on the decoded video.
Video decoding: decoding the video stream.
Content detection: and analyzing and judging the video content type by utilizing the video and related information thereof, and outputting a video content type detection result.
In addition, as shown in fig. 13, fig. 13 is an overall flowchart of a video processing method provided in an embodiment of the present application. The method specifically comprises the following steps:
step S1-1: setting related parameters of auxiliary stream video of a video conference system, wherein the type of the initial auxiliary stream video content is a document, the flag bit flag of the document is set to be 1, and the initial frame rate parameter is set to be f1;
step S1-2: acquiring an original video according to a frame rate f1, and encoding the original video and a video content type flag bit flag by adopting an H.265 video encoder to obtain a video code stream, wherein video type flag information is written into a protocol reserved position, and meanwhile, motion vector information of an encoded frame is stored for a subsequent content detection module to use;
Step S1-3: transmitting the video code stream encoded in the step S1-2 to a video receiving end through a TCP network;
step S1-4: the video receiving end adopts a corresponding H.265 video decoder to carry out video decoding on the received code stream, and obtains decoded video and content type flag;
step S1-5: the video receiving end displays and plays the decoded video according to the video type flag, if the video content type flag bit is 0, the frame rate is adjusted to f2 (f 2> > f 1), otherwise, the frame rate is adjusted to f1;
step S1-6: the content detection algorithm analyzes and processes the motion vector of the coding frame generated in the step S1-2, and the specific method is as follows: dividing the image into blocks, wherein the width W and the height H of the input image are equal, and dividing the image into blocks by adopting the preset patch size to obtain the number of blocks of the whole imageComparing the motion vector in each block with the motion vector value of the corresponding block of the previous frame, and if the difference value of the motion vector and the motion vector value is larger than a certain threshold value, considering the current block as motion, and recording by using a motionblock; calculating the ratio of motion block=motionblock/num in the whole image, if the ratio is>T, can be obtained through experiments according to the requirements, if the current image content type is video, if the current image content type is document type, the current frame type zone bit is stored; voting the frame type flag bits of the N frames stored continuously to obtain the type with the highest number of votes, namely the final video content type result of the current frame;
Step S1-7: according to the detection result of the last step, modifying the video content type flag bit of the video sending end and the frame rate parameters of video acquisition and coding, wherein the parameters are as follows: if the video content type detection result is a document, setting a video content type flag bit to 1 and setting a video frame rate parameter to f1; if the video content type detection result is a non-document, the video content type flag bit is set to 0, and the video frame rate parameter is set to f2 (f 2> > f 1).
As shown in fig. 14, fig. 14 is an overall flowchart of a video processing method according to another embodiment of the present application. The method specifically comprises the following steps:
step S2-1: setting related parameters of auxiliary stream video of a video conference system, wherein the type of the initial auxiliary stream video content is a document, the flag bit flag of the document is set to be 1, and the initial frame rate parameter is set to be f1;
step S2-2: acquiring an original video according to a frame rate f1, and adopting an H.265 hardware video encoder to encode the original video and a video content type flag bit flag to obtain a video code stream, wherein video type flag information is written into a protocol reserved position;
step S2-3: transmitting the video code stream coded in the step S2-2 to a video receiving end through a UDP network;
Step S2-4: the video receiving end adopts a corresponding H.265 video decoder to carry out video decoding on the received code stream, and obtains decoded video and a flag bit of the video content type;
step S2-5: the video receiving end displays and plays the decoded video according to the video type flag adjusting video frame rate parameter, if the video content type flag is 0, the frame rate is adjusted to f2 (f 2> > f 1), otherwise, the frame rate is adjusted to f1;
step S2-6: the content detection module adopts a convolutional neural network-based video content detection, the network input is the original video acquired in the step S2-2, the image characteristics are extracted through a deep learning network (such as mobilenet, resnet and the like), and a classifier (such as softmax, sigmoid and the like) judges the category attribute of the image characteristics, and the specific method is as follows: training the neural network resnet50 by using the labeling data set to acquire network parameters for subsequent feature extraction; performing feature extraction on the input image by using a trained neural network resnet50 to obtain image features; inputting image features into a sigmoid classifier, and acquiring a probability value prob of a video of a category to which the current feature belongs; if the video category probability value prob is more than 0.5, judging the current frame type as a document, otherwise, judging the current frame type as a non-document, and storing the current frame type flag bit; voting the frame type flag bits of the N frames stored continuously to obtain the type with the highest number of votes, namely the final video content type detection result of the current frame;
Step S2-7: according to the detection result of the last step, modifying the video content type flag bit of the video sending end and the frame rate parameters of video acquisition and coding, wherein the parameters are as follows: if the video content type detection result is a document, setting a video content type flag bit to 1 and setting a video frame rate parameter to f1; if the video content type detection result is a non-document, the video content type flag bit is set to 0, and the video frame rate parameter is set to f2 (f 2> > f 1).
According to the technical scheme of the embodiment of the application, the subjective experience of different video contents can be guaranteed to be good by dynamically adjusting the video frame rate through video content detection; in addition, the self-adaptive frame rate can control links such as video acquisition, video coding and the like, and bandwidth resources can be effectively utilized and saved; in addition, the video receiving end shares the detection result of the transmitting end, thereby ensuring the synchronism and reducing the calculation complexity.
Based on the above-described implementation environment, the video processing method on the video transmitting side, and the video processing method on the video receiving side, various embodiments of the video transmitting side, the video receiving side, the computer-readable storage medium, and the computer program product of the present application are presented below.
As shown in fig. 15, fig. 15 is a schematic structural diagram of a video transmitting end according to an embodiment of the present application; an embodiment of the present application further discloses a video transmitting terminal 100, including: the first memory 120, the first processor 110, and a computer program stored on the first memory 120 and executable on the first processor 110, the first processor 110 executing the video processing method on the video transmitting side as in any of the foregoing embodiments when the computer program is executed.
The first memory 120, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. Further, the first memory 120 may include high-speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the first memory 120 optionally includes memory remotely located with respect to the first processor 110, which may be connected to the implementation environment via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The video transmitting terminal 100 in this embodiment may correspond to a video transmitting terminal in the implementation environment of the embodiment shown in fig. 1, and both belong to the same application concept, so that both have the same implementation principle and beneficial effects, which are not described in detail herein.
The non-transitory software program and instructions required to implement the video processing method on the video sender side of the above-described embodiment are stored in the first memory 120, and when executed by the first processor 110, perform the video processing method on the video sender side of the above-described embodiment, for example, perform the method steps in fig. 2 to 10 described above.
It should be noted that, since the video sender 100 of the embodiment of the present application is capable of executing the video processing method of the video sender side of the embodiment, the specific implementation and the technical effect of the video sender 100 of the embodiment of the present application may correspond to the specific implementation and the technical effect of the video processing method of the video sender side described above.
In addition, as shown in fig. 16, fig. 16 is a schematic structural diagram of a video receiving end according to an embodiment of the present application; an embodiment of the present application further discloses a video receiving terminal 200, including: the second memory 220, the second processor 210, and a computer program stored in the second memory 220 and executable on the second processor 210, wherein the second processor 210 executes the video processing method on the video receiving side as in any of the foregoing embodiments when the computer program is executed.
The second memory 220, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. Further, the second memory 220 may include high-speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the second memory 220 optionally includes memory remotely located relative to the second processor 210, which may be connected to the implementation environment via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The video receiving end 200 in this embodiment may correspond to a video receiving end in the implementation environment of the embodiment shown in fig. 1, and both belong to the same application concept, so that both have the same implementation principle and beneficial effects, which are not described in detail herein.
The non-transitory software program and instructions required to implement the video processing method on the video receiving side of the above-described embodiment are stored in the second memory 220, and when executed by the second processor 210, perform the video processing method on the video receiving side of the above-described embodiment, for example, perform the method steps in fig. 11 described above.
It should be noted that, since the video receiving terminal 200 of the embodiment of the present application is capable of executing the video processing method of the video receiving terminal side of the embodiment, the specific implementation and the technical effect of the video receiving terminal 200 of the embodiment of the present application may correspond to those of the video processing method of the video receiving terminal side.
In addition, an embodiment of the present application also discloses a computer-readable storage medium having stored therein computer-executable instructions for performing the video processing method of any of the previous embodiments.
It should be noted that, since the computer readable storage medium of the embodiment of the present application can perform the video processing method on the video transmitting side or the video receiving side of the above embodiment, the specific implementation and the technical effect of the computer readable storage medium of the embodiment of the present application may correspond to those of the video processing method on the video transmitting side or the video receiving side described above.
Furthermore, an embodiment of the present application also discloses a computer program product comprising a computer program or computer instructions stored in a computer readable storage medium, the computer program or computer instructions being read from the computer readable storage medium by a processor of a computer device, the processor executing the computer program or computer instructions to cause the computer device to perform the video processing method as in any of the previous embodiments.
It should be noted that, since the computer program product of the embodiment of the present application is capable of executing the video processing method on the video transmitting side or the video receiving side of the above embodiment, the specific implementation and the technical effect of the computer program product of the embodiment of the present application may correspond to the specific implementation and the technical effect of the video processing method on the video transmitting side or the video receiving side described above.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

Claims (14)

1. A video processing method applied to a video transmitting end, the method comprising:
acquiring a first video content type of a target video, determining corresponding first identification information and a first frame rate parameter according to the first video content type, taking the first identification information as target identification information and taking the first frame rate parameter as target frame rate parameter;
acquiring the target video according to the target frame rate parameter, and coding the target identification information and the acquired target video to obtain a video code stream;
and sending the video code stream to a video receiving end, so that the video receiving end decodes the video code stream to obtain decoded video and the target identification information, and then displaying the decoded video according to the target frame rate parameter corresponding to the target identification information.
2. The video processing method according to claim 1, wherein after the encoding of the target identification information and the acquired target video, the method further comprises:
acquiring a coding frame in the coding process, and determining motion vector parameters of the coding frame;
Determining a second video content type of the target video according to the motion vector parameter, and determining corresponding second identification information and a second frame rate parameter according to the second video content type;
updating the target identification information to the second identification information and updating the target frame rate parameter to the second frame rate parameter.
3. The video processing method according to claim 2, wherein said determining a second video content type of the target video from the motion vector parameters comprises:
performing image division on the coded frame to obtain a plurality of block images, and determining the total image number of all the block images;
calculating the difference value of motion vector parameters of all adjacent two block images, and determining the number of motion blocks of the motion block images in the block images according to all the difference values;
and determining a second video content type of the target video according to the total image number and the motion block number.
4. A video processing method according to claim 3, wherein said determining the number of motion blocks of the motion block image in the block image based on all the differences comprises:
For every two adjacent block images, recording the difference value between the motion vector parameter of the current block image and the motion vector parameter of the next block image as a motion block image, wherein the difference value is larger than a first preset threshold value;
and counting the number of the motion blocks of all the motion block images.
5. The video processing method of claim 3, wherein said determining a second video content type of the target video based on the total number of images and the number of motion blocks comprises:
calculating the ratio of the number of the motion blocks to the total image number;
and comparing the ratio with a second preset threshold value, and determining a second video content type of the target video according to a comparison result.
6. The video processing method according to any one of claims 3 to 5, wherein the determining the second video content type of the target video from the total number of images and the number of motion blocks includes:
for each of said encoded frames, determining a video content type based on said total number of images and said number of motion blocks;
and selecting the video content type with the largest number as a second video content type of the target video.
7. The video processing method according to claim 1, wherein after the acquisition of the target video according to the target frame rate parameter, the method further comprises:
extracting features of video images of the target video to obtain image features;
determining a second video content type of the target video according to the image characteristics, and determining corresponding second identification information and a second frame rate parameter according to the second video content type;
updating the target identification information to the second identification information and updating the target frame rate parameter to the second frame rate parameter.
8. The video processing method of claim 7, wherein said determining a second video content type of the target video from the image characteristics comprises:
classifying and identifying the image features to obtain a probability value of the video content type of the target video;
and comparing the probability value with a third preset threshold value, and determining a second video content type of the target video according to a comparison result.
9. The video processing method according to claim 8, wherein the determining the second video content type of the target video according to the comparison result includes:
For each video image, determining a comparison result according to the probability value and a third preset threshold value;
and determining a second video content type of the target video according to the most number of comparison results.
10. A video processing method applied to a video receiving end, the method comprising:
receiving a video code stream from a video sending end;
decoding the video code stream to obtain decoded video and target identification information;
and determining a corresponding target frame rate parameter according to the target identification information, and displaying the decoded video according to the target frame rate parameter.
11. A video transmitting terminal, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the video processing method according to any one of claims 1 to 9 when the computer program is executed.
12. A video receiving terminal, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the video processing method of claim 10 when the computer program is executed.
13. A computer-readable storage medium storing computer-executable instructions for performing the video processing method of any one of claims 1 to 9 or the video processing method of claim 10.
14. A computer program product comprising a computer program or computer instructions, characterized in that the computer program or the computer instructions are stored in a computer readable storage medium, from which the computer program or the computer instructions are read by a processor of a computer device, the processor running the computer program or the computer instructions such that the computer device performs the video processing method of any one of claims 1 to 9 or the video processing method of claim 10.
CN202210678897.8A 2022-06-16 2022-06-16 Video processing method, transmitting end, receiving end, storage medium and program product Pending CN117294683A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210678897.8A CN117294683A (en) 2022-06-16 2022-06-16 Video processing method, transmitting end, receiving end, storage medium and program product
PCT/CN2023/099449 WO2023241485A1 (en) 2022-06-16 2023-06-09 Video processing method, video sending end, video receiving end, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210678897.8A CN117294683A (en) 2022-06-16 2022-06-16 Video processing method, transmitting end, receiving end, storage medium and program product

Publications (1)

Publication Number Publication Date
CN117294683A true CN117294683A (en) 2023-12-26

Family

ID=89192202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210678897.8A Pending CN117294683A (en) 2022-06-16 2022-06-16 Video processing method, transmitting end, receiving end, storage medium and program product

Country Status (2)

Country Link
CN (1) CN117294683A (en)
WO (1) WO2023241485A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104811722B (en) * 2015-04-16 2019-05-07 华为技术有限公司 A kind of decoding method and device of video data
CN108833923B (en) * 2018-06-20 2022-03-29 腾讯科技(深圳)有限公司 Video encoding method, video decoding method, video encoding device, video decoding device, storage medium and computer equipment
CN113315973B (en) * 2019-03-06 2022-12-20 深圳市道通智能航空技术股份有限公司 Encoding method, image encoder, and image transmission system
CN112995776B (en) * 2021-01-26 2023-04-07 北京字跳网络技术有限公司 Method, device, equipment and storage medium for determining screen capture frame rate of shared screen content
CN113965751B (en) * 2021-10-09 2023-03-24 腾讯科技(深圳)有限公司 Screen content coding method, device, equipment and storage medium

Also Published As

Publication number Publication date
WO2023241485A1 (en) 2023-12-21

Similar Documents

Publication Publication Date Title
US20220030244A1 (en) Content adaptation for streaming
CN110072119B (en) Content-aware video self-adaptive transmission method based on deep learning network
US20120275511A1 (en) System and method for providing content aware video adaptation
US11915144B2 (en) Apparatus, a method and a computer program for running a neural network
US20130044183A1 (en) Distributed video coding/decoding method, distributed video coding/decoding apparatus, and transcoding apparatus
CN107211193A (en) The intelligent adaptive video streaming method and system of sensory experience quality estimation driving
CN101375604A (en) Methods and systems for rate control within an encoding device
CN111263243B (en) Video coding method and device, computer readable medium and electronic equipment
CN109698957A (en) Image encoding method, calculates equipment and storage medium at device
CN114827617B (en) Video coding and decoding method and system based on perception model
CN111767428A (en) Video recommendation method and device, electronic equipment and storage medium
CN117294683A (en) Video processing method, transmitting end, receiving end, storage medium and program product
US20170195717A1 (en) Mobile search-ready smart display technology utilizing optimized content fingerprint coding and delivery
EP2368367A1 (en) Interactive system and method for transmitting key images selected from a video stream over a low bandwidth network
CN111818338B (en) Abnormal display detection method, device, equipment and medium
CN116962741A (en) Sound and picture synchronization detection method and device, computer equipment and storage medium
CN114422792A (en) Video image compression method, device, equipment and storage medium
CN112584093A (en) Video data processing method and device, terminal equipment and storage medium
US11388412B2 (en) Video compression technique using a machine learning system
CN109688358A (en) Fabricate class course resources visual development and the information transmission system and method
CN115379248B (en) Video source stream replacement method, system, equipment and storage medium
US20230319327A1 (en) Methods, systems, and media for determining perceptual quality indicators of video content items
US11895176B2 (en) Methods, systems, and media for selecting video formats for adaptive video streaming
CN113452996B (en) Video coding and decoding method and device
CN117061792B (en) Cloud video collaborative rendering method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication