WO2020249100A1 - Video processing method, apparatus and device - Google Patents

Video processing method, apparatus and device Download PDF

Info

Publication number
WO2020249100A1
WO2020249100A1 PCT/CN2020/095882 CN2020095882W WO2020249100A1 WO 2020249100 A1 WO2020249100 A1 WO 2020249100A1 CN 2020095882 W CN2020095882 W CN 2020095882W WO 2020249100 A1 WO2020249100 A1 WO 2020249100A1
Authority
WO
WIPO (PCT)
Prior art keywords
video frame
text content
text
video
information
Prior art date
Application number
PCT/CN2020/095882
Other languages
French (fr)
Chinese (zh)
Inventor
曾以亮
毛春静
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020249100A1 publication Critical patent/WO2020249100A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Definitions

  • This application relates to the field of computer technology, and in particular to a video processing method, device and equipment.
  • the video includes multiple video frames, and each video frame may include text content.
  • the text content may include subtitles, barrage, and prompt information.
  • the resolution of video continues to increase, making the bandwidth pressure of the video transmission channel more and more severe.
  • the video can be compressed before the video is transmitted.
  • the text content in the video becomes blurred, which affects the recognition of the text content by the user, resulting in a lower quality of the compressed video.
  • This application provides a video processing method, device and equipment. Improved the quality of compressed video.
  • an embodiment of the present application provides a video processing method.
  • a first device receives a second video frame sent by a second device and text information extracted from the first video frame.
  • the text information includes the text content and the attribute information of the text content obtained by frame compression.
  • the first device adds the text content to the second video frame according to the attribute information to obtain the third video frame to be played.
  • the second device first extracts text information from the first video frame, and compresses the first video frame to obtain the second video frame.
  • the second device sends the second video frame and text information to the first device. Since the second video frame is a compressed video frame, the bandwidth pressure of the transmission channel between the second device and the first device is reduced.
  • the first device may merge the second video frame and the text information to merge the text content in the text information into the second video frame to obtain the third video frame Since the text content is not compressed, the definition of the text content in the third video frame can be made higher, and the quality of the compressed video can be improved.
  • the first device may add text content to the second video frame to obtain the third video frame to be played through the following feasible implementations: the first device generates the text content according to the text content and attribute information Corresponding to the first image, the resolution of the first image is greater than the resolution of the second video frame; the first device adds the first image to the second video frame according to the attribute information to obtain the third video frame.
  • the first image generated according to the text content and attribute information includes text content. Since the resolution of the first image is greater than the resolution of the second video frame, the text content in the first image can be made clear In this way, even if the video frame is compressed, the definition of the text content in the video frame can be made higher.
  • the first device generates the first image corresponding to the text content according to the text content and attribute information, including: the first device determines at least one group of text content in the text content, and each group of text content includes At least one character, the font, size, color, and font effects of each character in a set of text content are the same.
  • the font effects include at least one of affine, rotation, or projection; the first device is based on the attribute information of each set of text content, Generate the first image corresponding to each set of text content.
  • the text content in the first video is divided into at least one group of text content. Since the font, size, color, and font effects of the characters in each group of text content are the same, each group of text content can be generated separately In this way, the accuracy of each first image generated can be higher.
  • the area except the text content in the first image is transparent.
  • the area except the text content in the first image is transparent, in this way, when the first image is added to the second video frame, the first image can be prevented from covering the video picture in the second video.
  • the first device adds the first image to the second video frame according to the attribute information to obtain the third video frame to be played, including: the first device obtains the text content in the attribute information in the For the location information in the first video frame, the first device adds the first image to the second video frame according to the location information to obtain the third video frame.
  • the first image can be accurately added to the second video frame, so that the position of the text information in the first image in the second video frame is the same
  • the text content has the same position in the first video frame.
  • the first device before the first device adds the text content to the second video frame according to the attribute information, the first device obtains the first identifier in the second video frame; the first device obtains the first identifier in the text information.
  • Two identification the first device determines that the first identification and the second identification are the same.
  • the first identifier and the second identifier are the same time stamp.
  • the first identifier and the second identifier are the same, it means that the text content and the second video frame correspond to the same first video frame. In this way, the first image corresponding to the text content can be added to the correct first image. Two video frames.
  • the first device receiving the second video frame sent by the second device and the text information extracted from the first video frame includes: the first device receives from the first transmission channel the second device sent The second video frame; the first device receives the text information sent by the second device from the second transmission channel, and the second transmission channel is the parallel bypass small bandwidth channel of the first transmission channel.
  • the second device sends the second video frame and text information to the first device on a different transmission channel.
  • the first device receives the second video frame and text information from a different transmission channel. In this way, not only The data transmission efficiency is higher, and the data transmission method in each transmission channel can be made simpler.
  • the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection.
  • the first device adds text content to the second video frame according to the attribute information, and after obtaining the third video frame to be played, the method further includes:
  • the first device plays the third video frame
  • the first device sends the third video frame to the third device, and the third device is used to play the third video frame.
  • an embodiment of the present application provides a video processing method.
  • a second device extracts text information from a first video frame.
  • the text information includes text content and attribute information; the second device compresses the first video frame to obtain The second video frame; the second device sends the second video frame and text information to the first device.
  • the second device first extracts text information from the first video frame, and compresses the first video frame to obtain the second video frame.
  • the second device sends the second video frame and text information to the first device. Since the second video frame is a compressed video frame, the bandwidth pressure of the transmission channel between the second device and the first device is reduced.
  • the first device may merge the second video frame and the text information to merge the text content in the text information into the second video frame to obtain the third video frame Since the text content is not compressed, the definition of the text content in the third video frame can be made higher, and the quality of the compressed video can be improved.
  • the method before the second device sends the second video frame and text information to the first device, the method further includes: the second device generates the first identifier; and the second device separately includes the second video frame and the text information. Add the first logo.
  • the first identifier is a timestamp generated by the second device.
  • the first device can determine the correspondence between the second video frame and the text information according to the first identifier, so that the first device can transfer the text content The corresponding first image is added to the correct second video frame.
  • the second device sending the second video frame and text information to the first device includes: the second device sends the second video frame to the first device through the first transmission channel;
  • the second transmission channel sends text information to the first device, and the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
  • the second device sends the second video frame and text information to the first device on different transmission channels.
  • the data transmission efficiency be higher, but also the data transmission in each transmission channel
  • the way is relatively simple.
  • the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection.
  • an embodiment of the present application provides a video processing device, including a receiving module and a processing module, where:
  • the receiving module is configured to receive a second video frame sent by a second device and text information extracted from the first video frame, where the second video frame is obtained by compressing the first video frame, and
  • the text information includes text content and attribute information of the text content
  • the processing module is configured to add the text content to the second video frame according to the attribute information to obtain a third video frame to be played.
  • the processing module is specifically configured to:
  • the first image is added to the second video frame to obtain the third video frame.
  • the processing module is specifically configured to:
  • each group of text content includes at least one character, the font, size, color, and font special effects of each character in the group of text content are the same, and the font special effects include affine, rotation Or at least one of projections;
  • the first image corresponding to each group of text content is generated according to the attribute information of each group of text content respectively.
  • the area in the first image other than the text content is transparent.
  • the processing module is specifically configured to:
  • the first image is added to the second video frame to obtain the third video frame.
  • the processing module before the processing module adds the text content to the second video frame according to the attribute information, the processing module is further configured to:
  • the first identifier and the second identifier are the same time stamp.
  • the receiving module is specifically configured to:
  • the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection.
  • the font effects include at least one of affine, rotation, or projection.
  • the processing module after the processing module adds the text content to the second video frame according to the attribute information to obtain the third video frame to be played, the processing module also uses in:
  • an embodiment of the present application provides a video processing device, including a processing module and a sending module, where:
  • the processing module is configured to extract text information from a first video frame, where the text information includes text content and attribute information;
  • the processing module is further configured to perform compression processing on the first video frame to obtain a second video frame;
  • the sending module is configured to send the second video frame and the text information to the first device.
  • the processing module before the sending module sends the second video frame and the text information to the first device, the processing module is further configured to:
  • the first identifier is added to the second video frame and the text information respectively.
  • the first identifier is a timestamp generated by the second device.
  • the sending module is specifically configured to:
  • the text information is sent to the first device through a second transmission channel, and the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
  • the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection.
  • the font effects include at least one of affine, rotation, or projection.
  • an embodiment of the present application provides a video processing device, including: a memory, a processor, and a computer program.
  • the computer program is stored in the memory, and the processor runs the computer program to execute the same as in the first aspect. Any one of the video processing methods.
  • an embodiment of the present application provides a video processing device, including a memory, a processor, and a computer program, the computer program is stored in the memory, and the processor runs the computer program to execute the same as in the second aspect Any one of the video processing methods.
  • an embodiment of the present application provides a storage medium, the storage medium includes a computer program, and the computer program is used to implement the video processing method according to any one of the first aspect.
  • an embodiment of the present application provides a storage medium, where the storage medium includes a computer program, and the computer program is used to implement the video processing method according to any one of the second aspect.
  • an embodiment of the present application also provides a chip or integrated circuit, including: a memory and a processor;
  • the memory is used for storing program instructions and sometimes also used for storing intermediate data
  • the processor is configured to call the program instructions stored in the memory to implement the video processing method according to any one of the first aspect.
  • an embodiment of the present application also provides a chip or integrated circuit, including a memory and a processor;
  • the memory is used for storing program instructions and sometimes also used for storing intermediate data
  • the processor is configured to call the program instructions stored in the memory to implement the video processing method according to any one of the second aspect.
  • an embodiment of the present application also provides a program product, the program product includes a computer program, the computer program is stored in a storage medium, and the computer program is used to implement the Video processing method.
  • an embodiment of the present application further provides a program product, the program product includes a computer program, the computer program is stored in a storage medium, and the computer program is used to implement any one of the second aspect Video processing method.
  • the second device first extracts text information from the first video frame and compresses the first video frame to obtain The second video frame.
  • the second device sends the second video frame and text information to the first device. Since the second video frame is a compressed video frame, the bandwidth pressure of the transmission channel between the second device and the first device is reduced.
  • the first device may merge the second video frame and the text information to merge the text content in the text information into the second video frame to obtain the third video frame Since the text content is not compressed, the definition of the text content in the third video frame can be made higher, and the quality of the compressed video can be improved.
  • FIG. 1 is a system architecture diagram provided by an embodiment of the application
  • FIG. 2 is a schematic flowchart of a video processing method provided by an embodiment of the application
  • FIG. 3 is a schematic diagram of a video frame provided by an embodiment of the application.
  • FIG. 4A is a schematic diagram of another video frame provided by an embodiment of this application.
  • 4B is a schematic diagram of another video frame provided by an embodiment of this application.
  • FIG. 4C is a schematic diagram of still another video frame provided by an embodiment of this application.
  • FIG. 5 is an architecture diagram of video processing provided by an embodiment of the application.
  • FIG. 6 is a schematic flowchart of another video processing method provided by an embodiment of the application.
  • FIG. 7 is a schematic diagram of a first image provided by an embodiment of the application.
  • FIG. 8 is a schematic diagram of a video processing process provided by an embodiment of the application.
  • FIG. 9 is a schematic diagram of another video processing process provided by an embodiment of the application.
  • FIG. 10 is a video processing device provided by an embodiment of this application.
  • FIG. 11 is another video processing device provided by an embodiment of this application.
  • FIG. 12 is a schematic diagram of the hardware structure of a video processing device provided by an embodiment of the application.
  • FIG. 13 is a schematic diagram of the hardware structure of a video processing device provided by an embodiment of the application.
  • FIG. 1 is a system architecture diagram provided by an embodiment of the application. Please refer to FIG. 1, which includes a first device 101 and a second device 102. There is a video transmission channel between the second device 102 and the first device 101, the second device 102 can send the video stream to the first device 101 through the transmission channel, and the first device 101 can play the received video stream.
  • the second device 102 may first extract text information from the video frame, and compress the video frame.
  • the second device 102 sends the compressed video frame and the extracted text information to the first device 101. Since the second device 102 compresses the video frame, the second device 102 and the first device 101 can be reduced. The bandwidth pressure between the transmission channels.
  • the first device 101 may merge the compressed video frame and text information to merge the text content in the text information into the video frame , And play the video frame after the merged text content. Because the text content in the text information is not compressed, the text content in the video played by the first device 101 has a higher definition, which avoids the text content in the video. Blurred.
  • the second device 102 may be a video server, and the first device 101 may be a device capable of playing videos such as a mobile phone, a computer, or a TV.
  • a user uses a mobile phone to project a video to a TV (video projection)
  • the second device 102 may be a video server or a mobile phone
  • the first device 101 may be a TV.
  • FIG. 2 is a schematic flowchart of a video processing method provided by an embodiment of the application. See Figure 2.
  • the method can include:
  • S201 The second device extracts text information from the first video frame.
  • the second device may be a server, a mobile phone, a computer, etc.
  • the first video frame is any frame in the video stream to be sent by the second device to the first device.
  • the second device has the same processing process for each frame in the video to be sent to the first device, and this application takes any first video frame as an example for description.
  • the second device may process the first video frame by using character recognition (CR) technology to extract text information from the first video frame.
  • CR technology may include optical character recognition (Optical Character Recognition, OCR) technology and the like.
  • the text information includes text content and attribute information.
  • the text content includes one or more characters, for example, the characters can be Chinese characters, numbers, letters, etc.
  • the text information may include multiple sets of text content and attribute information of each set of text content.
  • Each set of text content includes at least one character, at least one character in a set of text content has the same spacing between every two adjacent characters, and the font, size, color and font effects of each character in a set of text content are the same,
  • the font effects include at least one of affine, rotation, or projection.
  • the attribute information may include the position, font, size, color, and font special effects of the text content in the video frame.
  • the position of the text content in the video frame can be represented by at least two coordinates of the area occupied by the text content in the video frame. Fonts can include Song Ti, Hei Ti, Li Shu, Kai Ti, etc.
  • the font effect may include at least one of affine, rotation, or projection.
  • the attribute information of the group of text content may also include character spacing.
  • attribute information and font special effects may also include others, which are not specifically limited in the embodiment of the present application.
  • FIG. 3 is a schematic diagram of a video frame provided by an embodiment of the application.
  • the video frame 301 includes text content "Happy House” and "Today's weather is good.” Since the attribute information of each character in the text content "Happy House” is the same, and the attribute information of each character in the text content "Today's weather is good” is the same, therefore, two sets of text content can be extracted from the video frame shown in Figure 3 , Please refer to video frame 302, you can regard "Happy House” as a group of text content.
  • the position of this group of text content in video frame 302 is a rectangular area with points A1 and A2 as vertices.
  • the position of the group of text content in the video frame 302 is a rectangular area with the points B1 and B2 as vertices.
  • the two sets of text content are shown in Table 1:
  • the first video frame may remain unchanged. In this way, the workload of video frame processing can be reduced.
  • the text content in the text information can also be removed from the first video frame.
  • the text in the first video frame can be removed by the following feasible implementation methods content:
  • the color of the pixel where the text content is located in the first video frame is updated to realize the removal of the text content in the first video frame.
  • the area where the text content is located in the first video frame can present a complete background pattern.
  • the background pattern of the area where the text content is located may be a solid color, a preset regular shape, a preset image, and the like.
  • FIG. 4A is a schematic diagram of another video frame provided by an embodiment of the application.
  • the video frame A1 includes text content "Happy House” and "Today's weather is good”
  • the background pattern of the area where the text content "Happy House” is located is pure gray
  • the text content "Today's weather is good is good.
  • the background pattern of the area where "is located is pure red, you can replace the color of the pixel where the text content "Happy House” in video frame A1 is located with gray, and the color of the pixel where the text content "Today’s weather is good” in video frame A1 Replace with red to get video frame A2, see video frame A2, video frame A2 does not include text content.
  • FIG. 4B is a schematic diagram of another video frame provided by an embodiment of the application.
  • the video frame B1 includes text content "Happy House” and "Today's weather is good”
  • the background pattern of the area where the text content "Happy House” is located is vertical stripes
  • the text content "Today's weather is good If the background pattern in the area where "is located is petals, you can replace the color of the pixel where the text content "Happy House” in video frame B1 is located with the color of the pixel in the vertical stripe, and change the text content in video frame B1 "Today’s weather is good.
  • Video frame B2 does not include text content. In the video frame B2, complete vertical stripes and complete petals are included.
  • the preset color can be white, gray, etc.
  • FIG. 4C is a schematic diagram of still another video frame provided by an embodiment of this application.
  • the video frame C1 includes the text content "Happy House” and "Today's weather is good", the background pattern of the area where the text content "Happy House” is located is vertical stripes, and the text content "Today's weather is good.”
  • the background pattern of the area where "is located is petals.
  • the preset color is white
  • Get video frame C2 please refer to video frame C2, video frame C2 does not include text content.
  • the white text content covers some pixels in the vertical stripes and petals. In the video frame B2, the part in the vertical stripes is covered with white, and the part in the petals is covered with white.
  • the second device performs compression processing on the first video frame to obtain a second video frame.
  • the resolution of the second video frame is smaller than the resolution of the first video frame.
  • the second device may determine the compression ratio for compressing the first video frame according to the bandwidth of the transmission channel between the second device and the first device, and compress the first video frame according to the compression ratio to obtain the first video frame.
  • the bit rate (unit bps) of the second video is smaller than the bandwidth of the transmission channel between the second device and the first device.
  • S203 The second device sends the second video frame and text information to the first device.
  • One video includes multiple video frames.
  • the second device can obtain the corresponding second video frame and text information, and send the second video frame and text information to the first device.
  • the first device can receive multiple second video frames and multiple text information.
  • the second device needs to follow a preset rule Send the second video frame and text information.
  • the second device may send the second video frame and text information to the first device in the following feasible implementation manners:
  • the second device generates the first identifier, and adds the first identifier to the second video frame and the text information respectively, and the second device sends the second video frame including the first identifier and the text information including the first identifier to the first device.
  • the first identifier may be a timestamp generated by the second device according to the current time.
  • the second device may first generate a time stamp according to the current time, and then add an identifier to the time stamp, so that each video frame corresponds to a different time stamp. For example, suppose that the second device needs to generate the timestamps corresponding to video frame 1, video frame 2 and video frame 3 at the same time, and the timestamp generated according to the current time is timestamp 1, then the second device adds the identifier to the timestamp 1.
  • the timestamp corresponding to video frame 1 may be timestamp 1+a
  • the timestamp corresponding to video frame 2 may be timestamp 1+b
  • the timestamp corresponding to video frame 3 may be timestamp 1+c.
  • the second device adds the same identifier to the second video frame and the text information, so that the first device can determine the correspondence between the second video frame and the text information according to the identifier, and the process is simple and convenient.
  • the second device may send the second video frame and text information to the first device on the same transmission channel.
  • the second device may also send the second video frame and text information to the first device in a different transmission.
  • the second device can send a second video frame to the first device through the first transmission channel, and send text information to the first device through the second transmission channel.
  • the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel. .
  • the first transmission channel may be an embedded Display Port (eDP) channel, a High Definition Multimedia Interface (HDMI) channel, etc.
  • the second transmission channel may be an auxiliary channel (Auxiliary, AUX), a universal serial bus (Universal Serial Bus, USB) channel, etc.
  • the first device adds the text content to the second video frame according to the attribute information to obtain the third video frame to be played.
  • the first device may first determine the second video frame and the corresponding text information. For example, when the second device sends the second video frame and text information through the method shown in S203, the second device may determine the second video frame according to the identifier included in the received second video frame and the identifier included in the text information. Correspondence between frame and text information. For example, suppose that a second video frame received by the first device includes a first identifier, and a text message received includes a second identifier. If the first identifier and the second identifier are the same, the first device determines the second identifier. The video frame corresponds to the text information.
  • the first device may add text content to the second video frame to obtain the third video frame through the following feasible implementations: the first device generates the first image corresponding to the text content according to the text content and attribute information, and A device adds the first image to the second video frame according to the attribute information to obtain the third video frame.
  • the first device may add the first image to the second video frame according to the location information in the attribute information.
  • the first device may overlay the first image in the second video frame according to the location information.
  • the area except the text content in the first image is transparent.
  • the resolution of the first image is greater than the resolution of the second video frame.
  • the resolution of the first image may be equal to the resolution of the first image, so that the definition of the text content in the third video can be made higher.
  • the first device may play the third video frame, or the first device may send the third video frame to the third device, so that the third device may play the third video frame.
  • Video frame After the first device obtains the third video frame, the first device may play the third video frame, or the first device may send the third video frame to the third device, so that the third device may play the third video frame. Video frame.
  • the second device first extracts text information from the first video frame, and compresses the first video frame to obtain the second video frame .
  • the second device sends the second video frame and text information to the first device. Since the second video frame is a compressed video frame, the bandwidth pressure of the transmission channel between the second device and the first device is reduced.
  • the first device may merge the second video frame and the text information to merge the text content in the text information into the second video frame to obtain the third video frame Since the text content is not compressed, the definition of the text content in the third video frame can be made higher, and the quality of the compressed video can be improved.
  • Fig. 5 is an architecture diagram of video processing provided by an embodiment of the application.
  • the second device extracts text information from the first video frame to obtain text information, and compresses the first video frame after the text information is extracted to obtain the second Video frame.
  • the second device also generates a time stamp, and adds the same time stamp to the text information and the second video frame, that is, the text information and the second video frame are respectively stamped with the same time stamp.
  • the second device sends the time-stamped text information and the time-stamped second video frame to the first device.
  • the first device may determine the correspondence between the text information and the second video frame according to the time stamp. After the first device determines the correspondence between the text information and the second video frame, the first device may generate the first image according to the text information, and perform merging processing on the first image and the second video frame to obtain the third video frame.
  • FIG. 6 is a schematic flowchart of another video processing method provided by an embodiment of the application. Referring to Figure 6, the method may include:
  • the second device obtains the first video frame.
  • the second device obtains the first video frame in the video stream to be sent to the first device.
  • the video stream may be a local video stream of the second device, or a video stream received by the second device from other devices.
  • the second device extracts text information from the first video frame.
  • the second device performs compression processing on the first video frame to obtain a second video frame.
  • the second device generates a time stamp.
  • the second device may generate a time stamp according to the current time.
  • the second device adds a time stamp to the text information and the second video frame respectively.
  • the time stamp is included in the text information.
  • the second video frame includes the time stamp.
  • the second device sends the second video frame including the time stamp to the first device through the first transmission channel.
  • the second device sends the text information including the time stamp to the first device through the second transmission channel.
  • S608 The first device determines the correspondence between the text information and the second video frame according to the timestamp.
  • the first device generates a first image according to the text information.
  • the first device may generate a first image corresponding to each set of text content.
  • the first device generates a first image corresponding to the set of text content according to the set of text content, the font, size, color, and font effects of the set of text content, and the resolution of the first image
  • the rate is greater than the resolution of the second video frame.
  • the size of the first image may be the same as the size of the area occupied by the text content in the first video frame.
  • FIG. 7 is a schematic diagram of a first image provided by an embodiment of the application. Please refer to FIG. 7, assuming that the first video frame is as shown in FIG. 701.
  • the first video frame 701 includes text content "Happy House” and "Today's weather is good.”
  • the first device Assuming that the text information extracted in the first video frame is as shown in Table 1, the first device according to "Happy House” and “Position 1, Font 1, Size 1, Color 1, Spacing 1, Special Effects” in Table 1 1” can generate an image 702.
  • the text content "Happy House” is included.
  • the attribute information of the text content in the image 702 is the same as the attribute information of the text content in the first video frame 701.
  • the size of the image 702 is the same as the size of the area occupied by the text content "Happy House” in the first video frame 701.
  • the first device can also generate an image 703 according to "It’s nice today” and "Position 2, font 2, size 2, color 2, spacing 2, special effects 2" in Table 1.
  • the text content "Today The weather is good”
  • the attribute information of the text content in the image 703 is the same as the attribute information of the text content in the first video frame 701.
  • the size of the image 703 is the same as the size of the area occupied by the text content "It's nice today" in the first video frame 701.
  • the first device combines the first image and the second video frame to obtain a third video frame.
  • the first device respectively performs merging processing on each first image and second video frame to obtain a third video frame.
  • the first device plays the third video frame.
  • the second device first extracts text information from the first video frame, and compresses the first video frame to obtain the second video frame .
  • the second device sends the second video frame and text information to the first device. Since the second video frame is a compressed video frame, the bandwidth pressure of the transmission channel between the second device and the first device is reduced.
  • the first device may merge the second video frame and the text information to merge the text content in the text information into the second video frame to obtain the third video frame , And play the third video frame. Since the text content is not compressed, the definition of the text content in the third video frame played by the first device can be made higher, so that the quality of the video played by the first device is higher. high.
  • FIG. 8 is a schematic diagram of a video processing process provided by an embodiment of this application.
  • the first video frame is shown as 801, and the first video frame 801 includes the text information "Happy House” and "It's nice today.”
  • the font, size, color, and font effects of the characters in the text content "Happy House” are the same, and the font, size, color, and font effects of the characters in the text content "It's nice today” are the same.
  • the second device After acquiring the first video frame 801, the second device extracts the text information 802 from the first video frame 801, where the text information 802 includes two sets of text content "Happy House” and “It's nice today.” And the attribute information of each group of text content. After the second device extracts text information in the first video frame 801, it is assumed that the first video frame 801 remains unchanged.
  • the second device performs compression processing on the first video frame 801 from which the text information has been extracted, to obtain a second video frame 803.
  • the second device adds the same time stamp to the second video frame 803 and the text information 802, and transmits the second video frame 803 including the time stamp and the text information 802 including the time stamp.
  • the first device determines that the second video frame 803 corresponds to the text information 802 according to the time stamp.
  • the first device generates an image 804 according to the text content "Happy House” and the font, size, color and font effects of the text information.
  • the font, size, color and font effects of the text content "Happy House” in the image 804 are the same as The font, size, color, and font special effects of the text content in the first video frame 801 correspond to the same.
  • the first device generates the image 805 according to the text content "It’s nice today” and the font, size, color and font effects of the text content, and generates the image 805.
  • the special effect corresponds to the font, size, color, and font special effect of the text content in the first video frame 801.
  • the first device overlays the image 804 on the second video frame 803 according to the position information of the text content "Happy House” in the first video frame 801.
  • the first device also overlays the image 805 on the second video frame 803 according to the position information of the text content "It's nice today” in the first video frame 801 to obtain the third video frame 806.
  • the first device can play the third video frame 806.
  • the bandwidth pressure between the second device and the first device not only can the bandwidth pressure between the second device and the first device be reduced, but also the definition of the text content in the third video frame played by the first device can be made higher, so that the The quality of the video played by a device is higher. Further, after the second device extracts the text information in the first video frame, the first video frame remains unchanged (the first video frame is not processed), so that the workload of the second device's video processing is small, so that the video The processing efficiency is higher.
  • FIG. 9 is a schematic diagram of another video processing process provided by an embodiment of this application.
  • the first video frame is shown as 901, and the first video frame 901 includes text information "Happy House” and "It's nice today.”
  • the font, size, color, and font effects of the characters in the text content "Happy House” are the same, and the font, size, color, and font effects of the characters in the text content "It's nice today” are the same.
  • the second device After acquiring the first video frame 901, the second device extracts the text information 902 from the first video frame 901, where the text information 902 includes two sets of text content "Happy House” and "The weather is good today.” And the attribute information of each group of text content. After the second device extracts the text information in the first video frame 901, the second device removes the text content in the first video frame 901 to obtain the first video frame 903. Referring to FIG. 9, the first video frame 903 does not include text content.
  • the second device performs compression processing on the first video frame 903 to obtain the second video frame 904.
  • the second device adds the same time stamp to the second video frame 904 and the text information 902, and transmits the second video frame 904 including the time stamp and the text information 902 including the time stamp.
  • the first device determines that the second video frame 904 corresponds to the text information 902 according to the time stamp.
  • the first device generates an image 905 based on the text content "Happy House” and the font, size, color and font effects of the text information.
  • the font, size, color and font effects of the text content "Happy House” in the image 905 are consistent with
  • the font, size, color, and font special effects of the text content in the first video frame 901 correspond to the same.
  • the first device generates image 906 according to the text content "Today's weather is good” and the font, size, color and font effects of the text content, and generates image 906.
  • the text content "Today's weather is good” in image 906 has the font, size, color and font.
  • the special effects correspond to the font, size, color, and font special effects of the text content in the first video frame 901.
  • the first device overlays the image 905 on the second video frame 904 according to the position information of the text content "Happy House” in the first video frame 901.
  • the first device also overlays the image 906 on the second video frame 904 according to the position information of the text content "It's nice today” in the first video frame 901 to obtain the third video frame 907.
  • the first device can play the third video frame 907.
  • the bandwidth pressure between the second device and the first device can be reduced, but also the definition of the text content in the third video frame played by the first device can be made higher, so that the The quality of the video played by a device is higher.
  • the second device extracts the text information in the first video frame
  • the text content is removed from the first video frame.
  • the first device adds the first image (image 905 and image 905) to the second video frame.
  • Image 906 the problem that the text content in the first image cannot completely cover the text content in the second video frame can be avoided, so that the quality of the video processing is higher.
  • Fig. 10 is a video processing device provided by an embodiment of the application.
  • the video processing device 10 may be provided in the first device.
  • the video processing device 10 may include a receiving module 11 and a processing module 12, where:
  • the receiving module 11 is configured to receive a second video frame sent by a second device and text information extracted from the first video frame, where the second video frame is obtained by compressing the first video frame, so
  • the text information includes text content and attribute information of the text content
  • the processing module 12 is configured to add the text content to the second video frame according to the attribute information to obtain a third video frame to be played.
  • the receiving module 11 may execute S203 in the embodiment in FIG. 2 and S606-S607 in the embodiment in FIG. 6.
  • processing module 12 may execute S204 in the embodiment in FIG. 2 and S608-S611 in the embodiment in FIG. 6.
  • processing module 12 is specifically configured to:
  • the first image is added to the second video frame to obtain the third video frame.
  • processing module 12 is specifically configured to:
  • each group of text content includes at least one character, the font, size, color, and font special effects of each character in the group of text content are the same, and the font special effects include affine, rotation Or at least one of projections;
  • the first image corresponding to each group of text content is generated according to the attribute information of each group of text content respectively.
  • the area in the first image other than the text content is transparent.
  • processing module 12 is specifically configured to:
  • the first image is added to the second video frame to obtain the third video frame.
  • the processing module 12 is further configured to:
  • the first identifier and the second identifier are the same time stamp.
  • the receiving module 11 is specifically configured to:
  • the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection.
  • the font effects include at least one of affine, rotation, or projection.
  • the processing module 12 After the processing module 12 adds the text content to the second video frame according to the attribute information to obtain the third video frame to be played, the processing module 12 Also used for:
  • FIG. 11 is another video processing device provided by an embodiment of this application.
  • the video processing device 20 may be provided in a second device.
  • the video processing device 20 may include a processing module 21 and a sending module 22, where:
  • the processing module 21 is configured to extract text information from a first video frame, where the text information includes text content and attribute information;
  • the processing module 21 is further configured to perform compression processing on the first video frame to obtain a second video frame;
  • the sending module 22 is configured to send the second video frame and the text information to the first device.
  • processing module 21 may execute S201-S202 in the embodiment of FIG. 2 and S601-S605 in the embodiment of FIG. 6.
  • the sending module 22 may execute S203 in the embodiment in FIG. 2 and S606-S607 in the embodiment in FIG. 6.
  • the processing module 21 is further configured to:
  • the first identifier is added to the second video frame and the text information respectively.
  • the first identifier is a timestamp generated by the second device.
  • the sending module 22 is specifically used for:
  • the text information is sent to the first device through a second transmission channel, and the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
  • the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection.
  • the font effects include at least one of affine, rotation, or projection.
  • FIG. 12 is a schematic diagram of the hardware structure of a video processing device provided by an embodiment of the application.
  • the video processing device 30 includes: a memory 31, a processor 32, and a receiver 33, where the memory 31 and the processor 32 communicate; exemplary, the memory 31, the processor 32, and the receiver 33 can communicate through
  • the bus 44 communicates, the memory 31 is used to store a computer program, and the processor 32 executes the computer program to implement the foregoing video processing method.
  • the processor 32 shown in the present application may implement the function of the processing module 12 in the embodiment of FIG. 10, and the receiver 33 may implement the function of the receiving module 11 in the embodiment of FIG. 10, which will not be repeated here.
  • the foregoing processor may be a CPU, or other general-purpose processors, DSPs, ASICs, and so on.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps in the embodiment of the authentication method disclosed in this application may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
  • FIG. 13 is a schematic diagram of the hardware structure of a video processing device provided by an embodiment of the application.
  • the video processing device 40 includes: a memory 41, a processor 42, and a transmitter 43, where the memory 41 and the processor 42 communicate; for example, the memory 41, the processor 42 and the transmitter 43 can communicate through
  • the bus 44 communicates, the memory 41 is used to store a computer program, and the processor 42 executes the computer program to implement the foregoing video processing method.
  • the processor 42 shown in the present application may implement the function of the processing module 21 in the embodiment of FIG. 11, and the receiver 43 may implement the function of the sending module 22 in the embodiment of FIG. 11, which will not be repeated here.
  • the foregoing processor may be a CPU, or other general-purpose processors, DSPs, ASICs, and so on.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps in the embodiment of the authentication method disclosed in this application may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
  • the present application provides a storage medium, the storage medium is used to store a computer program, and the computer program is used to implement the video processing method described in the foregoing embodiment.
  • the embodiment of the present application also provides a chip or integrated circuit, including: a memory and a processor;
  • the memory is used for storing program instructions and sometimes also used for storing intermediate data
  • the processor is configured to call the program instructions stored in the memory to implement the video processing method described above.
  • the memory can be independent or integrated with the processor.
  • the memory may also be located outside the chip or integrated circuit.
  • An embodiment of the present application also provides a program product, the program product includes a computer program, the computer program is stored in a storage medium, and the computer program is used to implement the above-mentioned video processing method.
  • All or part of the steps in the foregoing method embodiments can be implemented by a program instructing relevant hardware.
  • the aforementioned program can be stored in a readable memory.
  • the program executes the steps that include the foregoing method embodiments; and the foregoing memory (storage medium) includes: read-only memory (English: read-only memory, abbreviation: ROM), RAM, flash memory, hard disk, Solid state hard drives, magnetic tapes (English: magnetic tape), floppy disks (English: floppy disk), optical discs (English: optical disc) and any combination thereof.
  • These computer program instructions can be provided to the processing unit of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processing unit of the computer or other programmable data processing equipment are generated It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.
  • the term “including” and its variations may refer to non-limiting inclusion; the term “or” and its variations may refer to “and/or”.
  • the terms “first”, “second”, etc. in this application are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence.
  • “plurality” means two or more.
  • “And/or” describes the association relationship of the associated objects, indicating that there can be three types of relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone.
  • the character “/” generally indicates that the associated objects are in an "or” relationship.

Abstract

Provided are a video processing method, apparatus and device. The method comprises: a first device receiving a second video frame sent by a second device and text information extracted from a first video frame, wherein the second video frame is obtained by compressing the first video frame, and the text information comprises text content and attribute information of the text content; and the first device adding, according to the attribute information, the text content to the second video frame in order to obtain a third video frame to be played. The quality of a compressed video is improved.

Description

视频处理方法、装置及设备Video processing method, device and equipment
本申请要求在2019年6月14日提交中国国家知识产权局、申请号为201910517023.2的中国专利申请的优先权,发明名称为“视频处理方法、装置及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the State Intellectual Property Office of China with the application number 201910517023.2, and the priority of the Chinese patent application with the title of "Video Processing Method, Device and Equipment" on June 14, 2019. The entire content is incorporated into this application by reference.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种视频处理方法、装置及设备。This application relates to the field of computer technology, and in particular to a video processing method, device and equipment.
背景技术Background technique
视频中包括多个视频帧,每个视频帧中都可能包括文本内容,例如,文本内容可以包括字幕、弹幕、提示信息等。The video includes multiple video frames, and each video frame may include text content. For example, the text content may include subtitles, barrage, and prompt information.
目前,视频的分辨率不断提升,使得视频传输通道的带宽压力也越来越大。为了缓解视频传输通道的带宽压力,在传输视频之前,可以对视频进行压缩处理。然而,在实际应用过程中,在视频被压缩处理之后,视频中的文本内容变得模糊,影响用户对文本内容的识别,导致压缩后的视频的质量较低。At present, the resolution of video continues to increase, making the bandwidth pressure of the video transmission channel more and more severe. In order to relieve the bandwidth pressure of the video transmission channel, the video can be compressed before the video is transmitted. However, in the actual application process, after the video is compressed, the text content in the video becomes blurred, which affects the recognition of the text content by the user, resulting in a lower quality of the compressed video.
发明内容Summary of the invention
本申请提供一种视频处理方法、装置及设备。提高了压缩后的视频的质量。This application provides a video processing method, device and equipment. Improved the quality of compressed video.
第一方面,本申请实施例提供一种视频处理方法,第一设备接收第二设备发送的第二视频帧和在第一视频帧中提取得到的文本信息,第二视频帧为对第一视频帧压缩得到的,文本信息中包括文本内容和文本内容的属性信息,第一设备根据属性信息,将文本内容添加至第二视频帧,得到待播放的第三视频帧。In a first aspect, an embodiment of the present application provides a video processing method. A first device receives a second video frame sent by a second device and text information extracted from the first video frame. The text information includes the text content and the attribute information of the text content obtained by frame compression. The first device adds the text content to the second video frame according to the attribute information to obtain the third video frame to be played.
在上述过程中,针对视频流中的任意一个第一视频帧,第二设备先在第一视频帧中提取文本信息,并对第一视频帧进行压缩处理得到第二视频帧。第二设备向第一设备发送第二视频帧和文本信息,由于第二视频帧为压缩处理后的视频帧,因此,减小了第二设备和第一设备之间的传输通道的带宽压力。在第一设备接收到第二视频帧和文本信息之后,第一设备可以对第二视频帧和文本信息进行合并处理,以将文本信息中的文本内容合并至第二视频帧得到第三视频帧,由于未对文本内容进行压缩处理,因此,可以使得第三视频帧中的文本内容的清晰度较高,提高了压缩后的视频的质量。In the above process, for any first video frame in the video stream, the second device first extracts text information from the first video frame, and compresses the first video frame to obtain the second video frame. The second device sends the second video frame and text information to the first device. Since the second video frame is a compressed video frame, the bandwidth pressure of the transmission channel between the second device and the first device is reduced. After the first device receives the second video frame and the text information, the first device may merge the second video frame and the text information to merge the text content in the text information into the second video frame to obtain the third video frame Since the text content is not compressed, the definition of the text content in the third video frame can be made higher, and the quality of the compressed video can be improved.
在一种可能的实施方式中,第一设备可以通过如下可行的实现方式将文本内容添加至第二视频帧得到待播放的第三视频帧:第一设备根据文本内容和属性信息,生成文本内容对应的第一图像,第一图像的分辨率大于第二视频帧的分辨率;第一设备根据属性信息,将第一图像增加至第二视频帧中,得到第三视频帧。In a possible implementation manner, the first device may add text content to the second video frame to obtain the third video frame to be played through the following feasible implementations: the first device generates the text content according to the text content and attribute information Corresponding to the first image, the resolution of the first image is greater than the resolution of the second video frame; the first device adds the first image to the second video frame according to the attribute information to obtain the third video frame.
在上述过程中,根据文本内容和属性信息生成的第一图像中包括文本内容,由于第一图像的分辨率大于第二视频帧的分辨率,因此,可以使得第一图像中的文本内容的清晰度较高,这样,即使对视频帧进行了压缩,还可以使得视频帧中的文本内容的清晰度较高。In the above process, the first image generated according to the text content and attribute information includes text content. Since the resolution of the first image is greater than the resolution of the second video frame, the text content in the first image can be made clear In this way, even if the video frame is compressed, the definition of the text content in the video frame can be made higher.
在一种可能的实施方式中,第一设备根据文本内容和属性信息,生成文本内容对应的第一图像,包括:第一设备在文本内容中确定至少一组文本内容,每组文本内容中包括至少一 个字符,一组文本内容中各字符的字体、尺寸、颜色和字体特效相同,字体特效包括仿射、旋转或投影中的至少一种;第一设备分别根据每组文本内容的属性信息,生成每组文本内容对应的第一图像。In a possible implementation manner, the first device generates the first image corresponding to the text content according to the text content and attribute information, including: the first device determines at least one group of text content in the text content, and each group of text content includes At least one character, the font, size, color, and font effects of each character in a set of text content are the same. The font effects include at least one of affine, rotation, or projection; the first device is based on the attribute information of each set of text content, Generate the first image corresponding to each set of text content.
在上述过程中,将第一视频中的文本内容划分为至少一组文本内容,由于每组文本内容中的字符的字体、尺寸、颜色和字体特效相同,因此,可以分别生成每组文本内容对应的第一图像,这样,可以使得生成的每个第一图像的精确度较高。In the above process, the text content in the first video is divided into at least one group of text content. Since the font, size, color, and font effects of the characters in each group of text content are the same, each group of text content can be generated separately In this way, the accuracy of each first image generated can be higher.
在一种可能的实施方式中,第一图像中除文本内容之外的区域为透明的。In a possible implementation manner, the area except the text content in the first image is transparent.
在上述过程中,由于第一图像中除文本内容之外的区域为透明的,这样,在将第一图像增加至第二视频帧时,可以避免第一图像覆盖第二视频中的视频画面。In the above process, since the area except the text content in the first image is transparent, in this way, when the first image is added to the second video frame, the first image can be prevented from covering the video picture in the second video.
在一种可能的实施方式中,第一设备根据属性信息,将第一图像增加至第二视频帧中,得到待播放的第三视频帧,包括:第一设备在属性信息中获取文本内容在第一视频帧中的位置信息,第一设备根据位置信息,将第一图像增加至第二视频帧中,得到第三视频帧。In a possible implementation manner, the first device adds the first image to the second video frame according to the attribute information to obtain the third video frame to be played, including: the first device obtains the text content in the attribute information in the For the location information in the first video frame, the first device adds the first image to the second video frame according to the location information to obtain the third video frame.
在上述过程中,根据文本内容在第一视频帧中的位置信息,可以准确的将第一图像增加至第二视频帧中,使得第一图像中的文本信息在第二视频帧中的位置与文本内容在第一视频帧中的位置相同。In the above process, according to the position information of the text content in the first video frame, the first image can be accurately added to the second video frame, so that the position of the text information in the first image in the second video frame is the same The text content has the same position in the first video frame.
在一种可能的实施方式中,第一设备根据属性信息,将文本内容添加至第二视频帧之前,第一设备在第二视频帧中获取第一标识;第一设备在文本信息中获取第二标识;第一设备确定第一标识和第二标识相同。可选的,第一标识和第二标识为相同的时间戳。In a possible implementation manner, before the first device adds the text content to the second video frame according to the attribute information, the first device obtains the first identifier in the second video frame; the first device obtains the first identifier in the text information. Two identification: the first device determines that the first identification and the second identification are the same. Optionally, the first identifier and the second identifier are the same time stamp.
在上述过程中,当第一标识和第二标识相同时,说明文本内容和第二视频帧与相同的第一视频帧对应的,这样,可以将文本内容对应的第一图像增加至正确的第二视频帧中。In the above process, when the first identifier and the second identifier are the same, it means that the text content and the second video frame correspond to the same first video frame. In this way, the first image corresponding to the text content can be added to the correct first image. Two video frames.
在一种可能的实施方式中,第一设备接收第二设备发送的第二视频帧和在第一视频帧中提取得到的文本信息,包括:第一设备从第一传输通道接收第二设备发送的第二视频帧;第一设备从第二传输通道接收第二设备发送的文本信息,第二传输通道为第一传输通道的并行旁路小带宽通道。In a possible implementation manner, the first device receiving the second video frame sent by the second device and the text information extracted from the first video frame includes: the first device receives from the first transmission channel the second device sent The second video frame; the first device receives the text information sent by the second device from the second transmission channel, and the second transmission channel is the parallel bypass small bandwidth channel of the first transmission channel.
在上述过程中,第二设备在不同的传输通道向第一设备发送第二视频帧和文本信息,相应的,第一设备从不同的传输通道接收第二视频帧和文本信息,这样,不但可以使得数据的传输效率较高,还可以使得在每个传输通道传输数据的传输方式较为简单。In the above process, the second device sends the second video frame and text information to the first device on a different transmission channel. Correspondingly, the first device receives the second video frame and text information from a different transmission channel. In this way, not only The data transmission efficiency is higher, and the data transmission method in each transmission channel can be made simpler.
在一种可能的实施方式中,属性信息包括文本内容在视频帧中的位置、字体、尺寸、颜色和字体特效,字体特效包括仿射、旋转或投影中的至少一种。In a possible implementation manner, the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection.
在一种可能的实施方式中,第一设备根据属性信息,将文本内容添加至第二视频帧,得到待播放的第三视频帧之后,还包括:In a possible implementation manner, the first device adds text content to the second video frame according to the attribute information, and after obtaining the third video frame to be played, the method further includes:
第一设备播放第三视频帧;The first device plays the third video frame;
或者,or,
第一设备向第三设备发送第三视频帧,第三设备用于播放第三视频帧。The first device sends the third video frame to the third device, and the third device is used to play the third video frame.
第二方面,本申请实施例提供一种视频处理方法,第二设备在第一视频帧中提取文本信息,文本信息包括文本内容和属性信息;第二设备对第一视频帧进行压缩处理,得到第二视频帧;第二设备向第一设备发送第二视频帧和文本信息。In a second aspect, an embodiment of the present application provides a video processing method. A second device extracts text information from a first video frame. The text information includes text content and attribute information; the second device compresses the first video frame to obtain The second video frame; the second device sends the second video frame and text information to the first device.
在上述过程中,针对视频流中的任意一个第一视频帧,第二设备先在第一视频帧中提取文本信息,并对第一视频帧进行压缩处理得到第二视频帧。第二设备向第一设备发送第二视频帧和文本信息,由于第二视频帧为压缩处理后的视频帧,因此,减小了第二设备和第一设备之间的传输通道的带宽压力。在第一设备接收到第二视频帧和文本信息之后,第一设备可 以对第二视频帧和文本信息进行合并处理,以将文本信息中的文本内容合并至第二视频帧得到第三视频帧,由于未对文本内容进行压缩处理,因此,可以使得第三视频帧中的文本内容的清晰度较高,提高了压缩后的视频的质量。In the above process, for any first video frame in the video stream, the second device first extracts text information from the first video frame, and compresses the first video frame to obtain the second video frame. The second device sends the second video frame and text information to the first device. Since the second video frame is a compressed video frame, the bandwidth pressure of the transmission channel between the second device and the first device is reduced. After the first device receives the second video frame and the text information, the first device may merge the second video frame and the text information to merge the text content in the text information into the second video frame to obtain the third video frame Since the text content is not compressed, the definition of the text content in the third video frame can be made higher, and the quality of the compressed video can be improved.
在一种可能的实施方式中,第二设备向第一设备发送第二视频帧和文本信息之前,还包括:第二设备生成第一标识;第二设备分别在第二视频帧和文本信息中添加第一标识。可选的,第一标识为第二设备生成的时间戳。In a possible implementation manner, before the second device sends the second video frame and text information to the first device, the method further includes: the second device generates the first identifier; and the second device separately includes the second video frame and the text information. Add the first logo. Optionally, the first identifier is a timestamp generated by the second device.
在上述过程中,通过在第二视频帧和文本信息中添加第一标识,可以使得第一设备根据第一标识确定第二视频帧和文本信息的对应关系,进而使得第一设备可以将文本内容对应的第一图像增加至正确的第二视频帧中。In the above process, by adding the first identifier to the second video frame and the text information, the first device can determine the correspondence between the second video frame and the text information according to the first identifier, so that the first device can transfer the text content The corresponding first image is added to the correct second video frame.
在一种可能的实施方式中,第二设备向第一设备发送第二视频帧和文本信息,包括:第二设备通过第一传输通道向第一设备发送第二视频帧;第二设备通过第二传输通道向第一设备发送文本信息,第二传输通道为第一传输通道的并行旁路小带宽通道。In a possible implementation manner, the second device sending the second video frame and text information to the first device includes: the second device sends the second video frame to the first device through the first transmission channel; The second transmission channel sends text information to the first device, and the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
在上述过程中,第二设备在不同的传输通道向第一设备发送第二视频帧和文本信息,这样,不但可以使得数据的传输效率较高,还可以使得在每个传输通道传输数据的传输方式较为简单。In the above process, the second device sends the second video frame and text information to the first device on different transmission channels. In this way, not only can the data transmission efficiency be higher, but also the data transmission in each transmission channel The way is relatively simple.
在一种可能的实施方式中,属性信息包括文本内容在视频帧中的位置、字体、尺寸、颜色和字体特效,字体特效包括仿射、旋转或投影中的至少一种。In a possible implementation manner, the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection.
第三方面,本申请实施例提供一种视频处理装置,包括接收模块和处理模块,其中,In a third aspect, an embodiment of the present application provides a video processing device, including a receiving module and a processing module, where:
所述接收模块用于,接收第二设备发送的第二视频帧和在第一视频帧中提取得到的文本信息,所述第二视频帧为对所述第一视频帧压缩得到的,所述文本信息中包括文本内容和所述文本内容的属性信息;The receiving module is configured to receive a second video frame sent by a second device and text information extracted from the first video frame, where the second video frame is obtained by compressing the first video frame, and The text information includes text content and attribute information of the text content;
所述处理模块用于,根据所述属性信息,将所述文本内容添加至所述第二视频帧,得到待播放的第三视频帧。The processing module is configured to add the text content to the second video frame according to the attribute information to obtain a third video frame to be played.
在一种可能的实施方式中,所述处理模块具体用于:In a possible implementation manner, the processing module is specifically configured to:
根据所述文本内容和所述属性信息,生成所述文本内容对应的第一图像,所述第一图像的分辨率大于所述第二视频帧的分辨率;Generating a first image corresponding to the text content according to the text content and the attribute information, where the resolution of the first image is greater than the resolution of the second video frame;
根据所述属性信息,将所述第一图像增加至所述第二视频帧中,得到所述第三视频帧。According to the attribute information, the first image is added to the second video frame to obtain the third video frame.
在一种可能的实施方式中,所述处理模块具体用于:In a possible implementation manner, the processing module is specifically configured to:
在所述文本内容中确定至少一组文本内容,每组文本内容中包括至少一个字符,一组文本内容中各字符的字体、尺寸、颜色和字体特效相同,所述字体特效包括仿射、旋转或投影中的至少一种;Determine at least one group of text content in the text content, each group of text content includes at least one character, the font, size, color, and font special effects of each character in the group of text content are the same, and the font special effects include affine, rotation Or at least one of projections;
分别根据每组文本内容的属性信息,生成每组文本内容对应的第一图像。The first image corresponding to each group of text content is generated according to the attribute information of each group of text content respectively.
在一种可能的实施方式中,所述第一图像中除所述文本内容之外的区域为透明的。In a possible implementation manner, the area in the first image other than the text content is transparent.
在一种可能的实施方式中,所述处理模块具体用于:In a possible implementation manner, the processing module is specifically configured to:
在所述属性信息中获取所述文本内容在所述第一视频帧中的位置信息Obtain the position information of the text content in the first video frame in the attribute information
根据所述位置信息,将所述第一图像增加至所述第二视频帧中,得到所述第三视频帧。According to the position information, the first image is added to the second video frame to obtain the third video frame.
在一种可能的实施方式中,在所述处理模块根据所述属性信息,将所述文本内容添加至所述第二视频帧之前,所述处理模块还用于:In a possible implementation manner, before the processing module adds the text content to the second video frame according to the attribute information, the processing module is further configured to:
在所述第二视频帧中获取第一标识;Acquiring a first identifier in the second video frame;
在所述文本信息中获取第二标识;Acquiring the second identifier in the text information;
确定所述第一标识和所述第二标识相同。It is determined that the first identifier and the second identifier are the same.
在一种可能的实施方式中,所述第一标识和所述第二标识为相同的时间戳。In a possible implementation manner, the first identifier and the second identifier are the same time stamp.
在一种可能的实施方式中,所述接收模块具体用于:In a possible implementation manner, the receiving module is specifically configured to:
从第一传输通道接收所述第二设备发送的所述第二视频帧;Receiving the second video frame sent by the second device from the first transmission channel;
从第二传输通道接收所述第二设备发送的所述文本信息,所述第二传输通道为所述第一传输通道的并行旁路小带宽通道。Receiving the text information sent by the second device from a second transmission channel, where the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
在一种可能的实施方式中,所述属性信息包括所述文本内容在所述视频帧中的位置、字体、尺寸、颜色和字体特效,所述字体特效包括仿射、旋转或投影中的至少一种。In a possible implementation manner, the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection. One kind.
在一种可能的实施方式中,在所述处理模块根据所述属性信息,将所述文本内容添加至所述第二视频帧,得到待播放的第三视频帧之后,所述处理模块还用于:In a possible implementation manner, after the processing module adds the text content to the second video frame according to the attribute information to obtain the third video frame to be played, the processing module also uses in:
播放所述第三视频帧;Playing the third video frame;
或者,or,
向第三设备发送所述第三视频帧,所述第三设备用于播放所述第三视频帧。Sending the third video frame to a third device, where the third device is used to play the third video frame.
第四方面,本申请实施例提供一种视频处理装置,包括处理模块和发送模块,其中,In a fourth aspect, an embodiment of the present application provides a video processing device, including a processing module and a sending module, where:
所述处理模块用于,在第一视频帧中提取文本信息,所述文本信息包括文本内容和属性信息;The processing module is configured to extract text information from a first video frame, where the text information includes text content and attribute information;
所述处理模块还用于,对所述第一视频帧进行压缩处理,得到第二视频帧;The processing module is further configured to perform compression processing on the first video frame to obtain a second video frame;
所述发送模块用于,向所述第一设备发送所述第二视频帧和所述文本信息。The sending module is configured to send the second video frame and the text information to the first device.
在一种可能的实施方式中,在所述发送模块向所述第一设备发送所述第二视频帧和所述文本信息之前,所述处理模块还用于:In a possible implementation manner, before the sending module sends the second video frame and the text information to the first device, the processing module is further configured to:
生成第一标识;Generate the first identification;
分别在所述第二视频帧和所述文本信息中添加所述第一标识。The first identifier is added to the second video frame and the text information respectively.
在一种可能的实施方式中,所述第一标识为所述第二设备生成的时间戳。In a possible implementation manner, the first identifier is a timestamp generated by the second device.
在一种可能的实施方式中,所述发送模块具体用于:In a possible implementation manner, the sending module is specifically configured to:
通过第一传输通道向所述第一设备发送所述第二视频帧;Sending the second video frame to the first device through a first transmission channel;
通过第二传输通道向所述第一设备发送所述文本信息,所述第二传输通道为所述第一传输通道的并行旁路小带宽通道。The text information is sent to the first device through a second transmission channel, and the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
在一种可能的实施方式中,所述属性信息包括所述文本内容在所述视频帧中的位置、字体、尺寸、颜色和字体特效,所述字体特效包括仿射、旋转或投影中的至少一种。In a possible implementation manner, the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection. One kind.
第五方面,本申请实施例提供一种视频处理装置,包括:存储器、处理器以及计算机程序,所述计算机程序存储在所述存储器中,所述处理器运行所述计算机程序执行如第一方面任一项所述的视频处理方法。In a fifth aspect, an embodiment of the present application provides a video processing device, including: a memory, a processor, and a computer program. The computer program is stored in the memory, and the processor runs the computer program to execute the same as in the first aspect. Any one of the video processing methods.
第六方面,本申请实施例提供一种视频处理装置,包括:存储器、处理器以及计算机程序,所述计算机程序存储在所述存储器中,所述处理器运行所述计算机程序执行如第二方面任一项所述的视频处理方法。In a sixth aspect, an embodiment of the present application provides a video processing device, including a memory, a processor, and a computer program, the computer program is stored in the memory, and the processor runs the computer program to execute the same as in the second aspect Any one of the video processing methods.
第七方面,本申请实施例提供一种存储介质,所述存储介质包括计算机程序,所述计算机程序用于实现如第一方面任一项所述的视频处理方法。In a seventh aspect, an embodiment of the present application provides a storage medium, the storage medium includes a computer program, and the computer program is used to implement the video processing method according to any one of the first aspect.
第八方面,本申请实施例提供一种存储介质,所述存储介质包括计算机程序,所述计算机程序用于实现如第二方面任一项所述的视频处理方法。In an eighth aspect, an embodiment of the present application provides a storage medium, where the storage medium includes a computer program, and the computer program is used to implement the video processing method according to any one of the second aspect.
第九方面,本申请实施例还提供一种芯片或者集成电路,包括:存储器和处理器;In a ninth aspect, an embodiment of the present application also provides a chip or integrated circuit, including: a memory and a processor;
所述存储器,用于存储程序指令,有时还用于存储中间数据;The memory is used for storing program instructions and sometimes also used for storing intermediate data;
所述处理器,用于调用所述存储器中存储的所述程序指令以实现如第一方面任一项所述 的视频处理方法。The processor is configured to call the program instructions stored in the memory to implement the video processing method according to any one of the first aspect.
第十方面,本申请实施例还提供一种芯片或者集成电路,包括:存储器和处理器;In a tenth aspect, an embodiment of the present application also provides a chip or integrated circuit, including a memory and a processor;
所述存储器,用于存储程序指令,有时还用于存储中间数据;The memory is used for storing program instructions and sometimes also used for storing intermediate data;
所述处理器,用于调用所述存储器中存储的所述程序指令以实现如第二方面任一项所述的视频处理方法。The processor is configured to call the program instructions stored in the memory to implement the video processing method according to any one of the second aspect.
第十一方面,本申请实施例还提供一种程序产品,所述程序产品包括计算机程序,所述计算机程序存储在存储介质中,所述计算机程序用于实现第一方面任一项所述的视频处理方法。In an eleventh aspect, an embodiment of the present application also provides a program product, the program product includes a computer program, the computer program is stored in a storage medium, and the computer program is used to implement the Video processing method.
第十二方面,本申请实施例还提供一种程序产品,所述程序产品包括计算机程序,所述计算机程序存储在存储介质中,所述计算机程序用于实现第二方面任一项所述的视频处理方法。In a twelfth aspect, an embodiment of the present application further provides a program product, the program product includes a computer program, the computer program is stored in a storage medium, and the computer program is used to implement any one of the second aspect Video processing method.
本申请实施例提供的视频处理方法、装置及设备,针对视频流中的任意一个第一视频帧,第二设备先在第一视频帧中提取文本信息,并对第一视频帧进行压缩处理得到第二视频帧。第二设备向第一设备发送第二视频帧和文本信息,由于第二视频帧为压缩处理后的视频帧,因此,减小了第二设备和第一设备之间的传输通道的带宽压力。在第一设备接收到第二视频帧和文本信息之后,第一设备可以对第二视频帧和文本信息进行合并处理,以将文本信息中的文本内容合并至第二视频帧得到第三视频帧,由于未对文本内容进行压缩处理,因此,可以使得第三视频帧中的文本内容的清晰度较高,提高了压缩后的视频的质量。According to the video processing method, device and device provided by the embodiments of the present application, for any first video frame in the video stream, the second device first extracts text information from the first video frame and compresses the first video frame to obtain The second video frame. The second device sends the second video frame and text information to the first device. Since the second video frame is a compressed video frame, the bandwidth pressure of the transmission channel between the second device and the first device is reduced. After the first device receives the second video frame and the text information, the first device may merge the second video frame and the text information to merge the text content in the text information into the second video frame to obtain the third video frame Since the text content is not compressed, the definition of the text content in the third video frame can be made higher, and the quality of the compressed video can be improved.
附图说明Description of the drawings
图1为本申请实施例提供的一种系统架构图;FIG. 1 is a system architecture diagram provided by an embodiment of the application;
图2为本申请实施例提供的一种视频处理方法的流程示意图;FIG. 2 is a schematic flowchart of a video processing method provided by an embodiment of the application;
图3为本申请实施例提供的一种视频帧示意图;FIG. 3 is a schematic diagram of a video frame provided by an embodiment of the application;
图4A为本申请实施例提供的另一种视频帧示意图;4A is a schematic diagram of another video frame provided by an embodiment of this application;
图4B为本申请实施例提供的另一种视频帧示意图;4B is a schematic diagram of another video frame provided by an embodiment of this application;
图4C为本申请实施例提供的再一种视频帧示意图;FIG. 4C is a schematic diagram of still another video frame provided by an embodiment of this application;
图5为本申请实施例提供的处理视频的架构图;FIG. 5 is an architecture diagram of video processing provided by an embodiment of the application;
图6为本申请实施例提供的另一种视频处理方法的流程示意图;6 is a schematic flowchart of another video processing method provided by an embodiment of the application;
图7为本申请实施例提供的第一图像的示意图;FIG. 7 is a schematic diagram of a first image provided by an embodiment of the application;
图8为本申请实施例提供的一种视频处理过程示意图;FIG. 8 is a schematic diagram of a video processing process provided by an embodiment of the application;
图9为本申请实施例提供的另一种视频处理过程示意图;FIG. 9 is a schematic diagram of another video processing process provided by an embodiment of the application;
图10为本申请实施例提供的一种视频处理装置;FIG. 10 is a video processing device provided by an embodiment of this application;
图11为本申请实施例提供的另一种视频处理装置;FIG. 11 is another video processing device provided by an embodiment of this application;
图12为本申请实施例提供的视频处理装置的硬件结构示意图;FIG. 12 is a schematic diagram of the hardware structure of a video processing device provided by an embodiment of the application;
图13为本申请实施例提供的视频处理装置的硬件结构示意图。FIG. 13 is a schematic diagram of the hardware structure of a video processing device provided by an embodiment of the application.
具体实施方式Detailed ways
为了便于理解,首先对本申请所使用的系统架构进行说明。To facilitate understanding, the system architecture used in this application will be described first.
图1为本申请实施例提供的一种系统架构图。请参见图1,包括第一设备101和第二设备102。第二设备102和第一设备101之间具有视频的传输通道,第二设备102可以通过该 传输通道向第一设备101发送视频流,第一设备101可以对接收到的视频流进行播放。Fig. 1 is a system architecture diagram provided by an embodiment of the application. Please refer to FIG. 1, which includes a first device 101 and a second device 102. There is a video transmission channel between the second device 102 and the first device 101, the second device 102 can send the video stream to the first device 101 through the transmission channel, and the first device 101 can play the received video stream.
在第二设备102向第一设备101发送视频流之前,针对视频流中的任意一个视频帧,第二设备102可以先在视频帧中提取文本信息,并对视频帧进行压缩处理。第二设备102向第一设备101发送压缩处理后的视频帧和提取得到的文本信息,由于第二设备102对视频帧进行了压缩处理,因此,可以减小第二设备102和第一设备101之间的传输通道的带宽压力。在第一设备101接收到压缩处理后的视频帧和文本信息之后,第一设备101可以对压缩处理后的视频帧和文本信息进行合并处理,以将文本信息中的文本内容合并至视频帧中,并播放合并文本内容后的视频帧,由于未对文本信息中的文本内容进行压缩处理,因此,第一设备101播放的视频中的文本内容的清晰度较高,避免了视频中的文本内容变模糊。Before the second device 102 sends the video stream to the first device 101, for any video frame in the video stream, the second device 102 may first extract text information from the video frame, and compress the video frame. The second device 102 sends the compressed video frame and the extracted text information to the first device 101. Since the second device 102 compresses the video frame, the second device 102 and the first device 101 can be reduced. The bandwidth pressure between the transmission channels. After the first device 101 receives the compressed video frame and text information, the first device 101 may merge the compressed video frame and text information to merge the text content in the text information into the video frame , And play the video frame after the merged text content. Because the text content in the text information is not compressed, the text content in the video played by the first device 101 has a higher definition, which avoids the text content in the video. Blurred.
在一种可能的应用场景中,在用户通过第一设备101进行在线视频观看时,第二设备102可以为视频服务器,第一设备101可以为手机、电脑、电视等可以进行视频播放的设备。In a possible application scenario, when a user watches online videos through the first device 101, the second device 102 may be a video server, and the first device 101 may be a device capable of playing videos such as a mobile phone, a computer, or a TV.
在一种可能的应用场景中,用户通过手机将视频投射到电视(视频投屏)上,第二设备102可以为视频服务器,也可以为手机,第一设备101可以为电视。In a possible application scenario, a user uses a mobile phone to project a video to a TV (video projection), the second device 102 may be a video server or a mobile phone, and the first device 101 may be a TV.
下面,通过具体实施例,对本申请所示的技术方案进行详细说明。需要说明的是,下面几个实施例可以独立存在,也可以相互结合。对于相同或相似的内容,在不同的实施例中不再重复说明。Hereinafter, the technical solutions shown in this application will be described in detail through specific embodiments. It should be noted that the following embodiments can exist independently or can be combined with each other. For the same or similar content, the description will not be repeated in different embodiments.
图2为本申请实施例提供的一种视频处理方法的流程示意图。请参见图2,该方法可以包括:FIG. 2 is a schematic flowchart of a video processing method provided by an embodiment of the application. See Figure 2. The method can include:
S201、第二设备在第一视频帧中提取文本信息。S201: The second device extracts text information from the first video frame.
可选的,第二设备可以为服务器、手机、电脑等设备。Optionally, the second device may be a server, a mobile phone, a computer, etc.
可选的,第一视频帧为第二设备待向第一设备发送的视频流中的任意一帧。其中,第二设备对待向第一设备发送的视频中的每一帧的处理过程相同,本申请以对任意的第一视频帧为例进行说明。Optionally, the first video frame is any frame in the video stream to be sent by the second device to the first device. Wherein, the second device has the same processing process for each frame in the video to be sent to the first device, and this application takes any first video frame as an example for description.
可选的,第二设备可以通过字符识别(Character Recognition,CR)技术对第一视频帧进行处理,以在第一视频帧中提取文本信息。例如,CR技术可以包括光学字符识别(Optical Character Recognition,OCR)技术等。Optionally, the second device may process the first video frame by using character recognition (CR) technology to extract text information from the first video frame. For example, CR technology may include optical character recognition (Optical Character Recognition, OCR) technology and the like.
其中,文本信息包括文本内容和属性信息。Among them, the text information includes text content and attribute information.
文本内容中包括一个或多个字符,例如,字符可以为汉字、数字、字母等。文本信息中可以包括多组文本内容和每组文本内容的属性信息。每组文本内容中包括至少一个字符,一组文本内容中的至少一个字符中每两个相邻字符之间的间距相同,一组文本内容中各字符的字体、尺寸、颜色和字体特效相同,字体特效包括仿射、旋转或投影中的至少一种。The text content includes one or more characters, for example, the characters can be Chinese characters, numbers, letters, etc. The text information may include multiple sets of text content and attribute information of each set of text content. Each set of text content includes at least one character, at least one character in a set of text content has the same spacing between every two adjacent characters, and the font, size, color and font effects of each character in a set of text content are the same, The font effects include at least one of affine, rotation, or projection.
可选的,属性信息可以包括文本内容在视频帧中的位置、字体、尺寸、颜色和字体特效。可以通过文本内容在视频帧所占的区域的至少两个坐标表示文本内容在视频帧中的位置。字体可以包括宋体、黑体、隶书、楷体等。字体特效可包括仿射、旋转或投影中的至少一种。可选的,当一组文本内容中包括多个字符时,该组文本内容的属性信息中还可以包括字符的间距。Optionally, the attribute information may include the position, font, size, color, and font special effects of the text content in the video frame. The position of the text content in the video frame can be represented by at least two coordinates of the area occupied by the text content in the video frame. Fonts can include Song Ti, Hei Ti, Li Shu, Kai Ti, etc. The font effect may include at least one of affine, rotation, or projection. Optionally, when a group of text content includes multiple characters, the attribute information of the group of text content may also include character spacing.
需要说明的是,属性信息以及字体特效还可以包括其它,本申请实施例对此不作具体限定。It should be noted that the attribute information and font special effects may also include others, which are not specifically limited in the embodiment of the present application.
图3为本申请实施例提供的一种视频帧示意图。请参见图3,视频帧301中包括文本内容“快乐之家”和“今天的天气不错哦”。由于文本内容“快乐之家”中各字符的属性信息相同,文本内容“今天的天气不错哦”中各字符的属性信息相同,因此,可以在图3所示的视 频帧中提取两组文本内容,请参见视频帧302,可以将“快乐之家”作为一组文本内容,该组文本内容在视频帧302中的位置为以点A1和点A2为顶点的矩形区域,将“今天的天气不错哦”作为一组文本内容,该组文本内容在视频帧302中的位置为以点B1和点B2为顶点的矩形区域。该两组文本内容如表1所示:FIG. 3 is a schematic diagram of a video frame provided by an embodiment of the application. Please refer to Fig. 3, the video frame 301 includes text content "Happy House" and "Today's weather is good." Since the attribute information of each character in the text content "Happy House" is the same, and the attribute information of each character in the text content "Today's weather is good" is the same, therefore, two sets of text content can be extracted from the video frame shown in Figure 3 , Please refer to video frame 302, you can regard "Happy House" as a group of text content. The position of this group of text content in video frame 302 is a rectangular area with points A1 and A2 as vertices. As a group of text content, the position of the group of text content in the video frame 302 is a rectangular area with the points B1 and B2 as vertices. The two sets of text content are shown in Table 1:
表1Table 1
Figure PCTCN2020095882-appb-000001
Figure PCTCN2020095882-appb-000001
可选的,在第一视频帧中提取文本信息之后,第一视频帧可以保持不变。这样,可以减少视频帧处理的工作量。Optionally, after the text information is extracted from the first video frame, the first video frame may remain unchanged. In this way, the workload of video frame processing can be reduced.
可选的,在第一视频帧中提取文本信息之后,也可以将第一视频帧中去除文本信息中的文本内容,可选的,可以通过如下可行的实现方式去除第一视频帧中的文本内容:Optionally, after extracting the text information in the first video frame, the text content in the text information can also be removed from the first video frame. Optionally, the text in the first video frame can be removed by the following feasible implementation methods content:
一种可行的实现方式:A feasible way to achieve:
根据第一视频帧中文本内容所在区域的背景图案,更新第一视频帧中文本内容所在像素的颜色,以实现去除第一视频帧中的文本内容。在去除第一视频帧中的文本内容之后,第一视频帧中文本内容所在的区域可以呈现完整的背景图案。According to the background pattern of the area where the text content in the first video frame is located, the color of the pixel where the text content is located in the first video frame is updated to realize the removal of the text content in the first video frame. After removing the text content in the first video frame, the area where the text content is located in the first video frame can present a complete background pattern.
可选的,文本内容所在区域的背景图案可以为纯色、预设的规则形状、预设图像等。Optionally, the background pattern of the area where the text content is located may be a solid color, a preset regular shape, a preset image, and the like.
下面,结合图4A-图4B,对该种去除文本信息的视频帧进行说明。Hereinafter, this video frame with text information removed will be described with reference to FIGS. 4A-4B.
图4A为本申请实施例提供的另一种视频帧示意图。请参见图4A,视频帧A1中包括文本内容“快乐之家”和“今天的天气不错哦”,文本内容“快乐之家”所在区域的背景图案为纯灰色,文本内容“今天的天气不错哦”所在区域的背景图案为纯红色,则可以将视频帧A1中文本内容“快乐之家”所在像素的颜色替换为灰色,将视频帧A1中文本内容“今天的天气不错哦”所在像素的颜色替换为红色,得到视频帧A2,请参见视频帧A2,视频帧A2中不包括文本内容。FIG. 4A is a schematic diagram of another video frame provided by an embodiment of the application. Please refer to Figure 4A, the video frame A1 includes text content "Happy House" and "Today's weather is good", the background pattern of the area where the text content "Happy House" is located is pure gray, and the text content "Today's weather is good." The background pattern of the area where "is located is pure red, you can replace the color of the pixel where the text content "Happy House" in video frame A1 is located with gray, and the color of the pixel where the text content "Today’s weather is good" in video frame A1 Replace with red to get video frame A2, see video frame A2, video frame A2 does not include text content.
图4B为本申请实施例提供的另一种视频帧示意图。请参见图4B,视频帧B1中包括文本内容“快乐之家”和“今天的天气不错哦”,文本内容“快乐之家”所在区域的背景图案为竖条纹,文本内容“今天的天气不错哦”所在区域的背景图案为花瓣,则可以将视频帧B1中文本内容“快乐之家”所在像素的颜色替换为竖条纹中的像素的颜色,将视频帧B1中文本内容“今天的天气不错哦”所在像素的颜色替换为花瓣中的像素的颜色,得到视频帧B2,请参见视频帧B2,视频帧B2中不包括文本内容。在视频帧B2中,包括完整的竖条纹和完整的花瓣。FIG. 4B is a schematic diagram of another video frame provided by an embodiment of the application. Please refer to Figure 4B, the video frame B1 includes text content "Happy House" and "Today's weather is good", the background pattern of the area where the text content "Happy House" is located is vertical stripes, and the text content "Today's weather is good." If the background pattern in the area where "is located is petals, you can replace the color of the pixel where the text content "Happy House" in video frame B1 is located with the color of the pixel in the vertical stripe, and change the text content in video frame B1 "Today’s weather is good. Replace the color of the pixel in the petal with the color of the pixel in the petal to obtain video frame B2. Please refer to video frame B2. Video frame B2 does not include text content. In the video frame B2, complete vertical stripes and complete petals are included.
另一种可行的实现方式:Another feasible way to achieve:
将第一视频帧中文本内容所在像素的颜色更新为预设颜色。例如,预设颜色可以为白色、灰色等。Update the color of the pixel where the text content in the first video frame is located to the preset color. For example, the preset color can be white, gray, etc.
下面,结合图4C,对该种去除文本信息的视频帧进行说明。Hereinafter, this video frame with text information removed will be described with reference to FIG. 4C.
图4C为本申请实施例提供的再一种视频帧示意图。请参见图4C,视频帧C1中包括文本内容“快乐之家”和“今天的天气不错哦”,文本内容“快乐之家”所在区域的背景图案为竖条纹,文本内容“今天的天气不错哦”所在区域的背景图案为花瓣。假设预设颜色为白色,则可以将视频帧C1中文本内容“快乐之家”所在像素的颜色替换为白色,将视频帧C1中文本内容“今天的天气不错哦”所在像素的颜色替换为白色,得到视频帧C2,请参见视频帧 C2,视频帧C2中不包括文本内容。其中,白色的文本内容覆盖了竖条纹中和花瓣中的部分像素。在视频帧B2中,竖条纹中的部分被白色覆盖,花瓣中的部分被白色覆盖。FIG. 4C is a schematic diagram of still another video frame provided by an embodiment of this application. Please refer to Figure 4C. The video frame C1 includes the text content "Happy House" and "Today's weather is good", the background pattern of the area where the text content "Happy House" is located is vertical stripes, and the text content "Today's weather is good." The background pattern of the area where "is located is petals. Assuming that the preset color is white, you can replace the color of the pixel where the text content "Happy House" in video frame C1 is located with white, and replace the color of the pixel where the text content "Today’s weather is good" in video frame C1 with white , Get video frame C2, please refer to video frame C2, video frame C2 does not include text content. Among them, the white text content covers some pixels in the vertical stripes and petals. In the video frame B2, the part in the vertical stripes is covered with white, and the part in the petals is covered with white.
S202、第二设备对第一视频帧进行压缩处理,得到第二视频帧。S202. The second device performs compression processing on the first video frame to obtain a second video frame.
其中,第二视频帧的分辨率小于第一视频帧的分辨率。Wherein, the resolution of the second video frame is smaller than the resolution of the first video frame.
可选的,第二设备可以根据第二设备与第一设备之间的传输通道的带宽确定对第一视频帧进行压缩的压缩比,并根据该压缩比对第一视频帧进行压缩,得到第二视频帧。第二视频的比特率(单位bps)小于第二设备与第一设备之间的传输通道的带宽。Optionally, the second device may determine the compression ratio for compressing the first video frame according to the bandwidth of the transmission channel between the second device and the first device, and compress the first video frame according to the compression ratio to obtain the first video frame. Two video frames. The bit rate (unit bps) of the second video is smaller than the bandwidth of the transmission channel between the second device and the first device.
S203、第二设备向第一设备发送第二视频帧和文本信息。S203: The second device sends the second video frame and text information to the first device.
一个视频中包括多个视频帧,针对每一个视频帧,第二设备均可以获取得到对应的第二视频帧和文本信息,并向第一设备发送第二视频帧和文本信息。相应的,第一设备可以接收到多个第二视频帧和多个文本信息,为了使得第一设备可以确定第二视频帧和文本信息之间的对应关系,第二设备需要按照预设的规则发送第二视频帧和文本信息。One video includes multiple video frames. For each video frame, the second device can obtain the corresponding second video frame and text information, and send the second video frame and text information to the first device. Correspondingly, the first device can receive multiple second video frames and multiple text information. In order for the first device to determine the correspondence between the second video frames and the text information, the second device needs to follow a preset rule Send the second video frame and text information.
可选的,第二设备可以通过如下可行的实现方式向第一设备发送第二视频帧和文本信息:Optionally, the second device may send the second video frame and text information to the first device in the following feasible implementation manners:
第二设备生成第一标识,并分别在第二视频帧和文本信息中添加第一标识,第二设备向第一设备发送包括第一标识的第二视频帧和包括第一标识的文本信息。The second device generates the first identifier, and adds the first identifier to the second video frame and the text information respectively, and the second device sends the second video frame including the first identifier and the text information including the first identifier to the first device.
其中,不同的视频帧对应的标识不同。Among them, different video frames correspond to different identifiers.
可选的,第一标识可以为第二设备根据当前时刻生成的时间戳。当第二设备在相同时刻需要生成多个视频帧对应的时间戳时,第二设备可以先根据当前时刻生成时间戳,再在时间戳中增加标识,以使每个视频帧对应时间戳不同。例如,假设第二设备需要同时生成视频帧1、视频帧2和视频帧3对应的时间戳,且根据当前时刻生成的时间戳为时间戳1,则第二设备在时间戳1中增加标识之后,则视频帧1对应的时间戳可以为时间戳1+a,视频帧2对应的时间戳可以为时间戳1+b,视频帧3对应的时间戳可以为时间戳1+c。Optionally, the first identifier may be a timestamp generated by the second device according to the current time. When the second device needs to generate time stamps corresponding to multiple video frames at the same time, the second device may first generate a time stamp according to the current time, and then add an identifier to the time stamp, so that each video frame corresponds to a different time stamp. For example, suppose that the second device needs to generate the timestamps corresponding to video frame 1, video frame 2 and video frame 3 at the same time, and the timestamp generated according to the current time is timestamp 1, then the second device adds the identifier to the timestamp 1. , The timestamp corresponding to video frame 1 may be timestamp 1+a, the timestamp corresponding to video frame 2 may be timestamp 1+b, and the timestamp corresponding to video frame 3 may be timestamp 1+c.
在该种可行的实现方式中,第二设备通过在第二视频帧和文本信息中添加相同标识,使得第一设备可以根据标识确定第二视频帧和文本信息的对应关系,过程简单方便。In this feasible implementation manner, the second device adds the same identifier to the second video frame and the text information, so that the first device can determine the correspondence between the second video frame and the text information according to the identifier, and the process is simple and convenient.
可选的,第二设备可以在相同的传输通道向第一设备发送第二视频帧和文本信息。Optionally, the second device may send the second video frame and text information to the first device on the same transmission channel.
可选的,第二设备也可以在不同的传输通过向第一设备发送第二视频帧和文本信息。例如,第二设备可以通过第一传输通道向第一设备发送第二视频帧,通过第二传输通道向第一设备发送文本信息,第二传输通道为第一传输通道的并行旁路小带宽通道。Optionally, the second device may also send the second video frame and text information to the first device in a different transmission. For example, the second device can send a second video frame to the first device through the first transmission channel, and send text information to the first device through the second transmission channel. The second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel. .
例如,第一传输通道可以为嵌入式显示端口(embedded Display Port,eDP)通道、高清多媒体接口(High Definition Multimedia Interface,HDMI)通道等。第二传输通道可以为辅助通道(Auxiliary,AUX)、通用串行总线(Universal Serial Bus,USB)通道等。For example, the first transmission channel may be an embedded Display Port (eDP) channel, a High Definition Multimedia Interface (HDMI) channel, etc. The second transmission channel may be an auxiliary channel (Auxiliary, AUX), a universal serial bus (Universal Serial Bus, USB) channel, etc.
S204、第一设备根据属性信息,将文本内容添加至第二视频帧,得到待播放的第三视频帧。S204. The first device adds the text content to the second video frame according to the attribute information to obtain the third video frame to be played.
可选的,第一设备可以先确定第二视频帧和对应的文本信息。例如,当第二设备通过S203所示的方法发送第二视频帧和文本信息时,第二设备可以根据接收到的第二视频帧中包括的标识和文本信息中包括的标识,确定第二视频帧和文本信息的对应关系。例如,假设第一设备接收到的一个第二视频帧中包括第一标识,接收到的一个文本信息中包括第二标识,若第一标识和第二标识相同,则第一设备确定该第二视频帧和该文本信息对应。Optionally, the first device may first determine the second video frame and the corresponding text information. For example, when the second device sends the second video frame and text information through the method shown in S203, the second device may determine the second video frame according to the identifier included in the received second video frame and the identifier included in the text information. Correspondence between frame and text information. For example, suppose that a second video frame received by the first device includes a first identifier, and a text message received includes a second identifier. If the first identifier and the second identifier are the same, the first device determines the second identifier. The video frame corresponds to the text information.
可选的,第一设备可以通过如下可行的实现方式将文本内容添加到第二视频帧以得到第三视频帧:第一设备根据文本内容和属性信息,生成文本内容对应的第一图像,第一设备根据属性信息,将第一图像增加至第二视频帧中,得到第三视频帧。例如,第一设备可以根据 属性信息中的位置信息,将第一图像增加至第二视频帧中。例如,第一设备可以根据位置信息,将第一图像覆盖在第二视频帧中。Optionally, the first device may add text content to the second video frame to obtain the third video frame through the following feasible implementations: the first device generates the first image corresponding to the text content according to the text content and attribute information, and A device adds the first image to the second video frame according to the attribute information to obtain the third video frame. For example, the first device may add the first image to the second video frame according to the location information in the attribute information. For example, the first device may overlay the first image in the second video frame according to the location information.
可选的,第一图像中除文本内容之外的区域为透明的。Optionally, the area except the text content in the first image is transparent.
可选的,第一图像的分辨率大于第二视频帧的分辨率。例如,第一图像的分辨率可以等于第一图像的分辨率,这样,可以使得第三视频中的文本内容的清晰度较高。Optionally, the resolution of the first image is greater than the resolution of the second video frame. For example, the resolution of the first image may be equal to the resolution of the first image, so that the definition of the text content in the third video can be made higher.
可选的,在第一设备得到第三视频帧之后,第一设备可以播放第三视频帧,或者,第一设备可以向第三设备发送第三视频帧,以使第三设备播放该第三视频帧。Optionally, after the first device obtains the third video frame, the first device may play the third video frame, or the first device may send the third video frame to the third device, so that the third device may play the third video frame. Video frame.
本申请实施例提供的视频处理方法,针对视频流中的任意一个第一视频帧,第二设备先在第一视频帧中提取文本信息,并对第一视频帧进行压缩处理得到第二视频帧。第二设备向第一设备发送第二视频帧和文本信息,由于第二视频帧为压缩处理后的视频帧,因此,减小了第二设备和第一设备之间的传输通道的带宽压力。在第一设备接收到第二视频帧和文本信息之后,第一设备可以对第二视频帧和文本信息进行合并处理,以将文本信息中的文本内容合并至第二视频帧得到第三视频帧,由于未对文本内容进行压缩处理,因此,可以使得第三视频帧中的文本内容的清晰度较高,提高了压缩后的视频的质量。In the video processing method provided by the embodiments of the present application, for any first video frame in a video stream, the second device first extracts text information from the first video frame, and compresses the first video frame to obtain the second video frame . The second device sends the second video frame and text information to the first device. Since the second video frame is a compressed video frame, the bandwidth pressure of the transmission channel between the second device and the first device is reduced. After the first device receives the second video frame and the text information, the first device may merge the second video frame and the text information to merge the text content in the text information into the second video frame to obtain the third video frame Since the text content is not compressed, the definition of the text content in the third video frame can be made higher, and the quality of the compressed video can be improved.
为了便于理解,下面,结合图5,对第一设备和第二设备处理视频的架构进行说明。For ease of understanding, the following describes the video processing architecture of the first device and the second device in conjunction with FIG. 5.
图5为本申请实施例提供的处理视频的架构图。请参见图5,第二设备获取得到第一视频帧之后,第二设备对第一视频帧进行文本信息提取得到文本信息,并对提取文本信息后的第一视频帧进行压缩处理,得到第二视频帧。第二设备还生成时间戳,并在文本信息和第二视频帧中添加相同的时间戳,即,分别为文本信息和第二视频帧打相同的时间戳。第二设备将带时间戳的文本信息和带时间戳的第二视频帧发送给第一设备。Fig. 5 is an architecture diagram of video processing provided by an embodiment of the application. Referring to Figure 5, after the second device obtains the first video frame, the second device extracts text information from the first video frame to obtain text information, and compresses the first video frame after the text information is extracted to obtain the second Video frame. The second device also generates a time stamp, and adds the same time stamp to the text information and the second video frame, that is, the text information and the second video frame are respectively stamped with the same time stamp. The second device sends the time-stamped text information and the time-stamped second video frame to the first device.
第一设备接收带时间戳的文本信息和带时间戳的第二视频帧之后,第一设备可以根据时间戳确定文本信息和第二视频帧的对应关系。在第一设备确定得到文本信息和第二视频帧的对应关系之后,第一设备可以根据文本信息生成第一图像,并对第一图像和第二视频帧进行合并处理,得到第三视频帧。After the first device receives the text information with a time stamp and the second video frame with a time stamp, the first device may determine the correspondence between the text information and the second video frame according to the time stamp. After the first device determines the correspondence between the text information and the second video frame, the first device may generate the first image according to the text information, and perform merging processing on the first image and the second video frame to obtain the third video frame.
在图5所示的架构下,下面,结合图6,对视频处理过程进行说明。Under the architecture shown in FIG. 5, the video processing process will be described below in conjunction with FIG. 6.
图6为本申请实施例提供的另一种视频处理方法的流程示意图。请参见图6,该方法可以包括:FIG. 6 is a schematic flowchart of another video processing method provided by an embodiment of the application. Referring to Figure 6, the method may include:
S601、第二设备获取第一视频帧。S601. The second device obtains the first video frame.
第二设备在待向第一设备发送的视频流中获取第一视频帧。视频流可以为第二设备本地的视频流,也可以为第二设备从其它接收到的视频流。The second device obtains the first video frame in the video stream to be sent to the first device. The video stream may be a local video stream of the second device, or a video stream received by the second device from other devices.
S602、第二设备在第一视频帧中提取文本信息。S602. The second device extracts text information from the first video frame.
需要说明的是,S602的执行过程可以参见S201的执行过程,此处不再进行赘述。It should be noted that, for the execution process of S602, refer to the execution process of S201, which will not be repeated here.
S603、第二设备对第一视频帧进行压缩处理,得到第二视频帧。S603. The second device performs compression processing on the first video frame to obtain a second video frame.
需要说明的是,S603的执行过程可以参见S202的执行过程,此处不再进行赘述。It should be noted that the execution process of S603 can be referred to the execution process of S202, which will not be repeated here.
S604、第二设备生成时间戳。S604. The second device generates a time stamp.
可选的,第二设备可以根据当前时刻生成时间戳。Optionally, the second device may generate a time stamp according to the current time.
S605、第二设备分别在文本信息和第二视频帧中添加时间戳。S605. The second device adds a time stamp to the text information and the second video frame respectively.
在向文本信息中添加该时间戳之后,文本信息中包括该时间戳。在向第二视频帧中添加该时间戳之后,第二视频帧中包括该时间戳。After the time stamp is added to the text information, the time stamp is included in the text information. After adding the time stamp to the second video frame, the second video frame includes the time stamp.
S606、第二设备通过第一传输通道向第一设备发送包括时间戳的第二视频帧。S606: The second device sends the second video frame including the time stamp to the first device through the first transmission channel.
S607、第二设备通过第二传输通道向第一设备发送包括时间戳的文本信息。S607: The second device sends the text information including the time stamp to the first device through the second transmission channel.
S608、第一设备根据时间戳确定文本信息和第二视频帧的对应关系。S608: The first device determines the correspondence between the text information and the second video frame according to the timestamp.
可选的,第一设备在相同的时刻可能会接收到多个文本信息和多个第二视频帧,因此,第一设备需要根据时间戳确定文本信息和第二视频帧之间的对应关系。例如,第一设备将时间戳相同的文本信息和第二视频帧确定为具有对应关系的文本信息和第二视频帧。Optionally, the first device may receive multiple text information and multiple second video frames at the same moment. Therefore, the first device needs to determine the correspondence between the text information and the second video frame according to the timestamp. For example, the first device determines the text information and the second video frame with the same time stamp as the text information and the second video frame having a corresponding relationship.
S609、第一设备根据文本信息生成第一图像。S609. The first device generates a first image according to the text information.
可选的,若文本信息中包括多组文本内容,则第一设备可以生成每组文本内容对应的第一图像。Optionally, if the text information includes multiple sets of text content, the first device may generate a first image corresponding to each set of text content.
可选的,针对任意一组文本内容,第一设备根据该组文本内容、该组文本内容的字体、尺寸、颜色和字体特效,生成该组文本内容对应的第一图像,第一图像的分辨率大于第二视频帧的分辨率。Optionally, for any set of text content, the first device generates a first image corresponding to the set of text content according to the set of text content, the font, size, color, and font effects of the set of text content, and the resolution of the first image The rate is greater than the resolution of the second video frame.
可选的,第一图像的尺寸可以与文本内容在第一视频帧中所占区域的尺寸相同。Optionally, the size of the first image may be the same as the size of the area occupied by the text content in the first video frame.
下面,结合图7,对第一图像进行说明。Hereinafter, the first image will be described with reference to FIG. 7.
图7为本申请实施例提供的第一图像的示意图。请参见图7,假设第一视频帧如图701所示。第一视频帧701中包括文本内容“快乐之家”和“今天的天气不错哦”。FIG. 7 is a schematic diagram of a first image provided by an embodiment of the application. Please refer to FIG. 7, assuming that the first video frame is as shown in FIG. 701. The first video frame 701 includes text content "Happy House" and "Today's weather is good."
假设在第一视频帧中提取到的文本信息如表1所示,则第一设备根据表1中的“快乐之家”和“位置1、字体1、尺寸1、颜色1、间距1、特效1”可以生成图像702,在图像702中,包括文本内容“快乐之家”,图像702中的文本内容的属性信息与第一视频帧701中该文本内容的属性信息相同。图像702的尺寸与第一视频帧701中文本内容“快乐之家”所占区域的尺寸相同。Assuming that the text information extracted in the first video frame is as shown in Table 1, the first device according to "Happy House" and "Position 1, Font 1, Size 1, Color 1, Spacing 1, Special Effects" in Table 1 1" can generate an image 702. In the image 702, the text content "Happy House" is included. The attribute information of the text content in the image 702 is the same as the attribute information of the text content in the first video frame 701. The size of the image 702 is the same as the size of the area occupied by the text content "Happy House" in the first video frame 701.
第一设备还根据表1中的“今天天气不错呀”和“位置2、字体2、尺寸2、颜色2、间距2、特效2”可以生成图像703,在图像703中,包括文本内容“今天天气不错呀”,图像703中的文本内容的属性信息与第一视频帧701中该文本内容的属性信息相同。图像703的尺寸与第一视频帧701中文本内容“今天天气不错呀”所占区域的尺寸相同。The first device can also generate an image 703 according to "It’s nice today" and "Position 2, font 2, size 2, color 2, spacing 2, special effects 2" in Table 1. In the image 703, the text content "Today The weather is good", the attribute information of the text content in the image 703 is the same as the attribute information of the text content in the first video frame 701. The size of the image 703 is the same as the size of the area occupied by the text content "It's nice today" in the first video frame 701.
S610、第一设备对第一图像和第二视频帧合并处理,得到第三视频帧。S610. The first device combines the first image and the second video frame to obtain a third video frame.
可选的,若第一图像的个数为多个,则第一设备分别对每个第一图像和第二视频帧进行合并处理,得到第三视频帧。Optionally, if the number of first images is multiple, the first device respectively performs merging processing on each first image and second video frame to obtain a third video frame.
S611、第一设备播放第三视频帧。S611. The first device plays the third video frame.
在图6所示的实施例中,针对视频流中的任意一个第一视频帧,第二设备先在第一视频帧中提取文本信息,并对第一视频帧进行压缩处理得到第二视频帧。第二设备向第一设备发送第二视频帧和文本信息,由于第二视频帧为压缩处理后的视频帧,因此,减小了第二设备和第一设备之间的传输通道的带宽压力。在第一设备接收到第二视频帧和文本信息之后,第一设备可以对第二视频帧和文本信息进行合并处理,以将文本信息中的文本内容合并至第二视频帧得到第三视频帧,并播放第三视频帧,由于未对文本内容进行压缩处理,因此,可以使得第一设备播放的第三视频帧中的文本内容的清晰度较高,使得第一设备播放的视频的质量较高。In the embodiment shown in FIG. 6, for any first video frame in the video stream, the second device first extracts text information from the first video frame, and compresses the first video frame to obtain the second video frame . The second device sends the second video frame and text information to the first device. Since the second video frame is a compressed video frame, the bandwidth pressure of the transmission channel between the second device and the first device is reduced. After the first device receives the second video frame and the text information, the first device may merge the second video frame and the text information to merge the text content in the text information into the second video frame to obtain the third video frame , And play the third video frame. Since the text content is not compressed, the definition of the text content in the third video frame played by the first device can be made higher, so that the quality of the video played by the first device is higher. high.
在上述任意一个实施例的基础上,下面,结合图8-图9,对上述视频处理方法进行说明。On the basis of any of the foregoing embodiments, the foregoing video processing method will be described below with reference to FIGS. 8-9.
图8为本申请实施例提供的一种视频处理过程示意图。请参见图8,第一视频帧如801所示,第一视频帧801中包括文本信息“快乐之家”和“今天天气不错哦”。其中,文本内容“快乐之家”中各字符的字体、尺寸、颜色和字体特效相同,文本内容“今天天气不错哦”中各字符的字体、尺寸、颜色和字体特效相同。FIG. 8 is a schematic diagram of a video processing process provided by an embodiment of this application. Referring to FIG. 8, the first video frame is shown as 801, and the first video frame 801 includes the text information "Happy House" and "It's nice today." Among them, the font, size, color, and font effects of the characters in the text content "Happy House" are the same, and the font, size, color, and font effects of the characters in the text content "It's nice today" are the same.
第二设备在获取到第一视频帧801之后,在第一视频帧801中提取得到文本信息802, 其中,文本信息802中包括两组文本内容“快乐之家”和“今天天气不错哦”,以及每组文本内容的属性信息。在第二设备在第一视频帧801中提取过文本信息之后,假设第一视频帧801不变。After acquiring the first video frame 801, the second device extracts the text information 802 from the first video frame 801, where the text information 802 includes two sets of text content "Happy House" and "It's nice today." And the attribute information of each group of text content. After the second device extracts text information in the first video frame 801, it is assumed that the first video frame 801 remains unchanged.
第二设备对提取过文本信息的第一视频帧801进行压缩处理,得到第二视频帧803。第二设备在第二视频帧803和文本信息802中添加相同的时间戳,并发送包括时间戳的第二视频帧803和包括时间戳的文本信息802。The second device performs compression processing on the first video frame 801 from which the text information has been extracted, to obtain a second video frame 803. The second device adds the same time stamp to the second video frame 803 and the text information 802, and transmits the second video frame 803 including the time stamp and the text information 802 including the time stamp.
第一设备接收到包括时间戳的第二视频帧803和包括时间戳的文本信息802之后,第一设备根据时间戳确定第二视频帧803和文本信息802对应。第一设备根据文本内容“快乐之家”和该文本信息的字体、尺寸、颜色和字体特效,生成图像804,图像804中的文本内容“快乐之家”的字体、尺寸、颜色和字体特效与第一视频帧801中的该文本内容的字体、尺寸、颜色和字体特效对应相同。第一设备根据文本内容“今天天气不错哦”和该文本内容的字体、尺寸、颜色和字体特效,生成图像805,图像805中的文本内容“今天天气不错哦”的字体、尺寸、颜色和字体特效与第一视频帧801中的该文本内容的字体、尺寸、颜色和字体特效对应相同。After the first device receives the second video frame 803 including the time stamp and the text information 802 including the time stamp, the first device determines that the second video frame 803 corresponds to the text information 802 according to the time stamp. The first device generates an image 804 according to the text content "Happy House" and the font, size, color and font effects of the text information. The font, size, color and font effects of the text content "Happy House" in the image 804 are the same as The font, size, color, and font special effects of the text content in the first video frame 801 correspond to the same. The first device generates the image 805 according to the text content "It’s nice today" and the font, size, color and font effects of the text content, and generates the image 805. The text content "It’s nice today" font, size, color and font in the image 805 The special effect corresponds to the font, size, color, and font special effect of the text content in the first video frame 801.
第一设备根据文本内容“快乐之家”在第一视频帧801中的位置信息,将图像804覆盖在第二视频帧803上。第一设备还根据文本内容“今天天气不错哦”在第一视频帧801中的位置信息,将图像805覆盖在第二视频帧803上,得到第三视频帧806。第一设备可以播放第三视频帧806。The first device overlays the image 804 on the second video frame 803 according to the position information of the text content "Happy House" in the first video frame 801. The first device also overlays the image 805 on the second video frame 803 according to the position information of the text content "It's nice today" in the first video frame 801 to obtain the third video frame 806. The first device can play the third video frame 806.
在图8所示的实施例中,不但可以减少第二设备与第一设备之间的带宽压力,还可以使得第一设备播放的第三视频帧中的文本内容的清晰度较高,使得第一设备播放的视频的质量较高。进一步的,在第二设备在第一视频帧中提取过文本信息之后,第一视频帧不变(不对第一视频帧进行处理),使得第二设备的视频处理的工作量较小,使得视频处理的效率较高。In the embodiment shown in FIG. 8, not only can the bandwidth pressure between the second device and the first device be reduced, but also the definition of the text content in the third video frame played by the first device can be made higher, so that the The quality of the video played by a device is higher. Further, after the second device extracts the text information in the first video frame, the first video frame remains unchanged (the first video frame is not processed), so that the workload of the second device's video processing is small, so that the video The processing efficiency is higher.
图9为本申请实施例提供的另一种视频处理过程示意图。请参见图9,第一视频帧如901所示,第一视频帧901中包括文本信息“快乐之家”和“今天天气不错哦”。其中,文本内容“快乐之家”中各字符的字体、尺寸、颜色和字体特效相同,文本内容“今天天气不错哦”中各字符的字体、尺寸、颜色和字体特效相同。FIG. 9 is a schematic diagram of another video processing process provided by an embodiment of this application. Referring to FIG. 9, the first video frame is shown as 901, and the first video frame 901 includes text information "Happy House" and "It's nice today." Among them, the font, size, color, and font effects of the characters in the text content "Happy House" are the same, and the font, size, color, and font effects of the characters in the text content "It's nice today" are the same.
第二设备在获取到第一视频帧901之后,在第一视频帧901中提取得到文本信息902,其中,文本信息902中包括两组文本内容“快乐之家”和“今天天气不错哦”,以及每组文本内容的属性信息。在第二设备在第一视频帧901中提取过文本信息之后,第二设备去除第一视频帧901中的文本内容,得到第一视频帧903。请参见图9,第一视频帧903中不包括文本内容。After acquiring the first video frame 901, the second device extracts the text information 902 from the first video frame 901, where the text information 902 includes two sets of text content "Happy House" and "The weather is good today." And the attribute information of each group of text content. After the second device extracts the text information in the first video frame 901, the second device removes the text content in the first video frame 901 to obtain the first video frame 903. Referring to FIG. 9, the first video frame 903 does not include text content.
第二设备对第一视频帧903进行压缩处理,得到第二视频帧904。第二设备在第二视频帧904和文本信息902中添加相同的时间戳,并发送包括时间戳的第二视频帧904和包括时间戳的文本信息902。The second device performs compression processing on the first video frame 903 to obtain the second video frame 904. The second device adds the same time stamp to the second video frame 904 and the text information 902, and transmits the second video frame 904 including the time stamp and the text information 902 including the time stamp.
第一设备接收到包括时间戳的第二视频帧904和包括时间戳的文本信息902之后,第一设备根据时间戳确定第二视频帧904和文本信息902对应。第一设备根据文本内容“快乐之家”和该文本信息的字体、尺寸、颜色和字体特效,生成图像905,图像905中的文本内容“快乐之家”的字体、尺寸、颜色和字体特效与第一视频帧901中的该文本内容的字体、尺寸、颜色和字体特效对应相同。第一设备根据文本内容“今天天气不错哦”和该文本内容的字体、尺寸、颜色和字体特效,生成图像906,图像906中的文本内容“今天天气不错哦”的字体、尺寸、颜色和字体特效与第一视频帧901中的该文本内容的字体、尺寸、颜色和字 体特效对应相同。After the first device receives the second video frame 904 including the time stamp and the text information 902 including the time stamp, the first device determines that the second video frame 904 corresponds to the text information 902 according to the time stamp. The first device generates an image 905 based on the text content "Happy House" and the font, size, color and font effects of the text information. The font, size, color and font effects of the text content "Happy House" in the image 905 are consistent with The font, size, color, and font special effects of the text content in the first video frame 901 correspond to the same. The first device generates image 906 according to the text content "Today's weather is good" and the font, size, color and font effects of the text content, and generates image 906. The text content "Today's weather is good" in image 906 has the font, size, color and font. The special effects correspond to the font, size, color, and font special effects of the text content in the first video frame 901.
第一设备根据文本内容“快乐之家”在第一视频帧901中的位置信息,将图像905覆盖在第二视频帧904上。第一设备还根据文本内容“今天天气不错哦”在第一视频帧901中的位置信息,将图像906覆盖在第二视频帧904上,得到第三视频帧907。第一设备可以播放第三视频帧907。The first device overlays the image 905 on the second video frame 904 according to the position information of the text content "Happy House" in the first video frame 901. The first device also overlays the image 906 on the second video frame 904 according to the position information of the text content "It's nice today" in the first video frame 901 to obtain the third video frame 907. The first device can play the third video frame 907.
在图9所示的实施例中,不但可以减少第二设备与第一设备之间的带宽压力,还可以使得第一设备播放的第三视频帧中的文本内容的清晰度较高,使得第一设备播放的视频的质量较高。进一步的,在第二设备在第一视频帧中提取过文本信息之后,在第一视频帧中去除了文本内容,这样,在第一设备向第二视频帧中添加第一图像(图像905和图像906)时,可以避免第一图像中的文本内容无法完全覆盖第二视频帧中的文本内容的问题,使得视频处理的质量较高。In the embodiment shown in FIG. 9, not only can the bandwidth pressure between the second device and the first device be reduced, but also the definition of the text content in the third video frame played by the first device can be made higher, so that the The quality of the video played by a device is higher. Further, after the second device extracts the text information in the first video frame, the text content is removed from the first video frame. In this way, the first device adds the first image (image 905 and image 905) to the second video frame. Image 906), the problem that the text content in the first image cannot completely cover the text content in the second video frame can be avoided, so that the quality of the video processing is higher.
图10为本申请实施例提供的一种视频处理装置。该视频处理装置10可以设置在第一设备中。请参见图10,该视频处理装置10可以包括接收模块11和处理模块12,其中,Fig. 10 is a video processing device provided by an embodiment of the application. The video processing device 10 may be provided in the first device. Referring to FIG. 10, the video processing device 10 may include a receiving module 11 and a processing module 12, where:
所述接收模块11用于,接收第二设备发送的第二视频帧和在第一视频帧中提取得到的文本信息,所述第二视频帧为对所述第一视频帧压缩得到的,所述文本信息中包括文本内容和所述文本内容的属性信息;The receiving module 11 is configured to receive a second video frame sent by a second device and text information extracted from the first video frame, where the second video frame is obtained by compressing the first video frame, so The text information includes text content and attribute information of the text content;
所述处理模块12用于,根据所述属性信息,将所述文本内容添加至所述第二视频帧,得到待播放的第三视频帧。The processing module 12 is configured to add the text content to the second video frame according to the attribute information to obtain a third video frame to be played.
可选的,接收模块11可以执行图2实施例中的S203、以及图6实施例中的S606-S607。Optionally, the receiving module 11 may execute S203 in the embodiment in FIG. 2 and S606-S607 in the embodiment in FIG. 6.
可选的,处理模块12可以执行图2实施例中的S204、以及图6实施例中的S608-S611。Optionally, the processing module 12 may execute S204 in the embodiment in FIG. 2 and S608-S611 in the embodiment in FIG. 6.
需要说明的是,本申请实施例所示的视频处理装置可以执行上述方法实施例所示的技术方案,其实现原理以及有益效果类似,此处不再进行赘述。It should be noted that the video processing apparatus shown in the embodiments of the present application can execute the technical solutions shown in the foregoing method embodiments, and the implementation principles and beneficial effects are similar, and will not be repeated here.
在一种可能的实施方式中,所述处理模块12具体用于:In a possible implementation manner, the processing module 12 is specifically configured to:
根据所述文本内容和所述属性信息,生成所述文本内容对应的第一图像,所述第一图像的分辨率大于所述第二视频帧的分辨率;Generating a first image corresponding to the text content according to the text content and the attribute information, where the resolution of the first image is greater than the resolution of the second video frame;
根据所述属性信息,将所述第一图像增加至所述第二视频帧中,得到所述第三视频帧。According to the attribute information, the first image is added to the second video frame to obtain the third video frame.
在一种可能的实施方式中,所述处理模块12具体用于:In a possible implementation manner, the processing module 12 is specifically configured to:
在所述文本内容中确定至少一组文本内容,每组文本内容中包括至少一个字符,一组文本内容中各字符的字体、尺寸、颜色和字体特效相同,所述字体特效包括仿射、旋转或投影中的至少一种;Determine at least one group of text content in the text content, each group of text content includes at least one character, the font, size, color, and font special effects of each character in the group of text content are the same, and the font special effects include affine, rotation Or at least one of projections;
分别根据每组文本内容的属性信息,生成每组文本内容对应的第一图像。The first image corresponding to each group of text content is generated according to the attribute information of each group of text content respectively.
在一种可能的实施方式中,所述第一图像中除所述文本内容之外的区域为透明的。In a possible implementation manner, the area in the first image other than the text content is transparent.
在一种可能的实施方式中,所述处理模块12具体用于:In a possible implementation manner, the processing module 12 is specifically configured to:
在所述属性信息中获取所述文本内容在所述第一视频帧中的位置信息Obtain the position information of the text content in the first video frame in the attribute information
根据所述位置信息,将所述第一图像增加至所述第二视频帧中,得到所述第三视频帧。According to the position information, the first image is added to the second video frame to obtain the third video frame.
在一种可能的实施方式中,在所述处理模块根据所述属性信息,将所述文本内容添加至所述第二视频帧之前,所述处理模块12还用于:In a possible implementation manner, before the processing module adds the text content to the second video frame according to the attribute information, the processing module 12 is further configured to:
在所述第二视频帧中获取第一标识;Acquiring a first identifier in the second video frame;
在所述文本信息中获取第二标识;Acquiring the second identifier in the text information;
确定所述第一标识和所述第二标识相同。It is determined that the first identifier and the second identifier are the same.
在一种可能的实施方式中,所述第一标识和所述第二标识为相同的时间戳。In a possible implementation manner, the first identifier and the second identifier are the same time stamp.
在一种可能的实施方式中,所述接收模块11具体用于:In a possible implementation manner, the receiving module 11 is specifically configured to:
从第一传输通道接收所述第二设备发送的所述第二视频帧;Receiving the second video frame sent by the second device from the first transmission channel;
从第二传输通道接收所述第二设备发送的所述文本信息,所述第二传输通道为所述第一传输通道的并行旁路小带宽通道。Receiving the text information sent by the second device from a second transmission channel, where the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
在一种可能的实施方式中,所述属性信息包括所述文本内容在所述视频帧中的位置、字体、尺寸、颜色和字体特效,所述字体特效包括仿射、旋转或投影中的至少一种。In a possible implementation manner, the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection. One kind.
在一种可能的实施方式中,在所述处理模块12根据所述属性信息,将所述文本内容添加至所述第二视频帧,得到待播放的第三视频帧之后,所述处理模块12还用于:In a possible implementation manner, after the processing module 12 adds the text content to the second video frame according to the attribute information to obtain the third video frame to be played, the processing module 12 Also used for:
播放所述第三视频帧;Playing the third video frame;
或者,or,
向第三设备发送所述第三视频帧,所述第三设备用于播放所述第三视频帧。Sending the third video frame to a third device, where the third device is used to play the third video frame.
需要说明的是,本申请实施例所示的视频处理装置可以执行上述方法实施例所示的技术方案,其实现原理以及有益效果类似,此处不再进行赘述。It should be noted that the video processing apparatus shown in the embodiments of the present application can execute the technical solutions shown in the foregoing method embodiments, and the implementation principles and beneficial effects are similar, and will not be repeated here.
图11为本申请实施例提供的另一种视频处理装置。该视频处理装置20可以设置在第二设备中。请参见图11,该视频处理装置20可以包括处理模块21和发送模块22,其中,FIG. 11 is another video processing device provided by an embodiment of this application. The video processing device 20 may be provided in a second device. Referring to FIG. 11, the video processing device 20 may include a processing module 21 and a sending module 22, where:
所述处理模块21用于,在第一视频帧中提取文本信息,所述文本信息包括文本内容和属性信息;The processing module 21 is configured to extract text information from a first video frame, where the text information includes text content and attribute information;
所述处理模块21还用于,对所述第一视频帧进行压缩处理,得到第二视频帧;The processing module 21 is further configured to perform compression processing on the first video frame to obtain a second video frame;
所述发送模块22用于,向所述第一设备发送所述第二视频帧和所述文本信息。The sending module 22 is configured to send the second video frame and the text information to the first device.
可选的,处理模块21可以执行图2实施例中的S201-S202、以及图6实施例中的S601-S605。Optionally, the processing module 21 may execute S201-S202 in the embodiment of FIG. 2 and S601-S605 in the embodiment of FIG. 6.
可选的,发送模块22可以执行图2实施例中的S203、以及图6实施例中的S606-S607。Optionally, the sending module 22 may execute S203 in the embodiment in FIG. 2 and S606-S607 in the embodiment in FIG. 6.
需要说明的是,本申请实施例所示的视频处理装置可以执行上述方法实施例所示的技术方案,其实现原理以及有益效果类似,此处不再进行赘述。It should be noted that the video processing apparatus shown in the embodiments of the present application can execute the technical solutions shown in the foregoing method embodiments, and the implementation principles and beneficial effects are similar, and will not be repeated here.
在一种可能的实施方式中,在所述发送模块22向所述第一设备发送所述第二视频帧和所述文本信息之前,所述处理模块21还用于:In a possible implementation manner, before the sending module 22 sends the second video frame and the text information to the first device, the processing module 21 is further configured to:
生成第一标识;Generate the first identification;
分别在所述第二视频帧和所述文本信息中添加所述第一标识。The first identifier is added to the second video frame and the text information respectively.
在一种可能的实施方式中,所述第一标识为所述第二设备生成的时间戳。In a possible implementation manner, the first identifier is a timestamp generated by the second device.
在一种可能的实施方式中,所述发送模22块具体用于:In a possible implementation manner, the sending module 22 is specifically used for:
通过第一传输通道向所述第一设备发送所述第二视频帧;Sending the second video frame to the first device through a first transmission channel;
通过第二传输通道向所述第一设备发送所述文本信息,所述第二传输通道为所述第一传输通道的并行旁路小带宽通道。The text information is sent to the first device through a second transmission channel, and the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
在一种可能的实施方式中,所述属性信息包括所述文本内容在所述视频帧中的位置、字体、尺寸、颜色和字体特效,所述字体特效包括仿射、旋转或投影中的至少一种。In a possible implementation manner, the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include at least one of affine, rotation, or projection. One kind.
需要说明的是,本申请实施例所示的视频处理装置可以执行上述方法实施例所示的技术方案,其实现原理以及有益效果类似,此处不再进行赘述。It should be noted that the video processing apparatus shown in the embodiments of the present application can execute the technical solutions shown in the foregoing method embodiments, and the implementation principles and beneficial effects are similar, and will not be repeated here.
图12为本申请实施例提供的视频处理装置的硬件结构示意图。请参见图12,该视频处理装置30包括:存储器31、处理器32和接收器33,其中,存储器31和处理器32通信;示例性的,存储器31、处理器32和接收器33可以通过通信总线44通信,所述存储器31用于存储计算机程序,所述处理器32执行所述计算机程序实现上述视频处理方法。FIG. 12 is a schematic diagram of the hardware structure of a video processing device provided by an embodiment of the application. 12, the video processing device 30 includes: a memory 31, a processor 32, and a receiver 33, where the memory 31 and the processor 32 communicate; exemplary, the memory 31, the processor 32, and the receiver 33 can communicate through The bus 44 communicates, the memory 31 is used to store a computer program, and the processor 32 executes the computer program to implement the foregoing video processing method.
可选的,本申请所示的处理器32可以实现图10实施例中的处理模块12的功能,接收器 33可以实现图10实施例中接收模块11的功能,此处不再进行赘述。Optionally, the processor 32 shown in the present application may implement the function of the processing module 12 in the embodiment of FIG. 10, and the receiver 33 may implement the function of the receiving module 11 in the embodiment of FIG. 10, which will not be repeated here.
可选的,上述处理器可以是CPU,还可以是其他通用处理器、DSP、ASIC等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请所公开的认证方法实施例中的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。Optionally, the foregoing processor may be a CPU, or other general-purpose processors, DSPs, ASICs, and so on. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like. The steps in the embodiment of the authentication method disclosed in this application may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
图13为本申请实施例提供的视频处理装置的硬件结构示意图。请参见图13,该视频处理装置40包括:存储器41、处理器42和发送器43,其中,存储器41和处理器42通信;示例性的,存储器41、处理器42和发送器43可以通过通信总线44通信,所述存储器41用于存储计算机程序,所述处理器42执行所述计算机程序实现上述视频处理方法。FIG. 13 is a schematic diagram of the hardware structure of a video processing device provided by an embodiment of the application. Referring to FIG. 13, the video processing device 40 includes: a memory 41, a processor 42, and a transmitter 43, where the memory 41 and the processor 42 communicate; for example, the memory 41, the processor 42 and the transmitter 43 can communicate through The bus 44 communicates, the memory 41 is used to store a computer program, and the processor 42 executes the computer program to implement the foregoing video processing method.
可选的,本申请所示的处理器42可以实现图11实施例中的处理模块21的功能,接收器43可以实现图11实施例中发送模块22的功能,此处不再进行赘述。Optionally, the processor 42 shown in the present application may implement the function of the processing module 21 in the embodiment of FIG. 11, and the receiver 43 may implement the function of the sending module 22 in the embodiment of FIG. 11, which will not be repeated here.
可选的,上述处理器可以是CPU,还可以是其他通用处理器、DSP、ASIC等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请所公开的认证方法实施例中的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。Optionally, the foregoing processor may be a CPU, or other general-purpose processors, DSPs, ASICs, and so on. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like. The steps in the embodiment of the authentication method disclosed in this application may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
本申请提供一种存储介质,所述存储介质用于存储计算机程序,所述计算机程序用于实现上述实施例所述的视频处理方法。The present application provides a storage medium, the storage medium is used to store a computer program, and the computer program is used to implement the video processing method described in the foregoing embodiment.
本申请实施例还提供一种芯片或者集成电路,包括:存储器和处理器;The embodiment of the present application also provides a chip or integrated circuit, including: a memory and a processor;
所述存储器,用于存储程序指令,有时还用于存储中间数据;The memory is used for storing program instructions and sometimes also used for storing intermediate data;
所述处理器,用于调用所述存储器中存储的所述程序指令以实现如上所述的视频处理方法。The processor is configured to call the program instructions stored in the memory to implement the video processing method described above.
可选的,存储器可以是独立的,也可以跟处理器集成在一起。在有些实施方式中,存储器还可以位于所述芯片或者集成电路之外。Optionally, the memory can be independent or integrated with the processor. In some embodiments, the memory may also be located outside the chip or integrated circuit.
本申请实施例还提供一种程序产品,所述程序产品包括计算机程序,所述计算机程序存储在存储介质中,所述计算机程序用于实现上述的视频处理方法。An embodiment of the present application also provides a program product, the program product includes a computer program, the computer program is stored in a storage medium, and the computer program is used to implement the above-mentioned video processing method.
实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一可读取存储器中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储器(存储介质)包括:只读存储器(英文:read-only memory,缩写:ROM)、RAM、快闪存储器、硬盘、固态硬盘、磁带(英文:magnetic tape)、软盘(英文:floppy disk)、光盘(英文:optical disc)及其任意组合。All or part of the steps in the foregoing method embodiments can be implemented by a program instructing relevant hardware. The aforementioned program can be stored in a readable memory. When the program is executed, it executes the steps that include the foregoing method embodiments; and the foregoing memory (storage medium) includes: read-only memory (English: read-only memory, abbreviation: ROM), RAM, flash memory, hard disk, Solid state hard drives, magnetic tapes (English: magnetic tape), floppy disks (English: floppy disk), optical discs (English: optical disc) and any combination thereof.
本申请实施例是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理单元以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理单元执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The embodiments of this application are described with reference to the flowcharts and/or block diagrams of the methods, devices (systems), and computer program products according to the embodiments of this application. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processing unit of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processing unit of the computer or other programmable data processing equipment are generated It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或 其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.
显然,本领域的技术人员可以对本申请实施例进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请实施例的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the embodiments of the present application without departing from the spirit and scope of the present application. In this way, if these modifications and variations of the embodiments of this application fall within the scope of the claims of this application and their equivalent technologies, this application is also intended to include these modifications and variations.
在本申请中,术语“包括”及其变形可以指非限制性的包括;术语“或”及其变形可以指“和/或”。本本申请中术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。本申请中,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。In this application, the term "including" and its variations may refer to non-limiting inclusion; the term "or" and its variations may refer to "and/or". The terms "first", "second", etc. in this application are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. In this application, "plurality" means two or more. "And/or" describes the association relationship of the associated objects, indicating that there can be three types of relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the associated objects are in an "or" relationship.

Claims (30)

  1. 一种视频处理方法,其特征在于,包括:A video processing method, characterized by comprising:
    第一设备接收第二设备发送的第二视频帧和在第一视频帧中提取得到的文本信息,所述第二视频帧为对所述第一视频帧压缩得到的,所述文本信息中包括文本内容和所述文本内容的属性信息;The first device receives a second video frame sent by the second device and text information extracted from the first video frame, where the second video frame is obtained by compressing the first video frame, and the text information includes Text content and attribute information of the text content;
    所述第一设备根据所述属性信息,将所述文本内容添加至所述第二视频帧,得到待播放的第三视频帧。The first device adds the text content to the second video frame according to the attribute information to obtain a third video frame to be played.
  2. 根据权利要求1所述的方法,其特征在于,所述第一设备根据所述属性信息,将所述文本内容添加至第二视频帧,得到待播放的第三视频帧,包括:The method according to claim 1, wherein the first device adds the text content to the second video frame according to the attribute information to obtain the third video frame to be played, comprising:
    所述第一设备根据所述文本内容和所述属性信息,生成所述文本内容对应的第一图像,所述第一图像的分辨率大于所述第二视频帧的分辨率;Generating, by the first device, a first image corresponding to the text content according to the text content and the attribute information, the resolution of the first image is greater than the resolution of the second video frame;
    所述第一设备根据所述属性信息,将所述第一图像增加至所述第二视频帧中,得到所述第三视频帧。The first device adds the first image to the second video frame according to the attribute information to obtain the third video frame.
  3. 根据权利要求2所述的方法,其特征在于,所述第一设备根据所述文本内容和所述属性信息,生成所述文本内容对应的第一图像,包括:The method according to claim 2, wherein the first device generating the first image corresponding to the text content according to the text content and the attribute information comprises:
    所述第一设备在所述文本内容中确定至少一组文本内容,每组文本内容中包括至少一个字符,一组文本内容中各字符的字体、尺寸、颜色和字体特效相同,所述字体特效包括仿射、旋转或投影中的至少一种;The first device determines at least one group of text content in the text content, each group of text content includes at least one character, and the font, size, color, and font special effect of each character in the group of text content are the same, and the font special effect Including at least one of affine, rotation or projection;
    所述第一设备分别根据每组文本内容的属性信息,生成每组文本内容对应的第一图像。The first device respectively generates a first image corresponding to each group of text content according to the attribute information of each group of text content.
  4. 根据权利要求2或3所述的方法,其特征在于,所述第一图像中除所述文本内容之外的区域为透明的。The method according to claim 2 or 3, wherein the area in the first image other than the text content is transparent.
  5. 根据权利要求2-4任一项所述的方法,其特征在于,所述第一设备根据所述属性信息,将所述第一图像增加至所述第二视频帧中,得到所述待播放的第三视频帧,包括:The method according to any one of claims 2-4, wherein the first device adds the first image to the second video frame according to the attribute information to obtain the to-be-played The third video frame includes:
    所述第一设备在所述属性信息中获取所述文本内容在所述第一视频帧中的位置信息The first device obtains location information of the text content in the first video frame from the attribute information
    所述第一设备根据所述位置信息,将所述第一图像增加至所述第二视频帧中,得到所述第三视频帧。The first device adds the first image to the second video frame according to the position information to obtain the third video frame.
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述第一设备根据所述属性信息,将所述文本内容添加至所述第二视频帧之前,还包括:The method according to any one of claims 1 to 5, wherein the first device adds the text content before the second video frame according to the attribute information, further comprising:
    所述第一设备在所述第二视频帧中获取第一标识;Acquiring, by the first device, a first identifier in the second video frame;
    所述第一设备在所述文本信息中获取第二标识;The first device obtains the second identifier in the text information;
    所述第一设备确定所述第一标识和所述第二标识相同。The first device determines that the first identifier and the second identifier are the same.
  7. 根据权利要求6所述的方法,其特征在于,所述第一标识和所述第二标识为相同的时间戳。The method according to claim 6, wherein the first identifier and the second identifier are the same time stamp.
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述第一设备接收第二设备发送的第二视频帧和在第一视频帧中提取得到的文本信息,包括:The method according to any one of claims 1-7, wherein the receiving, by the first device, the second video frame sent by the second device and the text information extracted from the first video frame, comprises:
    所述第一设备从第一传输通道接收所述第二设备发送的所述第二视频帧;Receiving, by the first device, the second video frame sent by the second device from a first transmission channel;
    所述第一设备从第二传输通道接收所述第二设备发送的所述文本信息,所述第二传输通道为所述第一传输通道的并行旁路小带宽通道。The first device receives the text information sent by the second device from a second transmission channel, and the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
  9. 根据权利要求1-8任一项所述的方法,其特征在于,所述属性信息包括所述文本内容在所述视频帧中的位置、字体、尺寸、颜色和字体特效,所述字体特效包括仿射、旋转或投 影中的至少一种。The method according to any one of claims 1-8, wherein the attribute information includes the position, font, size, color, and font special effects of the text content in the video frame, and the font special effects include At least one of affine, rotation, or projection.
  10. 根据权利要求1-9任一项所述的方法,其特征在于,所述第一设备根据所述属性信息,将所述文本内容添加至所述第二视频帧,得到待播放的第三视频帧之后,还包括:The method according to any one of claims 1-9, wherein the first device adds the text content to the second video frame according to the attribute information to obtain the third video to be played After the frame, it also includes:
    所述第一设备播放所述第三视频帧;Playing the third video frame by the first device;
    或者,or,
    所述第一设备向第三设备发送所述第三视频帧,所述第三设备用于播放所述第三视频帧。The first device sends the third video frame to a third device, and the third device is used to play the third video frame.
  11. 一种视频处理方法,其特征在于,包括:A video processing method, characterized by comprising:
    第二设备在第一视频帧中提取文本信息,所述文本信息包括文本内容和属性信息;The second device extracts text information from the first video frame, where the text information includes text content and attribute information;
    所述第二设备对所述第一视频帧进行压缩处理,得到第二视频帧;Performing compression processing on the first video frame by the second device to obtain a second video frame;
    所述第二设备向第一设备发送所述第二视频帧和所述文本信息。The second device sends the second video frame and the text information to the first device.
  12. 根据权利要求11所述的方法,其特征在于,所述第二设备向所述第一设备发送所述第二视频帧和所述文本信息之前,还包括:The method according to claim 11, wherein before the second device sends the second video frame and the text information to the first device, the method further comprises:
    所述第二设备生成第一标识;Generating the first identifier by the second device;
    所述第二设备分别在所述第二视频帧和所述文本信息中添加所述第一标识。The second device adds the first identifier to the second video frame and the text information respectively.
  13. 根据权利要求12所述的方法,其特征在于,所述第一标识为所述第二设备生成的时间戳。The method according to claim 12, wherein the first identifier is a timestamp generated by the second device.
  14. 根据权利要求11-13所述的方法,其特征在于,所述第二设备向所述第一设备发送所述第二视频帧和所述文本信息,包括:The method according to claims 11-13, wherein the second device sending the second video frame and the text information to the first device comprises:
    所述第二设备通过第一传输通道向所述第一设备发送所述第二视频帧;Sending, by the second device, the second video frame to the first device through a first transmission channel;
    所述第二设备通过第二传输通道向所述第一设备发送所述文本信息,所述第二传输通道为所述第一传输通道的并行旁路小带宽通道。The second device sends the text information to the first device through a second transmission channel, and the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
  15. 根据权利要求11-14任一项所述的方法,其特征在于,所述属性信息包括所述文本内容在所述视频帧中的位置、字体、尺寸、颜色和字体特效,所述字体特效包括仿射、旋转或投影中的至少一种。The method according to any one of claims 11-14, wherein the attribute information includes the position, font, size, color, and font effects of the text content in the video frame, and the font effects include At least one of affine, rotation, or projection.
  16. 一种视频处理装置,其特征在于,包括接收模块和处理模块,其中,A video processing device, which is characterized by comprising a receiving module and a processing module, wherein:
    所述接收模块用于,接收第二设备发送的第二视频帧和在第一视频帧中提取得到的文本信息,所述第二视频帧为对所述第一视频帧压缩得到的,所述文本信息中包括文本内容和所述文本内容的属性信息;The receiving module is configured to receive a second video frame sent by a second device and text information extracted from the first video frame, where the second video frame is obtained by compressing the first video frame, and The text information includes text content and attribute information of the text content;
    所述处理模块用于,根据所述属性信息,将所述文本内容添加至所述第二视频帧,得到待播放的第三视频帧。The processing module is configured to add the text content to the second video frame according to the attribute information to obtain a third video frame to be played.
  17. 根据权利要求16所述的装置,其特征在于,所述处理模块具体用于:The device according to claim 16, wherein the processing module is specifically configured to:
    根据所述文本内容和所述属性信息,生成所述文本内容对应的第一图像,所述第一图像的分辨率大于所述第二视频帧的分辨率;Generating a first image corresponding to the text content according to the text content and the attribute information, where the resolution of the first image is greater than the resolution of the second video frame;
    根据所述属性信息,将所述第一图像增加至所述第二视频帧中,得到所述第三视频帧。According to the attribute information, the first image is added to the second video frame to obtain the third video frame.
  18. 根据权利要求17所述的装置,其特征在于,所述处理模块具体用于:The device according to claim 17, wherein the processing module is specifically configured to:
    在所述文本内容中确定至少一组文本内容,每组文本内容中包括至少一个字符,一组文本内容中各字符的字体、尺寸、颜色和字体特效相同,所述字体特效包括仿射、旋转或投影中的至少一种;Determine at least one group of text content in the text content, each group of text content includes at least one character, the font, size, color, and font special effects of each character in the group of text content are the same, and the font special effects include affine, rotation Or at least one of projections;
    分别根据每组文本内容的属性信息,生成每组文本内容对应的第一图像。The first image corresponding to each group of text content is generated according to the attribute information of each group of text content respectively.
  19. 根据权利要求17或18所述的装置,其特征在于,所述第一图像中除所述文本内容之外的区域为透明的。The device according to claim 17 or 18, wherein the area in the first image other than the text content is transparent.
  20. 根据权利要求17-19任一项所述的装置,其特征在于,所述处理模块具体用于:The device according to any one of claims 17-19, wherein the processing module is specifically configured to:
    在所述属性信息中获取所述文本内容在所述第一视频帧中的位置信息Obtain the position information of the text content in the first video frame in the attribute information
    根据所述位置信息,将所述第一图像增加至所述第二视频帧中,得到所述第三视频帧。According to the position information, the first image is added to the second video frame to obtain the third video frame.
  21. 根据权利要求16-20任一项所述的装置,其特征在于,在所述处理模块根据所述属性信息,将所述文本内容添加至所述第二视频帧之前,所述处理模块还用于:The device according to any one of claims 16-20, wherein before the processing module adds the text content to the second video frame according to the attribute information, the processing module also uses in:
    在所述第二视频帧中获取第一标识;Acquiring a first identifier in the second video frame;
    在所述文本信息中获取第二标识;Acquiring the second identifier in the text information;
    确定所述第一标识和所述第二标识相同。It is determined that the first identifier and the second identifier are the same.
  22. 根据权利要求16-21任一项所述的装置,其特征在于,所述接收模块具体用于:The device according to any one of claims 16-21, wherein the receiving module is specifically configured to:
    从第一传输通道接收所述第二设备发送的所述第二视频帧;Receiving the second video frame sent by the second device from the first transmission channel;
    从第二传输通道接收所述第二设备发送的所述文本信息,所述第二传输通道为所述第一传输通道的并行旁路小带宽通道。Receiving the text information sent by the second device from a second transmission channel, where the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
  23. 根据权利要求16-22任一项所述的装置,其特征在于,所述属性信息包括所述文本内容在所述视频帧中的位置、字体、尺寸、颜色和字体特效,所述字体特效包括仿射、旋转或投影中的至少一种。The device according to any one of claims 16-22, wherein the attribute information includes the position, font, size, color, and font special effects of the text content in the video frame, and the font special effects include At least one of affine, rotation, or projection.
  24. 一种视频处理装置,其特征在于,包括处理模块和发送模块,其中,A video processing device, characterized in that it comprises a processing module and a sending module, wherein:
    所述处理模块用于,在第一视频帧中提取文本信息,所述文本信息包括文本内容和属性信息;The processing module is configured to extract text information from a first video frame, where the text information includes text content and attribute information;
    所述处理模块还用于,对所述第一视频帧进行压缩处理,得到第二视频帧;The processing module is further configured to perform compression processing on the first video frame to obtain a second video frame;
    所述发送模块用于,向第一设备发送所述第二视频帧和所述文本信息。The sending module is configured to send the second video frame and the text information to the first device.
  25. 根据权利要求24所述的装置,其特征在于,在所述发送模块向所述第一设备发送所述第二视频帧和所述文本信息之前,所述处理模块还用于:The apparatus according to claim 24, wherein before the sending module sends the second video frame and the text information to the first device, the processing module is further configured to:
    生成第一标识;Generate the first identification;
    分别在所述第二视频帧和所述文本信息中添加所述第一标识。The first identifier is added to the second video frame and the text information respectively.
  26. 根据权利要求24或25所述的装置,其特征在于,所述发送模块具体用于:The device according to claim 24 or 25, wherein the sending module is specifically configured to:
    通过第一传输通道向所述第一设备发送所述第二视频帧;Sending the second video frame to the first device through a first transmission channel;
    通过第二传输通道向所述第一设备发送所述文本信息,所述第二传输通道为所述第一传输通道的并行旁路小带宽通道。The text information is sent to the first device through a second transmission channel, and the second transmission channel is a parallel bypass small bandwidth channel of the first transmission channel.
  27. 根据权利要求24-26任一项所述的装置,其特征在于,所述属性信息包括所述文本内容在所述视频帧中的位置、字体、尺寸、颜色和字体特效,所述字体特效包括仿射、旋转或投影中的至少一种。The device according to any one of claims 24-26, wherein the attribute information includes the position, font, size, color, and font special effects of the text content in the video frame, and the font special effects include At least one of affine, rotation, or projection.
  28. 一种视频处理装置,其特征在于,包括:存储器、处理器以及计算机程序,所述计算机程序存储在所述存储器中,所述处理器运行所述计算机程序执行如权利要求1-10任一项所述的视频处理方法。A video processing device, comprising: a memory, a processor, and a computer program, the computer program is stored in the memory, and the processor runs the computer program to execute any one of claims 1-10 The described video processing method.
  29. 一种视频处理装置,其特征在于,包括:存储器、处理器以及计算机程序,所述计算机程序存储在所述存储器中,所述处理器运行所述计算机程序执行如权利要求11-15任一项所述的视频处理方法。A video processing device, characterized by comprising: a memory, a processor, and a computer program, the computer program is stored in the memory, and the processor runs the computer program to execute any one of claims 11-15 The described video processing method.
  30. 一种存储介质,其特征在于,所述存储介质包括计算机程序,所述计算机程序用于实现如权利要求1-15任一项所述的视频处理方法。A storage medium, wherein the storage medium includes a computer program, and the computer program is used to implement the video processing method according to any one of claims 1-15.
PCT/CN2020/095882 2019-06-14 2020-06-12 Video processing method, apparatus and device WO2020249100A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910517023.2 2019-06-14
CN201910517023.2A CN112087660A (en) 2019-06-14 2019-06-14 Video processing method, device and equipment

Publications (1)

Publication Number Publication Date
WO2020249100A1 true WO2020249100A1 (en) 2020-12-17

Family

ID=73734071

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/095882 WO2020249100A1 (en) 2019-06-14 2020-06-12 Video processing method, apparatus and device

Country Status (2)

Country Link
CN (1) CN112087660A (en)
WO (1) WO2020249100A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112887781A (en) * 2021-01-27 2021-06-01 维沃移动通信有限公司 Subtitle processing method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011010023A (en) * 2009-06-25 2011-01-13 Sharp Corp Image compressing apparatus, image compressing method, image decompressing apparatus, image decompressing method, image forming apparatus, computer program and recording medium
CN102630043A (en) * 2012-04-01 2012-08-08 北京捷成世纪科技股份有限公司 Object-based video transcoding method and device
CN103731609A (en) * 2012-10-11 2014-04-16 百度在线网络技术(北京)有限公司 Video playing method and system
CN103905837A (en) * 2014-03-26 2014-07-02 小米科技有限责任公司 Image processing method and device and terminal
CN105516539A (en) * 2014-10-10 2016-04-20 柯尼卡美能达株式会社 History generating apparatus and history generating method
WO2018103568A1 (en) * 2016-12-08 2018-06-14 中兴通讯股份有限公司 Methods of encoding and decoding cloud desktop content, device, and system
CN109168006A (en) * 2018-09-05 2019-01-08 高新兴科技集团股份有限公司 The video coding-decoding method that a kind of figure and image coexist

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102246532B (en) * 2008-12-15 2014-04-02 爱立信电话股份有限公司 Method and apparatus for avoiding quality deterioration of transmitted media content
CN102957892A (en) * 2011-08-24 2013-03-06 三星电子(中国)研发中心 Method, system and device for realizing audio and video conference
CN103873877A (en) * 2012-12-14 2014-06-18 华为技术有限公司 Image transmission method and device for remote desktop
US9471990B1 (en) * 2015-10-20 2016-10-18 Interra Systems, Inc. Systems and methods for detection of burnt-in text in a video
CN105979169A (en) * 2015-12-15 2016-09-28 乐视网信息技术(北京)股份有限公司 Video subtitle adding method, device and terminal

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011010023A (en) * 2009-06-25 2011-01-13 Sharp Corp Image compressing apparatus, image compressing method, image decompressing apparatus, image decompressing method, image forming apparatus, computer program and recording medium
CN102630043A (en) * 2012-04-01 2012-08-08 北京捷成世纪科技股份有限公司 Object-based video transcoding method and device
CN103731609A (en) * 2012-10-11 2014-04-16 百度在线网络技术(北京)有限公司 Video playing method and system
CN103905837A (en) * 2014-03-26 2014-07-02 小米科技有限责任公司 Image processing method and device and terminal
CN105516539A (en) * 2014-10-10 2016-04-20 柯尼卡美能达株式会社 History generating apparatus and history generating method
WO2018103568A1 (en) * 2016-12-08 2018-06-14 中兴通讯股份有限公司 Methods of encoding and decoding cloud desktop content, device, and system
CN109168006A (en) * 2018-09-05 2019-01-08 高新兴科技集团股份有限公司 The video coding-decoding method that a kind of figure and image coexist

Also Published As

Publication number Publication date
CN112087660A (en) 2020-12-15

Similar Documents

Publication Publication Date Title
EP3996381A1 (en) Cover image determination method and apparatus, and device
US6704042B2 (en) Video processing apparatus, control method therefor, and storage medium
JP6283108B2 (en) Image processing method and apparatus
US20220014819A1 (en) Video image processing
US11863801B2 (en) Method and device for generating live streaming video data and method and device for playing live streaming video
US9076071B2 (en) Logo recognition
US20190222806A1 (en) Communication system and method
WO2019192509A1 (en) Media data processing method and apparatus
CN106851386B (en) Method and device for realizing augmented reality in television terminal based on Android system
CN106303289A (en) A kind of real object and virtual scene are merged the method for display, Apparatus and system
WO2017118078A1 (en) Image processing method, playing method and related device and system
CN106713942B (en) Video processing method and device
EP3681144A1 (en) Video processing method and apparatus based on augmented reality, and electronic device
CN106027886B (en) A kind of panoramic video realizes the method and system of synchronization frame
WO2020249100A1 (en) Video processing method, apparatus and device
US11570453B2 (en) Switchable chroma sampling for wireless display
WO2023241459A1 (en) Data communication method and system, and electronic device and storage medium
EP3671657A1 (en) File generation apparatus, image generation apparatus, file generation method, and program
WO2023065961A1 (en) Video implantation method and apparatus, device, and computer readable storage medium
WO2023066054A1 (en) Image processing method, cloud server, vr terminal and storage medium
CN100518253C (en) A manuscript writing system
CN116962742A (en) Live video image data transmission method, device and live video system
CN113938617A (en) Multi-channel video display method and equipment, network camera and storage medium
KR20210052884A (en) Personalized Video Production System and Method Using Chroma Key
CN110619362A (en) Video content comparison method and device based on perception and aberration

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20822490

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20822490

Country of ref document: EP

Kind code of ref document: A1