WO2024078064A1 - Procédé et appareil de traitement d'image, et terminal - Google Patents

Procédé et appareil de traitement d'image, et terminal Download PDF

Info

Publication number
WO2024078064A1
WO2024078064A1 PCT/CN2023/105927 CN2023105927W WO2024078064A1 WO 2024078064 A1 WO2024078064 A1 WO 2024078064A1 CN 2023105927 W CN2023105927 W CN 2023105927W WO 2024078064 A1 WO2024078064 A1 WO 2024078064A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
auxiliary stream
current frame
processed
information
Prior art date
Application number
PCT/CN2023/105927
Other languages
English (en)
Chinese (zh)
Inventor
鄢玉民
宋晨
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2024078064A1 publication Critical patent/WO2024078064A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • the present application relates to but is not limited to the field of image processing technology.
  • the terminal will interact with the user by displaying multiple frames of images.
  • users' demand for real-time interaction is becoming more and more prominent.
  • Traditional video conferencing only displays a certain frame of image acquired, and cannot synchronously update the difference between consecutive frames, which reduces the interactivity between the terminal and the user and cannot meet the user's usage needs.
  • the present application provides an image processing method, device, terminal, electronic device and storage medium.
  • the present application provides an image processing method, the method comprising: synthesizing an auxiliary stream image and its corresponding annotation information to generate a current frame synthesized image; detecting the current frame synthesized image and the previous frame synthesized image to determine difference information; encoding the current frame synthesized image based on the difference information to generate encoded data; sending the encoded data to a peer device so that the peer device processes the encoded data to obtain and display a decoded image including the annotation information corresponding to the auxiliary stream image.
  • the present application provides an image processing method, the method comprising: obtaining encoded data, which is the data sent by the image processing method in the first aspect; decoding the encoded data to obtain a decoded image, which is an image carrying an auxiliary stream image and its corresponding annotation information; and displaying the decoded image.
  • the present application provides an encoding device, which includes: a synthesis module, configured to synthesize an auxiliary stream image and its corresponding annotation information to generate a current frame synthesized image; a detection module, configured to detect the current frame synthesized image and the previous frame synthesized image, and determine the difference information encoding module, configured to encode the current frame synthesized image according to the difference information to generate encoded data; a sending module, configured to send the encoded data to a peer device, so that the peer device processes the encoded data, obtains and displays a decoded image including the annotation information corresponding to the auxiliary stream image.
  • the present application provides a decoding device, comprising: an acquisition module, configured to acquire encoded data, the encoded data being the data sent by the image processing method in the first aspect; a decoding module, configured to decode the encoded data to obtain a decoded image, the decoded image being an image carrying an auxiliary stream image and its corresponding annotation information; and a display module, configured to display the decoded image.
  • the present application provides a terminal, comprising: an encoding device and/or a decoding device; the encoding device is configured to execute the image processing method in the first aspect of the present application; and the decoding device is configured to execute the image processing method in the second aspect of the present application.
  • the present application provides an image processing system, the image processing system comprising: a plurality of terminals connected in communication, the terminals being configured to implement any one of the image processing methods in the present application.
  • the present application provides an electronic device, comprising: one or more processors; a memory on which one or more programs are stored, and when the one or more programs are executed by one or more processors, the one or more processors implement any image processing method in the present application.
  • the present application provides a readable storage medium, which stores a computer program, and when the computer program is executed by a processor, any one of the image processing methods in the present application is implemented.
  • FIG1 is a schematic flow chart of an image processing method provided in the present application.
  • FIG. 2 is a schematic diagram showing a flow chart of image processing by the image synthesis device provided in the present application.
  • FIG. 3 is a schematic diagram showing a flow chart of detecting auxiliary stream images provided by the present application.
  • FIG4 is a schematic flow chart showing the image processing method provided by the present application.
  • FIG5 is a block diagram showing the composition of the image processing system provided by the present application.
  • FIG6 shows a block diagram of the image processing system provided by the present application.
  • FIG. 7 shows a block diagram of the image processing system provided by the present application.
  • FIG8 shows a schematic diagram of a display interface for auxiliary stream images provided in the present application.
  • FIG9 shows a block diagram of the encoding device provided in the present application.
  • FIG10 is a block diagram showing the composition of the decoding device provided in the present application.
  • FIG. 11 shows a block diagram of the components of the terminal provided in the present application.
  • FIG. 12 is a block diagram showing the composition of the image processing system provided by the present application.
  • FIG. 13 is a block diagram showing an exemplary hardware architecture of a computing device capable of implementing the image processing method and apparatus according to the present application.
  • FIG1 is a flow chart of an image processing method provided by the present application. The method can be applied to an encoding device. As shown in FIG1 , the image processing method in the present application includes but is not limited to the following steps S101 to S104.
  • Step S101 synthesize the auxiliary stream image and its corresponding annotation information to generate a current frame synthesized image.
  • Step S102 Detect the current frame synthesized image and the previous frame synthesized image to determine difference information.
  • the previous frame composite image is an image generated by synthesizing the previous frame image of the auxiliary stream image and the annotation information corresponding to the previous frame image of the auxiliary stream image.
  • Step S103 encoding the current frame synthesized image according to the difference information to generate encoded data.
  • Step S104 Send the coded data to the peer device so that the peer device can The decoded image including the annotation information corresponding to the auxiliary stream image is obtained and displayed.
  • the counterpart device is a device that can process the encoded data, obtain and display the decoded image including the annotation information corresponding to the auxiliary stream image.
  • the counterpart device can be a decoding device, a receiving terminal and other devices.
  • the counterpart device can be set based on the actual application scenario. Other unspecified counterpart devices are also within the scope of protection of this application and will not be repeated here.
  • the information on the terminal's annotation of the auxiliary stream image can be clearly identified; the current frame synthesized image and the previous frame synthesized image are detected to determine the difference information, so that the user can synchronously obtain the difference information between two consecutive frames, thereby improving the interactivity between the terminal and the user; the current frame synthesized image is encoded based on the difference information to generate encoded data, which can speed up the encoding speed of the image to reduce the energy consumption of encoding; and, by sending the encoded data to the opposite device, so that the opposite device processes the encoded data, obtains and displays the decoded image including the annotation information corresponding to the auxiliary stream image, so that the opposite device can view the decoded image with the annotation information, so that the opposite device can display the annotation information more clearly.
  • the synthesis of the auxiliary stream image and its corresponding annotation information to generate a current frame synthesized image in step S101 can be implemented in the following manner: based on multiple frame rates, obtaining the annotation information corresponding to the auxiliary stream image; processing the annotation information corresponding to the auxiliary stream image according to a preset container and a preset image format to generate an annotated image; and integrating the auxiliary stream image and the annotated image to generate a current frame synthesized image.
  • the annotation information corresponding to the auxiliary stream image may be information based on multiple frame rates in the form of point set data.
  • Frame rate refers to the number of frames or images shown or displayed per second.
  • Frame rate is mainly used to refer to the number of frames of an image played per second in the synchronized audio and/or image of a movie, television or video.
  • the frame rate may be 120 frames per second, or 24 frames per second (or 25 frames per second, 30 frames per second), etc.
  • the real-time change of the auxiliary stream image can be clarified, and then the annotation information corresponding to the auxiliary stream image is processed according to a preset container (such as a bitmap container, etc.) and a preset image format (such as an image format of a red green blue alpha (RGBA) color space; a YUV image format, etc.), so that the obtained annotated image can better reflect the real-time change characteristics and meet the user's needs.
  • a preset container such as a bitmap container, etc.
  • a preset image format such as an image format of a red green blue alpha (RGBA) color space; a YUV image format, etc.
  • the "Y” in the YUV image format represents brightness (Luminance or Luma), which is the grayscale value; while “U” and “V” represent chrominance (Chrominance or Chroma), which describes the color and saturation of the image and is used to specify the color of the pixel.
  • auxiliary stream image and the annotated image are integrated (for example, it can be superimposed synthesis or differential synthesis, etc.) to generate a current frame composite image, which is convenient for subsequent processing and improves the image processing efficiency.
  • the auxiliary stream image and the annotated image are integrated to generate a current frame composite image, including: converting the image formats of the auxiliary stream image and the annotated image respectively to obtain a converted image set; scaling each image in the converted image set according to a preset image resolution to obtain a scaled image set; synchronizing each image in the scaled image set according to a preset frame rate to obtain a processed auxiliary stream image and a processed annotated image; and superimposing and synthesizing the processed auxiliary stream image and the processed annotated image to generate a current frame composite image.
  • the processed auxiliary stream images and the processed annotated images can be more conveniently superimposed and synthesized, thereby ensuring the accuracy of the superimposed images and improving the image processing efficiency.
  • FIG2 shows a schematic diagram of a process flow of an image synthesis device provided by the present application for processing an image.
  • the image synthesis device 200 includes but is not limited to the following modules: a label collector 201, a data conversion module 202, an auxiliary stream image acquisition module 203, an image format conversion module 204, an image scaling module 205, a frame rate synchronization module 206, and an image overlay module 207.
  • the annotation collector 201 is configured to collect annotation information and support the collection of annotation information at multiple frame rates, obtain annotation information presented in the form of point set data, or passively receive point set data pushed by an annotation source.
  • the auxiliary stream image acquisition module 203 is configured to acquire auxiliary stream images, support acquisition of auxiliary stream images of various frame rates, and support various image formats, and can actively acquire auxiliary stream images or passively receive auxiliary stream image data push.
  • the data conversion module 202 is configured to process the point set data, for example, to convert the point set data based on a preset container such as a bitmap, and convert it into an appropriate
  • the synthesized annotated image supports output in preset image formats such as the image format of the RGBA color space and the YUV image format.
  • the image format conversion module 204 is configured to convert the format of the auxiliary stream image and the format of the annotated image into the same type of image format to avoid image synthesis failure caused by different image formats.
  • the image scaling module 205 is configured to stretch the auxiliary stream image and the annotated image to the same image resolution according to a preset image resolution, and is applied to the scenario where the resolution of the auxiliary stream image, the resolution of the annotated image and the resolution of the target image are inconsistent.
  • the frame rate synchronization module 206 is configured to synchronize the acquisition frequencies of the auxiliary stream image and the annotated image according to a preset frame rate, and control the frequency of the synthesized current frame synthesized image by dropping frames and/or inserting frames, thereby reducing the data processing pressure of the image synthesis device 200 and improving the efficiency and stability of image synthesis.
  • the image synthesis device 200 may process the input auxiliary stream image and annotation information in the following manner.
  • the auxiliary stream image is collected by the auxiliary stream image collection module 203, and the annotation information is collected by the annotation collector 201. Then, the annotation information corresponding to the auxiliary stream image is processed by the data conversion module 202 according to the preset container and the preset image format to generate an annotated image.
  • the image format conversion module 204 performs image format conversion on the auxiliary stream image and the annotated image respectively to obtain a converted image set, wherein the converted image set includes: the auxiliary stream image after image format conversion and the annotated image after image format conversion.
  • the image scaling module 205 performs scaling processing on the auxiliary stream image after the image format conversion and the annotated image after the image format conversion, for example, according to the preset image resolution, the resolution of the auxiliary stream image after the image format conversion is adjusted to obtain the scaled auxiliary stream image; according to the preset image resolution, the annotated image after the image format conversion is adjusted to obtain the scaled annotated image. This ensures that the image resolutions of the scaled annotated image and the scaled auxiliary stream image are both the preset image resolutions, which facilitates the subsequent image processing.
  • the scaled annotated image and the scaled auxiliary stream image are synchronized by the frame rate synchronization module 206 to obtain a processed auxiliary stream image and a processed annotated image with the same frame rate (for example, both are preset frame rates).
  • each image in the scaled image set is synchronized according to a preset frame rate to obtain a processed auxiliary stream image and a processed annotated image, including: when it is determined that the actual frame rate of the images in the scaled image set is greater than the preset frame rate, frame dropping processing is performed on each image in the scaled image set based on a sampling method to obtain a processed auxiliary stream image and a processed annotated image; when it is determined that the actual frame rate of the images in the scaled image set is less than the preset frame rate, internal interpolation is used to process each image in the scaled image set to obtain a processed auxiliary stream image and a processed annotated image.
  • auxiliary stream images and annotated images By processing auxiliary stream images and annotated images in different frame rate synchronization modes, the success rate of images in the superposition synthesis process can be increased, and the image processing efficiency can be improved.
  • the image superposition module 207 is used to superimpose the processed auxiliary stream image and the processed annotated image to generate a current frame composite image.
  • the processed auxiliary stream image and the processed annotated image are superimposed and synthesized to generate a current frame synthesized image, including: using the processed auxiliary stream image as the background image, superimposing the annotation features in the processed annotated image onto the processed auxiliary stream image, and obtaining the current frame synthesized image.
  • the transparency of the annotated image is set to be fully transparent, thereby obtaining the annotation features in the processed annotated image.
  • the annotation features in the processed annotated image are superimposed on the processed auxiliary stream image to obtain the current frame composite image.
  • the current frame composite image can have both the annotation features in the processed annotated image and the image features of the processed auxiliary stream image, thereby enriching the content of the current frame composite image.
  • the processed auxiliary stream image and the processed annotated image are superimposed and synthesized to generate a current frame synthesized image, including: processing the processed auxiliary stream image according to preset transparency information to obtain image features of the processed auxiliary stream image, wherein the image features of the processed auxiliary stream image match the annotated information;
  • the processed annotated image is used as the background image, and the image features of the processed auxiliary stream image are superimposed on the processed annotated image to obtain a current frame composite image.
  • the processed auxiliary stream image is processed according to the preset transparency information, so that the image features of the processed auxiliary stream image can be obtained, and the image features of the processed auxiliary stream image match the annotation information, so that the characteristics of the annotation information can be represented, so that the processed auxiliary stream image can be further processed.
  • the image features of the processed auxiliary stream image are superimposed on the processed annotated image, so that the current frame synthetic image has both the annotated features in the processed annotated image and the image features of the processed auxiliary stream image.
  • the detection of the current frame composite image and the previous frame composite image to determine the difference information in step S102 can be implemented in the following manner: based on preset sizes, the current frame composite image and the pre-stored previous frame composite image are partitioned to obtain a first region image set corresponding to the current frame composite image and a second region image set corresponding to the previous frame composite image; based on the number of regions, the first region image and the second region image are compared to obtain the difference information.
  • the first region image set includes a plurality of first region images
  • the second region image set includes a plurality of second region images
  • the preset size can be a predefined minimum size for partitioning or blocking an image. For example, if the preset size is 16*16, the current frame composite image can be divided into multiple 16*16 first area images. At the same time, the previous frame composite image can also be divided into multiple 16*16 second area images. This allows the image to be divided in detail and the differences between different images to be more prominent.
  • the number of regions in the first region image set is the same as the number of regions in the second region image set, which can facilitate block-by-block comparison of the region images in the two region image sets, thereby making the obtained difference information more accurate.
  • the difference information includes: at least one difference region.
  • Encoding the current frame synthesized image according to the difference information to generate encoded data includes: determining difference contour information according to the at least one difference region; cropping the current frame synthesized image according to the difference contour information to obtain a changed region image; encoding the changed region image to generate encoded data.
  • the difference region is used to characterize an image region where the image features of the first region image are different from the image features of the second region image, and can accurately measure the difference between the two frames of images, making it convenient to process the current frame synthesized image.
  • At least one difference area is merged to the maximum extent to obtain difference contour information, and the image boundary with difference is clarified, so as to crop the current frame synthetic image based on the difference contour information to obtain the image change information that only includes the difference information.
  • the image of the changing area is merged to the maximum extent to obtain difference contour information, and the image boundary with difference is clarified, so as to crop the current frame synthetic image based on the difference contour information to obtain the image change information that only includes the difference information.
  • the encoded data can reflect the difference between the previous and next two frames of image, thereby improving the encoding speed of the current frame synthesized image.
  • Fig. 3 shows a schematic diagram of a process for detecting auxiliary stream images provided by the present application.
  • the input image of the region detection device 300 is an auxiliary stream annotated image F1, which is a processed image obtained after being processed by the image synthesis device 200, and can simultaneously reflect the features of the auxiliary stream image and the annotated image.
  • the region detection device 300 After acquiring the auxiliary stream annotated image F1, the region detection device 300 performs block (or partition) processing on the auxiliary stream annotated image F1 to obtain a first region image set, where the first region image set includes a plurality of first region images.
  • auxiliary stream annotated image F1 By dividing the auxiliary stream annotated image F1 into blocks (or partitions), local information of the auxiliary stream annotated image F1 can be reflected, so as to facilitate subsequent comparison of features of different local images and realize detection of changed areas.
  • the region detection device 300 also pre-stores a second region image set, which includes multiple second region images.
  • the second region image set is an image set obtained by performing block (or partition) processing on the previous frame synthetic image, which can reflect the image features in different regions of the previous frame synthetic image.
  • difference information for example, different feature information in a certain area, etc.
  • the image block in the area is cached and the area where the image block is located is recorded.
  • the process of caching image blocks can be performed synchronously by multiple threads or by scanning the image blocks with differences line by line.
  • the contour of the image with differences can be extracted (for example, the circumscribed rectangular contour of the image block is extracted, etc.), and then based on the contour, the image within the contour is cropped to generate a difference image corresponding to the difference information.
  • the difference image and the auxiliary stream annotated image F1 are both input to the encoding module 310 for encoding, so that the encoded data can be obtained quickly and accurately.
  • the image of the changed area within the contour is cropped to obtain difference information, thereby improving the accuracy of judging the difference changes of the auxiliary stream image.
  • the method further includes: skipping the current frame synthetic image when it is determined that the difference information indicates that there is no difference between the current frame synthetic image and the previous frame synthetic image.
  • sending the encoded data to the peer device includes: sending the encoded data to the peer device through a first channel; after sending the encoded data to the peer device, further includes: sending labeled data corresponding to the labeled information to the peer device through a second channel.
  • the annotation data corresponding to the annotation information may be data obtained by packaging the annotation information and complying with the transmission rules of the second channel.
  • the annotation information is represented by binary data, and a data packet header (e.g., a data packet header representing information such as the network address of the peer device) is added in front of the binary data, thereby obtaining the annotation data corresponding to the annotation information.
  • Sending encoded data and labeled data corresponding to the labeling information to the peer device through different transmission channels can facilitate the peer device's processing of different data, so that the peer device can analyze and process the obtained encoded data more quickly, thereby improving data processing efficiency.
  • FIG4 is a flow chart of the image processing method provided by the present application. The method can be applied to a decoding device. As shown in FIG4 , the image processing method in the embodiment of the present application includes but is not limited to the following steps S401 to S404.
  • Step S401 obtaining encoded data.
  • the coded data is sent by the peer device (such as a coding device) through the Data encoded by any image processing method in the application.
  • the encoded data is data obtained by the encoding device by encoding the current frame synthetic image based on the difference information
  • the difference information is information obtained by the encoding device by detecting the current frame synthetic image and the previous frame synthetic image
  • the current frame synthetic image is an image synthesized by the encoding device on the auxiliary stream image and its corresponding annotation information.
  • Step S402 decode the encoded data to obtain a decoded image.
  • the decoded image is an image that carries the auxiliary stream image and its corresponding annotation information.
  • the decoding device since the encoded data sent by the encoding device is data encoded by any image processing method in the present application, that is, the encoded data already carries the annotation information and the auxiliary stream image, the decoding device only needs to perform corresponding decoding on the encoded data, thereby ensuring that the decoded image includes the characteristics of the auxiliary stream image and the characteristics of the annotation information corresponding to the auxiliary stream image.
  • the decoding method used by the decoding device to decode the encoded data matches the encoding method of the encoded data to ensure that an accurate decoded image is obtained.
  • the encoding device can use a specific compression technology to encode the current frame synthetic image based on the difference information to obtain encoded data, and the decoding device needs to use the same compression technology to decode the encoded data so that the obtained decoded image can simultaneously include the characteristics of the auxiliary stream image and the characteristics of the annotation information corresponding to the auxiliary stream image.
  • Step S403 superimpose the decoded image and the previous frame of synthesized image to generate an image to be displayed.
  • the decoded image includes an auxiliary stream image and annotation information, and the decoded image can reflect the features of the annotation information and the auxiliary stream image.
  • the decoded image is superimposed with the previous frame composite image to generate an image to be displayed, so that the image to be displayed can reflect the annotation information.
  • Step S404 display the image to be displayed.
  • the method before performing step S404 of displaying the image to be displayed, the method further includes: rendering the image to be displayed to obtain a rendered image to be displayed.
  • the surface shading effect of the image to be displayed can be reflected intuitively and in real time, thereby showing the texture characteristics of the image to be displayed and the influence of the light source on the image to be displayed, so that the user can also view the rendered image to be displayed, thereby improving the user's viewing experience.
  • the invention relates to a method for processing a video image, wherein the coded data is data sent by the coding device and encoded by any one of the image processing methods in the present application, so as to facilitate subsequent processing; the coded data is decoded to obtain and display a decoded image, which is an image carrying an auxiliary stream image and its corresponding annotation information, so that the decoded image can reflect the annotation information and the characteristics of the auxiliary stream image; the decoded image is superimposed with the previous frame composite image to generate an image to be displayed, and the image to be displayed is displayed, so that the image to be displayed can reflect the annotation information.
  • the coded data is data sent by the coding device and encoded by any one of the image processing methods in the present application, so as to facilitate subsequent processing
  • the coded data is decoded to obtain and display a decoded image, which is an image carrying an auxiliary stream image and its corresponding annotation information, so that the decoded image can reflect the annotation information and the characteristics of the auxiliary stream image
  • obtaining the encoded data in step S401 includes: obtaining the encoded data, including: receiving the encoded data through a first channel, wherein the encoded data is data corresponding to a synthesized image of a current frame, and the synthesized image of the current frame is an image synthesized of an auxiliary stream image and its corresponding annotation information; decoding the encoded data, and before obtaining the decoded image, the method also includes: receiving annotation data corresponding to the annotation information through a second channel.
  • the annotation data is data corresponding to the annotation information.
  • the annotation data can be represented by binary data and is used to represent the information.
  • the specific meaning of the annotation information corresponding to the auxiliary stream image can be clarified, so as to facilitate the subsequent processing of the data to be analyzed and improve the data processing efficiency; and by separately processing the data transmitted in different channels, different types of data can be processed to improve the accuracy of data processing.
  • Fig. 5 is a block diagram of the image processing system provided by the present application. As shown in Fig. 5, a first terminal 510 is connected to a second terminal 520 for communication (eg, communicating via the Internet or a communication network, etc.).
  • a first terminal 510 is connected to a second terminal 520 for communication (eg, communicating via the Internet or a communication network, etc.).
  • the first terminal 510 includes: an image synthesis device 511, a region detection device 512, an encoding module 513 and an auxiliary stream data sending module 514.
  • the second terminal 520 includes: a receiving module 521, a decoding module 522 and an image rendering module 523. The functions of each module can refer to the description in the above embodiment.
  • the image synthesis device 511 can simultaneously obtain the annotation information and the auxiliary stream image, process the annotation information, generate the annotation image, and then synthesize the annotation image with the auxiliary stream image to generate the current frame synthesized image, so that the current frame synthesized image can simultaneously reflect the image features of the auxiliary stream image and the features corresponding to the annotation information.
  • the superimposed auxiliary stream image processed by the decoding module 522 can also reflect the characteristics of the annotation information, but the finally obtained image cannot accurately and clearly represent the characteristics of the annotation information.
  • Fig. 6 shows a block diagram of the image processing system provided by the present application. As shown in Fig. 6, a first terminal 610 is connected in communication with a second terminal 620 (eg, communicating via the Internet or a communication network, etc.).
  • a first terminal 610 is connected in communication with a second terminal 620 (eg, communicating via the Internet or a communication network, etc.).
  • the first terminal 610 includes: an image synthesis device 611, a region detection device 612, an encoding module 613, an auxiliary stream data sending module 614 and an annotation information sending module 615.
  • the second terminal 620 includes: a receiving module 621, a decoding module 622, an image rendering module 623 and an annotation information receiving module 624. The functions of each module can refer to the description in the above embodiment.
  • the image synthesis device 611 can simultaneously obtain the annotation information and the auxiliary stream image, process the annotation information, generate the annotation image, and then synthesize the annotation image with the auxiliary stream image to generate the current frame synthesized image, so that the current frame synthesized image can simultaneously reflect the image features of the auxiliary stream image and the features corresponding to the annotation information.
  • annotation information sending module 615 can also obtain annotation information and send the annotation information to the second terminal 620 to facilitate the second terminal 620 to analyze the superimposed auxiliary stream image output by the decoding module 622, so that the image input to the image rendering module 623 can clearly and accurately reflect the characteristics of the annotation information.
  • the terminals can all support decoding of the annotation information, so that the user can obtain the characteristics of the annotation information.
  • Fig. 7 shows a block diagram of the image processing system provided by the present application.
  • the first terminal 710 is connected to the second terminal 720 and the third terminal 730 for communication (eg, communicating via the Internet or a communication network, etc.).
  • the first terminal 710 includes: an image synthesis device 711, a region detection device 712, an encoding module 713, an auxiliary stream data sending module 714, and a label information sending module 715.
  • the second terminal 720 includes: a receiving module 721, a decoding module 722, an image rendering module 723, and a label information receiving module 724.
  • the third terminal 730 includes: a receiving module 731, a decoding module 732, and an image rendering module 733. The functions of each module can refer to the description in the above embodiment.
  • the encoding module 713 is responsible for converting the image output by the area detection device 712 into a suitable
  • the auxiliary stream data sending module 714 and the annotation information sending module 715 transmit the image data to the second terminal 720 (or the third terminal 730) through a wired communication network or a wireless communication network (such as an optical network composed of optical fibers).
  • the image data processing may be implemented in the following manner.
  • the image synthesis device 711 obtains the annotation information and the auxiliary stream image respectively, and the image synthesis device 711 can obtain the annotation information and the auxiliary stream image at the same time, and process the annotation information to generate the annotation image, and then synthesize the annotation image with the auxiliary stream image to generate the current frame synthesis image. So that the current frame synthesis image can simultaneously reflect the image features of the auxiliary stream image and the features corresponding to the annotation information.
  • the annotation information is a series of annotated point set data generated by the annotation source.
  • the area detection device 712 performs difference detection on different areas of the input current frame composite image to obtain difference information, and inputs both the difference information and the auxiliary stream image into the encoding module 713 for encoding, generates encoded data, and outputs the encoded data to the auxiliary stream data sending module 714, so that the auxiliary stream data sending module 714 sends the obtained encoded data to the second terminal 720 (and/or the third terminal 730) through the communication network, so that the second terminal 720 and/or the third terminal 730 can obtain the encoded data synchronized with the auxiliary stream image and annotation information.
  • the region detection device 712 may obtain a first region image set including a plurality of first region images by dividing the input current frame synthetic image into blocks, and then compare the plurality of first region images with a plurality of second region images cached therein to obtain the changed region information.
  • the plurality of second region images are images obtained by dividing the previous frame synthetic image into blocks by the region detection device 712.
  • the second terminal 720 and/or the third terminal 730 will decode the encoded data, but the difference is that the second terminal 720 can also obtain the original annotation information at the same time to facilitate its analysis of the encoded data and obtain accurate auxiliary stream images and annotation information.
  • the annotated image and the auxiliary stream image are superimposed and synthesized. It can ensure the content consistency of the synthesized image of the current frame, and by comparing the differences between adjacent image frames, it can limit the range of image superposition corresponding to the annotation information, thereby improving the speed of image synthesis. It can meet the needs of users in different application scenarios and improve product competitiveness. It solves the problem of inconsistent interaction of auxiliary stream content between the first terminal 710 that can be annotated and the third terminal 730 that cannot be annotated.
  • FIG8 shows a schematic diagram of a display interface for auxiliary stream images provided by the present application.
  • (A) in FIG8 represents a display interface of an auxiliary stream image with annotated information sent by the first terminal 710, or a display interface of an auxiliary stream image with annotated information displayed by the second terminal 720.
  • FIG. 8 shows a display interface in the prior art in which only an auxiliary stream image (ie, an auxiliary stream image without annotation information) is displayed by a terminal.
  • FIG. 8 shows a display interface of the auxiliary stream image displayed by the third terminal 730 .
  • the display status of the information can be clearly marked, which facilitates user viewing and improves the user experience.
  • Fig. 9 shows a block diagram of the coding device provided by the present application. As shown in Fig. 9, in one embodiment, the coding device 900 includes but is not limited to the following modules.
  • the synthesis module 901 is configured to synthesize the auxiliary stream image and its corresponding annotation information to generate a current frame synthesized image;
  • the detection module 902 is configured to detect the current frame synthesized image and the previous frame synthesized image to determine the difference information;
  • the encoding module 903 is configured to encode the current frame synthesized image according to the difference information to generate encoded data;
  • the sending module 904 is configured to send the encoded data to the opposite device so that the opposite device processes the encoded data to obtain and display a decoded image including the annotation information corresponding to the auxiliary stream image.
  • the encoding device 900 in this embodiment can implement any image processing method applied to the encoding device in the embodiments of the present application.
  • the synthesis module synthesizes the auxiliary stream image and its corresponding annotation information to generate the current frame synthetic image, which can clarify the information of the terminal annotating the auxiliary stream image;
  • the detection module detects the current frame synthetic image and the previous frame synthetic image to determine the difference information, so that the user can synchronously obtain the difference information between two consecutive frames, thereby improving the interactivity between the terminal and the user;
  • the encoding module detects the current frame synthetic image and the previous frame synthetic image according to the difference information, and determines the difference information between the two consecutive frames.
  • the auxiliary stream image is encoded to generate encoded data, which can speed up the encoding speed of the image to reduce the energy consumption of encoding; and the encoded data is sent to the opposite device through the sending module, so that the opposite device processes the encoded data, obtains and displays the decoded image including the annotation information corresponding to the auxiliary stream image, so that the opposite device can view the decoded image with the annotation information, and the opposite device can display the annotation information more clearly.
  • Fig. 10 shows a block diagram of a decoding device provided by the present application. As shown in Fig. 10, in one embodiment, the decoding device 1000 includes but is not limited to the following modules.
  • the acquisition module 1001 is configured to acquire encoded data, which is data sent by any image processing method adopted by the encoding device in the present application; the decoding module 1002 is configured to decode the encoded data to obtain a decoded image, which is an image carrying an auxiliary stream image and its corresponding annotation information; the generation module 1003 is configured to superimpose the decoded image and the previous frame composite image to generate an image to be displayed; the display module 1004 is configured to display the image to be displayed.
  • the decoding device 1000 in this embodiment can implement any image processing method applied to a decoding device in the embodiments of the present application.
  • the decoding device of the implementation mode of the present application by using the acquisition module to obtain the encoded data, it is possible to clarify the processing requirements for the encoded data, wherein the encoded data is the data sent by the encoding device and encoded by it using any one of the image processing methods in the present application, which is convenient for subsequent processing; the encoded data is decoded to obtain and display a decoded image, which is an image carrying an auxiliary stream image and its corresponding annotation information, so that the decoded image can reflect the characteristics of the annotation information and the auxiliary stream image, which is convenient for users to use.
  • the encoded data is the data sent by the encoding device and encoded by it using any one of the image processing methods in the present application, which is convenient for subsequent processing
  • the encoded data is decoded to obtain and display a decoded image, which is an image carrying an auxiliary stream image and its corresponding annotation information, so that the decoded image can reflect the characteristics of the annotation information and the auxiliary stream image, which is convenient for users to use.
  • FIG11 is a block diagram showing a terminal provided by the present application.
  • the terminal 1100 includes but is not limited to the following modules: an encoding device 1101 and/or a decoding device 1102 .
  • (A) in FIG. 11 indicates that the terminal 1100 includes only the encoding device 1101 ; (B) in FIG. 11 indicates that the terminal 1100 includes only the decoding device 1102 ; and (C) in FIG. 11 indicates that the terminal 1100 includes the encoding device 1101 and the decoding device 1102 .
  • the encoding device 1101 is configured to execute any image processing method applied to an encoding device in the embodiments of the present application.
  • the decoding device 1102 is configured to execute any image processing method applied to a decoding device in the embodiments of the present application.
  • the terminal 1100 may be a terminal supporting audio/video conferencing functions (such as a smart phone, etc.), or a tablet computer supporting online teaching (or a personal computer, etc.).
  • the above terminal categories are only examples, and specific settings can be made according to actual needs. Other unspecified terminal categories are also within the scope of protection of this application and will not be repeated here.
  • the auxiliary stream image and its corresponding annotation information are synthesized by the encoding device to generate the current frame synthetic image, which can clarify the information of the terminal on the auxiliary stream image; the current frame synthetic image and the previous frame synthetic image are detected to determine the difference information, so that the user can synchronously obtain the difference information between two consecutive frames, thereby improving the interactivity between the terminal and the user; the auxiliary stream image is encoded according to the difference information to generate encoding data, which can speed up the encoding speed of the image to reduce the energy consumption of encoding.
  • the encoding data and its corresponding annotation information are obtained by the encoding device, which can clarify the processing requirements of the encoding data
  • the encoding data is the image obtained by the encoding device encoding the auxiliary stream image according to the difference information
  • the difference information is the information obtained by the encoding device detecting the current frame synthetic image and the previous frame synthetic image, which can enable the user to synchronously obtain the difference information between two consecutive frames, thereby improving the interactivity between the terminal and the user
  • the encoding data is decoded to obtain the image to be analyzed, thereby speeding up the processing speed of the image to be analyzed
  • the image to be analyzed is processed according to the annotation information corresponding to the encoding data to obtain a decoded image, so that the decoded image can reflect the characteristics of the annotation information and the auxiliary stream image, which is convenient for users to use.
  • Fig. 12 shows a block diagram of the image processing system provided by the present application.
  • the image processing system includes a plurality of terminals connected in communication; wherein the terminals can implement any one of the image processing methods in the embodiments of the present application.
  • the image processing system includes but is not limited to the following devices: at least one transmitting terminal 1201 in communication connection, and at least one first receiving terminal 1202 and/or second receiving terminal 1203 .
  • (A) in Figure 12 indicates that the image processing system includes: a sending terminal 1201 and a first receiving terminal 1202 that are communicatively connected; (B) in Figure 12 indicates that the image processing system includes: a sending terminal 1201 and a second receiving terminal 1203 that are communicatively connected; (C) in Figure 12 indicates that the image processing system includes: a sending terminal 1201, and a first receiving terminal 1202 and a second receiving terminal 1203 that are respectively communicatively connected to the sending terminal 1201.
  • the first sending terminal 1201 is configured to execute any of the embodiments of the present application.
  • the first receiving terminal 1202 is configured to execute any one of the image processing methods applied to a decoding device in the embodiments of the present application.
  • the second receiving terminal 1203 is configured to obtain the encoded data sent by the first terminal, decode the encoded data, obtain and display a decoded image including the annotation information corresponding to the auxiliary stream image, wherein the encoded data is data obtained by the encoding device by encoding the current frame synthetic image according to the difference information, the difference information is information obtained by the encoding device by detecting the current frame synthetic image and the previous frame synthetic image, and the current frame synthetic image is an image synthesized by the encoding device on the auxiliary stream image and its corresponding annotation information.
  • the auxiliary stream image and its corresponding annotation information are synthesized by the sending terminal to generate the current frame synthesized image, which can clearly identify the information annotated by the sending terminal on the auxiliary stream image; the current frame synthesized image and the previous frame synthesized image are detected to determine the difference information, so that the user can synchronously obtain the difference information between two consecutive frames, thereby improving the interactivity between the terminal and the user; the auxiliary stream image is encoded according to the difference information to generate encoded data, which can speed up the encoding speed of the image to reduce the energy consumption of encoding.
  • different receiving terminals can receive the encoded data and process the encoded data to obtain and display the decoded image including the annotation information corresponding to the auxiliary stream image, so that the first receiving terminal and/or the second receiving terminal can view the decoded image with the annotation information, so as to display the annotation information more clearly.
  • FIG. 13 is a block diagram showing an exemplary hardware architecture of a computing device capable of implementing the image processing method and apparatus according to the present application.
  • the computing device 1300 includes an input device 1301, an input interface 1302, a central processing unit 1303, a memory 1304, an output interface 1305, and an output device 1306.
  • the input interface 1302, the central processing unit 1303, the memory 1304, and the output interface 1305 are interconnected via a bus 1307.
  • the device 1306 is connected to the bus 1307 through the input interface 1302 and the output interface 1305 respectively, and further connected to other components of the computing device 1300.
  • the input device 1301 receives input information from the outside and transmits the input information to the central processing unit 1303 through the input interface 1302; the central processing unit 1303 processes the input information based on the computer executable instructions stored in the memory 1304 to generate output information, temporarily or permanently stores the output information in the memory 1304, and then transmits the output information to the output device 1306 through the output interface 1305; the output device 1306 outputs the output information to the outside of the computing device 1300 for user use.
  • the computing device shown in Figure 13 can be implemented as an electronic device, which may include: a memory configured to store a program; a processor configured to run the program stored in the memory to execute the image processing method described in the above embodiment.
  • the computing device shown in Figure 13 can be implemented as an image processing system, which may include: a memory configured to store a program; a processor configured to run the program stored in the memory to execute the image processing method described in the above embodiment.
  • Embodiments of the present application may be implemented by executing computer program instructions by a data processor of a mobile device, for example in a processor entity, or by hardware, or by a combination of software and hardware.
  • the computer program instructions may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages.
  • ISA instruction set architecture
  • the block diagrams of any logic flow in the drawings of this application may represent program steps, or may represent interconnected logic circuits, modules and functions, or may represent a combination of program steps and logic circuits, modules and functions.
  • the computer program may be stored in a memory.
  • the memory may be of any type suitable for the local technical environment and may use any Suitable data storage technology implementations include, but are not limited to, read-only memory (ROM), random access memory (RAM), optical storage devices and systems (digital versatile discs DVD or CD discs), etc.
  • Computer-readable media may include non-transitory storage media.
  • the data processor may be any type suitable for the local technical environment, such as, but not limited to, a general-purpose computer, a special-purpose computer, a microprocessor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a programmable logic device (FGPA), and a processor based on a multi-core processor architecture.
  • a general-purpose computer such as, but not limited to, a general-purpose computer, a special-purpose computer, a microprocessor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a programmable logic device (FGPA), and a processor based on a multi-core processor architecture.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FGPA programmable logic device

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)

Abstract

Sont fournis dans la présente demande un procédé et un appareil de traitement d'image, et un terminal. Le procédé consiste à : réaliser une synthèse sur une image de flux auxiliaire et étiqueter des informations correspondant à celle-ci, de façon à générer la trame actuelle d'image synthétisée; effectuer une détection sur la trame courante d'image synthétisée et la trame précédente d'image synthétisée, de façon à déterminer des informations de différence; coder la trame courante d'image synthétisée selon les informations de différence, de façon à générer des données codées; et envoyer les données codées à un dispositif homologue, de telle sorte que le dispositif homologue traite les données codées, de façon à obtenir et à afficher une image décodée comprenant les informations d'étiquetage correspondant à l'image de flux auxiliaire.
PCT/CN2023/105927 2022-10-11 2023-07-05 Procédé et appareil de traitement d'image, et terminal WO2024078064A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211239520.9 2022-10-11
CN202211239520.9A CN117915022A (zh) 2022-10-11 2022-10-11 图像处理方法、装置和终端

Publications (1)

Publication Number Publication Date
WO2024078064A1 true WO2024078064A1 (fr) 2024-04-18

Family

ID=90668671

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/105927 WO2024078064A1 (fr) 2022-10-11 2023-07-05 Procédé et appareil de traitement d'image, et terminal

Country Status (2)

Country Link
CN (1) CN117915022A (fr)
WO (1) WO2024078064A1 (fr)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070183679A1 (en) * 2004-02-05 2007-08-09 Vodafone K.K. Image processing method, image processing device and mobile communication terminal
JP2012156820A (ja) * 2011-01-27 2012-08-16 Nippon Telegr & Teleph Corp <Ntt> 映像コミュニケーションシステム及びその作動方法
CN103281539A (zh) * 2013-06-07 2013-09-04 华为技术有限公司 一种图像编、解码处理的方法、装置及终端
CN105744281A (zh) * 2016-03-28 2016-07-06 飞依诺科技(苏州)有限公司 一种连续图像的处理方法及装置
CN106791937A (zh) * 2016-12-15 2017-05-31 广东威创视讯科技股份有限公司 视频图像的标注方法和系统
CN114419502A (zh) * 2022-01-12 2022-04-29 深圳力维智联技术有限公司 一种数据分析方法、装置及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070183679A1 (en) * 2004-02-05 2007-08-09 Vodafone K.K. Image processing method, image processing device and mobile communication terminal
JP2012156820A (ja) * 2011-01-27 2012-08-16 Nippon Telegr & Teleph Corp <Ntt> 映像コミュニケーションシステム及びその作動方法
CN103281539A (zh) * 2013-06-07 2013-09-04 华为技术有限公司 一种图像编、解码处理的方法、装置及终端
CN105744281A (zh) * 2016-03-28 2016-07-06 飞依诺科技(苏州)有限公司 一种连续图像的处理方法及装置
CN106791937A (zh) * 2016-12-15 2017-05-31 广东威创视讯科技股份有限公司 视频图像的标注方法和系统
CN114419502A (zh) * 2022-01-12 2022-04-29 深圳力维智联技术有限公司 一种数据分析方法、装置及存储介质

Also Published As

Publication number Publication date
CN117915022A (zh) 2024-04-19

Similar Documents

Publication Publication Date Title
US11200426B2 (en) Video frame extraction method and apparatus, computer-readable medium
US11729465B2 (en) System and method providing object-oriented zoom in multimedia messaging
CN112565627B (zh) 一种基于位图叠加的多路视频集中显示设计方法
US10720091B2 (en) Content mastering with an energy-preserving bloom operator during playback of high dynamic range video
EP3751862A1 (fr) Procédé et dispositif d&#39;affichage, poste de télévision et support d&#39;informations
US20120013717A1 (en) Moving picture decoding method, moving picture decoding program, moving picture decoding apparatus, moving picture encoding method, moving picture encoding program, and moving picture encoding apparatus
CN110868625A (zh) 一种视频播放方法、装置、电子设备及存储介质
CN102474639A (zh) 根据人类视觉系统反馈度量变换视频数据
CN108235055B (zh) Ar场景中透明视频实现方法及设备
TWI539790B (zh) 用於產生及重建一視訊串流之裝置、方法與軟體產品
US20230325987A1 (en) Tone mapping method and apparatus
WO2023138226A1 (fr) Carte d&#39;envoi et carte de réception pour système d&#39;affichage, procédé de commande d&#39;affichage, et support de stockage
CN111479154A (zh) 音画同步的实现设备、方法及计算机可读存储介质
US8854435B2 (en) Viewpoint navigation
JP2012522285A (ja) データ及び三次元レンダリングを符号化するためのシステム及びフォーマット
WO2024078064A1 (fr) Procédé et appareil de traitement d&#39;image, et terminal
WO2011134373A1 (fr) Procédé, dispositif et système pour la transmission synchrone de vidéos multicanaux
CN111406404B (zh) 获得视频文件的压缩方法、解压缩方法、系统及存储介质
CN111970564B (zh) Hdr视频显示处理的优化方法及装置、存储介质、终端
CN116962805A (zh) 视频合成方法、装置、电子设备及可读存储介质
CN116962742A (zh) 网络直播的视频图像数据传输方法、装置及直播系统
WO2015132957A1 (fr) Dispositif vidéo, et procédé de traitement vidéo
KR101927865B1 (ko) 영상 증강 현실 서비스를 제공하는 방법, 셋톱박스 및 컴퓨터 프로그램
US11706394B1 (en) Method of real-time variable refresh rate video processing for ambient light rendering and system using the same
CN115118922B (zh) 一种云会议中实时视频合屏插入动图的方法与装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23876272

Country of ref document: EP

Kind code of ref document: A1