CN110582021A - Information processing method and device, electronic equipment and storage medium - Google Patents

Information processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110582021A
CN110582021A CN201910920031.1A CN201910920031A CN110582021A CN 110582021 A CN110582021 A CN 110582021A CN 201910920031 A CN201910920031 A CN 201910920031A CN 110582021 A CN110582021 A CN 110582021A
Authority
CN
China
Prior art keywords
rendering
data frame
target detection
real
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910920031.1A
Other languages
Chinese (zh)
Other versions
CN110582021B (en
Inventor
袁瑞
郭治姣
潘逸雯
吴军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Sensetime Technology Co Ltd
Original Assignee
Shenzhen Sensetime Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Sensetime Technology Co Ltd filed Critical Shenzhen Sensetime Technology Co Ltd
Priority to CN201910920031.1A priority Critical patent/CN110582021B/en
Publication of CN110582021A publication Critical patent/CN110582021A/en
Application granted granted Critical
Publication of CN110582021B publication Critical patent/CN110582021B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The present disclosure relates to an information processing method and apparatus, an electronic device, and a storage medium, wherein the method includes: acquiring at least one data frame included in multimedia information to be played; rendering the at least one data frame in real time based on a target detection result obtained by performing target detection on the at least one data frame to obtain at least one data frame after real-time rendering; and playing the at least one data frame after real-time rendering. The embodiment of the disclosure can improve the efficiency of multimedia information rendering.

Description

information processing method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer vision technologies, and in particular, to an information processing method and apparatus, an electronic device, and a storage medium.
Background
Currently, the field of computer vision relates to a video understanding technology, and video understanding can perform target detection on video information so as to automatically interpret the video information. For example, the video information is subjected to face recognition, gesture recognition and the like, the obtained video information comprises a face or a gesture, and a user can be helped to extract more effective information from the video information.
However, there is no appropriate solution for how to render the detection result obtained after video understanding on video information.
Disclosure of Invention
The present disclosure proposes an information processing technical solution.
According to an aspect of the present disclosure, there is provided an information processing method including:
acquiring at least one data frame included in multimedia information to be played; rendering the at least one data frame in real time based on a target detection result obtained by performing target detection on the at least one data frame to obtain at least one data frame after real-time rendering; and playing the at least one data frame after real-time rendering.
In a possible implementation manner, before the obtaining at least one data frame included in the multimedia information to be played, the method further includes:
acquiring multimedia information to be played; analyzing the multimedia information to obtain at least one media stream; and decoding the at least one media stream according to the coding format of each media stream to obtain at least one data frame included in each media stream. Therefore, the data frame of at least one media stream can be obtained by analyzing the multimedia information, so that each data frame can be flexibly processed, and the diversified requirements of users can be met.
In one possible implementation, the at least one data frame includes one or more of an audio frame, a video frame, and a text frame. In this way, real-time rendering of multiple data frame types may be supported.
In a possible implementation manner, the rendering the at least one data frame in real time based on a target detection result obtained by performing target detection on the at least one data frame to obtain the at least one data frame after being rendered in real time includes:
obtaining a target detection result obtained by performing target detection on the at least one data frame; based on the target detection result, obtaining rendering information for rendering the at least one data frame in real time; and performing real-time rendering on the at least one data frame by using the rendering information to obtain at least one data frame after real-time rendering. Therefore, the rendering information corresponding to each target detection result is searched according to each target detection result, and the data frame is rendered in real time by using the rendering information.
In a possible implementation manner, the obtaining rendering information for rendering the at least one data frame in real time based on the target detection result includes:
identifying the target detection result according to the data format of the target detection result to obtain an identified target detection result; and acquiring rendering information for rendering the at least one data frame in real time based on the identified target detection result. Therefore, the electronic equipment can identify the target detection results in multiple data formats, and the problem that the target detection result cannot be identified due to the fact that only the target detection result in a single data format can be identified can be solved.
In a possible implementation manner, the rendering the at least one data frame in real time by using the rendering information to obtain the at least one data frame after real-time rendering includes:
Determining a rendering position for performing real-time rendering on the at least one data frame according to the target detection result; and rendering the rendering position of the at least one data frame in real time by using the rendering information to obtain at least one data frame after real-time rendering. Therefore, the position of the human face in the video frame can be obtained by analyzing the target detection result, and then the position can be used as a rendering position for performing real-time rendering on the video frame.
in a possible implementation manner, in a case that the at least one data frame includes a video frame, the performing, in real time, rendering the rendering position of the at least one data frame by using the rendering information to obtain the at least one data frame after being rendered in real time includes:
Determining a target pixel value corresponding to the rendering information; and rendering the pixel points of the rendering position into the target pixel values in real time according to the target pixel values corresponding to the rendering information to obtain the video frames rendered in real time. Therefore, a target pixel value corresponding to the rendering information can be determined, and then the pixel point of the rendering position of the video frame can be rendered into the target pixel value in real time according to the target pixel value, so that the video frame after real-time rendering is obtained.
In one possible implementation, the rendering information includes a custom pattern. Like this, the custom pattern can set up according to user's actual demand to satisfy user's diversified requirement of rendering.
according to an aspect of the present disclosure, there is provided an information processing apparatus including:
The acquisition module is used for acquiring at least one data frame included in the multimedia information to be played;
The rendering module is used for rendering the at least one data frame in real time based on a target detection result obtained by performing target detection on the at least one data frame to obtain at least one data frame after real-time rendering;
And the playing module is used for playing the at least one data frame after real-time rendering.
in one possible implementation manner, the obtaining module is further configured to,
acquiring multimedia information to be played;
analyzing the multimedia information to obtain at least one media stream;
and decoding the at least one media stream according to the coding format of each media stream to obtain at least one data frame included in each media stream.
in one possible implementation, the at least one data frame includes one or more of an audio frame, a video frame, and a text frame.
In one possible implementation, the rendering module is specifically configured to,
obtaining a target detection result obtained by performing target detection on the at least one data frame;
based on the target detection result, obtaining rendering information for rendering the at least one data frame in real time;
and performing real-time rendering on the at least one data frame by using the rendering information to obtain at least one data frame after real-time rendering.
In one possible implementation, the rendering module is specifically configured to,
Identifying the target detection result according to the data format of the target detection result to obtain an identified target detection result;
And acquiring rendering information for rendering the at least one data frame in real time based on the identified target detection result.
In one possible implementation, the rendering module is specifically configured to,
Determining a rendering position for performing real-time rendering on the at least one data frame according to the target detection result;
And rendering the rendering position of the at least one data frame in real time by using the rendering information to obtain at least one data frame after real-time rendering.
In a possible implementation, in the case where the at least one data frame comprises a video frame, the rendering module is, in particular,
determining a target pixel value corresponding to the rendering information;
and rendering the pixel points of the rendering position into the target pixel values in real time according to the target pixel values corresponding to the rendering information to obtain the video frames rendered in real time.
In one possible implementation, the rendering information includes a custom pattern.
according to an aspect of the present disclosure, there is provided an electronic device including:
A processor;
A memory for storing processor-executable instructions;
wherein the processor is configured to: the above-described information processing method is executed.
According to an aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described information processing method.
In the embodiment of the present disclosure, at least one data frame included in multimedia information to be played may be acquired, then based on a target detection result obtained by performing target detection on the at least one data frame, the at least one data frame may be rendered in real time to obtain the at least one data frame rendered in real time, and the at least one data frame rendered in real time may be played. Therefore, in the process of playing the multimedia information, the target detection result can be rendered on at least one data frame included in the multimedia information in real time, so that the normal playing of the multimedia information can be maintained, the time and processing resources consumed in the process of rendering the multimedia information can be saved, and the efficiency of rendering the multimedia information is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 shows a flowchart of an information processing method according to an embodiment of the present disclosure.
Fig. 2 illustrates an example flow diagram for real-time rendering of data frames in accordance with an embodiment of this disclosure.
Fig. 3 illustrates an exemplary flow diagram for real-time rendering of video frames in accordance with an embodiment of the present disclosure.
fig. 4 shows a block diagram of an information processing apparatus according to an embodiment of the present disclosure.
fig. 5 shows a block diagram of an example of an electronic device according to an embodiment of the present disclosure.
Detailed Description
various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
According to the information processing scheme provided by the embodiment of the disclosure, at least one data frame included in the multimedia information to be played can be acquired, then, the at least one data frame can be rendered in real time based on the target detection result obtained by performing target detection on the at least one data frame, so that the at least one data frame after being rendered in real time is obtained, the target detection result can be rendered on the data frame of the multimedia information in real time, and the time and processing resources consumed by rendering the multimedia information are saved. And then, playing at least one data frame rendered in real time, thereby ensuring that the multimedia information rendered in real time is normally played, and providing effective reference for a user.
In the related art, video information is usually rendered, and the rendered video information is played after the entire video information is rendered, which involves the decoding and re-encoding processes of the video information, and consumes a large amount of processing resources and time, so that real-time playing of the video information cannot be realized.
the information processing scheme provided by the embodiment of the disclosure can perform real-time rendering on the data frame included in the multimedia information in the playing process of the multimedia information, and play the data frame after the real-time rendering, so that the re-encoding process of the rendered multimedia information can be avoided, processing resources and time can be saved, the rendering efficiency of the multimedia information can be improved, and the effect of performing real-time playing on the rendered multimedia information can be realized.
the information processing scheme provided by the embodiment of the present disclosure may be applied to playing, automatic rendering, expansion of multimedia file generation, and the like of multimedia information, and the embodiment of the present disclosure does not limit this.
Fig. 1 shows a flowchart of an information processing method according to an embodiment of the present disclosure. The information processing method may be performed by a terminal device, a server, or other types of electronic devices, where the terminal device may be a User Equipment (UE), a mobile device, a user terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like. In some possible implementations, the information processing method may be implemented by a processor calling computer readable instructions stored in a memory. The information processing method according to the embodiment of the present disclosure is described below by taking an electronic device as an execution subject.
S11, at least one data frame included in the multimedia information to be played is obtained.
in the embodiment of the disclosure, the electronic device may obtain multimedia information to be played, and then obtain one or more data frames included in the multimedia information from the multimedia information. The multimedia information may include one or more of audio information, video information, and text information, and accordingly, the at least one data frame included in the multimedia information may include one or more of an audio frame, a video frame, and a text frame.
here, the multimedia information may be multimedia information to be played, for example, in a face recognition scene, the multimedia information may be video information captured and played in a current scene, and for example, in a license plate recognition scene, the multimedia information may be a license plate image captured in the current scene.
In a possible implementation manner, before obtaining at least one data frame included in the multimedia information to be played, the multimedia information to be played may be obtained, then the multimedia information is analyzed to obtain at least one media stream, and then the at least one media stream is decoded according to the coding format of each media stream to obtain at least one data frame included in each media stream.
In this implementation manner, the obtained multimedia information to be played may be packaged in a certain packaging format. After the electronic device obtains the multimedia message to be played, the electronic device may analyze the encapsulated multimedia message to obtain one or more media streams of the multimedia message after decapsulation. The encoding format of one or more media streams may be the same or different, for example, the encoding format is a video stream of H264, and the encoding format is an audio stream of Advanced Audio Coding (AAC) format.
S12, rendering the at least one data frame in real time based on a target detection result obtained by performing target detection on the at least one data frame to obtain at least one data frame after real-time rendering.
in the embodiment of the present disclosure, a target detection result obtained by performing target detection on at least one data frame may be obtained, and then, for each data frame in the at least one data frame, each data frame may be rendered in real time according to the target detection result of each data frame, so as to obtain a data frame after being rendered in real time. The rendering may be adding multimedia effects to the data frame, such as coloring the data frame, adding lighting effects, shading effects, surface texture effects, and the like.
For example, in a face recognition scene, for a current data frame to be played, a face detection result of the current data frame may be obtained, where the current data frame may be a video frame, and the face detection result is the above-mentioned target detection result. And under the condition that the face detection result of the current data frame indicates that the current data frame comprises the target face, rendering the current data frame in real time, for example, rendering a video frame region of the current data frame comprising the target face into a preset color, or adding an indication label to the video frame region of the current data frame comprising the target face, so that the current data frame after real-time rendering can be obtained. By the method, the data frame included in the multimedia information can be rendered in real time, so that the recoding process in the rendering process of the multimedia information can be avoided, and the real-time rendering of the data frame is realized.
Here, the target detection result may have a plurality of data formats, for example, a json format, an xml format, a yaml format, and the like. The electronic equipment can identify target detection results in multiple data formats, so that the problem that the target detection result cannot be identified due to the fact that only the target detection result in a single data format can be identified can be solved.
And S13, playing the at least one data frame after real-time rendering.
In the embodiment of the present disclosure, after each rendered data frame, the rendered data frame may be output and played, for example, a video frame carrying an indication tag of a target face is output, or a text frame with a color text effect is output. Therefore, the data frame rendered in real time can be played in real time, the recoding process is avoided, and the normal playing of the multimedia information is ensured.
Fig. 2 illustrates an example flow diagram for real-time rendering of data frames in accordance with an embodiment of this disclosure. In a possible implementation manner, the step S12 may include the following steps:
and S121, obtaining a target detection result obtained by performing target detection on the at least one data frame.
Here, the electronic device may obtain the target detection result of the at least one data frame from the other device, for example, the electronic device receives the target detection result of the at least one data frame transmitted or sent by the other device. Or, the electronic device may perform target detection on at least one data frame included in the multimedia information to obtain a target detection result of the at least one data frame. Here, one data frame may correspond to one or more target detection results, for example, in a face detection scene, target detection results of a plurality of faces may be included in one video frame. Accordingly, in the case where there is no target object for target detection in the data frame, the target detection result of the data frame may be set to a preset value, such as 0.
And S122, based on the target detection result, obtaining rendering information for rendering the at least one data frame in real time.
here, after the target detection result of at least one data frame is acquired, rendering information corresponding to each target detection result may be found according to each target detection result, for example, a category of a target object may be determined according to the target detection result, for example, the target object is a category of a face, a vehicle, a building, or the like, and then rendering information corresponding to each target detection result may be found according to the category of the target object. For example, in a case that the category of the target detection result is a face, the target detection result may be a face detection result, rendering information corresponding to the face detection result may be searched in a local storage or a database, and the rendering information may be a custom pattern, for example, a square, an arrow, or other custom patterns. The user-defined pattern can be set according to the actual requirements of the user, so that the diversified requirements of the user are met.
in a possible implementation manner, the target detection result includes at least one data format, and the electronic device may identify the target detection result according to the data format of the target detection result to obtain an identified target detection result, and then obtain rendering information for rendering the at least one data frame in real time based on the identified target detection result.
In this implementation manner, the electronic device may identify a target detection result in multiple data formats, for example, data formats such as a json format, an xml format, a yaml format, and the like, after obtaining the target detection result of the current data frame, the electronic device may identify the target detection result according to the data format of the target detection result, analyze the target detection result in the corresponding data format, obtain effective information in the target detection result, for example, effective information such as a type of a target object and a position of the target object included in the target detection result, and then obtain rendering information of the current data frame by using the identified target detection result.
And S123, performing real-time rendering on the at least one data frame by using the rendering information to obtain at least one data frame after real-time rendering.
here, at least one data frame may be rendered in real time by using the obtained rendering information, for example, in a case that the rendering information corresponding to the current data frame is a custom pattern, the custom pattern may be added to the current data frame, such as adding a block, an arrow, and other custom patterns. In this way, real-time rendering of the data frame using the rendering information can be achieved.
In a possible implementation manner, a rendering position for performing real-time rendering on at least one data frame may be determined according to a target detection result, and then the rendering position of the at least one data frame is rendered in real time by using rendering information, so as to obtain at least one data frame after real-time rendering.
In this implementation manner, the position of the target object of the target detection in the data frame may be obtained by analyzing the target detection result of any one data frame, and the position may be used as a rendering position for performing real-time rendering. For example, in a face recognition scene, a target object of target detection may be a face, and a position of the face in a video frame may be obtained by analyzing a target detection result, and then the position may be used as a rendering position for performing real-time rendering on the video frame. For another example, in a text recognition scenario, the target object of target detection may be a specific word, such as a word of "flower", and by parsing the target detection result, the position of "flower" in the text frame may be obtained, and then the position may be used as a rendering position for performing real-time rendering on the text frame, for example, adding a pattern of flower at the position.
In an example, when at least one data frame includes a video frame, a target pixel value corresponding to the rendering information may be determined, and then, according to the target pixel value corresponding to the rendering information, a pixel point at a rendering position is rendered in real time as the target pixel value, so as to obtain a video frame after being rendered in real time.
In this example, the rendering information may be utilized to change the pixel values of the pixels at the rendering position of the video frame. Assume that a video frame has four layers of channels, which may be denoted as Y layer, U layer, V layer, and a layer, respectively. Wherein, Y layer may represent brightness, and U and V layers may represent chroma; every four components of the Y layer may correspond to a set of components of the U layer and the Y layer; the size of the U or V layers may be 1/4 size for Y; the A layer is an alpha layer, which may represent transparency. Assuming that the video frame before real-time rendering is frame1, the pixel value of the rendering position pixel point of the video frame can be represented as frame1(i, j), where i and j are positive integers. The superimposed pixel value frame2(i, j) indicated by the rendering information, and the superimposition weight may be alpha. Therefore, the pixel value of the rendering position of the video frame and the superposed pixel value indicated by the rendering information can be superposed, so that the pixel value of the pixel point of the rendering position of the video frame is changed. Assuming that the rendered video frame is represented as newframe, the pixel value of the rendering-position pixel point of the rendered video frame may be represented as newframe (i, j), and the newframe (i, j) may be the determined target pixel value. The pixel value of each layer channel of the rendered video frame can be expressed as formula (1):
newframe (i, j) [ k ] ═ 1-alpha × frame1(i, j) [ k ] + alpha × frame2(i, j) [ k ] equation (1);
Wherein k ═ 0,1,2,3] represents Y, U, V, and a layers, respectively. Therefore, by using the formula (1), a target pixel value corresponding to the rendering information can be determined, and then the pixel point at the rendering position of the video frame can be rendered into the target pixel value in real time according to the target pixel value, so as to obtain the video frame after real-time rendering.
Fig. 3 illustrates an exemplary flow diagram for real-time rendering of video frames in accordance with an embodiment of the present disclosure. In one example, an information processing scheme provided by an embodiment of the present disclosure may include the following steps:
And S31, acquiring the multimedia information to be played.
here, the multimedia information to be played may be packaged in a certain packaging format, for example, a packaging format such as flv, mkv, asf, MP4, etc.
S32, the multimedia information is analyzed to obtain at least one media stream.
Here, the encapsulated multimedia information may be parsed to obtain one or more media streams after the multimedia information is decapsulated. The encoding format of one or more media streams may be the same or different, for example, the encoding format is a video stream in H264, and the encoding format is an audio stream in AAC format.
s33, determining whether at least one media stream is a video stream, if so, performing step S34, otherwise, performing step S37.
Here, whether any one media stream is a video stream may be determined according to the encoding format of each media stream. In the case where any one of the media streams is a video stream, step S34 may be performed. In the case where any one of the media streams is an audio stream, step S37 may be performed.
s34, decoding the video stream by using the video decoding thread to obtain at least one video frame included in the video stream, and storing the obtained at least one video frame in the storage queue.
Here, when any one of the media streams is a video stream, the video stream may be decoded by using a video decoding thread in a decoding manner that matches the encoding format of the video stream, so that a plurality of video frames included in the video stream may be obtained, and in the decoding process, each time one video frame is obtained by decoding, the video frame is stored in the storage queue.
S35, extracting the current video frame from the storage queue, obtaining the analyzed rendering information in the rendering file or database according to the target detection result of the current video frame, and rendering the current video frame in real time based on the rendering information by using the video playing thread.
here, the first stored video frame may be extracted in the storage queue, which may be considered as the current video frame for real-time rendering. The electronic device may obtain a target detection result of the current video frame, and search, according to the target detection result of the current video frame, rendering information matched with the target detection result in a stored rendering file or database. After the rendering information matched with the target detection result is found, the rendering information can be analyzed according to the data format of the protocol to obtain the analyzed rendering information, and then the current video frame can be rendered in real time by using the video playing thread based on the rendering information, for example, rendering modes such as adding a custom pattern to the current video frame are added.
and S36, playing the current video frame after real-time rendering.
Here, after the current video frame is rendered in real time, the current video frame rendered in real time may be played using a video playing thread.
And S37, decoding the audio stream by using the audio decoding thread to obtain at least one audio frame included in the audio stream.
here, when any one of the media streams is an audio stream, the audio stream may be decoded by the audio decoding thread in a decoding manner that matches the encoding format of the audio stream, and a plurality of audio frames included in the audio stream may be obtained.
and S38, playing the decoded audio frame.
Here, the decoded audio frame may be played using an audio playback thread.
by the above example, when the multimedia information includes the video stream, the video frame of the video stream can be rendered in real time, so that when the multimedia information is played, the video frame in the multimedia information can be rendered and played in real time, and the normal playing of the multimedia information is ensured.
in some implementation manners, when the multimedia information includes an audio stream, real-time rendering may be performed on an audio frame of the audio stream, or when the multimedia information includes a video stream and an audio stream, real-time rendering may be performed on a video frame of the video stream and an audio frame of the audio stream, respectively.
It is understood that the above-mentioned method embodiments of the present disclosure can be combined with each other to form a combined embodiment without departing from the logic of the principle, which is limited by the space, and the detailed description of the present disclosure is omitted.
In addition, the present disclosure also provides an apparatus, an electronic device, a computer-readable storage medium, and a program, which can be used to implement any information processing method provided by the present disclosure, and the corresponding technical solutions and descriptions and corresponding descriptions in the method sections are not repeated.
it will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
Fig. 4 shows a block diagram of an information processing apparatus according to an embodiment of the present disclosure, which includes, as shown in fig. 4:
an obtaining module 41, configured to obtain at least one data frame included in multimedia information to be played;
A rendering module 42, configured to perform real-time rendering on the at least one data frame based on a target detection result obtained by performing target detection on the at least one data frame, so as to obtain at least one data frame after real-time rendering;
And a playing module 43, configured to play the at least one data frame after real-time rendering.
In a possible implementation manner, the obtaining module 41 is further configured to,
Acquiring multimedia information to be played;
Analyzing the multimedia information to obtain at least one media stream;
And decoding the at least one media stream according to the coding format of each media stream to obtain at least one data frame included in each media stream.
in one possible implementation, the at least one data frame includes one or more of an audio frame, a video frame, and a text frame.
In one possible implementation, the rendering module 42 is specifically configured to,
obtaining a target detection result obtained by performing target detection on the at least one data frame;
based on the target detection result, obtaining rendering information for rendering the at least one data frame in real time;
And performing real-time rendering on the at least one data frame by using the rendering information to obtain at least one data frame after real-time rendering.
In one possible implementation, the rendering module 42 is specifically configured to,
identifying the target detection result according to the data format of the target detection result to obtain an identified target detection result;
And acquiring rendering information for rendering the at least one data frame in real time based on the identified target detection result.
In one possible implementation, the rendering module 42 is specifically configured to,
Determining a rendering position for performing real-time rendering on the at least one data frame according to the target detection result;
and rendering the rendering position of the at least one data frame in real time by using the rendering information to obtain at least one data frame after real-time rendering.
In a possible implementation, in the case where the at least one data frame comprises a video frame, the rendering module 42 is, in particular,
Determining a target pixel value corresponding to the rendering information;
and rendering the pixel points of the rendering position into the target pixel values in real time according to the target pixel values corresponding to the rendering information to obtain the video frames rendered in real time.
In one possible implementation, the rendering information includes a custom pattern.
In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and for specific implementation, reference may be made to the description of the above method embodiments, and for brevity, details are not described here again
An embodiment of the present disclosure further provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured as the above method.
the electronic device may be provided as a terminal, server, or other form of device.
Fig. 5 is a block diagram illustrating an electronic device 1900 according to an example embodiment. For example, the electronic device 1900 may be provided as a server. Referring to fig. 5, electronic device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, executable by processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the above-described method.
the electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output (I/O) interface 1958. The electronic device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
in an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 1932, is also provided that includes computer program instructions executable by the processing component 1922 of the electronic device 1900 to perform the above-described methods.
The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
the computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. An information processing method characterized by comprising:
Acquiring at least one data frame included in multimedia information to be played;
Rendering the at least one data frame in real time based on a target detection result obtained by performing target detection on the at least one data frame to obtain at least one data frame after real-time rendering;
and playing the at least one data frame after real-time rendering.
2. the method according to claim 1, wherein before the obtaining at least one data frame included in the multimedia information to be played, the method further comprises:
Acquiring multimedia information to be played;
Analyzing the multimedia information to obtain at least one media stream;
And decoding the at least one media stream according to the coding format of each media stream to obtain at least one data frame included in each media stream.
3. the method of claim 1 or 2, wherein the at least one data frame comprises one or more of an audio frame, a video frame, and a text frame.
4. The method according to any one of claims 1 to 3, wherein the rendering the at least one data frame in real time based on a target detection result obtained by performing target detection on the at least one data frame to obtain at least one data frame after being rendered in real time comprises:
Obtaining a target detection result obtained by performing target detection on the at least one data frame;
Based on the target detection result, obtaining rendering information for rendering the at least one data frame in real time;
And performing real-time rendering on the at least one data frame by using the rendering information to obtain at least one data frame after real-time rendering.
5. The method of claim 4, wherein the obtaining rendering information for rendering the at least one data frame in real time based on the target detection result comprises:
Identifying the target detection result according to the data format of the target detection result to obtain an identified target detection result;
and acquiring rendering information for rendering the at least one data frame in real time based on the identified target detection result.
6. The method of claim 4, wherein the rendering the at least one data frame in real time using the rendering information to obtain at least one data frame after real-time rendering comprises:
Determining a rendering position for performing real-time rendering on the at least one data frame according to the target detection result;
and rendering the rendering position of the at least one data frame in real time by using the rendering information to obtain at least one data frame after real-time rendering.
7. the method of claim 6, wherein in a case that the at least one data frame includes a video frame, the rendering position of the at least one data frame in real time using the rendering information to obtain at least one data frame after real-time rendering, includes:
Determining a target pixel value corresponding to the rendering information;
And rendering the pixel points of the rendering position into the target pixel values in real time according to the target pixel values corresponding to the rendering information to obtain the video frames rendered in real time.
8. An information processing apparatus characterized by comprising:
the acquisition module is used for acquiring at least one data frame included in the multimedia information to be played;
The rendering module is used for rendering the at least one data frame in real time based on a target detection result obtained by performing target detection on the at least one data frame to obtain at least one data frame after real-time rendering;
And the playing module is used for playing the at least one data frame after real-time rendering.
9. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
Wherein the processor is configured to invoke the memory-stored instructions to perform the method of any of claims 1 to 7.
10. A computer readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 1 to 7.
CN201910920031.1A 2019-09-26 2019-09-26 Information processing method and device, electronic equipment and storage medium Active CN110582021B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910920031.1A CN110582021B (en) 2019-09-26 2019-09-26 Information processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910920031.1A CN110582021B (en) 2019-09-26 2019-09-26 Information processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110582021A true CN110582021A (en) 2019-12-17
CN110582021B CN110582021B (en) 2021-11-05

Family

ID=68813821

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910920031.1A Active CN110582021B (en) 2019-09-26 2019-09-26 Information processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110582021B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112218108A (en) * 2020-09-18 2021-01-12 广州虎牙科技有限公司 Live broadcast rendering method and device, electronic equipment and storage medium
CN114063966A (en) * 2021-11-04 2022-02-18 厦门雅基软件有限公司 Audio processing method and device, electronic equipment and computer readable storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080101456A1 (en) * 2006-01-11 2008-05-01 Nokia Corporation Method for insertion and overlay of media content upon an underlying visual media
CN103329526A (en) * 2011-08-17 2013-09-25 史克威尔·艾尼克斯控股公司 Moving image distribution server, moving image reproduction apparatus, control method, program and recording medium
US20150264416A1 (en) * 2014-03-11 2015-09-17 Amazon Technologies, Inc. Real-time rendering of targeted video content
US20160119669A1 (en) * 2014-06-24 2016-04-28 Google Inc. Efficient Frame Rendering
US20180068178A1 (en) * 2016-09-05 2018-03-08 Max-Planck-Gesellschaft Zur Förderung D. Wissenschaften E.V. Real-time Expression Transfer for Facial Reenactment
CN107801093A (en) * 2017-10-26 2018-03-13 深圳市量子视觉科技有限公司 Video Rendering method, apparatus, computer equipment and readable storage medium storing program for executing
CN109429078A (en) * 2017-08-24 2019-03-05 北京搜狗科技发展有限公司 Method for processing video frequency and device, for the device of video processing
CN109462776A (en) * 2018-11-29 2019-03-12 北京字节跳动网络技术有限公司 A kind of special video effect adding method, device, terminal device and storage medium
CN109600559A (en) * 2018-11-29 2019-04-09 北京字节跳动网络技术有限公司 A kind of special video effect adding method, device, terminal device and storage medium
US20190246165A1 (en) * 2016-10-18 2019-08-08 Robert Brouwer Messaging and commenting for videos

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080101456A1 (en) * 2006-01-11 2008-05-01 Nokia Corporation Method for insertion and overlay of media content upon an underlying visual media
CN103329526A (en) * 2011-08-17 2013-09-25 史克威尔·艾尼克斯控股公司 Moving image distribution server, moving image reproduction apparatus, control method, program and recording medium
US20150264416A1 (en) * 2014-03-11 2015-09-17 Amazon Technologies, Inc. Real-time rendering of targeted video content
US20160119669A1 (en) * 2014-06-24 2016-04-28 Google Inc. Efficient Frame Rendering
US20180068178A1 (en) * 2016-09-05 2018-03-08 Max-Planck-Gesellschaft Zur Förderung D. Wissenschaften E.V. Real-time Expression Transfer for Facial Reenactment
US20190246165A1 (en) * 2016-10-18 2019-08-08 Robert Brouwer Messaging and commenting for videos
CN109429078A (en) * 2017-08-24 2019-03-05 北京搜狗科技发展有限公司 Method for processing video frequency and device, for the device of video processing
CN107801093A (en) * 2017-10-26 2018-03-13 深圳市量子视觉科技有限公司 Video Rendering method, apparatus, computer equipment and readable storage medium storing program for executing
CN109462776A (en) * 2018-11-29 2019-03-12 北京字节跳动网络技术有限公司 A kind of special video effect adding method, device, terminal device and storage medium
CN109600559A (en) * 2018-11-29 2019-04-09 北京字节跳动网络技术有限公司 A kind of special video effect adding method, device, terminal device and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112218108A (en) * 2020-09-18 2021-01-12 广州虎牙科技有限公司 Live broadcast rendering method and device, electronic equipment and storage medium
CN112218108B (en) * 2020-09-18 2022-07-08 广州虎牙科技有限公司 Live broadcast rendering method and device, electronic equipment and storage medium
CN114063966A (en) * 2021-11-04 2022-02-18 厦门雅基软件有限公司 Audio processing method and device, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN110582021B (en) 2021-11-05

Similar Documents

Publication Publication Date Title
US10110936B2 (en) Web-based live broadcast
US20180063501A1 (en) Method and system of displaying a popping-screen
US11930202B2 (en) Method and apparatus for video watermarking, and storage medium
KR102463304B1 (en) Video processing method and device, electronic device, computer-readable storage medium and computer program
CN107295352B (en) Video compression method, device, equipment and storage medium
CN108924491B (en) Video stream processing method and device, electronic equipment and storage medium
US9584761B2 (en) Videoconference terminal, secondary-stream data accessing method, and computer storage medium
CN110582021B (en) Information processing method and device, electronic equipment and storage medium
CN110211030B (en) Image generation method and device
US10290110B2 (en) Video overlay modification for enhanced readability
CN111225288A (en) Method and device for displaying subtitle information and electronic equipment
JP7261732B2 (en) Method and apparatus for determining character color
JP7471510B2 (en) Method, device, equipment and storage medium for picture to video conversion - Patents.com
CN110769241B (en) Video frame processing method and device, user side and storage medium
CN111669476B (en) Watermark processing method, device, electronic equipment and medium
CN111626922B (en) Picture generation method and device, electronic equipment and computer readable storage medium
CN110636331B (en) Method and apparatus for processing video
CN113691835B (en) Video implantation method, device, equipment and computer readable storage medium
CN114125485B (en) Image processing method, device, equipment and medium
CN117319736A (en) Video processing method, device, electronic equipment and storage medium
CN108924588B (en) Subtitle display method and device
CN110312171B (en) Video clip extraction method and device
CN113891135A (en) Multimedia data playing method and device, electronic equipment and storage medium
CN112269957A (en) Picture processing method, device, equipment and storage medium
CN111212196B (en) Information processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant