CN113163227A - Method and device for obtaining video data and electronic equipment - Google Patents

Method and device for obtaining video data and electronic equipment Download PDF

Info

Publication number
CN113163227A
CN113163227A CN202010072877.7A CN202010072877A CN113163227A CN 113163227 A CN113163227 A CN 113163227A CN 202010072877 A CN202010072877 A CN 202010072877A CN 113163227 A CN113163227 A CN 113163227A
Authority
CN
China
Prior art keywords
original image
visualization
obtaining
data
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010072877.7A
Other languages
Chinese (zh)
Inventor
巫英才
唐谈
唐俊修
高占宁
王攀
任沛然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202010072877.7A priority Critical patent/CN113163227A/en
Publication of CN113163227A publication Critical patent/CN113163227A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream

Abstract

The application provides a method, a device and electronic equipment for obtaining video data, wherein the method comprises the following steps: obtaining an original image and target data associated with the original image; obtaining a visualization of the target data from the target data; obtaining display position information of the visualization in the original image; and synthesizing the original image and the visualization according to the display position information to obtain target video data. Since the present application obtains the display position information of the visualization of the target data associated with the original image in the original image, the video data obtained in this way can display the relationship between the target data and the original image.

Description

Method and device for obtaining video data and electronic equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for obtaining video data, and an electronic device.
Background
With the rapid development of science and technology, the living material level is continuously improved, and meanwhile, more and more objects are presented for users to select. Specifically, when object recommendation is performed for a user, information related to the object may be generally presented to the user. As one way to present the object-related information to the user, it may be to present a video containing the object to the user. In order to highlight the quantized related information features of the object, the related quantized data may be combined with the object, and the video combining the related quantized data with the object may be presented to the user. Therefore, how to combine the related quantization data and the object to form the video becomes a key for showing the object to the user.
In view of the above problems, the prior art mainly combines related quantized data with an object to form a video in the following manner. For example, when a video file is produced, data is converted into video units, then different video units are sequenced, and video segments corresponding to related quantized data are obtained according to the sequenced video units. And after the video segment corresponding to the related quantized data is obtained, simply splicing the video segment only containing the object and the video segment corresponding to the related quantized data, so as to obtain the video file containing the related quantized data and the object. However, a video file synthesized in such a manner of synthesizing a video file in the related art cannot intuitively exhibit the relationship between the related quantized data and the object, resulting in that a user cannot clearly perceive the relationship between the related quantized data and the object in the video file when viewing such a video file.
Disclosure of Invention
The application provides a method for obtaining video data, which aims to solve the problem that a video file synthesized by a video synthesis mode in the prior art cannot intuitively display the relation between related quantized data and an object.
The application provides a method for obtaining video data, which comprises the following steps:
obtaining an original image and target data associated with the original image;
obtaining a visualization of the target data from the target data;
obtaining display position information of the visualization in the original image;
and synthesizing the original image and the visualization according to the display position information to obtain target video data.
Optionally, the method further includes:
obtaining original audio data;
obtaining presentation time information of the visualization relative to the original audio data;
the synthesizing the original image and the visualization according to the display position information to obtain target video data includes:
and synthesizing the original image, the original audio data and the visualization according to the display position information and the display time information to obtain target video data.
Optionally, the method further includes:
according to the original image, obtaining characteristic information of a target main body in the original image;
the obtaining target data associated with the original image comprises:
and obtaining target data associated with the target subject according to the characteristic information of the target subject.
Optionally, the obtaining of the display position information of the visualization in the original image includes:
and obtaining the visualized display position information in the original image according to the characteristic information of the target subject and the characteristic information of the reference object in the original image.
Optionally, the obtaining, according to the feature information of the target subject and the feature information of the reference object in the original image, display position information of the visualization in the original image includes:
and inputting the characteristic information of the target subject, the characteristic information of the reference object and the visualization into a network model for obtaining display position information of the visualization in an original image, and obtaining the display position information of the visualization in the original image.
Optionally, the method further includes:
obtaining rhythm characteristic information of the original audio data according to the original audio data;
the obtaining presentation time information of the visualization relative to the raw audio data comprises:
and obtaining the display time information of the visualization relative to the original audio data according to the rhythm characteristic information.
Optionally, the obtaining, according to the rhythm feature information, display time information of the visualization relative to the original audio data includes:
and taking the rhythm characteristic information and the visualization as input data of a network model for obtaining visualized display time information, and obtaining the display time information of the visualization relative to the original audio data.
Optionally, the method further includes:
rendering the original image, the original audio data and the visualization to obtain a rendering result;
the synthesizing the original image, the original audio data and the visualization according to the display position information and the display time information to obtain target video data includes: and synthesizing the original image, the original audio data and the visualization according to the rendering result, the display position information and the display time information to obtain target video data.
Optionally, the synthesizing, according to the display position information and the display time information, the original image, the original audio data, and the visualization to obtain target video data includes:
and according to the display position information and the display time information, encoding the original image, the original audio data and the visualization to obtain the target video data.
Optionally, the feature information of the target subject includes at least one of the following information: position information of the target subject in the original image, size information of the target subject, and a type of the target subject.
Optionally, the feature information of the reference object at least includes position information of the reference object in the original image.
Optionally, the network model for obtaining the display position information of the visualization in the original image is a network model constructed in a manner that the display position information of the visualization in the original image and the position information of the reference object in the original image are kept close to each other.
Optionally, the network model for obtaining the display position information of the visualization in the original image is a network model that is constructed based on the display position information of the visualization in the original image and the position information distribution of the target subject in the original image in a balanced manner.
Optionally, the network model for obtaining the display position information of the visualization in the original image is a network model constructed in a manner that the display position information of the visualization in the original image is aligned with the position information of the target subject in the original image.
Optionally, the network model for obtaining the display position information of the visualization in the original image is a network model constructed based on the display position information of the visualization in the original image without obstructing the position information of the target subject in the original image.
Optionally, the network model for obtaining the display position information of the visualization in the original image is a network model constructed based on the display position information of the visualization in the original image to maintain consistency.
Optionally, the network model for obtaining the visualized presentation time information is a network model constructed based on the visualization and matching the presentation time information relative to the original audio data with the rhythm feature information.
Optionally, the type information of the visualization includes at least one of the following information: pie charts, line charts, bar charts, word cloud statistical charts.
Correspondingly, the application provides a device for obtaining video data, comprising:
a data obtaining unit for obtaining an original image and target data associated with the original image;
a visualization obtaining unit, configured to obtain a visualization of the target data according to the target data;
the display information obtaining unit is used for obtaining the display position information of the visualization in the original image;
and the target video data obtaining unit is used for synthesizing the original image and the visualization according to the display position information to obtain target video data.
Correspondingly, the present application also provides an electronic device, comprising:
a processor;
a memory for storing a computer program for execution by the processor to perform a method of obtaining video data, the method comprising the steps of:
obtaining an original image and target data associated with the original image;
obtaining a visualization of the target data from the target data;
obtaining display position information of the visualization in the original image;
and synthesizing the original image and the visualization according to the display position information to obtain target video data.
Accordingly, the present application provides a computer storage medium storing a computer program to be executed by a processor to perform a method of obtaining video data, the method comprising:
obtaining an original image and target data associated with the original image;
obtaining a visualization of the target data from the target data;
obtaining display position information of the visualization in the original image;
and synthesizing the original image and the visualization according to the display position information to obtain target video data.
Compared with the prior art, the method has the following advantages:
the application provides a method for obtaining video data, which comprises the following steps: obtaining an original image and target data associated with the original image; obtaining a visualization of the target data from the target data; obtaining display position information of the visualization in the original image; and synthesizing the original image and the visualization according to the display position information to obtain target video data. According to the method and the device, the display position information of the visualization of the target data associated with the original image in the original image is obtained, and the visualization of the target data is actually bound with the original image, so that the obtained video data can visually display the relationship between the target data and the original image, and the problem that the relationship between the related quantized data and the object cannot be visually displayed in a video file synthesized by the prior art is solved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1-a is a schematic view of an application scenario of a method for obtaining video data according to a first embodiment of the present application.
Fig. 1-B is a schematic diagram of one of key video frames in a composite target video file according to a first embodiment of the present application.
Fig. 1 is a flowchart of a method for obtaining video data according to a first embodiment of the present application.
Fig. 1-C is an exemplary diagram of a display position of a visualization of target data in an original image according to a first embodiment of the present application.
Fig. 1-D is a schematic view of another application scenario of the method for obtaining video data according to the first embodiment of the present application.
Fig. 2 is a diagram illustrating an apparatus for obtaining video data according to a second embodiment of the present application.
Fig. 3 is a schematic diagram of an electronic device for obtaining video data according to a third embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
The method for obtaining the video data provided by the application can be applied to a server or a client. Please refer to fig. 1-a, which is a schematic view of an application scenario of the method for obtaining video data provided by the present application. Taking the method for obtaining video data executed at the server as an example for explanation, the scene embodiment can be used in the following scenes: the user inputs an original image for obtaining a target video file, original audio data, and target data associated with the original image. Then, the server obtains, according to the original image, the original audio data, and the target data, display position information of the visualization of the target data in the original image and display time information of the visualization of the target data relative to the original audio data.
After the display position information of the visualization of the target data in the original image, the display time information of the visualization of the target data relative to the original audio data and the visualization of the target data are obtained, video synthesis processing is performed on the original image, the original audio data and the visualization by taking the display position information and the display time information as the basis, and a target video file is obtained.
The display position information of the obtained visualization of the target data in the original image, the display time information of the visualization of the target data relative to the original audio data, and the visualization are all the prerequisites for obtaining the target video file. Here, it should be noted that the target video file is a form as target video data, and in the present embodiment, the method of obtaining the target video data is mainly described in a manner of obtaining the target video file. In addition, the display position information of the visualization of the target data in the original image and the display time information of the visualization of the target data relative to the original audio data may be substantially the display position information of the visualization in the original image and the display time information of the visualization relative to the original audio data.
In the present application, "visualization" is used as a proper term to be understood as "a form of expression that can be visually perceived" for the sake of convenience of expression and in accordance with a common expression manner of those skilled in the art; by "visualization of target data" is meant that the target data is presented according to a predetermined rule as a representation that can be visually perceived, and the specific representation may be various static or dynamic data graphs, such as a histogram, a pie chart, a variation trend chart, a distribution chart, and the like; these representations enable the target data to be more easily understood by the user, and to be more compatible with the media format of video.
In the above method of obtaining target video data, the target video data may refer to a target video file. An original image refers to an original image file used to generate a target video file in which it may constitute a video frame of the target video file. The target data associated with the original image may refer to data associated with a target subject identified in the original image, and may be, for example, data describing the target subject. The display position information of the visualization of the target data in the original image may refer to position information of where the visualization of the target data is placed in the original image when the visualization of the target data is displayed in combination with the original image. For example, the size information may be coordinate information of the visualized target data in the original image, or may be size information of the visualized target data in the original image. Visualization of the target data refers to the form in which the target data is presented in the target video file in a visualized form. The visualization form of the data may include any one of a pie chart, a line chart, a bar chart, a word cloud statistical chart, for example. The original audio data may refer to a background music file used in synthesizing the target video file. The presentation time information of the visualization of the target data with respect to the original audio data is a correspondence between a start time point and an end time point of the visualization of the target data in the target video file and a time point at which a beat point of the background music file is located. Specifically, the start time point and the end time point of the visual display of the target data may respectively correspond to the time point of the beat point of the background music file, so as to obtain the corresponding time point relationship.
Specifically, referring to fig. 1-a, in the scenario of the present embodiment, the obtained target video file is described as a video file for synthesizing the electric shaver. In the target video file obtained in fig. 1-a, images at five different angles are shown for the electric shaver, and the images at five angles, i.e., the original images, are respectively: an electric shaver blade partial image, an electric shaver body partial image, an image when the electric shaver is used, an image of a connection manner of an electric shaver blade portion and a body portion, and an internal structure image of the electric shaver. For five different angle images, the target data associated with the original image are: data relating to a blade portion of the electric shaver, data relating to a monthly payment amount of the electric shaver, data relating to a use of the electric shaver, data relating to a connection manner of the blade portion of the electric shaver to the body portion, and data relating to an internal structure of the electric shaver.
More specifically, before the images of the five angles are acquired, the user may input an image of the electric shaver, and the images of the five angles are acquired based on the image of the electric shaver input by the user. Wherein the image of the five angles may actually be a sub-image of the electric shaver input by the user. Then, the images of the five angles are respectively combined with the visualizations of the associated target data, actually, the images obtained by visually combining the images of each angle with the associated target data are used as key video frames of the target video file, please refer to fig. 1-a, and the finally combined target video file includes five key video frames, namely, the first key video frame to the fifth key video frame.
A detailed process of respectively synthesizing the images of the five angles and the visualizations of the object data associated therewith is described below. After obtaining the images at the five angles, a data table of data associated with the images at the five angles may be obtained based on the images at the five angles. For example: after obtaining the electric shaver body portion image, a data table containing data relating to monthly payment amount of the electric shaver can be obtained. For example, the data table containing the monthly sales data of the electric shaver may be a data table in which monthly sales data of each month of the electric shaver in the next half year of july to december are counted.
Taking the visualized image of the combined body part image of the electric shaver and the monthly sales related data of the electric shaver as an example as a key video frame of the target video file. The synthesis process adopts the following steps. And taking the visualized image obtained by combining the body part image of the electric shaver and the monthly sales data of the electric shaver as a second key video frame of the target video file in the embodiment. After the electric shaver body partial image is obtained, a data table storing data relating to the monthly payment amount of the electric shaver can be obtained based on the electric shaver body partial image. Monthly sales data for the electric shaver may then be selected for a certain period of time. For example, the monthly sales data of the electric shaver selected for a certain period of time may be monthly sales data of each month from July to the second half of December. After obtaining the monthly sales data of the electric shaver in each month from july to december, the monthly sales data of the electric shaver in each month from july to december can be converted into a linear variation trend chart as shown in fig. 1-a, and the linear variation trend chart is visualized. The linear trend graph shown in fig. 1-a is a trend graph plotted with months as abscissa and the pin amount of the electric shaver per month as ordinate. Of course, the monthly sales data may also be converted to one of a histogram, pie chart, profile, and the like. The conversion of monthly sales data into a linear trend graph is presented here merely as an example of a visualization of the data.
Since the visualization of the target data needs to be obtained before one of the key video frames of the target video file is obtained, the position information is shown in the original image. After the obtained visualization of the target data shows the position information in the original image, the visualization of the target data and the original image can be synthesized to obtain a key video frame of the target video file. The information of the display position of the image of the body part of the electric shaver in which the data related to the monthly payment amount of the electric shaver is visualized will be described as an example. First, position information of a target body, i.e., an electric shaver body section in an electric shaver body section image, size information of the electric shaver body section, and a type of the electric shaver body section are obtained. Then, according to the visualization of the monthly sales data of the electric shaver and the five basic aesthetic principles that the distribution of the body part of the electric shaver in the image conforms to the plane layout, the visualization of the electric shaver month pin amount related data is balanced with the position distribution of the electric shaver body part in the electric shaver body part image, the visualization of the electric shaver month pin amount related data is kept aligned with the position of the electric shaver body part in the electric shaver body part image, the visualization of the electric shaver month pin amount related data is kept close to the position of an image edge (the image edge is one of reference objects) in the electric shaver body part image, the visualization of the electric shaver month pin amount related data is not shielded from the position of the electric shaver body part in the electric shaver body part image, and the positions of the visualizations of different target data in respectively associated sub-images are kept consistent. And determining display position information of the data related to the monthly sales volume of the electric shaver on the partial image of the body of the electric shaver. Finally, the position of the electric shaver monthly payment amount related data determined according to the above five principles in the electric shaver body partial image is visualized as the position shown in fig. 1-B. Fig. 1-B is a schematic diagram of one of key video frames in a composite target video file according to a first embodiment of the present application. In fig. 1-B, the position of the visualization (visualization of the electric shaver month pin amount-related data) on the lower left side of the target body (electric shaver body section) determined according to the above-described five principles can be seen, and of course, after preliminarily determining the position of the visualization of the electric shaver month pin amount-related data on the lower left side in the electric shaver body section image, the user can also adjust the position of the visualization of the electric shaver month pin amount-related data on the electric shaver body section image as necessary according to the indication of the adjustment region 2 on the right side in fig. 1-a. Meanwhile, the visualization type of the data related to the monthly pin count of the electric shaver can also be adjusted as required according to the indication of the adjustment area 1 on the left side in fig. 1-a.
And sequentially obtaining the display position information of the relevant data of a certain angle part of the electric shaver on the corresponding image in a mode of obtaining the display position information of the relevant data of the monthly sale amount of the electric shaver on the partial image of the electric shaver body.
After the display position information of the related data of a certain angle part of the electric shaver visualized on the corresponding image is sequentially obtained, the key video frame of the user synthesized target video file can be obtained. In the present embodiment, five key video frames as shown in fig. 1-a are synthesized. And after the five key video frames are synthesized, the synthesis of the target video file can be completed.
In addition, audio data needs to be added when synthesizing the target video file. In the present embodiment, the audio information added to the composite target video file is a music file, and since the composite target video file relates to images of an electric shaver at five angles, the composite target video file necessarily involves conversion between the images. When audio data is added to synthesize a video, whether an audio rhythm time point is matched with an image conversion time point needs to be considered, and based on the requirement of matching the audio rhythm time point with the image conversion time point, before synthesizing a target video file, audio rhythm time point information and image conversion time point information need to be obtained and matched.
In this embodiment, since the visualization of the related data of a certain angle part (which refers to a certain part of the electric shaver) and the corresponding image can be synchronously displayed, that is, the related data of a certain angle part and the corresponding image can be synchronously displayed through the visualization and the corresponding image, so that the related data of a certain angle part and the corresponding image can be bound and displayed, and therefore, the audio rhythm time point and the image conversion time point can be matched through the visualization of the related data of a certain angle part and the display time information of the audio data.
For example, when the electric shaver video file is composed, the start time point and the end time point of the visualization of the electric shaver monthly payment data, that is, the start time point and the end time point of the electric shaver body partial image are inevitably displayed. In order to match the start time point with the rhythm time point of the audio data and match the end time point with other rhythm time points of the audio data, it is necessary to obtain the display time information of the visualization of the related data of a certain angle part relative to the audio data. Specifically, the display time information of the visualization of the relevant data of a certain angle part relative to the audio data can be determined according to the principle that the visual display time point of the relevant data of the certain angle part of the electric shaver is matched with the rhythm time point of the audio data.
In addition, in order to facilitate the synthesis of the target video file, the original image, the original audio data and the visualization are rendered to obtain a rendering result. And finally, after obtaining the display position information, the display time information, the visualization and the rendering result, encoding the original image, the original audio data and the visualization, and synthesizing the encoded result into a target video file according to the rendering result.
After the preliminary synthesis of the target video file, the user may also make the following adjustments to the video file, for example: the ordering of the images can be adjusted, and of course, while the ordering of the images is adjusted, the corresponding visualizations of the images are "bound" to the corresponding images, so that while the images are moved, the corresponding visualizations are adjusted together with the images. Accordingly, the presentation time of the visualization relative to the audio data is also adjusted.
By adopting the method for obtaining the video data, because the display position information of the visualization of the target data in the original image is obtained, and the visualization of the target data is actually bound with the original image, the relationship between the target data and the original image can be intuitively displayed by the obtained video data, so that the problem that the relationship between the related quantized data and the object cannot be intuitively displayed by a video file synthesized by the prior art is solved; meanwhile, the visualization of the target data is obtained relative to the presentation time information of the original audio data, so that the video data obtained by the method can ensure that the presentation of the audio data and the target data is kept synchronous. It should be noted that the application scenario embodiment is only provided as an embodiment, and is provided to facilitate understanding of the method for obtaining video data of the present application, and is not used to limit the method for obtaining video data of the present application.
The application provides a method, a device, an electronic device and a computer storage medium for obtaining video data. The following are specific examples.
Fig. 1 is a flowchart illustrating a method for obtaining video data according to a first embodiment of the present application. The method comprises the following steps.
Step S101: an original image and associated target data for the original image are obtained.
By using the method for acquiring the video data, the original image of the user synthesis target video file is acquired at first. In the present embodiment, the obtained target video data mainly refers to a composite target video file. Since the synthesized target video file is a video file containing data visualization, the target data associated with the original image is also obtained in advance.
The method for obtaining the original image and the original audio data is simple, and the original image and the original audio data can be obtained by adopting a plurality of methods for obtaining the image and the audio data in the prior art. Taking the original image as an example: as one of the ways of obtaining the original image, the original image may be directly obtained in an image library in which the original image is stored in advance. Alternatively, the real object may be directly photographed, and the photographed real object image may be used as the original image, and it should be noted that, in this embodiment, the original image may refer to a plurality of continuous image files, that is, the original image may substantially refer to the original video file. For example, in obtaining a video file of the electric shaver in the scene embodiment, five-angle images of the electric shaver blade portion, the electric shaver body portion, the electric shaver use case, the manner in which the electric shaver blade portion is connected to the body portion, and the electric shaver internal configuration may be photographed in advance, and the photographed five-angle images may be taken as original images, i.e., original video files. When the original video file is obtained, audio data can be added to the synthesized target video file. Thus, the original audio data for generating the target video file can be obtained at the same time. The method for obtaining the original audio data is simple, and the existing section of audio file can be directly obtained to serve as the original audio data. In the present embodiment, the audio file generally refers to a music file.
There are many ways to obtain the target data associated with the original image as described above, and as an embodiment of obtaining the target data associated with the original image, the following description may be given. First, feature information of a target subject in an original image is obtained from the original image. And then, obtaining target data associated with the target subject according to the characteristic information of the target subject.
For example, when the data relating to the monthly payment amount of the electric shaver in the scene embodiment is obtained, the electric shaver body part image may be obtained in advance, and after the target body is determined to be the electric shaver body, the data relating to the monthly payment amount of the electric shaver may be directly obtained in the database including the data relating to each part of the electric shaver. Specifically, the database containing data of each part of the electric shaver may be a category of feature information of each part of the electric shaver, and the database contains related data of each part. Therefore, when the data related to the monthly sales of the electric shaver is to be obtained, the data related to the monthly sales of the electric shaver can be obtained by directly using the characteristic information of the monthly sales of the electric shaver as a search word in the database.
The characteristic information of the target subject may be any one of the following information or a combination of several of the following information. The position information of the target subject in the original image, the size information of the target subject, or the type of the target subject may be used. For example, after the electric shaver body section image is obtained as described above, the feature information of the target body may refer to position coordinate information of the electric shaver body section in the image, size information of the electric shaver body section in the image, and a type of the electric shaver body section in the image.
Step S102: and obtaining the visualization of the target data according to the target data.
After the target data associated with the original image is obtained in step S101, a visualization of the target data is obtained according to the target data. In an embodiment of the present application, the type of visualization includes at least one of: pie charts, line charts, bar charts, word cloud statistical charts. After the target data are converted into visualization, the visualization is synthesized in the finally generated target video file, so that the meaning expressed by the target data can be displayed more intuitively.
For example, please refer to fig. 1-a, which is a schematic diagram of a video file of an electric shaver synthesized according to the above scenario embodiment. In fig. 1-a, a target video file is obtained in which five different angle images are presented for an electric shaver, wherein one of the five angle images includes: the electric shaver body partial image, the target data correlated with the electric shaver body partial image is electric shaver monthly payment amount correlated data.
One of the electric shaver monthly sales related data is obtained before obtaining the target video file, which can be visualized as a line graph as shown in fig. 1-B. The graph clearly shows the monthly sales of the electric shaver from july to december and a half year. The graph shows the drift of the amount of sales in visual form for a few months.
In addition to the line graph, the visualization of the monthly sales data of the electric shaver may be in the form of a pie chart, a bar chart, a word cloud statistical chart, or the like. The method is used for combining different visualizations of the monthly sales data of the electric shaver into a target video file for displaying.
Step S103: and obtaining the display position information visualized in the original image.
The method comprises the step of obtaining the display position information of the visualization of the target data in the original image after obtaining the visualization of the original image and the target data. Since in step S101, the original image is obtained and at the same time, the original audio data can also be obtained as the background music of the synthesis target video file. Therefore, while the presentation position information of the visualization of the target data in the original image is obtained, the presentation time information of the visualization of the target data corresponding to the original audio data is also obtained.
After the original image, the original audio data and the visualization of the target data associated with the original image are obtained, in order to enable the target data to be displayed in the target video file in a visualized form in the obtained visualized target video file including the original image, the original audio data and the target data, and simultaneously ensure that the visualization does not affect the display of the target subject in the original image in the target video file, the original audio data can be displayed in the target video file in a mutually matched manner with the plurality of original images, therefore, the display position information of the visualization of the target data in the original image and the display time information of the visualization of the target data relative to the original audio data need to be obtained. In this embodiment, the presentation position information of the visualization of the target data in the original image and the presentation time information of the visualization of the target data relative to the original audio data may refer to the presentation position information of the visualization in the original image and the presentation time information of the visualization relative to the original audio data.
As one implementation of obtaining the display position information of the visualization of the target data in the original image, a specific embodiment is described below. And obtaining the display position information of the visualization of the target data in the original image according to the characteristic information of the target subject and the characteristic information of the reference object in the original image.
In step S101, the feature information of the target subject is elaborated, and when the display position information of the target data visualized in the original image is obtained, the feature information of the reference object in the original image is also referred to. Specifically, the feature information of the reference object may refer to position information of the reference object in the original image. For example, when the display position information visualized in the electric shaver body part image of the data relating to the monthly payment amount of the electric shaver is obtained, the position information of the reference object in the original image can be obtained by using the upper edge of the electric shaver body part as the reference object and confirming the reference object.
Specifically, the above-mentioned obtaining of the display position information of the target data visualized in the original image based on the feature information of the target subject may be inputting the feature information of the target subject, the feature information of the reference object, and the visualization of the target data into a network model for obtaining the display position information of the target data visualized in the original image, thereby obtaining the display position information of the target data visualized in the original image. The network model used for obtaining the display position information visualized in the original image is a network model constructed based on the following five principles. Each principle was modeled quantitatively and its energy function was calculated. Fig. 1-C shows an example diagram of a presentation position where visualization is obtained in an original image according to five principles.
In the first principle, the display position of the visualization of the target data in the original image is balanced with the position distribution of the target subject in the original image. I.e. the display position of the visualization in the original image and the position of the target subject in the original image comply with the balancing principle of fig. 1-C. The display position of the visualization of the target data expressed by the principle in the original image and the position of the target subject in the original image are shown in fig. 1-C. The method comprises the steps of balancing the display position of the visualization of target data in an original image and the position distribution of a target main body in the original image, wherein the distribution balance is divided into a transverse symmetrical balance mode and a longitudinal symmetrical balance mode, meanwhile, the position distribution balance principle corresponds to the balance principle in the basic aesthetic principle of plane layout, quantitative modeling is conducted on the principle, and the model energy function is specifically as follows.
Figure BDA0002377743380000131
Wherein i is the ith visualization in the original image, q is the qth target subject in the original image, and xi,yiFor visualizing the abscissa and ordinate information in the original image, wi,hiTo visualize the width and height information in the original image, xq,yqFor the information of the abscissa and ordinate of the qth target subject in the original image, wq,hqWidth and height information of the qth target subject in the original image, Q being the number of target subjects in the original imageW is the width information in the original image, H is the width information in the original image, and ω is the weight coefficient of the model.
In a second principle, the display position of the visualization of the target data in the original image is kept aligned with the position of the target subject in the original image. I.e. the display position of the visualization in the original image and the position of the target subject in the original image comply with the alignment principle of fig. 1-C. A reference may be selected as a criterion for alignment. The display position of the visualization of the target data expressed by the principle in the original image and the position of the target subject in the original image are shown in fig. 1-C. The alignment modes include left alignment, right alignment, up alignment, down alignment, vertical center alignment and horizontal center alignment, and the position keeping alignment principle corresponds to an alignment principle in the basic aesthetic principle of a planar layout, and quantitative modeling is performed on the principle, and the model energy function is specifically as follows.
Figure BDA0002377743380000141
Wherein v isiV visualization types in the ith visualization; a is a type of alignment. Ca(viAnd q) is an alignment distance between the ith visualization and the qth target subject, which is specifically defined as follows.
Figure BDA0002377743380000142
In a third principle, the display position of the visualization of the target data in the original image is kept close to the position of the reference object in the original image. I.e. the display position in the original image and the position of the reference object in the original image are visualized in accordance with the approach principle of fig. 1-C. For the visualization of the target data expressed by the principle, please refer to fig. 1-C for the display position of the original image and the position of the reference object in the original image. The display position of the visualization of the target data in the original image and the position of the reference object in the original image are kept close to each other, and quantitative modeling is performed on the principle corresponding to the close principle in the basic aesthetic principle of the plane layout, and the model energy function is specifically as follows.
Figure BDA0002377743380000151
Wherein r represents a reference object in the original image, VD represents a longitudinal relative distance between the visualization of the target data and the reference object, which is specifically defined as follows:
Figure BDA0002377743380000152
HD represents the lateral relative distance between the visualization of the target data and the reference object, which is defined in the following way:
Figure BDA0002377743380000153
wherein the content of the first and second substances,
Figure BDA0002377743380000154
information of an abscissa and an ordinate of a v visualization type in the ith visualization in the original image; x is the number ofr,yrThe information of the abscissa and the ordinate of the reference object in the original image is obtained;
Figure BDA0002377743380000155
width and height information in the original image for a vth visualization type in the ith visualization; w is ar,hrThe width and height information of the reference object in the original image is shown.
According to a fourth principle, the display position of the visualization of the target data in the original image does not obstruct the position of the target subject in the original image. I.e. visualizing the position of the presentation in the original image, complies with the readable principles of fig. 1-C. The display position of the visualization of the target data expressed by the principle in the original image and the position of the target subject in the original image are shown in fig. 1-C. The non-occlusion principle corresponds to a readable principle in the basic aesthetic principle of the planar layout for which a quantitative modeling is performed, and the model energy function is specifically as follows.
Figure BDA0002377743380000156
In a fifth principle, the visualization of the target data is consistent in the display position in the original image. I.e. the display position of the visualization in the original image, complies with the coherent principle of fig. 1-C. The display position of the visualization of the target data represented by the principle in the original image and the position of the visualization of the target data in the previous original image (i.e. the visualization of the previous target data) are shown in fig. 1-C. The principle of keeping consistency corresponds to the coherent principle in the basic aesthetic principle of the planar layout for which a quantitative modeling is performed, the model energy function being specified as follows.
E(Coherent)=(xi-xpre)2+(yi-ypre)2
Wherein x ispre,ypreAnd visualizing the abscissa and ordinate information in the previous original image for the previous target data.
Based on the five principles, an energy function E of the visualized target data in the original image and showing the position information is constructed.
E=ω1EBalancing2EAlignment of3EProximity to each other4EReadable5E(Coherent)
Wherein, ω is1~ω5The weight represented by each model can be assigned to each model.
When the display position information of the visualization of the target data in the original image is calculated, the minimum value is obtained by satisfying the energy function E. In the energy function E, since other parameters are constant values, only the display position of the visualization of the target data in the original image may affect the value of the energy function E, and therefore, the minimum value may be obtained through the energy function E, and the display position information of the visualization of the target data in the original image is obtained. Here, it should be noted that the display position information of the visualization of the target data in the original image may refer to the display position information of the ith visualization in the original image, and may also be more specifically the display position information of the vth visualization type in the ith visualization in the original image.
When obtaining the presentation position information of the visualization of the target data in the original image, it is also necessary to obtain presentation time information of the visualization of the target data with respect to the original audio data.
As an embodiment to obtain visualization of target data versus presentation time information of the original audio data. First, rhythm feature information of original audio data is obtained according to the original audio data. And then, obtaining the display time information of the visualization of the target data relative to the original audio data according to the rhythm characteristic information.
Specifically, the display time information of the visualization of the target data relative to the original audio data is obtained according to the rhythm feature information, and the display time information of the visualization of the target data relative to the original audio data may be obtained by using the rhythm feature information and the visualization of the target data as input data of a network model for obtaining the visualization display time information.
The network model for acquiring the visual display time information of the target data is constructed based on the principle that the display time information of the target data relative to the original audio data is matched with the rhythm characteristic information.
In order to ensure that the obtained video data can ensure that the audio data and the target data are kept synchronous in visual display, the embodiment realizes that the time between the transitions of a plurality of original images (i.e. the corresponding displayed video segments of the images in the target video file) is kept synchronous with the beat information of the audio data by keeping the audio data and the target data synchronous in visual display. The energy function constructed on the basis of the network model constructed by matching the presentation time information of the visualization based on the target data with the rhythm feature information of the original audio data is specifically as follows.
Figure BDA0002377743380000161
Wherein the content of the first and second substances,
Figure BDA0002377743380000162
the maximum matching number of the beat information in the rhythm of the video clip and the audio data is defined as follows:
Figure BDA0002377743380000171
wherein the content of the first and second substances,
Figure BDA0002377743380000172
the matching function for the video clip and the audio data is specifically defined as follows:
Figure BDA0002377743380000173
wherein T ═ T1,…,tn],tnRepresenting the presentation time length of each of n video segments, Bl=[b0,b1,…,bl]And representing rhythm points of rhythm information in the rhythm characteristic information.
Figure BDA0002377743380000174
Figure BDA0002377743380000175
Representing the start and end presentation times of each of the m video segments, wherein,
Figure BDA0002377743380000176
n is equal to or less than m.
Specifically, in order to fully utilize the rhythm point of the rhythm information in the rhythm feature information and avoid modifying the visualized display time length, the following boundary conditions must be satisfied
Figure BDA0002377743380000177
Due to differences in duration, time points
Figure BDA0002377743380000178
And music beat
Figure BDA0002377743380000179
Cannot correspond exactly, and therefore allows the presentation time of the video segment to be slightly lengthened or shortened to match the music tempo point. The matching function is defined as follows:
Figure BDA00023777433800001710
where α is a threshold value, and is 0.1 by default. In practice, there is a case where both one transition time point and a plurality of beat points satisfy the matching function, but in the present embodiment, only one is taken
Figure BDA00023777433800001711
The tempo point with the smallest value.
For any m and l, the transition time point and music tempo point are actually two ascending sequences, finding the most transition and tempo coincidence points is the same problem as the longest common subsequence, except that the common sequence is defined by a matching function. The longest common subsequence is a classical dynamic programming problem, and the state transition equation is defined as follows:
Figure BDA00023777433800001712
the corresponding point solved by the function is the transition time point
Figure BDA00023777433800001713
And rhythm point
Figure BDA00023777433800001715
The duration of each video clip with visualization (a video clip has only one visualization) is updated to
Figure BDA00023777433800001714
If the rhythm points are not dense enough, some transition points can not be matched with the rhythm points. Therefore, time can be allocated between two rhythm points with the duration of the video clip presentation as a weight. For example, in
Figure BDA00023777433800001716
With p transition points in between, then
Figure BDA0002377743380000181
By solving the time information of the minimum value of the energy function, the visualized display time information of the target data relative to the original audio data can be obtained
Step S104: and synthesizing the original image and the visualization according to the display position information to obtain target video data.
After the original image, the visualization corresponding to the target data and the display position information of the visualization of the target data in the original image are obtained in steps S101-S103, the original image and the visualization are synthesized according to the display position information, and then the target video data is obtained. Since the audio data can be used as background music when synthesizing the target video file, when the audio data exists in the target video file, the original image, the original audio file and the visualization are synthesized according to the display position information and the display time information, and the target video data is obtained.
When synthesizing the target video file, the original image, the original audio data, and the visualization may be rendered first to obtain a rendering result. And then, synthesizing the original image, the original audio data and the visualization according to the rendering result, the display position information and the display time information to obtain target video data.
In addition, when the target video data is obtained, the original image, the original audio data, and the visualization may be encoded first. And after encoding, obtaining the target video data according to the encoded image, the audio data and the visualization.
Please refer to fig. 1-D, which is a schematic view of another application scenario of the method for obtaining video data provided by the present application. Which shows a specific scenario for obtaining target video data. First, input original images, original audio data, video clip data, and target data are obtained. The target data is preprocessed according to the method of the first embodiment, that is, the display position information of the visualization of the target data in the original image or video segment data, the display time information of the visualization of the target data relative to the original audio data or video segment data, and the visualization of the target data are obtained. The preprocessing results are then optimized, for example, by reselecting the visualization data type of the target data. And then, rendering the optimized original image, the original audio data, the video fragment data and the visualization to obtain a rendering result. And finally, according to the rendering result, encoding the original image, the original audio data, the video segment data and the visualization to obtain target video data.
According to the method for obtaining the video data, because the display position information of the visualization of the target data associated with the original image in the original image is obtained, and the visualization of the target data is actually bound with the original image, the relationship between the target data and the original image can be visually displayed by the obtained video data, so that the problem that the relationship between the related quantized data and the object cannot be visually displayed by a video file synthesized in the prior art is solved; meanwhile, the display time information of the visualization of the target data relative to the original audio data is obtained, so that the obtained video data can ensure that the audio data and the visualization of the target data are displayed synchronously. In addition, according to the display position information and the display time information, the original image, the original audio data and the target video data obtained by performing synthesis processing on the visualization realize that the target data is visually embedded in the video.
In the first embodiment described above, a method for obtaining video data is provided, and correspondingly, the present application provides an apparatus for obtaining video data. Fig. 2 is a schematic diagram of an apparatus for obtaining video data according to a second embodiment of the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
A second embodiment of the present application provides an apparatus for obtaining video data, the apparatus comprising:
a data obtaining unit 201 for obtaining an original image and target data associated with the original image;
a visualization obtaining unit 202, configured to obtain a visualization of the target data according to the target data;
a display information obtaining unit 203, configured to obtain display position information of the visualization in the original image;
a target video data obtaining unit 204, configured to perform synthesis processing on the original image and the visualization according to the display position information, so as to obtain target video data.
Optionally, the data obtaining unit: also for obtaining raw audio data;
the display information obtaining unit is further configured to obtain display time information of the visualization relative to the original audio data;
the target video data obtaining unit is specifically configured to:
and synthesizing the original image, the original audio data and the visualization according to the display position information and the display time information to obtain target video data.
Optionally, the method further includes a feature information obtaining unit:
the characteristic information obtaining unit is used for obtaining the characteristic information of the target main body in the original image according to the original image;
the data obtaining unit is specifically configured to:
and obtaining target data associated with the target subject according to the characteristic information of the target subject.
Optionally, the display information obtaining unit is specifically configured to:
and obtaining the visualized display position information in the original image according to the characteristic information of the target subject and the characteristic information of the reference object in the original image.
Optionally, the display information obtaining unit is specifically configured to:
inputting the feature information of the target subject, the feature information of the reference object and the visualization into a network model for obtaining display position information of the visualization in an original image, and obtaining the display position information of the visualization in the original image.
Optionally, the system further comprises a rhythm feature information obtaining unit;
the rhythm characteristic information obtaining unit is used for obtaining rhythm characteristic information of the original audio data according to the original audio data;
the display information obtaining unit is specifically configured to:
and obtaining the display time information of the visualization relative to the original audio data according to the rhythm characteristic information.
Optionally, the display information obtaining unit is specifically configured to:
and taking the rhythm characteristic information and the visualization as input data of a network model for obtaining visualized display time information, and obtaining the display time information of the visualization relative to the original audio data.
Optionally, the method further includes:
the rendering unit is configured to render the original image, the original audio data, and the visualization to obtain a rendering result;
the target video data obtaining unit is specifically configured to: and synthesizing the original image, the original audio data and the visualization according to the rendering result, the display position information and the display time information to obtain target video data.
Optionally, the target video data obtaining unit is specifically configured to:
and according to the display position information and the display time information, encoding the original image, the original audio data and the visualization to obtain the target video data.
Optionally, the feature information of the target subject includes at least one of the following information: position information of the target subject in the original image, size information of the target subject, and a type of the target subject.
Optionally, the feature information of the reference object at least includes position information of the reference object in the original image.
Optionally, the network model for obtaining the display position information of the visualization in the original image is a network model constructed in a manner that the display position information of the visualization in the original image and the position information of the reference object in the original image are kept close to each other.
Optionally, the network model for obtaining the display position information of the visualization in the original image is a network model that is constructed based on the display position information of the visualization in the original image and the position information distribution of the target subject in the original image in a balanced manner.
Optionally, the network model for obtaining the display position information of the visualization in the original image is a network model constructed in a manner that the display position information of the visualization in the original image is aligned with the position information of the target subject in the original image.
Optionally, the network model for obtaining the display position information of the visualization in the original image is a network model constructed based on the display position information of the visualization in the original image without obstructing the position information of the target subject in the original image.
Optionally, the network model for obtaining the display position information of the visualization in the original image is a network model constructed based on the display position information of the visualization in the original image to maintain consistency.
Optionally, the network model for obtaining the visualized presentation time information is a network model constructed based on the visualization and matching the presentation time information relative to the original audio data with the rhythm feature information.
Optionally, the type information of the visualization includes at least one of the following information: pie charts, line charts, bar charts, word cloud statistical charts.
A first embodiment of the present application provides a method for obtaining video data, and a third embodiment of the present application provides an electronic device corresponding to the method of the first embodiment.
As shown in fig. 3, a schematic diagram of the electronic device of the present embodiment is shown.
The present embodiment provides an electronic device, including:
a processor 301;
a memory 302 for storing a computer program to be executed by a processor for performing a method of obtaining video data, said method comprising the steps of:
obtaining an original image and target data associated with the original image;
obtaining a visualization of the target data from the target data;
obtaining display position information of the visualization in the original image;
and synthesizing the original image and the visualization according to the display position information to obtain target video data.
A first embodiment of the present application provides a method for obtaining video data, and a fourth embodiment of the present application provides a computer storage medium corresponding to the method of the first embodiment, where the computer storage medium stores a computer program that is executed by a processor to perform the method for obtaining video data, the method includes the following steps:
obtaining an original image and target data associated with the original image;
obtaining a visualization of the target data from the target data;
obtaining display position information of the visualization in the original image;
and synthesizing the original image and the visualization according to the display position information to obtain target video data. Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory. The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer-readable medium does not include non-transitory computer-readable storage media (non-transitory computer readable storage media), such as modulated data signals and carrier waves.
2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims (21)

1. A method of obtaining video data, comprising:
obtaining an original image and target data associated with the original image;
obtaining a visualization of the target data from the target data;
obtaining display position information of the visualization in the original image;
and synthesizing the original image and the visualization according to the display position information to obtain target video data.
2. The method of claim 1, further comprising:
obtaining original audio data;
obtaining presentation time information of the visualization relative to the original audio data;
the synthesizing the original image and the visualization according to the display position information to obtain target video data includes:
and synthesizing the original image, the original audio data and the visualization according to the display position information and the display time information to obtain target video data.
3. The method of claim 1, further comprising:
according to the original image, obtaining characteristic information of a target main body in the original image;
the obtaining target data associated with the original image comprises:
and obtaining target data associated with the target subject according to the characteristic information of the target subject.
4. The method according to claim 3, wherein the obtaining of the information of the display position of the visualization in the original image comprises:
and obtaining the visualized display position information in the original image according to the characteristic information of the target subject and the characteristic information of the reference object in the original image.
5. The method according to claim 4, wherein obtaining the display position information of the visualization in the original image according to the feature information of the target subject and the feature information of the reference object in the original image comprises:
inputting the feature information of the target subject, the feature information of the reference object and the visualization into a network model for obtaining display position information of the visualization in an original image, and obtaining the display position information of the visualization in the original image.
6. The method of claim 2, further comprising:
obtaining rhythm characteristic information of the original audio data according to the original audio data;
the obtaining presentation time information of the visualization relative to the raw audio data comprises:
and obtaining the display time information of the visualization relative to the original audio data according to the rhythm characteristic information.
7. The method of claim 6, wherein obtaining presentation time information of the visualization relative to the original audio data according to the rhythm feature information comprises:
and taking the rhythm characteristic information and the visualization as input data of a network model for obtaining visualized display time information, and obtaining the display time information of the visualization relative to the original audio data.
8. The method of claim 2, further comprising:
rendering the original image, the original audio data and the visualization to obtain a rendering result;
the synthesizing the original image, the original audio data and the visualization according to the display position information and the display time information to obtain target video data includes: and synthesizing the original image, the original audio data and the visualization according to the rendering result, the display position information and the display time information to obtain target video data.
9. The method according to claim 2, wherein the synthesizing the original image, the original audio data, and the visualization according to the presentation position information and the presentation time information to obtain target video data comprises:
and according to the display position information and the display time information, encoding the original image, the original audio data and the visualization to obtain the target video data.
10. The method according to any one of claims 3-5, wherein the characteristic information of the target subject includes at least one of the following information: position information of the target subject in the original image, size information of the target subject, and a type of the target subject.
11. The method according to any one of claims 4 to 5, wherein the feature information of the reference object includes at least position information of the reference object in the original image.
12. The method according to claim 11, wherein the network model for obtaining the display position information of the visualization in the original image is a network model constructed based on a manner that the display position information of the visualization in the original image is kept close to the position information of the reference object in the original image.
13. The method according to claim 10, wherein the network model for obtaining the presentation position information of the visualization in the original image is a network model constructed based on the distribution balance between the presentation position information of the visualization in the original image and the position information of the target subject in the original image.
14. The method according to claim 10, wherein the network model for obtaining the display position information of the visualization in the original image is a network model constructed based on a manner that the display position information of the visualization in the original image is aligned with the position information of the target subject in the original image.
15. The method according to claim 10, wherein the network model for obtaining the presentation position information of the visualization in the original image is a network model constructed based on the presentation position information of the visualization in the original image without obstructing the position information of the target subject in the original image.
16. The method according to claim 5, wherein the network model for obtaining the presentation position information of the visualization in the original image is a network model constructed based on the presentation position information of the visualization in the original image to maintain consistency.
17. The method according to claim 7, wherein the network model for obtaining the presentation time information of the visualization is a network model constructed based on matching the presentation time information of the visualization relative to the raw audio data with the rhythm feature information.
18. The method of claim 1, wherein the type information of the visualization comprises at least one of the following information: pie charts, line charts, bar charts, word cloud statistical charts.
19. An apparatus for obtaining video data, comprising:
a data obtaining unit for obtaining an original image and target data associated with the original image;
a visualization obtaining unit, configured to obtain a visualization of the target data according to the target data;
the display information obtaining unit is used for obtaining the display position information of the visualization in the original image;
and the target video data obtaining unit is used for synthesizing the original image and the visualization according to the display position information to obtain target video data.
20. An electronic device, comprising:
a processor;
a memory for storing a computer program for execution by the processor to perform a method of obtaining video data, the method comprising the steps of:
obtaining an original image and target data associated with the original image;
obtaining a visualization of the target data from the target data;
obtaining display position information of the visualization in the original image;
and synthesizing the original image and the visualization according to the display position information to obtain target video data.
21. A computer storage medium storing a computer program for execution by a processor to perform a method of obtaining video data, the method comprising:
obtaining an original image and target data associated with the original image;
obtaining a visualization of the target data from the target data;
obtaining display position information of the visualization in the original image;
and synthesizing the original image and the visualization according to the display position information to obtain target video data.
CN202010072877.7A 2020-01-22 2020-01-22 Method and device for obtaining video data and electronic equipment Pending CN113163227A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010072877.7A CN113163227A (en) 2020-01-22 2020-01-22 Method and device for obtaining video data and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010072877.7A CN113163227A (en) 2020-01-22 2020-01-22 Method and device for obtaining video data and electronic equipment

Publications (1)

Publication Number Publication Date
CN113163227A true CN113163227A (en) 2021-07-23

Family

ID=76882396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010072877.7A Pending CN113163227A (en) 2020-01-22 2020-01-22 Method and device for obtaining video data and electronic equipment

Country Status (1)

Country Link
CN (1) CN113163227A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103220500A (en) * 2013-03-20 2013-07-24 积成电子股份有限公司 Overlay display method of power grid equipment monitoring image and service analysis image
US20160182971A1 (en) * 2009-12-31 2016-06-23 Flickintel, Llc Method, system and computer program product for obtaining and displaying supplemental data about a displayed movie, show, event or video game
CN107066605A (en) * 2017-04-26 2017-08-18 国家电网公司 Facility information based on image recognition has access to methods of exhibiting automatically
CN107124624A (en) * 2017-04-21 2017-09-01 腾讯科技(深圳)有限公司 The method and apparatus of video data generation
CN110418078A (en) * 2019-06-13 2019-11-05 浙江大华技术股份有限公司 Video generation method, device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160182971A1 (en) * 2009-12-31 2016-06-23 Flickintel, Llc Method, system and computer program product for obtaining and displaying supplemental data about a displayed movie, show, event or video game
CN103220500A (en) * 2013-03-20 2013-07-24 积成电子股份有限公司 Overlay display method of power grid equipment monitoring image and service analysis image
CN107124624A (en) * 2017-04-21 2017-09-01 腾讯科技(深圳)有限公司 The method and apparatus of video data generation
CN107066605A (en) * 2017-04-26 2017-08-18 国家电网公司 Facility information based on image recognition has access to methods of exhibiting automatically
CN110418078A (en) * 2019-06-13 2019-11-05 浙江大华技术股份有限公司 Video generation method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US9412414B2 (en) Spatial conform operation for a media-editing application
Barnes et al. Patchtable: Efficient patch queries for large datasets and applications
US8717390B2 (en) Art-directable retargeting for streaming video
US20120210232A1 (en) Rate Conform Operation for a Media-Editing Application
EP2043047A2 (en) Image display apparatus and computer program product
CN113261058B (en) Automatic video editing using beat match detection
US20070030283A1 (en) Image display method and device, image display system, server, program, and recording medium
JP2008513882A (en) Video image processing system and video image processing method
US20120280991A1 (en) Employing mesh files to animate transitions in client applications
CN103002218A (en) Image processing apparatus and image processing method
KR20140120240A (en) Method of object customization by high-speed and realistic 3d rendering through web pages
JP6662644B2 (en) Viewing material evaluation method, viewing material evaluation system, and program
JP2012094144A (en) Centralized database for 3-d and other information in videos
US20190139319A1 (en) Automatic 3d camera alignment and object arrangment to match a 2d background image
US20220277502A1 (en) Apparatus and method for editing data and program
US20240038274A1 (en) 3d media elements in 2d video
Denning et al. 3dflow: Continuous summarization of mesh editing workflows
Waldin et al. Chameleon: Dynamic color mapping for multi-scale structural biology models
CN113163227A (en) Method and device for obtaining video data and electronic equipment
US9734615B1 (en) Adaptive temporal sampling
JP2007164301A (en) Information processor, data analyzing method, program, and information storage medium
AU2019200269B2 (en) An interactive user interface and its corresponding engine for improving image completion quality
Ponto et al. Effective replays and summarization of virtual experiences
JP2009065462A (en) Device, method, and program for summarizing video
KR102244754B1 (en) Imaging method, apparatus and virtual reality device used in virtual reality device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210723