CN112969043B - Media file generation and playing method and equipment - Google Patents

Media file generation and playing method and equipment Download PDF

Info

Publication number
CN112969043B
CN112969043B CN202110463444.9A CN202110463444A CN112969043B CN 112969043 B CN112969043 B CN 112969043B CN 202110463444 A CN202110463444 A CN 202110463444A CN 112969043 B CN112969043 B CN 112969043B
Authority
CN
China
Prior art keywords
data
picture
producer
media file
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110463444.9A
Other languages
Chinese (zh)
Other versions
CN112969043A (en
Inventor
段君
李东朔
徐灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Youmu Technology Co ltd
Original Assignee
Beijing Youmu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Youmu Technology Co ltd filed Critical Beijing Youmu Technology Co ltd
Priority to CN202110463444.9A priority Critical patent/CN112969043B/en
Publication of CN112969043A publication Critical patent/CN112969043A/en
Priority to PCT/CN2021/111384 priority patent/WO2022227329A1/en
Application granted granted Critical
Publication of CN112969043B publication Critical patent/CN112969043B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2387Stream processing in response to a playback request from an end-user, e.g. for trick-play
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • H04N21/4334Recording operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47217End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks

Abstract

The invention provides a media file generating and playing method and equipment, wherein the generating method comprises the following steps: acquiring a picture sequence; recording audio data, presenting the picture sequence to a producer through a display interface in the recording process, acquiring picture switching data according to the switching action of the producer on the picture sequence, acquiring graffiti data according to the graffiti action of the producer in the display interface, and acquiring character adding data according to the operation of adding characters in the display interface by the producer, wherein the picture switching data, the graffiti data and the character adding data respectively at least comprise time information based on a recording process; and packaging the picture switching data, the scrawling data, the character adding data, the audio data, the picture sequence and/or the address information thereof into a media file.

Description

Media file generation and playing method and equipment
Technical Field
The invention relates to the field of media file editing and playing, in particular to a media file generating and playing method and device.
Background
Traditional knowledge content, when organized in a digital format, is typically in the form of a video recording. For knowledge content, video is an unstructured form of organization. For example, knowledge content of a class is typically stored in one or more video files that are several minutes to hours in duration. The course recording form is usually video capture, i.e. recording.
The video file is in a general format and is not specifically optimized for content, so that the video file consumes a large storage space, but has low information density. The video file can not extract the structured content for independent use or processing and multiplexing again, such as pictures, circled comment, explained voice and the like are mixed together, and the retrieval, query and processing of knowledge content by a computer program are not convenient.
Disclosure of Invention
In view of the above, the present invention provides a method for generating a media file, including:
acquiring a picture sequence;
recording audio data, presenting the picture sequence to a producer through a display interface in the recording process, acquiring picture switching data according to the switching action of the producer on the picture sequence, acquiring graffiti data according to the graffiti action of the producer in the display interface, and acquiring character adding data according to the operation of adding characters in the display interface by the producer, wherein the picture switching data, the graffiti data and the character adding data respectively at least comprise time information based on a recording process;
and packaging the picture switching data, the scrawling data, the character adding data, the audio data, the picture sequence and/or the address information thereof into a media file.
Optionally, after finishing recording the audio data, the method further includes:
playing back the audio data according to the operation of a producer;
the method comprises the steps of presenting a picture sequence to a producer through a display interface in the playback process, obtaining picture switching data according to the switching action of the producer on the picture sequence, obtaining scrawling data according to the scrawling action of the producer in the display interface, and obtaining character adding data according to the operation of adding characters in the display interface by the producer, wherein the picture switching data, the scrawling data and the character adding data at least comprise time information based on a playback process.
The invention also provides a media file playing method, which comprises the following steps:
acquiring a media file, wherein the media file comprises audio data, a picture sequence, picture switching data, graffiti data, character adding data and/or address information of the image switching data, the graffiti data and the character adding data, and the picture switching data, the graffiti data and the character adding data respectively at least comprise time information based on a recording process of the audio data;
analyzing the media file and playing the audio data;
and displaying the pictures, the scrawls and the characters in the picture sequence in a display interface according to the time information in the playing process.
Optionally, the time information in the picture switching data includes a time point when a producer switches a picture in a recording process; the picture switching data further comprises the sequence number of the switched picture in the picture sequence.
Optionally, the graffiti data further includes position information of a graffiti track in the display interface, and the time information in the graffiti data includes appearance time and disappearance time of the graffiti track.
Optionally, the graffiti data further includes color information and/or width information of the graffiti track.
Optionally, in the process of obtaining the graffiti data, sparse sampling is performed on the position information of the graffiti track, so as to compress the data size of the position information.
Optionally, the time information of the text addition data includes appearance time and disappearance time of the text; the text adding data also comprises position information of the text in the display interface.
Optionally, the text-adding data further includes color information and/or size information of the text.
Optionally, the media file includes a header file, a component index, frame index data, component data, frame data, and the audio data, wherein the header file includes version information, header length information, and length information of the component index, length information of the frame index data, length information of the component data, and length information of the frame data; the element index includes position information of each element data; the frame index data comprises playing time and position information corresponding to the frame data; the element data are the picture switching data, the doodle data, the character adding data and the picture sequence; the frame data comprises key frame data and changed frame data, wherein the key frame data refers to frame data established aiming at the current picture when the current picture has larger change relative to the previous frame; the change frame data refers to the difference attribute relative to the element in the most recent previous key frame.
Optionally, the analyzing the media file specifically includes:
acquiring the header length information to determine the length of the header file;
acquiring the header file to determine the length and the initial position of each subsequent part;
acquiring the element index and the frame index data, and analyzing the frame index data to determine the playing time corresponding to the subsequent key frame data and the change frame data;
acquiring frame data, asynchronously loading corresponding element data by analyzing internal dependent element information, playing the key frame data and the change frame data along with time progress, and playing the audio data.
Accordingly, the present invention provides a media file generating device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the processor, the instructions being executable by the at least one processor to cause the at least one processor to perform the media file generation method described above.
Accordingly, the present invention provides a media file playing device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the processor, and the instructions are executed by the at least one processor to cause the at least one processor to execute the media file playing method.
According to the media file generating and playing method and device provided by the invention, in the process of recording audio data, corresponding data are obtained according to the switching operation, the doodling action and the character adding operation of a producer on the display pictures, and the data respectively comprise time information based on the recording process, so that various operations of the producer on the display contents in the recording process are recorded. The producer can arbitrarily select the material, browse the picture and add the content at any time while carrying out the voice explanation, and the producer can edit and produce the knowledge content as if the producer sees the document as a visual editing document without shooting the video. The resulting media file, which includes at most audio data, picture data and some data for recording the producer's operations, is 5-10 times smaller in data size than the video file and facilitates retrieval, querying, and processing of knowledge content by computer programs.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an application interface in an embodiment of the invention;
FIG. 2 is a schematic diagram of a producer adding graffiti content to a display interface according to an embodiment of the present invention;
fig. 3 is a schematic diagram illustrating a producer adding text content in a display interface according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The embodiment of the invention provides a media file generation method, which can be executed by electronic equipment such as a computer, a tablet computer, a smart phone, a server and the like, and comprises the following steps:
and S1, acquiring the picture sequence. A producer may compose pictures including any content, such as a slide file, and convert the file into a sequence of pictures, for example. In a preferred embodiment, the method supports the import of a customized slide file, and generates one picture per page through transcoding by a real-time transcoding queue, and the processing process can be completed in a device executing the method and can also be uploaded to a remote server for processing. Considering the problems of space occupation and definition, the resolution of the picture can be preset to 720p, and the obtained picture data size is about 200 kb.
In an alternative embodiment, a real-time document making function is provided, and a maker can select a certain template from a plurality of preset templates, add text content into the certain template, and then automatically convert the template and the text into pictures. Specifically, the function of adding characters can be realized through one character editor, and when a user inputs characters to be added into the editor, the size of the characters can be automatically changed from large to small along with the increase of the number of lines of the input characters, so that all the characters can be clearly displayed. The text input supports undo and redo, facilitating the producer to adjust the content.
S2, recording audio data, presenting a picture sequence to a producer through a display interface in the recording process, acquiring picture switching data according to the switching action of the producer on the picture sequence, acquiring graffiti data according to the graffiti action of the producer in the display interface, and acquiring character adding data according to the operation of adding characters in the display interface by the producer, wherein the picture switching data, the graffiti data and the character adding data respectively at least comprise time information based on a recording process.
Data acquired during recording audio is used as description information for corresponding operations of a producer, such as picture switching data, which is used for describing that the producer switches and displays a certain picture in a picture sequence at a certain time in the recorded audio. Specifically, fig. 1 shows an application program interface based on the method, a picture in a picture sequence is displayed at a certain time point in a display interface 13, a progress display column 11 in the interface displays the current recording progress of audio data, it is assumed that a producer selects a 1 st picture in a picture selection column 12 during initial recording, the first picture is displayed in the display interface 13, and when the recording reaches a certain time point t, the producer clicks "picture 2" in the picture selection column 12, picture switching data is generated, which includes a switching action identifier and time t, and a second picture is correspondingly displayed in the display interface 13. Of course, the producer may also select the picture sequence not in the order of the sequence, for example, the nth picture may be displayed by skipping some pictures at the time point t, and the data generated for this purpose should also be able to indicate the picture information to be displayed specifically.
Regarding the graffiti data, the scheme allows a producer to draw marks such as circles, points, lines and the like in the display interface, and actually the graffiti operation of the producer for certain picture content currently displayed. Assuming that the point in time t has been recorded, the producer clicks on the add graffiti button 14, thereby drawing out graffiti content 21 such as a line in the interface shown in fig. 2, thereby generating graffiti data describing the content, location, time, etc. of the graffiti.
Regarding the text adding data, the scheme allows a producer to additionally add text content, such as annotation content aiming at certain picture content currently displayed, in the display interface. Assuming that the recording is made at the time point t, the producer clicks the add text button 15 to add some text content 22 in the interface shown in fig. 3, thereby generating text addition data describing the content, position, time, and the like of the added text.
During recording of audio, the producer may click a pause button in the interface to abort the recording in order to adjust the previous action. When a producer needs to modify certain actions, a certain point in time of the recorded audio can be selected, and then the corresponding picture switching action, graffiti or added text is deleted, or the contents are re-edited.
S3, packaging the picture switching data, the doodle data, the character adding data, the audio data, the picture sequence and/or the address information thereof into a media file. And after the audio recording is finished, packaging the data of various actions, the picture sequence and the audio data recorded in the recording process into a media file.
In an alternative embodiment, the producer may be supported to add new actions in the completed content. Specifically, the producer may choose to play back the audio data. During playback, the display interface will present pictures and reproduce the various actions it performs at various points in time. Similarly to the recording process, the producer can perform picture switching, doodling, and text adding at any time point, thereby obtaining corresponding data, each of which includes at least time information based on the playback progress.
The audio data, the pictures, the switching of the pictures, the scrawling and the characters are respectively used as independent elements, and the elements respectively correspond to different fields. The corresponding field of the audio data is the storage address of the audio data, which can be the local hard disk storage address or the address in a certain website (server); the picture sequence is stored into a form file, wherein the form file comprises the address information and the serial number of each picture, and also comprises thumbnails corresponding to the pictures for switching and indexing; the picture switching data, the doodle data and the text adding data are also independent elements respectively, and various fields for describing corresponding actions are included in the picture switching data, the doodle data and the text adding data.
Regarding the content in the media file, only the storage address information of the above-mentioned various elements in the internet may be included therein, not the element content itself. In an optional embodiment, the above elements are generated locally, after recording the audio data, the image switching data, the graffiti data, the text adding data, the audio data, and the image sequence may be uploaded to a remote storage device and a storage address, i.e., an address in the internet, may be obtained, and then when a media file is packaged, only the address information is packaged into the media file, so as to reduce the data volume of the file to the maximum extent. When such a media file needs to be played, the actual content of the element is downloaded via the address information therein.
According to the media file generation method provided by the embodiment of the invention, in the process of recording the audio data, corresponding data are obtained according to the switching operation, the doodling action and the character adding operation of a producer on the display pictures, and the data respectively comprise time information based on the recording process, so that various operations of the producer on the display contents in the recording process are recorded. The producer can arbitrarily select the material, browse the picture and add the content at any time while carrying out the voice explanation, and the producer can edit and produce the knowledge content as if the producer sees the document as a visual editing document without shooting the video. The resulting media file, which includes at most audio data, picture data and some data for recording the producer's operations, is 5-10 times smaller in data size than the video file and facilitates retrieval, querying, and processing of knowledge content by computer programs.
In a preferred embodiment, the picture switching data specifically includes a time point when a producer switches a picture in a recording process, and a serial number of the switched picture in a picture sequence. For example, the picture switching data { t:0, idx:5} indicates that the 5 th picture in the picture sequence is to be displayed at the 0 th second of audio recording; the picture switching data { t:3529, idx:1} indicates that the 1 st picture in the picture sequence is to be displayed at 3529 msec of audio recording. Such picture switching data can support a producer to arbitrarily skip a display picture.
The scrawling data comprise position information of a scrawling track in a display interface, and color information and/or width information of the track, wherein the time information comprises appearance time and disappearance time of the scrawling track. Such as { c:3, w:3, s:17275, d:2229, p: [ (36.97, 15.89), (37.09, 16.73) … … ] } where c represents color, w represents width, s represents start time (milliseconds), d represents end time (milliseconds), and p represents coordinate positions of points making up the graffiti trajectory.
In order to further compress the data volume of the media file, sparse sampling is adopted when the position information is recorded, for example, 1 point is sampled every 5 points for all points in the scrawling track, so that the effective digit number is reduced, and the relative coordinates of the points are reduced, so as to compress the data volume of the position information.
The character adding data comprises position information of the characters in the display interface, color information and size information of the characters, and the time information comprises appearance time and disappearance time of the characters. For example { "v": xxx "," c ":3," x ":0," y ":0," s ":1," te ":49040," tl ":55542}, where v denotes text content, c denotes color, x, y denotes relative coordinates, s denotes magnification (text size), te denotes the time of appearance at the beginning, and tl denotes the time of disappearance.
Regarding the packaging format (also referred to as a packaging container) of the media file described above, the present invention provides a preferred packaging structure. Specifically, the media file needs to include some auxiliary information and information about the manner of organizing the audio and video in addition to the audio and video stream, and the package container in this embodiment includes six parts:
the first part is a header file including version information (1 byte), format information (2 byte), header length information (8 byte), and length information (8 byte) of the element index, length information (8 byte) of the frame index data, length information (8 byte) of the element data, and length information (8 byte) of the frame data. Wherein the element data refers to the picture switching data, the doodle data, the character adding data and the picture sequence;
the second part is element index, each element index data is 8 bytes, including element type information and specific position information of each element data in the media file
The third part is frame index data, which includes the playing time corresponding to the frame data and the specific position information of each frame data in the media file. The frame index data is configured to deal with the problem of an unchangeable length frame or an uncertain length frame, so as to facilitate seek playing and serve as a basic basis for whether a stream can be played.
The fourth part is component data, i.e., the above-described picture switching data, graffiti data, character adding data, and picture sequence.
The fifth part is frame data, which comprises key frame data and change frame data. The key frame data refers to frame data established for a current frame when the current frame has a large change from a previous frame, for example, when a producer cuts a picture, and the frame data includes all attributes of all elements displayed on a current interface. For example, if the interface displayed in the 300ms includes a new picture a and text-added data b, the corresponding key frame data can be represented as
300:{
a, picture index id. Picture data, background color, initial width and height, display or hidden state.
b, text font, text color and text size. The text is located at a position corresponding to the width and height of the current interface. A display or a hidden state.
}。
The change frame data refers to the attribute of the difference, such as the change of the address information of the picture, such as the width change, etc., relative to the element in the last key frame, and the change frame data only includes the attribute of the difference. For example, in the interface displayed at 310ms relative to 300ms, only the attribute of the text-adding data b is changed (for example, the display position of the text), and the corresponding changed frame data can be represented as
310:{
b, new position. top left
}。
The sixth part is audio data, which may be binary data of an audio file in mp3, aac, or the like format directly appended to the frame data.
The invention also provides a media file playing method which is used for playing the media file generated according to the method. The method can be executed by electronic equipment such as a computer, a tablet computer, a smart phone and the like, and comprises the following steps:
acquiring a media file, wherein the media file comprises audio data, a picture sequence, picture switching data, graffiti data, character adding data and/or address information thereof, and the data at least comprises time information based on an audio data recording process;
analyzing the media file and playing audio data;
and displaying the pictures in the picture sequence, the scrawls and the characters in the display interface according to the time information in the playing process.
For the viewer, the experience is the same as that of watching a common video, but for the playing mechanism, the way of playing the media file provided by the embodiment of the present invention is completely different from that of playing a common video file. The audio data in the media file of the embodiment is continuously played, and for the current playing time t, the picture switching data, the doodle data and the text adding data corresponding to the t are acquired, and then layered rendering is performed, and the content is rendered in different layers in a display interface according to the attribute information in the data.
In addition, all the picture sequences, the picture switching data, the doodle data and the character adding data can be presented to a viewer, the viewer can select any content such as certain character adding data and certain picture in the picture sequences, and then the corresponding audio time point is obtained, so that the content to be viewed is quickly searched in all the content. For the preferred packaging format, the length of the header file part can be determined according to the header length information during playing, and after the element data and the frame index data are read, the playing can be started only by loading a plurality of frame data. When the interface display content is played to a certain time point, the interface display content at the current time can be directly restored only by comparing the difference between the current state and the initial state. Even if the viewer selects the skip play, the basic positions and times of the previous and following frames can be calculated, and thus the play can be continued based on the time point of the skip by only loading the corresponding data.
The following introduces a method for parsing and playing the media file including the six contents:
s1, the first few bytes of the media file are downloaded to determine the length of the header file.
And S2, completely downloading the header file to determine the length and the starting position of the subsequent parts.
And S3, completely downloading the element index data and the frame index data, and analyzing the frame index data to determine the playing time corresponding to the subsequent key frame data and the subsequent change frame data.
S4, downloading frame data, analyzing internal dependent element information, and asynchronously loading corresponding element data. No advance loading is performed, considering that the element data may be large. When the file is played to a certain time point, the position of the element data is determined according to the element index data, so that the element data is loaded asynchronously, the time required by downloading the file before playing is shortened, and the file is played as soon as possible.
S5, playing the key frames and the changed frames as time progresses. When the key frame is encountered, the key frame only needs to be analyzed according to the element attribute; when a change frame is encountered, data is superposed on the latest key frame once and combined, and then the data of the current change frame can be drawn, and the data is continuously played until the end.
In addition, the audio data can be supported by the normal format of audio, such as mp3 platform built-in decoder, and the mp3 data can be loaded for playing through the range request.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable attribute processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable attribute processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable attribute processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.

Claims (11)

1. A method for generating a media file, comprising:
acquiring a picture sequence;
recording audio data, presenting the picture sequence to a producer through a display interface in the recording process, acquiring picture switching data according to the switching action of the producer on the picture sequence, acquiring graffiti data according to the graffiti action of the producer aiming at the presented picture content in the display interface, and acquiring character adding data according to the operation of the producer aiming at adding characters to the presented picture content in the display interface, wherein the picture switching data, the graffiti data and the character adding data respectively at least comprise time information based on a recording process;
packaging the picture switching data, the graffiti data, the text adding data, the audio data, the picture sequence and/or the address information thereof into a media file, wherein the media file comprises a header file, an element index, frame index data, element data, frame data and the audio data, and the media file comprises a header file, an element index, frame index data, frame data and audio data
The header file comprises version information, header length information, length information of the element index, length information of the frame index data, length information of the element data and length information of the frame data;
the element index includes position information of each element data;
the frame index data comprises playing time and position information corresponding to the frame data;
the element data are the picture switching data, the doodle data, the character adding data and the picture sequence;
the frame data comprises key frame data and change frame data, wherein the key frame data refers to frame data established for a current picture when the current picture has larger change relative to the previous frame, specifically frame data established for a picture to be displayed currently when a producer switches pictures, and the key frame data comprises attributes of all the element data in the current picture; the change frame data refers to a difference attribute relative to the element data in the last key frame data, that is, only the difference attribute is set in the change frame data, and the difference attribute includes a position change attribute in the text-adding data and a width change attribute of the picture sequence.
2. The method of claim 1, further comprising, after completing recording the audio data:
playing back the audio data according to the operation of a producer;
the method comprises the steps of presenting a picture sequence to a producer through a display interface in the playback process, obtaining picture switching data according to the switching action of the producer on the picture sequence, obtaining scrawling data according to the scrawling action of the producer in the display interface, and obtaining character adding data according to the operation of adding characters in the display interface by the producer, wherein the picture switching data, the scrawling data and the character adding data at least comprise time information based on a playback process.
3. The method according to claim 1 or 2, wherein the time information in the picture switching data includes a time point when a producer switches a picture in a recording process; the picture switching data further comprises the sequence number of the switched picture in the picture sequence.
4. The method of claim 1 or 2, wherein the graffiti data further comprises location information of a graffiti track in the display interface, and the time information in the graffiti data comprises an appearance time and a disappearance time of the graffiti track.
5. The method of claim 4, wherein the graffiti data further comprises color information and/or width information of the graffiti track.
6. The method of claim 4, wherein during the process of obtaining the graffiti data, sparse sampling is performed on the position information of the graffiti track for compressing the data volume of the position information.
7. The method according to claim 1 or 2, wherein the time information of the letter addition data includes appearance time and disappearance time of a letter; the text adding data also comprises position information of the text in the display interface.
8. The method of claim 7, wherein the text-adding data further comprises color information and/or size information of the text.
9. A method for playing a media file, comprising:
acquiring a media file generated by the media file generation method according to any one of claims 1 to 8;
and analyzing and playing the media file, asynchronously loading corresponding element data by analyzing the element index data, playing key frame data and change frame data along with the time progress, and playing audio data.
10. A media file generation device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the media file generation method of any of claims 1-8.
11. A media file playback apparatus, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the media file playback method of claim 9.
CN202110463444.9A 2021-04-28 2021-04-28 Media file generation and playing method and equipment Active CN112969043B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110463444.9A CN112969043B (en) 2021-04-28 2021-04-28 Media file generation and playing method and equipment
PCT/CN2021/111384 WO2022227329A1 (en) 2021-04-28 2021-08-09 Media file generation method and device, and media file playback method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110463444.9A CN112969043B (en) 2021-04-28 2021-04-28 Media file generation and playing method and equipment

Publications (2)

Publication Number Publication Date
CN112969043A CN112969043A (en) 2021-06-15
CN112969043B true CN112969043B (en) 2021-08-24

Family

ID=76281341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110463444.9A Active CN112969043B (en) 2021-04-28 2021-04-28 Media file generation and playing method and equipment

Country Status (2)

Country Link
CN (1) CN112969043B (en)
WO (1) WO2022227329A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112969043B (en) * 2021-04-28 2021-08-24 北京优幕科技有限责任公司 Media file generation and playing method and equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5613057A (en) * 1994-01-14 1997-03-18 International Business Machines Corporation Method for creating a multimedia application using multimedia files stored in directories that are characteristics of display surface areas
CN102902709A (en) * 2012-08-02 2013-01-30 何建亿 Space allocation fixing file memory system and implementation method
CN104575547A (en) * 2013-10-17 2015-04-29 深圳市云帆世纪科技有限公司 Multi-media file making method, as well as multi-media file playing method and system
CN104952471A (en) * 2015-06-16 2015-09-30 深圳新创客电子科技有限公司 Method, device and equipment for synthesizing media file
CN105049920A (en) * 2015-07-27 2015-11-11 青岛海信移动通信技术股份有限公司 Method and device for recording multimedia files
CN107302715A (en) * 2017-08-10 2017-10-27 北京元心科技有限公司 Multimedia file playing method, multimedia file packaging method, corresponding device and terminal
CN107622118A (en) * 2017-09-20 2018-01-23 互联天下科技发展(深圳)有限公司 The making application method of full media electronic teaching material
CN112395835A (en) * 2019-07-31 2021-02-23 上海狼道信息科技有限公司 Multimedia notepad implementation method and system

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100353450C (en) * 2003-01-10 2007-12-05 华为技术有限公司 Processing method of multi-media data
CN100573419C (en) * 2003-12-19 2009-12-23 英特尔公司 With printing material and the related method and system of response that produces by computer system
CN105450944A (en) * 2015-11-13 2016-03-30 北京自由坊科技有限责任公司 Method and device for synchronously recording and reproducing slides and live presentation speech
CN106802896A (en) * 2015-11-26 2017-06-06 沈阳东软睿道教育服务有限公司 The making of micro- class, playing method and device and learning platform based on mobile terminal
US20190132650A1 (en) * 2017-10-27 2019-05-02 Facebook, Inc. Providing a slide show in a live video broadcast
CN109348156B (en) * 2018-11-29 2020-07-17 广州视源电子科技股份有限公司 Courseware recording and playing method and device, intelligent interactive panel and storage medium
CN109782986A (en) * 2018-12-14 2019-05-21 浙江学海教育科技有限公司 A kind of production method of teaching courseware, storage medium and application system
CN110673777A (en) * 2019-08-28 2020-01-10 北京大米科技有限公司 Online teaching method and device, storage medium and terminal equipment
CN112738617A (en) * 2020-12-28 2021-04-30 慧科教育科技集团有限公司 Audio slide recording and playing method and system
CN112969043B (en) * 2021-04-28 2021-08-24 北京优幕科技有限责任公司 Media file generation and playing method and equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5613057A (en) * 1994-01-14 1997-03-18 International Business Machines Corporation Method for creating a multimedia application using multimedia files stored in directories that are characteristics of display surface areas
CN102902709A (en) * 2012-08-02 2013-01-30 何建亿 Space allocation fixing file memory system and implementation method
CN104575547A (en) * 2013-10-17 2015-04-29 深圳市云帆世纪科技有限公司 Multi-media file making method, as well as multi-media file playing method and system
CN104952471A (en) * 2015-06-16 2015-09-30 深圳新创客电子科技有限公司 Method, device and equipment for synthesizing media file
CN105049920A (en) * 2015-07-27 2015-11-11 青岛海信移动通信技术股份有限公司 Method and device for recording multimedia files
CN107302715A (en) * 2017-08-10 2017-10-27 北京元心科技有限公司 Multimedia file playing method, multimedia file packaging method, corresponding device and terminal
CN107622118A (en) * 2017-09-20 2018-01-23 互联天下科技发展(深圳)有限公司 The making application method of full media electronic teaching material
CN112395835A (en) * 2019-07-31 2021-02-23 上海狼道信息科技有限公司 Multimedia notepad implementation method and system

Also Published As

Publication number Publication date
CN112969043A (en) 2021-06-15
WO2022227329A1 (en) 2022-11-03

Similar Documents

Publication Publication Date Title
US20220229536A1 (en) Information processing apparatus display control method and program
JP4514928B2 (en) Editing apparatus and method
US7054508B2 (en) Data editing apparatus and method
US6584463B2 (en) Video searching method, apparatus, and program product, producing a group image file from images extracted at predetermined intervals
AU2004246532B2 (en) Apparatus and method for organization and interpretation of multimedia data on a recording medium
US20150378544A1 (en) Automated Content Detection, Analysis, Visual Synthesis and Repurposing
CN103197850A (en) Information processing apparatus, information processing method, and computer readable medium
US20180143741A1 (en) Intelligent graphical feature generation for user content
JPH08115312A (en) Multimedia document reproducing device, multimedia document editing device, and multimedia document editing and reproducing device
CN111935505A (en) Video cover generation method, device, equipment and storage medium
CN112969043B (en) Media file generation and playing method and equipment
JP4514671B2 (en) CONTENT EDITING DEVICE, COMPUTER-READABLE PROGRAM, AND RECORDING MEDIUM CONTAINING THE SAME
KR100860510B1 (en) Method for creating slide show having visual effect in mobile device
US11551724B2 (en) System and method for performance-based instant assembling of video clips
KR101721231B1 (en) 4D media manufacture methods of MPEG-V standard base that use media platform
KR101477492B1 (en) Apparatus for editing and playing video contents and the method thereof
US10424337B2 (en) Sequential method for the presentation of images with enhanced functionality, and apparatus thereof
CN114430499B (en) Video editing method, video editing apparatus, electronic device, and readable storage medium
AU2002301447B2 (en) Interactive Animation of Sprites in a Video Production
JP2009225354A (en) Slide reproducer, slide reproduction system, and slide reproduction program
JP2002125180A (en) Management method and system by interactive computer control for element of video sequence
CN115220837A (en) Wizard type operation guide editing method, wizard type operation guide editing device, computer equipment and storage medium
JP2015210786A (en) Information processing device, information processing method and program
JP2005086344A (en) Moving picture management method and apparatus
EP2275957A1 (en) Method of displaying adaptive album art for portable terminal and apparatus for providing the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant