CN112040142B - Method for video authoring on mobile terminal - Google Patents

Method for video authoring on mobile terminal Download PDF

Info

Publication number
CN112040142B
CN112040142B CN202010650204.5A CN202010650204A CN112040142B CN 112040142 B CN112040142 B CN 112040142B CN 202010650204 A CN202010650204 A CN 202010650204A CN 112040142 B CN112040142 B CN 112040142B
Authority
CN
China
Prior art keywords
text
video
image material
list
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010650204.5A
Other languages
Chinese (zh)
Other versions
CN112040142A (en
Inventor
翟佳璐
李嘉良
李大任
曹顺达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhizhe Sihai Beijing Technology Co ltd
Original Assignee
Zhizhe Sihai Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhizhe Sihai Beijing Technology Co ltd filed Critical Zhizhe Sihai Beijing Technology Co ltd
Priority to CN202010650204.5A priority Critical patent/CN112040142B/en
Publication of CN112040142A publication Critical patent/CN112040142A/en
Application granted granted Critical
Publication of CN112040142B publication Critical patent/CN112040142B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/189Automatic justification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2628Alteration of picture size, shape, position or orientation, e.g. zooming, rotation, rolling, perspective, translation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/278Subtitling

Abstract

The invention relates to a method for video authoring on a mobile terminal, comprising: displaying an icon list related to the image material in a first area of a screen of the mobile terminal and displaying a text list in a second area of the screen; at least one of icons in the icon list and texts in the text list is adjusted to set the corresponding relation between the image materials and the texts; and generating a video including the text as subtitle information based on the set correspondence of the image material and the text. According to the embodiment of the invention, the display space of the mobile terminal screen can be fully utilized, and a user can conveniently browse and edit images and text materials throughout, set the corresponding relation of the images and the text materials and perform video creation.

Description

Method for video authoring on mobile terminal
Technical Field
The present invention relates to the field of mobile terminals, and in particular, to a method, an apparatus, an electronic device, and a computer readable storage medium for video authoring on a mobile terminal.
Background
With the rapid development of video platforms, video is a dominant media modality. Based on this, video authoring tools are also an important production tool for various types of self-media creators. The existing video creation tool has the product form of transverse arrangement information, and mainly provides a professional image material editing function.
With the rapid development of mobile internet and video technologies, more users wish to conveniently compose video contents from mobile terminals. However, they use existing video authoring tools and learning costs are high. Meanwhile, a video form of explanation is emerging, wherein video pictures are not taken as the main part, but text manuscripts are taken as the main part, and text pictures or moving pictures of corresponding subjects are matched to form video content with subtitles. The main body of the content is text, and the existing video creation tool cannot provide a convenient text editing function and a picture matching function.
Disclosure of Invention
In view of this, in order to solve the problem that the existing video creation tool is not suitable for mobile environment, the efficiency is low and the error is easy to occur in the operation process, the invention provides a tool capable of conveniently creating video on a mobile terminal. Through the tool, the display space of the mobile terminal screen can be fully utilized, and a user can conveniently browse and edit images and text materials throughout, set the corresponding relation of the images and the text materials and perform video creation. For example, the content of the graphic knowledge class is quickly and efficiently converted into high-quality video.
According to a first aspect of the present invention there is provided a method for video authoring on a mobile terminal, comprising: displaying an icon list related to the image material in a first area of a screen of the mobile terminal and displaying a text list in a second area of the screen; at least one of icons in the icon list and texts in the text list is adjusted to set the corresponding relation between the image materials and the texts; and generating a video including the text as subtitle information based on the set correspondence of the image material and the text.
In one possible embodiment, the first area and the second area may be located at left and right sides of the screen, and the icons in the icon list and the text in the text list are displayed in a top-down layout.
In one possible embodiment, the setting the correspondence between the image material and the text may include: and setting the corresponding relation between the image material and the text based on the horizontal position relation between the icon of the image material and the text.
In one possible embodiment, the icon may have a control for resizing, and the method may further comprise: and adjusting the size of the icon through the control, and setting the corresponding relation between the image material and the text.
In one possible embodiment, the method may further comprise: and moving the positions of the icons and/or the texts so as to set the corresponding relation between the image materials and the texts.
In one possible embodiment, the method may further comprise: and selecting the icon and/or the text, and editing the corresponding image material and text.
In one possible embodiment, the generating the video may specifically include: providing speech associated with the text; based on the time axis of the voice and the time axis of the image material, the time axis is calibrated and a video including the voice and the image material is generated based on the content having a longer duration.
In one possible embodiment, the image material may be any of a still picture, a moving picture, and a video image.
In one possible embodiment, the method may further comprise: acquiring image-text content comprising image materials and text content; segmenting the text content and forming an initial corresponding relation between the image material and a paragraph of the text content; forming a paragraph sentence of the text content into the text list; and displaying an icon list related to the image material in a first area of a screen of the mobile terminal and displaying the text list in a second area of the screen according to the image material, the text list and the initial correspondence.
In one possible embodiment, the method may further comprise: image material corresponding to text in the text list is selected and imported from a material library.
In one possible embodiment, the method may further comprise: and selecting the text in the text list, and calculating the image material related to the semantics according to the semantics of the text as the image material corresponding to the text.
According to a second aspect of the present invention, there is provided an apparatus for video authoring on a mobile terminal, comprising: a display unit for displaying an icon list related to the image material in a first area of a screen of the mobile terminal and displaying a text list in a second area of the screen; an adjusting unit, configured to adjust at least one of an icon in the icon list and a text in the text list, so as to set a correspondence between an image material and the text; and a video generation unit for generating a video including the text as subtitle information based on the set correspondence of the image material and the text.
According to a third aspect of the present disclosure, there is provided an electronic device comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to the first aspect when executing the program.
According to a fourth aspect of the present disclosure there is provided a computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to the first aspect.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the embodiments of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art. The above and other objects, features and advantages of the present application will become more apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the several views of the drawings. The drawings are not intended to be drawn to scale, with emphasis instead being placed upon illustrating the principles of the present application.
Fig. 1 shows a schematic operation interface diagram of an editing mode of authoring video at a mobile terminal according to an embodiment of the present invention.
Fig. 2 shows a schematic operation interface diagram of another editing mode of authoring video at a mobile terminal according to an embodiment of the present invention.
Fig. 3 shows a schematic operation interface diagram for adjusting the correspondence of image material and text at a mobile terminal according to an embodiment of the invention.
Fig. 4 shows a schematic operation interface diagram for editing text at a mobile terminal according to an embodiment of the present invention.
Fig. 5 shows a schematic operation interface diagram of a teletext content and imported image material to be used for generating a video according to an embodiment of the invention.
Fig. 6A-6C are schematic diagrams illustrating the generation of image material, text and their correspondence from teletext content, according to an embodiment of the invention.
Fig. 7 shows a schematic flow chart of a method for video authoring on a mobile terminal in accordance with an embodiment of the present invention.
FIG. 8 shows a schematic block diagram of an apparatus for video authoring on a mobile terminal in accordance with an embodiment of the invention.
Fig. 9 shows a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. It should be understood that the description is only illustrative and is not intended to limit the scope of the invention. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the present invention.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The words "a", "an", and "the" as used herein are also intended to include the meaning of "a plurality", etc., unless the context clearly indicates otherwise. Furthermore, the terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Fig. 1 shows a schematic operation interface diagram of an editing mode of authoring video at a mobile terminal according to an embodiment of the present invention. The editing mode shown in fig. 1 may also be referred to as a timeline mode; conventional video authoring tools typically use this view mode.
As shown, a series of materials for video authoring, including image materials 110, subtitle text 120, music materials 130, are shown at the lower part of the mobile terminal screen. The image material 110 may be one of a still picture, a moving picture, or a video image. The subtitle text 120 may be any type of text data that is divided into a plurality of sentences or paragraphs for display with video playback in a video image. The musical material may be any custom audio, for example, may be narrative audio corresponding to the caption text 120, which may be human speech generated based on any intelligent algorithm.
In the editing interface of the time axis mode, a video window 140 of the preview is displayed on the upper portion of the mobile terminal. At the same time, the user may align the image material 110, subtitle text 120, musical material 130 with the current timeline 150. In one embodiment, the user may drag any one or more of the image material 110, subtitle text 120, and music material 130 left and right to align the materials, i.e., to achieve synchronization of images, text, and sound. In addition, the user may also touch the image material 110, the subtitle text 120, and the music material 130 to edit the material at the touch location, for example, beautify the video material, modify the subtitle text content, and adjust the corresponding audio.
It can be seen that, in the editing mode shown in fig. 1, the material information such as the image material 110, the subtitle text 120, the audio material, etc. is arranged horizontally, and a larger space at the upper part of the screen is used for displaying the video material, and the text content is displayed in a thumbnail manner. The disadvantage of this lateral arrangement information is that:
1. the video occupies a larger display space, so that the text contents are displayed in a shortened mode, and an author cannot browse the text contents throughout; at this time, the user needs to continuously perform right-slide screen operation to browse complete text contents, and the text reading habit of the user is not met;
2. when new text is needed to be inserted, the new text must be attached to an image material, that is, the existing image video is followed by text, and the comment video is usually prepared after the existing text. In contrast, such authoring processes fail to provide a smoother authoring experience;
3. the text is used as an auxiliary content, a more convenient editing mode is not provided, only the operation of single sentence text is supported, and the functions of editing text content in batches, changing text positions in batches, synthesizing dubbing in batches and the like are not supported.
4. The video editing function also focuses on the editing function of a single video, and does not provide a function of conveniently adjusting the correspondence between materials and characters.
5. Automatic conversion of teletext content into video drafts is not supported.
In view of the above, the present invention provides a schema for "outline editing" based on conventional video authoring tools. For example, the schema may be switched to by a schema editing 160 control on the interface shown in FIG. 1. Under the outline mode, the image video materials and the characters are longitudinally arranged, the video is reduced and displayed, and the characters are displayed in more space, so that a user can browse the whole text conveniently, and the display form of the characters accords with the traditional reading habit.
Fig. 2 shows a schematic operation interface diagram of another editing mode (also referred to as outline mode) of authoring video at a mobile terminal according to an embodiment of the present invention. It should be noted that, in editing the material, the editing function of the single material can still be completed in the traditional time axis mode, and the outline mode is more focused on adjusting the corresponding relationship between the material and the text.
As shown in fig. 2, a plurality of icons 210 associated with image video material are shown in the left region of the screen of the mobile terminal; a plurality of texts 220 divided into sentence forms with separation lines therebetween are shown in the right region of the screen. In the left area, the icons 210 related to the image material are formed as an icon list; text 220 is formed as a text list in the right region. Also shown in FIG. 2 is a current timeline 250 that is similar to timeline 150 of FIG. 1. The current video preview 240 may be displayed in an upper portion of the screen according to the location of the current timeline 250.
According to the embodiment of the present invention, the correspondence of the image video material and the text may be set according to the positional relationship of the icon 210 and the text 220 on the screen. Specifically, the correspondence of the image video material and the text is set according to the size or height of the icon 210, for example, the horizontal positional relationship of the upper end position, the lower end position, and the text of the right area of the icon 210. For example, after the user selects the material 210, two controls, such as the handles 211 and 212, are displayed on the material 210, and the size or the height of the icon 210 can be adjusted by dragging the handles 211 and 212, so that the corresponding relationship between the image video material and the text can be quickly adjusted. It will be appreciated that the adjustment icon 210 may be larger such that the image video material corresponds to more text content, and that the adjustment icon 210 may be smaller and the text content corresponding to the material may be smaller.
It should be noted that, in both the timeline mode of fig. 1 and the outline mode of fig. 2, the overall effect of the video can be quickly previewed by using the timeline. Especially, the outline mode can obviously improve the preview efficiency and the operation efficiency of the corresponding relation, so that the user operation is more concentrated and efficient.
Fig. 3 shows a schematic operation interface diagram for adjusting the correspondence of image material and text at a mobile terminal according to an embodiment of the invention. In addition to adjusting the corresponding relation between the image material and the text by adjusting the size or height of the icon 210, the corresponding relation between the material and the sequence of the material can be quickly adjusted by long pressing the material.
The left side view of fig. 3 shows a schematic of long press and drag of icon 310 to move the corresponding video material to adjust the correspondence of image video material to text. The user presses the icon 310 for a long time, at which time the effect of the icon 310 floating is displayed on the screen, and the user can drag the icon 310 to insert it at a desired position. For example, icon 310 may be dragged upward, inserted into a forward position, to play the video material earlier; or conversely, icon 310 may be dragged downward, inserted into a rear position, to play the video material later.
In one embodiment, after icon 310 is dragged away from the home position, the home position may be displayed in black or any other color, suggesting that there is no image video material corresponding to text. Those skilled in the art will appreciate that the dragged icon may be further resized to increase or decrease its corresponding text.
The right side view of fig. 3 shows a schematic diagram of long pressing and dragging text 320 "i am from twelve years old" (e.g., the user holding a menu item on the right side of the text) to adjust the correspondence of image video material to the text. Similarly, the effect of floating text 320 is displayed on the screen and the user can drag text 310 to insert it at the desired location. For example, text 320 may be dragged upward, inserted into a forward position, to display the text earlier when the video is played; or conversely, text 320 may be dragged downward, inserted into a rear position, to be displayed later.
In one embodiment, text 320 may be rendered blank or any other color at the home location after being dragged away from the home location, suggesting that there is no text corresponding to the image video material.
It can be seen that in the outline mode, the correspondence between the added caption text and the image video material is clearly previewed, so that the problems of errors and low efficiency caused by editing the video material are reduced.
Fig. 4 shows a schematic operation interface diagram for editing text at a mobile terminal according to an embodiment of the present invention. Fig. 4 shows interface diagrams of text editing, more operations, and multiple choice text operations in order from left to right.
In the outline mode shown in fig. 2, editing can be performed by clicking on any text 220. As shown in the left diagram of fig. 4, compared with the lateral information display mode of fig. 1, the text editing view of the present invention can display more text content, including the edited text itself and the context of the text, which can be seen, is very convenient and more suitable for the reading habit of the user.
As shown in the middle diagram of fig. 4, a user may select a text sentence so that menu items, including "multi-select", "copy", "paste", etc., are presented on the screen, and the text is quickly edited. If "multiple choice" is selected, an interface is presented on the screen as shown in the right-hand diagram of fig. 4, and the user can sort through multiple texts, edit them together, such as copy text, cut text, delete text.
Fig. 5 shows a schematic operation interface diagram of a teletext content and imported image material to be used for generating a video according to an embodiment of the invention.
Under "know" applications, users often need to generate narrative-like video from mixed-type content including pictures (including still, moving, or video images) and text. As shown on the left side of fig. 5, the user answer content includes several pieces of text and pictures. As shown on the right side of fig. 5, the user may also select several image materials from the materials library for generating video of the narrative class. The lower operation field can perform quick management operation on the currently selected material, such as editing the material, replacing the material, deleting the material, and importing the material in batches. In one embodiment, material may be imported in bulk, and after editing the script, a profile may be quickly selected for the script from the bottom material column. That is, the user is free to add and delete image material, not limited to material within the original mixed-mode content.
It can be seen that the management and editing efficiency of the user on the image video material and the text is improved, the content and the text are added quickly, and the modification and adjustment of the caption text are quick and convenient.
A typical example of generating the initial correspondence of the image video material and the subtitle text described with reference to fig. 1 to 4 is described below in connection with fig. 6A to 6C.
Fig. 6A-6C are schematic diagrams illustrating the generation of image material, text and their correspondence from teletext content, according to an embodiment of the invention. The method realizes the initialization of the image materials, the texts and the corresponding relations thereof based on three stages of material preprocessing, intelligent segmentation and intelligent clause.
And firstly, preprocessing materials.
● Preprocessing logic of picture materials: only the static state diagram and the GIF can be imported, and black field video occupation display is needed when the importing fails. The video is not imported, and the uninformed material is not displayed in a space occupying mode.
● Preprocessing logic of the text material: discarded words include code blocks, notes, links (words are reserved when words and addresses are different), formulas, only spaces in natural segments or in list items. The characters retained include a title, a reference block, characters after the ordered list (retaining sequence numbers, adding periods if no separator/comma is at the end of each period), characters after the unordered list (removing sequence number marks, adding periods if no separator/comma is at the end of each period). All text is converted and the format is cleared.
And secondly, intelligent segmentation processing.
● Word segmentation is performed first: the natural segments are used as the segmentation basis, and each natural segment is a segment. If there are 2 or more consecutive natural segments, the word count < =5, these natural segments are combined into one "segment".
● Then, the pictures and the text paragraphs are in one-to-one correspondence: a picture is separated, and the corresponding text is the nearest natural segment above the picture (after merging processing); the pictures are continuous, the pictures and the words are corresponding in reverse order (namely, the last picture corresponds to the last section, the penultimate picture corresponds to the penultimate section, and the like … …); no text corresponding to the picture corresponds to the default black field video, such as the first segment, the second segment, the third segment, and the like in fig. 6A. If more pictures than natural segments appear in the corresponding text section, the pictures with more pictures at the back correspond to blank captions, as shown in "fig. 3" in fig. 6B: . If the space exists between the two pictures, or the line feed character exists, the two pictures are regarded as continuous; if the text is discarded entirely or there is a split line between the two pictures, it is considered discontinuous, as in fig. 6C, "fig. 2" does not correspond to any subtitle text.
And thirdly, intelligent clause processing.
● Each segment of text is cut into shorter sentences, each sentence corresponding to a line of subtitles.
● First, sentences are divided by a segmenter: the segmenter includes a "partition. "the"; "the"; "the: "? "? "the! "the! …, line-feed, two or more of which are "". One segmenter + quote = segmenter; if the end of a certain sentence is the left quotation mark after segmentation, the left quotation mark is moved to the head of the next sentence, and if the next sentence is not available, the sentence is deleted. One segmenter + bracket = segmenter; if the end of a sentence is left bracket after segmentation, the left bracket is moved to the next sentence head, if there is no next sentence, the sentence is deleted. A plurality of separators appear in succession, being regarded as one separator; between two segmenters, text is considered continuous if it is discarded entirely or has only spaces. If the segmenter is numeric both before and after, the segmenter is not processed (e.g., the two team score is 1:1) (e.g., total annual yield 194, 211, 400). Other punctuations are not processed and are normally displayed in the subtitles
● Dividing the ultralong sentences again: if the divided sentences exceed 26 Chinese characters (English, punctuation and numerical calculation of half Chinese characters), the sentences are cut from the last ' and ' the ' of the 26 words until all the sentences are < =26 Chinese characters. If there are a plurality of "s", "s" are consecutive, they are combined into one "s". If no comma exists, even if the sentence is more than 26 Chinese characters, the Chinese characters are not cut.
● Punctuation treatment: each line of subtitles in the video, the sentence ends show the "" and "". "the: the hiding process.
The image video material, text and corresponding relation are generated. It will be appreciated by those skilled in the art that the user may further adjust, edit, import image video material, text and their correspondence as described above with reference to fig. 1-5. For example, aiming at the caption text lacking the corresponding image video material, the invention also provides an automatic picture matching function, and according to the semantics of the characters, the picture matching is automatically calculated and carried out on the matched related pictures, thereby simplifying the step of searching the material for picture matching by the user.
Next, based on the correspondence of the set image material and text, a video including text as subtitle information is generated. In one embodiment, a one-touch dubbing function is also provided, and a user can generate corresponding comment audio from text. For example, according to user preferences, a male, female, or child sound is selected and a volume size is configured to produce the narrative audio. The interpreted audio may be combined with image video material, text to form a narrative-type video. According to the embodiment of the invention, when the video is generated, an automatic calibration function is also provided, when the video is inconsistent with the text time length (for example, the playing time length of the explanation audio corresponding to the text), the time axis is automatically calibrated by taking the content with longer time length as a reference, and the operation steps of a user are simplified.
Fig. 7 shows a schematic flow chart of a method 700 for video authoring on a mobile terminal in accordance with an embodiment of the present invention.
The method 700 includes: at step 710, displaying a list of icons associated with image material in a first area of a screen of the mobile terminal and displaying a list of text in a second area of the screen;
at step 720, at least one of the icons in the icon list and the text in the text list is adjusted to set the correspondence between the image material and the text; and
in step 730, a video including the text as subtitle information is generated based on the set correspondence of the image material and the text.
In one possible embodiment, the first area and the second area may be located at left and right sides of the screen, and the icons in the icon list and the text in the text list are displayed in a top-down layout.
In one possible embodiment, the setting the correspondence between the image material and the text may include: and setting the corresponding relation between the image material and the text based on the horizontal position relation between the icon of the image material and the text.
In one possible embodiment, the icon may have a control for resizing, and the method may further comprise: and adjusting the size of the icon through the control, and setting the corresponding relation between the image material and the text.
In one possible embodiment, the method may further comprise: and moving the positions of the icons and/or the texts so as to set the corresponding relation between the image materials and the texts.
In one possible embodiment, the method may further comprise: and selecting the icon and/or the text, and editing the corresponding image material and text.
In one possible embodiment, the generating the video may specifically include: providing speech associated with the text; based on the time axis of the voice and the time axis of the image material, the time axis is calibrated and a video including the voice and the image material is generated based on the content having a longer duration.
In one possible embodiment, the image material may be any of a still picture, a moving picture, and a video image.
In one possible embodiment, the method may further comprise: acquiring image-text content comprising image materials and text content; segmenting the text content and forming an initial corresponding relation between the image material and a paragraph of the text content; forming a paragraph sentence of the text content into the text list; and displaying an icon list related to the image material in a first area of a screen of the mobile terminal and displaying the text list in a second area of the screen according to the image material, the text list and the initial correspondence.
In one possible embodiment, the method may further comprise: image material corresponding to text in the text list is selected and imported from a material library.
In one possible embodiment, the method may further comprise: and selecting the text in the text list, and calculating the image material related to the semantics according to the semantics of the text as the image material corresponding to the text.
FIG. 8 shows a schematic block diagram of an apparatus 800 for video authoring on a mobile terminal in accordance with an embodiment of the invention.
Apparatus 800 for video authoring on a mobile terminal, comprising: a display unit 810 for displaying a list of icons related to image materials in a first area of a screen of the mobile terminal and displaying a list of texts in a second area of the screen;
an adjusting unit 820, configured to adjust at least one of an icon in the icon list and a text in the text list, so as to set a correspondence between image materials and the text; and
the video generating unit 830 is configured to generate a video including a text as subtitle information based on a correspondence between the set image material and the text.
In one possible embodiment, the adjustment unit 820 may also be used to: and moving the positions of the icons and/or the texts so as to set the corresponding relation between the image materials and the texts.
In one possible embodiment, the adjustment unit 820 may also be used to: and selecting the icon and/or the text, and editing the corresponding image material and text.
In one possible embodiment, the video generating unit 830 may be configured to: providing speech associated with the text; based on the time axis of the voice and the time axis of the image material, the time axis is calibrated and a video including the voice and the image material is generated based on the content having a longer duration.
In one possible embodiment, the image material may be any of a still picture, a moving picture, and a video image.
In one possible embodiment, the apparatus 800 may include an initialization unit (not shown in the figures). The initialization unit is used for: acquiring image-text content comprising image materials and text content; segmenting the text content and forming an initial corresponding relation between the image material and a paragraph of the text content; and forming the text list by paragraph clauses of the text content. Thus, the display unit 810 may be configured to display an icon list related to the image material in a first area of a screen of the mobile terminal and display the text list in a second area of the screen according to the image material, the text list, and the initial correspondence. The initializing unit 820 may be further configured to select and import image materials corresponding to the text in the text list from a materials library; and selecting the text in the text list, and calculating the image material related to the semantics according to the semantics of the text as the image material corresponding to the text.
Based on the detailed description of the invention, it can be seen that the invention breaks through the information organization form of the traditional video creation tool, uses the characters as the core to longitudinally arrange the information, and uses more space to display the characters, so that the user can conveniently browse the whole, and the browsing mode is more in accordance with the traditional reading habit.
The text editing function is provided more conveniently, the input of single sentence text is more original and biochemical, the text editing function is consistent with the traditional text input mode, and the learning cost is reduced; the function of editing the characters in batches is also provided, so that the more complex character editing requirement is met; meanwhile, the function of one-key dubbing is provided, the dubbing is generated by voice synthesis, and the user dubbing time is saved.
On the basis of the traditional video editing function, a more macroscopic material editing function is provided, the corresponding relation between materials and characters is adjusted, the batch graphic matching is supported, and the use habit of creating videos by graphic authors is more met. The automatic calibration function is provided, when the video and character durations are inconsistent, the time axis is automatically calibrated by taking the longer-duration content as a reference, and the operation steps of a user are simplified. The automatic picture matching function is provided, and pictures related to matching are automatically calculated according to the semantics of the characters to match pictures, so that the step of searching materials by a user to match pictures is simplified.
The existing image-text content is converted into a video draft by one key, and intelligent segmentation and sentence segmentation are carried out; meanwhile, the video draft is generated by importing materials in batches, so that the time for importing the materials is saved.
Fig. 9 shows a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device 900 includes a Central Processing Unit (CPU) 901 that can execute various appropriate actions and processes as shown in fig. 7 in accordance with a program stored in a Read Only Memory (ROM) 902 or a program loaded from a storage section 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data necessary for the operation of the electronic device 900 are also stored. The CPU 901, ROM 902, and RAM 903 are connected to each other through a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.
The following components are connected to the I/O interface 905: an input section 906 including a keyboard, a mouse, and the like; an output portion 907 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage portion 908 including a hard disk or the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as needed. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 910 so that a computer program read out therefrom is installed into the storage section 908 as needed.
Embodiments of the present invention provide a computer readable storage medium having stored thereon executable instructions that when executed by a processor cause the processor to perform any of the methods shown in fig. 7. By way of example, a computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device, such as a server, data center, or the like, that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.
From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to perform all or part of the functions described above. The specific working processes of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which are not described herein.
In the several embodiments provided in this application, it should be understood that the disclosed systems and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The foregoing is merely illustrative embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the technical scope of the present invention, and the invention should be covered. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (13)

1. A method for video authoring on a mobile terminal, comprising:
displaying an icon list related to the image material in a first area of a screen of the mobile terminal and a text list in a second area of the screen, and displaying a video preview in an upper portion of the screen;
at least one of icons in the icon list and texts in the text list is adjusted to set the corresponding relation between the image materials and the texts; the icon is provided with a control used for adjusting the size, and the size of the icon is adjusted through the control so as to set the corresponding relation between the image material and the text; and
based on the set correspondence of the image material and the text, a video including the text as subtitle information is generated.
2. The method of claim 1, wherein the first region and the second region are located on left and right sides of the screen, and icons in the icon list and text in the text list are displayed in a top-down layout.
3. The method of claim 2, wherein the setting of the correspondence of image material to text comprises: and setting the corresponding relation between the image material and the text based on the horizontal position relation between the icon of the image material and the text.
4. A method according to any one of claims 1-3, the method further comprising: and moving the positions of the icons and/or the texts so as to set the corresponding relation between the image materials and the texts.
5. A method according to any one of claims 1-3, the method further comprising: and selecting the icon and/or the text, and editing the corresponding image material and text.
6. The method according to claim 1, wherein the generating video, in particular, comprises: providing speech associated with the text; based on the time axis of the voice and the time axis of the image material, the time axis is calibrated and a video including the voice and the image material is generated based on the content having a longer duration.
7. The method of claim 1, wherein the image material is any one of a still picture, a moving picture, and a video image.
8. The method of claim 1, the method further comprising:
acquiring image-text content comprising image materials and text content;
segmenting the text content and forming an initial corresponding relation between the image material and a paragraph of the text content;
forming a paragraph sentence of the text content into the text list; and
and displaying an icon list related to the image material in a first area of a screen of the mobile terminal and displaying the text list in a second area of the screen according to the image material, the text list and the initial corresponding relation.
9. The method of claim 1, the method further comprising: image material corresponding to text in the text list is selected and imported from a material library.
10. The method of claim 1, the method further comprising: and selecting the text in the text list, and calculating the image material related to the semantics according to the semantics of the text as the image material corresponding to the text.
11. An apparatus for video authoring on a mobile terminal, comprising:
a display unit for displaying an icon list related to the image material in a first area of a screen of the mobile terminal and a text list in a second area of the screen, and displaying a video preview in an upper portion of the screen;
an adjusting unit, configured to adjust at least one of an icon in the icon list and a text in the text list, so as to set a correspondence between an image material and the text; the icon is provided with a control used for adjusting the size, and the size of the icon is adjusted through the control so as to set the corresponding relation between the image material and the text; and
and a video generation unit for generating a video including the text as subtitle information based on the set correspondence between the image material and the text.
12. An electronic device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to any one of claims 1-11 when the program is executed.
13. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1-11.
CN202010650204.5A 2020-07-08 2020-07-08 Method for video authoring on mobile terminal Active CN112040142B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010650204.5A CN112040142B (en) 2020-07-08 2020-07-08 Method for video authoring on mobile terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010650204.5A CN112040142B (en) 2020-07-08 2020-07-08 Method for video authoring on mobile terminal

Publications (2)

Publication Number Publication Date
CN112040142A CN112040142A (en) 2020-12-04
CN112040142B true CN112040142B (en) 2023-05-02

Family

ID=73578953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010650204.5A Active CN112040142B (en) 2020-07-08 2020-07-08 Method for video authoring on mobile terminal

Country Status (1)

Country Link
CN (1) CN112040142B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113473204B (en) * 2021-05-31 2023-10-13 北京达佳互联信息技术有限公司 Information display method and device, electronic equipment and storage medium
CN115811632A (en) * 2021-09-15 2023-03-17 北京字跳网络技术有限公司 Video processing method, device, equipment and storage medium
CN114125553A (en) * 2021-12-31 2022-03-01 深圳市爱剪辑科技有限公司 Video editing system based on mobile terminal and application method thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004199696A (en) * 2002-12-17 2004-07-15 Ricoh Co Ltd Method for displaying information stored in multiple multimedia document
JP2009094586A (en) * 2007-10-03 2009-04-30 Sony Corp Video playback device, video playback method and program
CN101751215A (en) * 2008-11-27 2010-06-23 索尼株式会社 Information processing apparatus, display control method and program
CN102592298A (en) * 2010-12-02 2012-07-18 索尼公司 Visual treatment for a user interface in a content integration framework
CN104885036A (en) * 2013-07-11 2015-09-02 三星电子株式会社 User terminal device for displaying application and methods thereof
CN108965737A (en) * 2017-05-22 2018-12-07 腾讯科技(深圳)有限公司 media data processing method, device and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8510656B2 (en) * 2009-10-29 2013-08-13 Margery Kravitz Schwarz Interactive storybook system and method
JP5818445B2 (en) * 2011-01-26 2015-11-18 京セラ株式会社 Mobile terminal device
US9996566B2 (en) * 2012-02-20 2018-06-12 Wix.Com Ltd. Visual design system for generating a visual data structure associated with a semantic composition based on a hierarchy of components
US9843823B2 (en) * 2012-05-23 2017-12-12 Yahoo Holdings, Inc. Systems and methods involving creation of information modules, including server, media searching, user interface and/or other features
US10185468B2 (en) * 2015-09-23 2019-01-22 Microsoft Technology Licensing, Llc Animation editor

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004199696A (en) * 2002-12-17 2004-07-15 Ricoh Co Ltd Method for displaying information stored in multiple multimedia document
JP2009094586A (en) * 2007-10-03 2009-04-30 Sony Corp Video playback device, video playback method and program
CN101751215A (en) * 2008-11-27 2010-06-23 索尼株式会社 Information processing apparatus, display control method and program
CN102592298A (en) * 2010-12-02 2012-07-18 索尼公司 Visual treatment for a user interface in a content integration framework
CN104885036A (en) * 2013-07-11 2015-09-02 三星电子株式会社 User terminal device for displaying application and methods thereof
CN108965737A (en) * 2017-05-22 2018-12-07 腾讯科技(深圳)有限公司 media data processing method, device and storage medium

Also Published As

Publication number Publication date
CN112040142A (en) 2020-12-04

Similar Documents

Publication Publication Date Title
CN112040142B (en) Method for video authoring on mobile terminal
US10380773B2 (en) Information processing apparatus, information processing method, and computer readable medium
US7603620B2 (en) Creating visualizations of documents
US8819545B2 (en) Digital comic editor, method and non-transitory computer-readable medium
US7844115B2 (en) Information processing apparatus, method, and program product
EP1980959A1 (en) Methods for authoring and interacting with multimedia representations of documents
EP1814049A1 (en) Methods for computing a navigation path
EP1980960A2 (en) Methods and apparatuses for converting electronic content descriptions
US20200104277A1 (en) Content file suggestions
US7793224B1 (en) Methods and apparatus for formatting identified content
CN107203498A (en) A kind of method, system and its user terminal and server for creating e-book
WO2020258717A1 (en) Text processing method, apparatus and device, and storage medium
US20080170084A1 (en) Information processing apparatus, information display method, and information display program product
JP6730760B2 (en) Server and program, video distribution system
CN117436417A (en) Presentation generation method and device, electronic equipment and storage medium
US11922975B2 (en) Method, apparatus, device and medium for generating video in text mode
JP5451696B2 (en) Subtitle adding apparatus, content data, subtitle adding method and program
JP4097736B2 (en) Method for producing comics using a computer and method for viewing a comic produced by the method on a monitor screen
JP2003091344A (en) Information processor, information processing method, recording medium, data structure and program
CN113901776A (en) Audio interaction method, medium, device and computing equipment
JP2003099424A (en) Document data structure, storage medium and information processor
JP2020108162A (en) Server and program
WO2024093572A1 (en) Method and apparatus for presenting document, and device and storage medium
US20240127512A1 (en) Adaptive editing experience for mixed media content
TWI828490B (en) Online text translation system for page-turning comics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant