CN113727024B - Method, device, electronic equipment and storage medium for generating multimedia information - Google Patents

Method, device, electronic equipment and storage medium for generating multimedia information Download PDF

Info

Publication number
CN113727024B
CN113727024B CN202111005750.4A CN202111005750A CN113727024B CN 113727024 B CN113727024 B CN 113727024B CN 202111005750 A CN202111005750 A CN 202111005750A CN 113727024 B CN113727024 B CN 113727024B
Authority
CN
China
Prior art keywords
video
effect
predetermined display
display area
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111005750.4A
Other languages
Chinese (zh)
Other versions
CN113727024A (en
Inventor
徐悦然
龚烨菲
闫鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202111005750.4A priority Critical patent/CN113727024B/en
Publication of CN113727024A publication Critical patent/CN113727024A/en
Application granted granted Critical
Publication of CN113727024B publication Critical patent/CN113727024B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/64Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The present disclosure provides a multimedia information generation method, apparatus, electronic device, storage medium, and program product. The multimedia information generation method comprises the following steps: acquiring first multimedia information, the first multimedia information being acquired based on a photographing template having a plurality of predetermined display areas, and the photographing template being generated by adding an effect flag in each of the plurality of predetermined display areas, displaying a first object in a part of the predetermined display areas of the plurality of predetermined display areas, and at least one predetermined display area of the remaining predetermined display areas being used for photographing and displaying a second object; the second multimedia information is generated, the first object is displayed in a portion of the predetermined display area, and the second object is displayed in at least one of the predetermined display areas. The method increases interaction between objects, improves creation desire of the objects participating in the shooting, enhances interactivity and optimizes user experience.

Description

Method, device, electronic equipment and storage medium for generating multimedia information
Technical Field
The present disclosure relates to the field of multimedia information processing, and in particular, to a method, an apparatus, an electronic device, a storage medium, and a program product for generating multimedia information.
Background
In the related art, taking a video included in multimedia information as an example, in a case where a first object wants to be in time with a second object in videos of second objects that have already been released, a currently existing photographing mode is to display the first object to be photographed in parallel with the videos of the second object (for example, the first object to be photographed is on the left side and the video of the second object is on the right side), the first object can be in time with the second object by photographing its own video within a video photographing period of the second object, and the video effect after the time is: the video shot by the first object and the video of the second object are displayed in parallel in one display frame. However, such a photographing template (for example, a typesetting template of a photographing position (for example, the templates displayed side by side on the left and right) in the prior art) is fixed, and as a result of completing photographing through such a photographing template, an object (for example, a user) unilaterally photographs with a video of another object, interaction between the objects is lacking, resulting in low desire for creation of the object and weak interactivity.
Disclosure of Invention
The present disclosure provides a method and apparatus for generating multimedia information, which at least solve the problems of lack of interaction between objects, low desire for creation of objects, and weak interactivity in the related art, or may not solve any of the problems. The technical scheme of the present disclosure is as follows:
According to a first aspect of embodiments of the present disclosure, there is provided a multimedia information generating method including: acquiring first multimedia information, wherein the first multimedia information is acquired based on a photographing template having a plurality of predetermined display areas, and the photographing template is generated by adding an effect flag in each of the plurality of predetermined display areas, wherein a first object is displayed in a part of the predetermined display areas of the plurality of predetermined display areas, and at least one predetermined display area among the remaining predetermined display areas of the plurality of predetermined display areas is used for photographing and displaying a second object; generating second multimedia information, wherein the first object is displayed in the part of the predetermined display area in the second multimedia information, and the second object is displayed in the at least one predetermined display area in the second multimedia information.
Optionally, the step of acquiring the first multimedia information may include: and generating first multimedia information based on the shooting template, wherein the part of the predetermined display areas of the plurality of predetermined display areas are used for shooting and displaying a first object.
Optionally, the generating of the shooting template may include: performing the following in each of the plurality of predetermined display areas: obtaining a virtual display object, wherein the virtual display object is a virtual object generated based on the characteristics of the object; determining a characteristic position of the virtual display object, wherein the characteristic position is position information corresponding to a characteristic of the object; and adding an effect mark corresponding to the characteristic position in each preset display area.
Alternatively, the generating of the multimedia information may include: determining a characteristic position of an object in a preset display area, wherein the characteristic position of the object is position information corresponding to the characteristic of a first object or a second object obtained through shooting; and synthesizing the effect mark in the preset display area for shooting the object with the object according to the characteristic position of the object and the characteristic position corresponding to the effect mark in the preset display area for shooting the object, thereby generating the multimedia information comprising the object.
Alternatively, the operation of combining the effect marker in the predetermined display area for photographing the subject with the subject may include: determining characteristic points of the effect marks meeting preset conditions; determining object feature matching points corresponding to the feature points of the effect mark based on the feature points of the effect mark; and synthesizing the effect marks in the preset display area for shooting the object onto the object in a layer superposition mode when the distance between the positions of the feature matching points of the object and the feature point positions of the corresponding effect marks is smaller than or equal to a preset threshold value.
Alternatively, the determining the feature point of the effect flag satisfying the preset condition may include the following means: and sampling the characteristic points of the effect marks or determining the characteristic points of the characteristic effect marks based on the calculation processing of the characteristic points of the effect marks.
Optionally, when the distance between the feature matching point of the object and the feature point of the corresponding effect marker is less than or equal to a predetermined threshold, the method may further include: the distance between the object matching feature points satisfying the predetermined number and the feature points of the corresponding effect markers is less than or equal to a predetermined threshold.
Alternatively, the first object and the second object may be located on different display layers, and the display layer on which the first object is located and the display layer on which the second object is located have a predefined sequential relationship therebetween, wherein the predefined sequential relationship varies according to different object requirements.
Alternatively, the total photographing duration of the second object may be equal to or less than the total duration of the first multimedia information.
Optionally, the effect mark may include at least one of a graffiti mark, a magic expression mark, and a sticker mark.
According to a second aspect of the embodiments of the present disclosure, there is provided a multimedia information generating apparatus including: an acquisition module configured to: acquiring first multimedia information, wherein the first multimedia information is acquired based on a photographing template having a plurality of predetermined display areas, and the photographing template is generated by adding an effect flag in each of the plurality of predetermined display areas, wherein a first object is displayed in a part of the predetermined display areas among the plurality of predetermined display areas, and at least one predetermined display area among the remaining predetermined display areas of the plurality of predetermined display areas is used for photographing and displaying a second object; a generation module configured to: generating second multimedia information, wherein the first object is displayed in the part of the predetermined display area in the second multimedia information, and the second object is displayed in the at least one predetermined display area in the second multimedia information.
Optionally, the operation of the obtaining module to obtain the first multimedia information may include: and generating first multimedia information based on the shooting template, wherein the part of the predetermined display areas of the plurality of predetermined display areas are used for shooting and displaying a first object.
Alternatively, the photographing template may be generated by: performing the following in each of the plurality of predetermined display areas: obtaining a virtual display object, wherein the virtual display object is a virtual object generated based on the characteristics of the object; determining a characteristic position of the virtual display object, wherein the characteristic position is position information corresponding to a characteristic of the object; and adding an effect mark corresponding to the characteristic position in each preset display area.
Alternatively, the generating operation of the multimedia information may include: determining a characteristic position of an object in a preset display area, wherein the characteristic position of the object is position information corresponding to the characteristic of a first object or a second object obtained through shooting; and synthesizing the effect mark in the preset display area for shooting the object with the object according to the characteristic position of the object and the characteristic position corresponding to the effect mark in the preset display area for shooting the object, thereby generating the multimedia information comprising the object.
Alternatively, the operation of combining the effect marker in the predetermined display area for photographing the subject with the subject may include: determining characteristic points of the effect marks meeting preset conditions; determining object feature matching points corresponding to the feature points of the effect mark based on the feature points of the effect mark; and synthesizing the effect marks in the preset display area for shooting the object onto the object in a layer superposition mode when the distance between the positions of the feature matching points of the object and the feature point positions of the corresponding effect marks is smaller than or equal to a preset threshold value.
Alternatively, the determining the feature point of the effect flag satisfying the preset condition may include the following means: and sampling the characteristic points of the effect marks or determining the characteristic points of the characteristic effect marks based on the calculation processing of the characteristic points of the effect marks.
Optionally, when the distance between the feature matching point of the object and the feature point of the corresponding effect marker is less than or equal to a predetermined threshold, the method may further include: the distance between the object matching feature points satisfying the predetermined number and the feature points of the corresponding effect markers is less than or equal to a predetermined threshold.
Alternatively, the first object and the second object may be located on different display layers, and the display layer on which the first object is located and the display layer on which the second object is located have a predefined sequential relationship therebetween, wherein the predefined sequential relationship varies according to different object requirements.
Alternatively, the total photographing duration of the second object may be equal to or less than the total duration of the first multimedia information.
Optionally, the effect mark may include at least one of a graffiti mark, a magic expression mark, and a sticker mark.
According to a third aspect of embodiments of the present disclosure, there is provided an electronic device comprising: a processor; a memory for storing the processor-executable instructions, wherein the processor is configured to execute the instructions to implement the multimedia information generating method as described above.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, which when executed by a processor of an electronic device/server, enables the electronic device/server to perform the multimedia information generation method as described above.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program/instruction, characterized in that the computer program/instruction, when executed by a processor, implements a multimedia information generating method as described above.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects: through the multimedia information generation method and the multimedia information generation device, the interaction between the objects is increased by the method, so that the creation desire of the objects participating in the simultaneous shooting is improved, the interaction is enhanced, and the user experience is optimized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.
FIG. 1 is an exemplary system architecture diagram in which exemplary embodiments of the present disclosure may be applied;
fig. 2 is a flowchart illustrating a multimedia information generation method according to an exemplary embodiment of the present disclosure;
fig. 3 is a schematic diagram illustrating an example of a multimedia information generation method according to an exemplary embodiment of the present disclosure;
FIG. 4 is a schematic diagram illustrating editable content according to an example embodiment of the disclosure;
fig. 5 is a schematic diagram illustrating an example of a multimedia information generation method according to an exemplary embodiment of the present disclosure;
fig. 6 is a block diagram illustrating a multimedia information generating apparatus according to an exemplary embodiment of the present disclosure;
fig. 7 is a block diagram illustrating an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The embodiments described in the examples below are not representative of all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
It should be noted that, in this disclosure, "at least one of the items" refers to a case where three types of juxtaposition including "any one of the items", "a combination of any of the items", "an entirety of the items" are included. For example, "including at least one of a and B" includes three cases side by side as follows: (1) comprises A; (2) comprising B; (3) includes A and B. For example, "at least one of the first and second steps is executed", that is, three cases are juxtaposed as follows: (1) performing step one; (2) executing the second step; (3) executing the first step and the second step.
As mentioned in the background of the present disclosure, in the related art, taking a video included in multimedia information as an example, when an object participating in a time shooting wishes to be time-shot with an object in an existing video, the object participating in the time shooting may enter a shooting window through a predetermined input (e.g., clicking an access icon of a time shooting function) in which a left part of the window is a shooting frame, a right part of the window is the existing video, and the object participating in the time shooting may start playing in response to another predetermined input (e.g., clicking a shooting button) while the existing video starts playing in response to the other predetermined input to complete the time shooting. However, the conventional simultaneous shooting (which may be called as simultaneous shooting and frame sharing) has a weak interaction, and the object participating in the simultaneous shooting has poor connectivity with the original video object, so that the method has no obvious effect on the refreshing and the retaining of the original video work.
In view of this, according to an exemplary embodiment of the present disclosure, the present disclosure proposes a multimedia information generation method and apparatus, taking a video included in multimedia information as an example, by setting the number of pits and the position display area of a snap shot when capturing an original video, arranging and combining into different display effects, and enabling an object participating in the snap shot to be in time with an object in the original video in a capturing environment of the original video (rather than in the capturing environment of the object itself), which may increase the interest level of the object participating in the snap shot, and both the interactivity and connectivity of the object participating in the snap shot with the object in the original video are improved, and this manner may allow a plurality of objects participating in the snap shot (i.e., re-creation of the original video) to be added in time, so that the scalability is enhanced.
Hereinafter, a multimedia information generation method and apparatus according to exemplary embodiments of the present disclosure will be described in detail with reference to fig. 1 to 7.
FIG. 1 is an exemplary system architecture diagram to which exemplary embodiments of the present disclosure may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. A user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages, such as multimedia information (e.g. video) data upload requests, multimedia information data acquisition requests, etc. Various communication client applications, such as a video recording class application, a video compression class application, a video and audio editing class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103. The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and capable of playing, recording, and editing video, including but not limited to smart phones, tablet computers, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they may be installed in the above-listed electronic devices, which may be implemented as a plurality of software or software modules (e.g. to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein.
The terminal devices 101, 102, 103 may be mounted with image pickup means (e.g., a camera) to collect multimedia information (e.g., video) data, and furthermore, the terminal devices 101, 102, 103 may also be mounted with components (e.g., a speaker) for converting electric signals into sound to play sound, and may also be mounted with means (e.g., a microphone) for converting analog audio signals into digital audio signals to collect sound.
The terminal devices 101, 102, 103 may perform acquisition of multimedia information (e.g., video) data using an image acquisition apparatus mounted thereon, perform acquisition of audio data using an audio acquisition apparatus mounted thereon, and the terminal devices 101, 102, 103 may encode, store, and transmit the acquired video data and audio data, and may decode and play the encoded video and audio received from another terminal device or from the server 105.
The server 105 may be a server providing various services, such as a background server providing support for video recording class applications, video compression class applications, video editing class applications, and the like installed on the terminal devices 101, 102, 103, or may be a storage server storing encoded video and audio uploaded by the terminal devices 101, 102, 103, and transmitting the stored encoded video and audio to the terminal devices 101, 102, 103 in response to a request of the terminal devices 101, 102, 103.
The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein.
It should be noted that the method for generating multimedia information provided in the embodiments of the present application is generally performed by the terminal devices 101, 102, 103, and accordingly, the multimedia information generating apparatus is generally provided in the terminal devices 101, 102, 103.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers as desired, and the disclosure is not limited in this regard.
Fig. 2 is a flowchart illustrating a multimedia information generation method according to an exemplary embodiment of the present disclosure. The multimedia information generating method 200 according to the exemplary embodiment of the present disclosure may be applied to clients (e.g., the terminal devices 101, 102, 103 shown in fig. 1), but is not limited thereto, and the multimedia information generating method 200 may also be applied to servers (e.g., the server 105 shown in fig. 1).
Referring to fig. 2, according to an exemplary embodiment of the present disclosure, first multimedia information is acquired at step S201. Specifically, the first multimedia information is acquired based on a photographing template having a plurality of predetermined display areas, and the photographing template is generated by adding an effect flag in each of the plurality of predetermined display areas. The first object is displayed in a part of the predetermined display areas of the plurality of predetermined display areas, and at least one predetermined display area of the remaining predetermined display areas of the plurality of predetermined display areas is used for photographing and displaying the second object.
According to an exemplary embodiment of the present disclosure, the step of acquiring the first multimedia information may include: based on the shooting template, first multimedia information is generated. Further, a part of the predetermined display areas of the plurality of predetermined display areas is used for photographing and displaying the first object.
In this way, by reserving the predetermined display area when the original video is photographed, the possibility of performing the photographing under the photographing environment of the original video can be provided for the subsequent subject participating in the photographing, thereby optimizing the user experience.
According to an exemplary embodiment of the present disclosure, the generating of the photographing template may include: the following operations are performed in each of the plurality of predetermined display areas: obtaining a virtual display object, wherein the virtual display object is a virtual object generated based on characteristics of the object; determining a characteristic position of the virtual display object, wherein the characteristic position is position information corresponding to a characteristic of the object; an effect flag corresponding to the feature position is added in each predetermined display area.
Alternatively, the related operation of generating the photographing template may be performed on the client, but is not limited thereto. Alternatively, the related operation of generating the photographing template may be performed in the server, in which case the multimedia information generation method 200 is performed on the client and the photographing template is received from the server to achieve the acquisition of the photographing template.
According to an exemplary embodiment of the present disclosure, the generating of the multimedia information may include: determining a characteristic position of an object in a preset display area, wherein the characteristic position of the object is position information corresponding to the characteristic of a first object or a second object obtained through shooting; and synthesizing the effect mark in the preset display area for shooting the object with the object according to the characteristic position of the object and the characteristic position corresponding to the effect mark in the preset display area for shooting the object, thereby generating the multimedia information comprising the object.
According to an exemplary embodiment of the present disclosure, an operation of synthesizing an effect marker in a predetermined display area for photographing an object with the object may include: determining characteristic points of the effect marks meeting preset conditions; determining object feature matching points corresponding to the feature points of the effect mark based on the feature points of the effect mark; when the distance between the position of the feature matching point of the object and the position of the feature point of the corresponding effect marker is less than or equal to a predetermined threshold value, the effect marker in a predetermined display area for photographing the object is synthesized onto the object in a layer overlapping manner.
According to an exemplary embodiment of the present disclosure, determining feature points of the effect marker satisfying the preset condition may include the following: and sampling the characteristic points of the effect marks or determining the characteristic points of the characteristic effect marks based on the calculation processing of the characteristic points of the effect marks.
Specifically, in the process of determining the feature points of the effect marker based on the preset condition, the preset condition may be sampling the intervals of all the feature points, or the process based on all the feature points, and finally determining a preset number of feature points capable of characterizing the effect marker.
According to an exemplary embodiment of the present disclosure, when a distance between a feature matching point of an object and a feature point of a corresponding effect marker is less than or equal to a predetermined threshold value, it may further include: the distance between the object matching feature points satisfying the predetermined number and the feature points of the corresponding effect markers is less than or equal to a predetermined threshold.
As an example, when the feature of the subject is a facial feature of a person, a plurality of features (e.g., at least three features) (e.g., a pair of eyes, nose) of the facial features are less than a threshold, and the entire face can be matched (e.g., adsorbed). Alternatively, for each feature (e.g., nose), the nose can only adsorb if the number of nose feature points satisfying the threshold is less than the threshold.
According to an exemplary embodiment of the present disclosure, the first object and the second object may be located at different display layers, and there is a predefined sequential relationship between the display layer at which the first object is located and the display layer at which the second object is located. Alternatively, the predefined order relationship may be changed according to different object requirements.
Here, regarding the setting of the display layer, assuming that the first object is user a and the second object is user B, the video shot by user a is referred to as an original video, there may be three exemplary cases:
in case one, regarding the setting of the single person layer, that is, user a is a person, user B is a person, and user B participating in the shooting wants to be set in front of user a (with reference to the relative positions of user a and user B). In this case, when the user B performs shooting, the user B subjected to the portrait division will be shot normally to obtain a live video of the user B and the user a, so that the user B joins in the shooting environment of the user a to complete the live shooting. Here, the predetermined display areas in the original video are at least two, and in the live video, the user a and the user B are respectively displayed in the two predetermined display areas. Further, in this case, it is not necessary to set the layer order that the user B has at the time of shooting, because the layer at which the user B is located above the layer at which the original video is located (i.e., the layer at which the user a is located) by default (i.e., the layer at which the user a is located at the bottom of all layers) in the related process of the layer overlapping display.
In case two, regarding the setting of the single person layer, i.e., user a is a person, user B is a person, and user B participating in the shooting wants to be set behind user a (with reference to the relative positions of user a and user B). In this case, when the user B performs photographing, first, the user a and the user B are subjected to image division, and the sequence of the layers in which the user a and the user B are located is set such that the layer of the user B is below the layer of the user a (i.e., the user B is located at the bottom layer of all the layers), and the user B after the image division and the layer sequence setting is photographed to obtain a live video of the user B and the user a, so that the user B is added to the photographing environment of the user a to complete the photographing. Here, the predetermined display areas in the original video are at least two, and in the live video, the user a and the user B are respectively displayed in the two predetermined display areas.
In case three, regarding the setting of the multi-person layer, that is, the user a is at least two persons and the user B is one person, or the user a is one person and the user B is at least two persons, or both the user a and the user B are at least two persons. In this case, before the user B formally shoots the video, all the people are subjected to image segmentation first, the sequence of the layers where each person is located is set, and then the user B subjected to image segmentation and layer sequence setting is shot to obtain a live video of the user B and the user a, so that the user B is added into the shooting environment of the user a to complete the live shooting.
Further, as an example of the third case, when the user a is two persons (i.e., the users A1 and A2) and the user B is one person, before the user B formally shoots the video, the users A1 and A2 and the user B are first subjected to image division, and the order of the layers in which the users A1, A2 are located (here, the users A1, A2 are located in the same layer) and the layer in which the user B is located is set, and then the user B after the image division and the layer order setting is shot again to obtain a live video of the user B and the user a. In this case, the user B may be located in front of both the users A1, A2 similarly to the case, or may be located behind both the users A1, A2 similarly to the case.
Here, it should be understood that the above scenario is an exemplary description, and any combination may be made to complete the arrangement of the layer sequence.
According to an exemplary embodiment of the present disclosure, a total photographing duration of the second object may be equal to or less than a total duration of the first multimedia information.
According to an exemplary embodiment of the present disclosure, the effect mark may include at least one of a graffiti mark, a magic expression mark, and a sticker mark.
Here, the effect markers in the present disclosure may be provided by an application on the client, may be received from any other device, or may be input by a user (e.g., user-made effect markers).
As an example, the multimedia information may include information in any multimedia form, such as video, images, and the like. For ease of description, the description is related to video as an example in the description of examples.
According to an exemplary embodiment of the present disclosure, in step S202, second multimedia information is generated. Specifically, the first object is displayed in a part of the predetermined display area in the second multimedia information, and the second object is displayed in at least one of the predetermined display areas in the second multimedia information.
According to an exemplary embodiment of the present disclosure, the generation step of the second multimedia information is similar to the generation step of the multimedia information described in step S201, and is not described here again.
Alternatively, the "subject" referred to in the present disclosure may be any subject that can be used for photographing, such as a person, an animal, a still, or the like. In the embodiments in the present disclosure, a specific description will be given with the object being a human example.
The multimedia information generation method 200 will be described below with reference to fig. 3 by taking a video as an example of the multimedia information. Fig. 3 is a schematic diagram illustrating an example of a multimedia information generation method according to an exemplary embodiment of the present disclosure. In the example of fig. 3, the photographic template includes 6 predetermined display areas, and the effect marker in each display area is a graffiti marker.
Referring to fig. 3, in a first step, a photographing operation is entered, representing a currently displayed photographing window. And secondly, starting a graffiti operation, namely setting a 1 st display area in 6 preset display areas before formally shooting the video, and setting a corresponding graffiti mark. And thirdly, setting the rest 5 preset display areas and corresponding graffiti marks. Fourth, a formal photographing operation is started, in this example, the photographing object is three persons, and the three persons photographed are displayed in 3 different predetermined display areas, that is, each person is located in a different one of the predetermined display areas. In this example, the remaining 3 predetermined display areas are available for subsequent shooting and display of other people. Fifth, the original video shooting is completed, and the video is further edited, for example, other graffiti, magic expression, sticker, music, text (e.g., editable content shown in fig. 4) and the like are added. Sixth, seventh, video related information can be edited and personalized settings are performed. Eighth and ninth steps, distributing video, and selectively distributing the video to double columns or single columns. It should be understood that the fifth to seventh steps described above may be omitted, and one of the eighth and ninth steps may be performed.
The process of generating a photographing template is specifically described below by way of example. The detailed description is given in terms of graffiti in the above examples. The specific process for generating the shooting template comprises the following steps: the following operations are performed in each of the plurality of predetermined display areas: acquiring a virtual display object (for example, a character model diagram, which will be described below as an example); capturing characteristic points of the character model diagram, and identifying characteristic point positions of the character model diagram (hereinafter, description will be given by taking a characteristic point which is a midpoint between two eyes as an example); graffiti marks corresponding to feature points are added in each predetermined display area. Alternatively, the feature point may be a point corresponding to a certain position of the body of the character model diagram, or any other point capable of determining the approximate position of the character.
After the photographing template is generated, when a real person needs to be photographed using the photographing template, the specific process of photographing may be as follows: and in the shooting process, capturing the characteristic points of the true person, identifying the position of the midpoint between the two eyes, and synthesizing the graffiti marks in the preset display area onto the true person in a pattern layer superposition mode when the distance between the characteristic point positions of the true person and the characteristic point positions of the character model map is smaller than or equal to a preset threshold value. Alternatively, taking the example of fig. 3 as an example, there are three persons and 6 graffiti marks, in this way three persons can be displayed compositely with 3 different graffiti marks, respectively.
Alternatively, the graffiti markings that have been synthesized may change as the expression of the corresponding person changes.
By the method, the preset display area is reserved when the original video is shot, so that the possibility of taking the photo under the shooting environment of the original video can be provided for the subsequent object participating in the photo taking, and the user experience is optimized.
According to the embodiment of the disclosure, by setting a series of expression templates (such as templates after graffiti), the expression is customized by a user or online graffiti or is a customized expression template, and a snap-in object (such as a snap-in user) completes snap-in through expression matching with corresponding positions in the expression templates, so that the snap-in experience of the snap-in user is improved.
The multimedia information generation method 200 will be described below with reference to fig. 5 by taking a video as an example of the multimedia information. Fig. 5 is a schematic diagram illustrating an example of a multimedia information generation method according to an exemplary embodiment of the present disclosure. The example of fig. 5 is an example continued on the basis of the example of fig. 3, that is, in the example of fig. 5, the photographing template also includes 6 predetermined display areas, the effect marks in each display area are graffiti marks, and the photographing object in the original video is three persons, and the photographing object to be added in a time (hereinafter referred to as an addition-time user) is one person.
Referring to fig. 5, in a first step, a photographing window to be added in a time frame is entered. And a second step of starting shooting the video, that is, shooting the joining in-time user in a 4 th predetermined display area different from the 3 predetermined display areas in which three persons have been displayed in the original video according to the needs of the joining in-time user before formally shooting the video. Here, it should be understood that the display content of the joining-in-time user has undergone the object segmentation process and the layer sequence setting process, and specific processing procedures are referred to the foregoing description, and will not be repeated here. Thirdly, the photographed video is subjected to a second creation, that is, further editing is performed on the video, and the editing process is similar to the fifth step described with reference to fig. 3, which will not be repeated here. The fourth step and the fifth step are similar to the sixth to ninth steps described with reference to fig. 3, and will not be described again here. It should be understood that the third and fourth steps described above.
According to an exemplary embodiment of the present disclosure, after the second multimedia information (e.g., video) is generated at step S202, when there is still a remaining predetermined display area in the second multimedia information, all related operations of joining the frames may be continuously implemented, and so on, until there is no remaining predetermined display area in the current multimedia information.
The above procedure is illustrated in tabular form below.
TABLE 1
Human body Work Human body Work
Originality author User a Production work a1 User aa Product aa1
User b Production work b1 User ba Production work ba1
User c Production work c1 User ca Production ca1
User n+1 Production work (n+1) 1 User (n+1) a Production of work (n+1) a1
Referring to table 1, the original author publishes the original work, and user a views and adds in a time to generate work a1, and completes one-time forwarding. After the work is added to the time, the original author may be notified in various possible ways, such as, but not limited to, private letter (support of masking functionality). User aa views user a's two-shot work and joins the photo, producing work aa1, and so on. Thus, n different works can be generated and developed into different user lines by sharing the user data to n persons.
Here, it should be noted that the video of the original author is not modifiable, what the current photographer can change is editable content related to the current photographer, and the current photographer cannot change related content of all photographers before, and this setting can be achieved by setting the layer sequence of the layers in which each photographer is located as described above. With this arrangement, the integrity of the video captured by each photographer can be maintained (i.e., not modified by other photographers).
By means of the method shown in the table 1, for example, one video of one user can be expanded to more videos of more people by adding a snap shot mode, so that the interest of the user in producing content is improved, the creative of the user can be effectively exerted, and the expandability is improved.
According to the exemplary embodiment of the disclosure, the interaction between the objects is increased by the method for generating the multimedia information, so that the creation desire of the objects participating in the time is improved, the interactivity is enhanced, and the user experience is optimized.
Further, according to an exemplary embodiment of the present disclosure, when photographing in a burst, an effect mark (e.g., a graffiti expression, a sticker, a magic expression, etc.) automatically adsorbs a predetermined area of an object (e.g., a face of a person), and the effect mark (e.g., various expressions, etc.) interacts with the object, so that the burst video has interest. By continuously adding new objects, virtual interaction among the objects is realized, the creation inspiration of the objects is stimulated, and the interestingness of the video is increased.
Fig. 6 is a block diagram illustrating a multimedia information generating apparatus according to an exemplary embodiment of the present disclosure.
Referring to fig. 6, the multimedia information generating apparatus 600 includes an acquisition module 601 and a generation module 602. Specifically, the acquisition module 601 is configured to: the method includes acquiring first multimedia information, wherein the first multimedia information is acquired based on a photographing template having a plurality of predetermined display areas, and the photographing template is generated by adding an effect flag in each of the plurality of predetermined display areas, wherein a first object is displayed in a portion of the predetermined display areas among the plurality of predetermined display areas, and at least one predetermined display area among remaining predetermined display areas among the plurality of predetermined display areas is used for photographing and displaying a second object. The generation module 602 is configured to: the second multimedia information is generated, wherein the first object is displayed in a portion of the predetermined display area in the second multimedia information, and the second object is displayed in at least one of the predetermined display areas in the second multimedia information.
According to an exemplary embodiment of the present disclosure, the operation of the acquisition module acquiring the first multimedia information may include: based on the photographing template, first multimedia information is generated, wherein a portion of the predetermined display areas of the plurality of predetermined display areas are used for photographing and displaying the first object.
In this way, by reserving the predetermined display area when the original video is photographed, the possibility of performing the photographing under the photographing environment of the original video can be provided for the subsequent subject participating in the photographing, thereby optimizing the user experience.
According to an exemplary embodiment of the present disclosure, a photographing template may be generated by: the following operations are performed in each of the plurality of predetermined display areas: obtaining a virtual display object, wherein the virtual display object is a virtual object generated based on characteristics of the object; determining a characteristic position of the virtual display object, wherein the characteristic position is position information corresponding to a characteristic of the object; an effect flag corresponding to the feature position is added in each predetermined display area.
According to an exemplary embodiment of the present disclosure, the generating operation of the multimedia information may include: determining a characteristic position of an object in a preset display area, wherein the characteristic position of the object is position information corresponding to the characteristic of a first object or a second object obtained through shooting; and synthesizing the effect mark in the preset display area for shooting the object with the object according to the characteristic position of the object and the characteristic position corresponding to the effect mark in the preset display area for shooting the object, thereby generating the multimedia information comprising the object.
According to an exemplary embodiment of the present disclosure, an operation of synthesizing an effect marker in a predetermined display area for photographing an object with the object may include: determining characteristic points of the effect marks meeting preset conditions; determining object feature matching points corresponding to the feature points of the effect mark based on the feature points of the effect mark; when the distance between the position of the feature matching point of the object and the position of the feature point of the corresponding effect marker is less than or equal to a predetermined threshold value, the effect marker in a predetermined display area for photographing the object is synthesized onto the object in a layer overlapping manner.
According to an exemplary embodiment of the present disclosure, determining feature points of the effect marker satisfying the preset condition may include the following: and sampling the characteristic points of the effect marks or determining the characteristic points of the characteristic effect marks based on the calculation processing of the characteristic points of the effect marks.
According to an exemplary embodiment of the present disclosure, when a distance between a feature matching point of an object and a feature point of a corresponding effect marker is less than or equal to a predetermined threshold value, it may further include: the distance between the object matching feature points satisfying the predetermined number and the feature points of the corresponding effect markers is less than or equal to a predetermined threshold.
According to an exemplary embodiment of the present disclosure, the first object and the second object may be located at different display layers, and there is a predefined sequential relationship between the display layer at which the first object is located and the display layer at which the second object is located. Alternatively, the predefined order relationship may be changed according to different object requirements.
According to an exemplary embodiment of the present disclosure, a total photographing duration of the second object may be equal to or less than a total duration of the first multimedia information.
According to an exemplary embodiment of the present disclosure, the effect mark may include at least one of a graffiti mark, a magic expression mark, and a sticker mark.
The processes specifically performed by the above-described respective modules and the related process information have been described above with reference to fig. 2, and a description thereof will not be repeated.
By the method, the interaction between the objects is increased by the aid of the method for generating the multimedia information according to the embodiment of the disclosure, so that the creation desire of the objects participating in the time is improved, the interactivity is enhanced, and the user experience is optimized.
In addition, when the device provided in the above embodiment implements the functions thereof, only the division of the above functional modules is used as an example, and in practical application, the above functional allocation may be implemented by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to implement all or part of the functions described above.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
Fig. 7 is a block diagram illustrating an electronic device 700 according to an exemplary embodiment of the present disclosure. The electronic device 700 may be, for example: smart phones, tablet computers, notebook computers or desktop computers. Electronic device 700 may also be referred to by other names of user devices, portable terminals, laptop terminals, desktop terminals, and the like.
In general, the electronic device 700 includes: a processor 701 and a memory 702.
Processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 701 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 701 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 701 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 701 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
Memory 702 may include one or more computer-readable storage media, which may be non-transitory. The memory 702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 702 is used to store at least one instruction for execution by processor 701 to implement the method of the present disclosure as shown in fig. 2.
In some embodiments, the electronic device 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by a bus or signal lines. The individual peripheral devices may be connected to the peripheral device interface 703 via buses, signal lines or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 704, touch display 705, camera assembly 706, audio circuitry 707, positioning assembly 708, and power supply 709.
A peripheral interface 703 may be used to connect I/O (Input/Output) related at least one peripheral device to the processor 701 and memory 702. In some embodiments, the processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.
The Radio Frequency circuit 704 is configured to receive and transmit RF (Radio Frequency) signals, also referred to as electromagnetic signals. The radio frequency circuitry 704 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 704 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 704 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuitry 704 may also include NFC (Near Field Communication ) related circuitry, which is not limited by the present disclosure.
The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 705 is a touch display, the display 705 also has the ability to collect touch signals at or above the surface of the display 705. The touch signal may be input to the processor 701 as a control signal for processing. At this time, the display 705 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 705 may be one, disposed on a front panel of the electronic device 700; in other embodiments, the display 705 may be at least two, respectively disposed on different surfaces of the electronic device 700 or in a folded design; in still other embodiments, the display 705 may be a flexible display disposed on a curved surface or a folded surface of the electronic device 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The display 705 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.
The camera assembly 706 is used to capture images or video. Optionally, the camera assembly 706 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 706 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.
The audio circuit 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing, or inputting the electric signals to the radio frequency circuit 704 for voice communication. For purposes of stereo acquisition or noise reduction, the microphone may be multiple, and disposed at different locations of the electronic device 700. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 707 may also include a headphone jack.
The location component 708 is operative to locate a current geographic location of the electronic device 700 for navigation or LBS (Location Based Service, location-based services). The positioning component 708 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, the Granati system of Russia, or the Galileo system of the European Union.
The power supply 709 is used to power the various components in the electronic device 700. The power supply 709 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery. When the power supply 709 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the electronic device 700 further includes one or more sensors 710. The one or more sensors 710 include, but are not limited to: acceleration sensor 711, gyroscope sensor 712, pressure sensor 713, fingerprint sensor 714, optical sensor 715, and proximity sensor 716.
The acceleration sensor 711 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the electronic device 700. For example, the acceleration sensor 711 may be used to detect the components of the gravitational acceleration in three coordinate axes. The processor 701 may control the touch display screen 705 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal acquired by the acceleration sensor 711. The acceleration sensor 711 may also be used for the acquisition of motion data of a game or a user.
The gyro sensor 712 may detect a body direction and a rotation angle of the electronic device 700, and the gyro sensor 712 may collect a 3D motion of the user on the electronic device 700 in cooperation with the acceleration sensor 711. The processor 701 may implement the following functions based on the data collected by the gyro sensor 712: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.
The pressure sensor 713 may be disposed at a side frame of the electronic device 700 and/or at an underlying layer of the touch display screen 705. When the pressure sensor 713 is disposed at a side frame of the electronic device 700, a grip signal of the user on the electronic device 700 may be detected, and the processor 701 performs left-right hand recognition or quick operation according to the grip signal collected by the pressure sensor 713. When the pressure sensor 713 is disposed at the lower layer of the touch display screen 705, the control of the operability control on the UI is realized by the processor 701 according to the pressure operation of the user on the touch display screen 705. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.
The fingerprint sensor 714 is used to collect a fingerprint of the user, and the processor 701 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 714, or the fingerprint sensor 714 identifies the identity of the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 701 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. The fingerprint sensor 714 may be provided on the front, back, or side of the electronic device 700. When a physical key or vendor Logo is provided on the electronic device 700, the fingerprint sensor 714 may be integrated with the physical key or vendor Logo.
The optical sensor 715 is used to collect the ambient light intensity. In one embodiment, the processor 701 may control the display brightness of the touch display 705 based on the ambient light intensity collected by the optical sensor 715. Specifically, when the intensity of the ambient light is high, the display brightness of the touch display screen 705 is turned up; when the ambient light intensity is low, the display brightness of the touch display screen 705 is turned down. In another embodiment, the processor 701 may also dynamically adjust the shooting parameters of the camera assembly 706 based on the ambient light intensity collected by the optical sensor 715.
A proximity sensor 716, also referred to as a distance sensor, is typically provided on the front panel of the electronic device 700. The proximity sensor 716 is used to capture the distance between the user and the front of the electronic device 700. In one embodiment, when the proximity sensor 716 detects that the distance between the user and the front face of the electronic device 700 gradually decreases, the processor 701 controls the touch display 705 to switch from the bright screen state to the off screen state; when the proximity sensor 716 detects that the distance between the user and the front surface of the electronic device 700 gradually increases, the processor 701 controls the touch display screen 705 to switch from the off-screen state to the on-screen state.
Those skilled in the art will appreciate that the structure shown in fig. 7 is not limiting of the electronic device 700 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.
According to an embodiment of the present disclosure, there may also be provided a computer-readable storage medium storing instructions, wherein the instructions, when executed by at least one processor, cause the at least one processor to perform a multimedia information generating method according to the present disclosure. Examples of the computer readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, nonvolatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD+RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, blu-ray or optical disk storage, hard Disk Drives (HDD), solid State Disks (SSD), card memory (such as multimedia cards, secure Digital (SD) cards or ultra-fast digital (XD) cards), magnetic tape, floppy disks, magneto-optical data storage, hard disks, solid state disks, and any other means configured to store computer programs and any associated data, data files and data structures in a non-transitory manner and to provide the computer programs and any associated data, data files and data structures to a processor or computer to enable the processor or computer to execute the programs. The computer programs in the computer readable storage media described above can be run in an environment deployed in a computer device, such as a client, host, proxy device, server, etc., and further, in one example, the computer programs and any associated data, data files, and data structures are distributed across networked computer systems such that the computer programs and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.
In accordance with embodiments of the present disclosure, there may also be provided a computer program product in which instructions are executable by a processor of a computer device to perform the above-described multimedia information generating method.
According to the multimedia information generation method, interaction between objects can be increased, so that the creation desire of the objects participating in the shooting is improved, the interactivity is enhanced, and the user experience is optimized.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (20)

1. A video generation method, comprising:
acquiring a first video, wherein the first video is acquired based on a photographing template having a plurality of predetermined display areas, and the photographing template is generated by adding an effect flag in each of the plurality of predetermined display areas, wherein a first object is displayed in a part of the predetermined display areas of the plurality of predetermined display areas, and at least one predetermined display area of the remaining predetermined display areas of the plurality of predetermined display areas is used for photographing and displaying a second object;
generating a second video, wherein a first object is displayed in the portion of the predetermined display area in the second video and a second object is displayed in the at least one predetermined display area in the second video,
the video generation step comprises the following steps:
determining a characteristic position of an object in a preset display area, wherein the characteristic position of the object is position information corresponding to the characteristic of a first object or a second object obtained through shooting;
and synthesizing the effect mark in the preset display area for shooting the object with the object according to the characteristic position of the object and the characteristic position corresponding to the effect mark in the preset display area for shooting the object, so as to generate a video comprising the object, wherein the effect mark in the preset display area for shooting the object automatically adsorbs the preset area of the object in the process of synthesizing the effect mark in the preset display area for shooting the object with the object.
2. The video generation method of claim 1, wherein the step of acquiring the first video comprises:
and generating a first video based on the shooting template, wherein the part of the predetermined display areas of the plurality of predetermined display areas are used for shooting and displaying a first object.
3. The video generation method according to claim 1, wherein the generation step of the photographing template includes:
performing the following in each of the plurality of predetermined display areas:
obtaining a virtual display object, wherein the virtual display object is a virtual object generated based on the characteristics of the object;
determining a characteristic position of the virtual display object, wherein the characteristic position is position information corresponding to a characteristic of the object;
and adding an effect mark corresponding to the characteristic position in each preset display area.
4. The video generation method according to claim 1, wherein the operation of synthesizing the effect marker in the predetermined display area for capturing the subject with the subject includes:
determining characteristic points of the effect marks meeting preset conditions;
determining object feature matching points corresponding to the feature points of the effect mark based on the feature points of the effect mark;
And synthesizing the effect marks in the preset display area for shooting the object onto the object in a layer superposition mode when the distance between the positions of the feature matching points of the object and the feature point positions of the corresponding effect marks is smaller than or equal to a preset threshold value.
5. The video generation method of claim 4, wherein the determining feature points of the effect markers that satisfy the preset condition includes: and sampling the characteristic points of the effect marks or determining the characteristic points of the characteristic effect marks based on the calculation processing of the characteristic points of the effect marks.
6. The video generation method of claim 4, wherein when a distance between a feature matching point of an object and a feature point of a corresponding effect marker is less than or equal to a predetermined threshold, further comprising: the distance between the object matching feature points satisfying the predetermined number and the feature points of the corresponding effect markers is less than or equal to a predetermined threshold.
7. The video generation method of claim 1, wherein the first object and the second object are located at different display layers, and wherein the display layers at which the first object is located and the display layers at which the second object is located have a predefined sequential relationship therebetween, wherein the predefined sequential relationship varies according to different object requirements.
8. The video generation method according to claim 1, wherein a total length of time taken by the second subject is equal to or less than a total length of time of the first video.
9. The video generation method of claim 1, wherein the effect mark comprises at least one of a graffiti mark, a magic expression mark, and a sticker mark.
10. A video generating apparatus, comprising:
an acquisition module configured to: acquiring a first video, wherein the first video is acquired based on a photographing template having a plurality of predetermined display areas, and the photographing template is generated by adding an effect flag in each of the plurality of predetermined display areas, wherein a first object is displayed in a part of the predetermined display areas among the plurality of predetermined display areas, and at least one predetermined display area among the remaining predetermined display areas of the plurality of predetermined display areas is used for photographing and displaying a second object;
a generation module configured to: generating a second video, wherein a first object is displayed in the portion of the predetermined display area in the second video and a second object is displayed in the at least one predetermined display area in the second video,
The video generation operation comprises the following steps:
determining a characteristic position of an object in a preset display area, wherein the characteristic position of the object is position information corresponding to the characteristic of a first object or a second object obtained through shooting;
and synthesizing the effect mark in the preset display area for shooting the object with the object according to the characteristic position of the object and the characteristic position corresponding to the effect mark in the preset display area for shooting the object, so as to generate a video comprising the object, wherein the effect mark in the preset display area for shooting the object automatically adsorbs the preset area of the object in the process of synthesizing the effect mark in the preset display area for shooting the object with the object.
11. The video generation apparatus of claim 10, wherein the operation of the acquisition module to acquire the first video comprises:
and generating a first video based on the shooting template, wherein the part of the predetermined display areas of the plurality of predetermined display areas are used for shooting and displaying a first object.
12. The video generating apparatus of claim 11, wherein the photographing template is generated by:
Performing the following in each of the plurality of predetermined display areas:
obtaining a virtual display object, wherein the virtual display object is a virtual object generated based on the characteristics of the object;
determining a characteristic position of the virtual display object, wherein the characteristic position is position information corresponding to a characteristic of the object;
and adding an effect mark corresponding to the characteristic position in each preset display area.
13. The video generating apparatus according to claim 10, wherein the operation of combining the effect mark in the predetermined display area for photographing the subject with the subject comprises:
determining characteristic points of the effect marks meeting preset conditions;
determining object feature matching points corresponding to the feature points of the effect mark based on the feature points of the effect mark;
and synthesizing the effect marks in the preset display area for shooting the object onto the object in a layer superposition mode when the distance between the positions of the feature matching points of the object and the feature point positions of the corresponding effect marks is smaller than or equal to a preset threshold value.
14. The video generating apparatus according to claim 13, wherein the determining the feature point of the effect flag satisfying the preset condition includes: and sampling the characteristic points of the effect marks or determining the characteristic points of the characteristic effect marks based on the calculation processing of the characteristic points of the effect marks.
15. The video generating apparatus according to claim 13, wherein when a distance between a feature matching point of the object and a feature point of the corresponding effect marker is less than or equal to a predetermined threshold value, further comprising: the distance between the object matching feature points satisfying the predetermined number and the feature points of the corresponding effect markers is less than or equal to a predetermined threshold.
16. The video generating apparatus of claim 10, wherein the first object and the second object are located at different display layers, and wherein the display layers at which the first object is located and the display layers at which the second object is located have a predefined sequential relationship therebetween, wherein the predefined sequential relationship varies according to different object requirements.
17. The video generating apparatus according to claim 10, wherein a total length of time taken by the second subject is equal to or less than a total length of time of the first video.
18. The video generating apparatus of claim 10, wherein the effect indicia comprises at least one of a graffiti indicia, a magic expression indicia, a decal indicia.
19. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions,
Wherein the processor is configured to execute the instructions to implement the video generation method of any of claims 1 to 9.
20. A computer readable storage medium, which when executed by a processor of an electronic device/server, causes the electronic device/server to perform the video generation method of any of claims 1 to 9.
CN202111005750.4A 2021-08-30 2021-08-30 Method, device, electronic equipment and storage medium for generating multimedia information Active CN113727024B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111005750.4A CN113727024B (en) 2021-08-30 2021-08-30 Method, device, electronic equipment and storage medium for generating multimedia information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111005750.4A CN113727024B (en) 2021-08-30 2021-08-30 Method, device, electronic equipment and storage medium for generating multimedia information

Publications (2)

Publication Number Publication Date
CN113727024A CN113727024A (en) 2021-11-30
CN113727024B true CN113727024B (en) 2023-07-25

Family

ID=78679203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111005750.4A Active CN113727024B (en) 2021-08-30 2021-08-30 Method, device, electronic equipment and storage medium for generating multimedia information

Country Status (1)

Country Link
CN (1) CN113727024B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114584599B (en) * 2022-03-18 2023-05-16 北京字跳网络技术有限公司 Game data processing method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016103832A (en) * 2015-12-21 2016-06-02 フリュー株式会社 Photo sticker creation device, photo sticker creation method and program
CN109218630A (en) * 2017-07-06 2019-01-15 腾讯科技(深圳)有限公司 A kind of method for processing multimedia information and device, terminal, storage medium
CN110868639A (en) * 2019-11-28 2020-03-06 北京达佳互联信息技术有限公司 Video synthesis method and device

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104580930B (en) * 2013-10-28 2019-09-27 腾讯科技(深圳)有限公司 Group photo method and system
KR20160097844A (en) * 2015-02-10 2016-08-18 신유원 Group photo amednding system, and group photo amednding method using thereof
CN110166799A (en) * 2018-07-02 2019-08-23 腾讯科技(深圳)有限公司 Living broadcast interactive method, apparatus and storage medium
CN109068055B (en) * 2018-08-10 2021-01-08 维沃移动通信有限公司 Composition method, terminal and storage medium
CN109089059A (en) * 2018-10-19 2018-12-25 北京微播视界科技有限公司 Method, apparatus, electronic equipment and the computer storage medium that video generates
CN112188074B (en) * 2019-07-01 2022-08-05 北京小米移动软件有限公司 Image processing method and device, electronic equipment and readable storage medium
CN110458916A (en) * 2019-07-05 2019-11-15 深圳壹账通智能科技有限公司 Expression packet automatic generation method, device, computer equipment and storage medium
CN116320721A (en) * 2019-08-29 2023-06-23 腾讯科技(深圳)有限公司 Shooting method, shooting device, terminal and storage medium
CN110602396B (en) * 2019-09-11 2022-03-22 腾讯科技(深圳)有限公司 Intelligent group photo method and device, electronic equipment and storage medium
CN112004034A (en) * 2020-09-04 2020-11-27 北京字节跳动网络技术有限公司 Method and device for close photographing, electronic equipment and computer readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016103832A (en) * 2015-12-21 2016-06-02 フリュー株式会社 Photo sticker creation device, photo sticker creation method and program
CN109218630A (en) * 2017-07-06 2019-01-15 腾讯科技(深圳)有限公司 A kind of method for processing multimedia information and device, terminal, storage medium
CN110868639A (en) * 2019-11-28 2020-03-06 北京达佳互联信息技术有限公司 Video synthesis method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
范凯熹.《动态构成》.中国海洋大学出版社,2013,第84-89页. *

Also Published As

Publication number Publication date
CN113727024A (en) 2021-11-30

Similar Documents

Publication Publication Date Title
CN110992493B (en) Image processing method, device, electronic equipment and storage medium
WO2020253655A1 (en) Method for controlling multiple virtual characters, device, apparatus, and storage medium
CN109600678B (en) Information display method, device and system, server, terminal and storage medium
CN108270794B (en) Content distribution method, device and readable medium
CN109327608B (en) Song sharing method, terminal, server and system
CN108965757B (en) Video recording method, device, terminal and storage medium
CN109167937B (en) Video distribution method, device, terminal and storage medium
CN109640125B (en) Video content processing method, device, server and storage medium
CN112118477B (en) Virtual gift display method, device, equipment and storage medium
CN109144346B (en) Song sharing method and device and storage medium
CN110533585B (en) Image face changing method, device, system, equipment and storage medium
CN111083526B (en) Video transition method and device, computer equipment and storage medium
CN112181573A (en) Media resource display method, device, terminal, server and storage medium
CN112363660B (en) Method and device for determining cover image, electronic equipment and storage medium
CN110147503B (en) Information issuing method and device, computer equipment and storage medium
CN112788359B (en) Live broadcast processing method and device, electronic equipment and storage medium
CN113490010B (en) Interaction method, device and equipment based on live video and storage medium
WO2019137166A1 (en) Video producing method, apparatus, storage medium, and electronic device
CN111221457A (en) Method, device and equipment for adjusting multimedia content and readable storage medium
CN112581358A (en) Training method of image processing model, image processing method and device
CN111628925A (en) Song interaction method and device, terminal and storage medium
CN114245218B (en) Audio and video playing method and device, computer equipment and storage medium
CN112417180B (en) Method, device, equipment and medium for generating album video
CN110209316B (en) Category label display method, device, terminal and storage medium
CN111818367A (en) Audio file playing method, device, terminal, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant