WO2017157135A1 - 媒体信息处理方法及媒体信息处理装置、存储介质 - Google Patents

媒体信息处理方法及媒体信息处理装置、存储介质 Download PDF

Info

Publication number
WO2017157135A1
WO2017157135A1 PCT/CN2017/074174 CN2017074174W WO2017157135A1 WO 2017157135 A1 WO2017157135 A1 WO 2017157135A1 CN 2017074174 W CN2017074174 W CN 2017074174W WO 2017157135 A1 WO2017157135 A1 WO 2017157135A1
Authority
WO
WIPO (PCT)
Prior art keywords
media information
segment
information segment
target
target media
Prior art date
Application number
PCT/CN2017/074174
Other languages
English (en)
French (fr)
Inventor
邬振海
傅斌
崔凌睿
汪倩怡
戴阳刚
时峰
吴发强
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2017157135A1 publication Critical patent/WO2017157135A1/zh
Priority to US16/041,585 priority Critical patent/US10652613B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Definitions

  • the present invention relates to media information processing technologies, and in particular, to a media information processing method, a media information processing apparatus, and a storage medium.
  • Users often share media information in various forms. Users use devices such as cameras in mobile terminal devices such as smart phones and tablets to capture video information locally on the device or share it, such as sharing on social networks. Or share with a specific contact.
  • the user has the need to imitate the existing media information, including fragments (or all) of the finished film and television works such as movies, TV dramas, etc., and shoot them as corresponding media information segments.
  • the professional media editing software is used to process the media information segment, and the captured media information segment is replaced with the original media information segment in the media information to realize the fusion of the media information segment and the media information captured by the user.
  • the embodiment of the invention provides a media information processing method, a media information processing device and a storage medium, which can implement the media information segment captured by the user and the media information simulated by the user. Efficient and seamless synthesis.
  • an embodiment of the present invention provides a media information processing method, including:
  • an embodiment of the present invention provides a media information processing apparatus, where the media information processing apparatus includes:
  • a first determining module configured to determine a media information segment of the target media information and a feature of each of the media information segments
  • the acquiring module is configured to collect the first media information segment corresponding to the target media information segment by acquiring the first user side based on the determined feature;
  • a second determining module configured to determine a media information segment other than the target media information segment in the target media information, and acquire a second media information segment corresponding to the determined feature of the media information segment;
  • a third determining module configured to determine a splicing manner of each of the media information segments in the target media information
  • the splicing module is configured to splicing the first media information segment and the second media information segment based on the determined splicing manner to obtain spliced media information.
  • an embodiment of the present invention provides a media information processing apparatus, where the media information is
  • the processing device includes:
  • the method includes a memory and a processor, wherein the memory stores executable instructions for causing the processor to perform operations including:
  • an embodiment of the present invention provides a non-volatile computer storage medium, where the computer storage medium stores executable instructions, and the executable instructions are used to execute the media information processing method provided by the embodiments of the present invention. .
  • the first user side by supporting the first user side to imitate the performance target media information segment by the feature of the media information segment desired to be performed by the first user side, the first user side does not memorize all the features (such as lines) of the target media information segment.
  • the imitating performance can be performed on the basis of the media information segment after the media information segment is determined, based on the feature of the media information segment that is not imitated by the first user side, and the media information segment that needs to be simulated by the first user side to imitate the target media information segment is spliced.
  • the entire process does not require any operation on the first user side.
  • the target media information segment needs to be simulated, and then the complete media information can be obtained, which solves the first problem.
  • the problem that the user side cannot operate the professional media boundary software and cannot generate complete media information improves the processing efficiency for the media information.
  • FIG. 1 is a schematic structural diagram of an optional hardware of a first device in an embodiment of the present invention
  • FIG. 2 is an optional schematic flowchart of a method for processing media information in an embodiment of the present invention
  • FIG. 3 is a schematic implementation diagram of media information segmentation in an embodiment of the present invention.
  • FIG. 4 is another schematic implementation diagram of media information segmentation in an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of an optional implementation of media information splicing in an embodiment of the present invention.
  • FIG. 6 is still another schematic implementation diagram of media information splicing in the embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a process of acquiring a first media information segment and acquiring a second media information segment synchronization process according to an embodiment of the present invention
  • FIG. 8 is a schematic diagram of a scenario for sharing media information after splicing according to an embodiment of the present invention.
  • FIG. 9 is still another optional schematic flowchart of a method for processing media information in an embodiment of the present invention.
  • FIG. 10 is another schematic implementation diagram of media information segmentation in an embodiment of the present invention.
  • FIG. 11 is another schematic implementation diagram of media information splicing in the embodiment of the present invention.
  • FIG. 12 is another schematic implementation diagram of media information splicing in the embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of an optional system of a media information processing apparatus according to an embodiment of the present invention.
  • FIG. 14 is a schematic structural diagram of still another optional system of the media information processing apparatus in the embodiment of the present invention.
  • the apparatus provided in the embodiments of the present invention may be implemented in various manners, for example, implementing all components of the device in a mobile terminal device such as a smart phone, a tablet computer, a notebook computer, or the like, or The components in the device are implemented in a coupled manner on the mobile terminal device and the server side described above.
  • a mobile terminal device such as a smart phone, a tablet computer, a notebook computer, or the like
  • the components in the device are implemented in a coupled manner on the mobile terminal device and the server side described above.
  • the microphone 130 may collect sound in the audio collection mode and process the audio information that may be processed by the processor 110, and the camera 140 may In the image acquisition mode, the image collection or video capture is performed on the user side of the first device 100 (hereinafter referred to as the first user side, and the first user side corresponds to at least one user) and is output as being processed by the processor 110.
  • the first user side the user side of the first device 100
  • the first user side corresponds to at least one user
  • the video information is used to store the audio information output by the microphone 130, the video information output by the camera 140, and store the result of processing the audio information and the video information by the processor 110.
  • the communication module 160 supports the processor 110 and the server side to perform data. Communication, such as transmitting the processing result of the media information stored in the memory 150 to the server on the network side, or Issued by the server side receiving media information such as processor 110 for processing information, a power supply module 170 is used as a first means 100 in the other modules to provide operating power.
  • the display module 120 can be implemented as a liquid crystal display module, an organic light emitting diode display module, or the like.
  • the camera 130 can be implemented as a single camera, a dual camera, or a 3D camera.
  • the microphone 130 can be implemented as a single microphone or a dual microphone (including the main
  • the memory 150 can be implemented as a flash memory, a read only memory, a transfer device, etc.
  • the communication module 160 can be implemented as a cellular communication chip, a peripheral module (such as a mobile phone card holder, a radio frequency module), and a cellular device.
  • the antenna can also be implemented as a wireless compatibility authentication (WiFi) communication chip, a peripheral module (such as a radio frequency module), and a WiFi antenna.
  • WiFi wireless compatibility authentication
  • each module in the first device 100 shown in FIG. 1 is implementing the present invention. Not all of the implementations of the embodiments are not required. Specifically, some or all of the hardware structures shown in FIG. 1 may be adopted according to the functions implemented by the first device 100 in the embodiments of the present invention.
  • the embodiment of the present invention describes a media information processing method, and the media information to be processed includes video information and audio information as an example.
  • the media information processing method according to the embodiment of the present invention includes the following steps:
  • Step 101 The first device determines a media information segment of the target media information and a feature of each media information segment.
  • media information may be stored in the database of the first device and the server side, and the media information to be processed (that is, the target media information) is the user side of the first device (that is, the first user side,
  • the user including the first device may further include other users who cooperate with the first device user to perform the imitation of the target media information, and the media information that the first user side desires to imitate the performance target media information ( Of course, it is also possible to imitate all pieces of media information of the performance target media information).
  • the media information segment is determined by segmenting the target media information based on the characteristics of the target media information. For example, the following manner may be adopted:
  • Method 1 characterizing the duration of the target media information based on the feature of the target media information, and dividing the target media information into pieces of media information based on a time axis;
  • Method 2 characterizing the carried person character of the target media information based on the feature of the target media information, extracting the media information segment including each person character from the target media information, and obtaining each media information segment, wherein each media information segment carries only one character Characters are held by different characters.
  • Method 1 Based on the time axis segmentation method, the average (or uneven) segmentation is performed according to the duration of the target media information (time duration) based on the time axis sequence (including segmentation of video information and audio information in the target media information, video) Information and audio information can be obtained from the target media
  • the media information segment is obtained by pre-separating the information, and the segmented media information segment includes a video information segment and an audio information segment.
  • the media information is segmented by using the scenario of the target media information (including time segments corresponding to different plots on the time axis), which is more convenient for the first user to select the desired imitation performance. Fragment of media information.
  • FIG. 3 a schematic diagram of one of the segmentation of the target media information based on the mode 1).
  • the target media information is divided into four media information segments, the media information segment A, the media information segment B, and the media information segment.
  • C and media information segment D wherein each media information segment comprises a video information segment and an audio information segment, for example, the media information segment A comprises a video information segment A and an audio information segment A.
  • Method 2 segmenting the target media information based on different person characters carried by the target media information, and sequentially extracting media information segments (including video information segments and audio information segments) carrying only different person characters from the target media information.
  • each frame image of the video information of the target media information is identified by the image recognition technology, and the character character carried by each frame image of the video information is determined (in the video information) Carrying the character character 1 and the character character 2), referring to FIG.
  • the first device may display the characteristics of the target media information, such as the name of the target media information, the segmentation story summary, the duration, the designed task role, etc., in addition to segmenting the target media information according to the characteristics of the target media information.
  • an instruction for dividing the target media information by the first user side such as an instruction based on time axis segmentation, based on an operation instruction of dividing a different person character, the first device responding to the segment of the media information on the first user side
  • the pieces of media information obtained by the target media information are obtained.
  • the first device After determining the media information segment included in the target media information, the first device analyzes the feature of each media information segment in the target media information, including at least one of the following: an identifier (number), duration, and media of each media information segment.
  • an identifier number
  • duration duration
  • media of each media information segment including at least one of the following: an identifier (number), duration, and media of each media information segment.
  • the persona involved in each sub-segment (one or more frame images) in the information segment the character can use the image representation extracted from the media information segment
  • the lines of each person character the audio information can be used by voice recognition technology
  • Step 102 The first user side is collected according to the determined feature to obtain a first media information segment corresponding to the target media information segment.
  • the first device 100 loads a list of target media information in a graphical interface for the first user to select media information that needs to be simulated, that is, target media information, and the first device loads the target after selecting the target media information on the first user side.
  • the media information segment of the media information and the feature of each media information segment are used by the first user side to continue to select a piece of media information that is required to imitate the performance, that is, the target media information segment.
  • the first device determines that the first user side selects the media information segment C, prompts to start collecting the performance of the first user side for the media information segment C, and stores the target media information C that is the first user side imitating performance (that is, the first a media information segment, the target media information segment of the first user side imitating the performance of the first media information segment, so the number is consistent with the number of target media information segments selected and mimicked by the first user side, due to possible media information segment C Involving a large number of lines, actions, etc., in order to improve the quality of the imitative performance on the first user side, the first device is After the prompting to start collecting the performance on the first user side, the feature of the media information segment C may be loaded in the graphical interface, including the collection start time of the media information segment C, the persona carried by the media information segment C, and the corresponding lines.
  • the first user side performs performance based on the feature of the prompt, and at the same time, the first device performs video collection and audio collection on the first user side to obtain a video information segment and an audio information segment, and synthesizes the video information segment and the audio information segment.
  • a piece of media information ie, a first piece of media information
  • the first user side performs performance based on the feature of the prompt, and at the same time, the first device performs video collection and audio collection on the first user side to obtain a video information segment and an audio information segment, and synthesizes the video information segment and the audio information segment.
  • a piece of media information ie, a first piece of media information
  • the target media information is two media information segments (such as media information segment A and media information segment C) or multiple media information segments (such as media information segment A, media information segment B, and media information segment C), reference may be made to Corresponding to the description of the media information segment C, it will not be described here.
  • Step 103 Determine a media information segment other than the target media information segment in the target media information, and acquire a second media information segment corresponding to the determined feature of the media information segment.
  • the first user side selects the media information segment C as the target media segment and performs an imitation performance, and the media information further includes the media information segment A, the media information segment B, and the media information. Fragment D, the first user side does not perform an imitation performance, in order to obtain complete media information corresponding to the media information shown in FIG.
  • the first device also needs to acquire and Media information segments corresponding to the features of the media information segment A, the media information segment B, and the media information segment D, that is, the performances in the media information segment A, the media information segment B, and the media information segment D (including task roles, actions, A piece of media information (that is, a second piece of media information).
  • the first device may directly use the original media information segment A, the media information segment B, and the media information segment D of the target media information shown in FIG. 2 and the media information segment of the first user side to imitate the performance.
  • C is spliced.
  • the media information segment A, the media information segment B, and the media information segment D may be collected on the corresponding user side to obtain a media information segment that imitates the performance, such as the second device is based on media other than the media information segment C in the media information segment.
  • the information piece (including the media information piece A, the media information piece B, and the media information piece D) is characterized by the media information piece obtained by collecting the performance of the second user side, so that the media information piece C of the first user side imitating the performance can be simulated.
  • the media information segment A, the media information segment B, and the media information segment D that are simulated by the second user side are spliced.
  • the media information segment of the simulated performance collected by other devices in the actual application may involve multiple media information (that is, not only the aforementioned target media information), and thus the other devices are on the corresponding user side.
  • the collected media information segments it is necessary to determine which media information segments are performed by simulating the target media information segments in the target media information.
  • the acquiring second device collects the media information segment obtained by the second user side, and the feature of the media information segment collected by the second device, and the media information segment A, the media information segment B, and the media information segment D in the target media information (also That is, the feature of each media information segment except the target media information segment in the target media information is matched, for example, the identifier of the matching media information segment (such as the number, name, etc. that uniquely identifies the media information segment), and the second device that will match successfully
  • the collected media information segment is used as the second media information segment.
  • Step 104 Determine a splicing manner of each media information segment in the target media information, and splicing the first media information segment and the second media information segment according to the determined splicing manner to obtain the spliced media information.
  • the target media information is segmented based on the time axis manner
  • the first device collects the media information segment (that is, the first media information segment) that is performed by the first user side for the media information segment C
  • the media information segment C and the media information segment A, the media information segment B, and the media information segment D based on the first user side imitating performance (as before, the media information segment A, the media information segment B, and the media information segment D may be maps 2 shown goals
  • the media information segment in the media information may also be other devices.
  • the second device 200 shown in FIG. 5 collects the media information segment A, the media information segment B, and the media information segment D in the second user side imitation target media information.
  • the first device 100 uses the time-axis splicing method to base the media information segment C, the media information segment A, the media information segment B, and the media information segment D on the time axis.
  • the sequence of the previous sequence (the media information segment A, the media information segment B, the media information segment C, and the media information segment D) is sequentially spliced in sequence.
  • the first device determines, according to the operation instruction of the first user side, that the first user side desires to imitate the performance target.
  • the media information includes the media information segment of the character 1 , and then loads the features of the media information segment of the character 1 (such as start and end time, lines at different time points, prompts of actions, etc.), and simulates the task on the first user side.
  • the performance of the media information segment of the character 1 is collected (including video capture and audio capture), and the media information segment that the first user side imitates the task character 1 is obtained.
  • the synchronous splicing method referring to FIG.
  • the first device 100 performs the media information segment (that is, the first media information segment) performed by the first user side to imitate the task character 1 and the media information segment including the persona 2 (also That is, the second media information segment may be the media information segment extracted based on the persona 2 in the target media information shown in FIG. 4, or may be another device, such as the second device 200 shown in FIG.
  • the media information segment that is simulated by the side of the media information segment of the character 2 is synchronized and stitched based on the extracted location of the carried character in the target media information.
  • the media information segment is segmented and determined based on the feature of the target media information, that is, the first device does not need to store the original data of the target media information when segmenting the target media information.
  • the first device determines the target media information segment in each media information segment of the target media information, and collects the target media information segment for the first user side to simulate the performance, in order to enhance the subsequent spliced media segment.
  • the first device 100 synchronously acquires the media information segment of the target media information that is not simulated by the first user side in the process of the collection, for example, the media information segment A and the media in the target media information.
  • the information segment B, the media information segment D may be other devices, such as the second device 200, imitating the media information segment A, the media for the second user side.
  • a piece of media information that is played by the information segment B, the media information segment D (that is, the media information segment other than the target media information segment in the target media).
  • the media information obtained by the splicing includes the performance of the first user side.
  • the first user side needs to upload and share the spliced media information.
  • FIG. 100 can be set in various forms of terminals such as a smart phone
  • the first device supports the selection of the media information fragments uploaded by the different second user side according to actual needs, according to the server side.
  • the returned media information segment is spliced, and after uploading the media information to the server side, the sharing link returned by the server side can be obtained, and the shared resource descriptor (URL) can be shared based on the shared connection, and the receiving device, for example, the second device 200 can URL access to media information, or based on links to share on HTML5 pages, page visits You can watch media information by clicking on the link.
  • the shared resource descriptor URL
  • the first user side by supporting the first user side to imitate the performance target media information segment by the feature of the media information segment desired to be performed by the first user side, the first user side does not memorize all the features (such as lines) of the target media information segment.
  • the imitating performance can be performed on the basis of the media information segment after the media information segment is determined, based on the feature of the media information segment that is not imitated by the first user side, and the media information segment that needs to be simulated by the first user side to imitate the target media information segment is spliced.
  • the video information collection and the audio collection are simultaneously performed when the media information segment that the first user side imitates the target media information segment is collected; the first user side still has such a requirement, and the first user side only imitates The action gesture in the target media information segment is performed, and the lines in the target media information segment are not performed. It is desirable that the first device collects the first user side performance and generates the image of the media information segment including the first user side imitating the performance target media information segment. And using the original audio information of the target media information corresponding to the target media information segment. The embodiment of the present invention describes the processing of this case.
  • the media information processing method includes the following steps:
  • Step 201 The first device determines a video information segment, audio information, and features of each video information segment of the target media information.
  • the first device does not collect the audio information of the first user but uses the target media information.
  • the original audio information so the video information and the audio information can be separated first from the target media information; the video information is segmented into video information segments, and the audio information is not processed.
  • the video information segments are obtained by averaging (or unevenly) segmentation (only for video information in the target media information) based on the duration of the target media information (time duration) based on the time axis.
  • the video information is segmented by using the scenario of the target media information (including the time segment corresponding to the different plots on the time axis) to obtain the video information segment, and the audio information is not
  • the scenario of the target media information including the time segment corresponding to the different plots on the time axis
  • FIG. 10 a schematic diagram of one of the splitting of the target media information based on the mode 1) is performed.
  • the target media information is divided into four video information segments, a video information segment A, a video information segment B, and a video information segment.
  • C and video information segment D, and the audio information in the target media information is separated from the video information and is not subjected to segmentation processing.
  • the method 4) separates the video information and the audio information from the target media information, and divides the video information based on the different person characters carried by the video information, and sequentially extracts the video information segments carrying only different person characters from the video information of the target media information.
  • the audio information in the target media information is separated from the video information and is not split.
  • each frame image of the video information in the target media information is identified by the image recognition technology, and the persona carried by each frame image of the video information is determined (the video information carries the character 1 and the character 2), see the figure. 4.
  • the video information piece including a target person character set as person character 1
  • extracting directly from a frame image including only the character character 1 for a frame image including a plurality of person characters (including a person character) 1 and the frame image of the character 2 the image including the target person character is extracted from the frame image by image recognition technology (such as face recognition, edge detection, etc.), so that the frame image extracted from the video information of the target media information will only be Includes persona 1.
  • Step 202 Collect a first video information segment corresponding to the target video information segment by acquiring the first user side based on the determined feature.
  • the first device loads a list of target media information in a graphical interface for the first user to select video information that needs to be simulated, that is, target video information. After the target video information is selected on the first user side, the first device loads the target.
  • the video information segment of the video information and the features of each video information segment are used by the first user side to continue to select a video information segment that is required to imitate the performance, that is, the target video information segment.
  • the first device determines that the first user side selects the video information segment C, it prompts to start collecting the performance of the first user side for the video information segment C, and stores it as the first user side imitation.
  • the target media information C of the performance that is, the first video information segment, the first video information segment is the target video information segment of the performance simulated by the first user side, so the number of the first video information segment is selected and imitated by the first user side
  • the number of target video information segments of the performance is consistent.
  • the first device after prompting to start collecting the performance on the first user side
  • the feature of the video information segment C may be loaded in the graphical interface, including the start and end time of the video information segment C, the character role carried by the video information segment C, and the corresponding action prompt, etc., so that the first user side is based on The feature of the prompt is performed, and at the same time, the first device performs video capture on the first user side to obtain a video information segment (that is, the first video information segment) that is played by the first user side to imitate the video information segment C.
  • the target media information is two video information segments (such as video information segment A and video information segment C) or multiple video information segments (such as video information segment A, video information segment B, and video information segment C), reference may be made to Corresponding to the description of the video information segment C, it will not be described here.
  • Step 203 Determine a video information segment other than the target video information segment in the target media information, and acquire a second video information segment corresponding to the determined feature of the video information segment.
  • the first user side selects the video information segment C in the media information as the target video segment and performs an imitation performance, and the target media information further includes the video information segment A and the video information. Fragment B and video information segment D, the first user side does not perform an imitation performance, in order to obtain complete media information corresponding to the media information shown in FIG.
  • the apparatus also needs to acquire video information segments corresponding to the features of the video information segment A, the video information segment B, and the video information segment D, that is, the performances in the video information segment A, the video information segment B, and the video information segment D (including A task video, a motion, a line, etc.) a piece of video information that is consistent (ie, a second video information segment).
  • the first device may directly use the original video information segment A, the video information segment B, and the video information segment D of the target media information shown in FIG. 10 to simulate the performance with the first user side.
  • the video information segment C is spliced.
  • the second device As another implementation manner of video splicing, in view of other devices (hereinafter, the second device is taken as an example), it is also possible to simulate the video information segment A, the video information segment B, and the video information segment D on the corresponding user side. a segment of the video information of the performance, such as the second device based on the feature of the video information segment (including the video information segment A, the video information segment B, and the video information segment D) other than the video information segment C in the video information segment. Performing the captured video information segment, so that the video information segment C of the first user side imitating performance and the video information segment A, the video information segment B and the video information segment D, and the target media of the second user side imitating performance may be performed.
  • the audio information in the information is spliced.
  • the video information segment of the simulated performance collected by other devices in the actual application may involve multiple media information (that is, not only the aforementioned target media information), and thus the other devices are on the corresponding user side.
  • the captured video information segments it is necessary to determine which video information segments are performed by simulating the target video information segments in the target media information.
  • the video information segment obtained by the second user side by the second device For example, acquiring the video information segment obtained by the second user side by the second device, the feature of the video information segment collected by the second device, and the video information segment A, the video information segment B, and the video information segment D in the target media information (also That is, the feature of the video information segment except the target video information segment in the target media information is matched, for example, the identifier of the matching video information segment (such as the number, name, etc. that uniquely identifies the video information segment), and the second device that matches successfully
  • the captured video information segment is used as a second video information segment.
  • Step 204 Determine a splicing manner of each video information segment in the target media information, and splicing the first video information segment with the second video information segment and the audio information according to the determined splicing manner to obtain the spliced media information.
  • the target media information is segmented based on a time axis, and the first The device collects the video information segment (that is, the first video information segment) performed by the first user side for the video information segment C, then the video information segment C and the video information segment A and the video information based on the first user side imitating performance.
  • the segment B and the video information segment D (as described above, the video information segment A, the video information segment B, and the video information segment D may be video information segments in the target media information shown in FIG. 10, and may be other The device, as shown in FIG.
  • the second device 200 collects the video information segment A, the video information segment B and the video information segment D in the second user side mimicking the target media information, and the audio information is spliced.
  • the first device 100 uses the time-axis sequential splicing method to sequence the video information segment C, the video information segment A, the video information segment B and the video information segment D, and the audio information based on the time axis ( The video information segment A, the video information segment B, the video information segment C, and the video information segment D) are sequentially spliced in sequence.
  • the first device determines, according to the operation instruction of the first user side, that the first user side desires to imitate the performance target media.
  • the information includes a video information segment of the character 1 , and then loads the features of the video information segment of the character 1 (eg, start, end time, prompt of the action, etc.), and simulates the video information segment of the task character 1 on the first user side.
  • the performance is collected (only for video capture), and the video information segment played by the first user side imitating the task character 1 is obtained.
  • the first device 100 converts the video information segment (that is, the first video information segment) performed by the first user side to the task character 1 and the video information segment including the character character 2 (also
  • the second video information segment may be the video information segment extracted based on the character 2 in the target media information shown in FIG. 4, or may be another device.
  • the second device 200 shown in FIG. 12 collects the second user side.
  • the video information segment that is simulated by emulating the video information segment of the character 2 is synchronously spliced based on the extracted location of the carried character in the target media information.
  • the video information segment is segmented and determined based on the feature of the target media information, that is, the first device does not need to store the target media information locally when segmenting the target media information.
  • Raw data
  • the first device determines the target video information segment in each video information segment of the target media information, and collects the target video information segment for the first user side to simulate the performance, in order to improve the processing of the subsequent spliced media segment
  • the first device synchronously acquires the video information segment of the target media information that is not simulated by the first user side in the process of the acquisition, for example, the video information segment A and the video information segment B in the target media information.
  • the video information segment D (that is, the video information segment except the target media segment in the target media information) may also be other devices such as the second device imitating the video information segment A, the video information segment B, and the video for the second user side.
  • a piece of video information that is played by the piece of information D that is, a piece of video information other than the target piece of video information in the target medium).
  • the media information obtained by the splicing includes the performance of the first user side.
  • the first user side needs to upload and share the spliced media information.
  • the first device is When acquiring the video information segment that is not performed by the first user side in the target media information, first, on the server side (for carrying the social function between different user sides), whether to query whether the user side having the social relationship with the first user side uploads
  • the first device supports selection of video information segments uploaded by different second user sides, splicing according to video information segments returned by the server side, and uploading media information to the server side.
  • the sharing link returned by the server side can be obtained, and the sharing can be performed in the form of a URL based on the sharing connection.
  • the receiving party can access the media information based on the URL, or share the HTML5 page based on the link, and the page visitor can view the media information by clicking the link.
  • an optional system architecture diagram of the media information processing apparatus 100 includes:
  • the video recording module 101 can be implemented by at least the camera 140 shown in FIG. 1.
  • the network adapter module 102 can be matched by at least the processor 110 and the communication module 160 in FIG.
  • the audio-video tiling module 103 can be implemented by at least the processor 110 shown in FIG.
  • the audio and video splicing module 103 separates a part of a movie, or a part of a continuous play, from a part of a sound and a video, and divides the video part into a plurality of paragraphs (eg, A, B, C, and D segments), and the user can run on the medium.
  • the video recording module 101 of the information processing device performs certain sections of the recorded video portion (such as recording A, C segments).
  • the network adaptation module 102 automatically pulls the movie from the background, and other passages of the video in the series of the drama.
  • the audio and video splicing module 103 performs splicing to generate a complete video (A+B+C+D), and the user's performance is integrated into the video to achieve the effect of performing with the star or other people.
  • the uploading module 104 saves the segments (A, C segments) of the user performance in the background, and when other users record other segments (B, D segments), the user can also be pulled from the background.
  • the clips are stitched together to achieve the effect of a common performance.
  • the user performs video recording through the recording system.
  • the video recording module 101 can add subtitles, filters, logos, and the like to the video.
  • the network adaptation module 102 automatically records the movie according to the user, and the serial segment automatically downloads the segment that the user has not recorded, and stores the sound.
  • the video splicing module 103 will first splicing the video segments, and then splicing the video and audio to complete the complete video.
  • the uploading module 104 will upload the spliced video and the video segment recorded by the user to the background, and the spliced complete
  • the video is used to share with others, and the user-recorded clips are used to match and compose other users to form a new video.
  • the video recording module 101 records the video. Video recording and recording through the camera In the process, only part of the clip is recorded. For example, there is a movie or a series, divided into four segments A, B, C, and D.
  • the video recording module 101 records the A segment and the C segment video, and the recorded video is kept locally, wherein the A and C segments of the video do not include the voice portion. And you can press subtitles, logos, etc. into the video during the recording process.
  • the network adaptation module 102 adapts the segments.
  • the network adaptation module 102 When the user is recording the video of the A and C segments, the network adaptation module 102 will automatically send the B, D segment video (without voice) according to the current recording of the movie or the series, and the segment in which the user is recording the background. And the entire video (A+B+C+D) voice.
  • the audio and video splicing module 103 performs video segment splicing.
  • the audio and video splicing module 103 will splicing the A, B, C, and D segments to complete a complete Video without voice.
  • the audio and video splicing module 103 performs video and audio splicing.
  • the application After the video is spliced, the application will immediately splicing the video and voice to form a complete video. At this time, there are some A and C segments of the user's own performance, and there are also B and D segments in the TV series. Video, of course, the B and D segments can also be changed to their user performances, not limited to the original footage in movies and TV series.
  • Upload module 104 video upload.
  • the B and D segments can be not only a prepared drama in advance, a clip in a movie, but also a clip played by other users. Therefore, after completing the video recording, the video recording module 101 not only provides the user to keep the local.
  • the function also provides the user with the function of uploading video.
  • the uploading module 104 uploads the recorded A and C segments and the complete video (A+B+C+D+voice) to the background for the complete video (A+B). +C+D+Voice) After uploading, it will return the address url of the saved video for users to share.
  • step 5 the video URL is returned, and the video URL can be used for sharing, which can be shared by a pure URL, or can be generated by an H5 webpage, and the form can be various.
  • Steps 5 and 6 of the above processing logic can be implemented as needed, and only serve as an extended function option for the media information processing device to increase the friendly experience.
  • the media information processing apparatus 300 includes: a first determining module 310, an acquiring module 320, a second determining module 330, a third determining module 340, and a splicing module 350.
  • the first determining module 310 determines the media information segment of the target media information and the feature of each media information segment.
  • the features of each media information segment include: an identifier (number) of each media information segment, a duration, and a segment involved in each media information segment. Personas and lines of each persona.
  • the analyzing module 360 obtains each media information segment of the target media information in response to the operation instruction of the media information on the first user side; or automatically splits the target media information, and may perform video information and audio information in the target media information when performing segmentation. Both are segmented, the segmented video information frequency band and the audio information segment are merged into a media information segment, or the video information and the audio information are separated from the target media information, the audio information is not processed, and the video information is segmented to obtain video information. Fragment; combining the above-mentioned segmentation methods, it can be known that there are two cases of target media information segments: 1) including video information and audio information; 2) including only video information.
  • the analysis module 360 analyzes the target media information to obtain the characteristics of the target media information, and divides the target media information from the following two dimensions:
  • Dimension 1 characterizing the duration of the target media information based on the feature of the target media information, and segmenting the target media information into media information segments based on the time axis;
  • Dimension 2 Characterizing the persona angle of the target media information based on the characteristics of the target media information Color, extracting media information segments including each persona from the target media information, and obtaining media information segments, wherein each media information segment carries only one person character carried by the person character is different. Based on the above different dimensions, the video information and the audio information in the target media information are segmented, or only the video information is segmented, and the following segmentation methods are corresponding:
  • the first determining module 310 performs averaging (or uneven) segmentation based on the duration of the target media information (time duration) based on the time axis segmentation manner (including video information and audio in the target media information).
  • the information is segmented, the video information and the audio information may be pre-separated from the target media information to obtain a media information segment, and the segmented one media information segment includes a video information segment and an audio information segment.
  • the second determining module 310 represents the carried person character of the target media information based on the feature of the target media information, extracts the media information segment including each character character into the target media information, and obtains each media information segment, wherein each media information
  • the clip only carries the different personas carried by a persona.
  • the first determining module 310 performs average (or uneven) segmentation based on the time axis sequence (time-only video information in the target media information) according to the time axis segmentation mode according to the time axis segmentation mode. Video message fragment.
  • the fourth determining module 310 separates the video information and the audio information from the target media information, and divides the video information based on the different person characters carried by the video information, and sequentially extracts only the different person characters from the video information of the target media information.
  • the video information segment is separated, and the audio information in the target media information is separated from the video information and is not divided.
  • the collecting module 320 collects the first media information segment corresponding to the target media information segment according to the determined feature, and the manner of the collection is related to the segmentation manner of the first determining module 310.
  • the acquisition module 320 pairs A user-side performance performs synchronous acquisition of video and audio, and the collected media information segment (first media information segment) includes video information and audio information; when the first determining module 310 adopts the foregoing manner 3) or mode 4
  • the collection module 320 performs video capture only on the performance of the first user side, and the collected media information segment (the first media information segment) includes only the video. information.
  • the collecting module 320 loads the features of each media information segment in the target media information (for example, loading the identifier of each media information segment for the first user side to select);
  • a user side selection operation determines a target media information segment in each media information segment;
  • the collection module 320 loads a feature of the target media information segment (eg, includes an acquisition start time of the target media information segment, and a hosted person of the target media information segment)
  • the role, and the corresponding line facilitates the first user to perform the performance based on the feature on the first user side, and collects the performance performed by the first user side based on the feature of the target media information piece (including videoting the performance of the first user side) Acquisition and audio capture).
  • the collection module 320 may select a target media information segment in each media information segment in the target media information according to a user selection operation, and the number of target media information segments is at least one, and correspondingly, the first The amount of media information is at least one.
  • the second determining module 330 determines a media information segment other than the target media information segment in the target media information, and acquires a second media information segment corresponding to the determined feature of the media information segment.
  • the second media information segment is spliced with the first media information module collected by the collection module 320 to form the spliced media information. Therefore, the second media information segment determined by the second determining module 330 includes information types (such as video information). Corresponding to the first media information, when the first media information segment collected by the acquisition module 320 includes the video information and the audio information, the second media information module acquired by the second determining module 330 includes the video information and the audio information; When the first media information segment collected by the acquisition module 320 includes only the video information, the second media information module acquired by the second determining module 330 also includes only the video information.
  • information types such as video information
  • the second determining module 330 acquires each media information segment except the first media information segment in the target media information as a second media information segment.
  • the media information segment obtained by the performance of the second user side is used as the second media information segment.
  • the second determining module 330 is further configured to acquire, by the second device 200, the media information segment obtained by the second user side, the feature of the media information segment collected by the second device 200, and the target media information except the target media information segment. The feature of each media information segment is matched, and the media information segment collected by the successfully matched second device 200 is used as the second media information segment.
  • the second determining module 330 can acquire the second media information of the target media information except the first media information segment in the process of collecting the first media information segment by the collecting module 320, thereby improving the time for the media information after the screen is spliced. To avoid long waits after the first user side imitates the performance target media information segment.
  • the third determining module 340 determines a splicing manner of each media information segment in the target media information; the splicing module 350 splices the first media information segment and the second media information segment based on the determined splicing manner to obtain the spliced media information.
  • the splicing module 350 uses the time-axis splicing method to combine the first media information segment and the second media.
  • the pieces of information are spliced in sequence on the time axis;
  • the splicing module 350 uses the synchronous splicing method to set the first media information segment, The second piece of media information is based on the person being carried The object role is synchronously stitched at the extracted position in the target media information.
  • the splicing module 350 uses the audio information in the target media information by using the time axis sequential splicing method (not The segmentation process is performed, and the first media information segment and the second media information segment are sequentially spliced based on a sequence on the time axis.
  • the splicing module 350 uses the synchronous splicing method to set the audio information and the first media.
  • the information segment and the second media information segment are synchronously spliced based on the extracted location of the carried character in the target media information.
  • the uploading module 370 (not shown in FIG. 14 and connected to the splicing module 350) uploads the spliced media information to the server side, and obtains the sharing link returned by the server side; the sharing module 380 (not shown in FIG. 14 and the splicing module) 350 connected), configured to respond to the sharing operation command of the first user side based on the sharing link, for example, send the sharing link to the second user side terminal device having the social attribute associated with the first user side, for the second user side View the stitched media information based on the share link.
  • the function module of the media information processing device is supported by corresponding hardware.
  • the first determining module 310, the second determining module 330, the third determining module 340, and the splicing module 350 can be at least the processor 110 and the communication module 160 in FIG.
  • the cooperation module 320 can be implemented by at least the microphone 130 and the camera 140 shown in FIG. 1 .
  • the above-described integrated unit of the present invention can also be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a stand-alone product.
  • the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product, which is stored in a storage medium and includes a plurality of instructions for making A computer device (which may be a personal computer, server, or network device, etc.) performs all or part of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a RAM, a ROM, a magnetic disk, or an optical disk.
  • the feature of the media information segment desired to be performed by the first user side is supported by the first user side to imitate the performance target media information segment, and the first user side does not memorize all the features of the target media information segment.
  • the imitation performance can be performed on the basis of (such as a line); after determining the media information segment, the media information that needs to be performed with the first user side to imitate the target media information segment is obtained based on the feature of the media information segment that is not imitated by the first user side.
  • the segment performs the splicing of the media information segment, and the entire process does not require any operation on the first user side.
  • For the first user side only the target media information segment needs to be simulated, and then the complete media information can be obtained, which solves The problem that the first user side cannot operate the professional media boundary software and cannot generate complete media information improves the processing efficiency for the media information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

公开了一种媒体信息处理方法及媒体信息处理装置、存储介质;方法包括:在第一装置侧确定目标媒体信息的媒体信息片段以及各所述媒体信息片段的特征,基于所确定的特征对第一用户侧进行采集得到对应目标媒体信息片段的第一媒体信息片段;确定所述目标媒体信息中除所述目标媒体信息片段之外的媒体信息片段,获取与所确定的媒体信息片段的特征所对应的第二媒体信息片段;确定所述目标媒体信息中各所述媒体信息片段的拼接方式,基于所确定的拼接方式将所述第一媒体信息片段与所述第二媒体信息片段进行拼接,得到拼接后媒体信息。其能够将用户拍摄的媒体信息片段与用户所模仿的媒体信息进行高效、无缝地合成。

Description

媒体信息处理方法及媒体信息处理装置、存储介质 技术领域
本发明涉及媒体信息处理技术,尤其涉及一种媒体信息处理方法及媒体信息处理装置、存储介质。
背景技术
目前因特网等网络飞速发展,成为用户获取信息、分享信息的重要媒介。网络接入带宽不断提升和移动通信的快速发展,使用户随时随地分享媒体信息(如视频信息、音频信息)成为可能。
用户往往以多种形式分享媒体信息,用户使用设备如使用智能手机、平板电脑等移动终端设备中的摄像头等方式,拍摄得到视频信息在设备本地保存,或者进行分享,如在社交网络中进行分享,或者与特定的联系人进行分享。
随着自媒体的兴起,用户存在对已有的媒体信息,包括成品的影视作品如电影、电视剧等中的片段(或者全部)进行模仿表演并拍摄为对应的媒体信息片段的需求,用户在拍摄到媒体信息片段之后利用专业媒体编辑软件对媒体信息片段进行处理,将拍摄的媒体信息片段替换媒体信息中原始的媒体信息片段,实现用户拍摄的媒体信息片段与媒体信息的融合。
但是,对于如何实现将用户拍摄的媒体信息片段与用户所模仿的媒体信息进行高效、无缝地合成,相关技术尚无有效解决方案。
发明内容
本发明实施例提供一种媒体信息处理方法及媒体信息处理装置、存储介质,能够实现将用户拍摄的媒体信息片段与用户所模仿的媒体信息进行 高效、无缝地合成。
本发明实施例的技术方案是这样实现的:
第一方面,本发明实施例提供一种媒体信息处理方法,包括:
第一装置确定目标媒体信息的媒体信息片段以及各所述媒体信息片段的特征,基于所确定的特征对第一用户侧进行采集得到对应目标媒体信息片段的第一媒体信息片段;
确定所述目标媒体信息中除所述目标媒体信息片段之外的媒体信息片段,获取与所确定的媒体信息片段的特征所对应的第二媒体信息片段;
确定所述目标媒体信息中各所述媒体信息片段的拼接方式,基于所确定的拼接方式将所述第一媒体信息片段与所述第二媒体信息片段进行拼接,得到拼接后媒体信息。
第二方面,本发明实施例提供一种媒体信息处理装置,所述媒体信息处理装置包括:
第一确定模块,配置为确定目标媒体信息的媒体信息片段以及各所述媒体信息片段的特征;
采集模块,配置为基于所确定的特征对第一用户侧进行采集得到对应目标媒体信息片段的第一媒体信息片段;
第二确定模块,配置为确定所述目标媒体信息中除所述目标媒体信息片段之外的媒体信息片段,获取与所确定的媒体信息片段的特征所对应的第二媒体信息片段;
第三确定模块,配置为确定所述目标媒体信息中各所述媒体信息片段的拼接方式;
拼接模块,配置为基于所确定的拼接方式将所述第一媒体信息片段与所述第二媒体信息片段进行拼接,得到拼接后媒体信息。
第三方面,本发明实施例提供一种媒体信息处理装置,所述媒体信息 处理装置包括:
包括:存储器和处理器,所述存储器中存储有可执行指令,所述可执行指令用于引起所述处理器执行包括以下的操作:
在第一装置侧确定目标媒体信息的媒体信息片段、以及各所述媒体信息片段的特征;
基于所确定的特征对第一用户侧进行采集得到对应所述目标媒体信息片段的第一媒体信息片段;
确定所述目标媒体信息中除所述目标媒体信息片段之外的媒体信息片段;
获取与所确定的媒体信息片段的特征所对应的第二媒体信息片段;
确定所述目标媒体信息中各所述媒体信息片段的拼接方式;
基于所确定的拼接方式将所述第一媒体信息片段与所述第二媒体信息片段进行拼接,得到拼接后的媒体信息。
第四方面,本发明实施例提供一种非易失性的计算机存储介质,所述计算机存储介质中存储有可执行指令,所述可执行指令用于执行本发明实施例提供的媒体信息处理方法。
本发明实施例中,通过对第一用户侧期望表演的媒体信息片段的特征支持第一用户侧模仿表演目标媒体信息片段,第一用户侧在不记忆目标媒体信息片段的全部特征(如台词)的基础上即可进行模仿表演;在确定媒体信息片段后基于第一用户侧未模仿表演的媒体信息片段的特征获取需要与第一用户侧模仿目标媒体信息片段而表演的媒体信息片段进行拼接的媒体信息片段,整个过程不需要第一用户侧的任何操作,对于第一用户侧来说只需要针对目标媒体信息片段进行模仿表演,而后便可获取完整的媒体信息,这就解决了因第一用户侧无法操作专业媒体边界软件而无法生成完整的媒体信息的问题,提升了针对媒体信息的处理效率。
附图说明
图1是本发明实施例中第一设备的一个可选的硬件结构示意图;
图2是本发明实施例中媒体信息处理方法的一个可选的流程示意图;
图3是本发明实施例中媒体信息分割的一个可选的实现示意图;
图4是本发明实施例中媒体信息分割的又一个可选的实现示意图;
图5是本发明实施例中媒体信息拼接的一个可选的实现示意图;
图6是本发明实施例中媒体信息拼接的又一个可选的实现示意图;
图7是本发明实施例中采集第一媒体信息片段、获取第二媒体信息片段同步处理的示意图;
图8是本发明实施例中分享拼接后的媒体信息的场景示意图;
图9是本发明实施例中媒体信息处理方法的又一个可选的流程示意图;
图10是本发明实施例中媒体信息分割的另一个可选的实现示意图;
图11是本发明实施例中媒体信息拼接的另一个可选的实现示意图;
图12是本发明实施例中媒体信息拼接的另一个可选的实现示意图;
图13是本发明实施例中媒体信息处理装置的一个可选的系统结构示意图;
图14是本发明实施例中媒体信息处理装置的又一个可选的系统结构示意图。
具体实施方式
以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
下面首先对实施本发明实施例的装置(在以下各具体实施例中实施为第一装置和第二装置)进行说明。
本发明实施例中所提供的装置可以采用各种方式来实施,例如在智能手机、平板电脑、笔记本电脑等移动终端设备中实施装置的全部组件,或 者,在上述的移动终端设备和服务器侧以耦合的方式实施装置中的组件。
以在移动终端设备实施装置的全部组件为例,参见图1示出的第一装置100的一个可选的硬件结构示意图(第二装置的硬件结构可以参考图1而实施),显示模块120用于显示第一装置中的处理器110处理的信息如媒体信息(包括视频、图像),麦克风130可以在音频采集模式中采集声音并处理为可以由处理器110处理的音频信息,摄像头140可以在图像采集模式中采集环境如对第一装置100的用户侧(以下称为第一用户侧,第一用户侧对应有至少一个用户)进行图像采集或视频采集并输出为可以由处理器110处理的视频信息,存储器150用于存储麦克风130输出的音频信息、摄像头140输出的视频信息,并存储处理器110对音频信息、视频信息进行处理的结果;通信模块160支持处理器110与服务器侧进行数据通信,如将存储器150存储的媒体信息的处理结果发送到网路侧的服务器,或者,接收服务器侧下发的信息如媒体信息供处理器110进行处理,供电模块170用于为第一装置100中的其他模块提供工作电力。
在图1中,处理器110通过总线180与第一装置100中各组件的命令以及数据的传输,对于图1中显示模块120、处理器110、麦克风130、摄像头140以及存储器150的具体实现形式不做限定,例如显示模块120可以实施为液晶显示模块、有机发光二极管显示模块等,摄像头130可以实施为单摄像头、双摄像头或3D摄像头等,麦克风130可以实施为单麦克风、双麦克风(包括主麦克风和降噪麦克风)等,存储器150可以实施为闪存(Flash)存储器、只读存储器、转移装置等;通信模块160可以实施为蜂窝通信芯片、外围模块(如手机卡座、射频模块)和蜂窝天线,当然,也可以实施为无线相容性认证(WiFi)通信芯片、外围模块(如射频模块)和WiFi天线。
需要指出的是,图1中示出的第一装置100中的各模块在实施本发明 实施例的实施过程中并非都是必需的,具体可以根据第一装置100在本发明实施例中记载的实现的功能而采用图1示出的部分或全部硬件结构。
本发明实施例记载一种媒体信息处理方法,以待处理的媒体信息包括视频信息和音频信息为例进行说明,参见图2,本发明实施例记载的媒体信息处理方法包括以下步骤:
步骤101,第一装置确定目标媒体信息的媒体信息片段以及各媒体信息片段的特征。
在本发明实施例中,第一装置本地以及服务器侧的数据库中都可以存储媒体信息,待处理的媒体信息(也就是目标媒体信息)就是第一装置的用户侧(也就是第一用户侧,例如包括使用第一装置的用户,还可以包括与第一装置用户配合表演模仿目标媒体信息的其他用户)期望模仿表演的媒体信息,第一用户侧期望模仿表演目标媒体信息的部分媒体信息片段(当然,也可以模仿表演目标媒体信息的全部媒体信息片段)。
媒体信息片段是基于目标媒体信息的特征对目标媒体信息进行分割确定,例如,可以采用如下方式:
方式1)基于目标媒体信息的特征表征目标媒体信息的时长,对目标媒体信息基于时间轴分割为各媒体信息片段;
方式2)基于目标媒体信息的特征表征目标媒体信息的所承载的人物角色,对目标媒体信息中提取包括各人物角色的媒体信息片段,得到各媒体信息片段,其中各媒体信息片段仅承载一个人物角色所承载的人物角色不同。
以下结合上述的不同分割方式对确定媒体信息片段进行说明。
方式1)基于时间轴分割方式,根据目标媒体信息的持续时间(时长)基于时间轴的先后顺序进行平均(或不平均)分割(包括对目标媒体信息中的视频信息和音频信息进行分割,视频信息和音频信息可以从目标媒体 信息中预先分离得到)得到媒体信息片段,分割得到的一个媒体信息片段包括视频信息片段和音频信息片段。
可选地,基于时间轴的先后顺序进行分割时,利用目标媒体信息的剧情(包括不同剧情在时间轴上对应的时间分段)对媒体信息进行分割,更加方便第一用户侧选择期望模仿表演的媒体信息片段。
参见图3示出的目标媒体信息基于方式1)进行分割的一个的示意图,在图3中,目标媒体信息被分割为4个媒体信息片段,媒体信息片段A、媒体信息片段B、媒体信息片段C和媒体信息片段D,其中每个媒体信息片段包括视频信息片段和音频信息片段,例如媒体信息片段A包括视频信息片段A和音频信息片段A。
方式2)基于目标媒体信息承载的不同人物角色对目标媒体信息进行分割,从目标媒体信息中依次提取仅承载不同人物角色的媒体信息片段(包括视频信息片段和音频信息片段)。
以从目标媒体信息片段中首先提取视频信息片段为例,通过图像识别技术识别出目标媒体信息的视频信息的每个帧图像,确定视频信息的每个帧图像承载的人物角色(设视频信息中承载有人物角色1和人物角色2),参见图4,在提取包括目标人物角色(设为人物角色1)的媒体信息片段时,对于只包括有人物角色1的帧图像则直接提取,对于同时包括有多个人物角色的帧图像(包括人物角色1和人物角色2的帧图像),通过图像识别技术(如人脸识别、边缘检测等技术)从帧图像提取包括目标人物角色的部分,这样从目标媒体信息的视频信息提取的帧图像将只包括人物角色1,基于所提取的包括目标人物角色的帧图像在时间轴上的位置,对目标媒体信息的音频信息进行提取同步提取,这样所提取的音频信息片段与所提取的视频信息片段在时间轴上是同步对应的,且与目标人物角色(人物角色1)对应。
在实际应用中,第一装置除了根据目标媒体信息的特征对目标媒体信息进行分割,还可以呈现目标媒体信息的特征,例如目标媒体信息的名称、分段剧情概要、时长、设计的任务角色等,供第一用户侧下达的对目标媒体信息进行分割的指令,如基于时间轴分割的指令,基于不同的人物角色进行分割的操作指令,第一装置响应第一用户侧的对媒体信息的片段操作得到目标媒体信息的各媒体信息片段。
在确定目标媒体信息所包括的媒体信息片段之后,第一装置分析出目标媒体信息中的各媒体信息片段的特征,包括以下至少之一:各媒体信息片段的标识(编号)、时长、各媒体信息片段中每个子片段(一个或多个帧图像)所涉及的人物角色(人物角色可以采用从媒体信息片段中提取的图像表征)、以及各人物角色的台词(可以通过语音识别技术对音频信息片段进行提取得到,或者从服务器侧直接获取)。
步骤102,基于所确定的特征对第一用户侧进行采集得到对应目标媒体信息片段的第一媒体信息片段。
参见图5,第一装置100在图形界面加载目标媒体信息的列表供第一用户侧选择需要模仿表演的媒体信息也就是目标媒体信息,在第一用户侧选中目标媒体信息后第一装置加载目标媒体信息的媒体信息片段以及各媒体信息片段的特征,供第一用户侧继续选择需要模仿表演的媒体信息片段也就是目标媒体信息片段。
假设第一装置确定第一用户侧选中媒体信息片段C后,提示开始采集第一用户侧针对媒体信息片段C的表演,并存储为第一用户侧模仿表演的目标媒体信息C(也就是第一媒体信息片段,第一媒体信息片段第一用户侧所模仿表演的目标媒体信息片段,因此数量与第一用户侧选中并模仿表演的目标媒体信息片段的数量一致),由于媒体信息片段C中可能涉及大量的台词、动作等,为了提升第一用户侧的模仿表演的质量,第一装置在 提示开始采集第一用户侧的表演之后,可以在图形界面中加载媒体信息片段C的特征,包括媒体信息片段C的采集起始时刻、媒体信息片段C的承载的人物角色、以及对应的台词,使第一用户侧基于提示的特征而进行表演,与此同时第一装置对第一用户侧进行视频采集和音频采集对应得到视频信息片段和音频信息片段,对视频信息片段和音频信息片段进行合成得到第一用户侧模仿媒体信息片段C而表演的媒体信息片段(也就是第一媒体信息片段)。
对于目标媒体信息为两个媒体信息片段(如媒体信息片段A和媒体信息片段C)或多个媒体信息片段(如媒体信息片段A、媒体信息片段B和媒体信息片段C)的情况,可以参照针对媒体信息片段C记载而对应实施,这里不再赘述。
步骤103,确定目标媒体信息中除目标媒体信息片段之外的媒体信息片段,获取与所确定的媒体信息片段的特征所对应的第二媒体信息片段。
仍以图2示出媒体信息为目标媒体信息为例,第一用户侧选中媒体信息片段C为目标媒体片段并进行模仿表演,媒体信息中还包括媒体信息片段A、媒体信息片段B和媒体信息片段D,第一用户侧并未进行模仿表演,为了基于第一用户侧所模仿表演的媒体信息片段C得到与图2示出的媒体信息相应的完整的媒体信息,第一装置还需要获取与媒体信息片段A、媒体信息片段B和媒体信息片段D的特征相对应的媒体信息片段,也就是与媒体信息片段A、媒体信息片段B和媒体信息片段D中的表演(包括任务角色、动作、台词等)一致的媒体信息片段(也就是第二媒体信息片段)。
作为一种实现方式,第一装置可以直接使用将图2中示出的目标媒体信息的原始的媒体信息片段A、媒体信息片段B和媒体信息片段D与第一用户侧模仿表演的媒体信息片段C进行拼接。
作为另一种实现方式,考虑到其他的装置(以下以第二装置为例)也 可能针对媒体信息片段A、媒体信息片段B和媒体信息片段D在对应的用户侧进行采集而得到模仿表演的媒体信息片段,如第二装置基于媒体信息片段中除媒体信息片段C之外的媒体信息片段(包括媒体信息片段A、媒体信息片段B和媒体信息片段D)的特征对第二用户侧的表演进行采集得到的媒体信息片段,这样可以将第一用户侧模仿表演的媒体信息片段C与第二用户侧模仿表演的媒体信息片段A、媒体信息片段B和媒体信息片段D进行拼接。
而实际应用中其他的装置在相应的用户侧采集到的模仿表演的媒体信息片段可能涉及多个媒体信息(也就是不仅仅包括前述的目标媒体信息),因此对于其他的装置在相应的用户侧采集的媒体信息片段中,需要确定哪些媒体信息片段是模仿目标媒体信息中的目标媒体信息片段而表演的。例如,获取第二装置采集第二用户侧得到的媒体信息片段,将第二装置采集的媒体信息片段的特征,与目标媒体信息中媒体信息片段A、媒体信息片段B和媒体信息片段D(也就是目标媒体信息中除目标媒体信息片段之外的各媒体信息片段的特征)匹配,例如匹配媒体信息片段的标识(如唯一表征媒体信息片段的编号、名称等),将匹配成功的第二装置所采集的媒体信息片段作为第二媒体信息片段。
步骤104,确定目标媒体信息中各媒体信息片段的拼接方式,基于所确定的拼接方式将第一媒体信息片段与第二媒体信息片段进行拼接,得到拼接后媒体信息。
以图2为例,目标媒体信息采用基于时间轴方式进行分割,并且第一装置采集了第一用户侧针对媒体信息片段C而表演的媒体信息片段(也就是第一媒体信息片段),那么,基于第一用户侧模仿表演的媒体信息片段C与媒体信息片段A、媒体信息片段B和媒体信息片段D(如前,这里的媒体信息片段A、媒体信息片段B和媒体信息片段D可以是图2示出的目标 媒体信息中的媒体信息片段,也可以是其他的装置如图5中示出的第二装置200采集第二用户侧模仿目标媒体信息中的媒体信息片段A、媒体信息片段B和媒体信息片段D而表演的媒体信息片段)进行拼接时,参见图5,第一装置100使用基于时间轴依次拼接方式,将媒体信息片段C、媒体信息片段A、媒体信息片段B和媒体信息片段D基于时间轴上的先后顺序(先后顺序依次为媒体信息片段A、媒体信息片段B、媒体信息片段C、和媒体信息片段D)进行依次拼接。
再以图4为例,当第一装置基于目标媒体信息承载的不同人物角色对目标媒体信息进行分割时,这里假设第一装置根据第一用户侧的操作指令确定第一用户侧期望模仿表演目标媒体信息中包括人物角色1的媒体信息片段,则加载人物角色1的媒体信息片段的特征(如起始、结束时间、不同时间点的台词、动作的提示等),对第一用户侧模仿任务角色1的媒体信息片段的表演进行采集(包括视频采集和音频采集),得到第一用户侧模仿任务角色1而表演的媒体信息片段。这里,采用同步拼接方式,参见图6,第一装置100将第一用户侧模仿任务角色1而表演的媒体信息片段(也就是第一媒体信息片段)与包括人物角色2的媒体信息片段(也就是第二媒体信息片段,可以是图4示出的目标媒体信息中基于人物角色2所提取的媒体信息片段,也可以是其他的装置如图6中示出的第二装置200采集第二用户侧模仿人物角色2的媒体信息片段而表演的媒体信息片段)基于所承载的人物角色在目标媒体信息中的提取位置进行同步拼接。
如前述步骤101中,媒体信息片段是基于目标媒体信息的特征对目标媒体信息进行分割确定,也就是说第一装置对目标媒体信息进行分割时并不需要在本地存储目标媒体信息的原始数据,针对此情况,在第一装置在目标媒体信息的各媒体信息片段中确定目标媒体信息片段,并针对第一用户侧模仿表演的目标媒体信息片段进行采集时,为了提升后续拼接媒体片 段的处理效率,参见图7,第一装置100在采集的过程中同步获取目标媒体信息中第一用户侧未模仿表演的媒体信息片段,例如可以是目标媒体信息中的媒体信息片段A、媒体信息片段B、媒体信息片段D(也就是目标媒体信息中除目标媒体片段之外的媒体信息片段),还可以是其他的装置如第二装置200针对第二用户侧模仿媒体信息片段A、媒体信息片段B、媒体信息片段D(也就是目标媒体中除目标媒体信息片段之外的媒体信息片段)而表演的媒体信息片段。
因拼接得到的媒体信息中包括有第一用户侧的表演,实际应用中第一用户侧存在将拼接得到的媒体信息上传并分享的需求,一个较为常见的场景是,参见图8,第一装置100(可以设置于智能手机等各种形式的终端中)在获取目标媒体信息中第一用户侧未表演的媒体信息片段时,首先在服务器侧(用于承载不同的用户侧之间的社交功能)查询与第一用户侧存在社交关系的用户侧是否上传了对应的媒体信息片段,当然,根据实际需要第一装置支持对各不同的第二用户侧上传的媒体信息片段进行选择,根据服务器侧返回的媒体信息片段进行拼接,上传媒体信息至服务器侧后可以获取服务器侧返回的分享链接,基于分享连接可以进行统一资源描述符(URL)形式的分享,接收方如第二装置200可以基于该URL访问媒体信息,或者进行基于链接在HTML5页面进行分享,页面的访问者通过点击链接可以观看媒体信息。
本发明实施例中,通过对第一用户侧期望表演的媒体信息片段的特征支持第一用户侧模仿表演目标媒体信息片段,第一用户侧在不记忆目标媒体信息片段的全部特征(如台词)的基础上即可进行模仿表演;在确定媒体信息片段后基于第一用户侧未模仿表演的媒体信息片段的特征获取需要与第一用户侧模仿目标媒体信息片段而表演的媒体信息片段进行拼接的媒体信息片段,整个过程不需要第一用户侧的任何操作,对于第一用户侧来 说只需要针对目标媒体信息片段进行模仿表演,而后便可获取完整的媒体信息,这就解决了因第一用户侧无法操作专业媒体边界软件而无法生成完整的媒体信息的问题,提升了针对媒体信息的处理效率。
本发明实施例前述记载中,采集第一用户侧模仿目标媒体信息片段而表演的媒体信息片段时同时进行了视频采集和音频采集;第一用户侧还存在这样的需求,第一用户侧仅仅模仿目标媒体信息片段中的动作姿态进行表演,不表演目标媒体信息片段中的台词,希望第一装置采集第一用户侧表演而生成媒体信息片段中包括有第用户侧模仿表演目标媒体信息片段的影像、使用目标媒体信息中对应该目标媒体信息片段的原生的音频信息。本发明实施例针对此情况的处理进行说明。
参见图9,本发明实施例记载的媒体信息处理方法包括以下步骤:
步骤201,第一装置确定目标媒体信息的视频信息片段、音频信息以及各视频信息片段的特征。
以下结合上述的不同分割方式对确定视频信息片段进行说明,与本发明实施例前述记载中确定媒体信息片段的区别在于,由于第一装置不会采集第一用户的音频信息而是使用目标媒体信息的原生的音频信息,因此可以首先从目标媒体信息中分离出视频信息和音频信息;对视频信息进行分割为视频信息片段,而对音频信息不做处理。
方式3)基于时间轴分割方式,根据目标媒体信息的持续时间(时长)基于时间轴的先后顺序进行平均(或不平均)分割(仅对目标媒体信息中的视频信息)得到视频信息片段。
可选地,基于时间轴的先后顺序进行分割时,利用目标媒体信息的剧情(包括不同剧情在时间轴上对应的时间分段)对媒体信息中视频信息进行分割得到视频信息片段,音频信息不做处理,更加方便第一用户侧选择期望模仿表演的视频信息片段。
参见图10示出的目标媒体信息基于方式1)进行分割的一个的示意图,在图10中,目标媒体信息被分割为4个视频信息片段,视频信息片段A、视频信息片段B、视频信息片段C和视频信息片段D,而目标媒体信息中的音频信息与视频信息分离后不做分割处理。
方式4)从目标媒体信息中分离出视频信息和音频信息,基于视频信息承载的不同人物角色对视频信息进行分割,从目标媒体信息的视频信息中依次提取仅承载不同人物角色的视频信息片段,而目标媒体信息中的音频信息与视频信息分离后不做分割处理。
例如,通过图像识别技术识别出目标媒体信息中视频信息的每个帧图像,确定视频信息的每个帧图像承载的人物角色(设视频信息中承载有人物角色1和人物角色2),参见图4,在提取包括目标人物角色(设为人物角色1)的视频信息片段时,对于只包括有人物角色1的帧图像则直接提取,对于同时包括有多个人物角色的帧图像(包括人物角色1和人物角色2的帧图像),通过图像识别技术(如人脸识别、边缘检测等技术)从帧图像提取包括目标人物角色的部分,这样从目标媒体信息的视频信息提取的帧图像将只包括人物角色1。
步骤202,基于所确定的特征对第一用户侧进行采集得到对应目标视频信息片段的第一视频信息片段。
参见图5,第一装置在图形界面加载目标媒体信息的列表供第一用户侧选择需要模仿表演的视频信息也就是目标视频信息,在第一用户侧选中目标视频信息后,第一装置加载目标视频信息的视频信息片段以及各视频信息片段的特征,供第一用户侧继续选择需要模仿表演的视频信息片段也就是目标视频信息片段。
例如,假设第一装置确定第一用户侧选中视频信息片段C后,提示开始采集第一用户侧针对视频信息片段C的表演,并存储为第一用户侧模仿 表演的目标媒体信息C(也就是第一视频信息片段,第一视频信息片段是第一用户侧所模仿表演的目标视频信息片段,因此第一视频信息片段的数量与第一用户侧选中并模仿表演的目标视频信息片段的数量一致),由于视频信息片段C中可能涉及大量的动作等,为了提升第一用户侧的模仿表演的质量,第一装置在提示开始采集第一用户侧的表演之后,可以在图形界面中加载视频信息片段C的特征,包括视频信息片段C的采集起始、结束时刻、视频信息片段C的承载的人物角色、以及对应的动作提示等,使第一用户侧基于提示的特征而进行表演,与此同时第一装置对第一用户侧进行视频采集,得到第一用户侧模仿视频信息片段C而表演的视频信息片段(也就是第一视频信息片段)。
对于目标媒体信息为两个视频信息片段(如视频信息片段A和视频信息片段C)或多个视频信息片段(如视频信息片段A、视频信息片段B和视频信息片段C)的情况,可以参照针对视频信息片段C记载而对应实施,这里不再赘述。
步骤203,确定目标媒体信息中除目标视频信息片段之外的视频信息片段,获取与所确定的视频信息片段的特征所对应的第二视频信息片段。
仍以图10示出媒体信息为目标媒体信息为例,第一用户侧选中媒体信息中的视频信息片段C为目标视频片段并进行模仿表演,目标媒体信息中还包括视频信息片段A、视频信息片段B和视频信息片段D,第一用户侧并未进行模仿表演,为了基于第一用户侧所模仿表演的视频信息片段C得到与图10示出的媒体信息相应的完整的媒体信息,第一装置还需要获取与视频信息片段A、视频信息片段B和视频信息片段D的特征相对应的视频信息片段,也就是与视频信息片段A、视频信息片段B和视频信息片段D中的表演(包括任务角色、动作、台词等)一致的视频信息片段(也就是第二视频信息片段)。
作为视频拼接的一种实现方式,第一装置可以直接使用将图10中示出的目标媒体信息的原始的视频信息片段A、视频信息片段B和视频信息片段D与第一用户侧模仿表演的视频信息片段C进行拼接。
作为视频拼接的另一种实现方式,鉴于其他的装置(以下以第二装置为例)也可能针对视频信息片段A、视频信息片段B和视频信息片段D在对应的用户侧进行采集而得到模仿表演的视频信息片段,如第二装置基于视频信息片段中除视频信息片段C之外的视频信息片段(包括视频信息片段A、视频信息片段B和视频信息片段D)的特征对第二用户侧的表演进行采集得到的视频信息片段,这样可以将第一用户侧模仿表演的视频信息片段C与第二用户侧模仿表演的视频信息片段A、视频信息片段B和视频信息片段D、以及目标媒体信息中的音频信息进行拼接。
而实际应用中其他的装置在相应的用户侧采集到的模仿表演的视频信息片段可能涉及多个媒体信息(也就是不仅仅包括前述的目标媒体信息),因此对于其他的装置在相应的用户侧采集的视频信息片段中,需要确定哪些视频信息片段是模仿目标媒体信息中的目标视频信息片段而表演的。
例如,获取第二装置采集第二用户侧得到的视频信息片段,将第二装置采集的视频信息片段的特征,与目标媒体信息中视频信息片段A、视频信息片段B和视频信息片段D(也就是目标媒体信息中除目标视频信息片段之外的各视频信息片段的特征)匹配,例如匹配视频信息片段的标识(如唯一表征视频信息片段的编号、名称等),将匹配成功的第二装置所采集的视频信息片段作为第二视频信息片段。
步骤204,确定目标媒体信息中各视频信息片段的拼接方式,基于所确定的拼接方式将第一视频信息片段与第二视频信息片段、以及音频信息进行拼接,得到拼接后媒体信息。
以图10为例,目标媒体信息采用基于时间轴方式进行分割,并且第一 装置采集了第一用户侧针对视频信息片段C而表演的视频信息片段(也就是第一视频信息片段),那么,基于第一用户侧模仿表演的视频信息片段C与视频信息片段A、视频信息片段B和视频信息片段D(如前所述,这里的视频信息片段A、视频信息片段B和视频信息片段D可以是图10示出的目标媒体信息中的视频信息片段,也可以是其他的装置如如图11示出的第二装置200采集第二用户侧模仿目标媒体信息中的视频信息片段A、视频信息片段B和视频信息片段D而表演的视频信息片段)、以及音频信息进行拼接时,参见图11,第一装置100使用基于时间轴依次拼接方式,将视频信息片段C、视频信息片段A、视频信息片段B和视频信息片段D、以及音频信息基于时间轴上的先后顺序(先后顺序依次为视频信息片段A、视频信息片段B、视频信息片段C、和视频信息片段D)进行依次拼接。
以图4为例,当第一装置基于目标媒体信息承载的不同人物角色对目标媒体信息进行分割时,这里假设第一装置根据第一用户侧的操作指令确定第一用户侧期望模仿表演目标媒体信息中包括人物角色1的视频信息片段,则加载人物角色1的视频信息片段的特征(如起始、结束时间、动作的提示等),对第一用户侧模仿任务角色1的视频信息片段的表演进行采集(仅进行视频采集),得到第一用户侧模仿任务角色1而表演的视频信息片段。
这里,采用同步拼接方式,参见图12,第一装置100将第一用户侧模仿任务角色1而表演的视频信息片段(也就是第一视频信息片段)与包括人物角色2的视频信息片段(也就是第二视频信息片段,可以是图4示出的目标媒体信息中基于人物角色2所提取的视频信息片段,也可以是其他的装置如图12示出的第二装置200采集第二用户侧模仿人物角色2的视频信息片段而表演的视频信息片段)基于所承载的人物角色在目标媒体信息中的提取位置进行同步拼接。
如前述步骤201中,视频信息片段是基于目标媒体信息的特征对目标媒体信息中的视频信息进行分割确定,也就是说第一装置对目标媒体信息进行分割时并不需要在本地存储目标媒体信息的原始数据。
针对这种情况,在第一装置在目标媒体信息的各视频信息片段中确定目标视频信息片段,并针对第一用户侧模仿表演的目标视频信息片段进行采集时,为了提升后续拼接媒体片段的处理效率,参见图7,第一装置在采集的过程中同步获取目标媒体信息中第一用户侧未模仿表演的视频信息片段,例如可以是目标媒体信息中的视频信息片段A、视频信息片段B、视频信息片段D(也就是目标媒体信息中除目标媒体片段之外的视频信息片段),还可以是其他的装置如第二装置针对第二用户侧模仿视频信息片段A、视频信息片段B、视频信息片段D(也就是目标媒体中除目标视频信息片段之外的视频信息片段)而表演的视频信息片段。
举例来说,因拼接得到的媒体信息中包括有第一用户侧的表演,实际应用中第一用户侧存在将拼接得到的媒体信息上传并分享的需求,一个较为常见的场景是,第一装置在获取目标媒体信息中第一用户侧未表演的视频信息片段时,首先在服务器侧(用于承载不同的用户侧之间的社交功能)查询与第一用户侧存在社交关系的用户侧是否上传了对应的视频信息片段,当然,根据实际需要第一装置支持对各不同的第二用户侧上传的视频信息片段进行选择,根据服务器侧返回的视频信息片段进行拼接,上传媒体信息至服务器侧后可以获取服务器侧返回的分享链接,基于分享连接进行URL形式的分享,接收方可以基于该URL访问媒体信息,或者进行基于链接在HTML5页面进行分享,页面的访问者通过点击链接可以观看媒体信息。
本发明实施例针对前述的媒体信息处理装置的系统架构进行说明,参见图13示出的媒体信息处理装置100的一个可选的系统架构示意图,包括:
视频录制模块101、网络适配模块102、音视频拼接模块103和上传模块104;图13示出的系统架构中的模块是对媒体信息处理装置的功能在逻辑功能模块层面的划分,媒体信息处理装置的硬件结构中存在对应的硬件来支撑模块,例如视频录制模块101至少可以由图1中示出的摄像头140实现,网络适配模块102至少可由图1中的处理器110、通信模块160配合实现,音视频拼接模块103至少可由图1中示出的处理器110实现。
本发明实施例中音视频拼接模块103将一段电影,或者连续剧片段中声音和视频部分分离,且将视频部分分为若干段落(如:A,B,C,D段),用户可以运行于媒体信息处理装置的视频录制模块101进行录制视频部分的某些段落(如录制A,C段),录完之后,网络适配模块102会自动从后台拉取该电影,连续剧片段中视频的其他段落(如B,D段),音视频拼接模块103进行拼接,生成完整的视频(A+B+C+D),用户的表演就融合到该视频中,达到和明星或其他人共同表演的效果,同时,在用户进行分享时,上传模块104会把用户表演的片段(A,C段)保存在后台,当其他用户录制其他片段(B,D段)时,也可以从后台拉取该用户片段进行拼接,达到共同表演的效果。
用户通过录制系统进行视频录制,视频录制模块101可以在视频中加入字幕,滤镜,logo等,网络适配模块102会根据用户录制电影,连续剧片段自动下载用户没有录制的片段,并存储,音视频拼接模块103会先把视频片段进行拼接,随后,再将视频,音频进行拼接,完成完整的视频,上传模块104会把拼接后的视频和用户录制的视频片段上传到后台,拼接后的完整视频用于与其他人分享,用户录制的片段用于和其他用户录制时进行匹配拼接形成新的视频。
处理逻辑
1、视频录制模块101录制视频。通过摄像头,进行视频录制,录制过 程中,只录制部分片段。如:有一段电影或连续剧,分为A、B、C、D四段,视频录制模块101录制A段和C段视频,把录制的视频保留在本地,其中A,C段视频不包含语音部分,且可以在录制过程中,同时将字幕,logo等压入视频中。
2.网络适配模块102适配片段。
当用户在录制A、C段视频的过程中,网络适配模块102会根据当前录制那段电影或连续剧、用户在录制哪段告知后台,后台自动下发B,D段视频(不包含语音)和整段视频(A+B+C+D)的语音。
3.音视频拼接模块103进行视频片段拼接。
当用户录制完A,C段视频时,此时B,D段视频理论上已经下载完毕,此时,音视频拼接模块103会把A,B,C,D段视频进行拼接,完成一段完整的不带语音的视频。
4.音视频拼接模块103进行视频与音频拼接。
完成视频的拼接后,应用随后会马上进行视频和语音的拼接,形成完整的视频,此时,该段视频中,有用户自己表演的部分A,C段,也有电视连续剧中的B,D段视频,当然,B,D段也可以改为其用户表演,并不局限于电影和电视剧中的原版片段。
5.上传模块104视频上传。
如前所述,B,D段不仅可以是事先准备好的连续剧,电影中的片段,也可以是其他用户表演的片段,所以,视频录制模块101在完成视频录制后,不仅提供用户保持本地的功能,同时也会给用户提供上传视频的功能,上传模块104会把用户录制的A,C段,完整的视频(A+B+C+D+语音)上传到后台,对于完整视频(A+B+C+D+语音)上传后,会返回保存视频的地址url,供用户进行分享。
6.分享。
步骤5中返回了视频URL,可以用这个视频URL做一些分享,可以纯URL的分享,也可以产生H5网页进行分享,形式可以有多种多样。
以上处理逻辑中步骤5、步骤6可以根据需要实施,仅作为媒体信息处理装置增加友好体验的一个扩展功能选项。
与本发明前述实施例的记载对应,针对媒体信息处理装置的另一个可选的系统结构进行说明,本发明实施例前述记载的第一装置100和第二装置200可以根据本发明实施例记载的媒体信息处理装置的系统架构实施,参见图14,媒体信息处理装置300包括:第一确定模块310,采集模块320,第二确定模块330、第三确定模块340和拼接模块350。
1)第一确定模块310和分析模块360
第一确定模块310确定目标媒体信息的媒体信息片段以及各媒体信息片段的特征,例如各媒体信息片段的特征包括:各媒体信息片段的标识(编号)、时长、各媒体信息片段中所涉及的人物角色、以及各人物角色的台词。
分析模块360响应第一用户侧的对媒体信息的操作指令得到目标媒体信息的各媒体信息片段;或者,自动对目标媒体信息进行分割,进行分割时可以对目标媒体信息中的视频信息和音频信息都进行分割,利用分割的视频信息频段和音频信息片段合并成媒体信息片段,或者从目标媒体信息中分离出视频信息和音频信息,对音频信息不做处理,而对视频信息进行分割得到视频信息片段;综合上述分割方式,可知目标媒体信息片段存在两种情况:1)包括视频信息和音频信息;2)仅包括视频信息。
分析模块360分析目标媒体信息得到目标媒体信息的特征,从以下两个维度对目标媒体信息进行分割:
维度1)基于目标媒体信息的特征表征目标媒体信息的时长,对目标媒体信息基于时间轴分割为各媒体信息片段;
维度2)基于目标媒体信息的特征表征目标媒体信息的所承载的人物角 色,对目标媒体信息中提取包括各人物角色的媒体信息片段,得到各媒体信息片段,其中各媒体信息片段仅承载一个人物角色所承载的人物角色不同。基于以上的不同维度,视对目标媒体信息中的视频信息和音频信息均进行分割,或仅对视频信息进行分割,对应有以下分割方式:
方式1)第一确定模块310基于时间轴分割方式,根据目标媒体信息的持续时间(时长)基于时间轴的先后顺序进行平均(或不平均)分割(包括对目标媒体信息中的视频信息和音频信息进行分割,视频信息和音频信息可以从目标媒体信息中预先分离得到)得到媒体信息片段,分割得到的一个媒体信息片段包括视频信息片段和音频信息片段。
方式2)第一确定模块310基于目标媒体信息的特征表征目标媒体信息的所承载的人物角色,对目标媒体信息中提取包括各人物角色的媒体信息片段,得到各媒体信息片段,其中各媒体信息片段仅承载一个人物角色所承载的人物角色不同。
方式3)第一确定模块310基于时间轴分割方式,根据目标媒体信息的持续时间(时长)基于时间轴的先后顺序进行平均(或不平均)分割(仅对目标媒体信息中的视频信息)得到视频信息片段。
方式4)第一确定模块310从目标媒体信息中分离出视频信息和音频信息,基于视频信息承载的不同人物角色对视频信息进行分割,从目标媒体信息的视频信息中依次提取仅承载不同人物角色的视频信息片段,而目标媒体信息中的音频信息与视频信息分离后不做分割处理。
2)采集模块320
采集模块320基于所确定的特征对第一用户侧进行采集得到对应目标媒体信息片段的第一媒体信息片段,采集的方式与第一确定模块310的分割方式有关,当第一确定模块310采用前述的方式1)或方式2)时,由于媒体信息片段中包括有视频信息和音频信息,对应地,采集模块320对第 一用户侧的表演进行视频和音频的同步采集,采集得到的媒体信息片段(第一媒体信息片段)中包括有视频信息和音频信息;当第一确定模块310采用前述的方式3)或方式4)时,由于媒体信息片段中仅包括有视频信息,对应地,采集模块320对第一用户侧的表演仅仅进行视频采集,采集得到的媒体信息片段(第一媒体信息片段)中仅包括有视频信息。
为了便于第一用户侧选择模仿目标媒体信息片段进行表演,采集模块320加载目标媒体信息中各媒体信息片段的特征(例如,加载各媒体信息片段的标识供第一用户侧选定);根据第一用户侧的选择操作在各媒体信息片段中确定目标媒体信息片段;采集模块320加载目标媒体信息片段的特征(例如,包括目标媒体信息片段的采集起始时刻、目标媒体信息片段的承载的人物角色、以及对应的台词)便于第一用户为了便于第一用户侧基于特征进而表演,并采集第一用户侧基于目标媒体信息片段的特征而实施的表演(包括对第一用户侧的表演进行视频采集和音频采集)。
从上述的记载可以看出,采集模块320可以根据用户的选取操作在目标媒体信息中的各媒体信息片段中选定目标媒体信息片段,目标媒体信息片段的数量至少为一个,相应地,第一媒体信息的数量至少为一个。
3)第二确定模块330
第二确定模块330确定目标媒体信息中除目标媒体信息片段之外的媒体信息片段,获取与所确定的媒体信息片段的特征所对应的第二媒体信息片段。
由于第二媒体信息片段是与采集模块320采集到的第一媒体信息模块进行拼接形成拼接后的媒体信息,因此,第二确定模块330确定的第二媒体信息片段包括的信息类型(如视频信息和音频信息)与第一媒体信息对应,当采集模块320采集到的第一媒体信息片段包括视频信息和音频信息时,第二确定模块330获取的第二媒体信息模块包括视频信息和音频信息; 当采集模块320采集到的第一媒体信息片段仅包括视频信息时,第二确定模块330获取的第二媒体信息模块也对应仅包括视频信息。
第二确定模块330获取目标媒体信息中除第一媒体信息片段之外的各媒体信息片段为第二媒体信息片段。
或者,基于媒体信息片段中除目标媒体信息片段之外的媒体信息片段的特征,对第二用户侧的表演采集得到的媒体信息片段作为第二媒体信息片段。第二确定模块330,还配置为获取第二装置200采集第二用户侧得到的媒体信息片段,将第二装置200采集的媒体信息片段的特征,与目标媒体信息中除目标媒体信息片段之外的各媒体信息片段的特征匹配,将匹配成功的第二装置200所采集的媒体信息片段作为第二媒体信息片段。
第二确定模块330在采集模块320采集第一媒体信息片段的过程中,同步获取目标媒体信息中除第一媒体信息片段之外的第二媒体信息,能够提高制作屏拼接后的媒体信息的时间,避免第一用户侧在模仿表演目标媒体信息片段后的长时间等待。
4)第三确定模块340和拼接模块350
第三确定模块340确定目标媒体信息中各媒体信息片段的拼接方式;拼接模块350基于所确定的拼接方式将第一媒体信息片段与第二媒体信息片段进行拼接,得到拼接后媒体信息。
与前述方式1)对应,当目标媒体信息的各媒体信息片段是基于时间轴上的先后时间顺序分割得到时,拼接模块350使用基于时间轴依次拼接方式,将第一媒体信息片段、第二媒体信息片段在时间轴上的先后顺序进行拼接;
与前述方式2)对应,当目标媒体信息的各媒体信息片段是基于各媒体信息片段承载的不同人物角色从目标媒体信息提取时,拼接模块350使用同步拼接方式,将第一媒体信息片段、第二媒体信息片段基于所承载的人 物角色在目标媒体信息中的提取位置进行同步拼接。
与前述方式3)对应,当目标媒体信息的各媒体信息片段是基于时间轴上的先后时间顺序分割得到时,使用基于时间轴依次拼接方式,拼接模块350将目标媒体信息中的音频信息(未进行分割处理)、第一媒体信息片段和第二媒体信息片段基于时间轴上的先后顺序进行依次拼接。
与前述方式4)对应,当目标媒体信息的各媒体信息片段是基于各媒体信息片段承载的不同人物角色从目标媒体信息提取得到时,拼接模块350使用同步拼接方式,将音频信息、第一媒体信息片段和第二媒体信息片段基于所承载的人物角色在目标媒体信息中的提取位置进行同步拼接。
5)上传模块370和分享模块380
上传模块370(图14中未示出,与拼接模块350连接)上传拼接后得到的媒体信息至服务器侧,获取服务器侧返回的分享链接;分享模块380(图14中未示出,与拼接模块350连接),配置为基于分享链接响应第一用户侧的分享操作指令,例如,将分享链接发送至与第一用户侧具有关联的社交属性的第二用户侧的终端设备,供第二用户侧基于分享链接查看拼接后的媒体信息。
媒体信息处理装置的功能模块存在对应的硬件来支撑,例如第一确定模块310、第二确定模块330、第三确定模块340、拼接模块350至少可以由图1中的处理器110、通信模块160配合实现,采集模块320至少可由图1中示出的麦克风130、摄像头140实现。
本领域的技术人员可以理解:实现上述本发明实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储装置、随机存取存储器(RAM,Random Access Memory)、只读存储器(ROM,Read-Only Memory)、磁碟或者光盘 等各种可以存储程序代码的介质。
另外,本发明上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机装置(可以是个人计算机、服务器、或者网络装置等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储装置、RAM、ROM、磁碟或者光盘等各种可以存储程序代码的介质。
综上所述,本发明实施例中通过对第一用户侧期望表演的媒体信息片段的特征支持第一用户侧模仿表演目标媒体信息片段,第一用户侧在不记忆目标媒体信息片段的全部特征(如台词)的基础上即可进行模仿表演;在确定媒体信息片段后基于第一用户侧未模仿表演的媒体信息片段的特征获取需要与第一用户侧模仿目标媒体信息片段而表演的媒体信息片段进行拼接的媒体信息片段,整个过程不需要第一用户侧的任何操作,对于第一用户侧来说只需要针对目标媒体信息片段进行模仿表演,而后便可获取完整的媒体信息,这就解决了因第一用户侧无法操作专业媒体边界软件而无法生成完整的媒体信息的问题,提升了针对媒体信息的处理效率。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (16)

  1. 一种媒体信息处理方法,包括:
    在第一装置侧确定目标媒体信息的媒体信息片段、以及各所述媒体信息片段的特征;
    基于所确定的特征对第一用户侧进行采集得到对应所述目标媒体信息片段的第一媒体信息片段;
    确定所述目标媒体信息中除所述目标媒体信息片段之外的媒体信息片段;
    获取与所确定的媒体信息片段的特征所对应的第二媒体信息片段;
    确定所述目标媒体信息中各所述媒体信息片段的拼接方式;
    基于所确定的拼接方式将所述第一媒体信息片段与所述第二媒体信息片段进行拼接,得到拼接后的媒体信息。
  2. 如权利要求1所述的方法,其中,所述基于所确定的特征对第一用户侧进行采集,得到对应所述目标媒体信息片段的第一媒体信息片段,包括:
    加载所述目标媒体信息中各所述媒体信息片段的特征;
    根据所述第一用户侧的选择操作在各所述媒体信息片段中确定所述目标媒体信息片段;
    加载所述目标媒体信息片段的特征,采集所述第一用户侧基于所述目标媒体信息片段的特征而实施的表演,形成所述第一媒体信息片段。
  3. 如权利要求1所述的方法,其中,所述第一媒体信息片段和所述第二媒体信息片段均未承载音频信息;
    所述基于所确定的拼接方式将所述第一媒体信息片段与所述第二媒体信息片段进行拼接,包括:
    获取所述目标媒体信息中的音频信息;
    基于所确定的拼接方式将所获取的音频信息、所述第一媒体信息片段以及所述第二媒体信息片段进行拼接。
  4. 如权利要求3所述的方法,其中,所述基于所确定的拼接方式将合成有对应的所述音频信息的所述第一媒体信息片段、所述第二媒体信息片段进行拼接,包括:
    当所述目标媒体信息的各所述媒体信息片段是基于时间轴上的先后时间顺序分割得到时,使用基于时间轴依次拼接方式,将所述音频信息、所述第一媒体信息片段和所述第二媒体信息片段基于时间轴上的先后顺序进行依次拼接;
    当所述目标媒体信息的各所述媒体信息片段是基于各所述媒体信息片段承载的不同人物角色从所述目标媒体信息提取得到时,使用同步拼接方式,将所述音频信息、所述第一媒体信息片段和所述第二媒体信息片段基于所承载的所述人物角色在所述目标媒体信息中的提取位置进行同步拼接。
  5. 如权利要求1所述的方法,其中,所述获取与所确定的媒体信息片段的特征所对应的第二媒体信息片段,包括:
    获取所述目标媒体信息中除所述第一媒体信息片段之外的各媒体信息片段为所述第二媒体信息片段;或者,
    获取在第二装置侧对第二用户侧的表演采集得到的媒体信息片段作为所述第二媒体信息片段,其中,所述第二装置基于所述媒体信息片段中除所述目标媒体信息片段之外的媒体信息片段的特征对第二用户侧的表演进行采集。
  6. 如权利要求1所述的方法,其中,还包括:
    分析所述目标媒体信息得到所述目标媒体信息的特征,执行以下分割操作至少之一:
    基于所述目标媒体信息的特征表征所述目标媒体信息的时长,对所述目标媒体信息基于时间轴分割为各所述媒体信息片段;
    基于所述目标媒体信息的特征表征所述目标媒体信息的所承载的人物角色,对所述目标媒体信息中提取包括各所述人物角色的媒体信息片段,得到各所述媒体信息片段,其中各所述媒体信息片段仅承载一个所述人物角色所承载的人物角色不同。
  7. 如权利要求1所述的方法,其中,还包括:
    上传所述拼接后得到的媒体信息至服务器侧,获取所述服务器侧返回的分享链接;
    基于所述分享链接响应所述第一用户侧的分享操作指令。
  8. 一种媒体信息处理装置,包括:
    第一确定模块,配置为在第一装置侧确定目标媒体信息的媒体信息片段以及各所述媒体信息片段的特征;
    采集模块,配置为基于所确定的特征对第一用户侧进行采集得到对应目标媒体信息片段的第一媒体信息片段;
    第二确定模块,配置为确定所述目标媒体信息中除所述目标媒体信息片段之外的媒体信息片段,获取与所确定的媒体信息片段的特征所对应的第二媒体信息片段;
    第三确定模块,配置为确定所述目标媒体信息中各所述媒体信息片段的拼接方式;
    拼接模块,配置为基于所确定的拼接方式将所述第一媒体信息片段与所述第二媒体信息片段进行拼接,得到拼接后的媒体信息。
  9. 如权利要求8所述的媒体信息处理装置,其中,
    所述采集模块,还配置为加载所述目标媒体信息中各所述媒体信息片段的特征;
    所述采集模块还配置为根据所述第一用户侧的选择操作在各所述媒体信息片段中确定所述目标媒体信息片段;
    所述采集模块还配置为加载所述目标媒体信息片段的特征,采集所述第一用户侧基于所述目标媒体信息片段的特征而实施的表演,形成所述第一媒体信息片段。
  10. 如权利要求8所述的媒体信息处理装置,其中,所述第一媒体信息片段和所述第二媒体信息片段均未承载音频信息;
    所述拼接模块,还配置为获取所述目标媒体信息中的音频信息;
    所述拼接模块,还配置为基于所确定的拼接方式将所获取的音频信息、所述第一媒体信息片段以及所述第二媒体信息片段进行拼接。
  11. 如权利要求10所述的媒体信息处理装置,其中,
    所述拼接模块,还配置为当所述目标媒体信息的各所述媒体信息片段是基于时间轴上的先后时间顺序分割得到时,使用基于时间轴依次拼接方式,将所述音频信息、所述第一媒体信息片段和所述第二媒体信息片段基于时间轴上的先后顺序进行依次拼接;
    所述拼接模块,还配置为当所述目标媒体信息的各所述媒体信息片段是基于各所述媒体信息片段承载的不同人物角色从所述目标媒体信息提取得到时,使用同步拼接方式,将所述音频信息、所述第一媒体信息片段和所述第二媒体信息片段基于所承载的所述人物角色在所述目标媒体信息中的提取位置进行同步拼接。
  12. 如权利要求8所述的媒体信息处理装置,其中,
    所述第二确定模块,还配置为获取所述目标媒体信息中除所述第一媒体信息片段之外的各媒体信息片段为所述第二媒体信息片段;或者,获取在第二装置侧对第二用户侧的表演采集得到的媒体信息片段作为所述第二媒体信息片段,其中,所述第二装置基于所述媒体信息片段中除所述 目标媒体信息片段之外的媒体信息片段的特征对第二用户侧的表演进行采集。
  13. 如权利要求8所述的媒体信息处理装置,其中,还包括:
    分析模块,配置为分析所述目标媒体信息得到所述目标媒体信息的特征,执行以下分割操作至少之一:
    基于所述目标媒体信息的特征表征所述目标媒体信息的时长,对所述目标媒体信息基于时间轴分割为各所述媒体信息片段;
    基于所述目标媒体信息的特征表征所述目标媒体信息的所承载的人物角色,对所述目标媒体信息中提取包括各所述人物角色的媒体信息片段,得到各所述媒体信息片段,其中各所述媒体信息片段仅承载一个所述人物角色所承载的人物角色不同。
  14. 如权利要求8所述的媒体信息处理装置,其中,还包括:
    上传模块,配置为上传所述拼接后得到的媒体信息至服务器侧,获取所述服务器侧返回的分享链接;
    分享模块,配置为基于所述分享链接响应所述第一用户侧的分享操作指令。
  15. 一种媒体信息处理装置,包括:存储器和处理器,所述存储器中存储有可执行指令,所述可执行指令用于引起所述处理器执行包括以下的操作:
    在第一装置侧确定目标媒体信息的媒体信息片段、以及各所述媒体信息片段的特征;
    基于所确定的特征对第一用户侧进行采集得到对应所述目标媒体信息片段的第一媒体信息片段;
    确定所述目标媒体信息中除所述目标媒体信息片段之外的媒体信息片段;
    获取与所确定的媒体信息片段的特征所对应的第二媒体信息片段;
    确定所述目标媒体信息中各所述媒体信息片段的拼接方式;
    基于所确定的拼接方式将所述第一媒体信息片段与所述第二媒体信息片段进行拼接,得到拼接后的媒体信息。
  16. 一种存储介质,存储有可执行指令,用于执行权利要求1至7任一项所述的媒体信息处理方法。
PCT/CN2017/074174 2016-03-14 2017-02-20 媒体信息处理方法及媒体信息处理装置、存储介质 WO2017157135A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/041,585 US10652613B2 (en) 2016-03-14 2018-07-20 Splicing user generated clips into target media information

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610143913.8 2016-03-14
CN201610143913.8A CN105812920B (zh) 2016-03-14 2016-03-14 媒体信息处理方法及媒体信息处理装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/041,585 Continuation-In-Part US10652613B2 (en) 2016-03-14 2018-07-20 Splicing user generated clips into target media information

Publications (1)

Publication Number Publication Date
WO2017157135A1 true WO2017157135A1 (zh) 2017-09-21

Family

ID=56467379

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/074174 WO2017157135A1 (zh) 2016-03-14 2017-02-20 媒体信息处理方法及媒体信息处理装置、存储介质

Country Status (3)

Country Link
US (1) US10652613B2 (zh)
CN (1) CN105812920B (zh)
WO (1) WO2017157135A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105812920B (zh) * 2016-03-14 2019-04-16 腾讯科技(深圳)有限公司 媒体信息处理方法及媒体信息处理装置
JP6980177B2 (ja) * 2018-01-09 2021-12-15 トヨタ自動車株式会社 オーディオ装置
CN110300274B (zh) * 2018-03-21 2022-05-10 腾讯科技(深圳)有限公司 视频文件的录制方法、装置及存储介质
US10706347B2 (en) * 2018-09-17 2020-07-07 Intel Corporation Apparatus and methods for generating context-aware artificial intelligence characters
CN109379633B (zh) * 2018-11-08 2020-01-10 北京微播视界科技有限公司 视频编辑方法、装置、计算机设备及可读存储介质
CN114697700A (zh) * 2020-12-28 2022-07-01 北京小米移动软件有限公司 视频剪辑方法、视频剪辑装置及存储介质
CN114302253B (zh) * 2021-11-25 2024-03-12 北京达佳互联信息技术有限公司 媒体数据处理方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009124004A1 (en) * 2008-03-31 2009-10-08 Dolby Laboratories Licensing Corporation Associating information with media content using objects recognized therein
CN202034042U (zh) * 2010-12-10 2011-11-09 深圳市同洲电子股份有限公司 一种应用于即时点播系统的多媒体信息处理系统
CN103916700A (zh) * 2014-04-12 2014-07-09 深圳市晟江科技有限公司 一种视频文件中信息识别的方法及系统
CN103945234A (zh) * 2014-03-27 2014-07-23 百度在线网络技术(北京)有限公司 一种提供视频相关信息的方法与设备
CN104967902A (zh) * 2014-09-17 2015-10-07 腾讯科技(北京)有限公司 视频分享方法、装置及系统
CN105812920A (zh) * 2016-03-14 2016-07-27 腾讯科技(深圳)有限公司 媒体信息处理方法及媒体信息处理装置

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7142645B2 (en) * 2002-10-04 2006-11-28 Frederick Lowe System and method for generating and distributing personalized media
US7123696B2 (en) * 2002-10-04 2006-10-17 Frederick Lowe Method and apparatus for generating and distributing personalized media clips
CN1719872A (zh) * 2005-08-11 2006-01-11 上海交通大学 基于全身融合的电影秀娱乐系统
US20070122786A1 (en) * 2005-11-29 2007-05-31 Broadcom Corporation Video karaoke system
US9083938B2 (en) * 2007-02-26 2015-07-14 Sony Computer Entertainment America Llc Media player with networked playback control and advertisement insertion
CN101051457A (zh) * 2007-03-26 2007-10-10 中山大学 一种电影卡拉ok的电子娱乐系统
CN101946500B (zh) * 2007-12-17 2012-10-03 伊克鲁迪控股公司 实时视频包含系统
US8555169B2 (en) * 2009-04-30 2013-10-08 Apple Inc. Media clip auditioning used to evaluate uncommitted media content
US20110126103A1 (en) * 2009-11-24 2011-05-26 Tunewiki Ltd. Method and system for a "karaoke collage"
KR20130127338A (ko) * 2012-05-14 2013-11-22 삼성전자주식회사 디스플레이 장치, 서버 및 그 제어 방법
US20140164507A1 (en) * 2012-12-10 2014-06-12 Rawllin International Inc. Media content portions recommended
WO2014194488A1 (en) * 2013-06-05 2014-12-11 Intel Corporation Karaoke avatar animation based on facial motion data
CN104376589A (zh) * 2014-12-04 2015-02-25 青岛华通国有资本运营(集团)有限责任公司 一种替换影视剧人物的方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009124004A1 (en) * 2008-03-31 2009-10-08 Dolby Laboratories Licensing Corporation Associating information with media content using objects recognized therein
CN202034042U (zh) * 2010-12-10 2011-11-09 深圳市同洲电子股份有限公司 一种应用于即时点播系统的多媒体信息处理系统
CN103945234A (zh) * 2014-03-27 2014-07-23 百度在线网络技术(北京)有限公司 一种提供视频相关信息的方法与设备
CN103916700A (zh) * 2014-04-12 2014-07-09 深圳市晟江科技有限公司 一种视频文件中信息识别的方法及系统
CN104967902A (zh) * 2014-09-17 2015-10-07 腾讯科技(北京)有限公司 视频分享方法、装置及系统
CN105812920A (zh) * 2016-03-14 2016-07-27 腾讯科技(深圳)有限公司 媒体信息处理方法及媒体信息处理装置

Also Published As

Publication number Publication date
CN105812920A (zh) 2016-07-27
US10652613B2 (en) 2020-05-12
CN105812920B (zh) 2019-04-16
US20180352293A1 (en) 2018-12-06

Similar Documents

Publication Publication Date Title
WO2017157135A1 (zh) 媒体信息处理方法及媒体信息处理装置、存储介质
CN109547819B (zh) 直播列表展示方法、装置以及电子设备
US10685460B2 (en) Method and apparatus for generating photo-story based on visual context analysis of digital content
KR101664754B1 (ko) 정보 취득 방법, 장치, 프로그램 및 기록매체
US20170257414A1 (en) Method of creating a media composition and apparatus therefore
JP6385447B2 (ja) 動画提供方法および動画提供システム
US20150058709A1 (en) Method of creating a media composition and apparatus therefore
CN111930994A (zh) 视频编辑的处理方法、装置、电子设备及存储介质
KR20150099774A (ko) 사용자-생성 비디오 동기화된 편집을 위한 디지털 플랫폼
CN112188117B (zh) 视频合成方法、客户端及系统
CN105893412A (zh) 图像分享方法及装置
TWI522823B (zh) 用於跨多種裝置之智慧型媒體展示技術
CN103428555A (zh) 一种多媒体文件的合成方法、系统及应用方法
CN112188267B (zh) 视频播放方法、装置和设备及计算机存储介质
WO2019114330A1 (zh) 一种视频播放方法、装置和终端设备
JP7074891B2 (ja) 撮影方法及び端末装置
WO2017185584A1 (zh) 播放优化方法和装置
WO2018050021A1 (zh) 虚拟现实场景调节方法、装置及存储介质
CN113938620B (zh) 图像处理方法、移动终端及存储介质
KR102038938B1 (ko) 숏클립 비디오를 제작 및 공유할 수 있도록 하는 완구 sns
WO2013116163A1 (en) Method of creating a media composition and apparatus therefore
WO2023241377A1 (zh) 视频数据的处理方法、装置、设备、系统及存储介质
CN108616768B (zh) 多媒体资源的同步播放方法、装置、存储位置及电子装置
WO2023045430A1 (zh) 基于二维码的数据处理方法、装置及系统
US9084011B2 (en) Method for advertising based on audio/video content and method for creating an audio/video playback application

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17765677

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17765677

Country of ref document: EP

Kind code of ref document: A1