WO2020215776A1 - 多媒体数据的处理方法以及装置 - Google Patents

多媒体数据的处理方法以及装置 Download PDF

Info

Publication number
WO2020215776A1
WO2020215776A1 PCT/CN2019/128161 CN2019128161W WO2020215776A1 WO 2020215776 A1 WO2020215776 A1 WO 2020215776A1 CN 2019128161 W CN2019128161 W CN 2019128161W WO 2020215776 A1 WO2020215776 A1 WO 2020215776A1
Authority
WO
WIPO (PCT)
Prior art keywords
multimedia data
information
scene
multimedia
shooting
Prior art date
Application number
PCT/CN2019/128161
Other languages
English (en)
French (fr)
Inventor
袁明飞
Original Assignee
珠海格力电器股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 珠海格力电器股份有限公司 filed Critical 珠海格力电器股份有限公司
Priority to US17/605,460 priority Critical patent/US11800217B2/en
Priority to EP19926007.6A priority patent/EP3941075A4/en
Publication of WO2020215776A1 publication Critical patent/WO2020215776A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/48Matching video sequences
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/27Server based end-user applications
    • H04N21/274Storing end-user multimedia data in response to end-user request, e.g. network recorder
    • H04N21/2743Video hosting of uploaded data from client
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/64Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/667Camera operation mode switching, e.g. between still and video, sport and normal or high- and low-resolution modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2621Cameras specially adapted for the electronic generation of special effects during image pickup, e.g. digital cameras, camcorders, video cameras having integrated special effects capability
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Definitions

  • the present disclosure relates to the field of multimedia, and in particular to a method and device for processing multimedia data.
  • the embodiments of the present disclosure provide a method and apparatus for processing multimedia data, so as to at least solve the technical problem of poor video imitation effect due to the inability to obtain information of the imitation video in the method known to the inventor.
  • a method for processing multimedia data including: acquiring first multimedia data; performing multi-dimensional analysis on the first multimedia data to obtain multimedia dimension information; and according to the multimedia dimension information Perform video shooting to obtain second multimedia data.
  • the multimedia data processing method further includes: detecting the number of scenes that make up the first multimedia data; in the case where it is detected that the number of scenes is more than one, acquiring the switching effect between the multiple scenes and each The scene information corresponding to the scene.
  • the method for processing multimedia data further includes: detecting the number of scenes that make up the first multimedia data; in a case where it is detected that the number of scenes is one, acquiring scene information corresponding to the scene.
  • the multimedia data processing method further includes: detecting a scene corresponding to each frame in the first multimedia data; determining the composition of the first multimedia data according to the matching degree of the scenes corresponding to two adjacent frames The number of scenes.
  • the multimedia data processing method further includes: identifying object information of the preset object, wherein the object information includes at least one of the following: the expression of the preset object , Actions and special effects.
  • the multimedia data processing method further includes: acquiring scene information corresponding to each scene in the first multimedia data and switching effects between multiple scenes; performing video shooting according to the scene information to obtain each The third multimedia data corresponding to the scene; the switching effect between multiple third multimedia data is set according to the switching effect, and the second multimedia data is obtained.
  • the multimedia data processing method further includes: obtaining scene information corresponding to the first multimedia data; and performing video shooting according to the scene information to obtain the second multimedia data.
  • the multimedia data processing method further includes: detecting the degree of matching between the second multimedia data and the corresponding scene; In the case where the matching degree is less than the preset matching degree, camera control information is generated; prompt information is generated according to the camera control information, where the prompt information is used by the user to control the shooting device to perform video shooting according to the camera control information.
  • the multimedia data processing method further includes: detecting a video shooting instruction; in the case of detecting that the video shooting instruction is an imitating shooting instruction, controlling the shooting device to enter the imitating shooting mode, wherein the imitating shooting mode is used to The existing multimedia data is shot to obtain multimedia data with the same shooting effect as the existing multimedia data; in the case where the video shooting instruction is detected as a conventional shooting instruction, the shooting device is controlled to enter the conventional shooting mode.
  • an apparatus for processing multimedia data including: an acquisition module for acquiring first multimedia data; and a parsing module for performing multiple operations on the first multimedia data. Dimension analysis to obtain multimedia dimensional information; a processing module for video shooting according to the multimedia dimensional information to obtain second multimedia data.
  • a storage medium includes a stored program, wherein the device where the storage medium is located is controlled to execute the multimedia data processing method when the program is running.
  • a processor for running a program wherein the multimedia data processing method is executed when the program is running.
  • the multimedia data is parsed, and the video is captured according to the parsed information.
  • the first multimedia data is analyzed in multiple dimensions to obtain the multimedia dimensions.
  • Information, and finally video shooting is performed according to the multimedia dimension information to obtain the second multimedia data.
  • the filter, special effects, transitions and other information of the first multimedia data can be obtained, and then the user adopts the same multimedia dimensions as the first multimedia data
  • the information is video-shot to obtain the second multimedia data with the same effect as the first multimedia data. Since the second multimedia data is obtained by shooting based on information parsed from the first multimedia data, the second multimedia data has the same effect as the first multimedia data.
  • the solution provided by this application achieves the purpose of imitating multimedia data, thereby achieving the technical effect of generating a video with the same effect as the source multimedia data, providing a user’s shooting experience, and solving the problem of the inventor.
  • the technical problem of poor video imitation effect due to the inability to obtain the information of the imitation video.
  • Fig. 1 is a flowchart of a method for processing multimedia data according to an embodiment of the present disclosure
  • Fig. 2 is a flowchart of an optional multimedia data processing method according to an embodiment of the present disclosure.
  • Fig. 3 is a schematic diagram of a multimedia data processing device according to an embodiment of the present disclosure.
  • an embodiment of a method for processing multimedia data is provided. It should be noted that the steps shown in the flowchart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and, Although a logical sequence is shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than here.
  • Fig. 1 is a flowchart of a method for processing multimedia data according to an embodiment of the present disclosure. As shown in Fig. 1, the method includes the following steps:
  • Step S102 Acquire first multimedia data.
  • first multimedia data is multimedia data that the user wants to imitate and shoot
  • first multimedia data is video data
  • the mobile device may obtain the first multimedia data.
  • a mobile device is a device with multimedia data processing capabilities, which can be, but is not limited to, interactive devices such as smart phones and tablets.
  • the mobile device caches the video (that is, the first multimedia data).
  • the client obtains multimedia data corresponding to the video.
  • the client can also obtain the network address corresponding to the resource of the first multimedia data through the Internet, and obtain the first multimedia data from the Internet according to the network address.
  • the mobile device does not need to download or cache the multimedia data , Which reduces the local memory usage of mobile devices.
  • Step S104 Perform multi-dimensional analysis on the first multimedia data to obtain multimedia dimensional information.
  • the multimedia dimension information includes at least one of the following: scene information of a scene included in the first multimedia data, and a switching effect between multiple scenes when the multimedia data includes multiple scenes, where the scene information It includes at least one of the following: background music, scene objects (for example, people, animals, scenery), scene effects (for example, filters, text, special effects, etc.), and camera information (for example, the position and angle of the camera).
  • scene information includes at least one of the following: background music, scene objects (for example, people, animals, scenery), scene effects (for example, filters, text, special effects, etc.), and camera information (for example, the position and angle of the camera).
  • the client uses AI (Artificial Intelligence) intelligent analysis video technology to intelligently analyze the first multimedia data.
  • AI Artificial Intelligence
  • the first multimedia data is mainly voice, text, face, object, and Multi-dimensional analysis of scenes.
  • Step S106 Perform video shooting according to the multimedia dimension information to obtain second multimedia data.
  • the client terminal after obtaining the multimedia dimension information of the first media data, performs video shooting according to the multimedia dimension information. For example, the client determines that the first multimedia data is used by analyzing the first multimedia data.
  • the "Autumn Fairy Tale” filter when video shooting, the client uses the “Autumn Fairy Tale” filter for video shooting to obtain the second multimedia data.
  • the first multimedia data is a video that is imitated
  • the second multimedia data is a video shot by imitating the first multimedia data.
  • the multimedia data is parsed, and the video is captured according to the parsed information.
  • the first multimedia data is Multi-dimensional analysis is performed to obtain multimedia dimensional information, and finally video shooting is performed according to the multimedia dimensional information to obtain second multimedia data.
  • the first multimedia data information such as filters, special effects, and transitions of the first multimedia data can be obtained, and then the user adopts the same multimedia dimensions as the first multimedia data
  • the information is video-shot to obtain the second multimedia data with the same effect as the first multimedia data. Since the second multimedia data is obtained by shooting based on information parsed from the first multimedia data, the second multimedia data has the same effect as the first multimedia data.
  • the solution provided by this application achieves the purpose of imitating multimedia data, thereby achieving the technical effect of generating a video with the same effect as the source multimedia data, providing a user’s shooting experience, and solving the problem of the inventor.
  • the technical problem of poor video imitation effect due to the inability to obtain the information of the imitation video.
  • the client terminal before acquiring the first multimedia data, the client terminal also detects a video shooting instruction. Wherein, when it is detected that the video shooting instruction is an imitation shooting instruction, the shooting device is controlled to enter the imitation shooting mode; when the video shooting instruction is detected as a normal shooting instruction, the shooting device is controlled to enter the normal shooting mode.
  • the imitation shooting mode is used for shooting based on the existing multimedia data, and obtaining multimedia data with the same shooting effect as the existing multimedia data
  • the user may select the mode of video shooting through the client before video shooting, for example, the user may select the desired shooting mode on the client through voice control or touch operation. If the user selects the imitation shooting mode, the client will receive the imitation shooting instruction. After receiving the imitating shooting instruction, the client will obtain the imitated multimedia data (ie, the first multimedia data), and analyze the first multimedia data.
  • the client will obtain the imitated multimedia data (ie, the first multimedia data), and analyze the first multimedia data.
  • the first multimedia data may include multiple scenes.
  • the first scene is in a park and the second scene is at home.
  • the multimedia dimension information corresponding to different scene numbers may also be different. Therefore, In the process of multi-dimensional analysis of the first multimedia data, the client needs to detect the number of scenes included in the first multimedia data.
  • the client detects the number of scenes constituting the first multimedia data. Among them, in the case of detecting that the number of scenes is multiple, obtain the switching effect between multiple scenes and the scene information corresponding to each scene; in the case of detecting that the number of scenes is one, obtain the scene information corresponding to the scene .
  • the previous switching effects of multiple scenes include, but are not limited to, black screen flipping and switching scenes, and there is no scene within a preset time period when two scenes are switched.
  • the client detects the scene corresponding to each frame in the first multimedia data, and then determines the composition of the first multimedia data according to the matching degree of the scenes corresponding to two adjacent frames The number of scenes. For example, the client detects that the scene corresponding to the first frame of video is the first scene, and the scene corresponding to the second frame of video is the second scene, where the first frame of video and the second frame of video are two adjacent frames of video, and the first The scene and the second scene are two different scenes, and the client determines that the first multimedia data includes two scenes. At this time, the client obtains the switching effect of the two scenes when the scene is switched.
  • the client After judging the number of scenes included in the first multimedia data, the client further recognizes each scene. In the case that the first multimedia data only contains one scene, the client performs a check on the entire first multimedia data. Multimedia data is recognized.
  • the recognition of the first multimedia data includes recognizing whether each scene of the first multimedia data includes a preset object, where the preset object may be a person or an animal.
  • the client recognizes the object information of the preset object, where the object information includes at least one of the following: expressions, actions, and special effects of the preset object.
  • the client when detecting that the first multimedia data contains a person, the client recognizes the person’s facial expressions, actions, and beauty effects, and recognizes whether there are filters, text, special effects, etc. in the multimedia data corresponding to the scene .
  • the client when it is detected that the first multimedia data does not contain people and only contains scenery, the client only recognizes whether there are filters, text, special effects, etc. in the multimedia data corresponding to the scene.
  • the user can directly use all the data identified above when making imitations.
  • the client obtains scene information corresponding to each scene in the first multimedia data and information between the multiple scenes. Switch effects, and then perform video shooting according to the scene information to obtain the third multimedia data corresponding to each scene, and set the switching effect between multiple third multimedia data according to the switching effect to obtain the second multimedia data .
  • the first multimedia data includes two scenes, the first scene uses the "Autumn Fairy Tale” filter, the second scene uses italics text, and the switching effect between the first scene and the second scene is
  • the black screen is flipped and switched, when the user uses the client to shoot the video of the first scene, the client turns on the "Autumn Fairy Tale” filter, and the client uses italics text during the video shooting of the second scene Mark and set the switching effect of these two scenes to black screen flip switching.
  • the client in the case that the first multimedia data includes a scene, the client obtains scene information corresponding to the first multimedia data, and performs video shooting according to the scene information to obtain the first multimedia data.
  • Multimedia data in the case that the first multimedia data includes a scene, the client obtains scene information corresponding to the first multimedia data, and performs video shooting according to the scene information to obtain the first multimedia data.
  • the client separately simulates each scene according to the multimedia dimension information corresponding to the first multimedia data; in the first multimedia data When a scene is included, the client directly shoots a video.
  • the client after acquiring the multimedia dimension information corresponding to the first multimedia data through the above content, the client can start imitating shooting.
  • you can directly use the filters, effects, special scenes, beauty, actions, text, music and other information of the source video (that is, the first multimedia data) identified above.
  • the user only needs to imitate the recognized template Just shoot people or landscapes of the source video.
  • the client's display interface tracks and displays the user's shooting situation in real time, and intelligently reminds the user how to control the camera.
  • the client detects the degree of matching between the second multimedia data and the corresponding scene, and the degree of matching is less than the preset degree of matching.
  • camera control information is generated, and prompt information is generated according to the camera control information, where the prompt information is used by the user to control the shooting device to perform video shooting according to the camera control information.
  • the client may also receive a control instruction from the user, and the control instruction is used to instruct the client to enter the imitation mode or the co-production mode.
  • the user can browse the shooting effects through the display interface of the client, or edit the video of a certain scene separately to use other effects, etc. After the video editing is completed, save it to complete the imitation video.
  • FIG. 2 shows a flowchart of a method for processing multimedia data provided by this application.
  • the user turns on the camera of the mobile device to control the client to enter the video shooting mode.
  • the client will receive the conventional shooting instruction and perform conventional video shooting; if the user selects the imitation shooting mode, then The client will receive the imitating shooting instruction and perform imitating shooting.
  • the client prompts the user to add a video that needs to be imitated. After the video is added, the client parses the video to obtain the multimedia dimension information of the video, such as filters, special effects, transition effects, camera information, etc. Then the user starts to shoot one or several videos.
  • the client will process the shooting video according to the relevant information of the source video and remind the user how to control the position of the camera in real time.
  • the angle can also prompt the user whether to perform imitation or co-production simultaneously.
  • the user can browse the shooting effect through the display interface of the client, or edit the video of a certain scene separately to use other effects, etc. After the video editing is completed, save it to complete the imitation video .
  • the solution provided by this application uses AI technology to intelligently analyze the video that the user wants to imitate, analyze the video filters, special effects, transitions, camera control, etc., and guide the user to shoot after the analysis, and edit the video after shooting , It has increased the user's interest in shooting video, shortened the distance from the trend of the times, also improved the user experience of the video, and enriched the user's experience.
  • an embodiment of a device for processing multimedia data is also provided. It should be noted that the device can execute the method for processing multimedia data in Embodiment 1, where FIG. 3 is based on the embodiment of the present disclosure.
  • the acquisition module 301 is used to obtain the first multimedia data; the analysis module 303 is used to perform multi-dimensional analysis on the first multimedia data to obtain multimedia dimension information; and the processing module 305 is used to perform processing based on the multimedia dimension information
  • the video is captured to obtain the second multimedia data.
  • the acquisition module 301, the analysis module 303, and the processing module 305 correspond to the steps S102 to S106 of the foregoing embodiment.
  • the three modules implement the same examples and application scenarios as the corresponding steps, but are not limited to The content disclosed in the above embodiment.
  • the analysis module includes: a first detection module and a first acquisition module.
  • the first detection module is used to detect the number of scenes that make up the first multimedia data;
  • the first acquisition module is used to obtain the switching effect between multiple scenes when the number of scenes is detected to be multiple, and Scene information corresponding to each scene.
  • the analysis module includes: a second detection module and a second acquisition module.
  • the second detection module is used to detect the number of scenes constituting the first multimedia data;
  • the second acquisition module is used to obtain scene information corresponding to the scene when the number of scenes is detected as one.
  • the second detection module includes: a third detection module and a first determination module.
  • the third detection module is used to detect the scene corresponding to each frame in the first multimedia data;
  • the first determination module is used to determine the composition of the first multimedia data according to the matching degree of the scenes corresponding to two adjacent frames The number of scenes of volume data.
  • the multimedia data processing apparatus further includes: an identification module.
  • the recognition module is used to recognize object information of a preset object, where the object information includes at least one of the following: the expression, action, and special effects of the preset object.
  • the processing module includes: a third acquisition module, a first processing module, and a second processing module.
  • the third acquisition module is used to acquire scene information corresponding to each scene in the first multimedia data and the switching effect between multiple scenes;
  • the first processing module is used to perform video shooting according to the scene information to obtain The third multimedia data corresponding to each scene;
  • the second processing module is used to set a switching effect between a plurality of third multimedia data according to the switching effect to obtain the second multimedia data.
  • the processing module includes: a fourth acquisition module and a third processing module.
  • the fourth obtaining module is used to obtain the scene information corresponding to the first multimedia data
  • the third processing module is used to perform video shooting according to the scene information to obtain the second multimedia data.
  • the multimedia data processing device further includes: a fourth detection module, a first generation module, and a second Generate modules.
  • the fourth detection module is used to detect the matching degree between the second multimedia data and the corresponding scene;
  • the first generation module is used to generate camera control information when the matching degree is less than the preset matching degree;
  • second The generating module is used to generate prompt information according to the camera control information, where the prompt information is used by the user to control the shooting device to perform video shooting according to the camera control information.
  • the multimedia data processing device further includes: a fifth detection module, a first control module, and a second control module.
  • the fifth detection module is used to detect the video shooting instruction
  • the first control module is used to control the shooting device to enter the imitation shooting mode when the video shooting instruction is detected as an imitation shooting instruction, wherein the imitation shooting mode is used for Shooting according to the existing multimedia data to obtain the same multimedia data as the shooting effect of the existing multimedia data
  • the second control module is used to control the shooting device to enter the normal mode when the video shooting instruction is detected as a normal shooting instruction Shooting mode.
  • a storage medium includes a stored program, wherein the device where the storage medium is located is controlled to execute the multimedia data processing method in Embodiment 1 when the program runs.
  • a processor for running a program wherein the multimedia data processing method in Embodiment 1 is executed when the program is running.
  • the disclosed technical content can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units may be a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of units or modules, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • a storage medium includes a number of instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Television Signal Processing For Recording (AREA)
  • Studio Devices (AREA)

Abstract

本申请公开了一种多媒体数据的处理方法以及装置。其中,该方法包括:获取第一多媒体数据;对第一多媒体数据进行多维度解析,得到多媒体维度信息;根据多媒体维度信息进行视频拍摄,得到第二多媒体数据。本申请解决了由于无法获取仿拍视频的信息导致视频仿拍效果差的技术问题。

Description

多媒体数据的处理方法以及装置
相关申请
本公开要求2019年04月22日申请的,申请号为201910324559.2,名称为“多媒体数据的处理方法以及装置”的中国专利申请的优先权,在此将其全文引入作为参考。
技术领域
本公开涉及多媒体领域,具体而言,涉及一种多媒体数据的处理方法以及装置。
背景技术
随着移动互联网的深入普及,网络视频,尤其是短视频广泛的出现在了人们的日常生活中,并对人们的产生了较深的影响。人们在闲暇时,可以通过第三方客户端观看短视频,还可通过第三方客户端模仿拍摄短视频。然而,由于无法获知源视频的相关信息,例如,采用的滤镜、特效、镜头等信息,因此,人们在模仿拍摄短视频的过程中,拍摄到的视频与源视频的拍摄效果相差较大,导致人们放弃拍摄,降低了用户体验。
针对上述的问题,目前尚未提出有效的解决方案。
发明内容
本公开实施例提供了一种多媒体数据的处理方法以及装置,以至少解决发明人所知的方法中由于无法获取仿拍视频的信息导致视频仿拍效果差的技术问题。
根据本公开实施例的一个方面,提供了一种多媒体数据的处理方法,包括:获取第一多媒体数据;对第一多媒体数据进行多维度解析,得到多媒体维度信息;根据多媒体维度信息进行视频拍摄,得到第二多媒体数据。
在一个实施例中,多媒体数据的处理方法还包括:检测组成第一多媒体数据的场景数量;在检测到场景数量为多个的情况下,获取多个场景之间的切换效果以及每个场景对应的场景信息。
在一个实施例中,多媒体数据的处理方法还包括:检测组成第一多媒体数据的场景数量;在检测到场景数量为一个的情况下,获取该场景对应的场景信息。
在一个实施例中,多媒体数据的处理方法还包括:检测第一多媒体数据中每一帧对应的场景;根据相邻两个帧所对应的场景的匹配度确定组成第一多媒体数据的场景数量。
在一个实施例中,在检测到场景对象包括预设对象的情况下,多媒体数据的处理方法还包括:识别预设对象的对象信息,其中,对象信息包括如下至少之一:预设对象的表情、动作以及特效。
在一个实施例中,多媒体数据的处理方法还包括:获取第一多媒体数据中的每个场景对应的场景信息以及多个场景之间的切换效果;根据场景信息进行视频拍摄,得到每个场景对应的第三多媒体数据;根据切换效果设置多个第三多媒体数据之间的切换效果,得到第二多媒体数据。
在一个实施例中,多媒体数据的处理方法还包括:获取第一多媒体数据所对应的场景信息;根据场景信息进行视频拍摄,得到第二多媒体数据。
在一个实施例中,在对根据多媒体维度信息进行视频拍摄,得到第二多媒体数据的过程中,多媒体数据的处理方法还包括:检测第二多媒体数据与对应的场景的匹配度;在匹配度小于预设匹配度的情况下,生成摄像头控制信息;根据摄像头控制信息生成提示信息,其中,提示信息用于用户按照摄像头控制信息控制拍摄设备进行视频拍摄。
在一个实施例中,多媒体数据的处理方法还包括:检测视频拍摄指令;在检测到视频拍摄指令为模仿拍摄指令的情况下,控制拍摄设备进入模仿拍摄模式,其中,模仿拍摄模式用于根据已存在的多媒体数据进行拍摄,得到与已存在的多媒体数据的拍摄效果相同的多媒体数据;在检测到视频拍摄指令为常规拍摄指令的情况下,控制拍摄设备进入常规拍摄模式。
根据本公开实施例的另一方面,还提供了一种多媒体数据的处理装置,包括:获取模块,用于获取第一多媒体数据;解析模块,用于对第一多媒体数据进行多维度解析,得到多媒体维度信息;处理模块,用于根据多媒体维度信息进行视频拍摄,得到第二多媒体数据。
根据本公开实施例的另一方面,还提供了一种存储介质,该存储介质包括存储的程序,其中,在程序运行时控制存储介质所在设备执行多媒体数据的处理方法。
根据本公开实施例的另一方面,还提供了一种处理器,该处理器用于运行程序,其中,程序运行时执行多媒体数据的处理方法。
在本公开实施例中,采用对多媒体数据进行解析,根据解析后的信息进行视频拍摄的方式,在得到第一多媒体数据之后,对第一多媒体数据进行多维度解析,得到多媒体维度信息,最后根据多媒体维度信息进行视频拍摄,得到第二多媒体数据。容易注意到的是,通过对第一多媒体数据进行解析,可以得到第一多媒体数据的滤镜、特效、转场等信息,进而用户采用与第一多媒体数据相同的多媒体维度信息进行视频拍摄,得到与第一多媒体 数据具有相同效果的第二多媒体数据。由于第二多媒体数据是根据第一多媒体数据解析后的信息进行拍摄得到的,因此,第二多媒体数据具有与第一多媒体数据相同的效果。由此可见,本申请所提供的方案达到了对多媒体数据进行仿拍的目的,从而实现了生成与源多媒体数据具有相同效果的视频的技术效果,提供了用户的拍摄体验,进而解决了发明人所知的方法中由于无法获取仿拍视频的信息导致视频仿拍效果差的技术问题。
附图说明
此处所说明的附图用来提供对本公开的进一步理解,构成本申请的一部分,本公开的示意性实施例及其说明用于解释本公开,并不构成对本公开的不当限定。在附图中:
图1是根据本公开实施例的一种多媒体数据的处理方法流程图;
图2是根据本公开实施例的一种可选的多媒体数据的处理方法的流程图;以及
图3是根据本公开实施例的一种多媒体数据的处理装置示意图。
具体实施方式
为了使本技术领域的人员更好地理解本公开方案,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分的实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
实施例1
根据本公开实施例,提供了一种多媒体数据的处理方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
图1是根据本公开实施例的多媒体数据的处理方法流程图,如图1所示,该方法包括如下步骤:
步骤S102,获取第一多媒体数据。
需要说明的是,上述第一多媒体数据为用户所要模仿拍摄的多媒体数据,其中,在一个实施例中第一多媒体数据为视频数据。
可选的,移动设备可获取第一多媒体数据。其中,移动设备为具有多媒体数据处理能力的设备,可以为但不限于智能手机、平板等交互设备。具体的,用户在观看视频时,移动设备对该视频(即第一多媒体数据)进行缓存。当用户想对该视频进行模仿拍摄时,用户将该视频输入至安装在移动设备的客户端上,该客户端获取该视频对应的多媒体数据。另外,客户端还可通过互联网获取第一多媒体数据的资源对应的网络地址,并根据网络地址从互联网上获取第一多媒体数据,在该场景下,移动设备无需下载或者缓存多媒体数据,减少了对移动设备的本地内存的占用。
步骤S104,对第一多媒体数据进行多维度解析,得到多媒体维度信息。
需要说明的是,多媒体维度信息包括如下至少之一:第一多媒体数据所包含的场景的场景信息,以及在多媒体数据包括多个场景时多个场景之间的切换效果,其中,场景信息包括如下至少之一:背景音乐、场景对象(例如,人、动物、风景)、场景效果(例如,滤镜、文字、特效等)以及摄像头信息(例如,摄像头的位置以及角度)。
可选的,客户端采用AI(Artificial Intelligence,人工智能)智能分析视频技术对第一多媒体数据进行智能分析,其中,主要对第一多媒体数据的语音、文字、人脸、物体以及场景等进行多维度分析。
步骤S106,根据多媒体维度信息进行视频拍摄,得到第二多媒体数据。
可选的,在得到了第一媒体数据的多媒体维度信息之后,客户端根据多媒体维度信息进行视频拍摄,例如,客户端通过对第一多媒体数据进行解析,确定第一多媒体使用了“秋天童话”这个滤镜,则在进行视频拍摄时,客户端使用“秋天童话”这个滤镜进行视频拍摄,得到第二多媒体数据。其中,第一多媒体数据为被模仿的视频,第二多媒体数据为模仿第一多媒体数据所拍摄的视频。
基于上述步骤S102至步骤S106所限定的方案,可以获知,采用对多媒体数据进行解析,根据解析后的信息进行视频拍摄的方式,在得到第一多媒体数据之后,对第一多媒体数据进行多维度解析,得到多媒体维度信息,最后根据多媒体维度信息进行视频拍摄,得到第二多媒体数据。
容易注意到的是,通过对第一多媒体数据进行解析,可以得到第一多媒体数据的滤镜、 特效、转场等信息,进而用户采用与第一多媒体数据相同的多媒体维度信息进行视频拍摄,得到与第一多媒体数据具有相同效果的第二多媒体数据。由于第二多媒体数据是根据第一多媒体数据解析后的信息进行拍摄得到的,因此,第二多媒体数据具有与第一多媒体数据相同的效果。
由此可见,本申请所提供的方案达到了对多媒体数据进行仿拍的目的,从而实现了生成与源多媒体数据具有相同效果的视频的技术效果,提供了用户的拍摄体验,进而解决了发明人所知的方法中由于无法获取仿拍视频的信息导致视频仿拍效果差的技术问题。
在一种可选的实施例中,在获取第一多媒体数据之前,客户端还检测视频拍摄指令。其中,在检测到视频拍摄指令为模仿拍摄指令的情况下,控制拍摄设备进入模仿拍摄模式;在检测到视频拍摄指令为常规拍摄指令的情况下,控制拍摄设备进入常规拍摄模式。
需要说明的是,模仿拍摄模式用于根据已存在的多媒体数据进行拍摄,得到与已存在的多媒体数据的拍摄效果相同的多媒体数据
可选的,用户在进行视频拍摄之前,可通过客户端选择进行视频拍摄的模式,例如,用户通过语音控制或者触控操作在客户端上选择所要进行的拍摄模式。如果用户选择了模仿拍摄模式,则客户端将接收到模仿拍摄指令。在接收到模仿拍摄指令之后,客户端将获取被模仿的多媒体数据(即第一多媒体数据),并对第一多媒体数据进行解析。
在一个实施例中,第一多媒体数据可能包括多个场景,例如,第一个场景为公园中,第二个场景为家中,不同的场景个数对应的多媒体维度信息也可能不同,因此,客户端在对第一多媒体数据进行多维度解析的过程中,需要检测第一多媒体数据包括的场景个数。
具体的,客户端检测组成第一多媒体数据的场景数量。其中,在检测到场景数量为多个的情况下,获取多个场景之间的切换效果以及每个场景对应的场景信息;在检测到场景数量为一个的情况下,获取该场景对应的场景信息。
需要说明的是,在场景数量为多个的情况下,多个场景之间需要进行场景切换,而多个场景之间的切换使用的切换效果不同,也会给最终生成的视频的视觉效果造成影响,因此,在场景数量为多个的情况下,除需获取场景信息之外,还需要获取多个场景之间的切换效果。可选的,多个场景之前的切换效果包括但不限于黑屏翻转切换场景、两个场景切换时预设时长内无场景。
在一种可选的实施例中,客户端检测第一多媒体数据中每一帧对应的场景,然后根据相邻两个帧所对应的场景的匹配度确定组成第一多媒体数据的场景数量。例如,客户端检测第一帧视频对应的场景为第一场景,第二帧视频对应的场景为第二场景,其中,第一帧视频与第二帧视频为相邻的两帧视频,第一场景和第二场景为两个不同的场景,则客户端 确定第一多媒体数据中包括两个场景,此时,客户端获取这两个场景进行场景切换时的切换效果。
在对第一多媒体数据所包含的场景数量进行判断之后,客户端进一步对每个场景进行识别,其中,在第一多媒体数据仅包含一个场景的情况下,客户端对整个第一多媒体数据进行识别。其中,对第一多媒体数据的识别包括识别第一多媒体数据的每个场景中是否包含预设对象,其中,预设对象可以为人物,也可以为动物。在检测到场景对象包括预设对象的情况下,客户端识别预设对象的对象信息,其中,对象信息包括如下至少之一:预设对象的表情、动作以及特效。
可选的,在检测到第一多媒体数据中包含有人物时,客户端识别人物的表情、动作以及美颜效果,并识别该场景对应的多媒体数据中是否有滤镜、文字、特效等。在检测到第一多媒体数据中不包含人物,仅包含风景时,客户端仅识别该场景对应的多媒体数据中是否有滤镜、文字、特效等。用户在进行仿拍时可直接使用上述识别到的所有数据。
在一种可选的实施例中,在第一多媒体数据包括多个场景的情况下,客户端获取第一多媒体数据中的每个场景对应的场景信息以及多个场景之间的切换效果,然后根据场景信息进行视频拍摄,得到每个场景对应的第三多媒体数据,并根据切换效果设置多个第三多媒体数据之间的切换效果,得到第二多媒体数据。例如,第一多媒体数据包括两个场景,第一个场景使用了“秋天童话”滤镜,第二个场景使用了楷体文字,第一个场景与第二个场景之间的切换效果为黑屏翻转切换,则在用户使用客户端进行第一个场景的视频拍摄的过程中,客户端开启“秋天童话”滤镜,在进行第二个场景的视频拍摄的过程中,客户端使用楷体文字进行标注,并设置这两个场景的切换效果为黑屏翻转切换。
在另一种可选的实施例中,在第一多媒体数据包括一个场景的情况下,客户端获取第一多媒体数据所对应的场景信息,并根据场景信息进行视频拍摄,得到第二多媒体数据。
需要说明的是,在第一多媒体数据包括多个场景的情况下,客户端按照第一多媒体数据对应的多媒体维度信息对每个场景分别进行仿拍;在第一多媒体数据包括一个场景的情况下,客户端直接拍摄一段视频。
在一个实施例中,通过上述内容获取到第一多媒体数据对应的多媒体维度信息之后,客户端即可开始模仿拍摄。其中,拍摄时可直接使用以上识别到源视频(即第一多媒体数据)的滤镜、效果、专场、美颜、动作、文字、音乐等信息,用户只需在识别好的模板上仿照源视频的人物或风景拍摄即可。为了使拍摄的视频能够达到更好的拍摄效果,在进行视频拍摄的过程中,客户端的显示界面上实时追踪显示用户的拍摄情况,并智能提醒用户如何控制摄像头。
具体的,在对根据多媒体维度信息进行视频拍摄,得到第二多媒体数据的过程中,客户端检测第二多媒体数据与对应的场景的匹配度,在匹配度小于预设匹配度的情况下,生成摄像头控制信息,并根据摄像头控制信息生成提示信息,其中,提示信息用于用户按照摄像头控制信息控制拍摄设备进行视频拍摄。
需要说明的是,用户在根据多媒体维度信息进行视频拍摄的过程中,客户端还可接收用户的控制指令,该控制指令用于指示客户端进入仿拍模式或合拍模式。另外,在拍摄完成后,用户可通过客户端的显示界面浏览拍摄效果,也可单独对某个场景的视频进行编辑使用其他效果等,在视频编辑完成后保存即可完成仿拍视频。
在一种可选的实施例中,图2示出了本申请所提供的多媒体数据的处理方法的流程图。具体的,用户打开移动设备的相机控制客户端进入视频拍摄模式,如果用户选择了常规拍摄模式,则客户端将接收到常规拍摄指令,进行常规的视频拍摄;如果用户选择了模仿拍摄模式,则客户端将接收到模仿拍摄指令,进行模仿拍摄。在模仿拍摄模式下,客户端提示用户添加需要模仿的视频。在视频添加完成后,客户端对该视频进行解析,得到该视频的多媒体维度信息,例如,滤镜、特效、转场效果、摄像头信息等。然后用户开始拍摄一段或几段视频,在用户拍摄一段或几段视频的过程中,客户端根据源视频的相关信息对拍摄中的视频进行对应的处理,并实时提醒用户如何控制摄像头的位置以及角度,还可提示用户是否同步进行仿拍或合拍。在完成一段或几段视频的拍摄之后,用户可通过客户端的显示界面浏览拍摄效果,也可单独对某个场景的视频进行编辑使用其他效果等,在视频编辑完成后保存即可完成仿拍视频。
由上可知,本申请所提供的方案使用AI技术智能解析用户所要模仿拍摄的视频,分析视频的滤镜、特效、转场、摄像头控制等,解析后指导用户拍摄,拍摄后可对视频进行编辑,提高了用户对拍摄视频的兴趣,拉近与时代流行的距离,还提高用户对视频的用户体验,丰富了用户的感受。
实施例2
根据本公开实施例,还提供了一种多媒体数据的处理装置实施例,需要说明的是,该装置可执行实施例1中的多媒体数据的处理方法,其中,图3是根据本公开实施例的多媒体数据的处理装置示意图,如图3所示,该装置包括:获取模块301、解析模块303以及处理模块305。
其中,获取模块301,用于获取第一多媒体数据;解析模块303,用于对第一多媒体数据进行多维度解析,得到多媒体维度信息;处理模块305,用于根据多媒体维度信息进行视频拍摄,得到第二多媒体数据。
此处需要说明的是,上述获取模块301、解析模块303以及处理模块305对应于上述实施例的步骤S102至步骤S106,三个模块与对应的步骤所实现的实例和应用场景相同,但不限于上述实施例所公开的内容。
在一种可选的方案中,解析模块包括:第一检测模块以及第一获取模块。其中,第一检测模块,用于检测组成第一多媒体数据的场景数量;第一获取模块,用于在检测到场景数量为多个的情况下,获取多个场景之间的切换效果以及每个场景对应的场景信息。
在一种可选的方案中,解析模块包括:第二检测模块以及第二获取模块。其中,第二检测模块,用于检测组成第一多媒体数据的场景数量;第二获取模块,用于在检测到场景数量为一个的情况下,获取该场景对应的场景信息。
在一种可选的方案中,第二检测模块包括:第三检测模块以及第一确定模块。其中,第三检测模块,用于检测第一多媒体数据中每一帧对应的场景;第一确定模块,用于根据相邻两个帧所对应的场景的匹配度确定组成第一多媒体数据的场景数量。
在一种可选的方案中,在检测到场景对象包括预设对象的情况下,多媒体数据的处理装置还包括:识别模块。其中,识别模块,用于识别预设对象的对象信息,其中,对象信息包括如下至少之一:预设对象的表情、动作以及特效。
在一种可选的方案中,处理模块包括:第三获取模块、第一处理模块以及第二处理模块。其中,第三获取模块,用于获取第一多媒体数据中的每个场景对应的场景信息以及多个场景之间的切换效果;第一处理模块,用于根据场景信息进行视频拍摄,得到每个场景对应的第三多媒体数据;第二处理模块,用于根据切换效果设置多个第三多媒体数据之间的切换效果,得到第二多媒体数据。
在一种可选的方案中,处理模块包括:第四获取模块以及第三处理模块。其中,第四获取模块,用于获取第一多媒体数据所对应的场景信息;第三处理模块,用于根据场景信息进行视频拍摄,得到第二多媒体数据。
在一种可选的方案中,在对根据多媒体维度信息进行视频拍摄,得到第二多媒体数据的过程中,多媒体数据的处理装置还包括:第四检测模块、第一生成模块以及第二生成模块。其中,第四检测模块,用于检测第二多媒体数据与对应的场景的匹配度;第一生成模块,用于在匹配度小于预设匹配度的情况下,生成摄像头控制信息;第二生成模块,用于根据摄像头控制信息生成提示信息,其中,提示信息用于用户按照摄像头控制信息控制拍摄设备进行视频拍摄。
在一种可选的方案中,多媒体数据的处理装置还包括:第五检测模块、第一控制模块以及第二控制模块。其中,第五检测模块,用于检测视频拍摄指令;第一控制模块,用于 在检测到视频拍摄指令为模仿拍摄指令的情况下,控制拍摄设备进入模仿拍摄模式,其中,模仿拍摄模式用于根据已存在的多媒体数据进行拍摄,得到与已存在的多媒体数据的拍摄效果相同的多媒体数据;第二控制模块,用于在检测到视频拍摄指令为常规拍摄指令的情况下,控制拍摄设备进入常规拍摄模式。
实施例3
根据本公开实施例的另一方面,还提供了一种存储介质,该存储介质包括存储的程序,其中,在程序运行时控制存储介质所在设备执行实施例1中的多媒体数据的处理方法。
实施例4
根据本公开实施例的另一方面,还提供了一种处理器,该处理器用于运行程序,其中,程序运行时执行实施例1中的多媒体数据的处理方法。
上述本公开实施例序号仅仅为了描述,不代表实施例的优劣。
在本公开的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,可以为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对发明人已知的技术中做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、 随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅是本公开的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本公开原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本公开的保护范围。

Claims (15)

  1. 一种多媒体数据的处理方法,其特征在于,包括:
    获取第一多媒体数据;
    对所述第一多媒体数据进行多维度解析,得到多媒体维度信息;
    根据所述多媒体维度信息进行视频拍摄,得到第二多媒体数据。
  2. 根据权利要求1所述的处理方法,其特征在于,对所述第一多媒体数据进行多维度解析,得到多媒体维度信息,包括:
    检测组成所述第一多媒体数据的场景数量;
    在检测到所述场景数量为多个的情况下,获取多个场景之间的切换效果以及每个场景对应的场景信息。
  3. 根据权利要求1所述的处理方法,其特征在于,对所述第一多媒体数据进行多维度解析,得到多媒体维度信息,包括:
    检测组成所述第一多媒体数据的场景数量;
    在检测到所述场景数量为一个的情况下,获取该场景对应的场景信息。
  4. 根据权利要求2所述的处理方法,其特征在于,检测组成所述第一多媒体数据的场景数量,包括:
    检测所述第一多媒体数据中每一帧对应的场景;
    根据相邻两个帧所对应的场景的匹配度确定所述组成所述第一多媒体数据的场景数量。
  5. 根据权利要求3所述的处理方法,其特征在于,检测组成所述第一多媒体数据的场景数量,包括:
    检测所述第一多媒体数据中每一帧对应的场景;
    根据相邻两个帧所对应的场景的匹配度确定所述组成所述第一多媒体数据的场景数量。
  6. 根据权利要求2所述的处理方法,其特征在于,在检测到场景对象包括预设对象的情况下,所述方法还包括:
    识别所述预设对象的对象信息,其中,所述对象信息包括如下至少之一:所述预设对象的表情、动作以及特效。
  7. 根据权利要求3所述的处理方法,其特征在于,在检测到场景对象包括预设对象的情况下,所述方法还包括:
    识别所述预设对象的对象信息,其中,所述对象信息包括如下至少之一:所述预设对 象的表情、动作以及特效。
  8. 根据权利要求2所述的处理方法,其特征在于,根据所述多媒体维度信息进行视频拍摄,得到第二多媒体数据,包括:
    获取所述第一多媒体数据中的所述每个场景对应的场景信息以及所述多个场景之间的切换效果;
    根据所述场景信息进行视频拍摄,得到所述每个场景对应的第三多媒体数据;
    根据所述切换效果设置多个所述第三多媒体数据之间的切换效果,得到所述第二多媒体数据。
  9. 根据权利要求3所述的处理方法,其特征在于,根据所述多媒体维度信息进行视频拍摄,得到第二多媒体数据,包括:
    获取所述第一多媒体数据所对应的场景信息;
    根据所述场景信息进行视频拍摄,得到所述第二多媒体数据。
  10. 根据权利要求8所述的处理方法,其特征在于,在对根据所述多媒体维度信息进行视频拍摄,得到第二多媒体数据的过程中,所述方法还包括:
    检测所述第二多媒体数据与对应的场景的匹配度;
    在所述匹配度小于预设匹配度的情况下,生成摄像头控制信息;
    根据所述摄像头控制信息生成提示信息,其中,所述提示信息用于用户按照所述摄像头控制信息控制拍摄设备进行视频拍摄。
  11. 根据权利要求9所述的处理方法,其特征在于,在对根据所述多媒体维度信息进行视频拍摄,得到第二多媒体数据的过程中,所述方法还包括:
    检测所述第二多媒体数据与对应的场景的匹配度;
    在所述匹配度小于预设匹配度的情况下,生成摄像头控制信息;
    根据所述摄像头控制信息生成提示信息,其中,所述提示信息用于用户按照所述摄像头控制信息控制拍摄设备进行视频拍摄。
  12. 根据权利要求1所述的方法,其特征在于,在获取第一多媒体数据之前,所述方法还包括:
    检测视频拍摄指令;
    在检测到所述视频拍摄指令为模仿拍摄指令的情况下,控制拍摄设备进入模仿拍摄模式,其中,所述模仿拍摄模式用于根据已存在的多媒体数据进行拍摄,得到与所述已存在的多媒体数据的拍摄效果相同的多媒体数据;
    在检测到所述视频拍摄指令为常规拍摄指令的情况下,控制拍摄设备进入常规拍摄模 式。
  13. 一种多媒体数据的处理装置,其特征在于,包括:
    获取模块,用于获取第一多媒体数据;
    解析模块,用于对所述第一多媒体数据进行多维度解析,得到多媒体维度信息;
    处理模块,用于根据所述多媒体维度信息进行视频拍摄,得到第二多媒体数据。
  14. 一种存储介质,其特征在于,所述存储介质包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行权利要求1至12中任意一项所述的多媒体数据的处理方法。
  15. 一种处理器,其特征在于,所述处理器用于运行程序,其中,所述程序运行时执行权利要求1至12中任意一项所述的多媒体数据的处理方法。
PCT/CN2019/128161 2019-04-22 2019-12-25 多媒体数据的处理方法以及装置 WO2020215776A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US17/605,460 US11800217B2 (en) 2019-04-22 2019-12-25 Multimedia data processing method and apparatus
EP19926007.6A EP3941075A4 (en) 2019-04-22 2019-12-25 MULTIMEDIA DATA PROCESSING METHOD AND APPARATUS

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910324559.2A CN110062163B (zh) 2019-04-22 2019-04-22 多媒体数据的处理方法以及装置
CN201910324559.2 2019-04-22

Publications (1)

Publication Number Publication Date
WO2020215776A1 true WO2020215776A1 (zh) 2020-10-29

Family

ID=67320130

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/128161 WO2020215776A1 (zh) 2019-04-22 2019-12-25 多媒体数据的处理方法以及装置

Country Status (4)

Country Link
US (1) US11800217B2 (zh)
EP (1) EP3941075A4 (zh)
CN (1) CN110062163B (zh)
WO (1) WO2020215776A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112423112A (zh) * 2020-11-16 2021-02-26 北京意匠文枢科技有限公司 一种发布视频信息的方法与设备

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110062163B (zh) 2019-04-22 2020-10-20 珠海格力电器股份有限公司 多媒体数据的处理方法以及装置
CN110855893A (zh) * 2019-11-28 2020-02-28 维沃移动通信有限公司 一种视频拍摄的方法及电子设备
CN113935388A (zh) * 2020-06-29 2022-01-14 北京达佳互联信息技术有限公司 匹配模型训练方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007213668A (ja) * 2006-02-08 2007-08-23 Matsushita Electric Ind Co Ltd ダビング装置、ダビング方法、ダビングプログラム、コンピュータ読み取り可能な記録媒体、及び録画再生装置
CN105893412A (zh) * 2015-11-24 2016-08-24 乐视致新电子科技(天津)有限公司 图像分享方法及装置
CN108566519A (zh) * 2018-04-28 2018-09-21 腾讯科技(深圳)有限公司 视频制作方法、装置、终端和存储介质
CN110062163A (zh) * 2019-04-22 2019-07-26 珠海格力电器股份有限公司 多媒体数据的处理方法以及装置

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001011920A (ja) * 1999-06-30 2001-01-16 Kazutomi Ito 水洗便所の節水装置
US6970199B2 (en) * 2001-10-05 2005-11-29 Eastman Kodak Company Digital camera using exposure information acquired from a scene
JP2003263095A (ja) * 2002-01-04 2003-09-19 Masanobu Kujirada 歴史又は地理の学習のためのシステム、方法、及びプログラム
JP4792985B2 (ja) * 2006-01-18 2011-10-12 カシオ計算機株式会社 カメラ装置、撮影条件設定方法、及び、プログラム
CN103530788A (zh) * 2012-07-02 2014-01-22 纬创资通股份有限公司 多媒体评价系统、多媒体评价装置以及多媒体评价方法
CN105898133A (zh) * 2015-08-19 2016-08-24 乐视网信息技术(北京)股份有限公司 一种视频拍摄方法及装置
CN105681891A (zh) * 2016-01-28 2016-06-15 杭州秀娱科技有限公司 移动端为用户视频嵌套场景的方法
CN106060655B (zh) * 2016-08-04 2021-04-06 腾讯科技(深圳)有限公司 一种视频处理方法、服务器及终端
CN106657810A (zh) * 2016-09-26 2017-05-10 维沃移动通信有限公司 一种视频图像的滤镜处理方法和装置
CN107657228B (zh) * 2017-09-25 2020-08-04 中国传媒大学 视频场景相似性分析方法及系统、视频编解码方法及系统
CN108024071B (zh) * 2017-11-24 2022-03-08 腾讯数码(天津)有限公司 视频内容生成方法、视频内容生成装置及存储介质
CN108109161B (zh) * 2017-12-19 2021-05-11 北京奇虎科技有限公司 基于自适应阈值分割的视频数据实时处理方法及装置
CN108566191B (zh) * 2018-04-03 2022-05-27 擎先电子科技(苏州)有限公司 一种通用管脚复用电路
CN108600825B (zh) * 2018-07-12 2019-10-25 北京微播视界科技有限公司 选择背景音乐拍摄视频的方法、装置、终端设备和介质
CN109145840B (zh) * 2018-08-29 2022-06-24 北京字节跳动网络技术有限公司 视频场景分类方法、装置、设备及存储介质
CN109145882A (zh) * 2018-10-10 2019-01-04 惠州学院 一种多模态视频目标检测系统
CN109379623A (zh) * 2018-11-08 2019-02-22 北京微播视界科技有限公司 视频内容生成方法、装置、计算机设备和存储介质
CN109547694A (zh) 2018-11-29 2019-03-29 维沃移动通信有限公司 一种图像显示方法及终端设备
CN113727025B (zh) * 2021-08-31 2023-04-14 荣耀终端有限公司 一种拍摄方法、设备和存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007213668A (ja) * 2006-02-08 2007-08-23 Matsushita Electric Ind Co Ltd ダビング装置、ダビング方法、ダビングプログラム、コンピュータ読み取り可能な記録媒体、及び録画再生装置
CN105893412A (zh) * 2015-11-24 2016-08-24 乐视致新电子科技(天津)有限公司 图像分享方法及装置
CN108566519A (zh) * 2018-04-28 2018-09-21 腾讯科技(深圳)有限公司 视频制作方法、装置、终端和存储介质
CN110062163A (zh) * 2019-04-22 2019-07-26 珠海格力电器股份有限公司 多媒体数据的处理方法以及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112423112A (zh) * 2020-11-16 2021-02-26 北京意匠文枢科技有限公司 一种发布视频信息的方法与设备

Also Published As

Publication number Publication date
US11800217B2 (en) 2023-10-24
CN110062163B (zh) 2020-10-20
EP3941075A1 (en) 2022-01-19
EP3941075A4 (en) 2022-05-18
CN110062163A (zh) 2019-07-26
US20220217266A1 (en) 2022-07-07

Similar Documents

Publication Publication Date Title
WO2020215776A1 (zh) 多媒体数据的处理方法以及装置
TWI751161B (zh) 終端設備、智慧型手機、基於臉部識別的認證方法和系統
WO2022001593A1 (zh) 视频生成方法、装置、存储介质及计算机设备
US11321385B2 (en) Visualization of image themes based on image content
CN107862315B (zh) 字幕提取方法、视频搜索方法、字幕分享方法及装置
WO2017177768A1 (zh) 一种信息处理方法及终端、计算机存储介质
CN107786549B (zh) 音频文件的添加方法、装置、系统及计算机可读介质
US20220312048A1 (en) Video editing method, terminal and readable storage medium
CN105335465B (zh) 一种展示主播账户的方法和装置
CN106127167B (zh) 一种增强现实中目标对象的识别方法、装置及移动终端
WO2022227393A1 (zh) 图像拍摄方法及装置、电子设备和计算机可读存储介质
CN106200918B (zh) 一种基于ar的信息显示方法、装置和移动终端
US11778263B2 (en) Live streaming video interaction method and apparatus, and computer device
CN111643900B (zh) 一种展示画面控制方法、装置、电子设备和存储介质
CN109064387A (zh) 图像特效生成方法、装置和电子设备
KR100886489B1 (ko) 영상 통화 시 얼굴의 표정에 따라 꾸미기 효과를 합성하는방법 및 시스템
CN110557678A (zh) 视频处理方法、装置及设备
CN105847735A (zh) 一种基于人脸识别的即时弹幕视频通信方法及系统
US11076091B1 (en) Image capturing assistant
CN103747177A (zh) 视频拍摄的处理方法及装置
TW202141446A (zh) 一種多媒體互動方法、設備及電腦可讀儲存介質
CN109286848B (zh) 一种终端视频信息的交互方法、装置及存储介质
US11889127B2 (en) Live video interaction method and apparatus, and computer device
CN106162222B (zh) 一种视频镜头切分的方法及装置
CN114051116A (zh) 一种驾考车辆的视频监控方法、装置以及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19926007

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019926007

Country of ref document: EP

Effective date: 20211011

NENP Non-entry into the national phase

Ref country code: DE