WO2018133321A1 - 一种生成镜头信息的方法和装置 - Google Patents

一种生成镜头信息的方法和装置 Download PDF

Info

Publication number
WO2018133321A1
WO2018133321A1 PCT/CN2017/089313 CN2017089313W WO2018133321A1 WO 2018133321 A1 WO2018133321 A1 WO 2018133321A1 CN 2017089313 W CN2017089313 W CN 2017089313W WO 2018133321 A1 WO2018133321 A1 WO 2018133321A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
lens
identifier
frame
frame picture
Prior art date
Application number
PCT/CN2017/089313
Other languages
English (en)
French (fr)
Inventor
宋磊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP17892150.8A priority Critical patent/EP3565243A4/en
Priority to US16/479,762 priority patent/US20190364196A1/en
Priority to CN201780082709.2A priority patent/CN110169055B/zh
Publication of WO2018133321A1 publication Critical patent/WO2018133321A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/62Control of parameters via user interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/667Camera operation mode switching, e.g. between still and video, sport and normal or high- and low-resolution modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • H04N23/84Camera processing pipelines; Components thereof for processing colour signals
    • H04N23/88Camera processing pipelines; Components thereof for processing colour signals for colour balance, e.g. white-balance circuits or colour temperature control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2621Cameras specially adapted for the electronic generation of special effects during image pickup, e.g. digital cameras, camcorders, video cameras having integrated special effects capability

Definitions

  • the present application relates to the field of data processing technologies, and in particular, to a method and apparatus for generating lens information.
  • the technical problem to be solved by the present application is to provide a method and apparatus for generating lens information, so as to be able to provide relevant lens information for a lens segment in a video source, so that the lens information can be used for searching for a lens segment, thereby enabling the user to It's easier to complete video editing.
  • a method of generating lens information including:
  • the first shot segment is composed of a first group of frame pictures including the target frame picture, and the first group frame picture includes a plurality of consecutive frame pictures in the video source, and the first group frame picture is Corresponding to the target object and the target lens category;
  • the lens information of the first shot segment includes: an identifier of the target object, an identifier of the target lens category, and a position identifier of the first shot segment in the video source.
  • the target object is the previous frame The corresponding object.
  • it also includes:
  • the target object corresponding to the target frame picture is not recognized, mark the target frame picture as a frame picture without a target object;
  • the second shot segment is composed of a second group of frame pictures including the target frame picture, and the second group frame picture includes a plurality of consecutive frame pictures in the video source, the second The framing screen is a frame picture without a target object;
  • the lens information of the second shot segment includes: an identifier for indicating a targetless object, and a position identifier of the second shot segment in the video source.
  • the location identifier of the first shot segment in the video source includes: an identifier of a start frame position of the first shot segment, and an identifier of an end frame position of the first shot segment.
  • the target lens category is a fixed field lens
  • the target lens category is a panoramic lens
  • the target lens category is a medium shot lens
  • the target lens category is a close-range lens
  • the target lens category is a close-up lens
  • the target lens category is a large close-up lens
  • first ratio range is smaller than the second ratio range
  • second ratio range is smaller than the third ratio range
  • third ratio range is smaller than the fourth ratio range
  • fourth ratio range Less than the fifth ratio range
  • fifth ratio range is smaller than the sixth ratio range.
  • it also includes:
  • the query instruction carries a query identifier, where the query identifier includes an identifier of the target object and/or an identifier of the target shot category;
  • the first shot segment is fed back according to the position identifier of the first shot segment in the video source in the lens information of the first shot segment.
  • an apparatus for generating lens information including:
  • An identification unit configured to perform object recognition on the target frame screen
  • a determining unit configured to determine, according to a size ratio occupied by the target object in the target frame picture, a target lens type corresponding to the target frame picture, if the target object corresponding to the target frame picture is identified;
  • a first generating unit configured to generate lens information of the first shot segment according to the target lens category and the target object corresponding to the target frame picture and the position of the target frame picture in the video source;
  • the first shot segment is composed of a first group of frame pictures including the target frame picture, and the first group frame picture includes a plurality of consecutive frame pictures in the video source, the first group frame
  • the screens each correspond to the target object and the target lens category;
  • the lens information of the first shot segment includes: an identifier of the target object, an identifier of the target lens category, and a position identifier of the first shot segment in the video source.
  • the target frame picture if the target frame is present in the plurality of objects The object corresponding to the previous frame of the face, the target object is the object corresponding to the previous frame.
  • it also includes:
  • a marking unit configured to mark the target frame picture as a frame picture without a target object if the target object corresponding to the target frame picture is not recognized;
  • a second generating unit configured to generate lens information of the second shot segment according to the frame image of the untargeted object
  • the second shot segment is composed of a second group of frame pictures including the target frame picture, and the second group frame picture includes a plurality of consecutive frame pictures in the video source, the second The framing screen is a frame picture without a target object;
  • the lens information of the second shot segment includes: an identifier for indicating a targetless object, and a position identifier of the second shot segment in the video source.
  • the location identifier of the first shot segment in the video source includes: an identifier of a start frame position of the first shot segment, and an identifier of an end frame position of the first shot segment.
  • the target lens category is a fixed field lens
  • the target lens category is a panoramic lens
  • the target lens category is a medium shot lens
  • the target lens category is a close-range lens
  • the target lens category is a close-up lens
  • the target lens category is a large close-up lens
  • first ratio range is smaller than the second ratio range
  • second ratio range is smaller than the third ratio range
  • third ratio range is smaller than the fourth ratio range
  • fourth ratio range Less than the fifth ratio range
  • fifth ratio range is smaller than the sixth ratio range.
  • it also includes:
  • a receiving unit configured to receive a query instruction of a shot segment, where the query instruction carries a query identifier, where the query identifier includes an identifier of the target object and/or an identifier of the first target lens category;
  • a searching unit configured to search for lens information having the query identifier, to obtain lens information of the first shot segment
  • a feedback unit configured to feed back the first shot segment according to a position identifier of the first shot segment in the video source in the lens information of the first shot segment.
  • an electronic device including a processor and a memory coupled to the processor;
  • the memory for storing program instructions and data
  • the processor is configured to read instructions and data stored in the memory, and perform the following operations:
  • the first shot segment is composed of a first group of frame pictures including the target frame picture, and the first group frame picture includes a plurality of consecutive frame pictures in the video source, and the first group frame picture is Corresponding to the target object and the target lens category;
  • the lens information of the first shot segment includes: an identifier of the target object, an identifier of the target lens category, and a position identifier of the first shot segment in the video source.
  • the target object is The object corresponding to the previous frame.
  • the processor is further configured to:
  • the target object corresponding to the target frame picture is not recognized, mark the target frame picture as a frame picture without a target object;
  • the second shot segment is composed of a second group of frame pictures including the target frame picture, and the second group frame picture includes a plurality of consecutive frame pictures in the video source, the second The framing screen is a frame picture without a target object;
  • the lens information of the second shot segment includes: an identifier for indicating a targetless object, and a position identifier of the second shot segment in the video source.
  • the location identifier of the first shot segment in the video source includes: an identifier of a start frame position of the first shot segment, and an identifier of an end frame position of the first shot segment.
  • the target lens category is a fixed field lens
  • the target lens category is a panoramic lens
  • the target lens category is a medium shot lens
  • the target lens category is a close-range lens
  • the target lens category is a close-up lens
  • the target lens category is a large close-up lens
  • first ratio range is smaller than the second ratio range
  • second ratio range is smaller than the third ratio range
  • third ratio range is smaller than the fourth ratio range
  • fourth ratio range Less than the fifth ratio range
  • fifth ratio range is smaller than the sixth ratio range.
  • the electronic device further includes a transceiver connected to the processor, where the processor is further configured to:
  • a query instruction that triggers the transceiver to receive a shot segment, where the query instruction carries a query identifier, where the query identifier includes an identifier of the target object and/or an identifier of the target shot category;
  • the first shot segment is fed back according to the position identifier of the first shot segment in the video source in the lens information of the first shot segment.
  • the target object and the target lens type may be viewed from the target object.
  • the lens segment is identified in the frequency source, and lens information that can be used to mark the target object corresponding to the lens segment, the target lens category corresponding to the lens segment, and the position of the lens segment in the video source is generated for the lens segment. Therefore, in the video editing work, through the lens information, the user can quickly and easily find the corresponding lens segment by using the target object and/or the target lens category, so that the user can find the appropriate lens segment in less time. This makes it easier to complete video editing.
  • FIG. 1 is a schematic diagram of a network system framework involved in an application scenario in an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a method for generating lens information according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of an example of a frame picture of different lens categories with a person as an object in the embodiment of the present application;
  • FIG. 4 is a schematic flowchart of a method for generating lens information according to an embodiment of the present application
  • FIG. 5 is a schematic flowchart of a method for querying a shot segment according to an embodiment of the present application
  • FIG. 6 is a schematic structural diagram of an apparatus for generating lens information according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of hardware of an electronic device according to an embodiment of the present application.
  • the video source may be processed as follows: identify the target object corresponding to the frame image in the video source and follow the target object in the frame image.
  • the occupied size ratio identifies the target shot type corresponding to the frame picture. If the target object can be recognized in a continuous set of frame pictures and both correspond to the same target shot type, the set of consecutive frame pictures is used as the shot fragment. Generate lens information for the shot segment.
  • the lens information includes: a target object identifier, a target lens category identifier, and a location identifier corresponding to the first shot segment.
  • the embodiment of the present application can be applied to, for example, the scenario shown in FIG. 1 .
  • the user 101 can perform video shooting work and editing work by interacting with the terminal 102.
  • the terminal 102 collects a video source.
  • the terminal 102 may sequentially perform the following operations: performing object recognition on the target frame screen; and identifying the target object corresponding to the target frame image, according to the target object in the target frame.
  • the first lens segment is composed of a first group of frame frames including the target frame picture, the first group of frames
  • the face includes a plurality of consecutive frame pictures in the video source, the first group frame picture corresponds to the target object and the target lens category
  • the lens information of the first shot segment includes: the target object An identifier, an identifier of the target lens category, and a location identifier of the first shot segment in the video source.
  • the lens information of each shot segment in the video source is saved in the terminal 102.
  • the target object and/or the target lens category may be selected on the terminal 102, and the terminal 102 may find the lens segment corresponding to the target object and/or the target lens category according to the lens information. And presented to the user 101.
  • FIG. 2 a schematic flowchart of a method for generating lens information in an embodiment of the present application is shown.
  • the video clip is to remix different shot segments in the video source, which requires cutting, merging and secondary encoding of the video source according to the shot segment, thereby generating new videos with different expressive powers.
  • the premise of cutting the video source is to enable the user to find the corresponding lens segment in the video source.
  • each frame image in the video source may be processed to determine the lens segment to which each frame image belongs, thereby generating a lens for searching.
  • the lens information of the clip is composed of a series of frame pictures. Therefore, for any one of the video sources, the following steps 201 to 203 can be performed as the target frame picture.
  • the object to be recognized from the target frame screen may be a character, or may be other objects than the person, such as an animal, a plant, an airplane, a car, a tank, a table, a chair, and the like.
  • the face recognition image can be used for face recognition on the target frame image, so that the recognized face is used as the recognized person object; if other objects than the person are used as the recognition object, The object recognition can be performed on the target frame picture by using the corresponding object recognition technology according to the relevant features of the object to be identified.
  • the object recognition of the target frame picture is used to identify the target object corresponding to the target frame picture.
  • the target object can be understood as the object to be described in the target frame picture. It can be understood that the target object corresponding to the target frame picture belongs to the object identified in the target frame picture, but all the objects identified in the target frame picture are not necessarily the target objects corresponding to the target frame picture.
  • the object recognition result in the target frame picture may include the following three cases:
  • a plurality of objects are identified in the target frame picture, and at this time, the target object corresponding to the target frame picture can also be determined from the plurality of objects.
  • the target object corresponding to the target frame picture may be determined according to the previous frame picture of the target frame picture. Specifically, when a plurality of objects are identified in the target frame screen, if an object corresponding to a previous frame of the target frame screen exists in the plurality of objects, the target object is the front object The object corresponding to one frame of the picture.
  • the target object to be described in the shot screen is still the character A.
  • object recognition is performed on some target frame screens, a plurality of people are recognized in the target frame screen, and the target frame screen can be determined by referring to the target object corresponding to the previous frame of the target frame screen.
  • Corresponding target object Since the target object corresponding to the previous frame picture is the character A, and the character A is included in the object recognized in the target frame picture, the target object corresponding to the target frame picture can be determined as the character A.
  • the target object corresponding to the target frame picture may be determined according to the next frame picture of the target frame picture. Specifically, when a plurality of objects are identified in the target frame screen, if an object corresponding to a subsequent frame of the target frame screen exists in the plurality of objects, the target object is the The object corresponding to one frame of the picture.
  • the lens screen is slowly shifted to only the person A, which is visible, and the lens screen actually needs to be described.
  • the object is character A.
  • the target frame corresponding to the next frame of the target frame screen can be referred to to determine the target frame corresponding to the target frame.
  • Target object Since the target object corresponding to the next frame picture is the character A, and the character A is included in the object recognized in the target frame picture, the corresponding target object in the target frame picture can be determined as the character A.
  • the target object corresponding to the target frame picture may be determined according to the previous frame picture and the subsequent frame picture of the target frame picture. Specifically, when a plurality of objects are identified in the target frame screen, if an object corresponding to a previous frame of the target frame and an object corresponding to a subsequent frame of the target frame are objects A and the object A exists in the plurality of objects, and the target object is the object A.
  • the lens refers to a basic constituent unit of the movie.
  • the lens category can include: fixed-field lens, panoramic lens, medium-range lens, close-range lens, close-up lens, large close-up lens, and so on.
  • the target lens category may be any of the lens categories mentioned above.
  • FIG. 3 shows a schematic diagram of an example of a frame picture of different shot categories with a person as an object.
  • the fixed-field lens can also be understood as the main lens, usually at the beginning of the film or at the beginning of a scene, for the purpose of clearing the location of the lens, for example, can be a wide-angle vision lens.
  • the panoramic lens is mainly used to express the whole body of the character. Among them, the character has a large range of motion in the panoramic lens. The size, clothing and identity can be clearly explained in the panoramic lens. The environment and props can also be clearly seen in the panoramic lens. Show that the panoramic lens can be used as a camera when shooting inside scenes. The total angle of the scene. The medium-range lens has a smaller range of scenes that can be accommodated than the panoramic lens.
  • the environment in which the character is located is in a secondary position in the mid-range lens.
  • the focus of the mid-range lens is to express the upper body movement of the character.
  • the close-up lens can clearly show the subtle movements of the characters and can focus on the facial expressions of the characters. Therefore, the close-up lens can convey the inner world of the characters and is the most powerful shot for characterizing the characters.
  • a close-up lens is used to take a picture of a face of a portrait, a part of the human body, or a detail of an item. In a large close-up shot, a detail of the subject captures the entire scene.
  • the size ratio of the target object in the target frame picture may be the ratio of the area of the target object to the target frame picture size, or may be the area of the target part and the target frame picture size. proportion.
  • the size ratio of the character A occupying in the target frame picture may be the size ratio of the face area of the character A to the target frame picture. Therefore, the size ratio occupied by the character A in the target picture can be calculated by first analyzing the face contour of the character A and determining the face area of the character A and the size of the target frame picture based on the analyzed face contour, and then, The face area can be divided by the size of the target frame picture, and the obtained ratio is the size ratio of the character A occupying the target frame picture.
  • the face area may be, for example, a pixel area occupied by a human face
  • the size of the target frame picture may be, for example, a pixel size of the target frame picture.
  • the target lens category corresponding to the target frame picture may be determined by setting a corresponding size scale range for different lens categories.
  • the first scale range may be set for the fixed field lens, and if the size ratio belongs to the first scale range, the target lens category is a fixed field lens.
  • a second scale range may be set for the panoramic lens, and if the size ratio belongs to the second scale range, the target lens category is a panoramic lens.
  • a third scale range may be set for the medium shot lens, and if the size ratio belongs to the third scale range, the target lens category is a medium shot lens.
  • a fourth scale range may be set for the close-up lens, and if the size ratio belongs to the fourth scale range, the target lens category is a close-range lens.
  • a fifth scale range may be set for the close-up lens, and if the size ratio belongs to the fifth scale range, the target lens category is a close-up lens.
  • a sixth scale range may be set for the large close-up lens, and if the size ratio belongs to the sixth scale range, the target lens category is a large close-up lens.
  • the first ratio range is smaller than the second ratio range
  • the second ratio range is smaller than the third ratio range
  • the third ratio range is smaller than the fourth ratio range
  • the fourth ratio range Less than the fifth ratio range
  • the fifth ratio range is smaller than the sixth ratio range.
  • the target lens type corresponding to the target frame picture may be a fixed field lens; if 0.01 ⁇ r ⁇ 0.02, the target lens may be a panoramic lens; if 0.02 ⁇ r ⁇ 0.1, the target lens may be a medium shot lens; 0.1 ⁇ r ⁇ 0.2, the target lens is a close-up lens; if 0.2 ⁇ r ⁇ 0.33, the target lens can be a close-up lens; if r ⁇ 0.75, the target lens can be a large close-up lens.
  • the first ratio range is r ⁇ 0.01
  • the second ratio range is 0.01 ⁇ r ⁇ 0.02
  • the third ratio range is 0.02 ⁇ r ⁇ 0.1
  • the fourth ratio range is 0.1 ⁇ r ⁇ 0.2
  • the five ratio range is 0.2 ⁇ r ⁇ 0.33
  • the sixth ratio range is r ⁇ 0.75.
  • the first shot segment is composed of a first group of frame pictures including the target frame picture, and the first group frame picture includes a plurality of consecutive frame pictures in the video source, and the first group frame picture is Corresponding to the target object and the target lens category;
  • the lens information of the first shot segment includes: an identifier of the target object, an identifier of the target lens category, and a position identifier of the first shot segment in the video source.
  • the location identifier of the first shot segment in the video source may include, for example, an identifier of a starting frame position of the first shot segment, and/or an end of the first shot segment.
  • the identifier of the target object can be used to distinguish different objects, and different objects can use different numbers, letters or symbols as the identifier.
  • the identification of the target lens category can be used to distinguish different lens categories, and different lens categories can be represented by different numbers, letters or symbols.
  • the lens information of each shot segment in the video source can be obtained:
  • Lens 1 Character A, from nth frame to n2th frame;
  • Lens 2 character B, from the n3th frame to the nth frame;
  • Lens 3 Character C, from the n5th frame to the nth frame;
  • Lens 4 Character A, from the n7th frame to the n8th frame;
  • the lens 1, the lens 2, the lens 3 and the lens 4 respectively represent four different target lens categories; the character A, the character B, and the character C respectively represent three different target objects; from the nth frame to the n2th frame, The n3th frame to the nth frame, the n5th frame to the n6th frame, and the n7th frame to the n8th frame respectively indicate positions of four different shot segments in the video source.
  • the target frame picture may be marked with information.
  • the lens information of the first shot segment is generated based on the mark information of each frame picture.
  • the marking information of the target frame screen may include: an identifier of the target lens category, an identifier of the target object, and a position of the target frame image in the video source.
  • the tag information of the target frame picture can be ⁇ n, a, X ⁇ .
  • n is the position of the target frame picture, that is, the target frame picture is the nth frame picture in the video source;
  • a represents the identified target object, and assuming that the identified target object is the character A, a can be specifically A, assuming The identified target object is the character B, then a can be specifically B;
  • X represents the identified target lens category.
  • one shot segment in the video source is composed of a group of consecutive frame pictures in the video source, and the frame pictures describe the same object in the same shot category. Therefore, according to the marking information of each frame picture in the video source, a set of consecutive frame pictures corresponding to the same target object and the same target lens category can be combined into one shot segment, and the position of the lens segment in the video source can be this The position that a group of consecutive frame pictures occupy in the video source.
  • the lens segment included in the video source can be determined after determining the corresponding target object and the target lens category for each frame of the video source, so that the video source can be sequentially Each frame of the frame performs steps 201-202, and then determines each lens segment included in the video source based on the position of each frame of the entire video source, the corresponding target object, and the target lens category, and then Each lens segment generates lens information.
  • the method can also include:
  • the target frame picture is marked as a frame picture without a target object.
  • the second shot segment is composed of a second group of frame pictures including the target frame picture, and the second group frame picture includes a plurality of consecutive frame pictures in the video source, the second The framing screen is a frame picture without a target object;
  • the lens information of the second shot segment includes: an identifier for indicating a targetless object, and a position identifier of the second shot segment in the video source.
  • the target object corresponding to the target image two cases may be included: one is that there is no object in the target frame picture; the other is that there are multiple objects in the target frame picture, but A target object corresponding to the target frame picture is determined from the plurality of objects.
  • the user did not take a picture of the character only for a period of time, and there was a time when a scene containing multiple people was taken but the target person to be described by the lens could not be determined from multiple people. It can be seen that the frame image generated in the two periods cannot identify the target object corresponding to the frame picture. In this case, multiple consecutive frame images that cannot be identified to the target object can be used as the second lens segment to generate corresponding frames.
  • the second shot fragment information is used as the second lens segment to generate corresponding frames.
  • the lens information generated for the lens segment can be used for the video editing work of the user.
  • the user can query the corresponding lens segment through the target object and/or the target lens category, thereby greatly improving the efficiency of the user querying the lens segment.
  • the method may further include:
  • 401 Receive a query instruction of a shot segment, where the query instruction carries a query identifier, where the query identifier includes an identifier of the target object and/or an identifier of the target shot category.
  • a query instruction having an identifier of the target object and/or an identifier of the first target lens category may be generated.
  • the lens category of the shot segment to be queried is the target lens category, and the shot segment to be queried corresponds to the target object.
  • the lens information of the first shot segment may include: an identifier of the target object, an identifier of the target lens category, and a position identifier of the first shot segment in the video source. Therefore, when looking for the lens information having the query identifier, the lens information of the first shot segment can be found.
  • the information of the first shot segment can be obtained from the lens information.
  • the lens information of the first shot segment can be understood as a correspondence.
  • the identifier of the target object, the identifier of the target lens category, and the location identifier of the first shot segment in the video source correspond to each other. Therefore, according to the identifier of the target object and/or the identifier of the target lens category, the location identifier of the first shot segment in the video source can be found in the correspondence, thereby providing the first shot segment to the user, thereby facilitating The user looks up the shot segment in the video source.
  • the user when the user wants to query the second shot segment, and the second shot segment does not have a corresponding target object, the user can query through the identifier of the untargeted object to obtain the lens information of the second shot segment, and according to The second lens segment includes a position identifier of the second lens segment in the video source, and the second lens segment is fed back.
  • the target object corresponding to the frame picture is determined by identifying the target object corresponding to the frame picture in the video source, and the target lens type corresponding to the frame picture is determined according to the size ratio of the target object in the frame picture, and then the video source may be selected according to the target object and the target lens type.
  • the lens segment is identified, and lens information that can be used to mark the target object corresponding to the lens segment, the target lens category corresponding to the lens segment, and the position of the lens segment in the video source is generated for the lens segment. Therefore, in the video editing work, through the lens information, the user can quickly and easily find the corresponding lens segment by using the target object and/or the target lens category, so that the user can find the appropriate lens segment in less time. This makes it easier to complete video editing.
  • the device may include, for example:
  • the identifying unit 601 is configured to perform object recognition on the target frame screen
  • a determining unit 602 configured to determine, according to a size ratio occupied by the target object in the target frame picture, a target lens type corresponding to the target frame picture, if the target object corresponding to the target frame picture is identified;
  • a first generating unit 603 configured to generate lens information of the first shot segment according to the target lens type and the target object corresponding to the target frame picture and the position of the target frame picture in the video source;
  • the first shot segment is composed of a first group of frame pictures including the target frame picture, and the first group frame picture includes a plurality of consecutive frame pictures in the video source, the first group frame
  • the screens each correspond to the target object and the target lens category;
  • the lens information of the first shot segment includes: an identifier of the target object, an identifier of the target lens category, and a position identifier of the first shot segment in the video source.
  • the target object is The object corresponding to the previous frame.
  • the device further includes:
  • a marking unit configured to mark the target frame picture as a frame picture without a target object if the target object corresponding to the target frame picture is not recognized;
  • a second generating unit configured to generate a lens letter of the second shot segment according to the frame image of the untargeted object interest
  • the second shot segment is composed of a second group of frame pictures including the target frame picture, and the second group frame picture includes a plurality of consecutive frame pictures in the video source, the second The framing screen is a frame picture without a target object;
  • the lens information of the second shot segment includes: an identifier for indicating a targetless object, and a position identifier of the second shot segment in the video source.
  • the location identifier of the first shot segment in the video source includes: an identifier of a start frame position of the first shot segment, and an identifier of an end frame position of the first shot segment.
  • the target lens category is a fixed field lens
  • the target lens category is a panoramic lens
  • the target lens category is a medium shot lens
  • the target lens category is a close-range lens
  • the target lens category is a close-up lens
  • the target lens category is a large close-up lens
  • first ratio range is smaller than the second ratio range
  • second ratio range is smaller than the third ratio range
  • third ratio range is smaller than the fourth ratio range
  • fourth ratio range Less than the fifth ratio range
  • fifth ratio range is smaller than the sixth ratio range.
  • the device further includes:
  • a receiving unit configured to receive a query instruction of a shot segment, where the query instruction carries a query identifier, where the query identifier includes an identifier of the target object and/or an identifier of the first target lens category;
  • a searching unit configured to search for lens information having the query identifier, to obtain lens information of the first shot segment
  • a feedback unit configured to feed back the first shot segment according to a position identifier of the first shot segment in the video source in the lens information of the first shot segment.
  • the user can quickly and easily find the corresponding lens segment by using the target object and/or the target lens category, and therefore, the user spends less time. You can find the right shot segment to make video editing work easier.
  • the electronic device 700 includes a processor 701 and a memory 702 coupled to the processor 701.
  • the memory 702 is configured to store program instructions and data.
  • the processor 701 is configured to read instructions and data stored in the memory 702, and perform the following operations:
  • the first shot segment is composed of a first group of frame pictures including the target frame picture, and the first group frame picture includes a plurality of consecutive frame pictures in the video source, and the first group frame picture is Corresponding to the target object and the target lens category;
  • the lens information of the first shot segment includes: an identifier of the target object, an identifier of the target lens category, and a position identifier of the first shot segment in the video source.
  • the target object is The object corresponding to the previous frame.
  • processor 701 is further configured to:
  • the target object corresponding to the target frame picture is not recognized, mark the target frame picture as a frame picture without a target object;
  • the second shot segment is composed of a second group of frame pictures including the target frame picture, and the second group frame picture includes a plurality of consecutive frame pictures in the video source, the second The framing screen is a frame picture without a target object;
  • the lens information of the second shot segment includes: an identifier for indicating a targetless object, and a position identifier of the second shot segment in the video source.
  • the location identifier of the first shot segment in the video source includes: an identifier of a start frame position of the first shot segment, and an identifier of an end frame position of the first shot segment.
  • the target lens category is a fixed field lens
  • the target lens category is a panoramic lens
  • the target lens category is a medium shot lens
  • the target lens category is a close-range lens
  • the target lens category is a close-up lens
  • the target lens category is a large close-up lens
  • first ratio range is smaller than the second ratio range
  • second ratio range is smaller than the third ratio range
  • third ratio range is smaller than the fourth ratio range
  • fourth ratio range Less than the fifth ratio range
  • fifth ratio range is smaller than the sixth ratio range.
  • the electronic device further includes a transceiver 703 connected to the processor 701, where the processor 701 is further configured to:
  • the first shot segment is fed back according to the position identifier of the first shot segment in the video source in the lens information of the first shot segment.
  • the electronic device 700 can be specifically a mobile phone, a tablet computer, a personal digital assistant (PDA), a point of sales (POS), a vehicle-mounted computer, a laptop personal computer, and a desktop personal.
  • Computer small computer, medium computer, or large computer.
  • the processor 701 can be a central processing unit (CPU), a network processor, or a combination thereof.
  • the processor 701 can also include a hardware chip.
  • the memory 702 can be a random access memory (RAM), a read only memory (ROM), a hard disk, a solid state hard disk, a flash memory, an optical disk, or any combination thereof.
  • the transceiver 703 can include a wired physical interface, a wireless physical interface, or a combination thereof.
  • the wired physical interface may be an electrical interface, an optical interface, or a combination thereof, such as an Ethernet interface or an Asynchronous Transfer Mode (ATM) interface.
  • the wireless physical interface can be a wireless local area network interface, a cellular mobile network interface, or a combination thereof.
  • the processor 701, the memory 702, and the transceiver 703 can be integrated in one or more separate circuits.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Studio Devices (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本申请公开了一种生成镜头信息的方法,包括:对目标帧画面进行对象识别;若识别到目标帧画面对应的目标对象,根据目标对象在目标帧画面中占据的尺寸比例,确定目标帧画面对应的目标镜头类别;根据目标帧画面对应的目标镜头类别和目标对象以及目标帧画面在视频源中的位置,生成第一镜头片段的镜头信息;其中,第一镜头片段由包括目标帧画面的第一组帧画面组成,第一组帧画面包括视频源中的多个连续的帧画面,第一组帧画面均对应于目标对象和目标镜头类别;第一镜头片段的镜头信息包括:目标对象的标识,目标镜头类别的标识,第一镜头片段在视频源中的位置标识。此外,本申请还公开了一种生成镜头信息的装置。

Description

一种生成镜头信息的方法和装置
本申请要求于2017年01月20号提交中国专利局、申请号为CN201710052627.5、发明名称为“一种按镜头效果做视频分类的方法和设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据处理技术领域,特别是涉及一种生成镜头信息的方法和装置。
背景技术
随着越来越多地电子设备能够提供视频采集功能,用户能够越来越方便地录制视频。通常,对于原始采集到的视频源,用户还需要对其进行视频剪辑,从而得到符合用户需求的目标视频。在视频剪辑的过程中,从视频源中可以切割出许多镜头片段,这些镜头片段可以在重新组合之后通过二次编码生成目标视频。其中,用户需要花费大量的时间在视频源中寻找合适的镜头片段,因此,视频剪辑工作对于用户来说不够便捷。
发明内容
本申请所要解决的技术问题是,提供一种生成镜头信息的方法和装置,以能够为视频源中的镜头片段提供相关的镜头信息,以便镜头信息能够用于对镜头片段的查找,从而使得用户能够更为便捷地完成视频剪辑工作。
第一方面,提供了一种生成镜头信息的方法,包括:
对目标帧画面进行对象识别;
若识别到所述目标帧画面对应的目标对象,根据所述目标对象在所述目标帧画面中占据的尺寸比例,确定所述目标帧画面对应的目标镜头类别;
根据所述目标帧画面对应的目标镜头类别和目标对象以及目标帧画面在视频源中的位置,生成第一镜头片段的镜头信息;
其中,所述第一镜头片段由包括所述目标帧画面的第一组帧画面组成,所述第一组帧画面包括视频源中的多个连续的帧画面,所述第一组帧画面均对应于所述目标对象和所述目标镜头类别;
所述第一镜头片段的镜头信息包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。
可选的,
在所述目标帧画面中识别出多个对象的情况下,若所述多个对象中存在所述目标帧画面的前一帧画面对应的对象,则所述目标对象为所述前一帧画面对应的对象。
可选的,还包括:
若识别不到所述目标帧画面对应的目标对象,将所述目标帧画面标记为无目标对象的帧画面;
根据所述无目标对象的帧画面,生成第二镜头片段的镜头信息;
其中,所述第二镜头片段由包括所述目标帧画面在内的第二组帧画面组成,所述第二组帧画面包括所述视频源中的多个连续的帧画面,所述第二组帧画面均为无目标对象的帧画面;
所述第二镜头片段的镜头信息包括:用于表示无目标对象的标识,所述第二镜头片段在所述视频源中的位置标识。
可选的,所述第一镜头片段在所述视频源中的位置标识,包括:所述第一镜头片段的起始帧位置的标识,所述第一镜头片段的结束帧位置的标识。
可选的,
若所述尺寸比例属于第一比例范围,所述目标镜头类别为定场镜头;
若所述尺寸比例属于第二比例范围,所述目标镜头类别为全景镜头;
若所述尺寸比例属于第三比例范围,所述目标镜头类别为中景镜头;
若所述尺寸比例属于第四比例范围,所述目标镜头类别为近景镜头;
若所述尺寸比例属于第五比例范围,所述目标镜头类别为特写镜头;
若所述尺寸比例属于第六比例范围,所述目标镜头类别为大特写镜头;
其中,所述第一比例范围小于所述第二比例范围,所述第二比例范围小于所述第三比例范围,所述第三比例范围小于所述第四比例范围,所述第四比例范围小于所述第五比例范围,所述第五比例范围小于所述第六比例范围。
可选的,还包括:
接收镜头片段的查询指令,所述查询指令中携带有查询标识,所述查询标识包括所述目标对象的标识和/或所述目标镜头类别的标识;
查找具有所述查询标识的镜头信息,得到所述第一镜头片段的镜头信息;
按照所述第一镜头片段的镜头信息中所述第一镜头片段在所述视频源中的位置标识,反馈所述第一镜头片段。
第二方面,提供了一种生成镜头信息的装置,包括:
识别单元,用于对目标帧画面进行对象识别;
确定单元,用于若识别到所述目标帧画面对应的目标对象,根据所述目标对象在所述目标帧画面中占据的尺寸比例,确定所述目标帧画面对应的目标镜头类别;
第一生成单元,用于根据所述目标帧画面对应的目标镜头类别和目标对象以及所述目标帧画面在视频源中的位置,生成第一镜头片段的镜头信息;
其中,所述第一镜头片段由包括所述目标帧画面的第一组帧画面组成,所述第一组帧画面包括所述视频源中的多个连续的帧画面,所述第一组帧画面均对应于所述目标对象和所述目标镜头类别;
所述第一镜头片段的镜头信息包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。
可选的,
在所述目标帧画面中识别出多个对象的情况下,若所述多个对象中存在所述目标帧画 面的前一帧画面对应的对象,则所述目标对象为所述前一帧画面对应的对象。
可选的,还包括:
标记单元,用于若识别不到所述目标帧画面对应的目标对象,将所述目标帧画面标记为无目标对象的帧画面;
第二生成单元,用于根据所述无目标对象的帧画面,生成第二镜头片段的镜头信息;
其中,所述第二镜头片段由包括所述目标帧画面在内的第二组帧画面组成,所述第二组帧画面包括所述视频源中的多个连续的帧画面,所述第二组帧画面均为无目标对象的帧画面;
所述第二镜头片段的镜头信息包括:用于表示无目标对象的标识,所述第二镜头片段在所述视频源中的位置标识。
可选的,所述第一镜头片段在所述视频源中的位置标识,包括:所述第一镜头片段的起始帧位置的标识,所述第一镜头片段的结束帧位置的标识。
可选的,
若所述尺寸比例属于第一比例范围,所述目标镜头类别为定场镜头;
若所述尺寸比例属于第二比例范围,所述目标镜头类别为全景镜头;
若所述尺寸比例属于第三比例范围,所述目标镜头类别为中景镜头;
若所述尺寸比例属于第四比例范围,所述目标镜头类别为近景镜头;
若所述尺寸比例属于第五比例范围,所述目标镜头类别为特写镜头;
若所述尺寸比例属于第六比例范围,所述目标镜头类别为大特写镜头;
其中,所述第一比例范围小于所述第二比例范围,所述第二比例范围小于所述第三比例范围,所述第三比例范围小于所述第四比例范围,所述第四比例范围小于所述第五比例范围,所述第五比例范围小于所述第六比例范围。
可选的,还包括:
接收单元,用于接收镜头片段的查询指令,所述查询指令中携带有查询标识,所述查询标识包括所述目标对象的标识和/或所述第一目标镜头类别的标识;
查找单元,用于查找具有所述查询标识的镜头信息,得到所述第一镜头片段的镜头信息;
反馈单元,用于按照所述第一镜头片段的镜头信息中所述第一镜头片段在所述视频源中的位置标识,反馈所述第一镜头片段。
第三方面,提供了一种电子设备,包括处理器以及与所述处理器连接的存储器;
所述存储器,用于存储程序指令和数据;
所述处理器,用于读取存储器中存储的指令和数据,执行以下操作:
对目标帧画面进行对象识别;
若识别到所述目标帧画面对应的目标对象,根据所述目标对象在所述目标帧画面中占据的尺寸比例,确定所述目标帧画面对应的目标镜头类别;
根据所述目标帧画面对应的目标镜头类别和目标对象以及目标帧画面在视频源中的位置,生成第一镜头片段的镜头信息;
其中,所述第一镜头片段由包括所述目标帧画面的第一组帧画面组成,所述第一组帧画面包括视频源中的多个连续的帧画面,所述第一组帧画面均对应于所述目标对象和所述目标镜头类别;
所述第一镜头片段的镜头信息包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。
可选的,在所述目标帧画面中识别出多个对象的情况下,若所述多个对象中存在所述目标帧画面的前一帧画面对应的对象,则所述目标对象为所述前一帧画面对应的对象。
可选的,所述处理器还用于执行以下操作:
若识别不到所述目标帧画面对应的目标对象,将所述目标帧画面标记为无目标对象的帧画面;
根据所述无目标对象的帧画面,生成第二镜头片段的镜头信息;
其中,所述第二镜头片段由包括所述目标帧画面在内的第二组帧画面组成,所述第二组帧画面包括所述视频源中的多个连续的帧画面,所述第二组帧画面均为无目标对象的帧画面;
所述第二镜头片段的镜头信息包括:用于表示无目标对象的标识,所述第二镜头片段在所述视频源中的位置标识。
可选的,所述第一镜头片段在所述视频源中的位置标识,包括:所述第一镜头片段的起始帧位置的标识,所述第一镜头片段的结束帧位置的标识。
可选的,
若所述尺寸比例属于第一比例范围,所述目标镜头类别为定场镜头;
若所述尺寸比例属于第二比例范围,所述目标镜头类别为全景镜头;
若所述尺寸比例属于第三比例范围,所述目标镜头类别为中景镜头;
若所述尺寸比例属于第四比例范围,所述目标镜头类别为近景镜头;
若所述尺寸比例属于第五比例范围,所述目标镜头类别为特写镜头;
若所述尺寸比例属于第六比例范围,所述目标镜头类别为大特写镜头;
其中,所述第一比例范围小于所述第二比例范围,所述第二比例范围小于所述第三比例范围,所述第三比例范围小于所述第四比例范围,所述第四比例范围小于所述第五比例范围,所述第五比例范围小于所述第六比例范围。
可选的,所述电子设备还包括与所述处理器连接的收发器,所述处理器还用于执行以下操作:
触发所述收发器接收镜头片段的查询指令,所述查询指令中携带有查询标识,所述查询标识包括所述目标对象的标识和/或所述目标镜头类别的标识;
查找具有所述查询标识的镜头信息,得到所述第一镜头片段的镜头信息;
按照所述第一镜头片段的镜头信息中所述第一镜头片段在所述视频源中的位置标识,反馈所述第一镜头片段。
在本申请中,通过识别视频源中的帧画面对应的目标对象并按照目标对象在帧画面中占据的尺寸比例识别帧画面对应的目标镜头类别,可以按照目标对象和目标镜头类别从视 频源中识别出镜头片段,并为镜头片段生成可用于标记镜头片段对应的目标对象、镜头片段对应的目标镜头类别以及镜头片段在视频源中位置的镜头信息。因此,在视频剪辑工作中,通过镜头信息,用户可以利用目标对象和/或目标镜头类别简单快速地查找到相应的镜头片段,因此,用户花费更少的时间就可以查找到合适的镜头片段,从而能够更便捷地完成视频剪辑工作。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。
图1为本申请实施例中一应用场景所涉及的网络系统框架示意图;
图2为本申请实施例中一种生成镜头信息的方法的流程示意图;
图3为本申请实施例中一种以人物作为对象的不同镜头类别的帧画面示例的示意图;
图4为本申请实施例中一种生成镜头信息的方法的流程示意图;
图5为本申请实施例中一种查询镜头片段的方法的流程示意图;
图6为本申请实施例中一种生成镜头信息的装置的结构示意图;
图7为本申请实施例中一种电子设备的硬件结构示意图。
具体实施方式
发明人经过研究发现,在视频剪辑的过程中,用户需要花费大量的时间在视频源中寻找合适的镜头片段,这导致视频剪辑工作对于用户来说不够便捷。基于此,在本申请实施例中,为了方便用户能够快速到查找合适的镜头片段,可以对视频源进行如下的处理:识别视频源中的帧画面对应的目标对象并按照目标对象在帧画面中占据的尺寸比例识别帧画面对应的目标镜头类别,若在连续的一组帧画面中均能识别到目标对象并且均对应于相同的目标镜头类别,则以这组连续的帧画面作为镜头片段,生成该镜头片段的镜头信息。其中,该镜头信息包括:目标对象标识、目标镜头类别标识及第一镜头片段对应的位置标识。经过以上的处理后,当用户需要使用视频源中的镜头片段时,通过该镜头信息,用户可以利用目标对象和/或目标镜头类别简单快速地查找到相应的镜头片段,因此,用户花费更少的时间就可以查找到合适的镜头片段,从而能够更便捷地完成视频剪辑工作。
举例来说,本申请实施例例如可以应用到如图1所示的场景。在该场景中,用户101可以通过与终端102交互,实现视频的拍摄工作和剪辑工作。具体地,在用户101操作终端102拍摄视频之后,终端102采集到了视频源。以视频源中的各个帧画面依次作为目标帧画面,终端102可以执行以下的操作:对目标帧画面进行对象识别;若识别到所述目标帧画面对应的目标对象,根据所述目标对象在所述目标帧画面中占据的尺寸比例,确定该目标帧画面对应的目标镜头类别;根据所述目标帧画面对应的目标镜头类别和目标对象以及目标帧画面在视频源中的位置,生成第一镜头片段的镜头信息;其中所述第一镜头片段由包括所述目标帧画面的第一组帧画面组成,所述第一组帧画 面包括视频源中的多个连续的帧画面,所述第一组帧画面均对应于所述目标对象和所述目标镜头类别;所述第一镜头片段的镜头信息包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。在视频源中的所有帧画面均处理完之后,视频源中各个镜头片段的镜头信息都保存在终端102中。当用户101需要在终端102中查询镜头片段时,可以在终端102上选择目标对象和/或目标镜头类别,终端102可以根据镜头信息查找到与目标对象和/或目标镜头类别相对应的镜头片段并呈现给用户101。
可以理解的是,上述场景仅是本申请实施例提供的一个场景示例,本申请实施例并不限于此场景。
下面结合附图,通过实施例来详细说明本申请实施例中一种生成镜头信息的方法和装置的具体实现方式。
参考图2,示出了本申请实施例中一种生成镜头信息的方法的流程示意图。可以理解的是,视频剪辑是对视频源中不同的镜头片段进行重新混合,这就需要按照镜头片段对视频源进行切割、合并及二次编码,从而生成具有不同表现力的新视频。其中,对视频源进行切割的前提是要使用户能够在视频源中找到相对应的镜头片段。在本实施例中,为了方便用户查找到所需的镜头片段,在进行剪辑工作之前,可以对视频源中的各个帧画面进行处理以确定各个帧画面所属的镜头片段,从而生成用于查找镜头片段的镜头信息。其中,视频源是由一系列的帧画面组成的,因此,对于视频源中的任意一个帧画面,可以作为目标帧画面执行以下的步骤201~203。
201、对目标帧画面进行对象识别。
本实施例中,从目标帧画面中所要识别出的对象,可以是人物,或者也可以是除人物以外的其他对象,如动物、植物、飞机、汽车、坦克、桌子、椅子等。
其中,若人物作为识别对象,可以通过人脸识别技术对目标帧画面进行人脸识别,从而将识别到的人脸作为识别到的人物对象;若以除人物之外的其它对象作为识别对象,可以依据所需识别的对象的相关特征,采用相应的对象识别技术对目标帧画面中进行对象识别。
本实施例中,目标帧画面的对象识别是用于识别目标帧画面对应的目标对象。其中,目标对象可以理解成目标帧画面所要描述的对象。可以理解的是,目标帧画面对应的目标对象属于目标帧画面中识别出的对象,但目标帧画面中识别出的所有对象并不一定都是目标帧画面对应的目标对象。为了识别目标帧画面对应的目标对象,目标帧画面中的对象识别结果可以包括以下三种情况:
1)目标帧画面中识别不到任何对象,此时目标帧画面中识别不到目标帧画面对应的目标对象;
2)目标帧画面中仅识别出一个对象,该对象即是目标帧画面对应的目标帧对象;
3)目标帧画面中识别出多个对象,此时还可以从多个对象中确定目标帧画面对应的目标对象。
作为一种示例,对于上述情况3),目标帧画面对应的目标对象可以依据目标帧画面的前一帧画面确定。具体的,在所述目标帧画面中识别出多个对象的情况下,若所述多个对象中存在所述目标帧画面的前一帧画面对应的对象,则所述目标对象为所述前一帧画面对应的对象。
举例说明:当用户在拍摄人物A时,一开始镜头画面中只有人物A,但后来镜头画面中出现了一些路过的路人,但此时镜头画面所要描述的目标对象还是人物A。在这种情况下,在对有些目标帧画面进行对象识别时,在目标帧画面中会识别出多个人物,此时可以参照目标帧画面的前一帧画面对应的目标对象来确定目标帧画面对应的目标对象。由于前一帧画面对应的目标对象为人物A,并且,人物A包含在目标帧画面中识别出的对象中,因此,目标帧画面对应的目标对象可以确定为人物A。
作为另一种示例,对于上述情况3),目标帧画面对应的目标对象可以依据目标帧画面的后一帧画面确定。具体地,在所述目标帧画面中识别出多个对象的情况下,若所述多个对象中存在所述目标帧画面的后一帧画面对应的对象,则所述目标对象为所述后一帧画面对应的对象。
举例说明:当用户在拍摄人物A时,一开始镜头画面中就出现了包含人物A在内的多个人物,后来镜头画面中慢慢转移到只拍摄人物A,可见,该镜头画面实际所要描述的对象是人物A。在这种情况下,在对有些目标帧画面进行识别时,在目标帧画面中会识别出多个人物,此时可以参照目标帧画面的后一帧画面对应的目标对象来确定目标帧画面对应的目标对象。由于后一帧画面对应的目标对象为人物A,并且,人物A包含在目标帧画面中识别出的对象中,因此,目标帧画面中对应的目标对象可以确定为人物A。
作为又一种示例,对于上述情况3),目标帧画面对应的目标对象可以依据目标帧画面的前一帧画面和后一帧画面确定。具体地,在所述目标帧画面中识别出多个对象的情况下,若所述目标帧画面的前一帧画面对应的对象和所述目标帧画面的后一帧画面对应的对象均为对象A且所述多个对象中存在对象A,则所述目标对象为对象A。
此外,对于上述情况3),虽然目标帧画面中识别到了多个对象,但有可能从所述多个对象中无法确定目标帧画面所要描述的目标对象。因此,在目标帧画面中识别到多个对象的情况下也有可能目标帧画面中识别不到目标对象。
202、若识别到所述目标帧画面对应的目标对象,根据目标对象在所述目标帧画面中占据的尺寸比例,确定所述目标帧画面对应的目标镜头类别。
本实施例中,镜头指的是影片的一个基本构成单位。镜头类别可以包括:定场镜头、全景镜头、中景镜头、近景镜头、特写镜头、大特写镜头等。其中,目标镜头类别可以是上述提及的任意一种镜头类别。
图3示出了一种以人物作为对象的不同镜头类别的帧画面示例的示意图。定场镜头还可以理解为主镜头,通常是位于影片一开始或一场戏的开头、用于明确交待地点的镜头,例如可以是一种视野宽阔的远景镜头。全景镜头,主要用于表现人物全身,其中,人物在全景镜头中具有较大的活动范围,体型、衣着打扮、身份在全景镜头中能够交代得比较清楚,环境、道具在全景镜头中也能够清楚地展现,通常在拍内景时全景镜头可以作为摄像 的总角度的景别。中景镜头相对于全景镜头所能包容的景物范围有所缩小,人物所处于的环境在中景镜头中处于次要地位,中景镜头的重点在于表现人物的上身动作。近景镜头能清楚地展现人物的细微动作并能着重表现人物的面部表情,因此,近景镜头能传达人物的内心世界,是刻画人物性格最有力的镜头。特写镜头用于拍摄人像的面部、人体的某一局部或一件物品的某一细部的镜头。在大特写镜头中拍摄对象的某个细部占满整个画面的镜头。
本实施例中,目标对象在目标帧画面中占据的尺寸比例,可以是目标对象整体的面积与目标帧画面尺寸的比例,或者也可以是目标对象的某个部分的面积与目标帧画面尺寸的比例。
举例说明:假设目标对象为人物A,人物A在目标帧画面中占据的尺寸比例,可以是人物A的人脸面积占目标帧画面的尺寸比例。因此,人物A在目标画面中占据的尺寸比例可以通过以下方式计算:首先分析人物A的人脸轮廓并基于分析出的人脸轮廓确定人物A的人脸面积以及目标帧画面的尺寸,然后,可以以人脸面积除以目标帧画面的尺寸,所得到的比例即为人物A在目标帧画面中占据的尺寸比例。其中,人脸面积例如可以是人脸占据的像素面积,目标帧画面的尺寸例如可以是目标帧画面的像素尺寸。
作为一种示例,可以通过为不同的镜头类别设置相应的尺寸比例范围来确定目标帧画面对应的目标镜头类别。例如,可以为定场镜头设置第一比例范围,则若所述尺寸比例属于第一比例范围,所述目标镜头类别为定场镜头。又如,可以为全景镜头设置第二比例范围,则若所述尺寸比例属于第二比例范围,所述目标镜头类别为全景镜头。再如,可以为中景镜头设置第三比例范围,则若所述尺寸比例属于第三比例范围,所述目标镜头类别为中景镜头。再如,可以为近景镜头设置第四比例范围,则若所述尺寸比例属于第四比例范围,所述目标镜头类别为近景镜头。再如,可以为特写镜头设置第五比例范围,则若所述尺寸比例属于第五比例范围,所述目标镜头类别为特写镜头。再如,可以为大特写镜头设置第六比例范围,则若所述尺寸比例属于第六比例范围,所述目标镜头类别为大特写镜头。其中,所述第一比例范围小于所述第二比例范围,所述第二比例范围小于所述第三比例范围,所述第三比例范围小于所述第四比例范围,所述第四比例范围小于所述第五比例范围,所述第五比例范围小于所述第六比例范围。
举例说明:假设在目标帧画面中识别出的目标对象是人物A,人物A的人脸面积为s,目标帧画面的尺寸为q,则人脸面积占该目标帧画面的尺寸比例为r=s/q。若r<0.01,目标帧画面对应的目标镜头类别可以为定场镜头;若0.01≤r≤0.02时,目标镜头可以为全景镜头;若0.02≤r≤0.1,目标镜头可以为中景镜头;若0.1≤r≤0.2,则目标镜头为近景镜头;若0.2≤r≤0.33,目标镜头可以为特写镜头;若r≥0.75,该目标镜头可以为大特写镜头。可见,在该示例中,第一比例范围为r<0.01,第二比例范围为0.01≤r≤0.02,第三比例范围为0.02≤r≤0.1,第四比例范围为0.1≤r≤0.2,第五比例范围为0.2≤r≤0.33,第六比例范围为r≥0.75。
203、根据所述目标帧画面对应的目标镜头类别和目标对象以及目标帧画面在视频源中的位置,生成第一镜头片段的镜头信息;
其中,所述第一镜头片段由包括所述目标帧画面的第一组帧画面组成,所述第一组帧画面包括视频源中的多个连续的帧画面,所述第一组帧画面均对应于所述目标对象和所述目标镜头类别;
所述第一镜头片段的镜头信息包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。
可以理解的是,所述第一镜头片段在所述视频源中的位置标识例如可以包括:所述第一镜头片段的起始帧位置的标识,和/或,所述第一镜头片段的结束帧位置的标识。所述目标对象的标识可以用于区分不同的对象,不同的对象可以采用不同的数字、字母或者符号作为标识。所述目标镜头类别的标识可以用于区分不同的镜头类别,不同的镜头类别可以采用不同的数字、字母或者符号表示。
举例说明:假设目标对象为人物,对视频源中每一帧画面执行201~203之后,可以得到视频源中各个镜头片段的镜头信息:
镜头1:人物A,从第n1帧~第n2帧;
镜头2:人物B,从第n3帧~第n4帧;
镜头3:人物C,从第n5帧~第n6帧;
镜头4:人物A,从第n7帧~第n8帧;
…….
其中,镜头1、镜头2、镜头3和镜头4分别表示四个不同的目标镜头类别;人物A、人物B、人物C分别表示三个不同的目标对象;从第n1帧~第n2帧、从第n3帧~第n4帧、从第n5帧~第n6帧、从第n7帧~第n8帧分别表示四个不同的镜头片段在视频源中的位置。
作为一种示例,在对目标帧画面进行识别之后,可以为目标帧画面进行信息标记,在视频源的帧画面都标记之后,再基于各个帧画面的标记信息生成第一镜头片段的镜头信息。
其中,目标帧画面的标记信息可以包括:目标镜头类别的标识、目标对象的标识以及目标帧画面在视频源中的位置。例如:目标帧画面的标记信息可以为{n,a,X}。其中,n表示该目标帧画面的位置,即目标帧画面是视频源中的第n帧画面;a表示识别出来的目标对象,假设识别出的目标对象为人物A则a具体可以为A,假设识别出的目标对象为人物B则a具体可以为B;X表示识别出来的目标镜头类别。
可以理解的是,视频源中的一个镜头片段是由视频源中一组连续的帧画面组成的,这些帧画面以相同的镜头类别描述同一个对象。因此,根据视频源中各个帧画面的标记信息,可将对应于同一目标对象和同一目标镜头类别的一组连续的帧画面组成一个镜头片段,镜头片段在视频源中的位置即可以为这一组连续的帧画面在视频源中占据的位置。
需要说明的是,在实际应用时,在为视频源中每一帧画面都确定了对应的目标对象和目标镜头类别之后才能够确定视频源中所包含的镜头片段,因此可以依次对视频源中的每一帧画面分别执行201~202的步骤,然后基于整个视频源中每一帧画面所在的位置、所对应的目标对象和目标镜头类别,确定视频源中所包含的各个镜头片段,再为各个镜头片段生成镜头信息。
可以理解的是,在目标帧画面进行对象识别时,有可能会存在识别不到目标对象的情 况。当目标帧画面识别不到目标对象时,无法根据目标对象将目标帧画面划分到镜头片段中。为了使得识别不到目标对象的目标帧画面能够划分到特定的镜头片段中,以便于用户对这些镜头片段进行查找,在本实施例的一些实施方式中,如图4所示,在201之后,该方法还可以包括:
301、若识别不到所述目标帧画面对应的目标对象,将所述目标帧画面标记为无目标对象的帧画面。
302、根据所述无目标对象的帧画面,生成第二镜头片段的镜头信息;
其中,所述第二镜头片段由包括所述目标帧画面在内的第二组帧画面组成,所述第二组帧画面包括所述视频源中的多个连续的帧画面,所述第二组帧画面均为无目标对象的帧画面;
所述第二镜头片段的镜头信息包括:用于表示无目标对象的标识,所述第二镜头片段在所述视频源中的位置标识。
需要说明的是,对于识别不到目标画面对应的目标对象,可以包括两种情况:一种是目标帧画面中不存在任何对象;另外一种是,目标帧画面中存在多个对象,但是无法从所述多个对象中确定出目标帧画面对应的目标对象。
举例说明:用户在视频录制的过程中,有一段时间没有拍摄人物仅仅拍摄的风景,还有一段时间拍摄了包含多个人的场景但无法从多个人中确定出镜头所要描述的目标人物。可见,在这两段时间下产生的帧画面,无法识别出帧画面对应的目标对象,此时,可以将多个连续的、无法识别到目标对象的帧画面作为第二镜头片段,生成相应的第二镜头片段信息。
本实施例中,为镜头片段生成的镜头信息可以用于用户的视频剪辑工作。在用户的视频剪辑工作中,用户可以通过目标对象和/或目标镜头类别查询到相应的镜头片段,从而大大提高了用户查询镜头片段的效率。具体地,在一些实施方式中,如图5所示,在203之后,该方法还可以包括:
401、接收镜头片段的查询指令,所述查询指令中携带有查询标识,所述查询标识包括所述目标对象的标识和/或所述目标镜头类别的标识。
具体实现时,当用户为了查询镜头片段而输入输入相应的目标对象和/或目标镜头类别时,可以生成具有目标对象的标识和/或第一目标镜头类别的标识的查询指令。其中,所要查询的镜头片段的镜头类别为目标镜头类别,所要查询的镜头片段对应于目标对象。
402、查找具有所述查询标识的镜头信息,得到所述第一镜头片段的镜头信息。
由203可知,所述第一镜头片段的镜头信息可以包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。因此,在查找具有所述查询标识的镜头信息时,可以查找到第一镜头片段的镜头信息。
403、按照所述第一镜头片段的镜头信息中所述第一镜头片段在所述视频源中的位置标识,反馈所述第一镜头片段。
通过目标对象的标识和/或第一目标镜头类别的标识,找到第一镜头片段的镜头信 息,进而可以从该镜头信息中获知第一镜头片段的位置信息。
作为一种示例,第一镜头片段的镜头信息可以理解为一种对应关系。在该对应关系中,目标对象的标识、目标镜头类别的标识和第一镜头片段在视频源中的位置标识之间相互对应。因此,根据目标对象的标识和/或目标镜头类别的标识,就可以在该对应关系中查找到第一镜头片段在该视频源中的位置标识,从而将第一镜头片段提供给用户,从而方便了用户在视频源中对镜头片段进行查找。
除此之外,当用户想要查询第二镜头片段时,而第二镜头片段没有对应的目标对象,则可以用户通过无目标对象的标识进行查询,得到第二镜头片段的镜头信息,并按照第二镜头信息中包括的该第二镜头片段在视频源中的位置标识,反馈该第二镜头片段。
本实施例中,通过识别视频源中的帧画面对应的目标对象并按照目标对象在帧画面中占据的尺寸比例确定帧画面对应的目标镜头类别,然后可以按照目标对象和目标镜头类别从视频源中识别出镜头片段,并为镜头片段生成可用于标记镜头片段对应的目标对象、镜头片段对应的目标镜头类别以及镜头片段在视频源中位置的镜头信息。因此,在视频剪辑工作中,通过镜头信息,用户可以利用目标对象和/或目标镜头类别简单快速地查找到相应的镜头片段,因此,用户花费更少的时间就可以查找到合适的镜头片段,从而更便捷的完成视频剪辑工作。
参见图6,示出了本申请实施例中一种生成镜头信息的装置的结构示意图。在本实施例中,所述装置例如可以包括:
识别单元601,用于对目标帧画面进行对象识别;
确定单元602,用于若识别到所述目标帧画面对应的目标对象,根据所述目标对象在所述目标帧画面中占据的尺寸比例,确定所述目标帧画面对应的目标镜头类别;
第一生成单元603,用于根据所述目标帧画面对应的目标镜头类别和目标对象以及所述目标帧画面在视频源中的位置,生成第一镜头片段的镜头信息;
其中,所述第一镜头片段由包括所述目标帧画面的第一组帧画面组成,所述第一组帧画面包括所述视频源中的多个连续的帧画面,所述第一组帧画面均对应于所述目标对象和所述目标镜头类别;
所述第一镜头片段的镜头信息包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。
可选的,在所述目标帧画面中识别出多个对象的情况下,若所述多个对象中存在所述目标帧画面的前一帧画面对应的对象,则所述目标对象为所述前一帧画面对应的对象。
可选的,所述装置还包括:
标记单元,用于若识别不到所述目标帧画面对应的目标对象,将所述目标帧画面标记为无目标对象的帧画面;
第二生成单元,用于根据所述无目标对象的帧画面,生成第二镜头片段的镜头信 息;
其中,所述第二镜头片段由包括所述目标帧画面在内的第二组帧画面组成,所述第二组帧画面包括所述视频源中的多个连续的帧画面,所述第二组帧画面均为无目标对象的帧画面;
所述第二镜头片段的镜头信息包括:用于表示无目标对象的标识,所述第二镜头片段在所述视频源中的位置标识。
可选的,所述第一镜头片段在所述视频源中的位置标识,包括:所述第一镜头片段的起始帧位置的标识,所述第一镜头片段的结束帧位置的标识。
可选的,
若所述尺寸比例属于第一比例范围,所述目标镜头类别为定场镜头;
若所述尺寸比例属于第二比例范围,所述目标镜头类别为全景镜头;
若所述尺寸比例属于第三比例范围,所述目标镜头类别为中景镜头;
若所述尺寸比例属于第四比例范围,所述目标镜头类别为近景镜头;
若所述尺寸比例属于第五比例范围,所述目标镜头类别为特写镜头;
若所述尺寸比例属于第六比例范围,所述目标镜头类别为大特写镜头;
其中,所述第一比例范围小于所述第二比例范围,所述第二比例范围小于所述第三比例范围,所述第三比例范围小于所述第四比例范围,所述第四比例范围小于所述第五比例范围,所述第五比例范围小于所述第六比例范围。
可选的,所述装置还包括:
接收单元,用于接收镜头片段的查询指令,所述查询指令中携带有查询标识,所述查询标识包括所述目标对象的标识和/或所述第一目标镜头类别的标识;
查找单元,用于查找具有所述查询标识的镜头信息,得到所述第一镜头片段的镜头信息;
反馈单元,用于按照所述第一镜头片段的镜头信息中所述第一镜头片段在所述视频源中的位置标识,反馈所述第一镜头片段。
通过本实施例提供的装置,在视频剪辑工作中,通过所生成的镜头信息,用户可以利用目标对象和/或目标镜头类别简单快速地查找到相应的镜头片段,因此,用户花费更少的时间就可以查找到合适的镜头片段,从而更便捷的完成视频剪辑工作。
参见图7,示出了本申请实施例中一种电子设备的硬件结构示意图。所述电子设备700包括处理器701以及与所述处理器701连接的存储器702。
所述存储器702,用于存储程序指令和数据。
所述处理器701,用于读取存储器702中存储的指令和数据,执行以下操作:
对目标帧画面进行对象识别;
若识别到所述目标帧画面对应的目标对象,根据所述目标对象在所述目标帧画面中占据的尺寸比例,确定所述目标帧画面对应的目标镜头类别;
根据所述目标帧画面对应的目标镜头类别和目标对象以及目标帧画面在视频源中 的位置,生成第一镜头片段的镜头信息;
其中,所述第一镜头片段由包括所述目标帧画面的第一组帧画面组成,所述第一组帧画面包括视频源中的多个连续的帧画面,所述第一组帧画面均对应于所述目标对象和所述目标镜头类别;
所述第一镜头片段的镜头信息包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。
可选的,在所述目标帧画面中识别出多个对象的情况下,若所述多个对象中存在所述目标帧画面的前一帧画面对应的对象,则所述目标对象为所述前一帧画面对应的对象。
可选的,所述处理器701还用于执行以下操作:
若识别不到所述目标帧画面对应的目标对象,将所述目标帧画面标记为无目标对象的帧画面;
根据所述无目标对象的帧画面,生成第二镜头片段的镜头信息;
其中,所述第二镜头片段由包括所述目标帧画面在内的第二组帧画面组成,所述第二组帧画面包括所述视频源中的多个连续的帧画面,所述第二组帧画面均为无目标对象的帧画面;
所述第二镜头片段的镜头信息包括:用于表示无目标对象的标识,所述第二镜头片段在所述视频源中的位置标识。
可选的,所述第一镜头片段在所述视频源中的位置标识,包括:所述第一镜头片段的起始帧位置的标识,所述第一镜头片段的结束帧位置的标识。
可选的,
若所述尺寸比例属于第一比例范围,所述目标镜头类别为定场镜头;
若所述尺寸比例属于第二比例范围,所述目标镜头类别为全景镜头;
若所述尺寸比例属于第三比例范围,所述目标镜头类别为中景镜头;
若所述尺寸比例属于第四比例范围,所述目标镜头类别为近景镜头;
若所述尺寸比例属于第五比例范围,所述目标镜头类别为特写镜头;
若所述尺寸比例属于第六比例范围,所述目标镜头类别为大特写镜头;
其中,所述第一比例范围小于所述第二比例范围,所述第二比例范围小于所述第三比例范围,所述第三比例范围小于所述第四比例范围,所述第四比例范围小于所述第五比例范围,所述第五比例范围小于所述第六比例范围。
可选的,所述电子设备还包括与所述处理器701连接的收发器703,所述处理器701还用于执行以下操作:
触发所述收发器703接收镜头片段的查询指令,所述查询指令中携带有查询标识,所述查询标识包括所述目标对象的标识和/或所述目标镜头类别的标识;
查找具有所述查询标识的镜头信息,得到所述第一镜头片段的镜头信息;
按照所述第一镜头片段的镜头信息中所述第一镜头片段在所述视频源中的位置标识,反馈所述第一镜头片段。
可选的,所述电子设备700具体可以为手机、平板电脑、个人数字助理(Personal Digital Assistant,PDA)、销售终端(Point of Sales,POS)、车载电脑、膝上型个人计算机、桌面型个人计算机、小型计算机、中型计算机或大型计算机等。所述处理器701可以为中央处理器(central processing unit,CPU),网络处理器或其组合。处理器701还可以包括硬件芯片。所述存储器702可以为随机存取存储器(random access memory,RAM)、只读存储器(ROM)、硬盘、固态硬盘、闪存、光盘或其任意组合。所述收发器703可以包括有线物理接口、无线物理接口或其组合。所述有线物理接口可以为电接口、光接口或其组合,例如为以太网接口或异步传输模式(Asynchronous Transfer Mode,ATM)接口。所述无线物理接口可以为无线局域网接口、蜂窝移动网络接口或其组合。所述处理器701、所述存储器702和所述收发器703可以集成在一个或多个独立的电路中。
本申请实施例中提到的“第一镜头片段”、“第一比例范围”、“第一生成单元”等名称中的“第一”只是用来做名字标识,并不代表顺序上的第一。该规则同样适用于“第二”等。
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到上述实施例方法中的全部或部分步骤可借助软件加通用硬件平台的方式来实现。基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如只读存储器(英文:read-only memory,ROM)/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者诸如路由器等网络通信设备)执行本申请各个实施例或者实施例的某些部分所述的方法。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的设备及系统实施例仅仅是示意性的,其中作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
以上所述仅是本申请示例性的实施方式,并非用于限定本申请的保护范围。

Claims (18)

  1. 一种生成镜头信息的方法,其特征在于,包括:
    对目标帧画面进行对象识别;
    若识别到所述目标帧画面对应的目标对象,根据所述目标对象在所述目标帧画面中占据的尺寸比例,确定所述目标帧画面对应的目标镜头类别;
    根据所述目标帧画面对应的目标镜头类别和目标对象以及目标帧画面在视频源中的位置,生成第一镜头片段的镜头信息;
    其中,所述第一镜头片段由包括所述目标帧画面的第一组帧画面组成,所述第一组帧画面包括视频源中的多个连续的帧画面,所述第一组帧画面均对应于所述目标对象和所述目标镜头类别;
    所述第一镜头片段的镜头信息包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。
  2. 根据权利要求1所述的方法,其特征在于,
    在所述目标帧画面中识别出多个对象的情况下,若所述多个对象中存在所述目标帧画面的前一帧画面对应的对象,则所述目标对象为所述前一帧画面对应的对象。
  3. 根据权利要求1所述的方法,其特征在于,还包括:
    若识别不到所述目标帧画面对应的目标对象,将所述目标帧画面标记为无目标对象的帧画面;
    根据所述无目标对象的帧画面,生成第二镜头片段的镜头信息;
    其中,所述第二镜头片段由包括所述目标帧画面在内的第二组帧画面组成,所述第二组帧画面包括所述视频源中的多个连续的帧画面,所述第二组帧画面均为无目标对象的帧画面;
    所述第二镜头片段的镜头信息包括:用于表示无目标对象的标识,所述第二镜头片段在所述视频源中的位置标识。
  4. 根据权利要求1所述的方法,其特征在于,所述第一镜头片段在所述视频源中的位置标识,包括:所述第一镜头片段的起始帧位置的标识,所述第一镜头片段的结束帧位置的标识。
  5. 根据权利要求1所述的方法,其特征在于,
    若所述尺寸比例属于第一比例范围,所述目标镜头类别为定场镜头;
    若所述尺寸比例属于第二比例范围,所述目标镜头类别为全景镜头;
    若所述尺寸比例属于第三比例范围,所述目标镜头类别为中景镜头;
    若所述尺寸比例属于第四比例范围,所述目标镜头类别为近景镜头;
    若所述尺寸比例属于第五比例范围,所述目标镜头类别为特写镜头;
    若所述尺寸比例属于第六比例范围,所述目标镜头类别为大特写镜头;
    其中,所述第一比例范围小于所述第二比例范围,所述第二比例范围小于所述第三比例范围,所述第三比例范围小于所述第四比例范围,所述第四比例范围小于所述 第五比例范围,所述第五比例范围小于所述第六比例范围。
  6. 根据权利要求1所述的方法,其特征在于,还包括:
    接收镜头片段的查询指令,所述查询指令中携带有查询标识,所述查询标识包括所述目标对象的标识和/或所述目标镜头类别的标识;
    查找具有所述查询标识的镜头信息,得到所述第一镜头片段的镜头信息;
    按照所述第一镜头片段的镜头信息中所述第一镜头片段在所述视频源中的位置标识,反馈所述第一镜头片段。
  7. 一种生成镜头信息的装置,其特征在于,包括:
    识别单元,用于对目标帧画面进行对象识别;
    确定单元,用于若识别到所述目标帧画面对应的目标对象,根据所述目标对象在所述目标帧画面中占据的尺寸比例,确定所述目标帧画面对应的目标镜头类别;
    第一生成单元,用于根据所述目标帧画面对应的目标镜头类别和目标对象以及所述目标帧画面在视频源中的位置,生成第一镜头片段的镜头信息;
    其中,所述第一镜头片段由包括所述目标帧画面的第一组帧画面组成,所述第一组帧画面包括所述视频源中的多个连续的帧画面,所述第一组帧画面均对应于所述目标对象和所述目标镜头类别;
    所述第一镜头片段的镜头信息包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。
  8. 根据权利要求7所述的装置,其特征在于,
    在所述目标帧画面中识别出多个对象的情况下,若所述多个对象中存在所述目标帧画面的前一帧画面对应的对象,则所述目标对象为所述前一帧画面对应的对象。
  9. 根据权利要求7所述的装置,其特征在于,还包括:
    标记单元,用于若识别不到所述目标帧画面对应的目标对象,将所述目标帧画面标记为无目标对象的帧画面;
    第二生成单元,用于根据所述无目标对象的帧画面,生成第二镜头片段的镜头信息;
    其中,所述第二镜头片段由包括所述目标帧画面在内的第二组帧画面组成,所述第二组帧画面包括所述视频源中的多个连续的帧画面,所述第二组帧画面均为无目标对象的帧画面;
    所述第二镜头片段的镜头信息包括:用于表示无目标对象的标识,所述第二镜头片段在所述视频源中的位置标识。
  10. 根据权利要求7所述的装置,其特征在于,所述第一镜头片段在所述视频源中的位置标识,包括:所述第一镜头片段的起始帧位置的标识,所述第一镜头片段的结束帧位置的标识。
  11. 根据权利要求7所述的装置,其特征在于,
    若所述尺寸比例属于第一比例范围,所述目标镜头类别为定场镜头;
    若所述尺寸比例属于第二比例范围,所述目标镜头类别为全景镜头;
    若所述尺寸比例属于第三比例范围,所述目标镜头类别为中景镜头;
    若所述尺寸比例属于第四比例范围,所述目标镜头类别为近景镜头;
    若所述尺寸比例属于第五比例范围,所述目标镜头类别为特写镜头;
    若所述尺寸比例属于第六比例范围,所述目标镜头类别为大特写镜头;
    其中,所述第一比例范围小于所述第二比例范围,所述第二比例范围小于所述第三比例范围,所述第三比例范围小于所述第四比例范围,所述第四比例范围小于所述第五比例范围,所述第五比例范围小于所述第六比例范围。
  12. 根据权利要求7所述的装置,其特征在于,还包括:
    接收单元,用于接收镜头片段的查询指令,所述查询指令中携带有查询标识,所述查询标识包括所述目标对象的标识和/或所述第一目标镜头类别的标识;
    查找单元,用于查找具有所述查询标识的镜头信息,得到所述第一镜头片段的镜头信息;
    反馈单元,用于按照所述第一镜头片段的镜头信息中所述第一镜头片段在所述视频源中的位置标识,反馈所述第一镜头片段。
  13. 一种电子设备,其特征在于,包括处理器以及与所述处理器连接的存储器;
    所述存储器,用于存储程序指令和数据;
    所述处理器,用于读取存储器中存储的指令和数据,执行以下操作:
    对目标帧画面进行对象识别;
    若识别到所述目标帧画面对应的目标对象,根据所述目标对象在所述目标帧画面中占据的尺寸比例,确定所述目标帧画面对应的目标镜头类别;
    根据所述目标帧画面对应的目标镜头类别和目标对象以及目标帧画面在视频源中的位置,生成第一镜头片段的镜头信息;
    其中,所述第一镜头片段由包括所述目标帧画面的第一组帧画面组成,所述第一组帧画面包括视频源中的多个连续的帧画面,所述第一组帧画面均对应于所述目标对象和所述目标镜头类别;
    所述第一镜头片段的镜头信息包括:所述目标对象的标识,所述目标镜头类别的标识,所述第一镜头片段在所述视频源中的位置标识。
  14. 根据权利要求13所述的电子设备,其特征在于,
    在所述目标帧画面中识别出多个对象的情况下,若所述多个对象中存在所述目标帧画面的前一帧画面对应的对象,则所述目标对象为所述前一帧画面对应的对象。
  15. 根据权利要求13所述的电子设备,其特征在于,所述处理器还用于执行以下操作:
    若识别不到所述目标帧画面对应的目标对象,将所述目标帧画面标记为无目标对象的帧画面;
    根据所述无目标对象的帧画面,生成第二镜头片段的镜头信息;
    其中,所述第二镜头片段由包括所述目标帧画面在内的第二组帧画面组成,所述第二组帧画面包括所述视频源中的多个连续的帧画面,所述第二组帧画面均为无目标 对象的帧画面;
    所述第二镜头片段的镜头信息包括:用于表示无目标对象的标识,所述第二镜头片段在所述视频源中的位置标识。
  16. 根据权利要求13所述的电子设备,其特征在于,所述第一镜头片段在所述视频源中的位置标识,包括:所述第一镜头片段的起始帧位置的标识,所述第一镜头片段的结束帧位置的标识。
  17. 根据权利要求13所述的电子设备,其特征在于,
    若所述尺寸比例属于第一比例范围,所述目标镜头类别为定场镜头;
    若所述尺寸比例属于第二比例范围,所述目标镜头类别为全景镜头;
    若所述尺寸比例属于第三比例范围,所述目标镜头类别为中景镜头;
    若所述尺寸比例属于第四比例范围,所述目标镜头类别为近景镜头;
    若所述尺寸比例属于第五比例范围,所述目标镜头类别为特写镜头;
    若所述尺寸比例属于第六比例范围,所述目标镜头类别为大特写镜头;
    其中,所述第一比例范围小于所述第二比例范围,所述第二比例范围小于所述第三比例范围,所述第三比例范围小于所述第四比例范围,所述第四比例范围小于所述第五比例范围,所述第五比例范围小于所述第六比例范围。
  18. 根据权利要求13所述的电子设备,其特征在于,所述电子设备还包括与所述处理器连接的收发器,所述处理器还用于执行以下操作:
    触发所述收发器接收镜头片段的查询指令,所述查询指令中携带有查询标识,所述查询标识包括所述目标对象的标识和/或所述目标镜头类别的标识;
    查找具有所述查询标识的镜头信息,得到所述第一镜头片段的镜头信息;
    按照所述第一镜头片段的镜头信息中所述第一镜头片段在所述视频源中的位置标识,反馈所述第一镜头片段。
PCT/CN2017/089313 2017-01-20 2017-06-21 一种生成镜头信息的方法和装置 WO2018133321A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP17892150.8A EP3565243A4 (en) 2017-01-20 2017-06-21 METHOD AND DEVICE FOR GENERATING IMAGE RECORDING INFORMATION
US16/479,762 US20190364196A1 (en) 2017-01-20 2017-06-21 Method and Apparatus for Generating Shot Information
CN201780082709.2A CN110169055B (zh) 2017-01-20 2017-06-21 一种生成镜头信息的方法和装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710052627.5 2017-01-20
CN201710052627 2017-01-20

Publications (1)

Publication Number Publication Date
WO2018133321A1 true WO2018133321A1 (zh) 2018-07-26

Family

ID=62907731

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/089313 WO2018133321A1 (zh) 2017-01-20 2017-06-21 一种生成镜头信息的方法和装置

Country Status (4)

Country Link
US (1) US20190364196A1 (zh)
EP (1) EP3565243A4 (zh)
CN (1) CN110169055B (zh)
WO (1) WO2018133321A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113515981A (zh) * 2020-05-22 2021-10-19 阿里巴巴集团控股有限公司 识别方法、装置、设备和存储介质
KR102605070B1 (ko) 2020-07-06 2023-11-24 한국전자통신연구원 인식 모델 학습 장치, 촬영본 영상 분석 장치 및 촬영본 검색 서비스 제공 장치
CN111757149B (zh) * 2020-07-17 2022-07-05 商汤集团有限公司 视频剪辑方法、装置、设备及存储介质
CN112601008B (zh) * 2020-11-17 2022-03-25 中兴通讯股份有限公司 一种摄像头切换方法、终端、装置及计算机可读存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101021904A (zh) * 2006-10-11 2007-08-22 鲍东山 视频内容分析系统
CN101604325A (zh) * 2009-07-17 2009-12-16 北京邮电大学 基于主场景镜头关键帧的体育视频分类方法
CN101783882A (zh) * 2009-01-15 2010-07-21 华晶科技股份有限公司 情境模式自动判断的方法及其影像撷取装置
CN102004386A (zh) * 2009-08-28 2011-04-06 鸿富锦精密工业(深圳)有限公司 摄影装置及其影像摄取方法
US20120308209A1 (en) * 2011-06-03 2012-12-06 Michael Edward Zaletel Method and apparatus for dynamically recording, editing and combining multiple live video clips and still photographs into a finished composition
CN103210651A (zh) * 2010-11-15 2013-07-17 华为技术有限公司 用于视频概要的方法和系统
CN104320670A (zh) * 2014-11-17 2015-01-28 东方网力科技股份有限公司 一种网络视频的摘要信息提取方法及系统

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006092765A2 (en) * 2005-03-04 2006-09-08 Koninklijke Philips Electronics N.V. Method of video indexing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101021904A (zh) * 2006-10-11 2007-08-22 鲍东山 视频内容分析系统
CN101783882A (zh) * 2009-01-15 2010-07-21 华晶科技股份有限公司 情境模式自动判断的方法及其影像撷取装置
CN101604325A (zh) * 2009-07-17 2009-12-16 北京邮电大学 基于主场景镜头关键帧的体育视频分类方法
CN102004386A (zh) * 2009-08-28 2011-04-06 鸿富锦精密工业(深圳)有限公司 摄影装置及其影像摄取方法
CN103210651A (zh) * 2010-11-15 2013-07-17 华为技术有限公司 用于视频概要的方法和系统
US20120308209A1 (en) * 2011-06-03 2012-12-06 Michael Edward Zaletel Method and apparatus for dynamically recording, editing and combining multiple live video clips and still photographs into a finished composition
CN104320670A (zh) * 2014-11-17 2015-01-28 东方网力科技股份有限公司 一种网络视频的摘要信息提取方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3565243A4

Also Published As

Publication number Publication date
CN110169055B (zh) 2021-06-15
CN110169055A (zh) 2019-08-23
EP3565243A4 (en) 2020-01-01
EP3565243A1 (en) 2019-11-06
US20190364196A1 (en) 2019-11-28

Similar Documents

Publication Publication Date Title
WO2021088510A1 (zh) 视频分类方法、装置、计算机以及可读存储介质
JP5358083B2 (ja) 人物画像検索装置及び画像検索装置
EP3063730B1 (en) Automated image cropping and sharing
JP5612310B2 (ja) 顔認識のためのユーザーインターフェース
WO2018133321A1 (zh) 一种生成镜头信息的方法和装置
WO2016054989A1 (zh) 建立拍照模板数据库、提供拍照推荐信息的方法及装置
CN105303161A (zh) 一种多人拍照的方法及装置
JP2005250950A (ja) マーカ提示用携帯端末および拡張現実感システムならびにその動作方法
JP2016119508A (ja) 方法、システム及びプログラム
JP2017526989A (ja) 関連ユーザー確定方法および装置
US10594930B2 (en) Image enhancement and repair using sample data from other images
CN112954450A (zh) 视频处理方法、装置、电子设备和存储介质
TW201448585A (zh) 利用行動電話及雲端可視化搜尋引擎之即時物體掃描
CN110929063A (zh) 相册生成方法、终端设备及计算机可读存储介质
CN108958592B (zh) 视频处理方法及相关产品
JP7293735B2 (ja) 文書及びテーブルの周囲の人物の検出に基づく文書及び人物を検索するためのシステム、方法並びにプログラム
JP2004280254A (ja) コンテンツ分類方法および装置
WO2015096015A1 (zh) 一种照片显示方法及用户终端
CN113194256B (zh) 拍摄方法、装置、电子设备和存储介质
CN111800574B (zh) 成像方法、装置和电子设备
US20200074218A1 (en) Information processing system, information processing apparatus, and non-transitory computer readable medium
KR20120080379A (ko) 디지털 카메라의 이미지 어노테이션 처리 방법 및 장치
JP2021077131A (ja) 構図アドバイスシステム、構図アドバイス方法、ユーザ端末、プログラム
CN111522990A (zh) 群组分享式摄影方法、拍摄设备、电子设备、存储介质
WO2015185479A1 (en) Method of and system for determining and selecting media representing event diversity

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17892150

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017892150

Country of ref document: EP

Effective date: 20190731