WO2022116772A1 - Video clipping method and apparatus, storage medium, and electronic device - Google Patents

Video clipping method and apparatus, storage medium, and electronic device Download PDF

Info

Publication number
WO2022116772A1
WO2022116772A1 PCT/CN2021/128711 CN2021128711W WO2022116772A1 WO 2022116772 A1 WO2022116772 A1 WO 2022116772A1 CN 2021128711 W CN2021128711 W CN 2021128711W WO 2022116772 A1 WO2022116772 A1 WO 2022116772A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
target
frame
cropping
position coordinates
Prior art date
Application number
PCT/CN2021/128711
Other languages
French (fr)
Chinese (zh)
Inventor
吴昊
王长虎
Original Assignee
北京有竹居网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京有竹居网络技术有限公司 filed Critical 北京有竹居网络技术有限公司
Publication of WO2022116772A1 publication Critical patent/WO2022116772A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction

Definitions

  • the present application is based on the Chinese application with the application number of 202011401472.X and the filing date of December 2, 2020, and claims its priority.
  • the disclosure of the Chinese application is hereby incorporated into the present application as a whole.
  • the present disclosure relates to the technical field of video processing, and in particular, to a video cropping method, apparatus, storage medium and electronic device.
  • Video cropping is a technique required in scenarios where the playback size of the video is inconsistent with the original video.
  • the video cropping algorithm in the related art usually uses a cropping frame of the target playback size to crop each video frame in the video. Specifically, for the text information included in each video frame, a loss function will be applied. When the text is completely within the cropping frame or completely outside the cropping frame, the result of the loss function is the smallest, and when half of the text is within the cropping frame, half In this case outside the clipping box, the result of this loss function is the largest to improve the clipping effect.
  • the present disclosure provides a video cropping method, the method comprising:
  • the clipping path For each of the mirrored clips, determine a clipping path corresponding to the mirrored clip according to the main content of the target video frame in the mirrored clip, where the target video frame is a partial video frame in the mirrored clip Or all video frames, the clipping path is used to represent the position movement path of the target clipping frame in all video frames included in the mirroring segment along the width direction or the length direction of the video frame;
  • the original video is cropped according to the size information of the target cropping frame and the cropping path corresponding to each mirroring segment to obtain a cropped target video.
  • the present disclosure provides a video cropping device, the device comprising:
  • the acquisition module is used to acquire the size information of the original video to be cropped and the target cropping frame
  • a first determining module configured to perform mirror detection on the original video to determine mirror segments in the original video
  • the second determination module is configured to, for each of the mirrored segments, determine the clipping path corresponding to the mirrored segment according to the main content of the target video frame in the mirrored segment, where the target video frame is the Part of video frames or all video frames in the mirror segment, and the cropping path is used to represent the position of the target cropping frame moving along the width direction or the length direction of the video frame in all video frames included in the mirror segment path;
  • a cropping module configured to crop the original video according to the size information of the target cropping frame and the cropping path corresponding to each of the mirroring segments, so as to obtain a cropped target video.
  • the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processing apparatus, implements the steps of the method described in the first aspect.
  • the present disclosure provides an electronic device, comprising:
  • a processing device is configured to execute the computer program in the storage device to implement the steps of the method in the first aspect.
  • FIG. 1 is a schematic diagram of a clipping result of a video clipping method in the related art
  • FIG. 2 is a flowchart of a video cropping method according to an exemplary embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of interpolation calculation in a video cropping method according to an exemplary embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of smoothing filtering processing in a video cropping method according to an exemplary embodiment of the present disclosure
  • FIG. 5 is a block diagram of a video cropping apparatus according to an exemplary embodiment of the present disclosure.
  • Fig. 6 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • the video cropping algorithm in the related art applies a loss function to the text information included in each video frame.
  • the result of the loss function is the smallest
  • the result of the loss function is the largest to improve the cropping effect.
  • the first video frame, the second video frame and the third video frame are three consecutive frames in the same video, the cropping frame of the first video frame is A, and in the second video frame, in order to frame the subtitles , the cropping frame B is moved down compared to the cropping frame A as a whole.
  • the cropping frame C is moved upward as a whole compared to the cropping frame B. Therefore, during the playback of the cropped video, the screen frequently shakes up and down, and the playback effect is poor.
  • FIG. 2 is a flowchart of a video cropping method according to an exemplary embodiment of the present disclosure.
  • the video cropping method may include:
  • Step 201 Obtain the size information of the original video to be cropped and the target cropping frame.
  • the user can input a URL (Uniform Resource Locator) corresponding to the original video in the electronic device, and then the electronic device can download the original video from the corresponding resource server according to the URL to perform video trimming.
  • the electronic device may, in response to a video trimming request triggered by the user, acquire the stored video from the memory as the original video for video trimming, etc.
  • the embodiment of the present disclosure does not limit the acquisition method of the original video.
  • the size information of the target cropping frame may define the width and length of the cropped video, which may be determined according to the playback size of the video playback device.
  • the size information of the original video is 720 ⁇ 1280 pixels, and the playback size of the video playback device is 1:1, then the size information of the target cropping frame can be determined to be 720 ⁇ 720 pixels.
  • the size information of the target cropping frame may be customized according to actual business requirements, etc., which is not limited in this embodiment of the present disclosure.
  • Step 202 perform segment detection on the original video to determine segment segments in the original video.
  • split detection can determine different shot segments in the original video.
  • the same storyboard segment includes multiple video frames, and the multiple video frames correspond to the same or similar scene scenes. Therefore, if multiple video frames in the same storyboard segment are cropped by multiple cropping frames with large positional deviations, the It will cause the screen of the cropped video to shake frequently, which will reduce the playback effect of the cropped video.
  • the video frames in different mirroring clips have different camera scenes due to their corresponding camera scenes. Therefore, even if the screen shakes, the video playback effect will not be greatly affected.
  • the original video in order to reduce the problem of frequent screen shaking during the playback of the cropped video and improve the playback effect of the cropped video, the original video can be detected by mirror detection to determine the mirrored segments in the original video. , so that in the subsequent processing process, a clipping path can be separately planned for each mirroring segment, so that the video frame in each mirroring clip can correspond to a clipping frame with a smaller position change.
  • Step 203 for each mirrored segment, according to the main content of the target video frame in the mirrored segment, determine a clipping path corresponding to the mirrored segment.
  • the target video frame is a part or all of the video frames in the mirrored segment, and the clipping path is used to represent the position of the target clipping frame along the width direction or the length direction of the video frame in all the video frames included in the mirrored clip movement path.
  • the target video frame can be obtained by performing frame extraction processing on the mirrored segment, and the target video frame can include all video frames in the mirrored clip, and can also include some video frames in the mirrored clip.
  • the original video may also be subjected to frame extraction processing, and then the video frames obtained by the frame extraction processing may be marked. After that, the mirror detection is performed again to obtain a mirror segment, and then the video frame with the mark in the mirror segment is used as the target video frame, etc.
  • the embodiment of the present disclosure does not limit the method for determining the target video frame.
  • the scene scenes corresponding to the video frames included in each storyboard segment are the same or similar, so the main content included in the video frames is not very different, so the clipping path is determined according to the main content of the target video frame in the storyboard segment, so that the same
  • the positional movement deviation of the target cropping frame in the cropping path corresponding to the storyboard segment is small, thereby reducing the problem of frequent screen shaking during the playback of the cropped video, and improving the video playback effect.
  • Step 204 Crop the original video according to the size information of the target cropping frame and the cropping path corresponding to each mirroring segment to obtain a cropped target video.
  • the size information of the target cropping frame defines the size of the cropping frame
  • the cropping path corresponding to each storyboard segment defines the position of the cropping frame, so that the size information of the target cropping frame and the cropping corresponding to each storyboard segment can be determined according to the size information of the target cropping frame.
  • the path crops each frame of the original video to obtain the cropped target video.
  • each video frame in the segment is trimmed according to the size information of the target cropping frame and the cropping path corresponding to the segment, and then each clipped Video frames are spliced in chronological order to obtain the cropped target video. That is to say, each video frame in each mirrored segment is cropped first, and then the cropped video frames corresponding to different mirrored segments are spliced in chronological order to obtain a cropped target video.
  • the first mirror clip includes video frame 1, video frame 2 and video frame 3, the second mirror clip includes video frame 4 and video frame 5, video frame 1, video frame 2, video frame 3, video frame 4
  • the time corresponding to the video frame 5 increases sequentially, that is, the video frame 1, the video frame 2, the video frame 3, the video frame 4 and the video frame 5 are played in sequence during the video playback.
  • video frame 1, video frame 2, and video frame 3 included in the first mirroring segment can be cropped, while the second mirroring segment including video frame 4 and video frame 5 can be cropped, and then The video frames included in the cropped first mirroring segment and the video frames included in the cropped second mirroring segment are spliced in chronological order to obtain a cropped target video.
  • a clipping path corresponding to the mirrored clip can be determined according to the main content of the target video frame in the mirrored clip, so as to perform video clipping according to the corresponding clipping path of each mirrored clip. Since the scene scenes corresponding to the video frames included in each storyboard segment are the same or similar, the main content included in the video frames is not very different, so the clipping path is determined according to the main content of the target video frame in the storyboard segment, which can make the same storyboard. The positional deviation of the cropping frame in the cropping path corresponding to the clip is small, thereby reducing the problem of frequent screen shaking during the playback of the cropped video, and improving the video playback effect.
  • the original video may be a vertical version video
  • the size information of the target cropping frame may be the size information corresponding to the horizontal version video
  • performing sub-slice detection on the original video to determine the sub-segment in the original video may be: performing sub-scenario detection on the original video by a frame difference method to determine the sub-segment in the original video; or, Input the original video into the pre-trained mirror detection model, and determine the mirror segment in the original video according to the output result of the mirror detection model.
  • the mirror detection model is based on the sample video and the sample mirror segment corresponding to the sample video. obtained by training.
  • the frame difference method can be one of the moving object detection and segmentation methods.
  • the basic principle is to use pixel-based time difference between two or three adjacent frames of an image sequence to extract the moving region in the image through occlusion.
  • the specific calculation method of the frame difference method is similar to that in the related art, and will not be repeated here.
  • the motion regions corresponding to different video frames in the original video can be determined by the frame difference method, so that the video frames with the same or similar motion regions belong to the same mirror segment, and then at least one corresponding to the original video can be obtained.
  • a storyboard segment Alternatively, a mirroring detection model may also be trained according to the sample video and the sample mirroring segments corresponding to the sample video, so that at least one mirroring segment corresponding to the original video is determined through the trained mirror detection model.
  • a clipping path corresponding to the segmented segment may be determined according to the main content of the target video frame in the segmented segment.
  • the target video frame is all the video frames in the mirrored segment, then for each target video frame, a target cropping frame that can include the main content in the target video frame can be determined, and then the target cropping frame can be determined Position coordinates in the width direction or length direction of the target video frame to obtain the clipping path.
  • the target cropping frame needs to be determined one by one for each video frame in the mirroring segment, which requires a large amount of calculation, thereby affecting the efficiency of video cropping.
  • the target video frame is a part of the video frame in the mirror clip
  • a target that can include the main content in the target video frame can be determined
  • the cropping frame is determined, and the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame are determined.
  • interpolation calculation is performed according to the position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirroring segment except the target video frame.
  • the clipping path corresponding to the mirroring clip is determined according to the position coordinates corresponding to each video frame in the mirroring clip.
  • the main content may be the main picture content occupying most of the image area, for example, in a video in which a character is explaining a story, the character is the main content in the target video frame.
  • at least one of the following detection methods may be performed to determine the main content: saliency detection, face detection, text detection, and logo detection.
  • saliency detection is used to detect the position of the main component of the target video frame.
  • Face detection is used to detect the location of the face in the target video frame.
  • Text detection is used to detect the position and content of text in the target video frame.
  • logo detection is used to detect the location of the logo, watermark, etc. in the target video frame.
  • determining the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame may be: if it is determined to crop along the width direction according to the size information of the original video and the size information of the target cropping frame, then The position coordinate of the cropping frame in the width direction of the target video frame. If it is determined to perform cropping along the length direction according to the size information of the original video and the size information of the target cropping frame, the position coordinates of the target cropping frame in the length direction of the target video frame are determined.
  • the length and width of the corresponding frame picture in the original video can be cropped respectively according to the size information of the target cropping frame, so the position coordinates of the target cropping frame in the width direction and the length direction of the target video frame can be determined, That is, if the width direction of the target video frame is taken as the X axis, and the length direction of the target video frame is taken as the Y axis, the position coordinates include the X coordinate value and the Y coordinate value.
  • cropping may be performed along the length or width of the corresponding frame according to the size information of the target cropping frame and the size information of the original video. For example, if the size information of the target cropping frame is 1:1, and the size information of the original video is 720 ⁇ 1280 pixels, then the length of the corresponding frame picture (along the y-axis direction) can be cropped, and the size of the cropped video is 720 ⁇ 720 pixel.
  • the position coordinates of the target cropping frame in the length direction of the target video frame can be determined, that is, if the width direction of the target video frame is taken as the X axis, and the length direction of the target video frame is taken as the Y axis, then the position coordinates Include the Y coordinate value.
  • the position coordinates of the target cropping frame in the width direction of the target video frame can be determined, that is, the position coordinates include the X coordinate value.
  • the interpolation calculation is performed according to the position coordinates of each target video frame
  • the position coordinates may be the position coordinates of the target crop frame in the width direction of the target video frame, the position coordinates of the target crop frame in the length direction of the target video frame, and Any one of the position coordinates of the target cropping frame in the width direction and the length of the target video frame can be determined according to actual business requirements in specific applications.
  • the position coordinates being the position coordinates of the target cropping frame in the length direction of the target video frame (that is, the position coordinates include the Y coordinate value).
  • the position coordinates of each target video frame are y 0 , y 1 , y 2 , ..., y n-1 (n is the number of target video frames), so the position coordinates can be interpolated to obtain the original video
  • the interpolation calculation method may be any interpolation calculation method in the related art, which is not limited in the present disclosure.
  • the present disclosure adopts a cubic spline interpolation calculation method, so that the cropping path obtained by the interpolation calculation is smoother, and the shaking of the video picture after cropping is reduced.
  • the objective function can be determined according to the position coordinates of each target video frame, wherein the objective function includes a plurality of segment functions, and each segment function is determined according to the position coordinates of every two adjacent target video frames.
  • each piecewise function and objective function are cubic equations in which the independent variable is time and the dependent variable is the position coordinate, and the first-order derivative and second-order derivative of the objective function are continuous in time.
  • the position coordinates of the target cropping frame corresponding to the other video frames can be determined according to the target function and the time corresponding to the other video frames except the target video frame in the mirroring segment.
  • the position coordinates of every two adjacent target video frames can be used as a segment interval, and each segment interval can correspond to a segment function, and the segment function is that the independent variable is time, and the dependent variable is the position coordinate.
  • the cubic equation of so the piecewise curve corresponding to each piecewise function can be obtained.
  • the objective function includes multiple piecewise functions, that is, the objective function is the sum of the multiple piecewise functions.
  • the objective function is a cubic equation with the independent variable as time and the dependent variable as the position coordinate.
  • the first-order derivative and the second-order derivative are continuous in time, so the piecewise curves corresponding to multiple piecewise functions can be connected into a smooth curve, Reduce the shaking of the video screen after cropping.
  • the embodiment of the present disclosure can calculate the position coordinates of the target cropping frame corresponding to other video frames in the mirroring segment according to the position coordinates of the target cropping frame in each target video frame based on cubic spline interpolation.
  • the second-order continuity of cubic spline interpolation can make the clipping path calculated by the interpolation smoother, thereby reducing the shaking of the clipped video image.
  • the interpolation calculation is performed in the above-mentioned manner, and the position coordinates of other video frames between each two adjacent target videos can be obtained, thereby determining the The clipping path corresponding to the storyboard clip.
  • interpolation calculation is performed according to the position coordinates of each target video frame to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirroring segment except the target video frame, and it can also be:
  • the position coordinates of each target video frame are processed by smoothing filtering to obtain the smooth position coordinates of each target video frame, and then interpolation calculation is performed according to the smooth position coordinates of each target video frame to obtain the target video frame in the mirror segment.
  • the position coordinates of the target cropping frame corresponding to other video frames is performed according to the position coordinates of each target video frame to obtain the position coordinates of the target cropping frame corresponding to other video frames.
  • the position coordinates of each target video frame may be processed through Gaussian smoothing filtering to obtain the smooth position coordinates of each target video frame.
  • Gaussian smoothing filtering For example, if the position coordinates of the target cropping frame in each target video frame are y 0 , y 1 , y 2 , ..., y n-1 , a Gaussian smoothing filtering method with a window of 2M+1 (M is a positive integer) can be used .
  • M is a positive integer
  • the weight corresponding to the position of the deviation ⁇ y from the center of the window conforms to the following Gaussian distribution:
  • the sliding window convolution kernel of length 2M+1 is [G(-M), G(-M+1),...,G(0),...,G(M-1),G(M)].
  • the position coordinates of each target video frame may also be processed through other smoothing filtering manners, such as mean filtering, etc., which is not limited in this embodiment of the present disclosure.
  • interpolation calculation may be performed according to the smooth position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirroring segment except the target video frame.
  • the position coordinates of the target cropping frame in the target video frame are smoothly filtered before the interpolation calculation, the positional offset between the target cropping frames in the target video frame can be reduced.
  • the smooth position coordinates Y1', Y2', Y3' and Y4' can be obtained.
  • the present disclosure also provides a video cropping device, which can become part or all of an electronic device through software, hardware, or a combination of the two.
  • the video cropping device 500 includes:
  • Obtaining module 501 for obtaining the size information of the original video to be cropped and the target cropping frame
  • a first determining module 502 configured to perform mirror detection on the original video to determine mirror segments in the original video
  • the second determination module 503 is configured to, for each of the mirrored segments, determine a clipping path corresponding to the mirrored segment according to the main content of the target video frame in the mirrored segment, where the target video frame is the A part of the video frames or all of the video frames in the storyboard clip, the clipping path is used to represent the position of the target clipping frame along the width direction or the length direction of the video frame in all the video frames included in the mirror clip clip moving path;
  • the cropping module 504 is configured to crop the original video according to the size information of the target cropping frame and the cropping path corresponding to each of the mirrored segments to obtain a cropped target video.
  • the second determining module 503 is used for:
  • For each of the target video frames determine a target cropping frame that can include the main content in the target video frame, and determine the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame;
  • the clipping path corresponding to the mirrored clip is determined.
  • the second determining module 503 is used for:
  • the position coordinates of the target cropping frame in the length direction of the target video frame are determined.
  • the second determining module 503 is used for:
  • An objective function is determined according to the position coordinates of each target video frame, wherein the target function includes a plurality of segment functions, each of which is based on the position coordinates of every two adjacent target video frames and determined, and each of the piecewise function and the objective function is a cubic equation whose independent variable is time and the dependent variable is the position coordinate, and the first-order derivative and second-order derivative of the objective function are in time continuous;
  • the second determining module 503 is used for:
  • Interpolation calculation is performed according to the smooth position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirror segment except the target video frame.
  • the first determining module 502 is configured to:
  • the cropping module 504 is used for:
  • Each clipped video frame is spliced in time sequence to obtain the clipped target video.
  • an embodiment of the present disclosure also provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processing device, implements the steps of any of the above video cropping methods.
  • an electronic device including:
  • a processing device is configured to execute the computer program in the storage device to implement the steps of any of the above video cropping methods.
  • Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, mobile terminals such as in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, and the like.
  • the electronic device shown in FIG. 6 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • an electronic device 600 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 601 that may be loaded into random access according to a program stored in a read only memory (ROM) 602 or from a storage device 608 Various appropriate actions and processes are executed by the programs in the memory (RAM) 603 . In the RAM 603, various programs and data required for the operation of the electronic device 600 are also stored.
  • the processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also connected to bus 604 .
  • I/O interface 605 input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration An output device 607 of a computer, etc.; a storage device 608 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 609.
  • Communication means 609 may allow electronic device 600 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 6 shows electronic device 600 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 609, or from the storage device 608, or from the ROM 602.
  • the processing apparatus 601 the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
  • any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol)
  • HTTP HyperText Transfer Protocol
  • communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: obtains the original video to be trimmed and the size information of the target trimming frame; The video is subjected to mirror detection to determine the mirror segment in the original video; for each mirror segment, according to the main content of the target video frame in the mirror segment, the cropping corresponding to the mirror segment is determined path, the target video frame is a part of the video frame or all the video frames in the mirror clip, and the clipping path is used to represent that the target clip frame is in all the video frames included in the mirror clip along the The position movement path in the width direction or the length direction of the video frame; according to the size information of the target cropping frame and the cropping path corresponding to each of the mirror segments, the original video is cropped to obtain the cropped target video.
  • Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider to via Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the modules involved in the embodiments of the present disclosure may be implemented in software or hardware. Among them, the name of the module does not constitute a limitation of the module itself under certain circumstances.
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLDs Complex Programmable Logical Devices
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • Example 1 provides a video cropping method, including:
  • the clipping path For each of the mirrored clips, determine a clipping path corresponding to the mirrored clip according to the main content of the target video frame in the mirrored clip, where the target video frame is a partial video frame in the mirrored clip Or all video frames, the clipping path is used to represent the position movement path of the target clipping frame in all video frames included in the mirroring segment along the width direction or the length direction of the video frame;
  • the original video is cropped according to the size information of the target cropping frame and the cropping path corresponding to each mirroring segment to obtain a cropped target video.
  • Example 2 provides the method of Example 1, wherein determining the clipping path corresponding to the mirrored segment according to the main content of the target video frame in the mirrored segment, including:
  • For each of the target video frames determine a target cropping frame that can include the main content in the target video frame, and determine the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame;
  • the clipping path corresponding to the mirrored clip is determined.
  • Example 3 provides the method of Example 2, wherein the determining the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame includes:
  • the position coordinates of the target cropping frame in the length direction of the target video frame are determined.
  • Example 4 provides the method of Example 2, wherein the interpolation calculation is performed according to the position coordinates of each target video frame, so as to obtain the target video in the mirroring segment divided by the target video.
  • the position coordinates of the target cropping frame corresponding to other video frames outside the frame including:
  • An objective function is determined according to the position coordinates of each target video frame, wherein the target function includes a plurality of segment functions, each of which is based on the position coordinates of every two adjacent target video frames and determined, and each of the piecewise function and the objective function is a cubic equation whose independent variable is time and the dependent variable is the position coordinate, and the first-order derivative and second-order derivative of the objective function are in time continuous;
  • Example 5 provides the method of any one of Examples 2-4, wherein the interpolation calculation is performed according to the position coordinates of each target video frame to obtain the The position coordinates of the target cropping frame corresponding to other video frames except the target video frame, including:
  • Smoothing filtering is performed on the position coordinates of each target video frame to obtain the smooth position coordinates of each target video frame;
  • Interpolation calculation is performed according to the smooth position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirror segment except the target video frame.
  • Example 6 provides the method of any one of Examples 1-4, the performing segment detection on the original video to determine segment segments in the original video, comprising: :
  • Example 7 provides the method of any one of Examples 1-4, according to the size information of the target cropping frame and the cropping path corresponding to each of the mirroring segments
  • the original video is cropped to obtain the cropped target video, including:
  • Each clipped video frame is spliced in time sequence to obtain the clipped target video.
  • Example 8 provides a video cropping apparatus, the apparatus comprising:
  • the acquisition module is used to acquire the size information of the original video to be cropped and the target cropping frame
  • a first determining module configured to perform mirror detection on the original video to determine mirror segments in the original video
  • the second determination module is configured to, for each of the mirrored segments, determine the clipping path corresponding to the mirrored segment according to the main content of the target video frame in the mirrored segment, where the target video frame is the Some or all of the video frames in the mirror segment, and the clipping path is used to represent the position of the target clipping frame moving along the width direction or the length direction of the video frame in all the video frames included in the mirror clip path;
  • a cropping module configured to crop the original video according to the size information of the target cropping frame and the cropping path corresponding to each of the mirroring segments, so as to obtain a cropped target video.
  • Example 9 provides the apparatus of Example 8, the second determining module is configured to:
  • For each of the target video frames determine a target cropping frame that can include the main content in the target video frame, and determine the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame;
  • the clipping path corresponding to the mirrored clip is determined.
  • Example 10 provides the apparatus of Example 9, the second determining module being configured to:
  • the position coordinates of the target cropping frame in the length direction of the target video frame are determined.
  • Example 11 provides the apparatus of Example 9, the second determining module being configured to:
  • An objective function is determined according to the position coordinates of each target video frame, wherein the target function includes a plurality of segment functions, each of which is based on the position coordinates of every two adjacent target video frames and determined, and each of the piecewise function and the objective function is a cubic equation whose independent variable is time and the dependent variable is the position coordinate, and the first-order derivative and second-order derivative of the objective function are in time continuous;
  • Example 12 provides the apparatus of any one of Examples 9-11, wherein the second determination module is configured to:
  • Interpolation calculation is performed according to the smooth position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirror segment except the target video frame.
  • Example 13 provides the apparatus of any one of Examples 8-11, wherein the first determining module is configured to:
  • Example 14 provides the apparatus of any one of Examples 8-11, the cropping module for:
  • each video frame in the mirror segment is trimmed;
  • Each clipped video frame is spliced in time sequence to obtain the clipped target video.
  • Example 15 provides a computer-readable medium having stored thereon a computer program that, when executed by a processing apparatus, implements the steps of the method of any one of Examples 1 to 7.
  • Example 16 provides an electronic device comprising:
  • a processing device for executing the computer program in the storage device to implement the steps of the method in any one of Examples 1 to 7.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Geometry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Television Signal Processing For Recording (AREA)
  • Image Processing (AREA)

Abstract

A video clipping method and apparatus, a storage medium, and an electronic device. The video clipping method comprises: obtaining size information of an original video to be clipped and a target clipping frame (201); performing storyboard detection on the original video to determine storyboard clips in the original video (202); for each storyboard clip, determining, according to the main content of a target video frame in the storyboard clip, a clipping path corresponding to the storyboard clip (203), the target video frame being a part or all of video frames in the storyboard clip, and the clipping path being used for representing a position movement path of the target clipping frame along the width directions or the length directions of the video frames in all the video frames comprised in the storyboard clip; and according to the size information of the target clipping frame and the clipping path corresponding to each storyboard clip, clipping the original video to obtain a clipped target video (204). The method can reduce the problem of frequent shaking of a picture in the playing process of a clipped video, thereby improving the playing effect of the clipped video.

Description

视频裁剪方法、装置、存储介质及电子设备Video cropping method, device, storage medium and electronic device
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请是以申请号为202011401472.X,申请日为2020年12月2日的中国申请为基础,并主张其优先权,该中国申请的公开内容在此作为整体引入本申请中。The present application is based on the Chinese application with the application number of 202011401472.X and the filing date of December 2, 2020, and claims its priority. The disclosure of the Chinese application is hereby incorporated into the present application as a whole.
技术领域technical field
本公开涉及视频处理技术领域,具体地,涉及一种视频裁剪方法、装置、存储介质及电子设备。The present disclosure relates to the technical field of video processing, and in particular, to a video cropping method, apparatus, storage medium and electronic device.
背景技术Background technique
视频裁剪是作用于视频播放尺寸与原视频不一致的场景下所需的技术。相关技术中的视频裁剪算法通常是使用目标播放尺寸的裁剪框去裁剪视频中的每一视频帧。具体地,针对每一视频帧包括的文字信息,会施加一个损失函数,当文字完全位于裁剪框内或完全位于裁剪框外,该损失函数的结果最小,而当文字一半在裁剪框内,一半在裁剪框外这种情况下,该损失函数的结果最大,以提升裁剪效果。Video cropping is a technique required in scenarios where the playback size of the video is inconsistent with the original video. The video cropping algorithm in the related art usually uses a cropping frame of the target playback size to crop each video frame in the video. Specifically, for the text information included in each video frame, a loss function will be applied. When the text is completely within the cropping frame or completely outside the cropping frame, the result of the loss function is the smallest, and when half of the text is within the cropping frame, half In this case outside the clipping box, the result of this loss function is the largest to improve the clipping effect.
但是,在竖版视频裁剪成横版视频的情况下,对于字幕、标志(logo)等文字信息,为了满足上述损失函数的情况,在不同视频帧内会得到不同的裁剪框位置,导致裁剪后的视频播放过程中画面频繁晃动,影响裁剪后视频的播放效果。However, when a vertical video is cropped into a horizontal video, for text information such as subtitles and logos, in order to satisfy the above loss function, different cropping frame positions will be obtained in different video frames, resulting in The screen shakes frequently during the video playback, which affects the playback effect of the clipped video.
发明内容SUMMARY OF THE INVENTION
提供该发明内容部分以便以简要的形式介绍构思,这些构思将在后面的具体实施方式部分被详细描述。该发明内容部分并不旨在标识要求保护的技术方案的关键特征或必要特征,也不旨在用于限制所要求的保护的技术方案的范围。This Summary is provided to introduce concepts in a simplified form that are described in detail in the Detailed Description section that follows. This summary section is not intended to identify key features or essential features of the claimed technical solution, nor is it intended to be used to limit the scope of the claimed technical solution.
第一方面,本公开提供一种视频裁剪方法,所述方法包括:In a first aspect, the present disclosure provides a video cropping method, the method comprising:
获取待裁剪的原始视频和目标裁剪框的尺寸信息;Obtain the size information of the original video to be cropped and the target cropping frame;
对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;Performing split detection on the original video to determine a split segment in the original video;
针对每一所述分镜片段,根据所述分镜片段中目标视频帧的主体内容,确定所述分镜片段对应的裁剪路径,所述目标视频帧为所述分镜片段中的部分视频帧或全部视频帧,所述裁剪路径用于表征所述目标裁剪框在所述分镜片段包括的所有视频帧中沿所述视频帧的宽度方向或长度方向的位置移动路径;For each of the mirrored clips, determine a clipping path corresponding to the mirrored clip according to the main content of the target video frame in the mirrored clip, where the target video frame is a partial video frame in the mirrored clip Or all video frames, the clipping path is used to represent the position movement path of the target clipping frame in all video frames included in the mirroring segment along the width direction or the length direction of the video frame;
根据所述目标裁剪框的尺寸信息和每一所述分镜片段对应的所述裁剪路径对所述原始视频进行裁剪,以得到裁剪后的目标视频。The original video is cropped according to the size information of the target cropping frame and the cropping path corresponding to each mirroring segment to obtain a cropped target video.
第二方面,本公开提供一种视频裁剪装置,所述装置包括:In a second aspect, the present disclosure provides a video cropping device, the device comprising:
获取模块,用于获取待裁剪的原始视频和目标裁剪框的尺寸信息;The acquisition module is used to acquire the size information of the original video to be cropped and the target cropping frame;
第一确定模块,用于对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;a first determining module, configured to perform mirror detection on the original video to determine mirror segments in the original video;
第二确定模块,用于针对每一所述分镜片段,根据所述分镜片段中目标视频帧的主体内容,确定所述分镜片段对应的裁剪路径,所述目标视频帧为所述分镜片段中的部分视频帧或全部视频帧,所述裁剪路径用于表征所述目标裁剪框在所述分镜片段包括的所有视频帧中沿所述视频帧的宽度方向或长度方向的位置移动路径;The second determination module is configured to, for each of the mirrored segments, determine the clipping path corresponding to the mirrored segment according to the main content of the target video frame in the mirrored segment, where the target video frame is the Part of video frames or all video frames in the mirror segment, and the cropping path is used to represent the position of the target cropping frame moving along the width direction or the length direction of the video frame in all video frames included in the mirror segment path;
裁剪模块,用于根据所述目标裁剪框的尺寸信息和每一所述分镜片段对应的所述裁剪路径对所述原始视频进行裁剪,以得到裁剪后的目标视频。A cropping module, configured to crop the original video according to the size information of the target cropping frame and the cropping path corresponding to each of the mirroring segments, so as to obtain a cropped target video.
第三方面,本公开提供一种计算机可读介质,其上存储有计算机程序,该程序被处理装置执行时实现第一方面中所述方法的步骤。In a third aspect, the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processing apparatus, implements the steps of the method described in the first aspect.
第四方面,本公开提供一种电子设备,包括:In a fourth aspect, the present disclosure provides an electronic device, comprising:
存储装置,其上存储有计算机程序;a storage device on which a computer program is stored;
处理装置,用于执行所述存储装置中的所述计算机程序,以实现第一方面中所述方法的步骤。A processing device is configured to execute the computer program in the storage device to implement the steps of the method in the first aspect.
本公开的其他特征和优点将在随后的具体实施方式部分予以详细说明。Other features and advantages of the present disclosure will be described in detail in the detailed description that follows.
附图说明Description of drawings
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。在附图中:The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent when taken in conjunction with the accompanying drawings and with reference to the following detailed description. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that the originals and elements are not necessarily drawn to scale. In the attached image:
图1是相关技术中的视频裁剪方法的裁剪结果示意图;1 is a schematic diagram of a clipping result of a video clipping method in the related art;
图2是根据本公开一示例性实施例示出的一种视频裁剪方法的流程图;2 is a flowchart of a video cropping method according to an exemplary embodiment of the present disclosure;
图3是根据本公开一示例性实施例示出的一种视频裁剪方法中插值计算的示意图;3 is a schematic diagram of interpolation calculation in a video cropping method according to an exemplary embodiment of the present disclosure;
图4是根据本公开一示例性实施例示出的一种视频裁剪方法中平滑滤波处理的示意图;4 is a schematic diagram of smoothing filtering processing in a video cropping method according to an exemplary embodiment of the present disclosure;
图5是根据本公开一示例性实施例示出的一种视频裁剪装置的框图;5 is a block diagram of a video cropping apparatus according to an exemplary embodiment of the present disclosure;
图6是根据本公开一示例性实施例示出的一种电子设备的框图。Fig. 6 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
具体实施方式Detailed ways
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for the purpose of A more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are only for exemplary purposes, and are not intended to limit the protection scope of the present disclosure.
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。It should be understood that the various steps described in the method embodiments of the present disclosure may be performed in different orders and/or in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this regard.
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。As used herein, the term "including" and variations thereof are open-ended inclusions, ie, "including but not limited to". The term "based on" is "based at least in part on." The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions of other terms will be given in the description below.
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。另外需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。It should be noted that concepts such as "first" and "second" mentioned in the present disclosure are only used to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units or interdependence. In addition, it should be noted that the modifications of "a" and "a plurality" mentioned in the present disclosure are illustrative rather than restrictive, and those skilled in the art should understand that unless the context clearly indicates otherwise, they should be understood as "a" or more".
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are only for illustrative purposes, and are not intended to limit the scope of these messages or information.
正如背景技术所言,相关技术中的视频裁剪算法中针对每一视频帧包括的文字信息,会施加一个损失函数,当文字完全位于裁剪框内或完全位于裁剪框外,该损失函数的结果最小,而当文字一半在裁剪框内,一半在裁剪框外这种情况下,该损失函数的结果最大,以提升裁剪效果。As mentioned in the background art, the video cropping algorithm in the related art applies a loss function to the text information included in each video frame. When the text is completely within the cropping frame or completely outside the cropping frame, the result of the loss function is the smallest , and when half of the text is inside the cropping frame and half is outside the cropping frame, the result of the loss function is the largest to improve the cropping effect.
但是,在竖版视频裁剪成横版视频的情况下,对于字幕、标志(logo)等文字信息,为了满足上述损失函数的情况,在不同视频帧内会得到不同的裁剪框位置,导致裁剪后的视频播放过程中画面频繁晃动,影响裁剪后视频的播放效果。例如,参照图1,第一视频帧、第二视频帧和第三视频帧为同一视频中时间连续的三帧画面,第一视频帧的裁剪框为A,第二视频帧中为了框住字幕,裁剪框B相较于裁剪框A整体进行了下移。第三视频帧中为了框住所有的人脸,裁剪框C相较于裁剪框B进行了整体上移。因此,在裁剪后的视频播放过程中画面会频繁的上下晃动,播放效果较差。However, when a vertical video is cropped into a horizontal video, for text information such as subtitles and logos, in order to satisfy the above loss function, different cropping frame positions will be obtained in different video frames, resulting in The screen shakes frequently during the video playback, which affects the playback effect of the clipped video. For example, referring to FIG. 1 , the first video frame, the second video frame and the third video frame are three consecutive frames in the same video, the cropping frame of the first video frame is A, and in the second video frame, in order to frame the subtitles , the cropping frame B is moved down compared to the cropping frame A as a whole. In the third video frame, in order to frame all the faces, the cropping frame C is moved upward as a whole compared to the cropping frame B. Therefore, during the playback of the cropped video, the screen frequently shakes up and down, and the playback effect is poor.
有鉴于此,本公开提供一种视频裁剪方法、装置、存储介质及电子设备,以减少裁剪 后视频的播放过程中画面频繁晃动的问题,从而提升裁剪后视频的播放效果。图2是根据本公开一示例性实施例示出的一种视频裁剪方法的流程图。参照图2,该视频裁剪方法可以包括:In view of this, the present disclosure provides a video cropping method, device, storage medium and electronic device, so as to reduce the problem of frequent picture shaking during the playback of the cropped video, thereby improving the playback effect of the cropped video. FIG. 2 is a flowchart of a video cropping method according to an exemplary embodiment of the present disclosure. Referring to Figure 2, the video cropping method may include:
步骤201,获取待裁剪的原始视频和目标裁剪框的尺寸信息。Step 201: Obtain the size information of the original video to be cropped and the target cropping frame.
示例地,用户可以在电子设备中输入原始视频对应的URL(Uniform Resource Locator,统一资源定位器),然后电子设备可以根据该URL从对应的资源服务器中下载原始视频进行视频裁剪。或者,电子设备可以响应于用户触发的视频裁剪请求,从存储器中获取存储的视频作为原始视频进行视频裁剪,等等,本公开实施例对于原始视频的获取方式不作限定。For example, the user can input a URL (Uniform Resource Locator) corresponding to the original video in the electronic device, and then the electronic device can download the original video from the corresponding resource server according to the URL to perform video trimming. Alternatively, the electronic device may, in response to a video trimming request triggered by the user, acquire the stored video from the memory as the original video for video trimming, etc. The embodiment of the present disclosure does not limit the acquisition method of the original video.
示例地,目标裁剪框的尺寸信息可以限定裁剪后的视频的宽度和长度,可以是根据视频播放设备的播放尺寸确定的。比如原始视频的尺寸信息为720×1280像素,视频播放设备的播放尺寸为1:1,则可以确定目标裁剪框的尺寸信息为720×720像素。或者,目标裁剪框的尺寸信息可以是根据实际业务需求自定义的,等等,本公开实施例对此不作限定。For example, the size information of the target cropping frame may define the width and length of the cropped video, which may be determined according to the playback size of the video playback device. For example, the size information of the original video is 720×1280 pixels, and the playback size of the video playback device is 1:1, then the size information of the target cropping frame can be determined to be 720×720 pixels. Alternatively, the size information of the target cropping frame may be customized according to actual business requirements, etc., which is not limited in this embodiment of the present disclosure.
步骤202,对原始视频进行分镜检测,以确定原始视频中的分镜片段。 Step 202 , perform segment detection on the original video to determine segment segments in the original video.
示例地,分镜检测可以确定原始视频中的不同镜头片段。同一分镜片段中包括多个视频帧,该多个视频帧对应的镜头场景相同或相似,因此同一分镜片段中的多个视频帧如果通过位置偏差较大的多个裁剪框进行裁剪,则会导致裁剪后视频的画面频繁晃动,从而会降低裁剪后视频的播放效果。不同分镜片段中的视频帧,由于其对应的镜头场景不同,本身即存在镜头切换,因此即使产生画面晃动,视频播放效果也不会受到较大影响。因此,在本公开实施例中,为了减少裁剪后视频的播放过程中画面频繁晃动的问题,提升裁剪后视频的播放效果,可以对原始视频进行分镜检测,以确定原始视频中的分镜片段,从而后续处理过程中,可以针对每一分镜片段单独规划裁剪路径,以使每一分镜片段中的视频帧可以对应位置变化较小的裁剪框。Illustratively, split detection can determine different shot segments in the original video. The same storyboard segment includes multiple video frames, and the multiple video frames correspond to the same or similar scene scenes. Therefore, if multiple video frames in the same storyboard segment are cropped by multiple cropping frames with large positional deviations, the It will cause the screen of the cropped video to shake frequently, which will reduce the playback effect of the cropped video. The video frames in different mirroring clips have different camera scenes due to their corresponding camera scenes. Therefore, even if the screen shakes, the video playback effect will not be greatly affected. Therefore, in the embodiment of the present disclosure, in order to reduce the problem of frequent screen shaking during the playback of the cropped video and improve the playback effect of the cropped video, the original video can be detected by mirror detection to determine the mirrored segments in the original video. , so that in the subsequent processing process, a clipping path can be separately planned for each mirroring segment, so that the video frame in each mirroring clip can correspond to a clipping frame with a smaller position change.
步骤203,针对每一分镜片段,根据该分镜片段中目标视频帧的主体内容,确定该分镜片段对应的裁剪路径。该目标视频帧为所述分镜片段中的部分视频帧或全部视频帧,该裁剪路径用于表征目标裁剪框在分镜片段包括的所有视频帧中沿视频帧的宽度方向或长度方向的位置移动路径。 Step 203 , for each mirrored segment, according to the main content of the target video frame in the mirrored segment, determine a clipping path corresponding to the mirrored segment. The target video frame is a part or all of the video frames in the mirrored segment, and the clipping path is used to represent the position of the target clipping frame along the width direction or the length direction of the video frame in all the video frames included in the mirrored clip movement path.
示例地,目标视频帧可以对分镜片段进行抽帧处理得到的,该目标视频帧可以包括分镜片段中的全部视频帧,也可以包括分镜片段中的部分视频帧,本公开实施例对此不作限定。或者,还可以对原始视频先进行抽帧处理,然后对抽帧处理得到的视频帧进行标记。在此之后,再进行分镜检测,以得到分镜片段,接着将分镜片段中具有标记的视频帧作为 目标视频帧,等等,本公开实施例对于确定目标视频帧的方式不作限定。For example, the target video frame can be obtained by performing frame extraction processing on the mirrored segment, and the target video frame can include all video frames in the mirrored clip, and can also include some video frames in the mirrored clip. This is not limited. Alternatively, the original video may also be subjected to frame extraction processing, and then the video frames obtained by the frame extraction processing may be marked. After that, the mirror detection is performed again to obtain a mirror segment, and then the video frame with the mark in the mirror segment is used as the target video frame, etc. The embodiment of the present disclosure does not limit the method for determining the target video frame.
示例地,每一分镜片段包括的视频帧对应的镜头场景相同或相似,因此视频帧包括的主体内容差异不大,从而根据分镜片段中目标视频帧的主体内容确定裁剪路径,可以使得同一分镜片段对应的裁剪路径中目标裁剪框的位置移动偏差较小,进而减少裁剪后视频的播放过程中画面频繁晃动的问题,提升视频播放效果。Exemplarily, the scene scenes corresponding to the video frames included in each storyboard segment are the same or similar, so the main content included in the video frames is not very different, so the clipping path is determined according to the main content of the target video frame in the storyboard segment, so that the same The positional movement deviation of the target cropping frame in the cropping path corresponding to the storyboard segment is small, thereby reducing the problem of frequent screen shaking during the playback of the cropped video, and improving the video playback effect.
步骤204,根据目标裁剪框的尺寸信息和每一分镜片段对应的裁剪路径对原始视频进行裁剪,以得到裁剪后的目标视频。Step 204: Crop the original video according to the size information of the target cropping frame and the cropping path corresponding to each mirroring segment to obtain a cropped target video.
示例地,目标裁剪框的尺寸信息限定了裁剪框的大小,每一分镜片段对应的裁剪路径限定了裁剪框的位置,从而可以根据目标裁剪框的尺寸信息和每一分镜片段对应的裁剪路径对原始视频的每一帧画面进行裁剪,以得到裁剪后的目标视频。Exemplarily, the size information of the target cropping frame defines the size of the cropping frame, and the cropping path corresponding to each storyboard segment defines the position of the cropping frame, so that the size information of the target cropping frame and the cropping corresponding to each storyboard segment can be determined according to the size information of the target cropping frame. The path crops each frame of the original video to obtain the cropped target video.
在可能的方式中,针对每一分镜片段,根据目标裁剪框的尺寸信息和该分镜片段对应的裁剪路径,对分镜片段中的每一视频帧进行裁剪,然后将裁剪后的每一视频帧按照时间顺序进行拼接,以得到裁剪后的目标视频。也即是说,先针对每一分镜片段中的每一视频帧进行裁剪,然后再将不同分镜片段对应的裁剪后的视频帧按照时间顺序依次拼接,以得到裁剪后的目标视频。In a possible way, for each segment, each video frame in the segment is trimmed according to the size information of the target cropping frame and the cropping path corresponding to the segment, and then each clipped Video frames are spliced in chronological order to obtain the cropped target video. That is to say, each video frame in each mirrored segment is cropped first, and then the cropped video frames corresponding to different mirrored segments are spliced in chronological order to obtain a cropped target video.
例如,第一分镜片段中包括视频帧1、视频帧2和视频帧3,第二分镜片段包括视频帧4和视频帧5,视频帧1、视频帧2、视频帧3、视频帧4和视频帧5对应的时间依次递增,即视频播放过程中依次播放视频帧1、视频帧2、视频帧3、视频帧4和视频帧5。在此种情况下,可以对第一分镜片段包括的视频帧1、视频帧2和视频帧3进行裁剪,同时对第二分镜片段包括视频帧4和视频帧5进行裁剪,然后再将裁剪后的第一分镜片段包括的视频帧和裁剪后的第二分镜片段包括的视频帧按照时间顺序进行拼接,以得到裁剪后的目标视频。For example, the first mirror clip includes video frame 1, video frame 2 and video frame 3, the second mirror clip includes video frame 4 and video frame 5, video frame 1, video frame 2, video frame 3, video frame 4 The time corresponding to the video frame 5 increases sequentially, that is, the video frame 1, the video frame 2, the video frame 3, the video frame 4 and the video frame 5 are played in sequence during the video playback. In this case, video frame 1, video frame 2, and video frame 3 included in the first mirroring segment can be cropped, while the second mirroring segment including video frame 4 and video frame 5 can be cropped, and then The video frames included in the cropped first mirroring segment and the video frames included in the cropped second mirroring segment are spliced in chronological order to obtain a cropped target video.
通过上述方式,可以针对每一分镜片段,根据该分镜片段中目标视频帧的主体内容确定该分镜片段对应的裁剪路径,从而根据每一分镜片段对应的裁剪路径进行视频裁剪。由于每一分镜片段包括的视频帧对应的镜头场景相同或相似,因此视频帧包括的主体内容差异不大,从而根据分镜片段中目标视频帧的主体内容确定裁剪路径,可以使得同一分镜片段对应的裁剪路径中裁剪框的位置移动偏差较小,进而减少裁剪后视频的播放过程中画面频繁晃动的问题,提升视频播放效果。例如,原始视频可以是竖版视频,目标裁剪框的尺寸信息可以是横版视频对应的尺寸信息,从而通过本公开提供的视频裁剪方法可以解决在竖版视频裁剪成横板视频的场景下裁剪后视频的画面晃动问题,提升视频播放效果。In the above manner, for each mirrored segment, a clipping path corresponding to the mirrored clip can be determined according to the main content of the target video frame in the mirrored clip, so as to perform video clipping according to the corresponding clipping path of each mirrored clip. Since the scene scenes corresponding to the video frames included in each storyboard segment are the same or similar, the main content included in the video frames is not very different, so the clipping path is determined according to the main content of the target video frame in the storyboard segment, which can make the same storyboard. The positional deviation of the cropping frame in the cropping path corresponding to the clip is small, thereby reducing the problem of frequent screen shaking during the playback of the cropped video, and improving the video playback effect. For example, the original video may be a vertical version video, and the size information of the target cropping frame may be the size information corresponding to the horizontal version video, so that the video cropping method provided by the present disclosure can solve the problem of cropping in the scene where the vertical version video is cropped into the horizontal version video. After the video is shaken, the video playback effect is improved.
为了使得本领域技术人员更加理解本公开提供的视频裁剪方法,下面对上述各步骤进 行详细举例说明。In order to make those skilled in the art better understand the video cropping method provided by the present disclosure, the above steps are illustrated in detail below.
在可能的方式中,对原始视频进行分镜检测,以确定原始视频中的分镜片段可以是:通过帧差法对原始视频进行分镜检测,以确定原始视频中的分镜片段;或者,将原始视频输入预训练的分镜检测模型中,并根据分镜检测模型的输出结果,确定原始视频中的分镜片段,该分镜检测模型是根据样本视频和样本视频对应的样本分镜片段进行训练得到的。In a possible manner, performing sub-slice detection on the original video to determine the sub-segment in the original video may be: performing sub-scenario detection on the original video by a frame difference method to determine the sub-segment in the original video; or, Input the original video into the pre-trained mirror detection model, and determine the mirror segment in the original video according to the output result of the mirror detection model. The mirror detection model is based on the sample video and the sample mirror segment corresponding to the sample video. obtained by training.
示例地,帧差法可以是运动目标检测和分割方法之一,基本原理是在图像序列相邻两帧或三帧间采用基于像素的时间差分通过闭值化来提取出图像中的运动区域,帧差法的具体计算方式与相关技术中类似,这里不再赘述。在本公开实施例中,通过帧差法可以确定原始视频中不同视频帧对应的运动区域,从而将该运动区域相同或相近的视频帧确定为属于同一分镜片段,进而得到原始视频对应的至少一个分镜片段。或者,还可以根据样本视频和样本视频对应的样本分镜片段训练分镜检测模型,从而通过训练后的分镜检测模型确定原始视频对应的至少一个分镜片段。Illustratively, the frame difference method can be one of the moving object detection and segmentation methods. The basic principle is to use pixel-based time difference between two or three adjacent frames of an image sequence to extract the moving region in the image through occlusion. The specific calculation method of the frame difference method is similar to that in the related art, and will not be repeated here. In the embodiment of the present disclosure, the motion regions corresponding to different video frames in the original video can be determined by the frame difference method, so that the video frames with the same or similar motion regions belong to the same mirror segment, and then at least one corresponding to the original video can be obtained. A storyboard segment. Alternatively, a mirroring detection model may also be trained according to the sample video and the sample mirroring segments corresponding to the sample video, so that at least one mirroring segment corresponding to the original video is determined through the trained mirror detection model.
在得到原始视频对应的至少一个分镜片段后,可以针对每一分镜片段,根据该分镜片段中目标视频帧的主体内容,确定分镜片段对应的裁剪路径。在可能的方式中,若目标视频帧为分镜片段中的全部视频帧,则可以针对每一目标视频帧,确定能够包括该目标视频帧中主体内容的目标裁剪框,然后确定该目标裁剪框在目标视频帧的宽度方向或长度方向的位置坐标,从而得到裁剪路径。但是,此种方式需要针对分镜片段中的每一视频帧逐一确定目标裁剪框,计算量较大,从而影响视频裁剪的效率。After obtaining at least one segmented segment corresponding to the original video, for each segmented segment, a clipping path corresponding to the segmented segment may be determined according to the main content of the target video frame in the segmented segment. In a possible way, if the target video frame is all the video frames in the mirrored segment, then for each target video frame, a target cropping frame that can include the main content in the target video frame can be determined, and then the target cropping frame can be determined Position coordinates in the width direction or length direction of the target video frame to obtain the clipping path. However, in this method, the target cropping frame needs to be determined one by one for each video frame in the mirroring segment, which requires a large amount of calculation, thereby affecting the efficiency of video cropping.
为了解决该问题,提升视频裁剪效率,在可能的方式中,若目标视频帧为分镜片段中的部分视频帧,则可以针对每一目标视频帧,确定能够包括目标视频帧中主体内容的目标裁剪框,并确定目标裁剪框在目标视频帧的宽度方向或长度方向的位置坐标。然后,根据每一目标视频帧的位置坐标进行插值计算,以得到分镜片段中除目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标。最后,根据分镜片段中每一视频帧对应的位置坐标,确定分镜片段对应的裁剪路径。In order to solve this problem and improve the video cropping efficiency, in a possible way, if the target video frame is a part of the video frame in the mirror clip, then for each target video frame, a target that can include the main content in the target video frame can be determined The cropping frame is determined, and the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame are determined. Then, interpolation calculation is performed according to the position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirroring segment except the target video frame. Finally, the clipping path corresponding to the mirroring clip is determined according to the position coordinates corresponding to each video frame in the mirroring clip.
示例地,主体内容可以是占据大部分图像区域的主要画面内容,比如有一个人物进行故事讲解的视频中,该人物则为目标视频帧中的主体内容。针对每一目标视频帧,可以执行如下至少一种检测方式来确定主体内容:显著性检测、人脸检测、文字检测、标志(logo)检测。其中,显著性检测用于检测目标视频帧的主体成分位置。人脸检测用于检测目标视频帧中的人脸所在位置。文字检测用于检测目标视频帧中的文字所在位置以及文字内容。logo检测用于检测目标视频帧中的logo、水印等内容的所在位置。此外,还可以在检测主体内容之前,先对目标视频帧进行边框检测,然后去除检测到的黑边、高斯模糊等无用边 框,提高后续主体内容的检测准确性。For example, the main content may be the main picture content occupying most of the image area, for example, in a video in which a character is explaining a story, the character is the main content in the target video frame. For each target video frame, at least one of the following detection methods may be performed to determine the main content: saliency detection, face detection, text detection, and logo detection. Among them, saliency detection is used to detect the position of the main component of the target video frame. Face detection is used to detect the location of the face in the target video frame. Text detection is used to detect the position and content of text in the target video frame. Logo detection is used to detect the location of the logo, watermark, etc. in the target video frame. In addition, it is also possible to perform frame detection on the target video frame before detecting the main content, and then remove the detected black borders, Gaussian blur and other useless frames, so as to improve the detection accuracy of the subsequent main content.
在可能的方式中,确定目标裁剪框在目标视频帧的宽度方向或长度方向的位置坐标可以是:若根据原始视频的尺寸信息和目标裁剪框的尺寸信息确定沿宽度方向进行裁剪,则确定目标裁剪框在目标视频帧的宽度方向的位置坐标。若根据原始视频的尺寸信息和目标裁剪框的尺寸信息确定沿长度方向进行裁剪,则确定目标裁剪框在目标视频帧的长度方向的位置坐标。In a possible manner, determining the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame may be: if it is determined to crop along the width direction according to the size information of the original video and the size information of the target cropping frame, then The position coordinate of the cropping frame in the width direction of the target video frame. If it is determined to perform cropping along the length direction according to the size information of the original video and the size information of the target cropping frame, the position coordinates of the target cropping frame in the length direction of the target video frame are determined.
应当理解的是,通常情况可以根据目标裁剪框的尺寸信息对原始视频中对应帧画面的长度和宽度分别进行裁剪,因此可以确定目标裁剪框在目标视频帧的宽度方向和长度方向的位置坐标,即若以目标视频帧的宽度方向为X轴,并以目标视频帧的长度方向为Y轴,则该位置坐标包括X坐标值和Y坐标值。It should be understood that in general, the length and width of the corresponding frame picture in the original video can be cropped respectively according to the size information of the target cropping frame, so the position coordinates of the target cropping frame in the width direction and the length direction of the target video frame can be determined, That is, if the width direction of the target video frame is taken as the X axis, and the length direction of the target video frame is taken as the Y axis, the position coordinates include the X coordinate value and the Y coordinate value.
在本公开实施例中,为了提高视频裁剪效率,可以根据目标裁剪框的尺寸信息和原始视频的尺寸信息,沿对应帧画面的长度或宽度进行裁剪。比如,目标裁剪框的尺寸信息为1:1,原始视频的尺寸信息为720×1280像素,则可以将对应帧画面的长度(沿y轴方向)进行裁剪,裁剪后的视频尺寸为720×720像素。在此种情况下,可以确定目标裁剪框在目标视频帧长度方向的位置坐标,即若以目标视频帧的宽度方向为X轴,并以目标视频帧的长度方向为Y轴,则该位置坐标包括Y坐标值。在其他情况下,若根据目标裁剪框的尺寸信息和原始视频的尺寸信息确定沿宽度方向进行裁剪,则可以确定目标裁剪框在目标视频帧的宽度方向的位置坐标,即该位置坐标包括X坐标值。In the embodiment of the present disclosure, in order to improve the video cropping efficiency, cropping may be performed along the length or width of the corresponding frame according to the size information of the target cropping frame and the size information of the original video. For example, if the size information of the target cropping frame is 1:1, and the size information of the original video is 720×1280 pixels, then the length of the corresponding frame picture (along the y-axis direction) can be cropped, and the size of the cropped video is 720×720 pixel. In this case, the position coordinates of the target cropping frame in the length direction of the target video frame can be determined, that is, if the width direction of the target video frame is taken as the X axis, and the length direction of the target video frame is taken as the Y axis, then the position coordinates Include the Y coordinate value. In other cases, if it is determined to crop along the width direction according to the size information of the target cropping frame and the size information of the original video, the position coordinates of the target cropping frame in the width direction of the target video frame can be determined, that is, the position coordinates include the X coordinate value.
示例地,根据每一目标视频帧的位置坐标进行插值计算,该位置坐标可以是目标裁剪框在目标视频帧的宽度方向的位置坐标、目标裁剪框在目标视频帧的长度方向的位置坐标、以及目标裁剪框在目标视频帧的宽度方向和长度的位置坐标中的任一者,在具体应用中可以根据实际业务需求确定。为了便于理解,下面以该位置坐标是目标裁剪框在目标视频帧的长度方向的位置坐标(即位置坐标包括Y坐标值)进行说明。Illustratively, the interpolation calculation is performed according to the position coordinates of each target video frame, and the position coordinates may be the position coordinates of the target crop frame in the width direction of the target video frame, the position coordinates of the target crop frame in the length direction of the target video frame, and Any one of the position coordinates of the target cropping frame in the width direction and the length of the target video frame can be determined according to actual business requirements in specific applications. For ease of understanding, the following description is made with the position coordinates being the position coordinates of the target cropping frame in the length direction of the target video frame (that is, the position coordinates include the Y coordinate value).
例如,每一目标视频帧的位置坐标为y 0、y 1、y 2、……、y n-1(n为目标视频帧的数量),因此可以对该位置坐标进行插值计算以得到原始视频中出目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标。其中,插值计算的方式可以是相关技术中的任意插值计算方式,本公开对此不作限定。 For example, the position coordinates of each target video frame are y 0 , y 1 , y 2 , ..., y n-1 (n is the number of target video frames), so the position coordinates can be interpolated to obtain the original video The position coordinates of the target cropping frame corresponding to other video frames other than the target video frame. The interpolation calculation method may be any interpolation calculation method in the related art, which is not limited in the present disclosure.
在可能的方式中,考虑到通常的线性插值方式可能导致插值处变化生硬,从而可能导致插值计算得到的裁剪路径中目标裁剪框的位置移动较大,进而导致裁剪后视频画面的晃动,本公开实施例采取三次样条插值计算方式,以使插值计算得到的裁剪路径更加平滑,减少裁剪后视频画面的晃动。In a possible manner, considering that the usual linear interpolation method may lead to abrupt changes in the interpolation position, which may cause the position of the target cropping frame in the cropping path obtained by interpolation calculation to move greatly, thereby causing the video picture to shake after cropping, the present disclosure The embodiment adopts a cubic spline interpolation calculation method, so that the cropping path obtained by the interpolation calculation is smoother, and the shaking of the video picture after cropping is reduced.
具体的,可以根据每一目标视频帧的位置坐标确定目标函数,其中该目标函数包括多个分段函数,每一分段函数是根据每两个相邻的目标视频帧的位置坐标而确定的,且每一分段函数和目标函数均是自变量为时间、因变量为位置坐标的三次方程,该目标函数的一阶导数和二阶导数在时间上连续。相应地,可以根据该目标函数以及分镜片段中除目标视频帧外的其他视频帧所对应的时间,确定其他视频帧对应的目标裁剪框的位置坐标。Specifically, the objective function can be determined according to the position coordinates of each target video frame, wherein the objective function includes a plurality of segment functions, and each segment function is determined according to the position coordinates of every two adjacent target video frames. , and each piecewise function and objective function are cubic equations in which the independent variable is time and the dependent variable is the position coordinate, and the first-order derivative and second-order derivative of the objective function are continuous in time. Correspondingly, the position coordinates of the target cropping frame corresponding to the other video frames can be determined according to the target function and the time corresponding to the other video frames except the target video frame in the mirroring segment.
示例地,可以将每两个相邻的目标视频帧的位置坐标作为一个分段区间,每一分段区间可以对应一个分段函数,该分段函数为自变量为时间、因变量为位置坐标的三次方程,因此可以得到每一分段函数对应的分段曲线。目标函数包括多个分段函数,即目标函数为该多个分段函数之和。并且,目标函数是自变量为时间、因变量为位置坐标的三次方程,一阶导数和二阶导数在时间上连续,因此可以将多个分段函数对应的分段曲线连接成一条平滑曲线,减少裁剪后视频画面的晃动。Illustratively, the position coordinates of every two adjacent target video frames can be used as a segment interval, and each segment interval can correspond to a segment function, and the segment function is that the independent variable is time, and the dependent variable is the position coordinate. The cubic equation of , so the piecewise curve corresponding to each piecewise function can be obtained. The objective function includes multiple piecewise functions, that is, the objective function is the sum of the multiple piecewise functions. Moreover, the objective function is a cubic equation with the independent variable as time and the dependent variable as the position coordinate. The first-order derivative and the second-order derivative are continuous in time, so the piecewise curves corresponding to multiple piecewise functions can be connected into a smooth curve, Reduce the shaking of the video screen after cropping.
例如,对于区间变量时间t:t 0≤t 1≤t 2≤…≤t n-1,对应的位置坐标y:y 0≤y 1≤y 2≤…≤y n-1,目标函数S(t)=S 0(t)+S 1(t)+S 2(t)+…+S n-1(t)满足:1)在每个分段区间[t i,t i+1],分段函数S i(t)是三次函数;2)S i(t)=y i;3)目标函数S(t)的一阶导数S'(t)和二阶导数S”(t)在[t 0,t n-1]连续,且目标函数S(t)是三次函数。其中,S i(t)的表示式可以是:S i(t)=a i+b i(t-t i)+c i(t-t i) 2+d i(t-t i) 3,其中,i的取值为0、1、2、……、(n-1)。 For example, for the interval variable time t: t 0 ≤t 1 ≤t 2 ≤...≤t n- 1 , the corresponding position coordinates y: y 0 ≤y 1 ≤y 2 ≤...≤y n-1 , the objective function S( t)=S 0 (t)+S 1 (t)+S 2 (t)+...+S n-1 (t) satisfies: 1) In each segment interval [t i ,t i+1 ], The piecewise function S i (t) is a cubic function; 2) S i (t)=y i ; 3) The first derivative S'(t) and the second derivative S'(t) of the objective function S(t) are in [t 0 ,t n-1 ] is continuous, and the objective function S(t) is a cubic function. Wherein, the expression of S i (t) can be: S i (t)=a i +b i (tt i ) +c i (tt i ) 2 +d i (tt i ) 3 , where i takes the value of 0, 1, 2, ..., (n-1).
也即是说,本公开实施例可以基于三次样条插值的方式,根据每一目标视频帧中目标裁剪框的位置坐标计算得到分镜片段中其他视频帧对应的目标裁剪框的位置坐标。其中三次样条插值的二阶连续性可以使得插值计算得到的裁剪路径更加平滑,从而减少裁剪后视频画面的晃动。例如,参照图3,对于目标视频帧的位置坐标Y1、Y2、Y3和Y4,按照上述方式进行插值计算,可以得到每两个相邻的目标视频之间其他视频帧的位置坐标,从而确定该分镜片段对应的裁剪路径。That is to say, the embodiment of the present disclosure can calculate the position coordinates of the target cropping frame corresponding to other video frames in the mirroring segment according to the position coordinates of the target cropping frame in each target video frame based on cubic spline interpolation. The second-order continuity of cubic spline interpolation can make the clipping path calculated by the interpolation smoother, thereby reducing the shaking of the clipped video image. For example, referring to FIG. 3, for the position coordinates Y1, Y2, Y3 and Y4 of the target video frame, the interpolation calculation is performed in the above-mentioned manner, and the position coordinates of other video frames between each two adjacent target videos can be obtained, thereby determining the The clipping path corresponding to the storyboard clip.
在可能的方式中,根据每一目标视频帧的位置坐标进行插值计算,以得到分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标,还可以是:对每一目标视频帧的位置坐标进行平滑滤波处理,以得到每一目标视频帧的平滑位置坐标,然后根据每一目标视频帧的平滑位置坐标进行插值计算,以得到分镜片段中除目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标。In a possible way, interpolation calculation is performed according to the position coordinates of each target video frame to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirroring segment except the target video frame, and it can also be: The position coordinates of each target video frame are processed by smoothing filtering to obtain the smooth position coordinates of each target video frame, and then interpolation calculation is performed according to the smooth position coordinates of each target video frame to obtain the target video frame in the mirror segment. The position coordinates of the target cropping frame corresponding to other video frames.
示例地,可以通过高斯平滑滤波的方式对每一目标视频帧的位置坐标进行处理,以得到每一目标视频帧的平滑位置坐标。比如,每一目标视频帧中目标裁剪框的位置坐标为y 0、y 1、y 2、……、y n-1,可以使用窗口为2M+1(M为正整数)的高斯平滑滤波方式。其中, 距离窗口中心偏差△y的位置对应的权重符合以下高斯分布:
Figure PCTCN2021128711-appb-000001
长度为2M+1的滑窗卷积核为[G(-M),G(-M+1),…,G(0),…,G(M-1),G(M)]。其中,高斯分布公式中的相关参数含义可以参考相关技术,这里不再赘述。当然,在其他可能的方式中,也可以通过其他平滑滤波方式对每一目标视频帧的位置坐标进行处理,比如均值滤波,等等,本公开实施例对此不作限定。
For example, the position coordinates of each target video frame may be processed through Gaussian smoothing filtering to obtain the smooth position coordinates of each target video frame. For example, if the position coordinates of the target cropping frame in each target video frame are y 0 , y 1 , y 2 , ..., y n-1 , a Gaussian smoothing filtering method with a window of 2M+1 (M is a positive integer) can be used . Among them, the weight corresponding to the position of the deviation △y from the center of the window conforms to the following Gaussian distribution:
Figure PCTCN2021128711-appb-000001
The sliding window convolution kernel of length 2M+1 is [G(-M), G(-M+1),...,G(0),...,G(M-1),G(M)]. For the meaning of the relevant parameters in the Gaussian distribution formula, reference may be made to the related art, which will not be repeated here. Certainly, in other possible manners, the position coordinates of each target video frame may also be processed through other smoothing filtering manners, such as mean filtering, etc., which is not limited in this embodiment of the present disclosure.
然后,可以根据每一目标视频帧的平滑位置坐标进行插值计算,以得到分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标。通过此种方式,由于在插值计算之前对目标视频帧中目标裁剪框的位置坐标进行了平滑滤波,因此可以减少目标视频帧中目标裁剪框之间的位置偏移。例如,参照图4,对于目标视频帧中目标裁剪框的初始位置坐标Y1、Y2、Y3和Y4进行平滑滤波处理,可以得到平滑位置坐标Y1’、Y2’、Y3’和Y4’。根据图4可知,相较于初始位置坐标,平滑位置坐标之间的位置偏移减小了,从而可以减少插值计算得到的其他视频帧对应的目标裁剪框之间的位置偏移,进一步提升裁剪路径的平滑性,减少裁剪后视频画面的晃动。Then, interpolation calculation may be performed according to the smooth position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirroring segment except the target video frame. In this way, since the position coordinates of the target cropping frame in the target video frame are smoothly filtered before the interpolation calculation, the positional offset between the target cropping frames in the target video frame can be reduced. For example, referring to Fig. 4 , performing smooth filtering processing on the initial position coordinates Y1, Y2, Y3 and Y4 of the target cropping frame in the target video frame, the smooth position coordinates Y1', Y2', Y3' and Y4' can be obtained. It can be seen from Figure 4 that, compared with the initial position coordinates, the position offset between the smooth position coordinates is reduced, so that the position offset between the target cropping frames corresponding to other video frames obtained by interpolation can be reduced, and the cropping can be further improved. The smoothness of the path reduces the shaking of the video screen after cropping.
基于同一发明构思,本公开还提供一种视频裁剪装置,该视频裁剪装置可以通过软件、硬件或者两者结合的方式成为电子设备的部分或全部。参照图5,该视频裁剪装置500包括:Based on the same inventive concept, the present disclosure also provides a video cropping device, which can become part or all of an electronic device through software, hardware, or a combination of the two. 5, the video cropping device 500 includes:
获取模块501,用于获取待裁剪的原始视频和目标裁剪框的尺寸信息;Obtaining module 501, for obtaining the size information of the original video to be cropped and the target cropping frame;
第一确定模块502,用于对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;a first determining module 502, configured to perform mirror detection on the original video to determine mirror segments in the original video;
第二确定模块503,用于针对每一所述分镜片段,根据所述分镜片段中目标视频帧的主体内容,确定所述分镜片段对应的裁剪路径,所述目标视频帧为所述分镜片段中的部分视频帧或全部视频帧,所述裁剪路径用于表征所述目标裁剪框在所述分镜片段包括的所有视频帧中沿所述视频帧的宽度方向或长度方向的位置移动路径;The second determination module 503 is configured to, for each of the mirrored segments, determine a clipping path corresponding to the mirrored segment according to the main content of the target video frame in the mirrored segment, where the target video frame is the A part of the video frames or all of the video frames in the storyboard clip, the clipping path is used to represent the position of the target clipping frame along the width direction or the length direction of the video frame in all the video frames included in the mirror clip clip moving path;
裁剪模块504,用于根据所述目标裁剪框的尺寸信息和每一所述分镜片段对应的所述裁剪路径对所述原始视频进行裁剪,以得到裁剪后的目标视频。The cropping module 504 is configured to crop the original video according to the size information of the target cropping frame and the cropping path corresponding to each of the mirrored segments to obtain a cropped target video.
可选地,所述第二确定模块503用于:Optionally, the second determining module 503 is used for:
针对每一所述目标视频帧,确定能够包括所述目标视频帧中主体内容的目标裁剪框,并确定所述目标裁剪框在所述目标视频帧的宽度方向或长度方向的位置坐标;For each of the target video frames, determine a target cropping frame that can include the main content in the target video frame, and determine the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame;
根据每一目标视频帧的所述位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标;Perform interpolation calculation according to the position coordinates of each target video frame, so as to obtain the position coordinates of the target crop frame corresponding to other video frames in the mirror segment except the target video frame;
根据所述分镜片段中每一视频帧对应的所述位置坐标,确定所述分镜片段对应的所述裁剪路径。According to the position coordinates corresponding to each video frame in the mirrored clip, the clipping path corresponding to the mirrored clip is determined.
可选地,所述第二确定模块503用于:Optionally, the second determining module 503 is used for:
若根据所述原始视频的尺寸信息和所述目标裁剪框的尺寸信息确定沿宽度方向进行裁剪,则确定所述目标裁剪框在所述目标视频帧的宽度方向的位置坐标;If it is determined to cut along the width direction according to the size information of the original video and the size information of the target cropping frame, then determine the position coordinates of the target cropping frame in the width direction of the target video frame;
若根据所述原始视频的尺寸信息和所述目标裁剪框的尺寸信息确定沿长度方向进行裁剪,则确定所述目标裁剪框在所述目标视频帧的长度方向的位置坐标。If it is determined to perform cropping along the length direction according to the size information of the original video and the size information of the target cropping frame, the position coordinates of the target cropping frame in the length direction of the target video frame are determined.
可选地,所述第二确定模块503用于:Optionally, the second determining module 503 is used for:
根据每一目标视频帧的所述位置坐标确定目标函数,其中所述目标函数包括多个分段函数,每一所述分段函数是根据每两个相邻的目标视频帧的所述位置坐标而确定的,且每一所述分段函数和所述目标函数均是自变量为时间、因变量为所述位置坐标的三次方程,所述目标函数的一阶导数和二阶导数在时间上连续;An objective function is determined according to the position coordinates of each target video frame, wherein the target function includes a plurality of segment functions, each of which is based on the position coordinates of every two adjacent target video frames and determined, and each of the piecewise function and the objective function is a cubic equation whose independent variable is time and the dependent variable is the position coordinate, and the first-order derivative and second-order derivative of the objective function are in time continuous;
根据所述目标函数以及所述分镜片段中除所述目标视频帧外的其他视频帧所对应的时间,确定所述其他视频帧对应的目标裁剪框的位置坐标。Determine the position coordinates of the target cropping frame corresponding to the other video frames according to the objective function and the time corresponding to the other video frames in the mirroring segment except the target video frame.
可选地,所述第二确定模块503用于:Optionally, the second determining module 503 is used for:
对每一目标视频帧的所述位置坐标进行平滑滤波处理,以得到所述每一目标视频帧的平滑位置坐标;performing smooth filtering processing on the position coordinates of each target video frame to obtain the smooth position coordinates of each target video frame;
根据所述每一目标视频帧的所述平滑位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标。Interpolation calculation is performed according to the smooth position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirror segment except the target video frame.
可选地,所述第一确定模块502用于:Optionally, the first determining module 502 is configured to:
通过帧差法对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;或者,将所述原始视频输入预训练的分镜检测模型中,并根据所述分镜检测模型的输出结果,确定所述原始视频中的分镜片段,所述分镜检测模型是根据样本视频和所述样本视频对应的样本分镜片段进行训练得到的。Perform mirror detection on the original video by the frame difference method to determine the mirror segments in the original video; or, input the original video into a pre-trained mirror detection model, and detect the mirror according to the mirror The output result of the model determines the mirror segment in the original video, and the mirror detection model is obtained by training according to the sample video and the sample mirror segment corresponding to the sample video.
可选地,所述裁剪模块504用于:Optionally, the cropping module 504 is used for:
针对每一所述分镜片段,根据所述目标裁剪框的尺寸信息和所述分镜片段对应的所述裁剪路径,对所述分镜片段中的每一视频帧进行裁剪;For each of the mirrored clips, according to the size information of the target cropping frame and the clipping path corresponding to the mirrored clip, trimming each video frame in the mirrored clip;
将裁剪后的每一视频帧按照时间顺序进行拼接,以得到裁剪后的所述目标视频。Each clipped video frame is spliced in time sequence to obtain the clipped target video.
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。Regarding the apparatus in the above-mentioned embodiment, the specific manner in which each module performs operations has been described in detail in the embodiment of the method, and will not be described in detail here.
基于同一发明构思,本公开实施例还提供一种计算机可读介质,其上存储有计算机程 序,该程序被处理装置执行时实现上述任一视频裁剪方法的步骤。Based on the same inventive concept, an embodiment of the present disclosure also provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processing device, implements the steps of any of the above video cropping methods.
基于同一发明构思,本公开实施例还提供一种电子设备,包括:Based on the same inventive concept, an embodiment of the present disclosure also provides an electronic device, including:
存储装置,其上存储有计算机程序;a storage device on which a computer program is stored;
处理装置,用于执行所述存储装置中的所述计算机程序,以实现上述任一视频裁剪方法的步骤。A processing device is configured to execute the computer program in the storage device to implement the steps of any of the above video cropping methods.
下面参考图6,其示出了适于用来实现本公开实施例的电子设备600的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图6示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。Referring next to FIG. 6 , it shows a schematic structural diagram of an electronic device 600 suitable for implementing an embodiment of the present disclosure. Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, mobile terminals such as in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in FIG. 6 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
如图6所示,电子设备600可以包括处理装置(例如中央处理器、图形处理器等)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储装置608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有电子设备600操作所需的各种程序和数据。处理装置601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, an electronic device 600 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 601 that may be loaded into random access according to a program stored in a read only memory (ROM) 602 or from a storage device 608 Various appropriate actions and processes are executed by the programs in the memory (RAM) 603 . In the RAM 603, various programs and data required for the operation of the electronic device 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604 .
通常,以下装置可以连接至I/O接口605:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置606;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置607;包括例如磁带、硬盘等的存储装置608;以及通信装置609。通信装置609可以允许电子设备600与其他设备进行无线或有线通信以交换数据。虽然图6示出了具有各种装置的电子设备600,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Typically, the following devices can be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration An output device 607 of a computer, etc.; a storage device 608 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 609. Communication means 609 may allow electronic device 600 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 6 shows electronic device 600 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置609从网络上被下载和安装,或者从存储装置608被安装,或者从ROM 602被安装。在该计算机程序被处理装置601执行时,执行本公开实施例的方法中限定的上述功能。In particular, according to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication device 609, or from the storage device 608, or from the ROM 602. When the computer program is executed by the processing apparatus 601, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、 便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
在一些实施方式中,可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。In some embodiments, any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol), can be used for communication, and can communicate with digital data in any form or medium (eg, communication network) interconnection. Examples of communication networks include local area networks ("LAN"), wide area networks ("WAN"), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取待裁剪的原始视频和目标裁剪框的尺寸信息;对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;针对每一所述分镜片段,根据所述分镜片段中目标视频帧的主体内容,确定所述分镜片段对应的裁剪路径,所述目标视频帧为所述分镜片段中的部分视频帧或全部视频帧,所述裁剪路径用于表征所述目标裁剪框在所述分镜片段包括的所有视频帧中沿所述视频帧的宽度方向或长度方向的位置移动路径;根据所述目标裁剪框的尺寸信息和每一所述分镜片段对应的所述裁剪路径对所述原始视频进行裁剪,以得到裁剪后的目标视频。The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: obtains the original video to be trimmed and the size information of the target trimming frame; The video is subjected to mirror detection to determine the mirror segment in the original video; for each mirror segment, according to the main content of the target video frame in the mirror segment, the cropping corresponding to the mirror segment is determined path, the target video frame is a part of the video frame or all the video frames in the mirror clip, and the clipping path is used to represent that the target clip frame is in all the video frames included in the mirror clip along the The position movement path in the width direction or the length direction of the video frame; according to the size information of the target cropping frame and the cropping path corresponding to each of the mirror segments, the original video is cropped to obtain the cropped target video.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言——诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器 上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)——连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. Where a remote computer is involved, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider to via Internet connection).
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的模块可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,模块的名称在某种情况下并不构成对该模块本身的限定。The modules involved in the embodiments of the present disclosure may be implemented in software or hardware. Among them, the name of the module does not constitute a limitation of the module itself under certain circumstances.
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
根据本公开的一个或多个实施例,示例1提供了一种视频裁剪方法,包括:According to one or more embodiments of the present disclosure, Example 1 provides a video cropping method, including:
获取待裁剪的原始视频和目标裁剪框的尺寸信息;Obtain the size information of the original video to be cropped and the target cropping frame;
对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;Performing split detection on the original video to determine a split segment in the original video;
针对每一所述分镜片段,根据所述分镜片段中目标视频帧的主体内容,确定所述分镜片段对应的裁剪路径,所述目标视频帧为所述分镜片段中的部分视频帧或全部视频帧,所 述裁剪路径用于表征所述目标裁剪框在所述分镜片段包括的所有视频帧中沿所述视频帧的宽度方向或长度方向的位置移动路径;For each of the mirrored clips, determine a clipping path corresponding to the mirrored clip according to the main content of the target video frame in the mirrored clip, where the target video frame is a partial video frame in the mirrored clip Or all video frames, the clipping path is used to represent the position movement path of the target clipping frame in all video frames included in the mirroring segment along the width direction or the length direction of the video frame;
根据所述目标裁剪框的尺寸信息和每一所述分镜片段对应的所述裁剪路径对所述原始视频进行裁剪,以得到裁剪后的目标视频。The original video is cropped according to the size information of the target cropping frame and the cropping path corresponding to each mirroring segment to obtain a cropped target video.
根据本公开的一个或多个实施例,示例2提供了示例1的方法,所述根据所述分镜片段中目标视频帧的主体内容,确定所述分镜片段对应的裁剪路径,包括:According to one or more embodiments of the present disclosure, Example 2 provides the method of Example 1, wherein determining the clipping path corresponding to the mirrored segment according to the main content of the target video frame in the mirrored segment, including:
针对每一所述目标视频帧,确定能够包括所述目标视频帧中主体内容的目标裁剪框,并确定所述目标裁剪框在所述目标视频帧的宽度方向或长度方向的位置坐标;For each of the target video frames, determine a target cropping frame that can include the main content in the target video frame, and determine the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame;
根据每一目标视频帧的所述位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标;Perform interpolation calculation according to the position coordinates of each target video frame, so as to obtain the position coordinates of the target crop frame corresponding to other video frames in the mirror segment except the target video frame;
根据所述分镜片段中每一视频帧对应的所述位置坐标,确定所述分镜片段对应的所述裁剪路径。According to the position coordinates corresponding to each video frame in the mirrored clip, the clipping path corresponding to the mirrored clip is determined.
根据本公开的一个或多个实施例,示例3提供了示例2的方法,所述确定所述目标裁剪框在所述目标视频帧的宽度方向或长度方向的位置坐标,包括:According to one or more embodiments of the present disclosure, Example 3 provides the method of Example 2, wherein the determining the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame includes:
若根据所述原始视频的尺寸信息和所述目标裁剪框的尺寸信息确定沿宽度方向进行裁剪,则确定所述目标裁剪框在所述目标视频帧的宽度方向的位置坐标;If it is determined to cut along the width direction according to the size information of the original video and the size information of the target cropping frame, then determine the position coordinates of the target cropping frame in the width direction of the target video frame;
若根据所述原始视频的尺寸信息和所述目标裁剪框的尺寸信息确定沿长度方向进行裁剪,则确定所述目标裁剪框在所述目标视频帧的长度方向的位置坐标。If it is determined to perform cropping along the length direction according to the size information of the original video and the size information of the target cropping frame, the position coordinates of the target cropping frame in the length direction of the target video frame are determined.
根据本公开的一个或多个实施例,示例4提供了示例2的方法,所述根据每一目标视频帧的所述位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标,包括:According to one or more embodiments of the present disclosure, Example 4 provides the method of Example 2, wherein the interpolation calculation is performed according to the position coordinates of each target video frame, so as to obtain the target video in the mirroring segment divided by the target video. The position coordinates of the target cropping frame corresponding to other video frames outside the frame, including:
根据每一目标视频帧的所述位置坐标确定目标函数,其中所述目标函数包括多个分段函数,每一所述分段函数是根据每两个相邻的目标视频帧的所述位置坐标而确定的,且每一所述分段函数和所述目标函数均是自变量为时间、因变量为所述位置坐标的三次方程,所述目标函数的一阶导数和二阶导数在时间上连续;An objective function is determined according to the position coordinates of each target video frame, wherein the target function includes a plurality of segment functions, each of which is based on the position coordinates of every two adjacent target video frames and determined, and each of the piecewise function and the objective function is a cubic equation whose independent variable is time and the dependent variable is the position coordinate, and the first-order derivative and second-order derivative of the objective function are in time continuous;
根据所述目标函数以及所述分镜片段中除所述目标视频帧外的其他视频帧所对应的时间,确定所述其他视频帧对应的目标裁剪框的位置坐标。Determine the position coordinates of the target cropping frame corresponding to the other video frames according to the objective function and the time corresponding to the other video frames in the mirroring segment except the target video frame.
根据本公开的一个或多个实施例,示例5提供了示例2-4任一项的方法,所述根据每一目标视频帧的所述位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标,包括:According to one or more embodiments of the present disclosure, Example 5 provides the method of any one of Examples 2-4, wherein the interpolation calculation is performed according to the position coordinates of each target video frame to obtain the The position coordinates of the target cropping frame corresponding to other video frames except the target video frame, including:
对每一目标视频帧的所述位置坐标进行平滑滤波处理,以得到所述每一目标视频帧 的平滑位置坐标;Smoothing filtering is performed on the position coordinates of each target video frame to obtain the smooth position coordinates of each target video frame;
根据所述每一目标视频帧的所述平滑位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标。Interpolation calculation is performed according to the smooth position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirror segment except the target video frame.
根据本公开的一个或多个实施例,示例6提供了示例1-4任一项的方法,所述对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段,包括:According to one or more embodiments of the present disclosure, Example 6 provides the method of any one of Examples 1-4, the performing segment detection on the original video to determine segment segments in the original video, comprising: :
通过帧差法对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;或者,将所述原始视频输入预训练的分镜检测模型中,并根据所述分镜检测模型的输出结果,确定所述原始视频中的分镜片段,所述分镜检测模型是根据样本视频和所述样本视频对应的样本分镜片段进行训练得到的。Perform mirror detection on the original video by the frame difference method to determine the mirror segments in the original video; or, input the original video into a pre-trained mirror detection model, and detect the mirror according to the mirror The output result of the model determines the mirror segment in the original video, and the mirror detection model is obtained by training according to the sample video and the sample mirror segment corresponding to the sample video.
根据本公开的一个或多个实施例,示例7提供了示例1-4任一项的方法,所述根据所述目标裁剪框的尺寸信息和每一所述分镜片段对应的所述裁剪路径对所述原始视频进行裁剪,以得到裁剪后的所述目标视频,包括:According to one or more embodiments of the present disclosure, Example 7 provides the method of any one of Examples 1-4, according to the size information of the target cropping frame and the cropping path corresponding to each of the mirroring segments The original video is cropped to obtain the cropped target video, including:
针对每一所述分镜片段,根据所述目标裁剪框的尺寸信息和所述分镜片段对应的所述裁剪路径,对所述分镜片段中的每一视频帧进行裁剪;For each of the mirrored clips, according to the size information of the target cropping frame and the clipping path corresponding to the mirrored clip, trimming each video frame in the mirrored clip;
将裁剪后的每一视频帧按照时间顺序进行拼接,以得到裁剪后的所述目标视频。Each clipped video frame is spliced in time sequence to obtain the clipped target video.
根据本公开的一个或多个实施例,示例8提供了一种视频裁剪装置,所述装置包括:According to one or more embodiments of the present disclosure, Example 8 provides a video cropping apparatus, the apparatus comprising:
获取模块,用于获取待裁剪的原始视频和目标裁剪框的尺寸信息;The acquisition module is used to acquire the size information of the original video to be cropped and the target cropping frame;
第一确定模块,用于对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;a first determining module, configured to perform mirror detection on the original video to determine mirror segments in the original video;
第二确定模块,用于针对每一所述分镜片段,根据所述分镜片段中目标视频帧的主体内容,确定所述分镜片段对应的裁剪路径,所述目标视频帧为所述分镜片段中的部分视频帧或全部视频帧,所述裁剪路径用于表征所述目标裁剪框在所述分镜片段包括的所有视频帧中沿所述视频帧的宽度方向或长度方向的位置移动路径;The second determination module is configured to, for each of the mirrored segments, determine the clipping path corresponding to the mirrored segment according to the main content of the target video frame in the mirrored segment, where the target video frame is the Some or all of the video frames in the mirror segment, and the clipping path is used to represent the position of the target clipping frame moving along the width direction or the length direction of the video frame in all the video frames included in the mirror clip path;
裁剪模块,用于根据所述目标裁剪框的尺寸信息和每一所述分镜片段对应的所述裁剪路径对所述原始视频进行裁剪,以得到裁剪后的目标视频。A cropping module, configured to crop the original video according to the size information of the target cropping frame and the cropping path corresponding to each of the mirroring segments, so as to obtain a cropped target video.
根据本公开的一个或多个实施例,示例9提供了示例8的装置,所述第二确定模块用于:According to one or more embodiments of the present disclosure, Example 9 provides the apparatus of Example 8, the second determining module is configured to:
针对每一所述目标视频帧,确定能够包括所述目标视频帧中主体内容的目标裁剪框,并确定所述目标裁剪框在所述目标视频帧的宽度方向或长度方向的位置坐标;For each of the target video frames, determine a target cropping frame that can include the main content in the target video frame, and determine the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame;
根据每一目标视频帧的所述位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标;Perform interpolation calculation according to the position coordinates of each target video frame, so as to obtain the position coordinates of the target crop frame corresponding to other video frames in the mirror segment except the target video frame;
根据所述分镜片段中每一视频帧对应的所述位置坐标,确定所述分镜片段对应的所述裁剪路径。According to the position coordinates corresponding to each video frame in the mirrored clip, the clipping path corresponding to the mirrored clip is determined.
根据本公开的一个或多个实施例,示例10提供了示例9的装置,所述第二确定模块用于:According to one or more embodiments of the present disclosure, Example 10 provides the apparatus of Example 9, the second determining module being configured to:
若根据所述原始视频的尺寸信息和所述目标裁剪框的尺寸信息确定沿宽度方向进行裁剪,则确定所述目标裁剪框在所述目标视频帧的宽度方向的位置坐标;If it is determined to cut along the width direction according to the size information of the original video and the size information of the target cropping frame, then determine the position coordinates of the target cropping frame in the width direction of the target video frame;
若根据所述原始视频的尺寸信息和所述目标裁剪框的尺寸信息确定沿长度方向进行裁剪,则确定所述目标裁剪框在所述目标视频帧的长度方向的位置坐标。If it is determined to perform cropping along the length direction according to the size information of the original video and the size information of the target cropping frame, the position coordinates of the target cropping frame in the length direction of the target video frame are determined.
根据本公开的一个或多个实施例,示例11提供了示例9的装置,所述第二确定模块用于:According to one or more embodiments of the present disclosure, Example 11 provides the apparatus of Example 9, the second determining module being configured to:
根据每一目标视频帧的所述位置坐标确定目标函数,其中所述目标函数包括多个分段函数,每一所述分段函数是根据每两个相邻的目标视频帧的所述位置坐标而确定的,且每一所述分段函数和所述目标函数均是自变量为时间、因变量为所述位置坐标的三次方程,所述目标函数的一阶导数和二阶导数在时间上连续;An objective function is determined according to the position coordinates of each target video frame, wherein the target function includes a plurality of segment functions, each of which is based on the position coordinates of every two adjacent target video frames and determined, and each of the piecewise function and the objective function is a cubic equation whose independent variable is time and the dependent variable is the position coordinate, and the first-order derivative and second-order derivative of the objective function are in time continuous;
根据所述目标函数以及所述分镜片段中除所述目标视频帧外的其他视频帧所对应的时间,确定所述其他视频帧对应的目标裁剪框的位置坐标。Determine the position coordinates of the target cropping frame corresponding to the other video frames according to the objective function and the time corresponding to the other video frames in the mirroring segment except the target video frame.
根据本公开的一个或多个实施例,示例12提供了示例9-11任一项的装置,所述第二确定模块用于:According to one or more embodiments of the present disclosure, Example 12 provides the apparatus of any one of Examples 9-11, wherein the second determination module is configured to:
对每一目标视频帧的所述位置坐标进行平滑滤波处理,以得到所述每一目标视频帧的平滑位置坐标;performing smooth filtering processing on the position coordinates of each target video frame to obtain the smooth position coordinates of each target video frame;
根据所述每一目标视频帧的所述平滑位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标。Interpolation calculation is performed according to the smooth position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirror segment except the target video frame.
根据本公开的一个或多个实施例,示例13提供了示例8-11任一项的装置,所述第一确定模块用于:According to one or more embodiments of the present disclosure, Example 13 provides the apparatus of any one of Examples 8-11, wherein the first determining module is configured to:
通过帧差法对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;或者,将所述原始视频输入预训练的分镜检测模型中,并根据所述分镜检测模型的输出结果,确定所述原始视频中的分镜片段,所述分镜检测模型是根据样本视频和所述样本视频对应的样本分镜片段进行训练得到的。Perform mirror detection on the original video by the frame difference method to determine the mirror segments in the original video; or, input the original video into a pre-trained mirror detection model, and detect the mirror according to the mirror The output result of the model determines the mirror segment in the original video, and the mirror detection model is obtained by training according to the sample video and the sample mirror segment corresponding to the sample video.
根据本公开的一个或多个实施例,示例14提供了示例8-11任一项的装置,所述裁剪模块用于:According to one or more embodiments of the present disclosure, Example 14 provides the apparatus of any one of Examples 8-11, the cropping module for:
针对每一所述分镜片段,根据所述目标裁剪框的尺寸信息和所述分镜片段对应的所述 裁剪路径,对所述分镜片段中的每一视频帧进行裁剪;For each described mirror segment, according to the size information of the target cropping frame and the clipping path corresponding to the mirror segment, each video frame in the mirror segment is trimmed;
将裁剪后的每一视频帧按照时间顺序进行拼接,以得到裁剪后的所述目标视频。Each clipped video frame is spliced in time sequence to obtain the clipped target video.
根据本公开的一个或多个实施例,示例15提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理装置执行时实现示例1至7任一项所述方法的步骤。According to one or more embodiments of the present disclosure, Example 15 provides a computer-readable medium having stored thereon a computer program that, when executed by a processing apparatus, implements the steps of the method of any one of Examples 1 to 7.
根据本公开的一个或多个实施例,示例16提供了一种电子设备,包括:According to one or more embodiments of the present disclosure, Example 16 provides an electronic device comprising:
存储装置,其上存储有计算机程序;a storage device on which a computer program is stored;
处理装置,用于执行所述存储装置中的所述计算机程序,以实现示例1至7任一项所述方法的步骤。A processing device for executing the computer program in the storage device to implement the steps of the method in any one of Examples 1 to 7.
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is merely a preferred embodiment of the present disclosure and an illustration of the technical principles employed. Those skilled in the art should understand that the scope of disclosure involved in the present disclosure is not limited to the technical solutions formed by the specific combination of the above-mentioned technical features, and should also cover, without departing from the above-mentioned disclosed concept, the technical solutions formed by the above-mentioned technical features or Other technical solutions formed by any combination of its equivalent features. For example, a technical solution is formed by replacing the above features with the technical features disclosed in the present disclosure (but not limited to) with similar functions.
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。Additionally, although operations are depicted in a particular order, this should not be construed as requiring that the operations be performed in the particular order shown or in a sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, although the above discussion contains several implementation-specific details, these should not be construed as limitations on the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。Although the subject matter has been described in language specific to structural features and/or logical acts of method, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are merely example forms of implementing the claims. Regarding the apparatus in the above-mentioned embodiment, the specific manner in which each module performs operations has been described in detail in the embodiment of the method, and will not be described in detail here.

Claims (11)

  1. 一种视频裁剪方法,所述方法包括:A video cropping method, the method comprising:
    获取待裁剪的原始视频和目标裁剪框的尺寸信息;Obtain the size information of the original video to be cropped and the target cropping frame;
    对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;Performing split detection on the original video to determine a split segment in the original video;
    针对每一所述分镜片段,根据所述分镜片段中目标视频帧的主体内容,确定所述分镜片段对应的裁剪路径,所述目标视频帧为所述分镜片段中的部分视频帧或全部视频帧,所述裁剪路径用于表征所述目标裁剪框在所述分镜片段包括的所有视频帧中沿所述视频帧的宽度方向或长度方向的位置移动路径;For each of the mirrored clips, determine a clipping path corresponding to the mirrored clip according to the main content of the target video frame in the mirrored clip, where the target video frame is a partial video frame in the mirrored clip Or all video frames, the clipping path is used to represent the position movement path of the target clipping frame in all video frames included in the mirroring segment along the width direction or the length direction of the video frame;
    根据所述目标裁剪框的尺寸信息和每一所述分镜片段对应的所述裁剪路径对所述原始视频进行裁剪,以得到裁剪后的目标视频。The original video is cropped according to the size information of the target cropping frame and the cropping path corresponding to each mirroring segment to obtain a cropped target video.
  2. 根据权利要求1所述的方法,其中所述根据所述分镜片段中目标视频帧的主体内容,确定所述分镜片段对应的裁剪路径,包括:The method according to claim 1, wherein the determining of the clipping path corresponding to the mirroring segment according to the main content of the target video frame in the mirroring segment comprises:
    针对每一所述目标视频帧,确定能够包括所述目标视频帧中主体内容的目标裁剪框,并确定所述目标裁剪框在所述目标视频帧的宽度方向或长度方向的位置坐标;For each of the target video frames, determine a target cropping frame that can include the main content in the target video frame, and determine the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame;
    根据每一目标视频帧的所述位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标;Perform interpolation calculation according to the position coordinates of each target video frame, so as to obtain the position coordinates of the target crop frame corresponding to other video frames in the mirror segment except the target video frame;
    根据所述分镜片段中每一视频帧对应的所述位置坐标,确定所述分镜片段对应的所述裁剪路径。According to the position coordinates corresponding to each video frame in the mirrored clip, the clipping path corresponding to the mirrored clip is determined.
  3. 根据权利要求2所述的方法,其中所述确定所述目标裁剪框在所述目标视频帧的宽度方向或长度方向的位置坐标,包括:The method according to claim 2, wherein the determining the position coordinates of the target cropping frame in the width direction or the length direction of the target video frame comprises:
    若根据所述原始视频的尺寸信息和所述目标裁剪框的尺寸信息确定沿宽度方向进行裁剪,则确定所述目标裁剪框在所述目标视频帧的宽度方向的位置坐标;If it is determined to cut along the width direction according to the size information of the original video and the size information of the target cropping frame, then determine the position coordinates of the target cropping frame in the width direction of the target video frame;
    若根据所述原始视频的尺寸信息和所述目标裁剪框的尺寸信息确定沿长度方向进行裁剪,则确定所述目标裁剪框在所述目标视频帧的长度方向的位置坐标。If it is determined to perform cropping along the length direction according to the size information of the original video and the size information of the target cropping frame, the position coordinates of the target cropping frame in the length direction of the target video frame are determined.
  4. 根据权利要求2所述的方法,其中所述根据每一目标视频帧的所述位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标,包括:The method according to claim 2, wherein the interpolation calculation is performed according to the position coordinates of each target video frame to obtain target cropping corresponding to other video frames except the target video frame in the mirroring segment The position coordinates of the box, including:
    根据每一目标视频帧的所述位置坐标确定目标函数,其中所述目标函数包括多个分段函数,每一所述分段函数是根据每两个相邻的目标视频帧的所述位置坐标而确定的,且每一所述分段函数和所述目标函数均是自变量为时间、因变量为所述位置坐标的三次方程,所述目标函数的一阶导数和二阶导数在时间上连续;An objective function is determined according to the position coordinates of each target video frame, wherein the target function includes a plurality of segment functions, each of which is based on the position coordinates of every two adjacent target video frames and determined, and each of the piecewise function and the objective function is a cubic equation whose independent variable is time and the dependent variable is the position coordinate, and the first-order derivative and second-order derivative of the objective function are in time continuous;
    根据所述目标函数以及所述分镜片段中除所述目标视频帧外的其他视频帧所对应的时间,确定所述其他视频帧对应的目标裁剪框的位置坐标。Determine the position coordinates of the target cropping frame corresponding to the other video frames according to the objective function and the time corresponding to the other video frames in the mirroring segment except the target video frame.
  5. 根据权利要求2-4任一项所述的方法,其中所述根据每一目标视频帧的所述位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标,包括:The method according to any one of claims 2-4, wherein the interpolation calculation is performed according to the position coordinates of each target video frame to obtain other videos in the mirror segment except the target video frame The position coordinates of the target cropping frame corresponding to the frame, including:
    对每一目标视频帧的所述位置坐标进行平滑滤波处理,以得到所述每一目标视频帧的平滑位置坐标;performing smooth filtering processing on the position coordinates of each target video frame to obtain the smooth position coordinates of each target video frame;
    根据所述每一目标视频帧的所述平滑位置坐标进行插值计算,以得到所述分镜片段中除所述目标视频帧外的其他视频帧对应的目标裁剪框的位置坐标。Interpolation calculation is performed according to the smooth position coordinates of each target video frame, so as to obtain the position coordinates of the target cropping frame corresponding to other video frames in the mirror segment except the target video frame.
  6. 根据权利要求1-5任一项所述的方法,其中所述对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段,包括:The method according to any one of claims 1-5, wherein the performing mirror detection on the original video to determine a mirror segment in the original video, comprising:
    通过帧差法对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;或者,将所述原始视频输入预训练的分镜检测模型中,并根据所述分镜检测模型的输出结果,确定所述原始视频中的分镜片段,所述分镜检测模型是根据样本视频和所述样本视频对应的样本分镜片段进行训练得到的。Perform mirror detection on the original video by the frame difference method to determine the mirror segments in the original video; or, input the original video into a pre-trained mirror detection model, and detect the mirror according to the mirror The output result of the model determines the mirror segment in the original video, and the mirror detection model is obtained by training according to the sample video and the sample mirror segment corresponding to the sample video.
  7. 根据权利要求1-6任一项所述的方法,其中所述根据所述目标裁剪框的尺寸信息和每一所述分镜片段对应的所述裁剪路径对所述原始视频进行裁剪,以得到裁剪后的所述目标视频,包括:The method according to any one of claims 1-6, wherein the original video is cropped according to the size information of the target cropping frame and the cropping path corresponding to each of the mirroring segments to obtain The cropped target video includes:
    针对每一所述分镜片段,根据所述目标裁剪框的尺寸信息和所述分镜片段对应的所述裁剪路径,对所述分镜片段中的每一视频帧进行裁剪;For each of the mirrored clips, according to the size information of the target cropping frame and the clipping path corresponding to the mirrored clip, trimming each video frame in the mirrored clip;
    将裁剪后的每一视频帧按照时间顺序进行拼接,以得到裁剪后的所述目标视频。Each clipped video frame is spliced in time sequence to obtain the clipped target video.
  8. 一种视频裁剪装置,所述装置包括:A video cropping device, the device comprising:
    获取模块,被配置为获取待裁剪的原始视频和目标裁剪框的尺寸信息;an acquisition module, configured to acquire the size information of the original video to be cropped and the target cropping frame;
    第一确定模块,被配置为对所述原始视频进行分镜检测,以确定所述原始视频中的分镜片段;a first determining module, configured to perform mirror detection on the original video to determine mirror segments in the original video;
    第二确定模块,被配置为针对每一所述分镜片段,根据所述分镜片段中目标视频帧的主体内容,确定所述分镜片段对应的裁剪路径,所述目标视频帧为所述分镜片段中的部分视频帧或全部视频帧,所述裁剪路径用于表征所述目标裁剪框在所述分镜片段包括的所有视频帧中沿所述视频帧的宽度方向或长度方向的位置移动路径;The second determining module is configured to, for each of the mirrored segments, determine a clipping path corresponding to the mirrored segment according to the main content of the target video frame in the mirrored segment, where the target video frame is the Part of the video frames or all video frames in the storyboard clip, the clipping path is used to represent the position of the target clipping frame along the width direction or the length direction of the video frame in all the video frames included in the mirror clip clip moving path;
    裁剪模块,被配置为根据所述目标裁剪框的尺寸信息和每一所述分镜片段对应的所述裁剪路径对所述原始视频进行裁剪,以得到裁剪后的目标视频。The cropping module is configured to crop the original video according to the size information of the target cropping frame and the cropping path corresponding to each of the mirroring segments, so as to obtain a cropped target video.
  9. 一种计算机可读介质,其上存储有计算机程序,该程序被处理装置执行时实现权利要求1-7中任一项所述方法的步骤。A computer-readable medium having stored thereon a computer program which, when executed by a processing device, implements the steps of the method of any one of claims 1-7.
  10. 一种电子设备,包括:An electronic device comprising:
    存储装置,其上存储有计算机程序;a storage device on which a computer program is stored;
    处理装置,被配置为执行所述存储装置中的所述计算机程序,以实现权利要求1-7中任一项所述方法的步骤。A processing device configured to execute the computer program in the storage device to implement the steps of the method of any one of claims 1-7.
  11. 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理装置执行时实现权利要求1-7中任一项所述方法的步骤。A computer program product comprising a computer program which, when executed by a processing device, implements the steps of the method of any one of claims 1-7.
PCT/CN2021/128711 2020-12-02 2021-11-04 Video clipping method and apparatus, storage medium, and electronic device WO2022116772A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011401472.XA CN112561839B (en) 2020-12-02 2020-12-02 Video clipping method and device, storage medium and electronic equipment
CN202011401472.X 2020-12-02

Publications (1)

Publication Number Publication Date
WO2022116772A1 true WO2022116772A1 (en) 2022-06-09

Family

ID=75047904

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/128711 WO2022116772A1 (en) 2020-12-02 2021-11-04 Video clipping method and apparatus, storage medium, and electronic device

Country Status (2)

Country Link
CN (1) CN112561839B (en)
WO (1) WO2022116772A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115883818A (en) * 2022-11-29 2023-03-31 北京优酷科技有限公司 Automatic statistical method and device for video frame number, electronic equipment and storage medium

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112561839B (en) * 2020-12-02 2022-08-19 北京有竹居网络技术有限公司 Video clipping method and device, storage medium and electronic equipment
CN112995757B (en) * 2021-05-08 2021-08-10 腾讯科技(深圳)有限公司 Video clipping method and device
CN113840159A (en) * 2021-09-26 2021-12-24 北京沃东天骏信息技术有限公司 Video processing method, device, computer system and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030086496A1 (en) * 2001-09-25 2003-05-08 Hong-Jiang Zhang Content-based characterization of video frame sequences
CN102541494A (en) * 2010-12-30 2012-07-04 中国科学院声学研究所 Video size switching system and video size switching method facing display terminal
CN110708606A (en) * 2019-09-29 2020-01-17 新华智云科技有限公司 Method for intelligently editing video
CN111815645A (en) * 2020-06-23 2020-10-23 广州筷子信息科技有限公司 Method and system for cutting advertisement video picture
CN112561839A (en) * 2020-12-02 2021-03-26 北京有竹居网络技术有限公司 Video clipping method and device, storage medium and electronic equipment

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8531535B2 (en) * 2010-10-28 2013-09-10 Google Inc. Methods and systems for processing a video for stabilization and retargeting
US9041819B2 (en) * 2011-11-17 2015-05-26 Apple Inc. Method for stabilizing a digital video
US20130129192A1 (en) * 2011-11-17 2013-05-23 Sen Wang Range map determination for a video frame
US20130321675A1 (en) * 2012-05-31 2013-12-05 Apple Inc. Raw scaler with chromatic aberration correction
US9071756B2 (en) * 2012-12-11 2015-06-30 Facebook, Inc. Systems and methods for digital video stabilization via constraint-based rotation smoothing
JP6372696B2 (en) * 2014-10-14 2018-08-15 ソニー株式会社 Information processing apparatus, information processing method, and program
CN104537833B (en) * 2014-12-19 2017-03-29 深圳大学 A kind of accident detection method and system
CN104980665A (en) * 2015-06-29 2015-10-14 北京金山安全软件有限公司 Multi-video-clip merging method and multi-video-clip merging device
US10679584B1 (en) * 2017-11-01 2020-06-09 Gopro, Inc. Systems and methods for transforming presentation of visual content
CN108447021B (en) * 2018-03-19 2021-06-08 河北工业大学 Video scaling method based on block division and frame-by-frame optimization
CN108765317B (en) * 2018-05-08 2021-08-27 北京航空航天大学 Joint optimization method for space-time consistency and feature center EMD self-adaptive video stabilization
CN110933488A (en) * 2018-09-19 2020-03-27 传线网络科技(上海)有限公司 Video editing method and device
JP7138935B2 (en) * 2018-10-19 2022-09-20 株式会社朋栄 HDR wide color gamut video conversion device and HDR wide color gamut video conversion method for converting HDR video into SDR video
CN110189378B (en) * 2019-05-23 2022-03-04 北京奇艺世纪科技有限公司 Video processing method and device and electronic equipment
CN110809189B (en) * 2019-12-03 2022-01-04 北京字节跳动网络技术有限公司 Video playing method and device, electronic equipment and computer readable medium
CN111405200B (en) * 2020-03-31 2022-07-29 深圳市奥拓电子股份有限公司 Video shrinking device, method and system and electronic equipment thereof
CN111510630B (en) * 2020-04-24 2021-09-28 Oppo广东移动通信有限公司 Image processing method, device and storage medium
CN111586473B (en) * 2020-05-20 2023-01-17 北京字节跳动网络技术有限公司 Video clipping method, device, equipment and storage medium
CN111610193A (en) * 2020-05-29 2020-09-01 武汉至科检测技术有限公司 System and method for inspecting structural defects of subway tunnel segment by adopting multi-lens shooting
CN111695540B (en) * 2020-06-17 2023-05-30 北京字节跳动网络技术有限公司 Video frame identification method, video frame clipping method, video frame identification device, electronic equipment and medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030086496A1 (en) * 2001-09-25 2003-05-08 Hong-Jiang Zhang Content-based characterization of video frame sequences
CN102541494A (en) * 2010-12-30 2012-07-04 中国科学院声学研究所 Video size switching system and video size switching method facing display terminal
CN110708606A (en) * 2019-09-29 2020-01-17 新华智云科技有限公司 Method for intelligently editing video
CN111815645A (en) * 2020-06-23 2020-10-23 广州筷子信息科技有限公司 Method and system for cutting advertisement video picture
CN112561839A (en) * 2020-12-02 2021-03-26 北京有竹居网络技术有限公司 Video clipping method and device, storage medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115883818A (en) * 2022-11-29 2023-03-31 北京优酷科技有限公司 Automatic statistical method and device for video frame number, electronic equipment and storage medium
CN115883818B (en) * 2022-11-29 2023-09-19 北京优酷科技有限公司 Video frame number automatic counting method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112561839A (en) 2021-03-26
CN112561839B (en) 2022-08-19

Similar Documents

Publication Publication Date Title
WO2022116772A1 (en) Video clipping method and apparatus, storage medium, and electronic device
CN112184738B (en) Image segmentation method, device, equipment and storage medium
US8810692B2 (en) Rolling shutter distortion correction
CN110796664B (en) Image processing method, device, electronic equipment and computer readable storage medium
CN111436005B (en) Method and apparatus for displaying image
WO2022105740A1 (en) Video processing method and apparatus, readable medium, and electronic device
CN110298851B (en) Training method and device for human body segmentation neural network
WO2022105622A1 (en) Image segmentation method and apparatus, readable medium, and electronic device
WO2022116990A1 (en) Video cropping method and apparatus, and storage medium, and electronic device
CN111314626A (en) Method and apparatus for processing video
KR100780057B1 (en) Device for detecting gradual shot conversion in moving picture and method for operating the device
WO2022116947A1 (en) Video cropping method and apparatus, storage medium and electronic device
CN113255812B (en) Video frame detection method and device and electronic equipment
CN114640796B (en) Video processing method, device, electronic equipment and storage medium
WO2021073204A1 (en) Object display method and apparatus, electronic device, and computer readable storage medium
CN111737575B (en) Content distribution method, content distribution device, readable medium and electronic equipment
CN114004229A (en) Text recognition method and device, readable medium and electronic equipment
CN110769129A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN112561840B (en) Video clipping method and device, storage medium and electronic equipment
WO2023072173A1 (en) Video processing method and apparatus, and electronic device and storage medium
WO2023143233A1 (en) Video noise detection method and apparatus, and device and medium
CN113256659B (en) Picture processing method and device and electronic equipment
WO2023025181A1 (en) Image recognition method and apparatus, and electronic device
CN113283436B (en) Picture processing method and device and electronic equipment
WO2021068729A1 (en) Image synthesis method and apparatus, electronic device, and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21899806

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21899806

Country of ref document: EP

Kind code of ref document: A1