CN110119652A - The shot segmentation method and device of video - Google Patents

The shot segmentation method and device of video Download PDF

Info

Publication number
CN110119652A
CN110119652A CN201810118861.8A CN201810118861A CN110119652A CN 110119652 A CN110119652 A CN 110119652A CN 201810118861 A CN201810118861 A CN 201810118861A CN 110119652 A CN110119652 A CN 110119652A
Authority
CN
China
Prior art keywords
frame
video
candidate frame
camera lens
lens start
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810118861.8A
Other languages
Chinese (zh)
Other versions
CN110119652B (en
Inventor
璁镐鸡
许伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Shanghai Quan Toodou Cultural Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Quan Toodou Cultural Communication Co Ltd filed Critical Shanghai Quan Toodou Cultural Communication Co Ltd
Priority to CN201810118861.8A priority Critical patent/CN110119652B/en
Publication of CN110119652A publication Critical patent/CN110119652A/en
Application granted granted Critical
Publication of CN110119652B publication Critical patent/CN110119652B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440236Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

This disclosure relates to the shot segmentation method and device of video.This method comprises: being decoded to video in the case where detecting transcoding instruction, obtaining decoding result;Candidate frame is determined in the decoding result;Extract the feature of each candidate frame;Determine first camera lens start frame of the video;The candidate frame is calculated at a distance from a upper camera lens start frame for the candidate frame according to the feature of a upper camera lens start frame for the feature of the candidate frame and the candidate frame for each candidate frame;In the case where the distance is greater than threshold value, the candidate frame is determined as camera lens start frame.The disclosure can carry out shot segmentation after the decoding of transcoding process, before coding, without carrying out individually decoding operation for shot segmentation, avoid repeat decoding.

Description

The shot segmentation method and device of video
Technical field
This disclosure relates to the shot segmentation method and device of video technique field more particularly to a kind of video.
Background technique
In the related technology, the shot segmentation of video is carried out using individual tool, it is necessary first to first video is decoded, Video is analyzed frame by frame again, determines the shot segmentation point in video.Since video code conversion process is also required to be decoded, because This, the shot segmentation method of video in the related technology causes duplicate decoding operation.
Summary of the invention
In view of this, the present disclosure proposes a kind of shot segmentation method of video and devices.
According to the one side of the disclosure, a kind of shot segmentation method of video is provided, comprising:
In the case where detecting transcoding instruction, video is decoded, decoding result is obtained;
Candidate frame is determined in the decoding result;
Extract the feature of each candidate frame;
Determine first camera lens start frame of the video;
For each candidate frame, risen according to a upper camera lens for the feature of the candidate frame and the candidate frame The feature of beginning frame calculates the candidate frame at a distance from a upper camera lens start frame for the candidate frame;
In the case where the distance is greater than threshold value, the candidate frame is determined as camera lens start frame.
In one possible implementation, candidate frame is determined in the decoding result, comprising:
Key frame in the decoding result is determined as candidate frame.
In one possible implementation, candidate frame is determined in the decoding result, comprising:
A candidate frame is determined every N number of video frame in the decoding result, wherein N is positive integer.
In one possible implementation, first camera lens start frame of the video is determined, comprising:
First video frame of the video is determined as to first camera lens start frame of the video.
In one possible implementation, first camera lens start frame of the video is determined, comprising:
First candidate frame of the video is determined as to first camera lens start frame of the video.
According to another aspect of the present disclosure, a kind of shot segmentation device of video is provided, comprising:
Decoder module, for being decoded to video, obtaining decoding result in the case where detecting transcoding instruction;
First determining module, for determining candidate frame in the decoding result;
Extraction module, for extracting the feature of each candidate frame;
Second determining module, for determining first camera lens start frame of the video;
Computing module, for for each candidate frame, according to the feature of the candidate frame and the candidate frame The feature of a upper camera lens start frame calculates the candidate frame at a distance from a upper camera lens start frame for the candidate frame;
Third determining module, for the candidate frame being determined as camera lens and is risen in the case where the distance being greater than threshold value Beginning frame.
In one possible implementation, first determining module is used for:
Key frame in the decoding result is determined as candidate frame.
In one possible implementation, first determining module is used for:
A candidate frame is determined every N number of video frame in the decoding result, wherein N is positive integer.
In one possible implementation, second determining module is used for:
First video frame of the video is determined as to first camera lens start frame of the video.
In one possible implementation, second determining module is used for:
First candidate frame of the video is determined as to first camera lens start frame of the video.
According to another aspect of the present disclosure, a kind of shot segmentation device of video is provided, comprising: processor;For depositing Store up the memory of processor-executable instruction;Wherein, the processor is configured to executing the above method.
According to another aspect of the present disclosure, a kind of non-volatile computer readable storage medium storing program for executing is provided, is stored thereon with Computer program instructions, wherein the computer program instructions realize the above method when being executed by processor.
The shot segmentation method of the video of all aspects of this disclosure and device by the case where detecting that transcoding instructs, Video is decoded, decoding result is obtained, candidate frame is determined in decoding result, extracts the feature of each candidate frame, is determined First camera lens start frame of the video, for each candidate frame, according to upper the one of the feature of the candidate frame and the candidate frame The feature of a camera lens start frame calculates the candidate frame at a distance from a upper camera lens start frame for the candidate frame, big in the distance In the case where threshold value, which is determined as camera lens start frame, thus, it is possible to after the decoding of transcoding process, encode it Preceding carry out shot segmentation avoids repeat decoding without carrying out individually decoding operation for shot segmentation.
According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the other feature and aspect of the disclosure will become It is clear.
Detailed description of the invention
Comprising in the description and constituting the attached drawing of part of specification and specification together illustrates the disclosure Exemplary embodiment, feature and aspect, and for explaining the principles of this disclosure.
Fig. 1 shows the flow chart of the shot segmentation method of the video according to one embodiment of the disclosure.
Fig. 2 shows in the shot segmentation method according to the video of one embodiment of the disclosure in transcoding process decoding and coding Between insertion divide the schematic diagram of mirror filter.
Fig. 3 shows the block diagram of the shot segmentation device of the video according to one embodiment of the disclosure.
Fig. 4 is a kind of block diagram of the device 800 of shot segmentation for video shown according to an exemplary embodiment.
Fig. 5 is a kind of block diagram of the device 1900 of shot segmentation for video shown according to an exemplary embodiment.
Specific embodiment
Various exemplary embodiments, feature and the aspect of the disclosure are described in detail below with reference to attached drawing.It is identical in attached drawing Appended drawing reference indicate element functionally identical or similar.Although the various aspects of embodiment are shown in the attached drawings, remove It non-specifically points out, it is not necessary to attached drawing drawn to scale.
Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary " Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.
In addition, giving numerous details in specific embodiment below to better illustrate the disclosure. It will be appreciated by those skilled in the art that without certain details, the disclosure equally be can be implemented.In some instances, for Method, means, element and circuit well known to those skilled in the art are not described in detail, in order to highlight the purport of the disclosure.
Fig. 1 shows the flow chart of the shot segmentation method of the video according to one embodiment of the disclosure.This method can be applied It in server, also can be applied in terminal device, be not limited thereto.As shown in Fig. 1 institute, the method comprising the steps of S11 To step S16.
In step s 11, in the case where detecting transcoding instruction, video is decoded, decoding result is obtained.
Wherein, video is decoded and refers to video is decoded in transcoding process.
In the present embodiment, shot segmentation is carried out after the decoding of transcoding process, before coding, thus, it is possible to be multiplexed to turn Decoding operation during code, without carrying out individually decoding operation for shot segmentation, so as to avoid multiple decoding, again Multiple decoding.
In one possible implementation, it can be inserted between the decoding and coding of transcoding process and divide mirror filter, And mirror filter can be divided to carry out shot segmentation by this.Fig. 2 shows the shot segmentations according to the video of one embodiment of the disclosure Insertion divides the schematic diagram of mirror filter between the decoding of transcoding process and coding in method.Step S12 to step S16 can lead to Crossing this divides mirror filter to realize.
In step s 12, candidate frame is determined in decoding result.
In one possible implementation, in all video frames of the video in decoding result, including candidate frame and Non-candidate frame.In other words, in this implementation, not using all video frames as candidate frame, so as to reduce shot segmentation Operand, improve the speed of shot segmentation.
In one possible implementation, candidate frame is determined in decoding result, may include: will be in decoding result Key frame is determined as candidate frame.In this implementation, by using key frame as candidate frame, from each key frame of video Camera lens start frame is determined, without determining camera lens start frame from all frames of video, so as to guarantee shot segmentation Under the premise of accuracy, operand is substantially reduced, reduces the time-consuming of shot segmentation.
In alternatively possible implementation, candidate frame is determined in decoding result, may include: in decoding result A candidate frame is determined every N number of video frame, wherein N is positive integer.For example, N is equal to 9.The implementation passes through every N number of Video frame determines a candidate frame, to only willVideo frame as candidate frame, only fromVideo frame in determine mirror Head start frame reduces the time-consuming of shot segmentation so as to substantially reduce operand.
In step s 13, the feature of each candidate frame is extracted.
In one possible implementation, the feature for extracting each candidate frame may include: to extract each candidate frame Gray value.
In alternatively possible implementation, the feature for extracting each candidate frame may include: to extract each candidate frame Local feature.
As an example of the implementation, the local feature for extracting each candidate frame may include: to extract each time Select SIFT (Scale-Invariant Feature Transform, Scale invariant features transform) feature of frame.
As another example of the implementation, the local feature for extracting each candidate frame may include: to extract each The SURF of candidate frame (Speeded Up Robust Features accelerates robust feature).
As another example of the implementation, the local feature for extracting each candidate frame may include: to extract each The KAZE feature of candidate frame.
As another example of the implementation, the local feature for extracting each candidate frame may include: to extract each The VLAD (Vector of Locally Aggregated Descriptors, local feature Aggregation Descriptor) of candidate frame is special Sign.
As another example of the implementation, the local feature for extracting each candidate frame may include: to extract each The VLAT of candidate frame (Vector of Locally Aggregated Tensors, local feature polymerize tensor).
As another example of the implementation, the local feature for extracting each candidate frame may include: to extract each LLC (Locality-constrained Linear Coding, local restriction uniform enconding) feature of candidate frame.
As another example of the implementation, the local feature for extracting each candidate frame may include: to extract each LSH (Locality Sensitive Hashing is based on local sensitivity Hash) feature of candidate frame.
It should be noted that although described in a manner of implementation above extract each candidate frame local feature it is as above, It will be appreciated by those skilled in the art that the disclosure answer it is without being limited thereto.Those skilled in the art can be according to practical application scene need It asks and/or the concrete type of the extracted local feature of personal preference flexible choice.
In alternatively possible implementation, the feature for extracting each candidate frame may include: to extract each candidate frame Depth characteristic.In this implementation, depth characteristic can refer to the feature extracted by deep learning network.Wherein, depth Learning network can be ResNet, VGG network or AlexNet etc., be not limited thereto.
In alternatively possible implementation, the feature for extracting each candidate frame may include: to extract each candidate frame Local feature and depth characteristic.
In step S14, first camera lens start frame of the video is determined.
In one possible implementation, first camera lens start frame for determining the video may include: by the video First video frame be determined as first camera lens start frame of the video.
In alternatively possible implementation, first camera lens start frame of the video is determined, comprising: by the video First candidate frame is determined as first camera lens start frame of the video.
In step S15, for each candidate frame, according to a upper mirror for the feature of the candidate frame and the candidate frame The feature of head start frame, calculates the candidate frame at a distance from a upper camera lens start frame for the candidate frame.
In one possible implementation, can calculate the candidate frame and the candidate frame a upper camera lens start frame it Between Euclidean distance.
In the present embodiment, can successively judge whether each candidate frame is mirror according to the vertical sequence of candidate frame Head start frame.
In step s 16, in the case where the distance is greater than threshold value, which is determined as camera lens start frame.
In the present embodiment, the case where threshold value is greater than at a distance from the upper camera lens start frame in candidate frame with the candidate frame Under, it can determine that the candidate frame and the difference of a upper camera lens start frame are larger, therefore, which can be determined as camera lens Start frame, that is, using the candidate frame as the start frame of new camera lens.In the present embodiment, camera lens can refer to that the camera lens is corresponding Video clip.
The present embodiment is by being decoded video, obtaining decoding result, solving in the case where detecting transcoding instruction Candidate frame is determined in code result, is extracted the feature of each candidate frame, first camera lens start frame of the video is determined, for each Candidate frame, according to the feature of a upper camera lens start frame for the feature of the candidate frame and the candidate frame, calculate the candidate frame with The candidate frame is determined as mirror in the case where the distance is greater than threshold value by the distance of a upper camera lens start frame for the candidate frame Head start frame, thus, it is possible to carry out shot segmentation after the decoding of transcoding process, before coding, without for camera lens point Undercutting row individually decodes operation, avoids repeat decoding.
Fig. 3 shows the block diagram of the shot segmentation device of the video according to one embodiment of the disclosure.As shown in figure 3, the device It include: decoder module 31, for being decoded to video, obtaining decoding result in the case where detecting transcoding instruction;First Determining module 32, for determining candidate frame in decoding result;Extraction module 33, for extracting the feature of each candidate frame;The Two determining modules 34, for determining first camera lens start frame of the video;Computing module 35 is used for for each candidate frame, According to the feature of a upper camera lens start frame for the feature of the candidate frame and the candidate frame, the candidate frame and the candidate are calculated The distance of a upper camera lens start frame for frame;Third determining module 36 is used in the case where the distance is greater than threshold value, by the time Frame is selected to be determined as camera lens start frame.
In one possible implementation, the first determining module 32 is used for: the key frame in decoding result is determined as Candidate frame.
In one possible implementation, the first determining module 32 is used for: true every N number of video frame in decoding result A fixed candidate frame, wherein N is positive integer.
In one possible implementation, the second determining module 34 is used for: first video frame of the video is determined For first camera lens start frame of the video.
In one possible implementation, the second determining module 34 is used for: first candidate frame of the video is determined For first camera lens start frame of the video.
The present embodiment is by being decoded video, obtaining decoding result, solving in the case where detecting transcoding instruction Candidate frame is determined in code result, is extracted the feature of each candidate frame, first camera lens start frame of the video is determined, for each Candidate frame, according to the feature of a upper camera lens start frame for the feature of the candidate frame and the candidate frame, calculate the candidate frame with The candidate frame is determined as mirror in the case where the distance is greater than threshold value by the distance of a upper camera lens start frame for the candidate frame Head start frame, thus, it is possible to carry out shot segmentation after the decoding of transcoding process, before coding, without for camera lens point Undercutting row individually decodes operation, avoids repeat decoding.
Fig. 4 is a kind of block diagram of the device 800 of shot segmentation for video shown according to an exemplary embodiment.Example Such as, device 800 can be mobile phone, computer, digital broadcasting terminal, messaging device, game console, and plate is set It is standby, Medical Devices, body-building equipment, personal digital assistant etc..
Referring to Fig. 4, device 800 may include following one or more components: processing component 802, memory 804, power supply Component 806, multimedia component 808, audio component 810, the interface 812 of input/output (I/O), sensor module 814, and Communication component 816.
The integrated operation of the usual control device 800 of processing component 802, such as with display, telephone call, data communication, phase Machine operation and record operate associated operation.Processing component 802 may include that one or more processors 820 refer to execute It enables, to perform all or part of the steps of the methods described above.In addition, processing component 802 may include one or more modules, just Interaction between processing component 802 and other assemblies.For example, processing component 802 may include multi-media module, it is more to facilitate Interaction between media component 808 and processing component 802.
Memory 804 is configured as storing various types of data to support the operation in device 800.These data are shown Example includes the instruction of any application or method for operating on device 800, contact data, and telephone book data disappears Breath, picture, video etc..Memory 804 can be by any kind of volatibility or non-volatile memory device or their group It closes and realizes, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash Device, disk or CD.
Power supply module 806 provides electric power for the various assemblies of device 800.Power supply module 806 may include power management system System, one or more power supplys and other with for device 800 generate, manage, and distribute the associated component of electric power.
Multimedia component 808 includes the screen of one output interface of offer between described device 800 and user.One In a little embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touch sensings Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding action Boundary, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, more matchmakers Body component 808 includes a front camera and/or rear camera.When device 800 is in operation mode, such as screening-mode or When video mode, front camera and/or rear camera can receive external multi-medium data.Each front camera and Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 810 is configured as output and/or input audio signal.For example, audio component 810 includes a Mike Wind (MIC), when device 800 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone is matched It is set to reception external audio signal.The received audio signal can be further stored in memory 804 or via communication set Part 816 is sent.In some embodiments, audio component 810 further includes a loudspeaker, is used for output audio signal.
I/O interface 812 provides interface between processing component 802 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock Determine button.
Sensor module 814 includes one or more sensors, and the state for providing various aspects for device 800 is commented Estimate.For example, sensor module 814 can detecte the state that opens/closes of device 800, and the relative positioning of component, for example, it is described Component is the display and keypad of device 800, and sensor module 814 can be with 800 1 components of detection device 800 or device Position change, the existence or non-existence that user contacts with device 800,800 orientation of device or acceleration/deceleration and device 800 Temperature change.Sensor module 814 may include proximity sensor, be configured to detect without any physical contact Presence of nearby objects.Sensor module 814 can also include optical sensor, such as CMOS or ccd image sensor, at As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 816 is configured to facilitate the communication of wired or wireless way between device 800 and other equipment.Device 800 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.In an exemplary implementation In example, communication component 816 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 816 further includes near-field communication (NFC) module, to promote short range communication.Example Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 800 can be believed by one or more application specific integrated circuit (ASIC), number Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.
In the exemplary embodiment, a kind of non-volatile computer readable storage medium storing program for executing is additionally provided, for example including calculating The memory 804 of machine program instruction, above-mentioned computer program instructions can be executed above-mentioned to complete by the processor 820 of device 800 Method.
Fig. 5 is a kind of block diagram of the device 1900 of shot segmentation for video shown according to an exemplary embodiment. For example, device 1900 may be provided as a server.Referring to Fig. 5, device 1900 includes processing component 1922, is further wrapped One or more processors and memory resource represented by a memory 1932 are included, it can be by processing component for storing The instruction of 1922 execution, such as application program.The application program stored in memory 1932 may include one or one with On each correspond to one group of instruction module.In addition, processing component 1922 is configured as executing instruction, to execute above-mentioned side Method.
Device 1900 can also include that a power supply module 1926 be configured as the power management of executive device 1900, and one Wired or wireless network interface 1950 is configured as device 1900 being connected to network and input and output (I/O) interface 1958.Device 1900 can be operated based on the operating system for being stored in memory 1932, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.
In the exemplary embodiment, a kind of non-volatile computer readable storage medium storing program for executing is additionally provided, for example including calculating The memory 1932 of machine program instruction, above-mentioned computer program instructions can be executed by the processing component 1922 of device 1900 to complete The above method.
The disclosure can be system, method and/or computer program product.Computer program product may include computer Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the disclosure.
Computer readable storage medium, which can be, can keep and store the tangible of the instruction used by instruction execution equipment Equipment.Computer readable storage medium for example can be-- but it is not limited to-- storage device electric, magnetic storage apparatus, optical storage Equipment, electric magnetic storage apparatus, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer readable storage medium More specific example (non exhaustive list) includes: portable computer diskette, hard disk, random access memory (RAM), read-only deposits It is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static random access memory (SRAM), portable Compact disk read-only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereon It is stored with punch card or groove internal projection structure and the above-mentioned any appropriate combination of instruction.Calculating used herein above Machine readable storage medium storing program for executing is not interpreted that instantaneous signal itself, the electromagnetic wave of such as radio wave or other Free propagations lead to It crosses the electromagnetic wave (for example, the light pulse for passing through fiber optic cables) of waveguide or the propagation of other transmission mediums or is transmitted by electric wire Electric signal.
Computer-readable program instructions as described herein can be downloaded to from computer readable storage medium it is each calculate/ Processing equipment, or outer computer or outer is downloaded to by network, such as internet, local area network, wide area network and/or wireless network Portion stores equipment.Network may include copper transmission cable, optical fiber transmission, wireless transmission, router, firewall, interchanger, gateway Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for the meter being stored in each calculating/processing equipment In calculation machine readable storage medium storing program for executing.
Computer program instructions for executing disclosure operation can be assembly instruction, instruction set architecture (ISA) instructs, Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming languages The source code or object code that any combination is write, the programming language include the programming language-of object-oriented such as Smalltalk, C++ etc., and conventional procedural programming languages-such as " C " language or similar programming language.Computer Readable program instructions can be executed fully on the user computer, partly execute on the user computer, be only as one Vertical software package executes, part executes on the remote computer or completely in remote computer on the user computer for part Or it is executed on server.In situations involving remote computers, remote computer can pass through network-packet of any kind It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit It is connected with ISP by internet).In some embodiments, by utilizing computer-readable program instructions Status information carry out personalized customization electronic circuit, such as programmable logic circuit, field programmable gate array (FPGA) or can Programmed logic array (PLA) (PLA), the electronic circuit can execute computer-readable program instructions, to realize each side of the disclosure Face.
Referring herein to according to the flow chart of the method, apparatus (system) of the embodiment of the present disclosure and computer program product and/ Or block diagram describes various aspects of the disclosure.It should be appreciated that flowchart and or block diagram each box and flow chart and/ Or in block diagram each box combination, can be realized by computer-readable program instructions.
These computer-readable program instructions can be supplied to general purpose computer, special purpose computer or other programmable datas The processor of processing unit, so that a kind of machine is produced, so that these instructions are passing through computer or other programmable datas When the processor of processing unit executes, function specified in one or more boxes in implementation flow chart and/or block diagram is produced The device of energy/movement.These computer-readable program instructions can also be stored in a computer-readable storage medium, these refer to It enables so that computer, programmable data processing unit and/or other equipment work in a specific way, thus, it is stored with instruction Computer-readable medium then includes a manufacture comprising in one or more boxes in implementation flow chart and/or block diagram The instruction of the various aspects of defined function action.
Computer-readable program instructions can also be loaded into computer, other programmable data processing units or other In equipment, so that series of operation steps are executed in computer, other programmable data processing units or other equipment, to produce Raw computer implemented process, so that executed in computer, other programmable data processing units or other equipment Instruct function action specified in one or more boxes in implementation flow chart and/or block diagram.
The flow chart and block diagram in the drawings show system, method and the computer journeys according to multiple embodiments of the disclosure The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation One module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more use The executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the box It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallel Row, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or The combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamic The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.
The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport In the principle, practical application or technological improvement to the technology in market for best explaining each embodiment, or lead this technology Other those of ordinary skill in domain can understand each embodiment disclosed herein.

Claims (12)

1. a kind of shot segmentation method of video characterized by comprising
In the case where detecting transcoding instruction, video is decoded, decoding result is obtained;
Candidate frame is determined in the decoding result;
Extract the feature of each candidate frame;
Determine first camera lens start frame of the video;
For each candidate frame, according to a upper camera lens start frame for the feature of the candidate frame and the candidate frame Feature, calculate the candidate frame at a distance from a upper camera lens start frame for the candidate frame;
In the case where the distance is greater than threshold value, the candidate frame is determined as camera lens start frame.
2. the method according to claim 1, wherein determining candidate frame in the decoding result, comprising:
Key frame in the decoding result is determined as candidate frame.
3. the method according to claim 1, wherein determining candidate frame in the decoding result, comprising:
A candidate frame is determined every N number of video frame in the decoding result, wherein N is positive integer.
4. the method according to claim 1, wherein determining first camera lens start frame of the video, comprising:
First video frame of the video is determined as to first camera lens start frame of the video.
5. the method according to claim 1, wherein determining first camera lens start frame of the video, comprising:
First candidate frame of the video is determined as to first camera lens start frame of the video.
6. a kind of shot segmentation device of video characterized by comprising
Decoder module, for being decoded to video, obtaining decoding result in the case where detecting transcoding instruction;
First determining module, for determining candidate frame in the decoding result;
Extraction module, for extracting the feature of each candidate frame;
Second determining module, for determining first camera lens start frame of the video;
Computing module, for for each candidate frame, according to upper the one of the feature of the candidate frame and the candidate frame The feature of a camera lens start frame calculates the candidate frame at a distance from a upper camera lens start frame for the candidate frame;
Third determining module, in the case where the distance is greater than threshold value, the candidate frame to be determined as camera lens start frame.
7. device according to claim 6, which is characterized in that first determining module is used for:
Key frame in the decoding result is determined as candidate frame.
8. device according to claim 6, which is characterized in that first determining module is used for:
A candidate frame is determined every N number of video frame in the decoding result, wherein N is positive integer.
9. device according to claim 6, which is characterized in that second determining module is used for:
First video frame of the video is determined as to first camera lens start frame of the video.
10. device according to claim 6, which is characterized in that second determining module is used for:
First candidate frame of the video is determined as to first camera lens start frame of the video.
11. a kind of shot segmentation device of video characterized by comprising
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to method described in any one of perform claim requirement 1 to 5.
12. a kind of non-volatile computer readable storage medium storing program for executing, is stored thereon with computer program instructions, which is characterized in that institute It states and realizes method described in any one of claim 1 to 5 when computer program instructions are executed by processor.
CN201810118861.8A 2018-02-06 2018-02-06 Video shot segmentation method and device Active CN110119652B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810118861.8A CN110119652B (en) 2018-02-06 2018-02-06 Video shot segmentation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810118861.8A CN110119652B (en) 2018-02-06 2018-02-06 Video shot segmentation method and device

Publications (2)

Publication Number Publication Date
CN110119652A true CN110119652A (en) 2019-08-13
CN110119652B CN110119652B (en) 2021-11-12

Family

ID=67519945

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810118861.8A Active CN110119652B (en) 2018-02-06 2018-02-06 Video shot segmentation method and device

Country Status (1)

Country Link
CN (1) CN110119652B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007134986A (en) * 2005-11-10 2007-05-31 Kddi Corp Shot boundary detection device
CN101650722A (en) * 2009-06-01 2010-02-17 南京理工大学 Method based on audio/video combination for detecting highlight events in football video
CN101236604B (en) * 2008-01-11 2010-06-09 北京航空航天大学 Fast lens boundary detection method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007134986A (en) * 2005-11-10 2007-05-31 Kddi Corp Shot boundary detection device
CN101236604B (en) * 2008-01-11 2010-06-09 北京航空航天大学 Fast lens boundary detection method
CN101650722A (en) * 2009-06-01 2010-02-17 南京理工大学 Method based on audio/video combination for detecting highlight events in football video

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
汪翔: "基于内容的视频检索关键技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
那幼超 等: "镜头分割在H.264视频压缩编码及视频服务器中的应用", 《电子器件》 *
郭永磊: "移动流媒体选播系统中媒体数据服务器的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Also Published As

Publication number Publication date
CN110119652B (en) 2021-11-12

Similar Documents

Publication Publication Date Title
TWI749423B (en) Image processing method and device, electronic equipment and computer readable storage medium
TWI766286B (en) Image processing method and image processing device, electronic device and computer-readable storage medium
TWI747325B (en) Target object matching method, target object matching device, electronic equipment and computer readable storage medium
TWI740309B (en) Image processing method and device, electronic equipment and computer readable storage medium
CN109189987A (en) Video searching method and device
CN109089133A (en) Method for processing video frequency and device, electronic equipment and storage medium
CN109934275B (en) Image processing method and device, electronic equipment and storage medium
CN110458218B (en) Image classification method and device and classification network training method and device
CN109887515B (en) Audio processing method and device, electronic equipment and storage medium
CN108804980A (en) Switching detection method of video scene and device
CN109729435A (en) The extracting method and device of video clip
CN108985176A (en) image generating method and device
CN110532956B (en) Image processing method and device, electronic equipment and storage medium
TW202107337A (en) Face image recognition method and device, electronic device and storage medium
CN110209877A (en) Video analysis method and device
CN108924644A (en) Video clip extracting method and device
CN105354560A (en) Fingerprint identification method and device
CN110399934A (en) A kind of video classification methods, device and electronic equipment
CN110121106A (en) Video broadcasting method and device
TW202141352A (en) Character recognition method, electronic device and computer readable storage medium
CN110930984A (en) Voice processing method and device and electronic equipment
CN111259967A (en) Image classification and neural network training method, device, equipment and storage medium
CN109635142A (en) Image-selecting method and device, electronic equipment and storage medium
CN110070049A (en) Facial image recognition method and device, electronic equipment and storage medium
CN110781905A (en) Image detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200515

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 200241 room 1162, building 555, Dongchuan Road, Shanghai, Minhang District

Applicant before: SHANGHAI QUANTUDOU CULTURE COMMUNICATION Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant