WO2016098430A1 - 情報処理方法、映像処理装置及びプログラム - Google Patents
情報処理方法、映像処理装置及びプログラム Download PDFInfo
- Publication number
- WO2016098430A1 WO2016098430A1 PCT/JP2015/078845 JP2015078845W WO2016098430A1 WO 2016098430 A1 WO2016098430 A1 WO 2016098430A1 JP 2015078845 W JP2015078845 W JP 2015078845W WO 2016098430 A1 WO2016098430 A1 WO 2016098430A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unit
- video
- music
- information
- editing
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
- G11B27/034—Electronic editing of digitised analogue information signals, e.g. audio or video signals on discs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
- G06V20/47—Detecting features for summarising video content
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
- G10H1/361—Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
- G10H1/368—Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems displaying animated or moving pictures synchronized with the music or audio part
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
- G10H1/38—Chord
- G10H1/383—Chord detection and/or recognition, e.g. for correction, or automatic bass generation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
- G10H1/40—Rhythm
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/91—Television signal processing therefor
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/021—Background music, e.g. for video sequences, elevator music
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/061—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of musical phrases, isolation of musically relevant segments, e.g. musical thumbnail generation, or for temporal structure analysis of a musical piece, e.g. determination of the movement sequence of a musical work
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/071—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for rhythm pattern analysis or rhythm style recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/076—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of timing, tempo; Beat detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2220/00—Input/output interfacing specifically adapted for electrophonic musical tools or instruments
- G10H2220/155—User input interfaces for electrophonic musical instruments
- G10H2220/351—Environmental parameters, e.g. temperature, ambient light, atmospheric pressure, humidity, used as input for musical purposes
- G10H2220/355—Geolocation input, i.e. control of musical parameters based on location or geographic position, e.g. provided by GPS, WiFi network location databases or mobile phone base station position databases
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2220/00—Input/output interfacing specifically adapted for electrophonic musical tools or instruments
- G10H2220/155—User input interfaces for electrophonic musical instruments
- G10H2220/391—Angle sensing for musical purposes, using data from a gyroscope, gyrometer or other angular velocity or angular movement sensing device
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2220/00—Input/output interfacing specifically adapted for electrophonic musical tools or instruments
- G10H2220/155—User input interfaces for electrophonic musical instruments
- G10H2220/395—Acceleration sensing or accelerometer use, e.g. 3D movement computation by integration of accelerometer data, angle sensing with respect to the vertical, i.e. gravity sensing.
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2220/00—Input/output interfacing specifically adapted for electrophonic musical tools or instruments
- G10H2220/155—User input interfaces for electrophonic musical instruments
- G10H2220/441—Image sensing, i.e. capturing images or optical patterns for musical purposes or musical control purposes
- G10H2220/455—Camera input, e.g. analyzing pictures from a video camera and using the analysis results as control data
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/075—Musical metadata derived from musical analysis or for use in electrophonic musical instruments
- G10H2240/085—Mood, i.e. generation, detection or selection of a particular emotional content or atmosphere in a musical piece
Definitions
- the present disclosure relates to an information processing method, a video processing apparatus, and a program.
- Patent Document 1 discloses a technique of switching video data at each timing of music phrase breaks or when there are multiple phrase breaks.
- the video is only switched at a timing corresponding to the timing of the BGM phrase break.
- Switching the video in accordance with the timing of the BGM phrase separation realizes natural switching of the video that matches the BGM, so it is difficult to excite the viewer's emotions, for example.
- the present disclosure proposes a new and improved information processing method, video processing apparatus, and program capable of more effectively stimulating the viewer's emotions.
- analyzing beats of input music extracting a plurality of unit videos from the input videos, and switching the extracted unit videos according to the analyzed beats Generating an editing information by a processor.
- a music analysis unit that analyzes beats of input music
- an extraction unit that extracts a plurality of unit videos from the input video
- the unit videos extracted by the extraction unit are
- An image processing apparatus includes an editing unit that generates editing information for switching according to the beat analyzed by the music analysis unit.
- the computer includes a music analysis unit that analyzes beats of input music, an extraction unit that extracts a plurality of unit videos from the input video, and the units extracted by the extraction unit
- a program for causing an image to function as an editing unit that generates editing information for switching video according to the beat analyzed by the music analysis unit is provided.
- elements having substantially the same functional configuration may be distinguished by adding different alphabets after the same reference numerals.
- a plurality of elements having substantially the same functional configuration are distinguished as necessary, such as the video processing devices 100A, 100B, and 100C.
- the video processing devices 100A, 100B, and 100C are simply referred to as the video processing device 100 when it is not necessary to distinguish them.
- FIG. 1 is a diagram for explaining an overview of a video processing apparatus 100 according to the present embodiment.
- FIG. 1 shows a user's operation using the video processing device 100 and a transition of processing performed in the video processing device 100, and time flows from left to right.
- the video processing apparatus 100 generates a summary video 50 from a video 10 taken by a user.
- the summary video 50 is a digest version video that summarizes video shot by the user.
- the video processing apparatus 100 generates the summary video 50 by switching and connecting sections adopted from the captured video 10 using any adoption standard according to the input music 30.
- the video includes image (still image / moving image) data and audio data.
- an outline of the generation process of the summary video 50 executed in the video processing apparatus 100 will be described.
- the video processing apparatus 100 performs recording processing for recording the shot video 10 and video analysis processing for analyzing the video 10.
- video analysis processing the video processing device 100 analyzes user operations during shooting, performs image analysis such as smile detection, color detection, and motion vector detection, or subjects based on sensor information during shooting. Analyzing the operation of.
- the video processing apparatus 100 performs an editing information generation process based on the video analysis result information 20 indicating the result of the video analysis process and the input music 30.
- the video processing apparatus 100 selects a unit video to be adopted as the summary video 50 from the video 10 by evaluating the video analysis result information 20 using an arbitrary adoption standard.
- the unit video is a series of video and is also called a shot.
- the video processing apparatus 100 generates editing information 40 for switching the adopted unit video according to the music 30.
- the editing information 40 is information that defines which section of which music 30 is BGM (background music) and which unit video is switched at which timing.
- the video processing apparatus 100 analyzes the music 30 based on the music theory, and generates editing information 40 so that the unit video is switched at a timing according to the melody, rhythm, beat, or excitement of the music 30.
- the video processing apparatus 100 performs a summary video generation process based on the editing information 40.
- the video processing apparatus 100 generates the summary video 50 by switching and connecting the unit video specified by the editing information 40 at the specified timing with the music 30 specified by the editing information 40 as BGM.
- the video processing apparatus 100 can also reproduce, record, and transmit the summary video 50 to other devices.
- the video processing apparatus 100 may generate a summary video 50 in which a plurality of videos and music 30 are BGM.
- FIG. 2 is a diagram for explaining the outline of the video analysis process executed in the video processing apparatus 100 according to the present embodiment.
- the video 10 is a video of the user's day
- the video analysis result information 20 includes information indicating video attributes such as a highlight 21 and a scene segment 22.
- the video 10 includes a video arriving at the sea, a video surfing, a video during a break, a video at lunch, a video at a hotel, and a video at sunset.
- the highlight 21 is a section showing the highlight in the video 10. Highlights include, for example, specific scenes such as jumps and turns, smiles, lively scenes of cheering events, and important scenes in specific events such as wedding cake cuts and ring exchanges.
- the scene segment 22 is a section in which the video 10 is divided under a predetermined condition.
- the scene segment 22 may be a section in which colors of the same system are divided based on colors.
- the scene segment 22 may be a section in which the same camera work is divided based on the camera work.
- the scene segment 22 may be a section that is divided on the basis of the date and time and is shot at a close date and time.
- the scene segment 22 may be a section that is divided based on a place and is shot at the same or near place.
- FIG. 2 shows an example in which the scene segment 22 is divided based on colors.
- the colors to be segmented may be, for example, white, blue, green, and red.
- the video processing apparatus 100 analyzes video attributes such as the highlight 21 and the scene segment 22 by video analysis processing.
- FIG. 3 is a diagram for explaining an overview of the editing information generation process and the summary video generation process executed in the video processing apparatus 100 according to the present embodiment.
- the video processing apparatus 100 extracts a series of videos having the same scene segment 22 as a unit video.
- the video processing apparatus 100 adopts the unit video according to a predetermined policy while preferentially adopting the highlight 21 from the unit videos.
- the video processing apparatus 100 may employ a unit video in which the scene segments 22 are dispersed in order to reduce visual bias.
- the video processing apparatus 100 may adopt unit videos in accordance with themes such as surfing and snowboard specified by the user.
- the image processing apparatus 100 is configured so that the ratio of highlights such as turns during surfing is higher than that of meals, and the scenes of blue systems, places close to the sea, and high wave periods.
- Unit video may be adopted so that the proportion of segments increases.
- the video processing apparatus 100 analyzes the music 30 (BGM) based on the music theory, and sets the timing for switching the unit video. Through these processes, the video processing apparatus 100 generates editing information 40 for switching the adopted unit video at the set timing. Then, the video processing apparatus 100 generates a summary video 50 based on the editing information 40. Note that the unit video included in the summary video 50 may or may not be along the time series.
- the video processing apparatus 100 can be realized as a camera such as an action camera or a wearable camera. Cameras such as action cameras and wearable cameras are often taken continuously for a long time, and the composition tends to be monotonous. For this reason, it is desirable to edit the video shot by such a camera into a summary video that summarizes the highlights. However, since such a camera is often small or has a simple UI, it may be difficult to manually edit while checking the video. Therefore, it is desirable that an appropriate summary video is generated even if the composition is a monotonous video taken continuously for a long time.
- the video processing apparatus 100 is a summary in which, even for such video, attributes are distributed and a shot including a highlight is switched according to BGM along a theme specified by the user. It is possible to generate a video.
- the video processing apparatus 100 may be realized as a general video camera or the like, or may be realized as an information processing apparatus such as a PC (Personal Computer) separate from the camera or a server on a network.
- PC Personal Computer
- FIG. 4 is a block diagram illustrating an example of a logical configuration of the video processing apparatus 100 according to the present embodiment.
- the video processing apparatus 100 includes an input unit 110, a storage unit 120, an output unit 130, and a control unit 140.
- the input unit 110 has a function of accepting input of various information from the outside. As shown in FIG. 4, the input unit 110 includes a sensor unit 111, an operation unit 112, a video acquisition unit 113, and a music acquisition unit 114.
- the sensor unit 111 has a function of detecting the motion of the subject.
- the sensor unit 111 may include a gyro sensor, an acceleration sensor, and a gravity sensor.
- a subject is a subject to be photographed and includes a photographer (user).
- the sensor unit 111 may include an arbitrary sensor such as a GPS (Global Positioning System), an infrared sensor, a proximity sensor, or a touch sensor.
- the sensor unit 111 outputs sensor information indicating the sensing result to the control unit 140.
- the sensor unit 111 may not be formed integrally with the video processing apparatus 100.
- the sensor unit 111 may acquire sensor information from a sensor attached to the subject via wired or wireless communication.
- the operation unit 112 has a function of accepting user operations.
- the operation unit 112 is realized by a button, a touch pad, and the like.
- the operation unit 112 can accept operations such as a zoom operation during shooting and a shooting mode setting operation.
- As the shooting mode for example, a normal mode for shooting a moving image and a simultaneous shooting mode for simultaneously shooting a moving image and a still image can be considered.
- the operation unit 112 can accept an editing instruction for designating a section to be included in the summary video during or after shooting.
- the operation unit 112 outputs operation information indicating the content of the user operation to the control unit 140.
- Video acquisition unit 113 has a function of acquiring video.
- the video acquisition unit 113 is realized as an imaging device and outputs data of a captured image (moving image / still image) that is a digital signal.
- the video acquisition unit 113 may further include a microphone that collects ambient sound and acquires sound data converted into a digital signal via an amplifier and an ADC (Analog Digital Converter). In that case, the video acquisition unit 113 outputs video data accompanied by surrounding sounds.
- ADC Analog Digital Converter
- the music acquisition unit 114 has a function of acquiring music data to be BGM of the summary video.
- the music acquisition unit 114 is realized as a wired or wireless interface, and acquires music data from another device such as a PC or a server.
- a wired interface for example, a connector compliant with a standard such as USB (Universal Serial Bus) can be cited.
- a wireless interface for example, a communication device compliant with a communication standard such as Bluetooth (registered trademark) or Wi-Fi (registered trademark) can be cited.
- the music acquisition unit 114 outputs the acquired music data to the control unit 140.
- Storage unit 120 The storage unit 120 has a function of storing various information.
- the storage unit 120 stores information output from the input unit 110 and information generated by the control unit 140.
- Output unit 130 has a function of outputting various information.
- the output unit 130 may have a function of playing back a summary video generated by a summary video generation unit 146 described later.
- the output unit 130 may include a display unit and a speaker.
- the output unit 130 may have a function of outputting editing information generated by the editing unit 144 described later.
- the output unit 130 may include a wired or wireless interface.
- Control unit 140 functions as an arithmetic processing device and a control device, and controls the overall operation in the video processing device 100 according to various programs. As shown in FIG. 4, the control unit 140 includes a music analysis unit 141, a video analysis unit 142, an extraction unit 143, an editing unit 144, an operation mode control unit 145, and a summary video generation unit 146.
- the music analysis unit 141 has a function of analyzing the content of the input music. Specifically, the music analysis unit 141 performs analysis based on music theory for the music data acquired by the music acquisition unit 114.
- the music analysis unit 141 may analyze the music structure. For example, the music analysis unit 141 identifies a portion that satisfies a predetermined condition by analyzing the structure of music. For example, the music analysis unit 141 is based on a music theory, and includes an intro part, a melody (Verse) part, a chorus (also called chorus) part, an interlude part, and a solo part. ) Part, ending (Outro) part, etc. can be specified.
- the melody portion may be divided into A melody (Melody A) and B melody (Melody B).
- the music analysis unit 141 may detect chord progression in each component of the identified music, and may identify a particularly important portion (section) in the chorus portion based on the detected chord signal. .
- the music analysis unit 141 may specify a section in which the vocal singing starts, a section having the highest vocal pitch, and the like as particularly important parts in the chorus part.
- the music analysis unit 141 may analyze the rhythm of music.
- the music analysis unit 141 analyzes music beats or measures bars.
- the first beat of these measures coincides with the beginning of the measure.
- the beat that coincides with the beginning of the measure is also referred to as the beat at the beginning of the measure.
- the music analysis unit 141 outputs music analysis result information indicating the analysis result to the editing unit 144.
- the music analysis result information includes, for example, information indicating the position of each component in the music data, particularly the position of an important part, the position of each beat, and the position of each bar.
- Video analysis unit 142 has a function of analyzing the content of the input video. Specifically, the video analysis unit 142 analyzes the content for the video data acquired by the video acquisition unit 113. Then, the video analysis unit 142 outputs video analysis result information indicating the analysis result of the video content to the extraction unit 143.
- the video analysis unit 142 detects a highlight based on information input from the input unit 110, and outputs information indicating the detected highlight in the video analysis result information.
- the video analysis unit 142 detects highlights related to subject motion, user operation, and face and smile.
- the video analysis unit 142 detects a predetermined motion of the subject based on the sensor information acquired by the sensor unit 111.
- the video analysis unit 142 can detect the movement of the subject such as jumping of the subject (jump), change of traveling direction (turn), running, acceleration, or deceleration based on the sensor information.
- the video analysis unit 142 may detect a predetermined motion of the subject by performing an image recognition process on the video data acquired by the video acquisition unit 113.
- the video analysis result information may include information indicating the detected motion of the subject and information indicating a section in which the motion is detected in the video data.
- the video analysis unit 142 detects a user operation based on the operation information acquired by the operation unit 112. For example, the video analysis unit 142 detects a predetermined operation such as a zoom operation or a shooting mode setting operation based on operation information acquired during shooting.
- the video analysis result information may include information indicating the detected user operation and information indicating a section in which the user operation is detected in the video data.
- the video analysis unit 142 detects an editing instruction based on operation information acquired during or after shooting.
- the video analysis result information may include information indicating a section designated as a section to be included in the summary video by the user.
- the video analysis unit 142 performs image recognition processing on the video data acquired by the video acquisition unit 113 to detect the face and smile of the subject.
- the video analysis result information may include information indicating a section and a region where the face and smile are detected in the video data, and the number of faces and smiles.
- the video analysis unit 142 performs voice recognition processing on the video data acquired by the video acquisition unit 113, thereby detecting a section where cheers are struck.
- the video analysis result information may include information indicating the section in which the cheer is detected in the video data and the volume.
- the video analysis unit 142 detects an important scene in a specific event by performing an image recognition process on the video data acquired by the video acquisition unit 113.
- Important scenes include wedding cake cuts and ring exchange.
- the video analysis result information may include information indicating the section in which the important scene is detected in the video data and the importance.
- the video analysis unit 142 detects information for a scene segment based on the information input by the input unit 110, and the information for the detected scene segment is a video analysis result. Output in the information.
- the video analysis unit 142 detects information for a scene segment related to color, camera work, date / time, and location.
- the video analysis unit 142 can detect the color of the video by performing image recognition processing on the video data acquired by the video acquisition unit 113. Specifically, the video analysis unit 142 analyzes YUV or RGB of the video and detects a color histogram for each frame or a plurality of frames. Then, the video analysis unit 142 detects a dominant color in each frame as the color of the frame.
- the identification information for identifying the detected color is also referred to as a color ID.
- the video analysis result information may include information indicating the color ID of each section.
- the video analysis unit 142 can detect camerawork by performing image recognition processing on the video data acquired by the video acquisition unit 113.
- the video analysis unit 142 detects camera work such as still, up and down, left and right by detecting a motion vector for each frame or for each of a plurality of frames.
- the identification information for identifying the detected camera work is also referred to as a camera work ID.
- the video analysis result information may include information indicating the camera work ID of each section.
- the video analysis unit 142 can detect the shooting date and time acquired by a GPS included in the sensor unit 111 or a clock built in a camera or the like included in the video acquisition unit 113.
- the identification information for identifying the detected shooting date / time is also referred to as a shooting date / time ID. It is assumed that the same shooting date / time ID is assigned to sections shot at the same or close date / time.
- the video analysis result information may include information indicating the shooting date / time ID and the section of each shooting date / time segment.
- the video analysis unit 142 can detect the location where the image was taken based on the position information acquired by the GPS included in the sensor unit 111.
- the identification information for identifying the detected shooting location is also referred to as a shooting location ID. It is assumed that the same shooting location ID is attached to sections shot at the same or close locations.
- the video analysis result information may include information indicating the shooting location ID of each section.
- the extraction unit 143 has a function of extracting a plurality of unit videos from the input video. Specifically, the extraction unit 143 extracts a plurality of unit videos from the video data acquired by the video acquisition unit 113 based on the analysis result by the video analysis unit 142. Specifically, the extraction unit 143 extracts a series of videos having the same video attributes indicated by the analysis result information as unit videos.
- the extraction unit 143 may extract a series of videos having the same scene segment as a unit video. Further, the extraction unit 143 may extract a video in which highlight is detected as a unit video. Specifically, the extraction unit 143 may extract a section in which a predetermined operation such as a subject jump is detected as one unit video. In addition, the extraction unit 143 extracts a section in which a predetermined operation such as a zoom operation or a shooting mode setting operation is detected, or a section designated as a section to be included in the summary video by the user as one unit video. May be.
- the extraction unit 143 may extract a section after zooming as a unit video if the operation is a zoom operation, or extract a section shot in the simultaneous shooting mode as a unit video if the operation is a shooting mode setting operation. Also good.
- the extraction unit 143 is a section in which the face or smile of the subject is detected, that is, a section in which the subject is detected to be in a predetermined state such as smiling or facing the camera, or before and after the section. May be extracted as one unit video.
- the extraction unit 143 may extract a section where cheers are boiling as one unit video.
- the extraction unit 143 may extract a section in which an important scene in a specific event is captured as one unit video.
- the extraction unit 143 may use a combination of these extraction criteria.
- the extraction unit 143 may set the degree of attention to the extracted unit video based on the analysis result by the video analysis unit 142. For example, the extraction unit 143 sets a high degree of attention to the unit video in the section corresponding to the highlight. Specifically, when the video analysis unit 142 analyzes that the motion of the subject in the shooting section of the unit video is a predetermined motion, the extraction unit 143 analyzes that the subject is in a predetermined state or When it is analyzed that a predetermined operation has been performed, a high degree of attention is set for the unit video.
- the extraction unit 143 sets a high degree of attention to the unit video when it is analyzed by the video analysis unit 142 that a cheer is generated in the shooting period of the unit video or when it is analyzed that the scene is an important scene. To do.
- a high degree of attention is set for the unit video corresponding to the section in which the predetermined motion such as the jump of the subject is detected.
- a high degree of attention is set for a unit video corresponding to a section in which the subject is detected to be in a predetermined state such as a smiling face or a face facing the camera.
- a high degree of attention is set for a unit video corresponding to a section in which a predetermined operation such as a zoom operation or a shooting mode setting operation is detected.
- a high degree of attention is set for the unit video corresponding to the section where the cheers are struck.
- a high degree of attention is set for a unit video corresponding to a section in which an important scene in a specific event such as a wedding cake cut or ring exchange is detected.
- the extraction unit 143 may set a high degree of attention to a unit video corresponding to a section designated as a section to be included in the summary video by the user. Then, the extraction unit 143 sets a low degree of attention in cases other than those described above.
- a unit video with a high degree of attention is also referred to as a highlight shot.
- a unit video with a low level of attention is also referred to as a sub-shot.
- the identification information for identifying the type of the extracted highlight shot is also referred to as a highlight ID.
- a highlight ID a different ID can be set according to the type of highlight such as jump, zoom operation, cheer, important scene, or user-designated highlight.
- the editing unit 144 has a function of generating editing information for switching the unit video extracted by the extracting unit 143 according to the input music. For example, the editing unit 144 sets which section of which music is input as BGM. Then, the editing unit 144 divides the music to be BGM by the music analysis result by the music analysis unit 141, and assigns the unit video extracted by the extraction unit 143 to each section. As a result, in the summary video, the unit video is switched at the timing when the music is divided. When allocating unit videos, the editing unit 144 may determine all or part of the unit videos extracted by the extraction unit 143 as unit videos to be adopted as summary videos, and assign the adopted unit videos to each section.
- the editing unit 144 assigns unit videos in the order of shooting times in principle. Of course, the editing unit 144 may assign a unit video without depending on the shooting time. In this way, the editing unit 144 generates editing information by setting which unit video is to be switched at which timing with which section of which music is input as BGM. Details of the processing by the editing unit 144 will be described in detail later.
- Operation mode control unit 145 has a function of controlling operation modes in the extraction unit 143 and the editing unit 144.
- the operation mode control unit 145 controls the operation mode according to the unit video extraction result by the extraction unit 143 and the switching timing setting result by the editing unit 144. Details of processing by the operation mode control unit 145 will be described in detail later.
- the summary video generation unit 146 has a function of generating a summary video composed of music and unit videos that are switched based on editing information. For example, the summary video generation unit 146 generates a summary video by switching and connecting the unit video specified by the editing information at the specified timing with the music specified by the editing information as BGM.
- the extraction unit 143 extracts a plurality of unit videos from the video data acquired by the video acquisition unit 113 based on the analysis result by the video analysis unit 142. Specifically, the extraction unit 143 extracts a unit video according to the video attribute analyzed by the video analysis unit 142. For example, the extraction unit 143 extracts highlight shots and sub-shots from the video data based on information for scene segments and information indicating highlights.
- the unit video extraction process based on the video analysis result will be described in detail with reference to FIG.
- FIG. 5 is a diagram for explaining unit video extraction processing according to the present embodiment.
- FIG. 5 schematically shows a process in which the extraction unit 143 extracts highlight shots 260A to 260E and sub-shots 270A to 270G.
- the extraction unit 143 first generates a scene segment 210 based on information for a scene segment.
- the extraction unit 143 generates a scene segment 210 by segmenting sections having the same color ID.
- the extraction unit 143 may use a plurality of pieces of information for a scene segment.
- the extraction unit 143 generates a scene segment 210 by segmenting a section having the same color ID, camera work ID, shooting location ID, and shooting date / time ID. May be.
- the extraction unit 143 associates the scene segment 210 with the highlight 220 and extracts highlight shots 240A to 240E from the input video 230. Then, the extraction unit 143 extracts a section divided by the scene segment 210 of the input video 230 as a sub-shot. However, the extraction unit 143 excludes a section that overlaps the highlight shot 240 and has a short time (for example, shorter than the length of the longest allocated section described later), extremely bright or dark, or a section where the camera work is not stable.
- the sub-shot 250 may be extracted.
- the number of unit videos extracted by the extraction unit 143 based on the video result information that is, the number of highlight shots and sub-shots is also referred to as the number of extractions.
- the editing unit 144 sets the switching timing of the unit video according to the input music. For example, the editing unit 144 generates editing information for switching the unit video extracted by the extracting unit 143 according to the component analyzed by the music analyzing unit 141, according to the measure, or according to the beat. May be. Specifically, the editing unit 144 divides the input music at a timing at which a component is switched, a timing at which a measure is switched, or a timing according to a beat, and sets a unit video switching timing at the divided position.
- the editing unit 144 may generate editing information for switching the unit video for each beat as the timing according to the beat.
- the unit video is switched with a sense of speed at a high tempo, and the viewer's emotion can be raised.
- the editing unit 144 may generate editing information for switching the unit video for every plurality of beats when the beat speed of the music exceeds the threshold value.
- the unit video may be switched every two beats.
- the editing unit 144 may set the number of unit video switching operations according to the beat for each type of music structure analyzed by the music analysis unit 141. Specifically, the editing unit 144 may set the number of unit video switching operations according to the beat for each musical component such as an intro part and a chorus part. Further, the editing unit 144 may switch the unit video according to the beat at a part that satisfies the predetermined condition specified by the music analysis unit. Specifically, the editing unit 144 may switch the unit video in accordance with the beat in the chorus part, particularly in a part that is particularly important such as a part where the vocal starts and the part where the vocal pitch is highest. Good. Thereby, it is possible to switch the unit video according to the beat in accordance with the excitement of the BGM, and it is possible to excite the viewer's emotions more effectively.
- the editing unit 144 may select whether or not to switch the unit video according to the beat in units of music measures analyzed by the music analysis unit 141.
- the unit video is switched according to the beat in units of measures.
- a person can be consciously or unconsciously listening to music with consciousness of measures and predicting development. Therefore, the switching of the unit video corresponding to the beat in the unit of measure is easily accepted by the viewer, so that the emotion of the viewer can be easily raised.
- the switching of the unit video according to the beat in the unit of measure is consistent with the switching of the unit video in the unit of measure.
- the editing unit 144 may separate bars for switching the unit video according to the beat. As a result, unit video switching according to beats is not performed in a plurality of consecutive bars, and excessive switching is prevented.
- a section in which music is divided by the set switching timing is also referred to as an allocated section below.
- setting the switching timing is equivalent to setting an allocation interval of how long each unit video is allocated to the summary video.
- the longest section among the allocation sections is also referred to as the longest allocation section below.
- the unit video switching timing described above may be set based on, for example, a preset probability table.
- the editing unit 144 may follow rules such as switching unit videos and setting the length of the longest allocated section at the timing of switching of music components.
- Similar to each other means, for example, that at least one of the movement of the subject, the shooting date (ie, shooting time information), the shooting location (ie, shooting position information), color information, or camera work is close.
- the unit video in which the color is the same and the camera work moves from right to left and the unit video from left to right are similar to each other.
- the unit videos in which the subject is jumping are similar to each other.
- being similar to each other may indicate that a specific subject is included in the unit video, for example.
- unit videos including the same person or the same team are similar.
- at least one of the unit images switched according to the beat in one measure may be adopted twice or more in one measure.
- the unit video A, the unit video B, the unit video A, and the unit video B may be used in this order, or the unit video A, the unit video A, the unit video A, and the unit video A may be used in this order. May be. This makes it easier to avoid giving the viewer a complicated impression.
- the unit video that switches according to the beat in one measure may be different.
- the unit video A, the unit video B, the unit video C, and the unit video D may be employed in this order.
- FIG. 6 is a diagram for explaining the unit video switching timing setting processing according to the present embodiment.
- the component 320 of the area 310 used as BGM among music, and the switching timing 330 set are shown.
- the dividing line of the switching timing 330 shows the switching timing, and the section divided by the dividing line shows the allocation section.
- the component 320 includes a melody portion, a chorus portion, and an ending portion.
- the music shown in FIG. 6 is quadruple music in which one bar 343 includes one bar 342 and three beats 341.
- the editing unit 144 sets the unit video switching timing at the timing when the component 320 switches from the melody to the chorus, and at the timing when the component 320 switches from the chorus to the ending.
- the editing unit 144 sets allocation sections 351A to 351D in units of one bar, sets allocation sections 352 in units of two bars, sets allocation sections 353 in units of three bars, and sets allocation sections 354 in units of one beat. It is set. Therefore, in the section 354, the unit video is switched for each beat. In this case, the longest allocation section 360 is 3 bars.
- Table 1 below shows the number of unit videos employed in the entire BGM and in each component in the example shown in FIG. 6 for each switching timing type (allocation section length).
- the unit video when the unit video is switched for each beat, one unit video may be adopted a plurality of times, so the maximum number of unit videos to be selected is four.
- Table 1 in the example shown in FIG. 6, a total of up to 10 unit videos are adopted as summary videos. Further, in the example shown in FIG. 6, the longest allocation section is for three bars.
- the number of unit videos employed in the summary video is determined by the number of allocation sections determined by the switching timing set by the editing unit 144 based on the music analysis result information, that is, the number of divided music.
- the number of pieces of music divided by the editing unit 144 based on the music analysis result information is also referred to as an adopted number. For example, in the example shown in FIG. More specifically, if the content of switching according to the beat is unit video A, unit video B, unit video C, and unit video D, the number of adoption is ten. If the content of switching according to the beat is unit video A, unit video B, unit video A, and unit video B, the number of adoption is eight.
- the editing unit 144 may switch the unit video extracted by the extraction unit 143 at the switching timing set in the switching timing setting process.
- the editing unit 144 may change the switching timing set in the switching timing setting process.
- the editing unit 144 may change the order of the allocation sections while maintaining the total number of allocation sections (corresponding to the number employed) and the number of allocation sections for each length set in the switching timing setting process. Such an example will be described later in the adopted section setting process.
- the unit video extraction process is restricted by the switching timing setting process.
- the extraction unit 143 may be restricted to extract at least the number of adopted unit videos. Due to this restriction, unit videos are switched without overlapping in the summary video. Further, the extraction unit 143 extracts a unit video having a length equal to or longer than the longest allocation section (in the example illustrated in FIG. 6, three bars) so that each extracted unit video may be used at any timing. May be imposed. According to this restriction, any extracted unit video can be allocated to the longest allocation section.
- the switching timing setting process imposes restrictions on the unit video extraction process.
- the editing unit 144 is restricted to set the switching timing so as to allocate a smaller number of unit videos than the number of unit videos extracted by the extraction unit 143. Due to this restriction, unit videos are switched without overlapping in the summary video.
- a limitation may be imposed that the editing unit 144 sets the switching timing so as to be an allocation section having a length corresponding to the length of each unit video extracted by the extraction unit 143. According to this restriction, a suitable allocation section can be assigned to each unit video extracted by the extraction unit 143.
- the operation mode control unit 145 can change the operation mode of the extraction unit 143 and the editing unit 144 in order to satisfy such a restriction.
- a case where the switching timing setting process is performed first will be described.
- the operation mode control unit 145 operates the extraction unit 143 and the editing unit 144 with the operation mode as the normal processing mode (first operation mode).
- the editing unit 144 sets the unit video switching timing using the music analysis result information as described above.
- the extraction unit 143 extracts the unit video using the video analysis result information as described above.
- the operation mode control unit 145 selects at least one of the re-extraction process by the extraction unit 143 and the re-adoption process by the editing unit 144 according to the magnitude relationship between the number of extractions and the number of adoptions in the normal processing mode. It is determined whether or not to implement the change.
- the extraction processing here refers to the above-described unit video extraction processing.
- the adoption process here refers to the switching timing setting process described above.
- the magnitude relationship between the number of extractions and the number of adoptions there is a limitation that the number of extractions is equal to or more than the number of adoptions as described above.
- the operation mode control unit 145 can satisfy the restriction by changing the operation mode when the restriction is not satisfied.
- the operation mode control unit 145 determines not to change the operation mode when the employed number and the extracted number in the normal processing mode are equal or the extracted number is larger. That is, the operation mode control unit 145 determines not to change the operation mode when the number of extractions is equal to or greater than the number of adoptions. This is because the restriction that the number of extractions described above is equal to or greater than the number adopted is satisfied without changing the operation mode.
- the operation mode control unit 145 can change the operation mode to another operation mode when the number of extractions in the normal processing mode is smaller than the number employed.
- the operation mode control unit 145 can change the operation mode to the division processing mode (second operation mode) or the retry processing mode (fifth operation mode).
- the extraction unit 143 divides at least one of the unit videos extracted in the normal processing mode into two or more unit videos. For example, the extraction unit 143 may set a unit video whose length exceeds a threshold among the unit videos extracted in the normal processing mode as a division target. In addition, the extraction unit 143 may determine the number of divisions so that the divided unit video is equal to or longer than the longest allocation section. Since the number of extractions increases according to the division processing mode, the restriction that the number of extractions is equal to or more than the number of adoptions can be satisfied.
- the editing unit 144 sets the switching timing by dividing music at a predetermined interval. Further, the extraction unit 143 extracts a unit video obtained by dividing the video at a predetermined interval. For example, the editing unit 144 divides the input music at equal intervals or at preset intervals, and sets the timing of the division as the switching timing. In addition, the extraction unit 143 extracts the divided video as a unit video by dividing the input video at regular intervals or at preset intervals. That is, the extraction unit 143 extracts the unit video without considering the highlight. In the retry processing mode, the number of extractions and the number of extractions can be arbitrarily adjusted by adjusting the separation interval. Therefore, the restriction that the number of extractions is equal to or more than the number of adoptions can be satisfied.
- FIG. 7 is a diagram for explaining an example of an operation mode of the video processing apparatus 100 according to the present embodiment.
- the video analysis result information and the music analysis result information are used, and a summary video with a video quality of “high” is generated.
- the video analysis result information is corrected and used.
- the unit video 410 extracted in the normal processing mode is divided into unit videos 411 and 412.
- the unit video 420 is divided into unit videos 421, 422, and 423
- the unit video 430 is divided into unit videos 431, 432, and 433.
- a unit video that was originally one can be divided into a plurality of pieces and each can be adopted as a summary video. That is, since similar unit videos can be adopted for the summary video, the video quality of the summary video is “medium”.
- the video analysis result information and the music analysis result information are ignored. Specifically, as shown in FIG. 7, the switching timing is equally spaced, and the unit video is obtained by dividing the input video at equal intervals. Therefore, the summary video generated in the retry processing mode is monotonous, and the video quality is “low”.
- the operation mode control unit 145 may change the operation mode to an operation mode other than the division processing mode and the retry processing mode when the number of extractions in the normal processing mode is smaller than the number of adoption.
- the operation mode control unit 145 can change the operation mode to the longest allocation interval shortening processing mode (third operation mode) or the sub-shot condition relaxation processing mode (fourth operation mode).
- the editing unit 144 shortens the longest allocation section as compared with the normal processing mode.
- the extraction unit 143 extracts the unit video with a length equal to or longer than the longest allocation section shorter than the normal processing mode.
- the extraction unit 143 extracts a unit video with a length of three bars or more in the normal processing mode.
- the extraction unit 143 extracts a unit video with a length of, for example, two bars or more.
- the extraction unit 143 can sub-extract and extract an image of a section that has only two bars in the normal processing mode and has not been extracted as a sub-shot because it is short. In this way, in the longest allocation section shortening processing mode, the number of extractions increases, so the restriction that the number of extractions is equal to or greater than the number of adoptions can be satisfied.
- the extraction unit 143 relaxes conditions related to the analysis result by the video analysis unit 142 for extracting the unit video as compared with the normal processing mode. For example, the extraction unit 143 extracts a unit video even if the time is short, extracts it as a unit video even if it is extremely bright or dark, or even if the camera work is not stable. Extract as video. In this way, in the sub-shot condition relaxation processing mode, the number of extractions increases, so that the restriction that the number of extractions is equal to or greater than the number of adoptions can be satisfied.
- the order of the operation modes listed above is arbitrary.
- the operation mode control unit 145 may change the operation mode in the order of the division processing mode, the longest allocated section shortening processing mode, the sub-shot condition relaxation processing mode, and the retry processing mode after the normal processing mode.
- the operation mode control unit 145 may use any combination of the above-described operation modes.
- the operation mode control unit 145 may perform processing that employs all or part of the above-described operation modes in parallel, and may select an operation mode that provides the highest quality result.
- the editing unit 144 selects a unit video to be used for the summary video from the unit videos extracted by the extraction unit 143. For example, the editing unit 144 prioritizes highlighting and selects unit videos for the number of adopted units.
- the unit video selection process will be described with reference to FIGS.
- FIG. 8 is a diagram for explaining unit video selection processing according to the present embodiment.
- the editing unit 144 first selects one or more sub-shots 510 as unit video candidates to be used for the summary video.
- the selected shot 520 is a unit video selected as a unit video candidate to be adopted for the summary video.
- the editing unit 144 may select the sub-shots 510 so that, for example, the scene segments are distributed and / or follow the theme specified by the user.
- the editing unit 144 selects the sub-shots 510 in descending order of evaluation values based on an evaluation function described later. [1], [2], [3], [4], [5], [6], and [7] in the figure indicate the selection order using the evaluation function.
- the number of adoption is seven.
- the editing unit 144 arranges the selected unit videos along the time at which the selected unit videos were taken.
- FIG. 9 is a view for explaining unit video selection processing according to the present embodiment.
- the editing unit 144 selects the highlight shot 530 as a candidate for a unit video to be adopted for the summary video.
- the editing unit 144 can select the highlight shot 530 so that adjacent unit videos in the selected shot do not have the same highlight.
- the editing unit 144 selects the highlight shots 530 in descending order of evaluation values based on an evaluation function described later.
- the editing unit 144 removes the sub-shot 540 having a low priority from the already selected sub-shots instead of selecting the highlight shot 530.
- An example of the sub-shot 540 having a low priority is a sub-shot whose selection order is late. [1] and [2] in the figure indicate a selection order and a removal order using evaluation functions.
- W si Si and W ss Ss in Equation 1 are terms related to the scene segment.
- the symbol W si and the symbol W ss are weights of each term, and can be arbitrarily set by the editing unit 144.
- the symbol Si is a value (score) related to the segment ID of the scene segment.
- the symbol Si is calculated based on the color ID, camera work ID, shooting date / time ID and / or location ID used for the scene segment.
- the score can be calculated to approach the percentage of segment IDs along the preset theme.
- a score can be calculated such that each segment ID is evenly selected to reduce visual bias.
- the symbol Ss is a score related to the stability of the scene segment.
- the symbol Ss is calculated based on the color used for the scene segment and / or the stability of the camerawork (less time change). For example, a higher score may be calculated as the stability is higher.
- the editing unit 144 may add a term related to the selection source video file to Equation 1 in order to distribute the selection source video file. Further, the editing unit 144 may add a term related to the time to the selected shots before and after the selection to Equation 1 in order to disperse the variance of the shooting time.
- the editing unit 144 calculates the evaluation function shown in Formula 1 for each unselected sub-shot, and selects the sub-shot having the highest evaluation value. Note that the score of each symbol may vary in relation to the already selected sub-shot.
- W hi Hi and W hs Hs in Equation 2 are terms related to highlighting.
- the symbol W hi and the symbol W hs are the weights of each term, and can be arbitrarily set by the editing unit 144.
- the symbol Hi is a score related to the highlight ID.
- the symbol Hi is calculated based on the highlight ID.
- the score can be calculated to approach the percentage of highlight IDs along the preset theme.
- a score can be calculated such that each highlight ID is evenly selected to reduce visual bias.
- the symbol Hs is a score related to the value of the highlight shot. For example, if the symbol Hs is a snowboard jump, the higher the dwell time and the greater the amount of rotation, the higher the score can be calculated.
- Other symbols are the same as those in Equation 1 above.
- the editing unit 144 calculates the evaluation function shown in Equation 2 for each unselected highlight shot, and selects the highlight shot with the highest evaluation value. Then, the editing unit 144 removes sub-shots whose selection order is late from the already selected sub-shots. Note that the score of each symbol may vary depending on the relationship with the already selected highlight shot.
- the editing unit 144 can avoid a series of jump highlight shots by using the symbol Hi, for example. Note that the score related to the symbol Hi may be ignored for the highlight shot related to the section designated as the section to be included in the summary video by the user. In this case, for example, unit videos of jumps designated as highlights by the user can be continuous. In addition, the editing unit 144 can preferentially select a high-value highlight shot by using the symbol Hs.
- the editing unit 144 may set the number of highlight shots with the same highlight ID to be less than the preset number. For example, the editing unit 144 may select a highlight shot that satisfies the following mathematical formula. According to the following formula, for example, even if the highlight shot of a jump can be selected up to two times, a jump with a high symbol Hs score can be selected three times or more, and the score of the symbol Hs is low. For jumps, the number of selections can be less than two. Highlight score Hs ⁇ Attenuation coefficient ⁇ Number of selections ⁇ Threshold value (Formula 3)
- the editing unit 144 may first select a highlight shot and then select a sub-shot. In that case, the editing unit 144 first selects a highlight shot, and selects the number of sub-shots obtained by subtracting the number of selected highlight shots from the number of adopted shots. In addition, the editing unit 144 may simultaneously select a highlight shot and a sub-shot. In that case, the editing unit 144 can apply a common evaluation function between the highlight shot and the sub-shot.
- the editing unit 144 sets an adopted section corresponding to the content of the unit video in the unit video extracted by the extracting unit 143, and generates editing information for adopting the adopted section set for each of the plurality of unit videos. .
- the editing unit 144 sets an adopted section to be adopted for the summary video according to the content of the unit video, and generates editing information for connecting the set adopted sections.
- the position of the adopted section is a section adopted for the summary video among the unit videos.
- the adopted section may be the entire unit video or a part thereof.
- the editing unit 144 may set the position of the adopted section in the unit video according to the content of the unit video.
- the editing unit 144 adopts the adopted section according to the content of the unit video such as whether the unit video is a highlight shot or a sub-shot, and what attributes such as a highlight ID, a color ID, and a camera work ID are. Can be set.
- the position of the adopted section indicates the position of the section set as the adopted section among the unit videos in the entire unit video, and includes, for example, the first half, the middle board, or the second half of the unit video. Thereby, for example, a more appropriate section for exciting the viewer's emotion is set according to the content of the unit video and is adopted in the summary video.
- the editing unit 144 may set the position of the adopted section in the unit video according to the operation of the subject of the video analyzed by the video analysis unit 142. For example, a highlight shot related to a snowboard jump is assumed. For the unit video analyzed by the video analysis unit 142 as the subject's motion is a jump, the editing unit 144 is running, from running to running in the air, in the air, from landing in the air, or after landing from the landing.
- the employed section may be set at any position up to. In that case, the editing unit 144 can set an adopted section focusing on various notable highlights of the jump.
- a highlight shot relating to a snowboard turn (change in the direction of movement) is assumed.
- the editing unit 144 is either before the change to the change, during the change, or from the change to the after the change.
- An adopted section may be set at the position. In that case, the editing unit 144 can set an adopted section focusing on various notable highlights of the turn.
- the editing unit 144 may distribute the positions of the adopted sections in each of the two or more highlight shots. For example, when the selected shot includes highlight shots related to jumps of a plurality of snowboards, the editing unit 144 is running, running from running to staying in the air, staying in the air, from staying in the air to after landing, or from landing to after landing For example, the positions of the adopted sections may be dispersed. Similarly, when the selected shot includes highlight shots related to a plurality of snowboard turns, the editing unit 144 determines the position of the adopted section from before conversion to during conversion, during conversion, or from conversion to after conversion. It may be dispersed. In that case, since the adopted sections are set from different viewpoints even for the same type of highlight shot, the viewer can appreciate the summary video without getting bored.
- the editing unit 144 may generate editing information so as to connect the highlight shot and another type of highlight shot or sub-shot. For example, the editing unit 144 assigns highlight shots with the same highlight ID so as not to be continuous, or assigns sub-shots between them when they are consecutive. As a result, the summary video is inflected, so that the viewer can enjoy the summary video without getting bored.
- the editing unit 144 may set the length of the highlight shot adoption section longer than the length of the sub-shot adoption section. For example, the editing unit 144 preferentially assigns highlight shots to long assignment sections. Accordingly, the viewer can view the highlight shot for a longer time, and thus can more effectively excite the viewer.
- FIG. 10 to 12 are diagrams for explaining the adopted section setting processing according to the present embodiment.
- FIG. 10 illustrates an example in which highlight shots are preferentially assigned to long assignment intervals.
- the breakdown of the allocation section 710 set in the switching timing setting process is two allocation sections 711 for one bar unit, four allocation sections 712 for two bar units, and allocation for three bar units. Assume that the number of sections 713 is one.
- the editing unit 144 preferentially allocates highlight shots to long allocation sections according to the rules shown in Table 2 below. Note that the rules shown in Table 2 below may be further subdivided according to the type of highlight, the type of scene segment, and the like.
- the breakdown of the selected shot 720 is in the order of sub-shot 721A, highlight shot 722A, sub-shot 721B, highlight shot 722B, sub-shot 721C, sub-shot 721D, and highlight shot 722C.
- the editing unit 144 generates editing information 730 for setting which unit video is switched at which timing by assigning an allocation section to each unit video as described below.
- the editing unit 144 assigns the allocation section 711A having the highest priority among the remaining allocation sections to the sub-shot 721A that is the first selected shot 720.
- the editing unit 144 assigns the allocation section 713 in units of three measures having the highest priority among the remaining allocation sections to the highlight shot 722A that is the second selected shot 720.
- the editing unit 144 assigns the allocation section 711B having the highest priority among the remaining allocation sections to the sub-shot 721B that is the third selected shot 720.
- the editing unit 144 assigns the allocation section 712A in units of two measures having the highest priority among the remaining allocation sections to the highlight shot 722B that is the fourth selected shot 720.
- the editing unit 144 assigns the allocation section 712B having the highest priority among the remaining allocation sections to the sub-shot 721C that is the fifth selected shot 720.
- the editing unit 144 assigns the allocation section 712C having the highest priority among the remaining allocation sections to the sub-shot 721D that is the sixth selected shot 720.
- the editing unit 144 assigns the remaining 2-bar unit assignment section 712D to the highlight shot 722C, which is the seventh selected shot 720.
- the editing unit 144 basically sets an adopted section 750 in the center portion of the unit video 740.
- the editing unit 144 may set an adoption section 750 in the first half, the center, or the second half of the unit video 740 for a highlight shot such as a turn.
- the length of the adopted section 750 set by the editing unit 144 corresponds to the length of the assigned section assigned to each unit video described with reference to FIG.
- FIG. 13 is a flowchart illustrating an example of a summary video generation process executed in the video processing apparatus 100 according to the present embodiment.
- the music analysis unit 141 analyzes the input music.
- the music analysis unit 141 analyzes the structure of music such as an intro part and a chorus part based on music theory, specifies a particularly important part in the chorus part, and analyzes beats and measures. .
- the video analysis unit 142 analyzes the input video. For example, the video analysis unit 142 detects a subject motion, detects a user operation, detects a face and a smile, detects a color, and detects camera work.
- step S106 the editing unit 144 sets the unit video switching timing. For example, the editing unit 144 sets the switching timing for each beat, for each bar, or for each bar based on the music analysis result in step S102. At that time, the editing unit 144 can set so that switching according to the beat is performed in a particularly important portion of the chorus portion. By this step, the longest allocation section length is determined.
- step S108 the editing unit 144 calculates the number of unit videos (adopted number) adopted for the summary video. For example, the editing unit 144 calculates the number of adoption based on the number of allocation sections determined by the switching timing set in step S106. Specifically, the editing unit 144 calculates the number of adoption by subtracting the number of assigned sections from the number of assigned sections if the number of assigned sections is the same as the number of adopted sections when there is no overlap in the unit video.
- step S110 the extraction unit 143 extracts a unit video.
- the extraction unit 143 extracts highlight shots and sub-shots based on the video analysis result in step S104.
- the extraction unit 143 extracts a unit video with a length equal to or longer than the longest allocation section among the allocation sections determined by the switching timing set in step S106. Further, the extraction unit 143 calculates the total number of extracted highlight shots and sub-shots as the number of extractions.
- step S112 the operation mode control unit 145 determines whether or not the number of extractions is equal to or greater than the number of adoptions.
- step S114 the operation mode control unit 145 changes the operation mode. For example, the operation mode control unit 145 changes to the division processing mode when the change before the normal operation mode. Then, the process returns to step S106. As described above, the operation mode control unit 145 changes the operation mode and returns the process to step S106 until the number of extractions is equal to or more than the number of adoptions. Note that in any operation mode, when the number of extractions does not exceed the number of adoptions, the video processing apparatus 100 may output an error and stop the processing.
- step S116 the editing unit 144 selects a unit video to be adopted for the summary video.
- the editing unit 144 adopts a unit video in which attributes are dispersed in order to reduce the visual bias from the unit videos extracted by the extraction unit 143, or follows the theme specified by the user. Select unit video. Note that the editing unit 144 may preferentially adopt the highlight shot as compared to the sub-shot.
- step S118 the editing unit 144 sets the adopted section of each unit video.
- the editing unit 144 sets an adopted section to be adopted for the summary video among the unit videos selected in step S116.
- the editing unit 144 sets an adopted section at an appropriate position so that, for example, a section to be particularly noted is adopted in the summary video according to the content of the unit video.
- the editing unit 144 stores the processing result described above in the editing information.
- step S120 the summary video generation unit 146 generates a summary video.
- the summary video generation unit 146 generates a summary video by switching and connecting the unit video specified by the editing information at the specified timing with the music specified by the editing information as BGM.
- FIG. 14 is a block diagram illustrating an example of a hardware configuration of the information processing apparatus according to the present embodiment.
- the information processing apparatus 900 illustrated in FIG. 14 can realize the video processing apparatus 100 illustrated in FIG. 4, for example.
- Information processing by the video processing apparatus 100 according to the present embodiment is realized by cooperation of software and hardware described below.
- the information processing apparatus 900 includes a CPU (Central Processing Unit) 901, a ROM (Read Only Memory) 902, a RAM (Random Access Memory) 903, and a host bus 904a.
- the information processing apparatus 900 includes a bridge 904, an external bus 904b, an interface 905, an input device 906, an output device 907, a storage device 908, a drive 909, a connection port 911, a communication device 913, and a sensor 915.
- the information processing apparatus 900 may include a processing circuit such as a DSP or an ASIC in place of or in addition to the CPU 901.
- the CPU 901 functions as an arithmetic processing unit and a control unit, and controls the overall operation in the information processing apparatus 900 according to various programs. Further, the CPU 901 may be a microprocessor.
- the ROM 902 stores programs used by the CPU 901, calculation parameters, and the like.
- the RAM 903 temporarily stores programs used in the execution of the CPU 901, parameters that change as appropriate during the execution, and the like.
- the CPU 901 can form the control unit 140 shown in FIG.
- the CPU 901, ROM 902, and RAM 903 are connected to each other by a host bus 904a including a CPU bus.
- the host bus 904 a is connected to an external bus 904 b such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 904.
- an external bus 904 b such as a PCI (Peripheral Component Interconnect / Interface) bus
- PCI Peripheral Component Interconnect / Interface
- the host bus 904a, the bridge 904, and the external bus 904b do not necessarily have to be configured separately, and these functions may be mounted on one bus.
- the input device 906 is realized by a device in which information is input by the user, such as a mouse, a keyboard, a touch panel, a button, a microphone, a switch, and a lever.
- the input device 906 may be, for example, a remote control device using infrared rays or other radio waves, or may be an external connection device such as a mobile phone or a PDA that supports the operation of the information processing device 900.
- the input device 906 may include, for example, an input control circuit that generates an input signal based on information input by the user using the above-described input means and outputs the input signal to the CPU 901.
- a user of the information processing apparatus 900 can input various data and instruct a processing operation to the information processing apparatus 900 by operating the input device 906.
- the input device 906 can form, for example, the operation unit 112 shown in FIG.
- the output device 907 is formed of a device that can notify the user of the acquired information visually or audibly. Examples of such devices include CRT display devices, liquid crystal display devices, plasma display devices, EL display devices, display devices such as lamps, audio output devices such as speakers and headphones, printer devices, and the like.
- the output device 907 outputs results obtained by various processes performed by the information processing device 900.
- the display device visually displays results obtained by various processes performed by the information processing device 900 in various formats such as text, images, tables, and graphs.
- the audio output device converts an audio signal composed of reproduced audio data, acoustic data, and the like into an analog signal and outputs it aurally.
- the display device and the audio output device can form, for example, the output unit 130 shown in FIG.
- the storage device 908 is a data storage device formed as an example of a storage unit of the information processing device 900.
- the storage apparatus 908 is realized by, for example, a magnetic storage device such as an HDD, a semiconductor storage device, an optical storage device, a magneto-optical storage device, or the like.
- the storage device 908 may include a storage medium, a recording device that records data on the storage medium, a reading device that reads data from the storage medium, a deletion device that deletes data recorded on the storage medium, and the like.
- the storage device 908 stores programs executed by the CPU 901, various data, various data acquired from the outside, and the like.
- the storage device 908 can form, for example, the storage unit 120 shown in FIG.
- the drive 909 is a storage medium reader / writer, and is built in or externally attached to the information processing apparatus 900.
- the drive 909 reads information recorded on a removable storage medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and outputs the information to the RAM 903.
- the drive 909 can also write information to a removable storage medium.
- connection port 911 is an interface connected to an external device, and is a connection port with an external device capable of transmitting data by USB (Universal Serial Bus), for example.
- the connection port 911 can form, for example, the music acquisition unit 114 shown in FIG.
- the communication device 913 is a communication interface formed by a communication device or the like for connecting to the network 920, for example.
- the communication device 913 is, for example, a communication card for wired or wireless LAN (Local Area Network), LTE (Long Term Evolution), Bluetooth (registered trademark), or WUSB (Wireless USB).
- the communication device 913 may be a router for optical communication, a router for ADSL (Asymmetric Digital Subscriber Line), a modem for various communication, or the like.
- the communication device 913 can transmit and receive signals and the like according to a predetermined protocol such as TCP / IP, for example, with the Internet and other communication devices.
- the communication device 913 can form, for example, the music acquisition unit 114 illustrated in FIG.
- the network 920 is a wired or wireless transmission path for information transmitted from a device connected to the network 920.
- the network 920 may include a public line network such as the Internet, a telephone line network, and a satellite communication network, various LANs including the Ethernet (registered trademark), a wide area network (WAN), and the like.
- the network 920 may include a dedicated line network such as an IP-VPN (Internet Protocol-Virtual Private Network).
- IP-VPN Internet Protocol-Virtual Private Network
- the sensor 915 is various sensors such as an acceleration sensor, a gyro sensor, a geomagnetic sensor, an optical sensor, a sound sensor, a distance measuring sensor, and a force sensor.
- the sensor 915 acquires information on the state of the information processing apparatus 900 itself, such as the posture and movement speed of the information processing apparatus 900, and information on the surrounding environment of the information processing apparatus 900, such as brightness and noise around the information processing apparatus 900.
- Sensor 915 may also include a GPS sensor that receives GPS signals and measures the latitude, longitude, and altitude of the device.
- the sensor 915 can form, for example, the sensor unit 111 shown in FIG.
- the sensor 915 may be separated from the information processing apparatus 900.
- the sensor 915 may be attached to a subject, and the information processing apparatus 900 may acquire information indicating a result of sensing the subject by wired or wireless communication.
- the imaging device 917 photoelectrically converts imaging light obtained by the lens system including a lens system including an imaging lens, a diaphragm, a zoom lens, a focus lens, and the like, a drive system that causes the lens system to perform a focus operation and a zoom operation, A solid-state imaging device array that generates an imaging signal.
- the solid-state imaging device array may be realized by a CCD (Charge Coupled Device) sensor array or a CMOS (Complementary Metal Oxide Semiconductor) sensor array, for example.
- the imaging device 917 outputs captured image data that is a digital signal.
- the imaging device 917 can form, for example, the video acquisition unit 113 illustrated in FIG.
- each of the above components may be realized using a general-purpose member, or may be realized by hardware specialized for the function of each component. Therefore, it is possible to change the hardware configuration to be used as appropriate according to the technical level at the time of carrying out this embodiment.
- a computer program for realizing each function of the information processing apparatus 900 according to the present embodiment as described above can be produced and mounted on a PC or the like.
- a computer-readable recording medium storing such a computer program can be provided.
- the recording medium is, for example, a magnetic disk, an optical disk, a magneto-optical disk, a flash memory, or the like.
- the above computer program may be distributed via a network, for example, without using a recording medium.
- the video processing apparatus 100 generates a summary video that can excite the viewer's emotions by switching an appropriate unit video at an appropriate timing according to music. Is possible.
- the video processing apparatus 100 analyzes the beat of the input music, extracts a plurality of unit videos from the input video, and generates editing information for switching the extracted unit videos according to the beat. .
- the unit video is switched at a fast timing according to the beat, so that the viewer's emotion can be raised more effectively.
- the video processing apparatus 100 sets an adopted section corresponding to the content of the unit video in the extracted unit video, and generates editing information for adopting the adopted section set for each of the plurality of unit videos.
- the video processing apparatus 100 sets, for each section extracted as a candidate to be adopted for the summary video, a section to be actually used for the summary video in a section to be particularly appreciated among each of the extracted sections. It becomes possible to do. Therefore, for example, a more appropriate section is used for the summary video to excite the viewer's emotions.
- the video processing apparatus 100 controls an operation mode related to a process of extracting a unit video from the input video and a process of setting a timing for switching the unit video according to the input music.
- the video processing apparatus 100 can generate a summary video that switches videos according to music in an appropriate operation mode.
- the video processing apparatus 100 can switch between different unit videos at the set switching timing by switching the operation mode so that the number of employed and the number of extracted is equal or the number of extracted is larger. it can.
- each device described in this specification may be realized as a single device, or a part or all of the devices may be realized as separate devices.
- the storage unit 120 and the control unit 140 are provided in a device such as a server connected to the input unit 110 and the output unit 130 via a network or the like. Also good.
- Analyzing the beat of the input music Extracting multiple unit videos from the input video; Generating editing information for switching the extracted unit video according to the analyzed beat by a processor;
- An information processing method including: (2) The information processing method further includes analyzing the musical measure, The information processing method according to (1), wherein in the generation of the editing information, whether or not to switch the unit video according to a beat is selected in the analyzed measure unit of the music.
- the information processing method according to any one of (2) to (7), wherein in generating the editing information, measures for switching the unit video according to a beat are separated from each other.
- the information processing method further includes analyzing the structure of the music, Any one of (1) to (8), wherein in the generation of the editing information, the number of times of switching the unit video according to the beat is set for each type of the analyzed music structure. Information processing method described in 1.
- the information processing method further includes identifying a portion of the music that satisfies a predetermined condition, The generation of the editing information includes switching the unit video according to a beat in a portion satisfying the specified predetermined condition, according to any one of (1) to (9). Information processing method.
- a music analyzer that analyzes the beats of the input music; An extraction unit for extracting a plurality of unit videos from the input video; An editing unit for generating editing information for switching the unit video extracted by the extraction unit according to the beat analyzed by the music analysis unit;
- a video processing apparatus comprising: (17) The music analysis unit analyzes the music bar, The video processing apparatus according to (16), wherein the editing unit selects whether or not to switch the unit video according to a beat in units of the music bar analyzed by the music analysis unit.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Television Signal Processing For Recording (AREA)
- Image Processing (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
- Management Or Editing Of Information On Record Carriers (AREA)
Abstract
Description
1.概要
2.基本構成
3.機能詳細
3.1.単位映像の抽出処理
3.2.切り替えタイミングの設定処理
3.3.動作モードの決定処理
3.4.単位映像の選択処理
3.5.採用区間の設定処理
4.動作処理
5.ハードウェア構成例
6.まとめ
まず、図1~図3を参照して、本実施形態に係る映像処理装置の概要を説明する。
図4は、本実施形態に係る映像処理装置100の論理的な構成の一例を示すブロック図である。図4に示すように、映像処理装置100は、入力部110、記憶部120、出力部130及び制御部140を含む。
入力部110は、外部からの各種情報の入力を受け付ける機能を有する。図4に示すように、入力部110は、センサ部111、操作部112、映像取得部113及び音楽取得部114を含む。
センサ部111は、被写体の動作を検出する機能を有する。例えば、センサ部111は、ジャイロセンサ、加速度センサ及び重力センサを含み得る。被写体とは、撮影対象であり、撮影者(ユーザ)も含むものとする。センサ部111は、GPS(Global Positioning System)、赤外線センサ、近接センサ、タッチセンサ等の任意のセンサを含んでいてもよい。センサ部111は、センシング結果を示すセンサ情報を制御部140へ出力する。なお、センサ部111は、映像処理装置100と一体的に形成されていなくてもよい。例えば、センサ部111は、被写体に装着されたセンサから有線又は無線通信を介してセンサ情報を取得してもよい。
操作部112は、ユーザ操作を受け付ける機能を有する。例えば、操作部112は、ボタン及びタッチパッド等により実現される。操作部112は、撮影中のズーム操作や撮影モードの設定操作などの操作を受け付け得る。撮影モードとしては、例えば動画を撮影する通常モード、及び動画及び静止画を同時に撮影する同時撮影モードなどが考えられる。他にも、操作部112は、撮影中又は撮影後に、要約映像に含めるべき区間を指定する編集指示を受け付け得る。操作部112は、ユーザ操作の内容を示す操作情報を制御部140へ出力する。
映像取得部113は、映像を取得する機能を有する。例えば、映像取得部113は、撮像装置として実現され、デジタル信号とされた撮影画像(動画像/静止画像)のデータを出力する。映像取得部113は、周囲の音を収音し、アンプ及びADC(Analog Digital Converter)を介してデジタル信号に変換した音データを取得するマイクをさらに含んでいてもよい。その場合、映像取得部113は、周囲の音が付随する映像データを出力する。
音楽取得部114は、要約映像のBGMとなる音楽データを取得する機能を有する。例えば、音楽取得部114は、有線又は無線のインタフェースとして実現され、PC又はサーバ等の他の装置から音楽データを取得する。有線のインタフェースとしては、例えばUSB(Universal Serial Bus)等の規格に準拠したコネクタが挙げられる。無線のインタフェースとしては、例えばBluetooth(登録商標)又はWi-Fi(登録商標)等の通信規格に準拠した通信装置が挙げられる。音楽取得部114は、取得した音楽データを制御部140へ出力する。
記憶部120は、各種情報を記憶する機能を有する。例えば、記憶部120は、入力部110から出力された情報、及び制御部140により生成される情報を記憶する。
出力部130は、各種情報を出力する機能を有する。例えば、出力部130は、後述する要約映像生成部146により生成された要約映像を再生する機能を有していてもよい。その場合、出力部130は、表示部及びスピーカを含み得る。他にも、出力部130は、後述する編集部144により生成された編集情報を出力する機能を有していてもよい。その場合、出力部130は、有線又は無線のインタフェースを含み得る。
制御部140は、演算処理装置および制御装置として機能し、各種プログラムに従って映像処理装置100内の動作全般を制御する。図4に示すように、制御部140は、音楽解析部141、映像解析部142、抽出部143、編集部144、動作モード制御部145及び要約映像生成部146を含む。
音楽解析部141は、入力された音楽の内容を解析する機能を有する。詳しくは、音楽解析部141は、音楽取得部114により取得された音楽データを対象として、音楽理論に基づく解析を行う。
映像解析部142は、入力された映像の内容を解析する機能を有する。詳しくは、映像解析部142は、映像取得部113により取得された映像データを対象として、内容の解析を行う。そして、映像解析部142は、映像の内容の解析結果を示す映像解析結果情報を抽出部143へ出力する。
例えば、映像解析部142は、入力部110により入力された情報に基づいてハイライトを検出し、検出したハイライトを示す情報を映像解析結果情報に含めて出力する。一例として、映像解析部142が、被写体動作、ユーザ操作、並びに顔及び笑顔に関するハイライトを検出する例を説明する。
例えば、映像解析部142は、入力部110により入力された情報に基づいてシーンセグメントのための情報を検出し、検出したシーンセグメントのための情報を映像解析結果情報に含めて出力する。一例として、映像解析部142が、色、カメラワーク、日時及び場所に関するシーンセグメントのための情報を検出する例を説明する。
抽出部143は、入力された映像から複数の単位映像を抽出する機能を有する。詳しくは、抽出部143は、映像解析部142による解析結果に基づいて、映像取得部113により取得された映像データから複数の単位映像を抽出する。詳しくは、抽出部143は、解析結果情報が示す映像の属性が同一である一続きの映像を、単位映像として抽出する。
編集部144は、抽出部143により抽出された単位映像を、入力された音楽に応じて切り替えるための編集情報を生成する機能を有する。例えば、編集部144は、入力されたどの音楽のどの区間をBGMとするかを設定する。そして、編集部144は、BGMにする音楽を音楽解析部141による音楽解析結果により区切り、各区間に抽出部143により抽出された単位映像を割り当てる。これにより、要約映像において、音楽が区切られたタイミングで単位映像が切り替わることとなる。単位映像の割り当ての際、編集部144は、抽出部143により抽出された単位映像から全部又は一部を要約映像に採用する単位映像として決定し、採用した単位映像を各区間に割り当て得る。なお、編集部144は、原則撮影時刻の順に単位映像を割り当てるものとする。もちろん、編集部144は、撮影時刻に依存せずに単位映像を割り当ててもよい。このように、編集部144は、入力されたどの音楽のどの区間をBGMとして、どの単位映像をどのタイミングで切り替えるかを設定することで、編集情報を生成する。編集部144による処理の詳細については、後に詳しく説明する。
動作モード制御部145は、抽出部143及び編集部144における動作モードを制御する機能を有する。動作モード制御部145は、抽出部143による単位映像の抽出結果、及び編集部144による切り替えタイミングの設定結果に応じて、動作モードを制御する。動作モード制御部145による処理の詳細については、後に詳しく説明する。
要約映像生成部146は、音楽と編集情報に基づいて切り替わる単位映像とから成る要約映像を生成する機能を有する。例えば、要約映像生成部146は、編集情報により指定された音楽をBGMとして、編集情報により指定された単位映像を指定されたタイミングで切り替えて連結することで、要約映像を生成する。
以上、本実施形態に係る映像処理装置100の基本構成を説明した。続いて、映像処理装置100が有する機能を詳細に説明する。
抽出部143は、映像解析部142による解析結果に基づいて、映像取得部113により取得された映像データから複数の単位映像を抽出する。具体的には、抽出部143は、映像解析部142により解析された映像の属性に応じて単位映像を抽出する。例えば、抽出部143は、シーンセグメントのための情報及びハイライトを示す情報に基づいて、映像データからハイライトショット及びサブショットを抽出する。以下、図5を参照して、映像解析結果に基づく単位映像の抽出処理を具体的に説明する。
編集部144は、音楽解析部141から出力された音楽解析結果情報に基づいて、入力された音楽に応じて単位映像の切り替えタイミングを設定する。例えば、編集部144は、抽出部143により抽出された単位映像を、音楽解析部141により解析された構成要素に応じて、小節に応じて、又は拍に応じて切り替えるための編集情報を生成してもよい。具体的には、編集部144は、入力された音楽を、構成要素が切り替わるタイミング、小節が切り替わるタイミング、又は拍に応じたタイミングで区切り、その区切った位置に単位映像の切り替えタイミングを設定する。
上述した、切り替えタイミングの設定処理と単位映像の抽出処理との順番は任意である。
(概要)
編集部144は、抽出部143により抽出された単位映像の中から、要約映像に採用する単位映像を選択する。例えば、編集部144は、ハイライトを優先して、採用数分の単位映像を選択する。以下、図8及び図9を参照して、単位映像の選択処理を説明する。
以下では、サブショットの選択のために用いられる評価関数の一例を説明する。例えば、編集部144は、下記の数式1に示す評価関数を用いて、サブショットを選択し得る。
以下では、ハイライトショットの選択のために用いられる評価関数の一例を説明する。例えば、編集部144は、下記の数式2に示す評価関数を用いて、ハイライトショットを選択し得る。
ハイライトスコアHs-減衰係数×選択回数≧閾値 …(数式3)
編集部144は、抽出部143により抽出された単位映像に当該単位映像の内容に応じた採用区間を設定し、複数の単位映像の各々について設定した採用区間を採用するための編集情報を生成する。例えば、編集部144は、単位映像の内容に応じて、要約映像に採用すべき採用区間を設定し、設定した採用区間を連結するための編集情報を生成する。なお、採用区間の位置とは、単位映像のうち、要約映像に採用される区間である。採用区間は単位映像の全部であってもよいし、一部であってもよい。
図13は、本実施形態に係る映像処理装置100において実行される要約映像の生成処理の流れの一例を示すフローチャートである。
最後に、図14を参照して、本実施形態に係る情報処理装置のハードウェア構成について説明する。図14は、本実施形態に係る情報処理装置のハードウェア構成の一例を示すブロック図である。なお、図14に示す情報処理装置900は、例えば、図4に示した映像処理装置100を実現し得る。本実施形態に係る映像処理装置100による情報処理は、ソフトウェアと、以下に説明するハードウェアとの協働により実現される。
以上、図1~図14を参照して、本開示の一実施形態について詳細に説明した。上記説明したように、本実施形態に係る映像処理装置100は、音楽に合わせた適切なタイミングで適切な単位映像が切り替わることで、鑑賞者の感情を盛り上げることが可能な要約映像を生成することが可能である。
(1)
入力された音楽のビートを解析することと、
入力された映像から複数の単位映像を抽出することと、
抽出された前記単位映像を解析されたビートに応じて切り替えるための編集情報をプロセッサにより生成することと、
を含む情報処理方法。
(2)
前記情報処理方法は、前記音楽の小節を解析することをさらに含み、
前記編集情報を生成することにおいて、解析された前記音楽の小節の単位で、拍に応じた前記単位映像の切り替えの実施有無を選択する、前記(1)に記載の情報処理方法。
(3)
ひとつの小節内で拍に応じて切り替わる前記単位映像は、互いに類似する、前記(2)に記載の情報処理方法。
(4)
ひとつの小節内で拍に応じて切り替わる前記単位映像は、被写体の動作、撮影時間情報、撮影位置情報、色情報又はカメラワークの少なくともいずれかが近い、前記(3)に記載の情報処理方法。
(5)
ひとつの小節内で拍に応じて切り替わる前記単位映像は、特定の被写体が含まれる、前記(3)に記載の情報処理方法。
(6)
ひとつの小節内で拍に応じて切り替わる前記単位映像の少なくともひとつは、ひとつの小節内で2回以上採用される、前記(2)~(5)のいずれか一項に記載の情報処理方法。
(7)
ひとつの小節内で拍に応じて切り替わる前記単位映像は、それぞれ異なる、前記(2)~(6)のいずれか一項に記載の情報処理方法。
(8)
前記編集情報を生成することにおいて、拍に応じた前記単位映像の切り替えを実施する小節同士を離間させる、前記(2)~(7)のいずれか一項に記載の情報処理方法。
(9)
前記情報処理方法は、前記音楽の構造を解析することをさらに含み、
前記編集情報を生成することにおいて、解析された前記音楽の構造の種類ごとに、拍に応じた前記単位映像の切り替えの実施回数を設定する、前記(1)~(8)のいずれか一項に記載の情報処理方法。
(10)
前記情報処理方法は、前記音楽のうち所定の条件を満たす部分を特定することをさらに含み、
前記編集情報を生成することは、特定された前記所定の条件を満たす部分で、拍に応じた前記単位映像の切り替えを実施する、前記(1)~(9)のいずれか一項に記載の情報処理方法。
(11)
前記音楽のうち前記所定の条件を満たす部分を特定することにおいて、音楽理論に基づいて前記音楽のコーラス部分を特定する、前記(10)に記載の情報処理方法。
(12)
前記編集情報を生成することにおいて、前記単位映像を1拍ごとに切り替えるための前記編集情報を生成する、前記(1)~(11)のいずれか一項に記載の情報処理方法。
(13)
前記編集情報を生成することにおいて、前記音楽の拍の速さが閾値を超える場合に、前記単位映像を複数拍ごとに切り替えるための前記編集情報を生成する、前記(1)~(11)のいずれか一項に記載の情報処理方法。
(14)
前記情報処理方法は、前記音楽と前記編集情報に基づいて切り替わる前記単位映像とから成る要約映像を生成することをさらに含む、前記(1)~(13)のいずれか一項に記載の情報処理方法。
(15)
前記情報処理方法は、生成された前記要約映像を再生することをさらに含む、前記(14)に記載の情報処理方法。
(16)
入力された音楽の拍を解析する音楽解析部と、
入力された映像から複数の単位映像を抽出する抽出部と、
前記抽出部により抽出された前記単位映像を前記音楽解析部により解析された拍に応じて切り替えるための編集情報を生成する編集部と、
を備える映像処理装置。
(17)
前記音楽解析部は、前記音楽の小節を解析し、
前記編集部は、前記音楽解析部により解析された前記音楽の小節の単位で、拍に応じた前記単位映像の切り替えの実施有無を選択する、前記(16)に記載の映像処理装置。
(18)
ひとつの小節内で拍に応じて切り替わる前記単位映像は、互いに類似する、前記(17)に記載の映像処理装置。
(19)
ひとつの小節内で拍に応じて切り替わる前記単位映像は、被写体の動作、撮影時間情報、撮影位置情報、色情報又はカメラワークの少なくともいずれかが近い、前記(18)に記載の映像処理装置。
(20)
コンピュータを、
入力された音楽の拍を解析する音楽解析部と、
入力された映像から複数の単位映像を抽出する抽出部と、
前記抽出部により抽出された前記単位映像を前記音楽解析部により解析された拍に応じて切り替えるための編集情報を生成する編集部と、
として機能させるためのプログラム。
20 映像解析結果情報
30 音楽
40 編集情報
50 要約映像
100 映像処理装置
110 入力部
111 センサ部
112 操作部
113 映像取得部
114 音楽取得部
120 記憶部
130 出力部
140 制御部
141 音楽解析部
142 映像解析部
143 抽出部
144 編集部
145 動作モード制御部
146 要約映像生成部
Claims (20)
- 入力された音楽のビートを解析することと、
入力された映像から複数の単位映像を抽出することと、
抽出された前記単位映像を解析されたビートに応じて切り替えるための編集情報をプロセッサにより生成することと、
を含む情報処理方法。 - 前記情報処理方法は、前記音楽の小節を解析することをさらに含み、
前記編集情報を生成することにおいて、解析された前記音楽の小節の単位で、拍に応じた前記単位映像の切り替えの実施有無を選択する、請求項1に記載の情報処理方法。 - ひとつの小節内で拍に応じて切り替わる前記単位映像は、互いに類似する、請求項2に記載の情報処理方法。
- ひとつの小節内で拍に応じて切り替わる前記単位映像は、被写体の動作、撮影時間情報、撮影位置情報、色情報又はカメラワークの少なくともいずれかが近い、請求項3に記載の情報処理方法。
- ひとつの小節内で拍に応じて切り替わる前記単位映像は、特定の被写体が含まれる、請求項3に記載の情報処理方法。
- ひとつの小節内で拍に応じて切り替わる前記単位映像の少なくともひとつは、ひとつの小節内で2回以上採用される、請求項2に記載の情報処理方法。
- ひとつの小節内で拍に応じて切り替わる前記単位映像は、それぞれ異なる、請求項2に記載の情報処理方法。
- 前記編集情報を生成することにおいて、拍に応じた前記単位映像の切り替えを実施する小節同士を離間させる、請求項2に記載の情報処理方法。
- 前記情報処理方法は、前記音楽の構造を解析することをさらに含み、
前記編集情報を生成することにおいて、解析された前記音楽の構造の種類ごとに、拍に応じた前記単位映像の切り替えの実施回数を設定する、請求項1に記載の情報処理方法。 - 前記情報処理方法は、前記音楽のうち所定の条件を満たす部分を特定することをさらに含み、
前記編集情報を生成することは、特定された前記所定の条件を満たす部分で、拍に応じた前記単位映像の切り替えを実施する、請求項1に記載の情報処理方法。 - 前記音楽のうち前記所定の条件を満たす部分を特定することにおいて、音楽理論に基づいて前記音楽のコーラス部分を特定する、請求項10に記載の情報処理方法。
- 前記編集情報を生成することにおいて、前記単位映像を1拍ごとに切り替えるための前記編集情報を生成する、請求項1に記載の情報処理方法。
- 前記編集情報を生成することにおいて、前記音楽の拍の速さが閾値を超える場合に、前記単位映像を複数拍ごとに切り替えるための前記編集情報を生成する、請求項1に記載の情報処理方法。
- 前記情報処理方法は、前記音楽と前記編集情報に基づいて切り替わる前記単位映像とから成る要約映像を生成することをさらに含む、請求項1に記載の情報処理方法。
- 前記情報処理方法は、生成された前記要約映像を再生することをさらに含む、請求項14に記載の情報処理方法。
- 入力された音楽の拍を解析する音楽解析部と、
入力された映像から複数の単位映像を抽出する抽出部と、
前記抽出部により抽出された前記単位映像を前記音楽解析部により解析された拍に応じて切り替えるための編集情報を生成する編集部と、
を備える映像処理装置。 - 前記音楽解析部は、前記音楽の小節を解析し、
前記編集部は、前記音楽解析部により解析された前記音楽の小節の単位で、拍に応じた前記単位映像の切り替えの実施有無を選択する、請求項16に記載の映像処理装置。 - ひとつの小節内で拍に応じて切り替わる前記単位映像は、互いに類似する、請求項17に記載の映像処理装置。
- ひとつの小節内で拍に応じて切り替わる前記単位映像は、被写体の動作、撮影時間情報、撮影位置情報、色情報又はカメラワークの少なくともいずれかが近い、請求項18に記載の映像処理装置。
- コンピュータを、
入力された音楽の拍を解析する音楽解析部と、
入力された映像から複数の単位映像を抽出する抽出部と、
前記抽出部により抽出された前記単位映像を前記音楽解析部により解析された拍に応じて切り替えるための編集情報を生成する編集部と、
として機能させるためのプログラム。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016564717A JP6569687B2 (ja) | 2014-12-15 | 2015-10-09 | 情報処理方法、映像処理装置及びプログラム |
EP15869638.5A EP3217655A4 (en) | 2014-12-15 | 2015-10-09 | Information processing method, video processing device and program |
US15/531,732 US10325627B2 (en) | 2014-12-15 | 2015-10-09 | Information processing method and image processing apparatus |
CN201580066673.XA CN107409193A (zh) | 2014-12-15 | 2015-10-09 | 信息处理方法、影像处理装置和程序 |
US16/410,079 US10847185B2 (en) | 2014-12-15 | 2019-05-13 | Information processing method and image processing apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2014253214 | 2014-12-15 | ||
JP2014-253214 | 2014-12-15 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/531,732 A-371-Of-International US10325627B2 (en) | 2014-12-15 | 2015-10-09 | Information processing method and image processing apparatus |
US16/410,079 Continuation US10847185B2 (en) | 2014-12-15 | 2019-05-13 | Information processing method and image processing apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016098430A1 true WO2016098430A1 (ja) | 2016-06-23 |
Family
ID=56126332
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2015/078845 WO2016098430A1 (ja) | 2014-12-15 | 2015-10-09 | 情報処理方法、映像処理装置及びプログラム |
Country Status (5)
Country | Link |
---|---|
US (2) | US10325627B2 (ja) |
EP (1) | EP3217655A4 (ja) |
JP (1) | JP6569687B2 (ja) |
CN (1) | CN107409193A (ja) |
WO (1) | WO2016098430A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018167706A1 (en) * | 2017-03-16 | 2018-09-20 | Sony Mobile Communications Inc. | Method and system for automatically creating a soundtrack to a user-generated video |
CN110099300A (zh) * | 2019-03-21 | 2019-08-06 | 北京奇艺世纪科技有限公司 | 视频处理方法、装置、终端及计算机可读存储介质 |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201601143D0 (en) | 2016-01-21 | 2016-03-09 | Oxehealth Ltd | Method and apparatus for health and safety monitoring of a subject in a room |
GB201601140D0 (en) | 2016-01-21 | 2016-03-09 | Oxehealth Ltd | Method and apparatus for estimating heart rate |
GB201601142D0 (en) | 2016-01-21 | 2016-03-09 | Oxehealth Ltd | Method and apparatus for estimating breathing rate |
GB201601217D0 (en) | 2016-01-22 | 2016-03-09 | Oxehealth Ltd | Signal processing method and apparatus |
GB201615899D0 (en) | 2016-09-19 | 2016-11-02 | Oxehealth Ltd | Method and apparatus for image processing |
US10885349B2 (en) | 2016-11-08 | 2021-01-05 | Oxehealth Limited | Method and apparatus for image processing |
GB201706449D0 (en) | 2017-04-24 | 2017-06-07 | Oxehealth Ltd | Improvements in or realting to in vehicle monitoring |
JP2019004927A (ja) * | 2017-06-20 | 2019-01-17 | カシオ計算機株式会社 | 電子機器、リズム情報報知方法及びプログラム |
GB201803508D0 (en) * | 2018-03-05 | 2018-04-18 | Oxehealth Ltd | Method and apparatus for monitoring of a human or animal subject |
US11508393B2 (en) * | 2018-06-12 | 2022-11-22 | Oscilloscape, LLC | Controller for real-time visual display of music |
GB201900034D0 (en) | 2019-01-02 | 2019-02-13 | Oxehealth Ltd | Method and apparatus for monitoring of a human or animal subject |
GB201900032D0 (en) | 2019-01-02 | 2019-02-13 | Oxehealth Ltd | Method and apparatus for monitoring of a human or animal subject |
GB201900033D0 (en) | 2019-01-02 | 2019-02-13 | Oxehealth Ltd | Mrthod and apparatus for monitoring of a human or animal subject |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004096617A (ja) * | 2002-09-03 | 2004-03-25 | Sharp Corp | ビデオ編集方法、ビデオ編集装置、ビデオ編集プログラム、及び、プログラム記録媒体 |
JP2004159331A (ja) * | 2002-11-01 | 2004-06-03 | Microsoft Corp | ビデオを自動的に編集するためのシステムおよび方法 |
JP2006127574A (ja) * | 2004-10-26 | 2006-05-18 | Sony Corp | コンテンツ利用装置、コンテンツ利用方法、配信サーバー装置、情報配信方法および記録媒体 |
JP2008048054A (ja) * | 2006-08-11 | 2008-02-28 | Fujifilm Corp | 動画生成方法、プログラムおよび装置 |
JP2012084957A (ja) * | 2010-10-07 | 2012-04-26 | Moso Inc | コンテンツ編集装置および方法、並びにプログラム |
JP2012253619A (ja) * | 2011-06-03 | 2012-12-20 | Casio Comput Co Ltd | 動画再生装置、動画再生方法及びプログラム |
Family Cites Families (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS63253774A (ja) * | 1987-04-10 | 1988-10-20 | Ekuserutoronikusu Kk | ビデオ画像制御装置 |
JP3325809B2 (ja) | 1997-08-15 | 2002-09-17 | 日本電信電話株式会社 | 映像制作方法及び装置及びこの方法を記録した記録媒体 |
JP2005506643A (ja) * | 2000-12-22 | 2005-03-03 | ミュビー テクノロジーズ ピーティーイー エルティーディー | メディアプロダクションシステムとその方法 |
AU2003249663A1 (en) * | 2002-05-28 | 2003-12-12 | Yesvideo, Inc. | Summarization of a visual recording |
JP4196816B2 (ja) | 2003-12-08 | 2008-12-17 | ソニー株式会社 | データ編集装置およびデータ編集方法 |
JP2005269605A (ja) * | 2004-02-20 | 2005-09-29 | Fuji Photo Film Co Ltd | デジタル図鑑システム、図鑑検索方法、図鑑検索プログラム |
JP4465534B2 (ja) * | 2004-03-31 | 2010-05-19 | パイオニア株式会社 | 画像検索方法、装置及びプログラムを記録した記録媒体 |
JP4622479B2 (ja) * | 2004-11-25 | 2011-02-02 | ソニー株式会社 | 再生装置および再生方法 |
US20060159370A1 (en) * | 2004-12-10 | 2006-07-20 | Matsushita Electric Industrial Co., Ltd. | Video retrieval system and video retrieval method |
JP4940588B2 (ja) * | 2005-07-27 | 2012-05-30 | ソニー株式会社 | ビート抽出装置および方法、音楽同期画像表示装置および方法、テンポ値検出装置および方法、リズムトラッキング装置および方法、音楽同期表示装置および方法 |
US7558809B2 (en) * | 2006-01-06 | 2009-07-07 | Mitsubishi Electric Research Laboratories, Inc. | Task specific audio classification for identifying video highlights |
US7945142B2 (en) * | 2006-06-15 | 2011-05-17 | Microsoft Corporation | Audio/visual editing tool |
US8370747B2 (en) * | 2006-07-31 | 2013-02-05 | Sony Mobile Communications Ab | Method and system for adapting a visual user interface of a mobile radio terminal in coordination with music |
JP4660861B2 (ja) * | 2006-09-06 | 2011-03-30 | 富士フイルム株式会社 | 楽曲画像シンクロ動画シナリオ生成方法、プログラムおよび装置 |
JP2008217254A (ja) * | 2007-03-01 | 2008-09-18 | Fujifilm Corp | プレイリスト作成装置、およびプレイリスト作成方法 |
US7904798B2 (en) * | 2007-08-13 | 2011-03-08 | Cyberlink Corp. | Method of generating a presentation with background music and related system |
JP5104709B2 (ja) * | 2008-10-10 | 2012-12-19 | ソニー株式会社 | 情報処理装置、プログラム、および情報処理方法 |
WO2011078379A1 (ja) * | 2009-12-25 | 2011-06-30 | 楽天株式会社 | 画像生成装置、画像生成方法、画像生成プログラム及び記録媒体 |
CN102117638A (zh) * | 2009-12-30 | 2011-07-06 | 北京华旗随身数码股份有限公司 | 音乐节奏控制的视频输出的方法及播放装置 |
US9060673B2 (en) * | 2010-04-28 | 2015-06-23 | Given Imaging Ltd. | System and method for displaying portions of in-vivo images |
JP2013200750A (ja) * | 2012-03-26 | 2013-10-03 | Sony Corp | 画像処理装置、画像処理方法およびコンピュータプログラム |
US20130330062A1 (en) * | 2012-06-08 | 2013-12-12 | Mymusaic Inc. | Automatic creation of movie with images synchronized to music |
KR101477486B1 (ko) * | 2013-07-24 | 2014-12-30 | (주) 프람트 | 동영상 재생 및 편집을 위한 사용자 인터페이스 장치 및 그 방법 |
-
2015
- 2015-10-09 EP EP15869638.5A patent/EP3217655A4/en not_active Withdrawn
- 2015-10-09 JP JP2016564717A patent/JP6569687B2/ja active Active
- 2015-10-09 US US15/531,732 patent/US10325627B2/en active Active
- 2015-10-09 CN CN201580066673.XA patent/CN107409193A/zh active Pending
- 2015-10-09 WO PCT/JP2015/078845 patent/WO2016098430A1/ja active Application Filing
-
2019
- 2019-05-13 US US16/410,079 patent/US10847185B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004096617A (ja) * | 2002-09-03 | 2004-03-25 | Sharp Corp | ビデオ編集方法、ビデオ編集装置、ビデオ編集プログラム、及び、プログラム記録媒体 |
JP2004159331A (ja) * | 2002-11-01 | 2004-06-03 | Microsoft Corp | ビデオを自動的に編集するためのシステムおよび方法 |
JP2006127574A (ja) * | 2004-10-26 | 2006-05-18 | Sony Corp | コンテンツ利用装置、コンテンツ利用方法、配信サーバー装置、情報配信方法および記録媒体 |
JP2008048054A (ja) * | 2006-08-11 | 2008-02-28 | Fujifilm Corp | 動画生成方法、プログラムおよび装置 |
JP2012084957A (ja) * | 2010-10-07 | 2012-04-26 | Moso Inc | コンテンツ編集装置および方法、並びにプログラム |
JP2012253619A (ja) * | 2011-06-03 | 2012-12-20 | Casio Comput Co Ltd | 動画再生装置、動画再生方法及びプログラム |
Non-Patent Citations (1)
Title |
---|
See also references of EP3217655A4 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018167706A1 (en) * | 2017-03-16 | 2018-09-20 | Sony Mobile Communications Inc. | Method and system for automatically creating a soundtrack to a user-generated video |
US10902829B2 (en) | 2017-03-16 | 2021-01-26 | Sony Corporation | Method and system for automatically creating a soundtrack to a user-generated video |
CN110099300A (zh) * | 2019-03-21 | 2019-08-06 | 北京奇艺世纪科技有限公司 | 视频处理方法、装置、终端及计算机可读存储介质 |
CN110099300B (zh) * | 2019-03-21 | 2021-09-03 | 北京奇艺世纪科技有限公司 | 视频处理方法、装置、终端及计算机可读存储介质 |
Also Published As
Publication number | Publication date |
---|---|
JP6569687B2 (ja) | 2019-09-04 |
US20170323665A1 (en) | 2017-11-09 |
JPWO2016098430A1 (ja) | 2017-09-28 |
US10847185B2 (en) | 2020-11-24 |
EP3217655A4 (en) | 2018-07-18 |
EP3217655A1 (en) | 2017-09-13 |
US20190267040A1 (en) | 2019-08-29 |
CN107409193A (zh) | 2017-11-28 |
US10325627B2 (en) | 2019-06-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6569687B2 (ja) | 情報処理方法、映像処理装置及びプログラム | |
EP2710594B1 (en) | Video summary including a feature of interest | |
US8643746B2 (en) | Video summary including a particular person | |
US10170157B2 (en) | Method and apparatus for finding and using video portions that are relevant to adjacent still images | |
US20100094441A1 (en) | Image selection apparatus, image selection method and program | |
JP4988011B2 (ja) | 電子機器及び画像処理方法 | |
JP6583285B2 (ja) | 情報処理方法、映像処理装置及びプログラム | |
CN105874780A (zh) | 对一组图像生成文本色彩的方法和装置 | |
CN104618446A (zh) | 一种实现多媒体推送的方法和装置 | |
CN102906818A (zh) | 将视频摘要存储为元数据 | |
KR102045575B1 (ko) | 스마트 미러 디스플레이 장치 | |
US20200251146A1 (en) | Method and System for Generating Audio-Visual Content from Video Game Footage | |
JP2011239141A (ja) | 情報処理方法、情報処理装置、情景メタデータ抽出装置、欠損補完情報生成装置及びプログラム | |
WO2012160771A1 (ja) | 情報処理装置、情報処理方法、プログラム、記憶媒体及び集積回路 | |
US20230290382A1 (en) | Method and apparatus for matching music with video, computer device, and storage medium | |
CN105556947A (zh) | 用于色彩检测以生成文本色彩的方法和装置 | |
WO2014020816A1 (en) | Display control device, display control method, and program | |
JP2016116073A (ja) | 映像処理方法、映像処理装置及びプログラム | |
JP5550447B2 (ja) | 電子機器及び方法 | |
JP2012137560A (ja) | カラオケ装置、カラオケ装置の制御方法及び制御プログラム | |
JP2017017387A (ja) | 映像処理装置および映像処理方法 | |
JP2016131329A (ja) | 画像音声記録装置、画像音声記録方法、画像音声記録プログラム | |
JP2019201244A (ja) | 動画処理装置、動画の生産方法、およびプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15869638 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2016564717 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15531732 Country of ref document: US |
|
REEP | Request for entry into the european phase |
Ref document number: 2015869638 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |