WO2021240678A1 - 映像処理装置、映像処理方法、及び、記録媒体 - Google Patents
映像処理装置、映像処理方法、及び、記録媒体 Download PDFInfo
- Publication number
- WO2021240678A1 WO2021240678A1 PCT/JP2020/020868 JP2020020868W WO2021240678A1 WO 2021240678 A1 WO2021240678 A1 WO 2021240678A1 JP 2020020868 W JP2020020868 W JP 2020020868W WO 2021240678 A1 WO2021240678 A1 WO 2021240678A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- scene
- audience
- important
- video
- spectator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/142—Detection of scene cut or scene change
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/87—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving scene cut or scene change detection in combination with video compression
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8456—Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8549—Creating video summaries, e.g. movie trailer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/91—Television signal processing therefor
Definitions
- the present invention relates to processing video data.
- Patent Document 1 A technology to generate a video digest from a moving image has been proposed.
- a learning data file is created from a training moving image prepared in advance and an important scene moving image specified by a user, and an important scene is detected from the target moving image based on the learning data file.
- the extraction device is disclosed.
- the digest video edited by humans often includes not only the video played by the athlete but also the video of the spectators in the spectators' seats and the message board of the spectators. ..
- the number of such spectator scenes is smaller than the scenes played by the players, it is difficult to learn them as important scenes by machine learning, and it is difficult to include them in the digest video.
- One object of the present invention is to provide a video processing device capable of generating a digest video including an audience scene in a sports video or the like in the digest video.
- the video processing device is Video acquisition means to acquire material video, Audience scene extraction means for extracting an audience scene that reflects an audience from the material video, An important scene extraction means for extracting important scenes from the material video, Associating means for associating the audience scene with the important scene, A generation means for generating a digest video including the important scene and an audience scene associated with the important scene is provided.
- the video processing method is: Get the material video, An audience scene that reflects the audience is extracted from the material video, Extract important scenes from the material video and Associate the audience scene with the important scene, A digest video including the important scene and an audience scene associated with the important scene is generated.
- the recording medium is: Get the material video, An audience scene that reflects the audience is extracted from the material video, Extract important scenes from the material video and Associate the audience scene with the important scene, A program for causing a computer to execute a process of generating a digest video including the important scene and an audience scene associated with the important scene is recorded.
- the overall configuration of the digest generator according to the embodiment is shown.
- An example of a digest video is shown.
- the configuration of the digest generator during training and inference is shown.
- It is a block diagram which shows the hardware composition of the digest generator.
- An example of an image of the audience seats is shown.
- the method of including the audience scene in the digest video is schematically shown.
- the functional configuration of the digest generation apparatus which concerns on 1st Embodiment is shown. It is a flowchart of a digest generation process. It is a flowchart of the audience scene extraction process.
- the functional configuration of the training device of the audience scene extraction model is shown. It is a flowchart of a training process. It is a block diagram which shows the functional structure of the image processing apparatus which concerns on 2nd Embodiment.
- FIG. 1 shows the overall configuration of the digest generation device 100 according to the embodiment.
- the digest generation device 100 is connected to a material video database (hereinafter, “database” is also referred to as “DB”) 2.
- the material video DB 2 stores various material videos, that is, moving images.
- the material video may be, for example, a video such as a television program broadcast from a broadcasting station, or a video distributed on the Internet or the like.
- the material video may or may not include audio.
- the digest generation device 100 generates and outputs a digest video using a part of the material video stored in the material video DB 2.
- the digest video is a video that connects important scenes in the material video in chronological order.
- the digest generation device 100 generates a digest image using a digest generation model trained by machine learning (hereinafter, also simply referred to as a “generation model”).
- a digest generation model trained by machine learning
- a model using a neural network can be used.
- FIG. 2 shows an example of a digest video.
- the digest generation device 100 extracts scenes A to D included in the material video as important scenes, and generates a digest video in which these are connected in chronological order.
- the important scenes extracted from the material video may be repeatedly used in the digest video depending on the content.
- FIG. 3A is a block diagram showing a configuration for training a generative model used by the digest generator 100.
- a pre-prepared training data set is used to train the generative model.
- the training data set is a pair of training material video and correct answer data showing the correct answer to the training material video.
- the correct answer data is data in which a tag indicating the correct answer (hereinafter referred to as “correct answer tag”) is attached to the position of an important scene in the training material video.
- the correct answer tag is added to the correct answer data by an experienced editor or the like. For example, for a baseball commentary material video, a baseball commentator or the like selects a highlight scene during the game and assigns a correct answer tag.
- the method of assigning the correct answer tag by the editor may be learned by machine learning or the like, and the correct answer tag may be automatically assigned.
- the training material video is input to the generative model M.
- the generative model M extracts important scenes from the material video. Specifically, the generative model M extracts a feature amount from a set of one or a plurality of frames constituting the material image, and calculates the importance (importance score) for the material image based on the extracted feature amount. .. Then, the generative model M outputs a portion whose importance is equal to or higher than a predetermined threshold value as an important scene.
- the training unit 4 optimizes the generative model M by using the output of the generative model M and the correct answer data.
- the training unit 4 compares the important scene output by the generated model M with the scene indicated by the correct answer tag included in the correct answer data, and parameters of the generated model M so as to reduce the error (loss). To update.
- the trained generative model M thus obtained enables the editor to extract a scene close to the scene to which the correct answer tag is attached as an important scene from the material video.
- FIG. 3B shows a configuration at the time of inference by the digest generation device 100.
- the material image to be generated for the digest image is input to the trained generation model M.
- the generation model M calculates the importance from the material video, extracts a portion whose importance is equal to or higher than a predetermined threshold value as an important scene, and outputs the portion to the digest generation unit 5.
- the digest generation unit 5 generates and outputs a digest video by connecting important scenes extracted by the generation model M. In this way, the digest generation device 100 generates a digest image from the material image by using the trained generation model M.
- FIG. 4 is a block diagram showing a hardware configuration of the digest generation device 100.
- the digest generator 100 includes an interface (IF) 11, a processor 12, a memory 13, a recording medium 14, and a database (DB) 15.
- IF interface
- DB database
- IF11 inputs and outputs data to and from an external device.
- the material video stored in the material video DB 2 is input to the digest generation device 100 via the IF 11.
- the digest video generated by the digest generation device 100 is output to an external device through the IF 11.
- the processor 12 is a computer such as a CPU (Central Processing Unit), and controls the entire digest generation device 100 by executing a program prepared in advance. Specifically, the processor 12 executes a training process and a digest generation process, which will be described later.
- CPU Central Processing Unit
- the memory 13 is composed of a ROM (Read Only Memory), a RAM (Random Access Memory), and the like.
- the memory 13 is also used as a working memory during execution of various processes by the processor 12.
- the recording medium 14 is a non-volatile, non-temporary recording medium such as a disk-shaped recording medium or a semiconductor memory, and is configured to be removable from the digest generation device 100.
- the recording medium 14 records various programs executed by the processor 12. When the digest generator 100 executes various processes, the program recorded in the recording medium 14 is loaded into the memory 13 and executed by the processor 12.
- the database 15 temporarily stores the material video input through the IF 11, the digest video generated by the digest generator 100, and the like. Further, the database 15 stores information on the trained generative model used by the digest generator 100, a training data set used for training the generative model, and the like.
- the digest generation device 100 may include an input unit such as a keyboard and a mouse for the creator to give instructions and inputs, and a display unit such as a liquid crystal display.
- the digest generation device 100 extracts a scene showing the audience seats (hereinafter, referred to as “audience scene”) when generating a digest image from a material image such as a sports game image. Include in the digest video.
- the digest generation device 100 is characterized in that the audience scene extracted from the material image is associated with the important scene extracted from the material image and included in the digest image.
- FIG. 5A shows an example of an image of the audience seats. This video is a video of the audience seats including a large number of spectators.
- FIG. 6 schematically shows a method of including the audience scene in the digest video.
- the time in the material image is shown on the horizontal axis.
- the digest generation device 100 extracts the audience scene from the material image by preprocessing.
- the audience scenes A and B are extracted from the material video.
- the digest generation device 100 extracts important scenes from the material video by the above-mentioned method.
- important scenes 1 to 3 are extracted from the material video.
- the digest generation device 100 performs a process of associating the audience scenes A and B with any of the important scenes. Then, when the association is possible, the digest generation device 100 arranges the audience scenes before or after the important scenes associated with each other on the time axis to generate the digest video.
- the audience scene is associated with an important scene based on the time in the material image. Specifically, the first method associates the audience scene with the important scene with the closest time in the material video.
- the audience scene may be associated with the important scene only when the time interval (time difference) between the audience scene and the important scene is equal to or less than a predetermined threshold value. In this case, if the time interval between the spectator scene and the nearest important scene is larger than the threshold value, the spectator scene is not associated with the important scene.
- the positional relationship of the audience scene with respect to the important scene follows the positional relationship between the two in the material image.
- the audience scene A is in front of the important scene 1 in the material image
- the audience scene A is arranged in front of the important scene 1 as shown in the example of the digest image.
- the audience scene is arranged after the important scene to be the target.
- the digest generator 100 is a color of clothes, hats, etc. worn by people included in the audience scene extracted from the material image, or an object possessed by those people (for example, a megaphone, etc.). It recognizes colors such as cheering flags) and extracts information about the colors that occupy most of the audience seats.
- the digest generation device 100 acquires information about a color from an audience scene and associates it with an important scene of a team having a team color that is the same as or similar to that color. For example, suppose that the material image is a match between Team A and Team B, the team color of Team A is red, and the team color of Team B is blue.
- the digest generator 100 associates the spectator scene in which most of the spectator seats are occupied in red with the important scenes related to Team A (for example, the scoring scene of Team A), and the majority of the spectator seats are occupied in blue. Associate the audience scene with the important scenes related to Team B.
- each spectator scene may be associated with the most important scene of the team in terms of time.
- each spectator scene may be associated with a randomly selected important scene from a plurality of important scenes of the team.
- the digest generator 100 recognizes a character string such as a cheering message written on a message board, a placard, a cheering flag, etc. included in the audience scene extracted from the material image, and displays the audience scene. Associate with important scenes related to the string.
- the digest generator 100 when the team name, player name, player's uniform number, etc. are written on the message board displayed in the audience scene, the digest generator 100 indicates the team indicated by the character string and the character string. Associate the spectator scene with the important scene of the team to which the player belongs. For example, as shown in FIG. 5B, when the message "Go! GIANTS! Is written on the message board displayed in the audience scene, the digest generator 100 sets this audience scene as the team "GIANTS". Associate with important scenes of.
- the digest generator 100 associates each spectator scene with the important scenes of the team that are closest in time. It may be associated with a randomly selected important scene among a plurality of important scenes of the team.
- the digest generation device 100 associates the audience scene A with the important scene 1 by the first method and arranges the audience scene A in front of the important scene 1.
- the spectator scene A since the time interval ⁇ t 12 between the time t 1 of the spectator scene A and the time t 2 of the important scene in the material image is smaller than the predetermined threshold value Tth, the spectator scene A is associated with the important scene 1.
- the spectator scene B since the time interval ⁇ t 35 of the important scene 2 and the time interval ⁇ t 45 with the important scene 3 are both larger than the predetermined threshold value Tth, the spectator scene B becomes an important scene depending on the first method. Not associated.
- the audience scene B is associated with the important scene 2 by either the second method or the third method.
- any one of the above-mentioned first to third methods may be used, or two or more may be used in combination.
- the priority when two or more are used in combination can be arbitrarily determined.
- the digest generation device 100 associates all of the audience scenes extracted from the material video with the important scenes, and does not need to be included in the digest video. If there are many spectator scenes, some of them may be selected and associated with important scenes to be included in the digest video. In addition, only the audience scenes that are the targets of the association by applying one or more of the above first to third methods are included in the important scenes, and the audience scenes that are not the targets of the association are included in the digest video. It may not be included.
- FIG. 7 is a block diagram showing a functional configuration of the digest generation device 100 according to the first embodiment.
- the digest generation device 100 includes an audience scene extraction unit 21, an audience scene DB 22, an important scene extraction unit 23, an association unit 24, and a digest generation unit 25.
- the material video is input to the audience scene extraction unit 21 and the important scene extraction unit 23.
- the spectator scene extraction unit 21 extracts the spectator scene from the material video and stores it in the spectator scene DB 22.
- the spectator scene is a video showing the spectator seats in a sports game video or the like.
- the spectator scene extraction unit 21 uses, for example, a neural network to extract spectator scenes using a pre-trained model. The model training method will be described later.
- the spectator scene extraction unit 21 extracts the spectator scene from the material video as a preprocessing for generating the digest video, and stores it in the spectator scene DB 22.
- the spectator scene extraction unit 21 also extracts the time information of each spectator scene used in the above-mentioned first method as accompanying information and stores it in the spectator scene DB 22 in association with the spectator scene. Further, the spectator scene extraction unit 21 also extracts information on colors and information on character strings used in the second method described above as accompanying information, and stores the information in the spectator scene DB 22 in association with the spectator scene.
- the important scene extraction unit 23 extracts important scenes from the material video by the method described with reference to FIG. 3 and outputs the important scenes to the association unit 24.
- the associating unit 24 associates the spectator scenes stored in the spectator scene DB 22 with the important scenes extracted by the important scene extraction unit 23. Specifically, the association unit 24 associates the audience scene with the important scene by using any one or a plurality of combinations of the above-mentioned first to third methods, and outputs the spectator scene to the digest generation unit 25.
- the association unit 24 outputs a pair of the audience scene and the important scene to the digest generation unit 25 for the important scene to which the audience scene is associated, and digests only the important scene for the important scene to which the audience scene is not associated. Output to the generation unit 25.
- the digest generation unit 25 generates a digest video by connecting important scenes input from the association unit 24 in chronological order. At that time, the digest generation unit 25 inserts each audience scene before or after the associated important scene.
- the associating unit 24 may generate arrangement information indicating whether each spectator scene is arranged before or after the important scene, and output the spectator scene and the important scene to the digest generation unit 25. In this case, the digest generation unit 25 may determine the insertion position of the audience scene with reference to the input arrangement information. In this way, the digest generation unit 25 generates and outputs a digest video including the audience scene.
- FIG. 8 is a flowchart of the digest generation process executed by the digest generation device 100. This process is realized by the processor 12 shown in FIG. 4 executing a program prepared in advance and operating as each element shown in FIG. 7.
- the spectator scene extraction unit 21 performs the spectator scene extraction process as a pre-process (step S11).
- FIG. 9 is a flowchart of the audience scene extraction process.
- the audience scene extraction unit 21 acquires the material image (step S21) and detects the audience scene from the material image (step S22). Then, when the spectator scene is detected (step S23: Yes), the spectator scene extraction unit 21 saves the spectator scene in the spectator scene DB 22 (step S24).
- the audience scene extraction unit 21 determines whether or not the processing of steps S21 to S24 has been performed to the end of the material image (step S25: No), and if not, repeats steps S21 to S24. ..
- step S25: Yes the process ends.
- the audience scene is extracted from the material image.
- the accompanying information of the spectator scenes the time of each spectator scene, the information about the color and the character string included in the spectator scene, and the like are acquired.
- the important scene extraction unit 23 extracts important scenes from the material video (step S12).
- the association unit 24 associates the extracted important scene with the spectator scene stored in the spectator scene DB 22 by using any one or a plurality of the above-mentioned first to third (step S13).
- the association unit 24 outputs the important scene to which the spectator scene is associated and the important scene to which the spectator scene is not associated to the digest generation unit 25.
- the digest generation unit 25 connects the important scenes in chronological order, inserts the corresponding audience scenes before or after the important scenes, and generates the digest video (step S14). In this way, the digest video generation process is completed.
- FIG. 10 shows a functional configuration of a training device for training an audience scene extraction model Mx.
- the training device 200 includes an audience scene extraction model Mx and a training unit 4x.
- a training data set for training the audience scene extraction model Mx is prepared.
- the training data set includes training material images and correct answer data.
- the correct answer data is data in which a correct answer tag indicating the correct answer is attached to the audience scene included in the training material video.
- Training material video is input to the audience scene extraction model Mx.
- the spectator scene extraction model Mx extracts the feature amount from the input training material image, extracts the spectator scene based on the feature amount, and outputs it to the training unit 4x.
- the training unit 4x optimizes the spectator scene extraction model Mx by using the spectator scene output by the spectator scene extraction model Mx and the correct answer data. Specifically, the training unit 4x calculates the loss by comparing the audience scene extracted by the audience scene extraction model Mx with the scene to which the correct answer tag is attached, and the audience scene extraction model Mx so as to reduce the loss. Update the parameters. In this way, a trained audience scene extraction model Mx is obtained.
- FIG. 11 is a flowchart of the training process by the training device 200. This process is actually realized by the processor 12 shown in FIG. 4 executing a program prepared in advance and operating as each element shown in FIG.
- the audience scene extraction model Mx extracts the audience scene from the training material video (step S31).
- the training unit 4x optimizes each model using the audience scene output from the audience scene extraction model Mx and the correct answer data (step S32).
- the training device 200 determines whether or not the training end condition is satisfied (step S33).
- the training end condition is, for example, that all the training data sets prepared in advance are used, and that the loss value calculated by the training unit 4x converges within a predetermined range. In this way, the training of the audience scene extraction model Mx is performed until the training end condition is satisfied, and when the training end condition is satisfied, the training process ends.
- FIG. 12 is a block diagram showing a functional configuration of the video processing apparatus according to the second embodiment.
- the image processing device 70 includes an image acquisition unit 71, an audience scene extraction unit 72, an important scene extraction unit 73, an association unit 74, and a generation unit 75.
- the image acquisition means 71 acquires a material image.
- the audience scene extraction means 72 extracts an audience scene that reflects the audience from the material image.
- the important scene extraction means 73 extracts important scenes from the material video.
- the associating means 74 associates the audience scene with the important scene.
- the generation means 75 generates a digest video including an important scene and an audience scene associated with the important scene.
- Video acquisition means to acquire material video Audience scene extraction means for extracting an audience scene that reflects an audience from the material video
- An important scene extraction means for extracting important scenes from the material video
- Associating means for associating the audience scene with the important scene
- a generation means for generating a digest video including the important scene and an audience scene associated with the important scene.
- a video processing device equipped with.
- Appendix 2 The generation means arranges the important scenes in chronological order to generate the digest video.
- the video processing device according to Appendix 1, wherein the generation means arranges an audience scene associated with the important scene before or after the important scene to generate the digest video.
- Appendix 3 The video processing device according to Appendix 1 or 2, wherein the associating means associates an audience scene existing at a position within a predetermined time before and after the important scene with the important scene.
- the spectator scene extraction means extracts information about colors included in the spectator scene and obtains information.
- the video processing device according to any one of Supplementary note 1 to 3, wherein the association means associates the audience scene with the important scene based on the information regarding the color.
- the material image is a sports image and is The audience scene extraction means extracts the colors of people's clothes or objects possessed by the people included in the audience scene.
- the image processing apparatus according to any one of Supplementary note 1 to 3, wherein the association means associates the audience scene with an important scene in which a team whose team color is a color extracted from the audience scene is projected.
- the spectator scene extraction means extracts a character string included in the spectator scene and obtains a character string.
- the video processing device according to any one of Supplementary note 1 to 5, wherein the association means associates the audience scene with the important scene based on the character string.
- the material image is a sports image and is
- the spectator scene extraction means extracts a character string indicated by a message board included in the spectator scene or an object worn or possessed by a person included in the spectator scene.
- the associating means associates the spectator scene with an important scene showing the team indicated by the character string extracted from the spectator scene or the team to which the player indicated by the character string belongs.
- the audience scene extraction means uses a model trained using a training data set including a training material image prepared in advance and correct answer data indicating the audience scene in the training material image, and the audience scene.
- the video processing apparatus according to any one of Supplementary note 1 to 7.
- a recording medium recording a program that causes a computer to execute a process of generating a digest video including the important scene and an audience scene associated with the important scene.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Television Signal Processing For Recording (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/926,694 US20230199194A1 (en) | 2020-05-27 | 2020-05-27 | Video processing device, video processing method, and recording medium |
| JP2022527349A JP7420245B2 (ja) | 2020-05-27 | 2020-05-27 | 映像処理装置、映像処理方法、及び、プログラム |
| PCT/JP2020/020868 WO2021240678A1 (ja) | 2020-05-27 | 2020-05-27 | 映像処理装置、映像処理方法、及び、記録媒体 |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2020/020868 WO2021240678A1 (ja) | 2020-05-27 | 2020-05-27 | 映像処理装置、映像処理方法、及び、記録媒体 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021240678A1 true WO2021240678A1 (ja) | 2021-12-02 |
Family
ID=78723076
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2020/020868 Ceased WO2021240678A1 (ja) | 2020-05-27 | 2020-05-27 | 映像処理装置、映像処理方法、及び、記録媒体 |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20230199194A1 (https=) |
| JP (1) | JP7420245B2 (https=) |
| WO (1) | WO2021240678A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024142883A1 (ja) * | 2022-12-27 | 2024-07-04 | 日本電気株式会社 | 検索装置、検索方法、及び記録媒体 |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12354356B2 (en) * | 2020-05-27 | 2025-07-08 | Nec Corporation | Video processing device, video processing method, and recording medium |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2006008923A2 (ja) * | 2004-06-29 | 2006-01-26 | Matsushita Electric Industrial Co Ltd | 映像編集装置及び方法 |
| JP2011504702A (ja) * | 2007-11-22 | 2011-02-10 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | ビデオ要約を生成する方法 |
| JP2014229092A (ja) * | 2013-05-23 | 2014-12-08 | 株式会社ニコン | 画像処理装置、画像処理方法、および、そのプログラム |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150297949A1 (en) * | 2007-06-12 | 2015-10-22 | Intheplay, Inc. | Automatic sports broadcasting system |
| US10318575B2 (en) * | 2014-11-14 | 2019-06-11 | Zorroa Corporation | Systems and methods of building and using an image catalog |
| US20170109584A1 (en) * | 2015-10-20 | 2017-04-20 | Microsoft Technology Licensing, Llc | Video Highlight Detection with Pairwise Deep Ranking |
| JP2021511729A (ja) * | 2018-01-18 | 2021-05-06 | ガムガム インコーポレイテッドGumgum, Inc. | 画像、又はビデオデータにおいて検出された領域の拡張 |
-
2020
- 2020-05-27 WO PCT/JP2020/020868 patent/WO2021240678A1/ja not_active Ceased
- 2020-05-27 US US17/926,694 patent/US20230199194A1/en not_active Abandoned
- 2020-05-27 JP JP2022527349A patent/JP7420245B2/ja active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2006008923A2 (ja) * | 2004-06-29 | 2006-01-26 | Matsushita Electric Industrial Co Ltd | 映像編集装置及び方法 |
| JP2011504702A (ja) * | 2007-11-22 | 2011-02-10 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | ビデオ要約を生成する方法 |
| JP2014229092A (ja) * | 2013-05-23 | 2014-12-08 | 株式会社ニコン | 画像処理装置、画像処理方法、および、そのプログラム |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024142883A1 (ja) * | 2022-12-27 | 2024-07-04 | 日本電気株式会社 | 検索装置、検索方法、及び記録媒体 |
Also Published As
| Publication number | Publication date |
|---|---|
| US20230199194A1 (en) | 2023-06-22 |
| JPWO2021240678A1 (https=) | 2021-12-02 |
| JP7420245B2 (ja) | 2024-01-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8121462B2 (en) | Video edition device and method | |
| JP6535323B2 (ja) | ゲーム動画配信装置、ゲーム動画配信方法、ゲーム動画配信プログラム | |
| CN109691124B (zh) | 用于自动生成视频亮点的方法和系统 | |
| JP2020127714A (ja) | ビデオゲーム映像からオーディオビジュアルコンテンツを生成する方法およびシステム | |
| JP2022521120A5 (https=) | ||
| US20080269924A1 (en) | Method of summarizing sports video and apparatus thereof | |
| US20250299487A1 (en) | Video processing device, video processing method, and recording medium | |
| US20200175457A1 (en) | Evaluation of actor auditions | |
| JP7420245B2 (ja) | 映像処理装置、映像処理方法、及び、プログラム | |
| JP7086331B2 (ja) | ダイジェスト映像生成装置およびダイジェスト映像生成プログラム | |
| WO2021240644A1 (ja) | 情報出力プログラム、装置、及び方法 | |
| KR20240003876A (ko) | 스포츠 경기 영상의 선수 행동과 경기 상황을 인식하는 시스템 | |
| US12010371B2 (en) | Information processing apparatus, video distribution system, information processing method, and recording medium | |
| JP2005109566A (ja) | 映像要約装置、説明文生成装置、映像要約方法、説明文生成方法及びプログラム | |
| CN116170651A (zh) | 从视频和文本输入生成高光时刻视频的方法、系统和存储介质 | |
| JP7485023B2 (ja) | 映像処理装置、映像処理方法、訓練装置、及び、プログラム | |
| JP7780465B2 (ja) | 学習装置、動画像生成装置、学習済モデルの生成方法、動画像生成方法及びプログラム | |
| US20250177858A1 (en) | Apparatus, systems and methods for visual description | |
| US10213691B2 (en) | Schemes for using audio updates to link real-life events to game events after release of the game | |
| KR20220157033A (ko) | 영상에 연관된 광고 정보를 표시하기 위한 영상 스트리밍 서비스 서버 및 그 동작 방법 | |
| JP7550949B1 (ja) | プログラム | |
| CN118692006A (zh) | 用于陪伴观赛的数字人构建方法、装置、设备和介质 | |
| JP7285349B2 (ja) | メッセージ出力装置、メッセージ出力方法及びプログラム | |
| WO2022149216A1 (ja) | 情報処理装置、情報処理方法、及び、記録媒体 | |
| JP6900792B2 (ja) | 対話文動画の自動生成装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20937536 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022527349 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 20937536 Country of ref document: EP Kind code of ref document: A1 |