WO2023047657A1 - 情報処理装置および情報処理方法 - Google Patents
情報処理装置および情報処理方法 Download PDFInfo
- Publication number
- WO2023047657A1 WO2023047657A1 PCT/JP2022/012459 JP2022012459W WO2023047657A1 WO 2023047657 A1 WO2023047657 A1 WO 2023047657A1 JP 2022012459 W JP2022012459 W JP 2022012459W WO 2023047657 A1 WO2023047657 A1 WO 2023047657A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- emotion
- scene
- user
- information processing
- metadata
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—Two-dimensional [2D] image generation
- G06T11/60—Creating or editing images; Combining images with text
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
- G06V20/47—Detecting features for summarising video content
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/91—Television signal processing therefor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/91—Television signal processing therefor
- H04N5/92—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/91—Television signal processing therefor
- H04N5/93—Regeneration of the television signal or of selected parts thereof
Definitions
- the present technology relates to an information processing device and an information processing method, and more particularly to an information processing device and the like that processes information related to video content.
- the purpose of this technology is to make it possible to effectively use emotion data that indicates the user's emotion for each scene of video content.
- An information processing apparatus comprising an extraction unit for extracting emotion-representing scenes based on emotion metadata having user emotion information for each scene of video content.
- the extraction unit extracts emotion representative scenes based on emotion metadata having user emotion information for each scene of video content. For example, the extraction unit may extract an emotion representative scene based on the type of user's emotion.
- the extraction unit may extract an emotion-representative scene based on the degree of user's emotion.
- the extraction unit may extract a scene in which the level of user's emotion exceeds a threshold value as an emotion representative scene.
- the extraction unit may extract an emotion-representing scene based on the statistical value of the user's emotional level of the entire video content.
- the statistical values may include, for example, maximum values, sorting results, average values or standard deviation values.
- emotion representative scenes are extracted based on emotion metadata having user emotion information for each scene of video content. It is possible to effectively use it in reproducing and editing content.
- the present technology may further include, for example, a reproduction control unit that reproduces the emotion-representing scene extracted from the moving image content. This allows the user to view only the extracted emotion-representing scene.
- the present technology may further include an editing control unit that extracts extracted emotion-representative scenes from video content and generates new video content, for example.
- an editing control unit that extracts extracted emotion-representative scenes from video content and generates new video content, for example.
- the user can obtain new video content that includes only the extracted emotion-representative scenes.
- the present technology may further include, for example, a display control unit that displays the temporal position of the extracted emotion representative scene relative to the entire video content. This allows the user to easily recognize the temporal position of the extracted emotion-representing scene relative to the entire moving image content.
- the display control unit displays the type and degree of the user's emotion in the extracted emotion-representing scene at the time position corresponding to the extracted emotion-representing scene of the time axis slide bar corresponding to the entire video content. may be displayed.
- the user can recognize the temporal position of the extracted emotion representative scene with respect to the entire image content from the position of the time axis slide bar, and the type and degree of the user's emotion in the extracted emotional scene. is also easily recognizable.
- the display control unit may display the type of user's emotion as a mark. This allows the user to intuitively recognize the type of emotion from the mark.
- FIG. 1 is a block diagram showing a configuration example of an information processing device that generates emotion metadata
- FIG. FIG. 4 is a block diagram showing another configuration example of an information processing device that generates emotion metadata
- 1 is a block diagram showing a configuration example of an information processing device that uses emotion metadata
- FIG. 10 is a diagram for explaining a case where a scene in which the degree of user's emotion exceeds a threshold is extracted as an emotion-representing scene
- FIG. 10 is a diagram for explaining a case of extracting an emotion-representing scene based on the statistical value of the degree of user's emotion in the entire moving image content
- FIG. 10 is a diagram for explaining a display example and the like for displaying the position of an emotion-representing scene with respect to the entire moving image content
- FIG. 11 is a block diagram showing another configuration example of an information processing device that uses emotion metadata
- FIG. 1 shows a configuration example of an information processing device 100A that generates emotion metadata.
- This information processing device 100A includes a content database (content DB) 101, a content reproduction display unit 102, a facial image capturing camera 103, a biological information sensor 104, a user emotion analysis unit 105, a metadata generation unit 106, It has a metadata rewriting unit 107 .
- the content database 101 stores a plurality of video content files.
- the content database 101 supplies the moving image content file corresponding to the reproduced moving image file name to the content reproduction display unit 102 .
- the name of the reproduced moving image file is designated by, for example, the user of the information processing apparatus 100A.
- the content reproduction display unit 102 reproduces the moving image content included in the moving image content file supplied from the content database 101, and displays the moving image on a display unit (not shown).
- the content playback display unit 102 also supplies a frame number (time code) to the metadata generation unit 106 in synchronization with the playback frame. This frame number is information that can specify a scene of moving image content.
- the facial image capturing camera 103 is a camera that captures the facial image of the user viewing the moving image displayed on the display unit by the content reproduction display unit 102 . Face images of respective frames obtained by the face image photographing camera 103 are sequentially supplied to the user emotion analysis unit 105 .
- the biometric information sensor 104 is a sensor for acquiring biometric information such as heart rate, respiration rate, and sweating amount, which is attached to the user viewing the moving image displayed on the content reproduction display section 102 .
- the biometric information of each frame acquired by the biometric information sensor 104 is sequentially supplied to the user emotion analysis unit 105 .
- the user emotion analysis unit 105 Based on the face image of each frame sequentially supplied from the face image capturing camera 103 and the biological information of each frame sequentially supplied from the biological information sensor 104, the user emotion analysis unit 105 analyzes the user's emotion of a predetermined type for each frame. The level of emotion is analyzed and user emotion information is supplied to the metadata generator 106 .
- the types of user emotions are not limited to secondary information obtained by analyzing facial images and biometric information, such as “happiness”, “anger”, “sorrow”, and “comfort” information.
- primary information that is biological information such as heart rate, respiration rate, and perspiration amount.
- Metadata generation unit 106 associates user emotion information of each frame obtained by user emotion analysis unit 105 with a frame number (time code) to generate emotion metadata having user emotion information for each frame of video content. , supplies this emotion metadata to the metadata rewriting unit 107 .
- the metadata rewriting unit 107 adds the emotion metadata supplied from the metadata generation unit 106 as it is when emotion metadata has not been added to the moving image content file corresponding to the playback moving image file name. Also, if emotion metadata has already been added to the moving image content file corresponding to the playback moving image file name, the metadata rewriting unit 107 updates the emotion metadata with the emotion metadata supplied from the metadata generating unit 106 .
- the metadata rewriting unit 107 supplies emotion metadata from the metadata generating unit 106 to the already added emotion metadata.
- update with emotion metadata obtained by synthesizing the emotion metadata obtained from Weighted averaging can be considered as a combining method, but it is not limited to this, and other methods may be used. Note that, in the case of weighted averaging, when the already added emotion metadata relates to m users, the already added emotion metadata and the emotion metadata supplied from the metadata generation unit 106 are are m:1 weighted and averaged.
- the information processing apparatus 100A shown in FIG. 1 generates emotion metadata having user emotion information for each frame of moving image content, and adds this emotion metadata to the moving image content file.
- This emotion metadata can be used when reproducing and viewing content, or when editing video content.
- FIG. 2 shows a configuration example of an information processing device 100B that generates emotion metadata.
- parts corresponding to those in FIG. 1 are denoted by the same reference numerals, and detailed description thereof will be omitted as appropriate.
- This information processing apparatus 100B includes a content database (content DB) 101, a content reproduction display unit 102, a facial image photographing camera 103, a biological information sensor 104, a user emotion analysis unit 105, a metadata generation unit 106, a metadata It has a data database (metadata DB) 108 .
- Metadata generation unit 106 associates user emotion information of each frame obtained by user emotion analysis unit 105 with a frame number (time code) to generate emotion metadata having user emotion information for each frame of video content. , supplies this emotion metadata to the metadata database 108 .
- the metadata database 108 stores emotion metadata corresponding to multiple video content files.
- the metadata database 108 puts the emotion metadata supplied from the metadata generation unit 106 into a database together with the movie file name so that it is possible to identify which movie content file the emotion metadata is for. Store in association with.
- the metadata database 108 stores the emotion metadata supplied from the metadata generation unit 106 as it is when the emotion metadata corresponding to the name of the reproduced moving image file is not yet stored. If the metadata database 108 already stores emotion metadata corresponding to the name of the reproduced moving image file, the metadata database 108 updates it with the emotion metadata supplied from the metadata generation unit 106 .
- the metadata database 108 adds the emotion metadata supplied from the metadata generation unit 106 to the already stored emotion metadata. Update with emotion metadata obtained by synthesis. Although detailed description is omitted, the method of combining is the same as that of the metadata rewriting unit 107 in the information processing apparatus 100A of FIG. 1 described above.
- the emotional metadata stored in the metadata database 108 and the video content files stored in the content database 101 are linked by video file names.
- link information such as URLs.
- link information such as a URL for accessing the emotion metadata stored in the metadata database 108 is recorded as metadata in the corresponding moving image content file of the content database 101 to perform the linking.
- the rest of the information processing apparatus 100B shown in FIG. 2 is configured similarly to the information processing apparatus 100A shown in FIG.
- emotion metadata having user emotion information for each frame of video content is generated, and this emotion metadata is stored in the metadata database 108 in association with the video content file.
- This emotion metadata can be used when playing back and watching moving image content or when editing moving image content.
- emotion metadata corresponding to a plurality of moving image content files are stored in the metadata database 108.
- the process of extracting the emotional metadata from the video content file is unnecessary, so it is particularly useful to use only the emotional metadata. In the case of analysis, etc., it becomes possible to perform processing efficiently.
- FIG. 3 shows a configuration example of an information processing device 200A that uses emotion metadata.
- This information processing device 200A has a content database (content DB) 201, a content reproduction/editing section 202, a metadata extraction section 203, and an emotion representative scene extraction section 204.
- content DB content database
- FIG. 3 shows a configuration example of an information processing device 200A that uses emotion metadata.
- This information processing device 200A has a content database (content DB) 201, a content reproduction/editing section 202, a metadata extraction section 203, and an emotion representative scene extraction section 204.
- content DB content database
- the content database 201 corresponds to the content database 101 shown in FIG. 1, and stores a plurality of moving image content files. Each moving image content file is added with emotion metadata having user emotion information for each frame of the moving image content. It is
- the content database 201 supplies the moving image content file corresponding to the reproduced moving image file name to the content reproducing/editing unit 202 and the metadata extracting unit 203 .
- the playback moving image file name is specified by, for example, the user of the information processing device 200A.
- the metadata extraction unit 203 extracts emotion metadata from the video content file supplied from the content data database 201 and supplies it to the emotion representative scene extraction unit 204 .
- the emotion representative scene extraction unit 204 extracts an emotion representative scene from the emotion metadata supplied from the metadata extraction unit 203 .
- the emotion-representative scene extraction unit 204 extracts an emotion-representative scene based on the type of user's emotion.
- the emotion metadata has user emotion information of "happiness”, “angry”, “sorrow”, and “comfort” as user emotion information for each frame of video content, one of these emotions is selected.
- the scene whose degree (level) is equal to or greater than a threshold value is extracted as an emotion representative scene.
- selection of emotions and setting of thresholds can be arbitrarily performed by user operations, for example.
- the emotion-representative scene extraction unit 204 extracts an emotion-representative scene based on the degree of user's emotion.
- (1) scenes in which the degree of user's emotion exceeds a threshold value are extracted as emotion-representing scenes, or (2) extraction as emotion-representing scenes based on statistical values of the degree of user's emotion in the entire video content. , etc. can be considered.
- the degree of user's emotion exceeds a threshold value as an emotion-representing scene.
- the emotion metadata has user emotion information of "happiness”, “angry”, “sorrow”, and “comfort” as user emotion information for each frame of video content
- the degree (level) of each emotion is extracted as an emotion representative scene.
- the threshold can be arbitrarily set by, for example, a user's operation.
- FIG. 4(a) shows an example of a change in the degree (level) of predetermined user emotion for each frame.
- the horizontal axis indicates the frame number fr
- the vertical axis indicates the degree Em(fr) of the user's emotion.
- the frame number fr_a is stored as the emotion representative scene information L(1)
- the degree Em(fr_b) at the frame number fr_b exceeds the threshold th is exceeded
- the frame number fr_b is stored as emotion representative scene information L(2).
- the flowchart of FIG. 4(b) shows an example of the processing procedure of the emotion-representing scene extraction unit 204 when extracting a scene in which the level of user's emotion exceeds a threshold value as an emotion-representing scene.
- the emotion representative scene extraction unit 204 starts processing in step ST1.
- step ST3 the emotion representative scene extraction unit 204 determines whether the degree Em(fr) is greater than the threshold th.
- emotion representative scene extraction section 204 stores emotion representative scene information, that is, stores frame number fr as emotion representative scene L(n) in step ST4.
- emotion representative scene extraction section 204 increments n to n+1.
- step ST6 the emotion representative scene extraction unit 204 determines whether or not the frame number fr is greater than the last frame number fr_end, that is, determines the end.
- fr>fr_end the emotion representative scene extraction unit 204 returns to the processing of step ST3 and repeats the same processing as described above.
- fr>fr_end emotion representative scene extraction section 204 terminates the process in step ST7.
- the statistical values in this case are maximum values, sorting results, mean values or standard deviation values.
- the statistic value is the maximum value
- the emotion metadata has information of "happiness”, “anger”, “sorrow”, and “comfort” as user emotion information for each frame of video content
- each emotion the scene with the maximum degree (level) is extracted as the emotion representative scene.
- the statistical value is the result of sorting
- the emotion metadata has information of "happiness”, “angry”, “sorrow”, and “comfort” as user emotion information for each frame of video content
- the scenes with the second and third ranks are also extracted as emotion representative scenes.
- the emotion metadata has information of "happiness”, “angry”, “sorrow”, and "comfort” as user emotion information for each frame of video content.
- scenes in which the degree (level) of each emotion deviates greatly from the average are extracted as emotion representative scenes.
- FIG. 5(a) shows an example of a change in the degree (level) of predetermined user emotion for each frame.
- the horizontal axis indicates the frame number fr
- the vertical axis indicates the degree Em(fr) of the user's emotion.
- the degree Em(fr_a) of the frame number fr_a is the maximum value em_max, so the frame number fr_a is stored as the emotion representative scene information L.
- the flowchart of FIG. 5(b) shows an example of the processing procedure of the emotion-representing scene extraction unit 204 when extracting, as an emotion-representing scene, a scene in which the degree of user's emotion in the entire video content is the maximum value.
- the emotion representative scene extraction unit 204 starts processing in step ST11.
- step ST13 the emotion representative scene extraction unit 204 determines whether the degree Em(fr) is greater than the maximum value em_max.
- emotion representative scene extraction section 204 stores emotion representative scene information, that is, stores frame number fr as emotion representative scene L in step ST14. Also, the emotion representative scene extraction unit 204 updates em_max to Em(fr) in step ST14.
- step ST16 the emotion representative scene extraction unit 204 determines whether or not the frame number fr is greater than the last frame number fr_end, that is, determines the end.
- fr>fr_end the emotion representative scene extraction unit 204 returns to the processing of step ST13 and repeats the same processing as described above.
- fr>fr_end emotion representative scene extraction section 204 terminates the process in step ST17.
- the emotion-representative scene extraction unit 204 supplies the emotion-representative scene information to the content reproduction/editing unit 202 .
- a content reproduction/editing unit 202 reproduces video content included in a video content file supplied from the content database 201 .
- the content reproduction/editing unit 202 can reproduce part of the moving image content included in the moving image content file supplied from the content database 201 according to the user's operation or automatically.
- the emotion representative scene extracted by the emotion representative scene information extraction unit 204 is controlled by a control unit (not shown) to reproduce. This allows the user to view only the extracted emotion-representing scene.
- the position of the emotion-representing scene extracted by the emotion-representing scene information extraction unit 204 is displayed with respect to the entire moving image content. Also, it is controlled by a control unit (not shown). As a result, the user can easily recognize the temporal position of the extracted emotion-representing scene with respect to the entire video content, and can efficiently perform the playback operation. It is possible to efficiently reproduce only the extracted emotion representative scene.
- the content reproduction/editing unit 202 edits the video content included in the video content file supplied from the content database 201 according to the user's operation or automatically to generate new video content.
- the emotion-representative scene extracted by the emotion-representative scene information extraction unit 204 is extracted and a new video content is generated by a control unit (not shown). be done. As a result, it is possible to automatically obtain new video content that includes only the extracted emotion-representative scenes.
- the position of the emotion-representing scene extracted by the emotion-representing scene information extraction unit 204 is displayed with respect to the entire video content. Also, it is controlled by a control unit (not shown).
- a control unit not shown.
- FIG. 6(a) shows an example of displaying the position of the emotion-representing scene extracted by the emotion-representing scene information extraction unit 204 relative to the entire video content.
- a time axis slide bar 301 indicating progress of reproduction of moving image content is displayed at the bottom, and a reproduced image 302 is displayed at the top.
- This time axis slide bar 301 corresponds to the entire video content, and at the time position of this time axis slide bar 301 corresponding to the emotion representative scene extracted by the emotion representative scene information extraction unit 204, the emotion representative scene is displayed.
- the type and degree of user emotion in the scene are displayed. In this case, the user can recognize the time position of the extracted emotion-representing scene with respect to the entire video content from the position of the time axis slide bar, and the type and degree of the user's emotion in the extracted emotional scene. is also easily recognizable.
- the type is indicated by a mark (icon) so that the user can intuitively recognize it, and the degree is indicated by a numerical value, but the display mode is not limited to this.
- the emotion-representative scene information extraction unit 204 instead of displaying the type and degree of the user's emotion in the emotion-representative scene at the time position corresponding to the emotion-representative scene extracted by the emotion-representative scene information extraction unit 204, as shown in FIG. It is also conceivable to display user emotion information for each frame of moving image content as it is. In the illustrated example, only the information of "sorrow” and "comfort” is shown for simplification of the drawing. In this case, as indicated by broken lines in FIG. 3, the emotion metadata extracted by the metadata extraction unit 203 is supplied to the content reproduction/editing unit 202, and display is performed based on this emotion metadata.
- the emotion representative scene information extraction unit 204 extracts the emotion representative scene based on the emotion metadata having the user emotion information for each frame of the moving image content.
- Emotion data indicating the user's emotion for each frame of content can be effectively used in playback and editing of video content.
- FIG. 7 shows a configuration example of an information processing device 200B that uses emotion metadata. 7, parts corresponding to those in FIG. 3 are denoted by the same reference numerals, and detailed description thereof will be omitted as appropriate.
- This information processing device 200B has a content database (content DB) 201, a content reproduction/editing unit 202, a metadata database (metadata DB) 205, and an emotion representative scene extraction unit 204.
- content DB content database
- metadata database metadata database
- the metadata database 205 corresponds to the metadata database 108 shown in FIG. 2, and stores emotion metadata linked to each of the plurality of video content files stored in the content database 201. Note that this example shows an example in which the linking is performed by the video file name.
- Metadata database 205 is input with the same playback video file name as that input to content database 201 , so that the emotion associated with the video content file supplied from content database 201 to content playback/editing unit 202 is displayed.
- the metadata is supplied to the emotion representative scene extraction unit 204 .
- the emotion-representative scene extraction unit 204 extracts an emotion-representative scene from the emotion metadata supplied from the metadata database 205 and supplies the emotion-representative scene information to the content reproduction/editing unit 202 .
- the rest of the information processing device 200B shown in FIG. 7 is configured similarly to the information processing device 200A shown in FIG. Also in this information processing device 200B, the same effects as those of the information processing device 200A shown in FIG. 3 can be obtained.
- emotion metadata generated by viewing by one user is metadata having the emotion information of that one user, but emotion metadata generated by viewing by a large number of users is metadata of the other users. Emotional reactions become metadata with statistically representative emotional information.
- An information processing apparatus including an extraction unit that extracts an emotion-representing scene based on emotion data representing a user's emotion for each scene of video content.
- the extraction unit extracts the emotion representative scene based on the type of the user's emotion.
- the extraction unit extracts the emotion representative scene based on the degree of the user's emotion.
- the extracting unit extracts a scene in which the level of the user's emotion exceeds a threshold as the emotion representative scene.
- the display control unit displays the type and degree of the user's emotion in the extracted emotion-representing scene at the time position corresponding to the extracted emotion-representing scene of the time-axis slide bar corresponding to the entire moving image content.
- the information processing apparatus according to (9) above.
- (11) The information processing apparatus according to (10), wherein the display control unit displays the type of the user's emotion with a mark.
- An information processing method having a procedure of extracting an emotion-representing scene based on emotion data representing user's emotion for each scene of video content.
- Metadata database (metadata DB) 100A, 100B... information processing apparatus 101... content database (content DB) 102 Content reproduction display unit 103 Face image capturing camera 104 Biometric information sensor 105 User emotion analysis unit 106 Metadata generation unit 107 Metadata rewrite unit 108 . ⁇ Metadata database (metadata DB) 200A, 200B... Information processing apparatus 201... Content database (content DB) 202 Content reproduction/editing unit 203 Metadata extraction unit 204 Emotion representative scene extraction unit 205 Metadata database (metadata DB)
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Television Signal Processing For Recording (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US18/692,510 US20240404322A1 (en) | 2021-09-22 | 2022-03-17 | Information processing apparatus and information processing method |
| JP2023549350A JPWO2023047657A1 (https=) | 2021-09-22 | 2022-03-17 | |
| CN202280062511.9A CN117941341A (zh) | 2021-09-22 | 2022-03-17 | 信息处理装置和信息处理方法 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2021-153856 | 2021-09-22 | ||
| JP2021153856 | 2021-09-22 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023047657A1 true WO2023047657A1 (ja) | 2023-03-30 |
Family
ID=85720379
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2022/012459 Ceased WO2023047657A1 (ja) | 2021-09-22 | 2022-03-17 | 情報処理装置および情報処理方法 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20240404322A1 (https=) |
| JP (1) | JPWO2023047657A1 (https=) |
| CN (1) | CN117941341A (https=) |
| WO (1) | WO2023047657A1 (https=) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2007328675A (ja) * | 2006-06-09 | 2007-12-20 | Sony Corp | 情報処理装置、および情報処理方法、並びにコンピュータ・プログラム |
| JP2008060622A (ja) * | 2006-08-29 | 2008-03-13 | Sony Corp | 映像編集システム、映像処理装置、映像編集装置、映像処理方法、映像編集方法、プログラムおよびデータ構造 |
| JP2015527668A (ja) * | 2012-09-25 | 2015-09-17 | インテル コーポレイション | 閲覧者の反応推定及びビジュアル・キュー検出によるビデオ・インデクシング |
| JP2019186707A (ja) * | 2018-04-06 | 2019-10-24 | 株式会社メディアシステム | 電話システムおよびプログラム |
| WO2020206487A1 (en) * | 2019-04-12 | 2020-10-15 | Pinch Labs Pty Ltd | Methods and apparatus for displaying, compressing and/or indexing information relating to a meeting |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR100828371B1 (ko) * | 2006-10-27 | 2008-05-08 | 삼성전자주식회사 | 컨텐츠의 메타 데이터 생성 방법 및 장치 |
| US11042729B2 (en) * | 2017-05-01 | 2021-06-22 | Google Llc | Classifying facial expressions using eye-tracking cameras |
| KR102814131B1 (ko) * | 2020-07-09 | 2025-05-29 | 삼성전자주식회사 | 요약 비디오를 생성하는 디바이스 및 방법 |
| US12126791B1 (en) * | 2022-05-20 | 2024-10-22 | Nvidia Corporation | Conversational AI-encoded language for data compression |
-
2022
- 2022-03-17 JP JP2023549350A patent/JPWO2023047657A1/ja active Pending
- 2022-03-17 US US18/692,510 patent/US20240404322A1/en active Pending
- 2022-03-17 CN CN202280062511.9A patent/CN117941341A/zh active Pending
- 2022-03-17 WO PCT/JP2022/012459 patent/WO2023047657A1/ja not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2007328675A (ja) * | 2006-06-09 | 2007-12-20 | Sony Corp | 情報処理装置、および情報処理方法、並びにコンピュータ・プログラム |
| JP2008060622A (ja) * | 2006-08-29 | 2008-03-13 | Sony Corp | 映像編集システム、映像処理装置、映像編集装置、映像処理方法、映像編集方法、プログラムおよびデータ構造 |
| JP2015527668A (ja) * | 2012-09-25 | 2015-09-17 | インテル コーポレイション | 閲覧者の反応推定及びビジュアル・キュー検出によるビデオ・インデクシング |
| JP2019186707A (ja) * | 2018-04-06 | 2019-10-24 | 株式会社メディアシステム | 電話システムおよびプログラム |
| WO2020206487A1 (en) * | 2019-04-12 | 2020-10-15 | Pinch Labs Pty Ltd | Methods and apparatus for displaying, compressing and/or indexing information relating to a meeting |
Also Published As
| Publication number | Publication date |
|---|---|
| CN117941341A (zh) | 2024-04-26 |
| JPWO2023047657A1 (https=) | 2023-03-30 |
| US20240404322A1 (en) | 2024-12-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN100462975C (zh) | 信息演示方法和信息演示装置 | |
| KR100983840B1 (ko) | 악곡과 화상이 동기화된 모션 픽쳐용 시나리오 생성 방법,프로그램, 및 장치 | |
| US7873258B2 (en) | Method and apparatus for reviewing video | |
| JP6369706B1 (ja) | 医療動画処理システム | |
| WO2003102953A1 (en) | Authoring device and authoring method | |
| JP2016119600A (ja) | 編集装置及び編集方法 | |
| JP2005267279A (ja) | 情報処理システム及び情報処理方法、並びにコンピュータ・プログラム | |
| JP2012105012A (ja) | 動画再生装置、動画再生方法、コンピュータプログラム、記憶媒体 | |
| JP2003078868A (ja) | メディア作品制作支援装置及びプログラム | |
| JP2010157961A (ja) | 字幕作成システム及びプログラム | |
| WO2023047658A1 (ja) | 情報処理装置および情報処理方法 | |
| JP2017211862A (ja) | 情報処理装置、情報処理方法及びプログラム | |
| JP2005109566A (ja) | 映像要約装置、説明文生成装置、映像要約方法、説明文生成方法及びプログラム | |
| JP2010268195A (ja) | 動画コンテンツ編集プログラム、サーバ、装置及び方法 | |
| JP2018180519A (ja) | 音声認識誤り修正支援装置およびそのプログラム | |
| WO2023047657A1 (ja) | 情報処理装置および情報処理方法 | |
| US20050262527A1 (en) | Information processing apparatus and information processing method | |
| JP2022117505A (ja) | コンテンツ修正装置、コンテンツ配信サーバ、コンテンツ修正方法、コンテンツ修正プログラム、および、記録媒体 | |
| US11200919B2 (en) | Providing a user interface for video annotation tools | |
| JP2005228297A (ja) | 実物キャラクター型動画像情報物の制作方法,実物キャラクター型動画像情報物の再生方法,記録媒体 | |
| KR20070066878A (ko) | 콘텐츠 분배 장치 | |
| JP2005167822A (ja) | 情報再生装置及び情報再生方法 | |
| US20210287433A1 (en) | Providing a 2-dimensional dataset from 2-dimensional and 3-dimensional computer vision techniques | |
| JP2005080000A (ja) | インデキシング装置、映像再生装置及び方法 | |
| JP2005150923A (ja) | 画像編集方法および装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22872416 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023549350 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202280062511.9 Country of ref document: CN |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 22872416 Country of ref document: EP Kind code of ref document: A1 |