WO2022244180A1 - 動画マニュアル作成装置、動画マニュアル作成方法、及び動画マニュアル作成プログラム - Google Patents
動画マニュアル作成装置、動画マニュアル作成方法、及び動画マニュアル作成プログラム Download PDFInfo
- Publication number
- WO2022244180A1 WO2022244180A1 PCT/JP2021/019151 JP2021019151W WO2022244180A1 WO 2022244180 A1 WO2022244180 A1 WO 2022244180A1 JP 2021019151 W JP2021019151 W JP 2021019151W WO 2022244180 A1 WO2022244180 A1 WO 2022244180A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- moving image
- manual
- information data
- video
- work procedure
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 117
- 238000004458 analytical method Methods 0.000 claims abstract description 46
- 230000009471 action Effects 0.000 claims abstract description 35
- 230000033001 locomotion Effects 0.000 claims description 60
- 238000010191 image analysis Methods 0.000 claims description 14
- 230000003190 augmentative effect Effects 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 39
- 238000001514 detection method Methods 0.000 description 32
- 238000012545 processing Methods 0.000 description 31
- 239000011521 glass Substances 0.000 description 16
- 230000008569 process Effects 0.000 description 15
- 230000006870 function Effects 0.000 description 5
- 230000004397 blinking Effects 0.000 description 2
- 238000010411 cooking Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000004040 coloring Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/20—Scenes; Scene-specific elements in augmented reality scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Definitions
- the present disclosure relates to a video manual creation device, a video manual creation method, and a video manual creation program.
- An object of the present disclosure is to provide a video manual creation device, a video manual creation method, and a video manual creation program capable of creating a video manual in which the correspondence between objects and human motions is easy to understand.
- a video manual creation device of the present disclosure analyzes a work procedure manual file in which work procedures are described, and generates text information data indicating the structure of sentences included in the work procedure manual file; analyzing a moving image file of a moving image of a work according to the work procedure, generating object information data indicating an object included in the moving image, and generating motion information data indicating a human motion included in the moving image
- a moving image analysis unit that generates a first set of nouns and verbs included in the text is collected from the text information data, and the moving image is generated from the object information data and the motion information data.
- a link information generating unit for generating link information data indicating a correspondence with a scene in the moving image containing the second set obtained by the search; and a work procedure based on the link information data.
- a moving image manual generation unit that generates moving image manual data for displaying a moving image manual including the moving image, the noun, and the verb on a display.
- a moving image manual creation method is a method executed by a moving image manual creation apparatus that creates moving image manual data, and analyzes a work procedure file in which work procedures are described, and analyzes the generating text information data indicating the structure of the text in the document, analyzing a moving image file of a moving image of the work according to the work procedure, and generating object information data indicating an object included in the moving image; generating motion information data indicating human motions included in the moving image; collecting a first set of nouns and verbs included in the text from the text information data; A second set, which is a set of the object and the action included in the moving image, is collected from the object information data and the motion information data, and the collected first set and the collected second set are collected.
- FIG. 1 is a functional block diagram schematically showing the configuration of a moving image manual creating apparatus according to Embodiment 1;
- FIG. FIG. 2 is a diagram showing an example (part 1) of a moving picture manual created by the moving picture manual creating apparatus according to Embodiment 1;
- FIG. 2 is a diagram showing an example (part 2) of a moving picture manual created by the moving picture manual creating apparatus according to Embodiment 1;
- 1 is a diagram showing an example of a hardware configuration of a system (for example, a computer) that implements the moving image manual creation device and the display control device according to Embodiment 1;
- FIG. 4 is a diagram showing a configuration example of text information data generated by a document analysis unit;
- FIG. 4 is a diagram showing a configuration example of object information data generated by an object detection unit of a moving image analysis unit;
- FIG. 4 is a diagram showing a configuration example of motion information data generated by a motion detection unit of a moving image analysis unit;
- FIG. 4 is a diagram showing a configuration example of link information data generated by a link information generation unit;
- 7 is a flowchart showing processing for generating text information data by a document analysis unit;
- FIG. 4 is a diagram showing an example of a tree structure of text information data generated by a document analysis unit;
- 9 is a flowchart showing object information data generation processing by an object detection unit of a moving image analysis unit;
- FIG. 4 is a diagram showing a configuration example of object information data generated by an object detection unit of a moving image analysis unit;
- FIG. 4 is a diagram showing a configuration example of motion information data generated by a motion detection unit of a moving image analysis unit
- FIG. 4 is a diagram showing a configuration example of link information data generated
- FIG. 4 is a diagram showing an example of a tree structure of sentence information data generated by an object detection unit of a moving image analysis unit; 9 is a flowchart showing a process of generating motion information data by a motion detection unit of a moving image analysis unit; FIG. 4 is a diagram showing an example of a tree structure of motion information data generated by a motion detection unit of a moving image analysis unit; 7 is a flowchart showing link information data generation processing by a link information generation unit; FIG.
- FIG. 10 is a diagram showing processing for generating a tree structure of link information data by a link information generation unit; 7 is a flow chart showing processing for generating a moving image manual by a moving image manual generating unit; 4 is a flowchart showing display processing of a moving image manual by a display control device;
- FIG. 10 is a functional block diagram schematically showing the configuration of a moving image manual creation device according to Embodiment 2;
- FIG. 10 is a diagram showing an example of the hardware configuration of a system (for example, a computer) that implements the moving image manual creation device and the display control device according to the second embodiment; 10 is a flow chart showing processing performed by a voice analysis unit of the moving image manual creating apparatus according to Embodiment 2;
- FIG. 10 is a diagram showing processing for generating a tree structure of link information data by a link information generation unit
- 7 is a flow chart showing processing for generating a moving image manual by a moving image manual generating unit
- 4 is a flowchart showing display
- FIG. 13 is a functional block diagram schematically showing the configuration of a moving image manual creation device according to Embodiment 4;
- FIG. 11 is a diagram showing the configuration of a display control device for displaying a moving image manual generated by a moving image manual creation device according to Embodiment 4 on AR (augmented reality) glasses;
- FIG. 12 is a diagram showing an example of a hardware configuration of a system (for example, a computer) that implements the moving image manual creation device and the display control device according to the fourth embodiment;
- 16 is a flow chart showing processing by a superimposition alignment control unit of the moving image manual creating apparatus according to Embodiment 4;
- a video manual creation device, a video manual creation method, and a video manual creation program according to an embodiment will be described below with reference to the drawings.
- the following embodiments are merely examples, and the embodiments can be combined as appropriate and each embodiment can be modified as appropriate.
- FIG. 1 is a functional block diagram schematically showing the configuration of a moving image manual creating apparatus 100 according to Embodiment 1.
- a moving image manual creating apparatus 100 is an apparatus capable of executing the moving image manual creating method according to the first embodiment.
- Moving image manual data D ⁇ b>107 created by moving image manual creation device 100 is output to display control device 110 .
- the display control device 110 has a moving image reproduction control unit 112 that controls moving image reproduction, and a moving image manual display control unit 111 that controls the display operation of the moving image manual.
- the display control device 110 displays the video manual on the display 120 as a video display device.
- the moving image manual creation device 100, the display control device 110, and the display 120 constitute a moving image manual presentation system that presents a moving image manual to a person (for example, a worker). Also, the display control device 110 may be a part of the moving image manual creation device 100 .
- the video manual creation device 100 has a document analysis unit 101, a video analysis unit 102, a link information generation unit 106, and a video manual generation unit 107.
- the moving image analysis unit 102 has an object detection unit 103 and a motion detection unit 104 .
- the document analysis unit 101 analyzes the work procedure manual file in which the work procedure manual is described, and generates text information data D101 that indicates the structure of the text contained in the work procedure manual file.
- the document analysis unit 101 collects nouns and verbs contained in sentences. For example, a noun is a word that indicates the name of an object, and a verb is a word that indicates the action of a person (eg, a worker).
- the video analysis unit 102 analyzes the video file of the video of the work according to the work procedure.
- the object detection unit 103 of the moving image analysis unit 102 detects an object included in the moving image and generates object information data D103 indicating the object.
- the motion detection unit 104 of the video analysis unit 102 detects human motions included in the video and generates motion information data D104 indicating the motions.
- the objects include at least one of work objects, tools used in work, and human body parts.
- the link information generating unit 106 collects the first set of nouns and verbs included in the text from the text information data D101, and collects the first set of nouns and verbs included in the text from the text information data D101, A second set is collected, which is the set of the object being held and the action of the person.
- the link information generation unit 106 generates a noun and an object from the collected first set (150 in FIG. 4 described later) and the collected second set (160a and 160b in FIGS. 6 and 7 described later). to retrieve the first and second sets that correspond to and that correspond to verbs and actions.
- the link information generation unit 106 generates a position (151 in FIG. 8 to be described later) in the work procedure manual where the first set obtained by the search is described and the second set obtained by the search. link information data D106 indicating the correspondence with a scene (161 in FIG. 8, which will be described later) in the moving image.
- the video manual generation unit 107 generates video manual data D107 that is used to display a video manual including work procedures, videos, nouns, and verbs on a display based on the work procedure manual file, the video file, and the link information data D106. to generate
- FIG. 2 and 3 are diagrams showing examples (parts 1 and 2) of moving image manuals created by the moving image manual creating apparatus 100 and displayed on the display 120.
- FIG. 2 and 3 the contents of the work procedure manual are displayed on the left half of the display 120, and the animation is displayed on the right half.
- Fig. 2 shows the nouns "left hand”, “board”, “right hand”, and “driver” from the sentence “While holding the board with the left hand, turn the screws at the four corners with the screwdriver with the right hand.” ”, and “screw”, and extract objects corresponding to the nouns in the video, such as work objects (e.g. board, screw), tools used in work (e.g. tools), body parts of workers (e.g.
- work objects e.g. board, screw
- tools used in work e.g. tools
- body parts of workers e.g.
- Fig. 3 the verbs "press” and “turn” are extracted from the sentences in item 3 of the work procedure manual, and the corresponding verbs are displayed near the body parts that perform the actions corresponding to the verbs in the video. shows an example of displaying .
- FIG. 4 is a diagram showing an example of the hardware configuration of a computer, which is a system that implements the video manual creation device 100 and the display control device 110.
- the computer includes a CPU (Central Processing Unit) 510 which is a processor for processing information, a main memory 520 such as a RAM (Random Access Memory), a hard disk drive (HDD) or a solid state drive ( SSD) or the like, an output interface (I/F) 540 , an input I/F 550 , a communication I/F 560 and an image processor 570 .
- a display 120, a touch panel 121, a keyboard/mouse 122, a video camera/microphone 123, a network 125, and a tablet/smartphone 124 are connected to the computer.
- Each function of the video manual creation device 100 and the display control device 110 may be implemented by a processing circuit.
- the processing circuit may be dedicated hardware, or may be a processor that executes programs stored in the main memory (eg, moving image manual creation program, display control program, etc.).
- Each function of the video manual creation device 100 and the display control device 110 may be partly realized by dedicated hardware and partly realized by software or firmware.
- the processing circuitry may implement the functionality of each functional block shown in FIG. 1 through hardware, software, firmware, or any combination thereof.
- FIG. 5 is a diagram showing a configuration example of the text information data D101 generated by the document analysis unit 101. As shown in FIG. In FIG. 5, the arrows indicate the hierarchical structure between the definition data, and the arrowheads point to lower layers. In addition, in FIG. 5, “procedure” indicates one sentence.
- text information data D101 is composed of one or more "documents", each "document” is composed of one or more "chapters”, and each "chapter” is composed of one or more "sections”.
- each ⁇ section'' consists of ⁇ procedures'' as one or more sentences
- each ⁇ procedure'' consists of ⁇ objects'', ⁇ tools'', and ⁇ actions'', which are ⁇ objects''. showing.
- FIG. 6 is a diagram showing a configuration example of the object information data D103 generated by the object detection unit 103 of the moving image analysis unit 102.
- each "moving image” is composed of one or more “frame images”
- each "frame image” includes one or more "objects (objects or tools)”
- each " An example in which "object” is composed of one or more coordinates indicating "in-screen coordinate position” is shown. Also, by multiplying the reciprocal of the frame rate value by the frame number, it is possible to represent the playback time from the beginning of the moving image.
- FIG. 7 is a diagram showing a configuration example of motion information data D104 generated by the motion detection unit 104 of the video analysis unit 102.
- each "moving image” is composed of one or more "frame images”
- each "frame image” includes one or more "objects (objects or tools)”
- FIG. 8 is a diagram showing a configuration example of the link information data D106 generated by the link information generation unit 106.
- the link information data D106 of FIG. 8 is composed of the text information data D101 of FIG. 5, the object information data D103 of FIG. 6, and the action information data D104 of FIG.
- the link information data D106 in FIG. 8 shows an example in which each "procedure” is composed of an "object (object)", an "object (tool or body part)", and an "action".
- FIG. 9 is a flow chart showing processing for generating text information data D101 by the document analysis unit 101.
- the document analysis unit 101 reads a work procedure file as a document file (step S101), and the chapter number (that is, chapter number and section number) at the position of the read text is the last chapter number of the work procedure manual. It is determined whether or not there is (step S102). If the document analysis unit 101 is the last chapter/section number (YES in step S102), the document analysis unit 101 ends the process of generating the text information data D101. If it is not the last chapter/section number (NO in step S102), document analysis unit 101 generates a node based on the hierarchical tree of chapter numbers and section numbers (shown in FIG. 10 described later) (step S103).
- Document analysis unit 101 determines whether or not it is the last "procedure” (that is, sentence or text) (step S104). back to If it is not the last "procedure” (NO in step S104), the document analysis unit 101 cuts out the "procedure" of the section number (step S105).
- the document analysis unit 101 cuts out words by morphological analysis and determines the part of speech of the words (step S106).
- the document analysis unit 101 performs syntactic analysis using a dictionary, and generates nodes for each object (noun) that is an object, a tool (noun) that is an object, and an action (verb) (step S107).
- the document analysis unit 101 repeats the processing of steps S104 to S107 up to the last "procedure".
- FIG. 10 is a diagram showing an example of the tree structure of the text information data D101 generated by the document analysis unit 101.
- a "document” is composed of one or more nodes “chapter”, and each "chapter” is composed of one or more nodes “section”.
- FIG. 11 is a flowchart showing the process of generating the object information data D103 by the object detection unit 103 of the moving image analysis unit 102.
- the object detection unit 103 reads a moving image file (that is, a video file) obtained by photographing with a camera (step S111), and determines whether or not the frame image of the read moving image file is the frame number of the last frame image. It judges (step S112). If it is the last frame number (YES in step S112), the object detection unit 103 ends the process of generating the object information data D103. If it is not the last frame number (NO in step S112), object detection unit 103 generates a node (shown in FIG. 12 described later) for each frame image (step S113).
- a moving image file that is, a video file obtained by photographing with a camera
- the object detection unit 103 detects objects (for example, objects, tools, body parts) in the image based on image analysis processing. Object detection unit 103 determines whether or not there is another undetected object (step S115), and if there is no other object (NO in step S115), the process returns to step S112. If there is another object (YES in step S115), object detection unit 103 generates a node (shown in FIG. 12 to be described later) for each object (step S116), and coordinates position node (to be described later) for each object. 12) is generated (step S117).
- objects for example, objects, tools, body parts
- FIG. 12 is a diagram showing an example of the tree structure of the object information data D103 generated by the object detection unit 103.
- a "moving image” is composed of one or more nodes of "frame images” (frame numbers 0001 to 1234), and each "frame image” is one or more nodes of "objects”.
- each ⁇ object'' consists of ⁇ coordinate positions'', which are one or more nodes.
- FIG. 13 is a flowchart showing the process of generating the motion information data D104 by the motion detection unit 104.
- the motion detection unit 104 reads a moving image file (step S121), and determines whether or not the frame image of the read moving image file is the frame number of the last frame image (step S122). If the frame number is the last frame number (YES in step S122), the motion detection unit 104 ends the generation processing of the motion information data D104. If the frame number is not the last one (NO in step S122), motion detection unit 104 generates a node (shown in FIG. 14 described later) for each frame image (step S123).
- the motion detection unit 104 detects motion (for example, hand motion) in the image based on the skeleton extraction process (step S124).
- the motion detection unit 104 generates a node (shown in FIG. 14, which will be described later) for each motion (step S125), and generates a node (shown in FIG. 14, which will be described later) for each coordinate position and movement direction of the motion. ) is generated (step S126).
- FIG. 14 is a diagram showing an example of the tree structure of the motion information data D104 generated by the motion detection unit 104.
- a "moving image” is composed of one or more nodes “frame images” (frame numbers 0001 to 1234), and each "frame image” is one or more nodes “moving images”.
- ' including posture
- each 'motion' is composed of one or more nodes, ie, 'coordinate position' and 'movement direction'.
- FIG. 15 is a flow chart showing the process of generating the link information data D106 by the link information generation unit 106.
- the link information generation unit 106 reads the tree (for example, FIG. 10) of the text information data D101 (step S131), and the tree structure of the read text information data D101 is the last chapter number (that is, the chapter number and the section number). ) (step S132). If it is the last chapter/section number (YES in step S132), link information generating unit 106 saves link information data D106 in storage device 530 (step S133), and ends the process of generating link information data D106.
- the link information generation unit 106 acquires the node of the set of three elements ⁇ object as object, tool as object, human action ⁇ of the “procedure”. (step S134).
- the link information generation unit 106 searches for a mixed tree of object information/motion information data consisting of a tree of object information data D103 (eg, FIG. 12) and a tree of motion information data D104 (eg, FIG. 14) (Ste S135), it is determined whether or not a scene in which the three elements match has started (step S136). If a scene with matching three elements starts (YES in step S136), the link information generation unit 106 saves the scene information of the start scene (step S137), and returns the process to step S136. If there is no scene with matching three elements (NO in step S136), the link information generation unit 106 determines whether or not there is an end scene of scenes with matching three elements (step S138).
- the link information generation unit 106 determines whether or not there is an end scene of the scene with the three matching elements (step S138). If there is an end scene (YES in step S138), link information generator 106 saves the scene information of the end frame (step S139).
- the link information generation unit 106 generates a link between the "procedure" node and the scene information node (scene start time and end time) (step S140).
- the link information generation unit 106 generates a link of coordinate position and motion direction information to the node of the three elements of the "procedure” (step S141).
- FIG. 16 is a diagram showing processing for generating the tree structure of the link information data D106 by the link information generation unit 106.
- the link information generation unit 106 creates a tree of the link information data D106 by linking the "procedure" of the tree of the text information data D101 and the frame image of the mixed tree of the object information/motion information data. It is shown that.
- FIG. 17 is a flowchart showing the process of generating a moving image manual by the moving image manual generation unit 107.
- the moving image manual generation unit 107 reads the work procedure manual file (step S151), reads the link information data D106 (step S152), and determines whether or not the chapter number of the read work procedure manual file is the last chapter number. (step S153). If it is the last chapter/section number (YES in step S153), moving image manual generation unit 107 ends the generation processing of moving image manual data D107. If it is not the last chapter/section number (NO in step S153), moving image manual generation unit 107 identifies the text position of the chapter/section number in the work procedure manual (step S154).
- the moving image manual generation unit 107 acquires scene information (for example, playback start time) corresponding to the relevant chapter/section number in the link information data, and generates a scene information link to the text position of the relevant chapter/section number ( embed link code).
- scene information for example, playback start time
- FIG. 18 is a flowchart showing display processing of a moving image manual by the display control device 110.
- the display control device 110 accepts designation of a chapter number (for example, click by the user) on the work procedure manual screen of the moving image manual data D107 (step S161).
- the display control device 110 jumps to the reproduction start position on the work moving image screen by executing the link code (step S162).
- the display control device 110 reads one image frame from the moving image (step S163).
- Display control device 110 determines whether or not the playback position is the playback end position (step S164), and if it is the playback end position (YES in step S164), stops playing the moving image (step S169).
- step S164 If the playback position is not the playback end position (NO in step S164), display control device 110 specifies (clicks) an object or tool, which is an object in the "procedure", and an action on the work procedure manual screen. Accept (step S165).
- the display control device 110 acquires the coordinate position and movement direction information of the designated item by referring to the link information table (step S166). Next, the display control device 110 superimposes an emphasis mark at a desired position within the current image frame (step S167). Next, the display control device 110 superimposes an emphasis mark at a desired position within the current image frame, and reproduces and displays the moving image (step S168).
- the video manual creation apparatus 100 associates the "procedure" in the chapters and sections of the work procedure manual with the corresponding scenes of the work video (for example, the time). 2, search and collation is performed for data in which at least two sets of information of objects as objects, tools as objects, and human actions match each other. Therefore, the rate of occurrence of collation errors can be reduced.
- the "procedure" in the "chapter section” on the work procedure manual and the corresponding scene on the moving image are linked via the link information data D106, and are uniquely linked. so that the worker can identify each "procedure” (sentence of a sentence) in the "chapter” in the work procedure manual, or the object ( ⁇ ), tool written in the "procedure” If you specify any part of (with) or action (to do) by means of a mouse click, etc., the target as an object specified on the image of the corresponding scene of the work video is immediately linked to this.
- Objects materials, parts, etc.
- tools tools (tools, right hand, etc.) as designated objects
- highlighting frames, coloring, blinking, superimposing arrows, etc.
- the "procedure" in the "chapter section" on the work procedure manual and the corresponding scene on the work moving image can be bidirectionally displayed via the "link information data". Since it is linked and can be uniquely identified, when the scene is specified by pausing the playback on the work video, the scene will identify the "procedure” part in the "chapter section" of the work procedure manual. can be specified, and screen transition and highlighting can be automatically performed.
- the object detection unit 103 and the motion detection unit 104 store object information in each image based on the time axis of the work moving image and the motion of the worker in the object information data D103 and the motion information data D104. It detects and retains information. Therefore, it is possible to search for a desired work scene through character input or voice input of a keyword indicating an object as an object, a tool as an object, and a human action related to the content of the work. As a result, it is possible to easily and accurately perform jumping to a desired scene in a moving image and cue playback.
- FIG. 19 is a functional block diagram schematically showing the configuration of moving image manual creating apparatus 200 according to the second embodiment.
- a moving image manual creating apparatus 200 is an apparatus capable of executing the moving image manual creating method according to the second embodiment.
- the video manual creating apparatus 200 is different from the first embodiment in that the video analysis unit 202 has the audio analysis unit 105 that analyzes the audio of the video file, and the link information generation unit 106 further uses the audio information data D105. It is different from the moving image manual creation device 100 concerned.
- Moving image manual data D ⁇ b>107 created by moving image manual creation device 200 is output to display control device 110 .
- the moving image manual creation device 200, the display control device 110, and the display 120 constitute a moving image manual presentation system that presents a moving image manual to a person (for example, a worker). Also, the display control device 110 may be a part of the moving image manual creation device 200 .
- FIG. 20 is a diagram showing an example of the hardware configuration of a system (for example, a computer) 200a that implements the video manual creation device 200 and the display control device 110. As shown in FIG. In FIG. 20 , the same or corresponding configurations as those shown in FIG. 4 are given the same reference numerals as those shown in FIG. 4 .
- the system 200a of FIG. 20 is different from the system 100a of FIG. 4 in that the audio of the moving image file is analyzed and the moving image manual is created using the audio information data D105.
- FIG. 21 is a flow chart showing the processing performed by the audio analysis unit 105 of the video manual creation device 200.
- the audio analysis unit 105 reads the audio of the moving image file (step S201), and determines whether or not the frame image of the read moving image file is the last frame image (step S202). If it is the last frame image (YES in step S202), the audio analysis unit 105 ends the process of generating the audio information data D105. If the speech analysis unit 105 is not the last frame number (NO in step S202), the speech analysis unit 105 acquires the speech start time (step S203), and generates a node (shown in FIG. 22 described later) for each frame image ( step S204).
- the speech analysis unit 105 converts the speech into text based on speech recognition processing (step S205).
- the speech analysis unit 105 identifies an object (object), a tool (in), and an action (do) for each sentence of "procedure” (step S206).
- an object, a tool, and an action are generated (step S207).
- FIG. 22 is a diagram showing an example of the tree structure of the voice information data D105 generated by the voice analysis unit 105.
- a “moving image” is composed of one or more nodes “frame images” (for example, frame numbers 0001 to 1234), and each “frame image” is one or more nodes. It consists of "object (object)”, “object (tool)”, and "action”.
- FIG. 23 is a flow chart showing the process of generating the link information data D106 by the link information generation unit 106 of the moving picture manual creating apparatus 200.
- the link information generation unit 106 reads the tree of the text information data D101 (step S211), and checks whether the tree structure of the read text information data D101 has the last chapter/section number (that is, chapter number and section number). It is determined whether or not (step S212). If it is the last chapter/section number (YES in step S212), link information generating unit 106 saves link information data D106 in storage device 530 (step S213), and ends the process of generating link information data D106.
- the link information generation unit 106 acquires the node of the three-element set ⁇ object as object, tool as object, human action ⁇ of the “procedure”. (step S1214).
- the link information generation unit 106 generates a tree of object information data D103, a tree of motion information data D104 (for example, FIG. 14), and a mixed tree of data of object information/motion information/audio information consisting of audio information data.
- a search is performed (step S215), and it is determined whether or not a scene with matching three elements has started (step S216). If a scene with matching three elements starts (YES in step S216), link information generation unit 106 saves the scene information of the start scene (step S217), and returns the process to step S216. If there is no scene with matching three elements (NO in step S217), the link information generation unit 106 determines whether or not there is an end scene of scenes with matching three elements (step S218).
- the link information generation unit 106 determines whether or not there is an end scene of the scene with the same three elements (step S218). If there is an end scene (YES in step S218), link information generator 106 saves the scene information of the end frame (step S219).
- the link information generation unit 106 generates a link between the "procedure" node and the scene information node (scene start time and end time) (step S220).
- the link information generation unit 106 generates a link of coordinate position and movement direction information to the node of the three elements of the "procedure” (step S221).
- the video manual creating apparatus 200 is provided with the audio analysis unit 105.
- the audio analysis unit 105 analyzes the audio in the video and analyzes the audio in the video (for example, objects, tools, and human actions), and outputs audio information data D105 structured below the time axis of the moving image. Therefore, for example, in the case of a cooking video, the worker can create a video manual for cooking with voice commentary by proceeding with the work while uttering the work procedure in the video.
- the link information generation unit 106 associates the work procedure manual with the work animation using the audio information data D105, the accuracy of the association processing between the work procedure manual and the animation can be improved. can.
- Embodiment 2 is the same as Embodiment 1 above.
- FIG. 24 is a functional block diagram schematically showing the configuration of moving image manual creating apparatus 300 according to the third embodiment.
- a moving image manual creating apparatus 300 is an apparatus capable of executing the moving image manual creating method according to the third embodiment.
- Moving image manual creating apparatus 300 is different from moving image manual creating apparatus 200 according to Embodiment 2 in that it includes moving image recording unit 308 that records moving images taken by a camera, and that recorded moving images are analyzed by moving image analysis unit 202. different.
- Moving image manual data D ⁇ b>107 created by moving image manual creation device 300 is output to display control device 110 .
- the moving image manual creation device 300, the display control device 110, and the display 120 constitute a moving image manual presentation system that presents a moving image manual to a person (for example, a worker). Also, the display control device 110 may be a part of the moving image manual creation device 300 .
- FIG. 25 is a diagram showing an example of the hardware configuration of a system (for example, a computer) 300a that implements the moving image manual creation device 300 and the display control device 110 according to the third embodiment.
- a system for example, a computer
- the same reference numerals as those shown in FIG. 20 are attached to the same or corresponding configurations as those shown in FIG.
- the system 300a of FIG. 25 is different from the system 200a of FIG. 10 in that the audio of the moving image file is analyzed and moving image manual data is created using the audio information data D105.
- FIG. 26 is a flow chart showing parallel processing of the moving image recording unit 308 and the object detecting unit 103 in the moving image manual creation device 300 according to the third embodiment.
- the same processes as those shown in FIG. 11 are given the step numbers shown in FIG.
- the moving image recording unit 308 reads the images captured by the camera, writes the moving image file up to the last frame in the storage device 530, and the moving image analyzing unit 202 reads the moving image file received from the moving image recording unit 308. This is different from the moving picture manual creating apparatus 200 according to the second embodiment.
- the moving image manual creating apparatus 300 is equipped with a moving image recording program, and the moving image captured by the camera is recorded in the storage device 530 .
- the moving image manual creation device 300 associates the work procedure manual with the work moving image captured by the camera in accordance with the moving image manual generation program. Therefore, at the work site, the worker can start and stop camera recording, and at the same time, on the video manual created and displayed on the spot (in this case, the video part is the video of the worker himself), You can check the video that shows and the work contents, correspondence, improvement points, etc. of the work procedure manual.
- the video manual creation device 300 has a new video camera added to the work site to capture the work situation of the worker himself and associate the work procedure manual with the video of the work that has just been shot. In light of the contents of the work procedure manual, it is possible to warn of omissions and errors in work items and correct the errors. In this way, by providing the function of recording moving images, the moving image manual creating apparatus 300 can not only present the working moving image manual but also can exert an educational effect on the operator himself/herself.
- Embodiment 3 is the same as Embodiment 1 or 2 above.
- FIG. 27 is a functional block diagram schematically showing the configuration of moving image manual creating apparatus 400 according to the fourth embodiment.
- the same reference numerals as those shown in FIG. 24 are attached to the same or corresponding configurations as those shown in FIG.
- a video manual creation device 400 is a device capable of executing the video manual creation method according to the fourth embodiment.
- the moving image manual creation device 400, the display control device 410, and the AR (augmented reality) glasses 420 constitute a moving image manual presentation system that presents a moving image manual to a person (for example, a worker).
- the moving image manual presentation system according to the fourth embodiment differs from the moving image manual presentation system according to the third embodiment in that AR glasses 420 and a display control device 410 that displays an image on the AR glasses 420 are used. Also, the display control device 110 may be a part of the moving image manual creating device 400 .
- FIG. 28 is a diagram showing the configuration of the display control device 410 for displaying the moving image manual generated by the moving image manual creation device 400 according to the fourth embodiment on the AR glasses 420.
- the AR glasses 420 also called smart glasses, allow a person to see the real world in front of them and an image superimposed on the real world (for example, a commentary text superimposed on an object in the real world) at the same time. It has a function to make
- the AR glasses also have a camera (that is, a video camera) 421 that captures moving images in the same direction as the line of sight of the person wearing the AR glasses.
- the display control device 410 displays an AR image by superimposing a moving image manual or a part of the moving image manual on an object in the real world that can be seen ahead of a person's line of sight.
- the AR image includes, for example, display components such as frames and arrows that highlight objects in the real world.
- the display control device 410 controls the display state such as the color of the display component and whether or not the display component blinks.
- the display control device 410 includes an alignment unit that aligns CG (computer graphics) with the real scene, and a superimposition unit that displays the CG by superimposing it on the camera image or the real world. have The alignment unit and the superimposing unit constitute a superimposition alignment control unit 113 .
- Alignment processing is performed, for example, according to a superposition alignment program.
- the alignment processing the moving image captured by the camera 421 is analyzed, and based on the position information of each object captured in the camera image, the position information that can be seen from the line of sight of the worker is sequentially calculated. Alignment processing is performed.
- FIG. 29 is a diagram showing an example of the hardware configuration of a system (for example, a computer) 400a that implements the moving image manual creation device 400 and the display control device 110 according to the fourth embodiment.
- a system for example, a computer
- FIG. 29 the same reference numerals as those shown in FIG. 25 are attached to the same or corresponding configurations as those shown in FIG.
- the system 400a of FIG. 29 differs from the system 300a of FIG. 25 in that the moving image manual is displayed on the AR glasses 420.
- FIG. 30 is a flow chart showing processing by the superimposition alignment control unit 113 of the moving image manual creating apparatus 400 according to the fourth embodiment.
- the moving image recording unit 308 reads frame images for moving image manual captured by a camera (step S401). If there is an end instruction (YES in step S402), the superimposition alignment control unit 113 causes the AR glasses 420 to display a see-through screen, and ends the AR image display processing.
- the superimposition alignment control unit 113 causes the object detection unit 103 to detect an object (for example, an object/tool) in the frame image (step S404). Positional information within the frame image is acquired (step S405).
- an object for example, an object/tool
- the superimposition alignment control unit 113 controls alignment for posture correction (step S406), superimposes (that is, synthesizes) CG as an AR image on the proper position of each detected object (step S407), It is displayed on the screen of the glass 420 (step S408).
- the video manual creation apparatus includes the AR glasses 420, the video camera 421 photographs the work site from the worker's viewpoint, and the AR glasses 420 have a transparent (see-through) screen. is doing.
- the operator can see the real world in front of them (for example, see through a transparent screen), and can superimpose digital (CG) data such as text, images, or moving images on the real world.
- CG digital
- the work beginner can hear the “procedure” (work content) of the work procedure manual from the speaker attached to the camera 421 of the AR glasses 420, and the real world object (for example, the object or A display part for highlighting is superimposed and displayed in such a manner that the display position matches the tool).
- the display component is, for example, a frame surrounding the area to be emphasized, the color of the area to be emphasized, blinking of the display part such as the frame, an arrow pointing to the area to be emphasized, and the like.
- objects objects, tools, etc.
- objects visible in front of the operator's line of sight are highlighted in a form that matches the display position, so even if the operator is a beginner, the operator can easily perform the work.
- a person can visually identify parts, materials, tools, etc. easily and accurately. Therefore, work errors can be avoided, and confusion can be reduced, so work can be carried out efficiently.
- Embodiment 4 is the same as any of Embodiments 1 to 3.
- Video manual creation device 101 Document analysis unit, 102, 202 Video analysis unit, 103 Object detection unit, 104 Action detection unit, 105 Audio analysis unit, 106 Link information generation unit, 107 Video manual generation unit , 110, 410 display controller, 120 display, 150 first set, 160a, 160b second set, 420 AR glasses, 421 camera, 510 CPU, 520 main memory, 530 storage device.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
図1は、実施の形態1に係る動画マニュアル作成装置100の構成を概略的に示す機能ブロック図である。動画マニュアル作成装置100は、実施の形態1に係る動画マニュアル作成方法を実行することができる装置である。動画マニュアル作成装置100によって作成された動画マニュアルデータD107は、表示制御装置110に出力される。表示制御装置110は、動画再生の制御を行う動画再生制御部112と、動画マニュアルの表示動作を制御する動画マニュアル表示制御部111とを有する。表示制御装置110は、映像表示装置としてのディスプレイ120に動画マニュアルを表示させる。動画マニュアル作成装置100、表示制御装置110、及びディスプレイ120は、人(例えば、作業者)に動画マニュアルを提示する動画マニュアル提示システムを構成する。また、表示制御装置110は、動画マニュアル作成装置100の一部であってもよい。
図19は、実施の形態2に係る動画マニュアル作成装置200の構成を概略的に示す機能ブロック図である。図19において、図1に示される構成と同一又は対応する構成には、図1に示される符号と同じ符号が付されている。動画マニュアル作成装置200は、実施の形態2に係る動画マニュアル作成方法を実行することができる装置である。動画マニュアル作成装置200は、動画解析部202が動画ファイルの音声を解析する音声解析部105を有する点、リンク情報生成部106が音声情報データD105をさらに用いている点において、実施の形態1に係る動画マニュアル作成装置100と異なる。動画マニュアル作成装置200によって作成された動画マニュアルデータD107は、表示制御装置110に出力される。動画マニュアル作成装置200、表示制御装置110、及びディスプレイ120は、人(例えば、作業者)に動画マニュアルを提示する動画マニュアル提示システムを構成する。また、表示制御装置110は、動画マニュアル作成装置200の一部であってもよい。
図24は、実施の形態3に係る動画マニュアル作成装置300の構成を概略的に示す機能ブロック図である。図24において、図19に示される構成と同一又は対応する構成には、図19に示される符号と同じ符号が付されている。動画マニュアル作成装置300は、実施の形態3に係る動画マニュアル作成方法を実行することができる装置である。動画マニュアル作成装置300は、カメラで撮影された動画を記録する動画記録部308を備え、記録された動画を動画解析部202で解析する点において、実施の形態2に係る動画マニュアル作成装置200と異なる。動画マニュアル作成装置300によって作成された動画マニュアルデータD107は、表示制御装置110に出力される。動画マニュアル作成装置300、表示制御装置110、及びディスプレイ120は、人(例えば、作業者)に動画マニュアルを提示する動画マニュアル提示システムを構成する。また、表示制御装置110は、動画マニュアル作成装置300の一部であってもよい。
図27は、実施の形態4に係る動画マニュアル作成装置400の構成を概略的に示す機能ブロック図である。図27において、図24に示される構成と同一又は対応する構成には、図24に示される符号と同じ符号が付されている。動画マニュアル作成装置400は、実施の形態4に係る動画マニュアル作成方法を実行することができる装置である。動画マニュアル作成装置400、表示制御装置410、及びAR(拡張現実)グラス420は、人(例えば、作業者)に動画マニュアルを提示する動画マニュアル提示システムを構成する。実施の形態4の動画マニュアル提示システムは、ARグラス420と、ARグラス420に画像を表示する表示制御装置410とを用いている点において、実施の形態3の動画マニュアル提示システムと異なる。また、表示制御装置110は、動画マニュアル作成装置400の一部であってもよい。
上記実施の形態では、データがツリー構造である例を説明したが、データはツリー構造以外のものであってもよい。
Claims (9)
- 作業手順が記載された作業手順書ファイルを解析して、前記作業手順書ファイルに含まれている文章の構造を示す文章情報データを生成する文書解析部と、
前記作業手順に従う作業を撮影した動画の動画ファイルを解析して、前記動画に含まれている物体を示す物体情報データを生成し、前記動画に含まれている人の動作を示す動作情報データを生成する動画解析部と、
前記文章情報データから、前記文章に含まれている名詞と動詞との組である第1の組を収集し、前記物体情報データ及び前記動作情報データから、前記動画に含まれている前記物体と前記動作との組である第2の組を収集し、収集された前記第1の組と収集された前記第2の組とから、前記名詞と前記物体とが対応し且つ前記動詞と前記動作とが対応する前記第1の組と前記第2の組とを検索し、前記検索によって得られた前記第1の組が記載されている前記作業手順内の位置と、前記検索によって得られた前記第2の組が含まれている前記動画内のシーンとの対応を示すリンク情報データを生成するリンク情報生成部と、
前記リンク情報データに基づいて、前記作業手順と前記動画と前記名詞と前記動詞とを含む動画マニュアルをディスプレイに表示させる動画マニュアルデータを生成する動画マニュアル生成部と、
を有することを特徴とする動画マニュアル作成装置。 - 前記名詞は、前記物体の名称を示す単語であり、
前記動詞は、前記人の動き示す単語である、
ことを特徴とする請求項1に記載の動画マニュアル作成装置。 - 前記物体は、前記作業の対象物、前記作業で使用される道具、及び前記人の身体部位の少なくとも1つを含む
ことを特徴とする請求項1又は2に記載の動画マニュアル作成装置。 - 前記動画解析部は、前記動画ファイルに含まれている音声を示す音声情報データを生成する音声解析部を含み、
前記リンク情報生成部は、
前記音声情報データに含まれる音声キーワードを収集し、
前記文章情報データから前記音声キーワードに対応する前記名詞及び前記動詞を検索し、前記検索によって得られた前記名詞及び前記動詞の前記作業手順内の位置と、前記動画内のシーンと、前記音声キーワードとの対応を示す前記リンク情報データを生成する
ことを特徴とする請求項1から3のいずれか1項に記載の動画マニュアル作成装置。 - 前記人をカメラ撮影して得られた撮影ファイルを記憶デバイスに記録する動画記録部をさらに有し、
前記動画ファイルは、前記記憶デバイスに記録された前記撮影ファイルである
ことを特徴とする請求項1から4のいずれか1項に記載の動画マニュアル作成装置。 - 表示制御装置をさらに有し、
前記表示制御装置は、ディスプレイに、前記動画マニュアル又は前記動画マニュアルの一部を拡張現実情報として前記人の視線の先に見える現実の物体に重ねて拡張現実画像を表示させる
ことを特徴とする請求項1から4のいずれか1項に記載の動画マニュアル作成装置。 - 前記拡張現実画像は、前記現実の物体を強調する表示部品を含み、
前記表示制御装置は、前記表示部品の表示状態を切り替える
ことを特徴とする請求項6に記載の動画マニュアル作成装置。 - 動画マニュアルデータを作成する動画マニュアル作成装置が実行する動画マニュアル作成方法であって、
作業手順が記載された作業手順書ファイルを解析して、前記作業手順書ファイルに含まれている文章の構造を示す文章情報データを生成するステップと、
前記作業手順に従う作業を撮影した動画の動画ファイルを解析して、前記動画に含まれている物体を示す物体情報データを生成し、前記動画に含まれている人の動作を示す動作情報データを生成するステップと、
前記文章情報データから、前記文章に含まれている名詞と動詞との組である第1の組を収集し、前記物体情報データ及び前記動作情報データから、前記動画に含まれている前記物体と前記動作との組である第2の組を収集し、収集された前記第1の組と収集された前記第2の組とから、前記名詞と前記物体とが対応し且つ前記動詞と前記動作とが対応する前記第1の組と前記第2の組とを検索し、前記検索によって得られた前記第1の組が記載されている前記作業手順内の位置と、前記検索によって得られた前記第2の組が含まれている前記動画内のシーンとの対応を示すリンク情報データを生成するステップと、
前記リンク情報データに基づいて、前記作業手順と前記動画と前記名詞と前記動詞とを含む動画マニュアルをディスプレイに表示させる動画マニュアルデータを生成するステップと、
を有することを特徴とする動画マニュアル作成方法。 - 作業手順が記載された作業手順書ファイルを解析して、前記作業手順書ファイルに含まれている文章の構造を示す文章情報データを生成するステップと、
前記作業手順に従う作業を撮影した動画の動画ファイルを解析して、前記動画に含まれている物体を示す物体情報データを生成し、前記動画に含まれている人の動作を示す動作情報データを生成するステップと、
前記文章情報データから、前記文章に含まれている名詞と動詞との組である第1の組を収集し、前記物体情報データ及び前記動作情報データから、前記動画に含まれている前記物体と前記動作との組である第2の組を収集し、収集された前記第1の組と収集された前記第2の組とから、前記名詞と前記物体とが対応し且つ前記動詞と前記動作とが対応する前記第1の組と前記第2の組とを検索し、前記検索によって得られた前記第1の組が記載されている前記作業手順内の位置と、前記検索によって得られた前記第2の組が含まれている前記動画内のシーンとの対応を示すリンク情報データを生成するステップと、
前記リンク情報データに基づいて、前記作業手順と前記動画と前記名詞と前記動詞とを含む動画マニュアルをディスプレイに表示させる動画マニュアルデータを生成するステップと、
をコンピュータに実行させることを特徴とする動画マニュアル作成プログラム。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021554980A JP7023427B1 (ja) | 2021-05-20 | 2021-05-20 | 動画マニュアル作成装置、動画マニュアル作成方法、及び動画マニュアル作成プログラム |
CN202180098158.5A CN117280339A (zh) | 2021-05-20 | 2021-05-20 | 动态图像手册制作装置、动态图像手册制作方法和动态图像手册制作程序 |
PCT/JP2021/019151 WO2022244180A1 (ja) | 2021-05-20 | 2021-05-20 | 動画マニュアル作成装置、動画マニュアル作成方法、及び動画マニュアル作成プログラム |
US18/381,540 US20240071113A1 (en) | 2021-05-20 | 2023-10-18 | Video manual generation device, video manual generation method, and storage medium storing video manual generation program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/019151 WO2022244180A1 (ja) | 2021-05-20 | 2021-05-20 | 動画マニュアル作成装置、動画マニュアル作成方法、及び動画マニュアル作成プログラム |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/381,540 Continuation US20240071113A1 (en) | 2021-05-20 | 2023-10-18 | Video manual generation device, video manual generation method, and storage medium storing video manual generation program |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022244180A1 true WO2022244180A1 (ja) | 2022-11-24 |
Family
ID=81076736
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/019151 WO2022244180A1 (ja) | 2021-05-20 | 2021-05-20 | 動画マニュアル作成装置、動画マニュアル作成方法、及び動画マニュアル作成プログラム |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240071113A1 (ja) |
JP (1) | JP7023427B1 (ja) |
CN (1) | CN117280339A (ja) |
WO (1) | WO2022244180A1 (ja) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023223671A1 (ja) * | 2022-05-17 | 2023-11-23 | 株式会社Nttドコモ | 動画マニュアル生成装置 |
JP7406038B1 (ja) * | 2023-09-19 | 2023-12-26 | 株式会社日立パワーソリューションズ | 作業支援システム及び作業支援方法 |
JP7492092B1 (ja) | 2024-02-20 | 2024-05-28 | 株式会社スタディスト | 電子マニュアルの作成を支援するためのコンピュータシステムおよびプログラム |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008225883A (ja) * | 2007-03-13 | 2008-09-25 | Ricoh Co Ltd | データ処理装置及びそのためのプログラム |
WO2019181263A1 (ja) * | 2018-03-20 | 2019-09-26 | ソニー株式会社 | 情報処理装置、情報処理方法およびプログラム |
JP2020177534A (ja) * | 2019-04-19 | 2020-10-29 | 京セラドキュメントソリューションズ株式会社 | 透過型ウェアラブル端末 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6391078B1 (ja) * | 2017-08-18 | 2018-09-19 | クックパッド株式会社 | 情報処理装置、端末装置、陳列棚、情報処理システム、情報処理方法、及びプログラム |
JP6837034B2 (ja) * | 2018-08-13 | 2021-03-03 | クックパッド株式会社 | 陳列棚 |
-
2021
- 2021-05-20 JP JP2021554980A patent/JP7023427B1/ja active Active
- 2021-05-20 CN CN202180098158.5A patent/CN117280339A/zh active Pending
- 2021-05-20 WO PCT/JP2021/019151 patent/WO2022244180A1/ja active Application Filing
-
2023
- 2023-10-18 US US18/381,540 patent/US20240071113A1/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008225883A (ja) * | 2007-03-13 | 2008-09-25 | Ricoh Co Ltd | データ処理装置及びそのためのプログラム |
WO2019181263A1 (ja) * | 2018-03-20 | 2019-09-26 | ソニー株式会社 | 情報処理装置、情報処理方法およびプログラム |
JP2020177534A (ja) * | 2019-04-19 | 2020-10-29 | 京セラドキュメントソリューションズ株式会社 | 透過型ウェアラブル端末 |
Non-Patent Citations (2)
Title |
---|
MIURA KOICHI, TAKANO MOTOMU, HAMADA REIKO, IDE ICHIRO, SAKAI SHUICHI, TANAKA HIDEHIKO: "Associating Semantically Structured Cooking Videos with their Preparation Steps . IEICE Transactions. (J86-D-II) no. 11", THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS, 1 November 2003 (2003-11-01), pages 1647 - 1656, XP093011049 * |
YAMAKATA, YOKO: "A Method of Recipe to Cooking Video Mapping for Automated Cooking Content Construction", IEICE TRANSACTIONS, vol. J90-D, no. 10, 1 October 2007 (2007-10-01), JP , pages 2817 - 2829, XP009541687, ISSN: 1880-4535 * |
Also Published As
Publication number | Publication date |
---|---|
JPWO2022244180A1 (ja) | 2022-11-24 |
US20240071113A1 (en) | 2024-02-29 |
JP7023427B1 (ja) | 2022-02-21 |
CN117280339A (zh) | 2023-12-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022244180A1 (ja) | 動画マニュアル作成装置、動画マニュアル作成方法、及び動画マニュアル作成プログラム | |
CN110446063B (zh) | 视频封面的生成方法、装置及电子设备 | |
KR102238809B1 (ko) | 터치스크린 상에 표시되는 조치 가능한 콘텐츠 | |
US8793118B2 (en) | Adaptive multimodal communication assist system | |
JP2023083402A (ja) | 外科用ビデオをセグメント化するためのシステムおよび方法 | |
US20060197764A1 (en) | Document animation system | |
JP2019531538A (ja) | ワードフロー注釈 | |
GB2556174A (en) | Methods and systems for generating virtual reality environments from electronic documents | |
CN103984772B (zh) | 文本检索字幕库生成方法和装置、视频检索方法和装置 | |
Liu et al. | Towards mediating shared perceptual basis in situated dialogue | |
Zarrieß et al. | Easy things first: Installments improve referring expression generation for objects in photographs | |
Oliveira et al. | Automatic sign language translation to improve communication | |
Iida et al. | Multi-modal reference resolution in situated dialogue by integrating linguistic and extra-linguistic clues | |
EP3276484A1 (en) | Information processing system and information processing method | |
JP2008102758A (ja) | Fmeaシートの作成方法およびfmeaシート自動作成装置 | |
CN112465144A (zh) | 基于有限知识的多模态示范意图生成方法及装置 | |
Ryumin et al. | Towards automatic recognition of sign language gestures using kinect 2.0 | |
WO2015141523A1 (ja) | 情報処理装置、情報処理方法及びコンピュータプログラム | |
EP3824378A1 (en) | Machine interaction | |
Lei et al. | Assistsr: Task-oriented video segment retrieval for personal AI assistant | |
CN108874360B (zh) | 全景内容定位方法和装置 | |
GB2590741A (en) | Artificial intelligence and augmented reality system and method | |
CN112102502A (zh) | 用于飞机驾驶舱功能试验的增强现实辅助方法 | |
Pfeiffer et al. | Gesture semantics reconstruction based on motion capturing and complex event processing: a circular shape example | |
JP2020035023A (ja) | 学習方法、誤り判定方法、学習システム、誤り判定システム、およびプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2021554980 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21940794 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180098158.5 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21940794 Country of ref document: EP Kind code of ref document: A1 |