US20240045902A1 - Method and apparatus for generating segment search data of visual work instruction for performing artificial intelligence - Google Patents

Method and apparatus for generating segment search data of visual work instruction for performing artificial intelligence Download PDF

Info

Publication number
US20240045902A1
US20240045902A1 US18/230,220 US202318230220A US2024045902A1 US 20240045902 A1 US20240045902 A1 US 20240045902A1 US 202318230220 A US202318230220 A US 202318230220A US 2024045902 A1 US2024045902 A1 US 2024045902A1
Authority
US
United States
Prior art keywords
work instruction
visual work
segment
task
generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US18/230,220
Inventor
Sung Bum Park
Suehyun CHANG
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Livinai Inc
Academic Cooperation Foundation of Hoseo University
Original Assignee
Livinai Inc
Academic Cooperation Foundation of Hoseo University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Livinai Inc, Academic Cooperation Foundation of Hoseo University filed Critical Livinai Inc
Assigned to HOSEO UNIVERSITY ACADEMIC COOPERATION FOUNDATION, LivinAI Inc. reassignment HOSEO UNIVERSITY ACADEMIC COOPERATION FOUNDATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CHANG, Suehyun, PARK, SUNG BUM
Publication of US20240045902A1 publication Critical patent/US20240045902A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/86Arrangements for image or video recognition or understanding using pattern recognition or machine learning using syntactic or structural representations of the image or video pattern, e.g. symbolic string recognition; using graph matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images

Definitions

  • the present invention relates to generating data for performing artificial intelligence, and more particularly, to a method and apparatus for generating segment search data of a visual work instruction for performing artificial intelligence, by which method and apparatus data is generated that enables a user to perform a search for a desired segment in a visual work instruction using an artificial intelligence-based text search model.
  • Search technology has been evolving since Google introduced its PageRank technique based on graph theory. These search technologies were based on unsupervised learning, meaning that they were able to search when given only a set of documents.
  • a typical example of a search model based on unsupervised learning is BM25, which shows significantly improved performance when used in conjunction with a query expansion technique called RM3.
  • BM25 A typical example of a search model based on unsupervised learning
  • RM3 a query expansion technique
  • Anserini is widely used in academia and in the field.
  • Lin, Jimmy pre-BERT deep learning-based retrieval models, such as DRMM, KNRM, and PACRR, performed similarly to or worse than Anserini, a retrieval model based on unsupervised learning methodologies, while post-BERT models outperformed Anserini (see Lin, Jimmy. “The Neural Hype, Justified! A Recantation.”). This can also be seen in the leader board of the Ad-Hoc Information Retrieval section of the Paper with Code above. From these academic studies, we can see that AI-based search models can improve the accuracy of search results.
  • AI-based search models have some limitations.
  • search models based on unsupervised learning generally do not suffer from long document lengths, most AI-based search models are limited in the length they can handle. For example, the maximum number of tokens that can be processed by BERT is limited to 512. Therefore, it is not a problem when searching a corpus of short articles, but it is difficult to apply especially when searching for long documents such as papers.
  • the present invention was created to solve the above problems and relates to generating training data for performing artificial intelligence, and more specifically, it aims to provide a method and apparatus for generating segment search data of a visual work instruction for performing artificial intelligence, by which method and apparatus data is generated that enables a user to perform a search for a desired segment in a visual work instruction using an artificial intelligence-based text search model.
  • a method for generating segment search data for segment search in a visual work instruction for performing artificial intelligence comprising, (a) separating a video segment based on textual information associated with the visual work instruction; (b) generating and storing a text file corresponding to the video segment separated in step (a); and (c) generating and storing synchronization information for the text file generated in step (b).
  • the textual information of step (a) includes a description of the visual work instruction as a whole, a task name, a task description, and module names, unit names, and part names associated with the task description.
  • the task name is subdivided into task steps.
  • the synchronization information of step (c) is a start time and an end time of the video content for the text file generated in step (b).
  • an apparatus for generating segment search data for searching segment in a visual work instruction for performing artificial intelligence comprising: at least one processor; and at least one memory for storing computer-executable instructions, wherein said computer-executable instructions stored in said at least one memory cause said at least one processor perform the steps of: (a) separating the video segment from the textual information associated with the visual work instruction; (b) generating a text file corresponding to the video segment delimited in said step (a); and (c) generating synchronization information for the text file generated in step (b) above, and storing said text file together with said synchronization information as segment search data.
  • an artificial intelligence-based text search model is used to search for a user's desired section in a visual work instruction.
  • FIG. 1 is a diagram schematically illustrating an apparatus for performing data generation for searching segments of a visual work instruction for performing artificial intelligence according to one embodiment of the present invention.
  • FIG. 2 is a flow diagram illustrating a method for generating data for searching segments of a visual work instruction for performing artificial intelligence according to an embodiment of the present invention.
  • FIG. 3 is a diagram illustrating synchronization information of a generated text file in the method for generating data for searching segments of a visual work instruction for performing artificial intelligence according to the present invention.
  • FIGS. 4 through 6 are video screens showing results of segment search using the method for generating data for searching segments of a visual work instruction for performing artificial intelligence according to the present invention.
  • FIG. 1 is a diagram illustrating an apparatus for performing data generation for segment searching of visual work instructions for performing artificial intelligence according to one embodiment of the present invention, and is a schematic illustration of a configuration of a computing device equipped with a segment search data generation application for visual work instructions.
  • a computing electronic device 100 comprises a processor 110 , a non-volatile storage unit 120 for storing programs and data, a volatile memory 130 for storing running programs, an input/output unit 140 for inputting and outputting information to and from a user, and a bus for internal communication between these devices.
  • Running programs may include an operating system and various applications. Although not shown, it includes a power supply.
  • FIG. 2 is a diagram illustrating the flow of a method for generating data for searching segment of a visual work instruction for performing artificial intelligence according to the present invention
  • FIG. 3 is a diagram illustrating synchronization information of a text file generated in the method for generating data for searching segment of a visual work instruction for performing artificial intelligence according to the present invention
  • FIGS. 4 to 6 are video screens showing the results of segment searching using the method for generating data for searching segment of a visual work instruction for performing artificial intelligence according to the present invention.
  • the method for generating data for retrieving segments of a visual work instruction for performing artificial intelligence of the present invention identifies video segments from textual information associated with the visual work instruction (S 100 ).
  • a video is a moving picture characterized by continuously showing multiple frames at a fast speed.
  • the video may be accompanied by voice and music synchronized to a time base.
  • the visual work instruction is accompanied by textual information that was present at the time the work instruction was created as a video. This textual information may be synchronized with the visual work instruction.
  • Textual information associated with the visual work instruction includes, but is not limited to, the visual work instruction “overall description”, “task name”, “task description”, and “module name”, “unit name”, and “part name” associated with the task description.
  • Each module may consist of multiple units, and each unit may consist of multiple parts.
  • a visual work instruction for a product called a car is written with a function-oriented module name
  • this function-oriented module name will have text information divided into engine function module name, body function module name, transmission function module name, control function module name, etc.
  • there is a unit name that is separated from each module name and there is a part name text that is separated from each unit name.
  • “task name” can be subdivided into “task steps” and text information can exist.
  • video segments in a visual work instruction is based on the textual information present in the visual work instruction: the visual work instruction's “overall description”, “task name”, “task description”, and the textual information of the “module name”, “unit name”, and “part name” associated with that task description.
  • a video segment may be distinguished based on the point at which the part name changes, or a video segment may not be distinguished if the part name changes but the unit name does not change. In the latter case, the video segment is usually longer than the former.
  • the length of the segments, the separation point, etc. will vary depending on whether the segments are separated based on the visual work instruction or the synchronized text information.
  • a text file corresponding to the video segment contents identified in step S 100 is generated (S 200 ).
  • the text file is generated based on the text information such as “description of the whole”, “task name”, “task description”, “module name”, “unit name”, “part name” related to the task description, etc. in the visual work instruction as described above in step S 100 .
  • the generated text file corresponds to the content of the video segment that contains all of this textual information. In this case, each video segment includes at least one different text information.
  • the text file corresponding to that segment may include “description of the video as a whole”, “task name”, “task description”, “module name”, “unit name”, and “part name”.
  • the video segments are separated based on points where the unit name changes, the text file corresponding to the segment may include a “description of the video as a whole”, a “task name”, a “task description”, a “module name”, and a “unit name”, in which case the “part name” may not be included or may include all part names associated with the unit.
  • synchronization information for synchronizing the text file generated in step S 200 with the visual work instruction is generated, and the text file data along with the generated synchronization information is stored as segment search data (S 300 ).
  • FIG. 3 is a drawing to illustrate synchronization information for a text file generated by the method for generating data for searching segments of a visual work instruction for performing artificial intelligence according to the present invention, wherein a bar represents a visual work instruction. It will be assumed and explained that the bar-shaped video of FIG. 3 is a work instruction for a module or a unit of a product called an automobile.
  • the bar-shaped video is divided into segments ⁇ circle around (1) ⁇ , ⁇ circle around (2) ⁇ , ⁇ circle around (3) ⁇ , ⁇ circle around (4) ⁇ , and ⁇ circle around (5) ⁇ .
  • the left arrow is the start of segment ⁇ circle around (3) ⁇
  • the right arrow is the end of segment ⁇ circle around (3) ⁇ .
  • the left start time of segment ⁇ circle around (3) ⁇ can be the end time of segment ⁇ circle around (2) ⁇ .
  • the start time and the end time mean the start time and the end time according to the time axis, which is the synchronization information for the video segments.
  • the video of the work instruction shown in FIG. 3 is arbitrarily assumed to be a video of one specific “unit name”, the video of this specific “unit name” will have a “task name”, and the video with this “task name” is divided into tasks 1, 2, and 3.
  • the video sections in the form of bars of task 1, task 2, and task 3 in FIG. 3 exist as segments.
  • task 1 is divided into segments with segments ⁇ circle around (1) ⁇ , ⁇ circle around (2) ⁇ , and ⁇ circle around (3) ⁇ consisting of different part names.
  • FIGS. 4 to 6 are video screens showing the results of a segment search using the method for generating data for searching segment of the visual work instruction for performing artificial intelligence according to the present invention.
  • the video is retrieved by a segment search data including textual information related to the video content of the visual work instruction, namely, a text file consisting of a unit name 31 , a task step 32 including a task name, a task description 33 , a part name 34 , and synchronization information of this text file.
  • a segment search data including textual information related to the video content of the visual work instruction, namely, a text file consisting of a unit name 31 , a task step 32 including a task name, a task description 33 , a part name 34 , and synchronization information of this text file.
  • the unit name 31 of the visual work instructions corresponding to FIGS. 4 to 6 are the same, and the unit name 31 are divided into task steps 32 including task names.
  • the text file information related to the video shown in FIG. 4 includes text file information with one (1) part name 34 .
  • the text file information in FIG. 4 is the result of searching for a corresponding video based on the text information of the unit name 31 , the task step 32 labeled “process 3”, the task description 33 , and the information with one (1) part name 34 .
  • FIG. 5 is the result of searching for a video with the same unit name 31 and the same task step 32 as in FIG. 4 , where “process ⁇ circle around (3) ⁇ ” is labeled, and the task description 33 is the same textual information, but the part name 34 is composed of three texts.
  • FIG. 6 shows the same unit name 31 as in FIGS.
  • the task step 32 is labeled as “process ⁇ circle around (4) ⁇ ”, and the task description 33 is different from FIGS. 4 and 5 , and the part name 34 is different from FIGS. 4 and 5 .
  • the desired segment can be searched by using a combination of words in a text file.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Human Computer Interaction (AREA)

Abstract

The present invention relates to generating training data for performing artificial intelligence. A method and apparatus for generating section data of a visual work instruction for performing artificial intelligence are provided for generating data for searching a user's desired section in a visual work instruction using an artificial intelligence-based text search model.

Description

    BACKGROUND OF THE INVENTION 1. Field of the Invention
  • The present invention relates to generating data for performing artificial intelligence, and more particularly, to a method and apparatus for generating segment search data of a visual work instruction for performing artificial intelligence, by which method and apparatus data is generated that enables a user to perform a search for a desired segment in a visual work instruction using an artificial intelligence-based text search model.
  • 2. Description of the Related Art
  • Search technology has been evolving since Google introduced its PageRank technique based on graph theory. These search technologies were based on unsupervised learning, meaning that they were able to search when given only a set of documents. A typical example of a search model based on unsupervised learning is BM25, which shows significantly improved performance when used in conjunction with a query expansion technique called RM3. As an open source, Anserini is widely used in academia and in the field.
  • Meanwhile, in the field of natural language processing, various search models have been proposed by academic researchers who want to apply AI techniques. For example, deep learning-based search models such as DRMM, KNRM, PACRR, etc. have been proposed. Google's BERT, released in 2018, has shown good performance in various natural language processing fields, and research has continued to utilize transformer or language model-based search models.
  • In the Ad-Hoc Information Retrieval section of Paper with Code, a website that introduces open-source AI models in each field, one can find the current state-of-the-art (SOTA) of AI-based search models, including Anserini, a search model based on unsupervised learning.
  • According to a researcher at the University of Waterloo in Canada named Lin, Jimmy, pre-BERT deep learning-based retrieval models, such as DRMM, KNRM, and PACRR, performed similarly to or worse than Anserini, a retrieval model based on unsupervised learning methodologies, while post-BERT models outperformed Anserini (see Lin, Jimmy. “The Neural Hype, Justified! A Recantation.”). This can also be seen in the leader board of the Ad-Hoc Information Retrieval section of the Paper with Code above. From these academic studies, we can see that AI-based search models can improve the accuracy of search results.
  • However, AI-based search models have some limitations.
  • In order to use AI-based search models for inference, they must first be trained, which requires a large amount of labeled data. Labeled data should basically be processed and provided by humans, which is uneconomical because the cost of labeling is too large given the amount of data required for training.
  • Another problem is that while search models based on unsupervised learning generally do not suffer from long document lengths, most AI-based search models are limited in the length they can handle. For example, the maximum number of tokens that can be processed by BERT is limited to 512. Therefore, it is not a problem when searching a corpus of short articles, but it is difficult to apply especially when searching for long documents such as papers.
  • On the other hand, videos and images do not contain textual information by default, so it is difficult to search for them using information retrieval techniques.
  • PRIOR ART LITERATURE Non-Patent Literature
    • (Non-Patent Literature 1) [1] https://paperswithcode.com/task/ad-hoc-information-retrieval
    • (Non-Patent Literature 2) [2] MacAvaney, Sean, et al. “CEDR: Contextualized embeddings for document ranking.” Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.
    • (Non-Patent Literature 3) [3] Dai, Zhuyun, and Jamie Callan. “Deeper text understandings for IR with contextual neural language modeling.” Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.
    SUMMARY OF THE INVENTION
  • The present invention was created to solve the above problems and relates to generating training data for performing artificial intelligence, and more specifically, it aims to provide a method and apparatus for generating segment search data of a visual work instruction for performing artificial intelligence, by which method and apparatus data is generated that enables a user to perform a search for a desired segment in a visual work instruction using an artificial intelligence-based text search model.
  • To accomplish this objective, there is provided a method for generating segment search data for segment search in a visual work instruction for performing artificial intelligence, comprising, (a) separating a video segment based on textual information associated with the visual work instruction; (b) generating and storing a text file corresponding to the video segment separated in step (a); and (c) generating and storing synchronization information for the text file generated in step (b).
  • Preferably, the textual information of step (a) includes a description of the visual work instruction as a whole, a task name, a task description, and module names, unit names, and part names associated with the task description.
  • Preferably, the task name is subdivided into task steps.
  • Preferably, the synchronization information of step (c) is a start time and an end time of the video content for the text file generated in step (b).
  • Other aspect of the present invention to accomplish this object is an apparatus for generating segment search data for searching segment in a visual work instruction for performing artificial intelligence, comprising: at least one processor; and at least one memory for storing computer-executable instructions, wherein said computer-executable instructions stored in said at least one memory cause said at least one processor perform the steps of: (a) separating the video segment from the textual information associated with the visual work instruction; (b) generating a text file corresponding to the video segment delimited in said step (a); and (c) generating synchronization information for the text file generated in step (b) above, and storing said text file together with said synchronization information as segment search data.
  • According to the present invention, an artificial intelligence-based text search model is used to search for a user's desired section in a visual work instruction.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a diagram schematically illustrating an apparatus for performing data generation for searching segments of a visual work instruction for performing artificial intelligence according to one embodiment of the present invention.
  • FIG. 2 is a flow diagram illustrating a method for generating data for searching segments of a visual work instruction for performing artificial intelligence according to an embodiment of the present invention.
  • FIG. 3 is a diagram illustrating synchronization information of a generated text file in the method for generating data for searching segments of a visual work instruction for performing artificial intelligence according to the present invention.
  • FIGS. 4 through 6 are video screens showing results of segment search using the method for generating data for searching segments of a visual work instruction for performing artificial intelligence according to the present invention.
  • DETAILED DESCRIPTION OF THE INVENTION
  • Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. Prior to the description of the present invention, it will be noted that the terms and wordings used in the specification and the claims should not be construed as general and lexical meanings, but should be construed as the meanings and concepts that agree with the technical spirits of the present invention, based on the principle stating that the concepts of the terms may be properly defined by the inventor(s) to describe the invention in the best manner. Therefore, because the examples described in the specification and the configurations illustrated in the drawings are merely for the preferred embodiments of the present invention but cannot represent all the technical sprints of the present invention, it should be understood that various equivalents and modifications that may replace them can be present.
  • FIG. 1 is a diagram illustrating an apparatus for performing data generation for segment searching of visual work instructions for performing artificial intelligence according to one embodiment of the present invention, and is a schematic illustration of a configuration of a computing device equipped with a segment search data generation application for visual work instructions.
  • Referring to FIG. 1 , a computing electronic device 100 comprises a processor 110, a non-volatile storage unit 120 for storing programs and data, a volatile memory 130 for storing running programs, an input/output unit 140 for inputting and outputting information to and from a user, and a bus for internal communication between these devices. Running programs may include an operating system and various applications. Although not shown, it includes a power supply.
  • FIG. 2 is a diagram illustrating the flow of a method for generating data for searching segment of a visual work instruction for performing artificial intelligence according to the present invention, FIG. 3 is a diagram illustrating synchronization information of a text file generated in the method for generating data for searching segment of a visual work instruction for performing artificial intelligence according to the present invention, and FIGS. 4 to 6 are video screens showing the results of segment searching using the method for generating data for searching segment of a visual work instruction for performing artificial intelligence according to the present invention.
  • First, as shown in FIG. 2 , the method for generating data for retrieving segments of a visual work instruction for performing artificial intelligence of the present invention identifies video segments from textual information associated with the visual work instruction (S100).
  • In general, a video is a moving picture characterized by continuously showing multiple frames at a fast speed. The video may be accompanied by voice and music synchronized to a time base. In the present invention, the visual work instruction is accompanied by textual information that was present at the time the work instruction was created as a video. This textual information may be synchronized with the visual work instruction. Although a work instruction for assembling a product is described below as an example, the invention is not limited to such example.
  • Textual information associated with the visual work instruction includes, but is not limited to, the visual work instruction “overall description”, “task name”, “task description”, and “module name”, “unit name”, and “part name” associated with the task description. Each module may consist of multiple units, and each unit may consist of multiple parts. For example, for a product called an automobile, there will be a textual description of the entire automobile, and along with the text containing the description of the entire automobile, there will be text for each task name and task description. And for each work description, there is textual information of module name, unit name, and part name. If we assume that a visual work instruction for a product called a car is written with a function-oriented module name, there will be a function-oriented work instruction text along with a description of the entire visual work instruction for the car, and this function-oriented module name will have text information divided into engine function module name, body function module name, transmission function module name, control function module name, etc. In addition, there is a unit name that is separated from each module name, and there is a part name text that is separated from each unit name. Also, “task name” can be subdivided into “task steps” and text information can exist.
  • In other words, the distinction between video segments in a visual work instruction is based on the textual information present in the visual work instruction: the visual work instruction's “overall description”, “task name”, “task description”, and the textual information of the “module name”, “unit name”, and “part name” associated with that task description. For example, a video segment may be distinguished based on the point at which the part name changes, or a video segment may not be distinguished if the part name changes but the unit name does not change. In the latter case, the video segment is usually longer than the former. As such, depending on whether the segments are separated based on the visual work instruction or the synchronized text information, the length of the segments, the separation point, etc. will vary.
  • Then, a text file corresponding to the video segment contents identified in step S100 is generated (S200). The text file is generated based on the text information such as “description of the whole”, “task name”, “task description”, “module name”, “unit name”, “part name” related to the task description, etc. in the visual work instruction as described above in step S100. The generated text file corresponds to the content of the video segment that contains all of this textual information. In this case, each video segment includes at least one different text information. For example, if a video segment is separated based on a point where a part name changes, the text file corresponding to that segment may include “description of the video as a whole”, “task name”, “task description”, “module name”, “unit name”, and “part name”. Also, if the video segments are separated based on points where the unit name changes, the text file corresponding to the segment may include a “description of the video as a whole”, a “task name”, a “task description”, a “module name”, and a “unit name”, in which case the “part name” may not be included or may include all part names associated with the unit.
  • Next, synchronization information for synchronizing the text file generated in step S200 with the visual work instruction is generated, and the text file data along with the generated synchronization information is stored as segment search data (S300).
  • FIG. 3 is a drawing to illustrate synchronization information for a text file generated by the method for generating data for searching segments of a visual work instruction for performing artificial intelligence according to the present invention, wherein a bar represents a visual work instruction. It will be assumed and explained that the bar-shaped video of FIG. 3 is a work instruction for a module or a unit of a product called an automobile.
  • In FIG. 3 , the bar-shaped video is divided into segments {circle around (1)}, {circle around (2)}, {circle around (3)}, {circle around (4)}, and {circle around (5)}. Of the arrows on both sides of segment {circle around (3)}, the left arrow is the start of segment {circle around (3)} and the right arrow is the end of segment {circle around (3)}. Meanwhile, the left start time of segment {circle around (3)} can be the end time of segment {circle around (2)}. In other words, the start time and the end time mean the start time and the end time according to the time axis, which is the synchronization information for the video segments.
  • And if the video of the work instruction shown in FIG. 3 is arbitrarily assumed to be a video of one specific “unit name”, the video of this specific “unit name” will have a “task name”, and the video with this “task name” is divided into tasks 1, 2, and 3. In other words, the video sections in the form of bars of task 1, task 2, and task 3 in FIG. 3 exist as segments. In this case, task 1 is divided into segments with segments {circle around (1)}, {circle around (2)}, and {circle around (3)} consisting of different part names.
  • FIGS. 4 to 6 are video screens showing the results of a segment search using the method for generating data for searching segment of the visual work instruction for performing artificial intelligence according to the present invention.
  • As shown in FIGS. 4 to 6 , the video is retrieved by a segment search data including textual information related to the video content of the visual work instruction, namely, a text file consisting of a unit name 31, a task step 32 including a task name, a task description 33, a part name 34, and synchronization information of this text file.
  • The unit name 31 of the visual work instructions corresponding to FIGS. 4 to 6 are the same, and the unit name 31 are divided into task steps 32 including task names.
  • The text file information related to the video shown in FIG. 4 includes text file information with one (1) part name 34. In other words, the text file information in FIG. 4 is the result of searching for a corresponding video based on the text information of the unit name 31, the task step 32 labeled “process 3”, the task description 33, and the information with one (1) part name 34. FIG. 5 is the result of searching for a video with the same unit name 31 and the same task step 32 as in FIG. 4 , where “process {circle around (3)}” is labeled, and the task description 33 is the same textual information, but the part name 34 is composed of three texts. Furthermore, FIG. 6 shows the same unit name 31 as in FIGS. 4 and 5 , but the task step 32 is labeled as “process {circle around (4)}”, and the task description 33 is different from FIGS. 4 and 5 , and the part name 34 is different from FIGS. 4 and 5 . In other words, when a user wants to search for a desired video, the desired segment can be searched by using a combination of words in a text file.
  • As shown above, although the present invention has been described by means of limited embodiments and drawings, the invention is not limited thereby and various modifications and variations can be made by one having ordinary knowledge in the technical field to which the invention belongs within the equitable scope of the technical idea of the invention and the claims of the patent which will be described below.

Claims (5)

What is claimed is:
1. A method for generating segment search data for segment search in a visual work instruction for performing artificial intelligence, comprising,
(a) separating a video segment based on textual information associated with the visual work instruction;
(b) generating and storing a text file corresponding to the video segment separated in step (a); and
(c) generating and storing synchronization information for the text file generated in step (b).
2. The method of claim 1, wherein the textual information of step (a) includes a description of the visual work instruction as a whole, a task name, a task description, and module names, unit names, and part names associated with the task description.
3. The method of claim 2, wherein the task name is subdivided into task steps.
4. The method of claim 1, wherein the synchronization information of step (c) is a start time and an end time of the video content for the text file generated in step (b).
5. An apparatus for generating segment search data for searching segment in a visual work instruction for performing artificial intelligence, comprising:
at least one processor; and
at least one memory for storing computer-executable instructions,
wherein said computer-executable instructions stored in said at least one memory cause said at least one processor perform the steps of:
(a) separating the video segment from the textual information associated with the visual work instruction;
(b) generating a text file corresponding to the video segment delimited in said step (a); and
(c) generating synchronization information for the text file generated in step (b) above, and storing said text file together with said synchronization information as segment search data.
US18/230,220 2022-08-04 2023-08-04 Method and apparatus for generating segment search data of visual work instruction for performing artificial intelligence Pending US20240045902A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2022-0097120 2022-08-04
KR1020220097120A KR102463260B1 (en) 2022-08-04 2022-08-04 Method and apparatus for generating segment search data of video work order for performing artificial intelligence

Publications (1)

Publication Number Publication Date
US20240045902A1 true US20240045902A1 (en) 2024-02-08

Family

ID=84043704

Family Applications (1)

Application Number Title Priority Date Filing Date
US18/230,220 Pending US20240045902A1 (en) 2022-08-04 2023-08-04 Method and apparatus for generating segment search data of visual work instruction for performing artificial intelligence

Country Status (2)

Country Link
US (1) US20240045902A1 (en)
KR (1) KR102463260B1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116647653B (en) * 2023-07-27 2023-10-13 广州竞远安全技术股份有限公司 Safe operation and maintenance operation monitoring system and method based on video log association retrieval

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102286572B1 (en) * 2015-03-04 2021-08-06 한국전자통신연구원 Device and Method for new 3D Video Representation from 2D Video

Also Published As

Publication number Publication date
KR102463260B1 (en) 2022-11-07

Similar Documents

Publication Publication Date Title
CN110892399B (en) System and method for automatically generating summary of subject matter
KR101088983B1 (en) Data search system and data search method using a global unique identifier
CN100555264C (en) The annotate method of electronic document, device and system
Shlain et al. Syntactic search by example
US20240045902A1 (en) Method and apparatus for generating segment search data of visual work instruction for performing artificial intelligence
CN105094836B (en) It is a kind of to generate the method and apparatus for illustrating document
US20200242349A1 (en) Document retrieval through assertion analysis on entities and document fragments
US8549009B2 (en) XML data processing system, data processing method and XML data processing control program used for the system
Grassi et al. Pundit: Semantically Structured Annotations for Web Contents and Digital Libraries.
Staykova Natural language generation and semantic technologies
Xia et al. Enriching a massively multilingual database of interlinear glossed text
CN117591661A (en) Question-answer data construction method and device based on large language model
Mukku et al. Tag me a label with multi-arm: Active learning for telugu sentiment analysis
Moscato et al. Overfa: A collaborative framework for the semantic annotation of documents and websites
Rehm et al. An OWL-and XQuery-based mechanism for the retrieval of linguistic patterns from XML-corpora
KR101835994B1 (en) Method and apparatus of providing searching service for electronic books
Quoc et al. Abstractive text summarization using LSTMs with rich features
Witte et al. Integrating wiki systems, natural language processing, and semantic technologies for cultural heritage data management
de Campos et al. An integrated system for managing the andalusian parliament's digital library
Tarawneh et al. a hybrid approach for indexing and searching the holy Quran
CN113448563A (en) LaTeX online collaboration platform
CN109710844A (en) The method and apparatus for quick and precisely positioning file based on search engine
Pazienza et al. Semi-automatic generation of GUIs for RDF browsing
Thiombiano et al. Discovery and enrichment of knowledges from a semantic wiki
JP5160120B2 (en) Information search apparatus, information search method, and information search program

Legal Events

Date Code Title Description
AS Assignment

Owner name: LIVINAI INC., KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PARK, SUNG BUM;CHANG, SUEHYUN;REEL/FRAME:065254/0286

Effective date: 20230808

Owner name: HOSEO UNIVERSITY ACADEMIC COOPERATION FOUNDATION, KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PARK, SUNG BUM;CHANG, SUEHYUN;REEL/FRAME:065254/0286

Effective date: 20230808

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION