CN117648445A - Rich media training courseware knowledge graph construction method and device and computer equipment - Google Patents

Rich media training courseware knowledge graph construction method and device and computer equipment Download PDF

Info

Publication number
CN117648445A
CN117648445A CN202311342061.1A CN202311342061A CN117648445A CN 117648445 A CN117648445 A CN 117648445A CN 202311342061 A CN202311342061 A CN 202311342061A CN 117648445 A CN117648445 A CN 117648445A
Authority
CN
China
Prior art keywords
courseware
rich media
timing
script
media training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311342061.1A
Other languages
Chinese (zh)
Inventor
倪相生
顾天雄
章伟林
徐冲
汪凝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Zhejiang Electric Power Co Ltd
Original Assignee
State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Zhejiang Electric Power Co Ltd filed Critical State Grid Zhejiang Electric Power Co Ltd
Priority to CN202311342061.1A priority Critical patent/CN117648445A/en
Publication of CN117648445A publication Critical patent/CN117648445A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Engineering & Computer Science (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Acoustics & Sound (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例公开了富媒体培训课件知识图谱构建方法、装置及计算机设备。所述方法包括:获取富媒体培训课件;对所述富媒体培训课件进行预处理,以得到文字脚本;根据所述文字脚本构建知识图谱。通过实施本发明实施例的方法可实现对对富媒体课件库建立知识图谱,满足搜索、智能推荐等场景需求。

Embodiments of the present invention disclose rich media training courseware knowledge graph construction methods, devices and computer equipment. The method includes: obtaining rich media training courseware; preprocessing the rich media training courseware to obtain a text script; and constructing a knowledge graph based on the text script. By implementing the method of the embodiment of the present invention, it is possible to establish a knowledge graph for the rich media courseware library to meet the needs of scenarios such as search and intelligent recommendation.

Description

富媒体培训课件知识图谱构建方法、装置及计算机设备Rich media training courseware knowledge graph construction method, device and computer equipment

技术领域Technical field

本发明涉及知识图谱构建方法,更具体地说是指富媒体培训课件知识图谱构建方法、装置及计算机设备。The present invention relates to a knowledge graph construction method, and more specifically to a knowledge graph construction method, device and computer equipment for rich media training courseware.

背景技术Background technique

富媒体培训课件采用文字、图片、音频、视频、动画等多种媒体元素,利用富媒体制作的培训课件生动、直观和易于理解,因此被广泛用在电力安全教育培训中。电力培训部门一般都积累了大量的富媒体课件,建立了主要由视频和音频课件组成的富媒体课件库。富媒体由于其内容以视频、音频的的多媒体形式呈现,并非传统的文本,因此无法直接被索引和搜索。搜索引擎通常依赖于文本和标记来理解和组织内容,而富媒体中的内容并不容易被直接解读和分析。现有的增加富媒体课件的可搜索性的方法是为视频添加元数据,包括为视频添加标题、描述、关键字和标签等元信息,使搜索引擎能够更好地理解富媒体课件的内容和主题。这种方案存在诸多不足,如因只包含课件中很小一部分信息,因此无法实现对课件中知识的完整搜索;无法将搜索内容直接定位到视频的具体位置、无法实现语义搜索等。Rich media training courseware uses text, pictures, audio, video, animation and other media elements. Training courseware produced using rich media is vivid, intuitive and easy to understand, so it is widely used in power safety education and training. Electric power training departments have generally accumulated a large amount of rich media courseware and established a rich media courseware library mainly composed of video and audio courseware. Rich media content is presented in the form of multimedia such as video and audio, rather than traditional text, and therefore cannot be directly indexed and searched. Search engines often rely on text and markup to understand and organize content, and content in rich media is not easy to interpret and analyze directly. The existing method of increasing the searchability of rich media courseware is to add metadata to videos, including adding metainformation such as titles, descriptions, keywords, and tags to videos, so that search engines can better understand the content and content of rich media courseware. theme. This solution has many shortcomings. For example, because it only contains a small part of the information in the courseware, it cannot achieve a complete search of the knowledge in the courseware; it cannot directly locate the search content to the specific location of the video, and it cannot implement semantic search.

因此,有必要设计一种新的方法,实现对对富媒体课件库建立知识图谱,满足搜索、智能推荐等场景需求。Therefore, it is necessary to design a new method to establish a knowledge graph for rich media courseware libraries to meet the needs of search, intelligent recommendation and other scenarios.

发明内容Contents of the invention

本发明的目的在于克服现有技术的缺陷,提供富媒体培训课件知识图谱构建方法、装置及计算机设备。The purpose of the present invention is to overcome the shortcomings of the existing technology and provide a knowledge graph construction method, device and computer equipment for rich media training courseware.

为实现上述目的,本发明采用以下技术方案:富媒体培训课件知识图谱构建方法,包括:In order to achieve the above objectives, the present invention adopts the following technical solution: a rich media training courseware knowledge graph construction method, including:

获取富媒体培训课件;Get rich media training courseware;

对所述富媒体培训课件进行预处理,以得到文字脚本;Preprocess the rich media training courseware to obtain text scripts;

根据所述文字脚本构建知识图谱。Build a knowledge graph based on the text script.

其进一步技术方案为:所述对所述富媒体培训课件进行预处理,以得到文字脚本,包括:The further technical solution is: preprocessing the rich media training courseware to obtain a text script, including:

对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本;Preprocess the audio courseware in the rich media training courseware to obtain the courseware timing script;

对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本;Preprocess the video courseware in the rich media training courseware to obtain voice and courseware timing scripts and image and courseware timing scripts;

组合所述课件时序脚本、语音与课件时序脚本以及图像与课件时序脚本,以得到文字脚本。The courseware timing script, the voice and courseware timing script, and the image and courseware timing script are combined to obtain a text script.

其进一步技术方案为:所述对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本,包括:The further technical solution is: preprocessing the audio courseware in the rich media training courseware to obtain the courseware timing script, including:

对所述富媒体培训课件中的音频课件使用语音识别和自然语言处理技术识别人声讲话内容,并将人声讲话内容转化为文本,生成课件时序脚本。For the audio courseware in the rich media training courseware, speech recognition and natural language processing technology are used to identify the human speech content, convert the human speech content into text, and generate a courseware timing script.

其进一步技术方案为:所述对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本,包括:The further technical solution is: preprocessing the video courseware in the rich media training courseware to obtain voice and courseware timing scripts and images and courseware timing scripts, including:

对所述富媒体培训课件中的视频课件提取视频课件音轨,使用语音识别和自然语言处理技术识别人声讲话内容,以生成包含时序信息的语音与课件时序脚本;Extract the video courseware audio track from the video courseware in the rich media training courseware, and use speech recognition and natural language processing technology to identify the human speech content to generate a speech and courseware timing script containing timing information;

对所述富媒体培训课件中的视频课件逐帧提取为图片;Extract the video courseware in the rich media training courseware into pictures frame by frame;

对所述图片的指定区域进行OCR识别,以生成包含时序信息的图像与课件时序脚本。Perform OCR recognition on designated areas of the picture to generate images and courseware timing scripts containing timing information.

其进一步技术方案为:所述对所述图片的指定区域进行OCR识别,以生成包含时序信息的图像与课件时序脚本,包括:The further technical solution is to perform OCR recognition on the designated area of the picture to generate images and courseware timing scripts containing timing information, including:

对所述图片的指定区域进行OCR识别,以提取字幕信息;Perform OCR recognition on designated areas of the picture to extract subtitle information;

对所述字幕信息进行重复字幕的去除,以生成包含时序信息的图像与课件时序脚本。Repeated subtitles are removed from the subtitle information to generate images and courseware timing scripts containing timing information.

其进一步技术方案为:所述根据所述文字脚本构建知识图谱,包括:The further technical solution is: constructing a knowledge graph based on the text script, including:

对所述文字脚本提取命名实体;Extract named entities from the text script;

对所述文字脚本提取所述命名实体的关系;Extract the relationship of the named entity from the text script;

采用所述命名实体以及所述命名实体的关系构建带有上下文的实体关系三元组;Using the named entity and the relationship of the named entity to construct an entity relationship triplet with context;

利用所述实体关系三元组构建知识图谱。The knowledge graph is constructed using the entity relationship triplet.

本发明还提供了富媒体培训课件知识图谱构建装置,包括:The present invention also provides a device for constructing a knowledge graph for rich media training courseware, including:

课件获取单元,用于获取富媒体培训课件;Courseware acquisition unit, used to obtain rich media training courseware;

预处理单元,用于对所述富媒体培训课件进行预处理,以得到文字脚本;A preprocessing unit, used to preprocess the rich media training courseware to obtain text scripts;

构建单元,用于根据所述文字脚本构建知识图谱。A building unit used to build a knowledge graph based on the text script.

其进一步技术方案为:所述预处理单元包括:Its further technical solution is: the preprocessing unit includes:

第一预处理子单元,用于对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本;The first preprocessing subunit is used to preprocess the audio courseware in the rich media training courseware to obtain the courseware timing script;

第二预处理子单元,用于对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本;The second preprocessing subunit is used to preprocess the video courseware in the rich media training courseware to obtain voice and courseware timing scripts and image and courseware timing scripts;

组合子单元,用于组合所述课件时序脚本、语音与课件时序脚本以及图像与课件时序脚本,以得到文字脚本。The combination subunit is used to combine the courseware timing script, the voice and courseware timing script, and the image and courseware timing script to obtain a text script.

本发明还提供了一种计算机设备,所述计算机设备包括存储器及处理器,所述存储器上存储有计算机程序,所述处理器执行所述计算机程序时实现上述的方法。The present invention also provides a computer device. The computer device includes a memory and a processor. A computer program is stored on the memory. When the processor executes the computer program, the above method is implemented.

本发明还提供了一种存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述的方法。The present invention also provides a storage medium that stores a computer program. When the computer program is executed by a processor, the above method is implemented.

本发明与现有技术相比的有益效果是:本发明通过语音识别、图像识别、自然语言处理等方法,提取富媒体培训课件中能够反映其内容的文字脚本,以此为基础来构建带时序信息的知识图谱,解决了富媒体课件因不能直接提取实体和实体关系而无法构建知识图谱的难题,实现了对富媒体培训库语义搜索的支持,并且搜索结果能够直接定位到富媒体课件的具体位置,将时序信息保存到知识图谱中,搜索结果能够直接定位到富媒体课件的具体位置,实现对对富媒体课件库建立知识图谱,满足搜索、智能推荐等场景需求。Compared with the existing technology, the beneficial effects of the present invention are: the present invention extracts text scripts in the rich media training courseware that can reflect its content through speech recognition, image recognition, natural language processing and other methods, and based on this, constructs a time series The knowledge graph of information solves the problem of rich media courseware being unable to construct a knowledge graph because it cannot directly extract entities and entity relationships. It supports semantic search of the rich media training library, and the search results can directly locate the specific content of the rich media courseware. Location, the time series information is saved into the knowledge graph, and the search results can directly locate the specific location of the rich media courseware, realizing the establishment of a knowledge graph for the rich media courseware library to meet the needs of scenarios such as search and intelligent recommendation.

下面结合附图和具体实施例对本发明作进一步描述。The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

附图说明Description of drawings

为了更清楚地说明本发明实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present invention, which are of great significance to this field. Ordinary technicians can also obtain other drawings based on these drawings without exerting creative work.

图1为本发明实施例提供的富媒体培训课件知识图谱构建方法的应用场景示意图;Figure 1 is a schematic diagram of the application scenario of the rich media training courseware knowledge graph construction method provided by the embodiment of the present invention;

图2为本发明实施例提供的富媒体培训课件知识图谱构建方法的流程示意图;Figure 2 is a schematic flow chart of a method for constructing a knowledge graph for rich media training courseware provided by an embodiment of the present invention;

图3为本发明实施例提供的富媒体培训课件知识图谱构建方法的子流程示意图;Figure 3 is a schematic sub-flow diagram of a method for constructing a knowledge graph for rich media training courseware provided by an embodiment of the present invention;

图4为本发明实施例提供的富媒体培训课件知识图谱构建方法的子流程示意图;Figure 4 is a schematic sub-flow diagram of a method for constructing a knowledge graph for rich media training courseware provided by an embodiment of the present invention;

图5为本发明实施例提供的富媒体培训课件知识图谱构建方法的子流程示意图;Figure 5 is a schematic sub-flow diagram of a method for constructing a knowledge graph for rich media training courseware provided by an embodiment of the present invention;

图6为本发明实施例提供的富媒体培训课件知识图谱构建方法的子流程示意图;Figure 6 is a schematic sub-flow diagram of a method for constructing a knowledge graph for rich media training courseware provided by an embodiment of the present invention;

图7为本发明实施例提供的富媒体培训课件知识图谱构建装置的示意性框图;Figure 7 is a schematic block diagram of a device for constructing a knowledge graph for rich media training courseware provided by an embodiment of the present invention;

图8为本发明实施例提供的富媒体培训课件知识图谱构建装置的预处理单元的示意性框图;Figure 8 is a schematic block diagram of the preprocessing unit of the device for building a knowledge graph of rich media training courseware provided by an embodiment of the present invention;

图9为本发明实施例提供的富媒体培训课件知识图谱构建装置的第二预处理子单元的示意性框图;Figure 9 is a schematic block diagram of the second preprocessing subunit of the rich media training courseware knowledge graph construction device provided by an embodiment of the present invention;

图10为本发明实施例提供的富媒体培训课件知识图谱构建装置的识别模块的示意性框图;Figure 10 is a schematic block diagram of the identification module of the device for building a knowledge graph of rich media training courseware provided by an embodiment of the present invention;

图11为本发明实施例提供的富媒体培训课件知识图谱构建装置的构建单元的示意性框图;Figure 11 is a schematic block diagram of the construction unit of the rich media training courseware knowledge graph construction device provided by an embodiment of the present invention;

图12为本发明实施例提供的计算机设备的示意性框图。Figure 12 is a schematic block diagram of a computer device provided by an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that, when used in this specification and the appended claims, the terms "comprises" and "comprises" indicate the presence of described features, integers, steps, operations, elements and/or components but do not exclude the presence of one or The presence or addition of multiple other features, integers, steps, operations, elements, components and/or collections thereof.

还应当理解,在此本发明说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本发明。如在本发明说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。It should also be understood that the terminology used in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an" and "the" are intended to include the plural forms unless the context clearly dictates otherwise.

还应当进一步理解,在本发明说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It will be further understood that the term "and/or" as used in the specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items. .

请参阅图1和图2,图1为本发明实施例提供的富媒体培训课件知识图谱构建方法的应用场景示意图。图2为本发明实施例提供的富媒体培训课件知识图谱构建方法的示意性流程图。该富媒体培训课件知识图谱构建方法应用于服务器,该服务器与终端进行数据交互,利用富媒体培训课件库内的每个课件搭建知识图谱,该知识图谱中包含了富媒体课件库中蕴含的大部分知识,能够对富媒体课件库的语义搜索、智能推荐等应用场景提供有效支持。特别地,利用本方法构建的知识图谱中包含了知识实体关系在相应的富媒体课件出现的时序信息(即具体出现位置,例如视频课件的第几秒包含该知识实体),在应用于搜索场景中时,利用该时序信息能够在课件中精确定位包含搜索结果的具体位置,用户无需完整观看整个课件来人工定位搜索结果所在的位置,有效地提升了用户体验,提高了搜索效率。Please refer to Figures 1 and 2. Figure 1 is a schematic diagram of an application scenario of the method for constructing a knowledge graph for rich media training courseware provided by an embodiment of the present invention. Figure 2 is a schematic flow chart of a method for constructing a knowledge graph of rich media training courseware provided by an embodiment of the present invention. The rich media training courseware knowledge graph construction method is applied to the server. The server interacts with the terminal in data and uses each courseware in the rich media training courseware library to build a knowledge graph. The knowledge graph contains a large number of contents contained in the rich media courseware library. Part of the knowledge can provide effective support for application scenarios such as semantic search and intelligent recommendation of rich media courseware libraries. In particular, the knowledge graph constructed using this method contains the timing information of the knowledge entity relationship appearing in the corresponding rich media courseware (that is, the specific appearance position, such as which seconds of the video courseware contains the knowledge entity). When applied to search scenarios In time, this timing information can be used to accurately locate the specific location containing the search results in the courseware. Users do not need to watch the entire courseware to manually locate the location of the search results, which effectively improves the user experience and improves search efficiency.

图2是本发明实施例提供的富媒体培训课件知识图谱构建方法的流程示意图。如图2所示,该方法包括以下步骤S110至S130。Figure 2 is a schematic flowchart of a method for constructing a knowledge graph of rich media training courseware provided by an embodiment of the present invention. As shown in Figure 2, the method includes the following steps S110 to S130.

S110、获取富媒体培训课件。S110. Obtain rich media training courseware.

在本实施例中,富媒体培训课件是指来自富媒体培训课件库中的每一个课件,In this embodiment, rich media training courseware refers to each courseware from the rich media training courseware library,

S120、对所述富媒体培训课件进行预处理,以得到文字脚本。S120. Preprocess the rich media training courseware to obtain a text script.

在本实施例中,文字脚本是指从音频课件以及视频课件中提取出来的文本脚本。In this embodiment, text scripts refer to text scripts extracted from audio courseware and video courseware.

在一实施例中,请参阅图3,上述的步骤S120可包括步骤S121~S123。In an embodiment, please refer to FIG. 3 , the above-mentioned step S120 may include steps S121 to S123.

S121、对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本。S121. Preprocess the audio courseware in the rich media training courseware to obtain a courseware timing script.

在本实施例中,是指课件时序脚本富媒体课件生成文字脚本时,在保存每句话的同时,需保存对应的时序信息,即这句话在富媒体课件中的出现时间,最终生成的脚本将是<时间戳,句子>对的集合,脚本具体的物理保存方式不限。一种实施方式是以“<时间戳1,句子1><时间戳2,句子2>...”的格式保存到txt文件中。In this embodiment, it refers to the courseware timing script. When rich media courseware generates text scripts, while saving each sentence, the corresponding timing information needs to be saved, that is, the appearance time of this sentence in the rich media courseware. The final generated The script will be a collection of <timestamp, sentence> pairs, and the specific physical storage method of the script is not limited. One implementation is to save it to a txt file in the format of "<timestamp 1, sentence 1> <timestamp 2, sentence 2>...".

将富媒体课件内容转换为文字脚本,以此为基础创建知识图谱。为了让用户利用建立的知识图谱查看搜索结果时可以直接跳转到富媒体课件中与该搜索结果对应的视频/音频位置,构建课件时序脚本。Convert rich media courseware content into text scripts and create a knowledge graph based on this. In order to allow users to use the established knowledge graph to view search results, they can directly jump to the video/audio position corresponding to the search results in the rich media courseware, and build a courseware timing script.

具体地,对所述富媒体培训课件中的音频课件使用语音识别和自然语言处理技术识别人声讲话内容,并将人声讲话内容转化为文本,生成课件时序脚本。Specifically, speech recognition and natural language processing technology are used to identify the content of human speech for the audio courseware in the rich media training courseware, convert the human speech content into text, and generate a courseware timing script.

其中,人声讲话内容包括旁白、对话等人类语言。Among them, human speech content includes narration, dialogue and other human languages.

对于mp3、WAV等音频类型的培训课件,例如重要领导讲话、歌曲等,这类课件的内容基本都蕴含在人声部分,所以使用语音识别技术将其中的人声讲话部分转换成文字,保存为课件的课件时序脚本。这样获得的脚本能够保留课件的大部分信息。具体的语音识别方法是现有技术,可采用例如阿里云的“智能语音交互”服务中的SAN-M模型实现,此处不再赘述。For mp3, WAV and other audio-type training courseware, such as important leadership speeches, songs, etc., the content of this type of courseware is basically contained in the human voice part, so speech recognition technology is used to convert the human voice speech part into text and save it as The courseware timing script of the courseware. The script obtained in this way can retain most of the information of the courseware. The specific speech recognition method is an existing technology and can be implemented using, for example, the SAN-M model in Alibaba Cloud's "Intelligent Voice Interaction" service, which will not be described again here.

S122、对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本。S122. Preprocess the video courseware in the rich media training courseware to obtain the voice and courseware timing script and the image and courseware timing script.

在本实施例中,语音与课件时序脚本是指由语音转换而来的文本构成的课件时序脚本。In this embodiment, the speech and courseware timing script refers to the courseware timing script composed of text converted from speech.

图像与课件时序脚本是指由视频图像转换而来的文本构成的课件时序脚本。Images and courseware timing scripts refer to courseware timing scripts composed of text converted from video images.

具体地,对于mp4等视频类型的培训课件,由于其信息分布在视频的声音和画面两部分,所以需要从声音和画面两个维度提取信息,分别生成语音与课件时序脚本以及图像与课件时序脚本,这两类脚本综合起来,能够保留课件中的主要信息。Specifically, for video-type training courseware such as MP4, since the information is distributed in the sound and picture parts of the video, it is necessary to extract information from the two dimensions of sound and picture, and generate voice and courseware timing scripts and image and courseware timing scripts respectively. , these two types of scripts combined can retain the main information in the courseware.

在一实施例中,请参阅图4,上述的步骤S122可包括步骤S1221~S1223。In an embodiment, please refer to FIG. 4 , the above-mentioned step S122 may include steps S1221 to S1223.

S1221、对所述富媒体培训课件中的视频课件提取视频课件音轨,使用语音识别和自然语言处理技术识别人声讲话内容,以生成包含时序信息的语音与课件时序脚本.S1221. Extract the video courseware audio track from the video courseware in the rich media training courseware, and use speech recognition and natural language processing technology to identify the human speech content to generate a speech and courseware timing script containing timing information.

在本实施例中,对于课程讲解类的富媒体培训课件一般都配有人声旁白,因此语音中基本包含了所有课件信息;对于影片类,人声对白包含了部分课件信息。因此可以将课件的音轨抽取出来,然后使用与使用语音识别和自然语言处理技术识别人声讲话内容进行处理,生成该课件的语音-课件时序脚本。音轨抽取的具体实施方法有很多成熟的方案,例如采用开源工具FFmpeg的extract_audio函数,此处不再赘述。In this embodiment, the rich media training courseware for course explanations is generally equipped with human voice narration, so the voice basically contains all the courseware information; for the video type, the human voice dialogue contains part of the courseware information. Therefore, the audio track of the courseware can be extracted, and then speech recognition and natural language processing technology can be used to identify the human speech content for processing, and generate the speech-courseware timing script of the courseware. There are many mature solutions for the specific implementation of audio track extraction, such as using the extract_audio function of the open source tool FFmpeg, which will not be described here.

S1222、对所述富媒体培训课件中的视频课件逐帧提取为图片。S1222. Extract the video courseware in the rich media training courseware into pictures frame by frame.

在本实施例中,视频类的富媒体培训课件的信息主要蕴藏在画面中,但现有的人工智能技术无法做到对画面内容的精确识别和理解,考虑到富媒体课件为了培训目的一般都会配有字幕,包括底部的解说字幕和关键段落切换时的过场字幕,这些字幕的作用是对课件图像内容进行说明,因此通过对字幕识别获得的脚本能够反映大部分的图像信息,可用于构建知识图谱。In this embodiment, the information of video-based rich media training courseware is mainly contained in the screen. However, the existing artificial intelligence technology cannot accurately identify and understand the content of the screen. Considering that rich media courseware is generally used for training purposes, It is equipped with subtitles, including explanatory subtitles at the bottom and cutscene subtitles when switching key paragraphs. The function of these subtitles is to explain the image content of the courseware. Therefore, the script obtained through subtitle recognition can reflect most of the image information and can be used to build knowledge. Map.

具体地,首先将视频课件进行逐帧拆分为图片,每帧对应一张图片,然后使用OCR技术对每帧图片进行识别,提取出其中的文字信息。帧的拆分和OCR均有多种成熟的方案,一种实施方案是使用FFmpeg将视频拆分为逐帧的图片,用Tesseract OCR识别出图片中的字幕文字,此处不再赘述。Specifically, the video courseware is first split into pictures frame by frame, each frame corresponds to a picture, and then OCR technology is used to identify each frame of picture and extract the text information in it. There are many mature solutions for frame splitting and OCR. One implementation is to use FFmpeg to split the video into frame-by-frame pictures, and use Tesseract OCR to identify the subtitle text in the picture. I will not go into details here.

S1223、对所述图片的指定区域进行OCR识别,以生成包含时序信息的图像与课件时序脚本。S1223. Perform OCR recognition on the designated area of the picture to generate an image and courseware timing script containing timing information.

在一实施例中,请参阅图5,上述的步骤S1223可包括步骤S12231~S12232。In an embodiment, please refer to FIG. 5 , the above-mentioned step S1223 may include steps S12231 to S12232.

S12231、对所述图片的指定区域进行OCR识别,以提取字幕信息。S12231. Perform OCR recognition on the designated area of the picture to extract subtitle information.

在本实施例中,在利用OCR识别图片中文字以提取字幕时,图片的画面中除字幕外,还会包括大量的其他无关文字,例如视频中出现广告牌、路标中的文字等,此类文字与课件内容相关性不大,如果纳入识别会污染脚本。一种可行的方法是规定识别区域。因为一个富媒体培训课件中,字幕一般有固定的显示区域,例如画面底部的狭窄矩形区域,因此可以只对此区域的文字进行识别作为字幕纳入脚本。具体实施方式有多种成熟方案,一种实现方式是使用Tesseract OCR SDK包中的image_to_string函数,该函数对图片识别后的返回值中包括了文本在图片中的区域信息,只需要给定字幕显示的矩形区域的左上角和右下角坐标,即从返回值中提取本区域内的字幕文本。In this embodiment, when OCR is used to identify text in a picture to extract subtitles, in addition to subtitles, the picture will also include a large number of other irrelevant text, such as text on billboards and road signs in the video, etc. The text has little correlation with the content of the courseware, and if included in the identification, it will contaminate the script. One possible approach is to specify identification areas. Because in a rich media training courseware, subtitles generally have a fixed display area, such as a narrow rectangular area at the bottom of the screen, only the text in this area can be recognized and included in the script as subtitles. There are many mature solutions for specific implementations. One implementation method is to use the image_to_string function in the Tesseract OCR SDK package. The return value of this function after image recognition includes the area information of the text in the image, and only needs to display the given subtitles. The coordinates of the upper left corner and lower right corner of the rectangular area, that is, the subtitle text in this area is extracted from the return value.

S12232、对所述字幕信息进行重复字幕的去除,以生成包含时序信息的图像与课件时序脚本。S12232. Remove duplicate subtitles from the subtitle information to generate images and courseware timing scripts containing timing information.

在本实施例中,由于相同的字幕会出现在连续的多帧图片中,因此每一帧图片中提取出的文字序列需与上一帧提取的文字序列进行对比和去重,以防大量重复文本导致的脚本体积过大。In this embodiment, since the same subtitles will appear in multiple consecutive frames of pictures, the text sequence extracted in each frame of the picture needs to be compared and deduplicated with the text sequence extracted in the previous frame to prevent a large number of duplications. The text causes the script to be too large.

S123、组合所述课件时序脚本、语音与课件时序脚本以及图像与课件时序脚本,以得到文字脚本。S123. Combine the courseware timing script, the voice and courseware timing script, and the image and courseware timing script to obtain a text script.

在本实施例中,文字脚本包括三类脚本,课件时序脚本、语音与课件时序脚本以及图像与课件时序脚本。In this embodiment, text scripts include three types of scripts, courseware timing scripts, voice and courseware timing scripts, and image and courseware timing scripts.

S130、根据所述文字脚本构建知识图谱。S130. Construct a knowledge graph according to the text script.

具体地,获得富媒体培训库中所有课件相应的文字脚本,脚本的文字基本上能够反映富媒体课件的具体内容。同时三类脚本中还包含了具体的课件时序索引信息。以下对所有脚本进行分析,构建富媒体课件库的知识图谱。Specifically, text scripts corresponding to all courseware in the rich media training library are obtained, and the text of the scripts can basically reflect the specific content of the rich media courseware. At the same time, the three types of scripts also contain specific courseware timing index information. All scripts are analyzed below to build a knowledge graph of the rich media courseware library.

使用NLP技术,提取三类脚本中的命名实体及关系;对命名实体进行消歧;构造带上下文信息的实体关系三元组SPO;构造知识图谱。Use NLP technology to extract named entities and relationships in three types of scripts; disambiguate named entities; construct entity relationship triplet SPO with contextual information; construct a knowledge graph.

在一实施例中,请参阅图6,上述的步骤S130可包括步骤S131~S134。In an embodiment, please refer to FIG. 6 , the above-mentioned step S130 may include steps S131 to S134.

S131、对所述文字脚本提取命名实体;S131. Extract named entities from the text script;

S132、对所述文字脚本提取所述命名实体的关系;S132. Extract the relationship of the named entity from the text script;

S133、采用所述命名实体以及所述命名实体的关系构建带有上下文的实体关系三元组;S133. Use the named entity and the relationship of the named entity to construct an entity relationship triplet with context;

S134、利用所述实体关系三元组构建知识图谱。S134. Construct a knowledge graph using the entity relationship triplet.

具体地,通过对富媒体培训课件库中所有课件进行第一阶段的处理,会得到一个课件时序脚本库,富媒体培训课件库中每个音频课件对应于课件时序脚本库中的一个脚本,每个视频课件对应两个(语音与课件时序脚本、图像与课件时序脚本)。这些脚本均为自然语言文本,反映了对应富媒体课件中蕴含的知识,所以可以直接以此为基础构建知识图谱。Specifically, by performing the first stage of processing on all courseware in the rich media training courseware library, a courseware timing script library will be obtained. Each audio courseware in the rich media training courseware library corresponds to a script in the courseware timing script library. One video courseware corresponds to two (voice and courseware timing script, image and courseware timing script). These scripts are natural language texts that reflect the knowledge contained in the corresponding rich media courseware, so a knowledge graph can be constructed directly based on them.

采用现有的利用文本构建知识图谱的成熟方案即可。其中一种实施方法简述如下:利用自然语言处理工具包HanLP所提供的命名实体识别功能从文字脚本中抽取备选命名实体,利用BERT神经网络模型进行实体消歧,获得命名实体;用HanLP的深度学习语义分析功能抽取实体关系;利用以上步骤获得的命名实体和实体关系构造三元组,同时将时序信息保存到对应三元组中,即包含该关系的课件编号和出现时间。将生成的实体关系组导入图数据库Neo4J,将实体保存为Neo4J中图的“节点”,将关系保存为图中的“关系”,将实体关系三元组中的时序信息保存为图中“关系”的属性。如果一个关系在富媒体课件库中多次出现,则将所有时序信息以集合的方式保存到“关系”属性中。通过以上步骤,即可完成富媒体课件库对应知识图谱的构造。Just use the existing mature solution of using text to build a knowledge graph. One implementation method is briefly described as follows: use the named entity recognition function provided by the natural language processing toolkit HanLP to extract candidate named entities from text scripts, and use the BERT neural network model to perform entity disambiguation to obtain named entities; use HanLP's The deep learning semantic analysis function extracts entity relationships; uses the named entities and entity relationships obtained in the above steps to construct triples, and at the same time saves the timing information into the corresponding triples, that is, the courseware number and occurrence time containing the relationship. Import the generated entity relationship group into the graph database Neo4J, save the entities as "nodes" of the graph in Neo4J, save the relationships as "relationships" in the graph, and save the timing information in the entity-relationship triplet as "relationships" in the graph " attribute. If a relationship appears multiple times in the rich media courseware library, all timing information will be saved in a set in the "Relationship" attribute. Through the above steps, the construction of the knowledge graph corresponding to the rich media courseware library can be completed.

上述的富媒体培训课件知识图谱构建方法,通过语音识别、图像识别、自然语言处理等方法,提取富媒体培训课件中能够反映其内容的文字脚本,以此为基础来构建带时序信息的知识图谱,解决了富媒体课件因不能直接提取实体和实体关系而无法构建知识图谱的难题,实现了对富媒体培训库语义搜索的支持,并且搜索结果能够直接定位到富媒体课件的具体位置,将时序信息保存到知识图谱中,搜索结果能够直接定位到富媒体课件的具体位置,实现对对富媒体课件库建立知识图谱,满足搜索、智能推荐等场景需求。The above-mentioned rich media training courseware knowledge graph construction method uses speech recognition, image recognition, natural language processing and other methods to extract the text scripts in the rich media training courseware that can reflect its content, and use this as a basis to build a knowledge graph with temporal information. , which solves the problem that rich media courseware cannot directly extract entities and entity relationships and cannot construct a knowledge graph. It also supports semantic search of rich media training libraries, and the search results can directly locate the specific location of rich media courseware, and integrate the time series into the rich media courseware. The information is saved in the knowledge graph, and the search results can directly locate the specific location of the rich media courseware, enabling the establishment of a knowledge graph for the rich media courseware library to meet the needs of search, intelligent recommendation and other scenarios.

图7是本发明实施例提供的一种富媒体培训课件知识图谱构建装置300的示意性框图。如图7所示,对应于以上富媒体培训课件知识图谱构建方法,本发明还提供一种富媒体培训课件知识图谱构建装置300。该富媒体培训课件知识图谱构建装置300包括用于执行上述富媒体培训课件知识图谱构建方法的单元,该装置可以被配置于服务器中。具体地,请参阅图7,该富媒体培训课件知识图谱构建装置300包括课件获取单元301、预处理单元302以及构建单元303。FIG. 7 is a schematic block diagram of a device 300 for building a rich media training courseware knowledge graph provided by an embodiment of the present invention. As shown in Figure 7, corresponding to the above rich media training courseware knowledge graph construction method, the present invention also provides a rich media training courseware knowledge graph construction device 300. The rich media training courseware knowledge graph construction device 300 includes a unit for executing the above rich media training courseware knowledge graph construction method, and the device can be configured in a server. Specifically, please refer to FIG. 7 . The rich media training courseware knowledge graph construction device 300 includes a courseware acquisition unit 301 , a preprocessing unit 302 and a construction unit 303 .

课件获取单元301,用于获取富媒体培训课件;预处理单元302,用于对所述富媒体培训课件进行预处理,以得到文字脚本;构建单元303,用于根据所述文字脚本构建知识图谱。The courseware acquisition unit 301 is used to obtain rich media training courseware; the preprocessing unit 302 is used to preprocess the rich media training courseware to obtain text scripts; the construction unit 303 is used to build a knowledge graph according to the text scripts .

在一实施例中,如图8所示,所述预处理单元302包括第一预处理子单元3021、第二预处理子单元3022以及组合子单元3023。In an embodiment, as shown in FIG. 8 , the preprocessing unit 302 includes a first preprocessing subunit 3021, a second preprocessing subunit 3022, and a combining subunit 3023.

第一预处理子单元3021,用于对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本;第二预处理子单元3022,用于对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本;组合子单元3023,用于组合所述课件时序脚本、语音与课件时序脚本以及图像与课件时序脚本,以得到文字脚本。The first preprocessing subunit 3021 is used to preprocess the audio courseware in the rich media training courseware to obtain the courseware timing script; the second preprocessing subunit 3022 is used to preprocess the audio courseware in the rich media training courseware. The video courseware is preprocessed to obtain the voice and courseware timing script and the image and courseware timing script; the combination subunit 3023 is used to combine the courseware timing script, the voice and courseware timing script, and the image and courseware timing script to obtain the text script. .

在一实施例中,所述第一预处理子单元3021,用于对所述富媒体培训课件中的音频课件使用语音识别和自然语言处理技术识别人声讲话内容,并将人声讲话内容转化为文本,生成课件时序脚本。In one embodiment, the first preprocessing subunit 3021 is used to use speech recognition and natural language processing technology to identify the content of human speech in the audio courseware in the rich media training courseware, and convert the content of the human speech into For text, generate courseware timing scripts.

在一实施例中,如图9所示,所述第二预处理子单元3022包括音轨处理模块30221、图片提取模块30222以及识别模块30223。In one embodiment, as shown in Figure 9, the second preprocessing subunit 3022 includes an audio track processing module 30221, a picture extraction module 30222, and an identification module 30223.

音轨处理模块30221,用于对所述富媒体培训课件中的视频课件提取视频课件音轨,使用语音识别和自然语言处理技术识别人声讲话内容,以生成包含时序信息的语音与课件时序脚本;图片提取模块30222,用于对所述富媒体培训课件中的视频课件逐帧提取为图片;识别模块30223,用于对所述图片的指定区域进行OCR识别,以生成包含时序信息的图像与课件时序脚本。The audio track processing module 30221 is used to extract the video courseware audio track from the video courseware in the rich media training courseware, and use speech recognition and natural language processing technology to identify the human speech content to generate speech and courseware timing scripts containing timing information. ; Picture extraction module 30222, used to extract the video courseware in the rich media training courseware into pictures frame by frame; Recognition module 30223, used to perform OCR recognition on the designated area of the picture to generate images containing timing information and Courseware timing script.

在一实施例中,如图10所示,所述识别模块30223包括字幕提取子模块302231以及去重子模块302232。In one embodiment, as shown in Figure 10, the identification module 30223 includes a subtitle extraction sub-module 302231 and a deduplication sub-module 302232.

字幕提取子模块302231,用于对所述图片的指定区域进行OCR识别,以提取字幕信息;去重子模块302232,用于对所述字幕信息进行重复字幕的去除,以生成包含时序信息的图像与课件时序脚本。The subtitle extraction sub-module 302231 is used to perform OCR recognition on the designated area of the picture to extract subtitle information; the deduplication sub-module 302232 is used to remove duplicate subtitles from the subtitle information to generate an image containing timing information and Courseware timing script.

在一实施例中,如图11所示,所述构建单元303包括实体提取子单元3031、关系提取子单元3032、三元组构建子单元3033以及图谱构建子单元3034。In one embodiment, as shown in Figure 11, the construction unit 303 includes an entity extraction subunit 3031, a relationship extraction subunit 3032, a triplet construction subunit 3033, and a graph construction subunit 3034.

实体提取子单元3031,用于对所述文字脚本提取命名实体;关系提取子单元3032,用于对所述文字脚本提取所述命名实体的关系;三元组构建子单元3033,用于采用所述命名实体以及所述命名实体的关系构建带有上下文的实体关系三元组;图谱构建子单元3034,用于利用所述实体关系三元组构建知识图谱。The entity extraction subunit 3031 is used to extract named entities from the text script; the relationship extraction subunit 3032 is used to extract the relationship of the named entities from the text script; the triple construction subunit 3033 is used to use the The named entity and the relationship of the named entity construct an entity relationship triplet with context; the graph construction subunit 3034 is used to construct a knowledge graph using the entity relationship triplet.

需要说明的是,所属领域的技术人员可以清楚地了解到,上述富媒体培训课件知识图谱构建装置300和各单元的具体实现过程,可以参考前述方法实施例中的相应描述,为了描述的方便和简洁,在此不再赘述。It should be noted that those skilled in the art can clearly understand that for the specific implementation process of the above-mentioned rich media training courseware knowledge graph construction device 300 and each unit, reference can be made to the corresponding descriptions in the foregoing method embodiments. For the convenience of description and It’s concise and I won’t go into details here.

上述富媒体培训课件知识图谱构建装置300可以实现为一种计算机程序的形式,该计算机程序可以在如图12所示的计算机设备上运行。The above-mentioned rich media training courseware knowledge graph construction device 300 can be implemented in the form of a computer program, and the computer program can run on the computer device as shown in Figure 12.

请参阅图12,图12是本申请实施例提供的一种计算机设备的示意性框图。该计算机设备500可以是服务器,其中,服务器可以是独立的服务器,也可以是多个服务器组成的服务器集群。Please refer to Figure 12, which is a schematic block diagram of a computer device provided by an embodiment of the present application. The computer device 500 may be a server, where the server may be an independent server or a server cluster composed of multiple servers.

参阅图12,该计算机设备500包括通过系统总线501连接的处理器502、存储器和网络接口505,其中,存储器可以包括非易失性存储介质503和内存储器504。Referring to Figure 12, the computer device 500 includes a processor 502, a memory and a network interface 505 connected through a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.

该非易失性存储介质503可存储操作系统5031和计算机程序5032。该计算机程序5032包括程序指令,该程序指令被执行时,可使得处理器502执行一种富媒体培训课件知识图谱构建方法。The non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions. When executed, the program instructions can cause the processor 502 to execute a rich media training courseware knowledge graph construction method.

该处理器502用于提供计算和控制能力,以支撑整个计算机设备500的运行。The processor 502 is used to provide computing and control capabilities to support the operation of the entire computer device 500 .

该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序5032被处理器502执行时,可使得处理器502执行一种富媒体培训课件知识图谱构建方法。The internal memory 504 provides an environment for the execution of the computer program 5032 in the non-volatile storage medium 503. When the computer program 5032 is executed by the processor 502, it can cause the processor 502 to execute a rich media training courseware knowledge map construction method.

该网络接口505用于与其它设备进行网络通信。本领域技术人员可以理解,图12中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备500的限定,具体的计算机设备500可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。The network interface 505 is used for network communication with other devices. Those skilled in the art can understand that the structure shown in Figure 12 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied. The specific computer device 500 may include more or fewer components than shown, some combinations of components, or a different arrangement of components.

其中,所述处理器502用于运行存储在存储器中的计算机程序5032,以实现如下步骤:Wherein, the processor 502 is used to run the computer program 5032 stored in the memory to implement the following steps:

获取富媒体培训课件;对所述富媒体培训课件进行预处理,以得到文字脚本;根据所述文字脚本构建知识图谱。Obtain rich media training courseware; preprocess the rich media training courseware to obtain a text script; and construct a knowledge graph based on the text script.

在一实施例中,处理器502在实现所述对所述富媒体培训课件进行预处理,以得到文字脚本步骤时,具体实现如下步骤:In one embodiment, when the processor 502 implements the step of preprocessing the rich media training courseware to obtain a text script, the processor 502 specifically implements the following steps:

对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本;对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本;组合所述课件时序脚本、语音与课件时序脚本以及图像与课件时序脚本,以得到文字脚本。Preprocess the audio courseware in the rich media training courseware to obtain the courseware timing script; preprocess the video courseware in the rich media training courseware to obtain the voice and courseware timing script and the image and courseware timing script; The courseware timing script, the voice and courseware timing script, and the image and courseware timing script are combined to obtain a text script.

在一实施例中,处理器502在实现所述对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本步骤时,具体实现如下步骤:In one embodiment, when the processor 502 implements the step of preprocessing the audio courseware in the rich media training courseware to obtain the courseware timing script, the processor 502 specifically implements the following steps:

对所述富媒体培训课件中的音频课件使用语音识别和自然语言处理技术识别人声讲话内容,并将人声讲话内容转化为文本,生成课件时序脚本。For the audio courseware in the rich media training courseware, speech recognition and natural language processing technology are used to identify the human speech content, convert the human speech content into text, and generate a courseware timing script.

在一实施例中,处理器502在实现所述对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本步骤时,具体实现如下步骤:In one embodiment, when the processor 502 implements the step of preprocessing the video courseware in the rich media training courseware to obtain the voice and courseware timing scripts and the image and courseware timing scripts, the processor 502 specifically implements the following steps:

对所述富媒体培训课件中的视频课件提取视频课件音轨,使用语音识别和自然语言处理技术识别人声讲话内容,以生成包含时序信息的语音与课件时序脚本;对所述富媒体培训课件中的视频课件逐帧提取为图片;对所述图片的指定区域进行OCR识别,以生成包含时序信息的图像与课件时序脚本。Extract the video courseware audio track from the video courseware in the rich media training courseware, and use speech recognition and natural language processing technology to identify the human speech content to generate a speech and courseware timing script containing timing information; The video courseware in the system is extracted into pictures frame by frame; OCR recognition is performed on the designated area of the picture to generate images and courseware timing scripts containing timing information.

在一实施例中,处理器502在实现所述对所述图片的指定区域进行OCR识别,以生成包含时序信息的图像与课件时序脚本步骤时,具体实现如下步骤:In one embodiment, when the processor 502 implements the step of performing OCR recognition on a designated area of the picture to generate an image and courseware timing script containing timing information, the processor 502 specifically implements the following steps:

对所述图片的指定区域进行OCR识别,以提取字幕信息;对所述字幕信息进行重复字幕的去除,以生成包含时序信息的图像与课件时序脚本。Perform OCR recognition on designated areas of the picture to extract subtitle information; remove duplicate subtitles from the subtitle information to generate images and courseware timing scripts containing timing information.

在一实施例中,处理器502在实现所述根据所述文字脚本构建知识图谱步骤时,具体实现如下步骤:In one embodiment, when the processor 502 implements the step of constructing a knowledge graph based on the text script, the processor 502 specifically implements the following steps:

对所述文字脚本提取命名实体;对所述文字脚本提取所述命名实体的关系;采用所述命名实体以及所述命名实体的关系构建带有上下文的实体关系三元组;利用所述实体关系三元组构建知识图谱。Extract named entities from the text script; extract the relationship of the named entities from the text script; use the named entities and the relationship of the named entities to construct an entity relationship triplet with context; use the entity relationship Triplets build knowledge graphs.

应当理解,在本申请实施例中,处理器502可以是中央处理单元(CentralProcessing Unit,CPU),该处理器502还可以是其他通用处理器、数字信号处理器(DigitalSignal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that in the embodiment of the present application, the processor 502 may be a central processing unit (Central Processing Unit, CPU). The processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), special integrated processors, etc. Circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general processor may be a microprocessor or the processor may be any conventional processor.

本领域普通技术人员可以理解的是实现上述实施例的方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成。该计算机程序包括程序指令,计算机程序可存储于一存储介质中,该存储介质为计算机可读存储介质。该程序指令被该计算机系统中的至少一个处理器执行,以实现上述方法的实施例的流程步骤。Those of ordinary skill in the art can understand that all or part of the processes in the methods of implementing the above embodiments can be completed by instructing relevant hardware through a computer program. The computer program includes program instructions, and the computer program can be stored in a storage medium, and the storage medium is a computer-readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the process steps of the embodiments of the above method.

因此,本发明还提供一种存储介质。该存储介质可以为计算机可读存储介质。该存储介质存储有计算机程序,其中该计算机程序被处理器执行时使处理器执行如下步骤:Therefore, the present invention also provides a storage medium. The storage medium may be a computer-readable storage medium. The storage medium stores a computer program, wherein when the computer program is executed by the processor, it causes the processor to perform the following steps:

获取富媒体培训课件;对所述富媒体培训课件进行预处理,以得到文字脚本;根据所述文字脚本构建知识图谱。Obtain rich media training courseware; preprocess the rich media training courseware to obtain a text script; and construct a knowledge graph based on the text script.

在一实施例中,所述处理器在执行所述计算机程序而实现所述对所述富媒体培训课件进行预处理,以得到文字脚本步骤时,具体实现如下步骤:In one embodiment, when the processor executes the computer program to implement the step of preprocessing the rich media training courseware to obtain a text script, the processor specifically implements the following steps:

对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本;对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本;组合所述课件时序脚本、语音与课件时序脚本以及图像与课件时序脚本,以得到文字脚本。Preprocess the audio courseware in the rich media training courseware to obtain the courseware timing script; preprocess the video courseware in the rich media training courseware to obtain the voice and courseware timing script and the image and courseware timing script; The courseware timing script, the voice and courseware timing script, and the image and courseware timing script are combined to obtain a text script.

在一实施例中,所述处理器在执行所述计算机程序而实现所述对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本步骤时,具体实现如下步骤:In one embodiment, when the processor executes the computer program to implement the step of preprocessing the audio courseware in the rich media training courseware to obtain the courseware timing script, the processor specifically implements the following steps:

对所述富媒体培训课件中的音频课件使用语音识别和自然语言处理技术识别人声讲话内容,并将人声讲话内容转化为文本,生成课件时序脚本。For the audio courseware in the rich media training courseware, speech recognition and natural language processing technology are used to identify the human speech content, convert the human speech content into text, and generate a courseware timing script.

在一实施例中,所述处理器在执行所述计算机程序而实现所述对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本步骤时,具体实现如下步骤:In one embodiment, when the processor executes the computer program to implement the step of preprocessing the video courseware in the rich media training courseware to obtain the voice and courseware timing script and the image and courseware timing script, , specifically implement the following steps:

对所述富媒体培训课件中的视频课件提取视频课件音轨,使用语音识别和自然语言处理技术识别人声讲话内容,以生成包含时序信息的语音与课件时序脚本;对所述富媒体培训课件中的视频课件逐帧提取为图片;对所述图片的指定区域进行OCR识别,以生成包含时序信息的图像与课件时序脚本。Extract the video courseware audio track from the video courseware in the rich media training courseware, and use speech recognition and natural language processing technology to identify the human speech content to generate a speech and courseware timing script containing timing information; The video courseware in the system is extracted into pictures frame by frame; OCR recognition is performed on the designated area of the picture to generate images and courseware timing scripts containing timing information.

在一实施例中,所述处理器在执行所述计算机程序而实现所述****步骤时,具体实现如下步骤:In one embodiment, when the processor executes the computer program to implement the **** step, the processor specifically implements the following steps:

在一实施例中,所述处理器在执行所述计算机程序而实现所述对所述图片的指定区域进行OCR识别,以生成包含时序信息的图像与课件时序脚本步骤时,具体实现如下步骤:In one embodiment, when the processor executes the computer program to implement the OCR recognition of a designated area of the picture to generate an image and courseware timing script containing timing information, the processor specifically implements the following steps:

对所述图片的指定区域进行OCR识别,以提取字幕信息;对所述字幕信息进行重复字幕的去除,以生成包含时序信息的图像与课件时序脚本。Perform OCR recognition on designated areas of the picture to extract subtitle information; remove duplicate subtitles from the subtitle information to generate images and courseware timing scripts containing timing information.

在一实施例中,所述处理器在执行所述计算机程序而实现所述根据所述文字脚本构建知识图谱步骤时,具体实现如下步骤:In one embodiment, when the processor executes the computer program to implement the step of constructing a knowledge graph based on the text script, the processor specifically implements the following steps:

对所述文字脚本提取命名实体;对所述文字脚本提取所述命名实体的关系;采用所述命名实体以及所述命名实体的关系构建带有上下文的实体关系三元组;利用所述实体关系三元组构建知识图谱。Extract named entities from the text script; extract the relationship of the named entities from the text script; use the named entities and the relationship of the named entities to construct an entity relationship triplet with context; use the entity relationship Triplets build knowledge graphs.

所述存储介质可以是U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的计算机可读存储介质。The storage medium may be a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a magnetic disk or an optical disk, and other computer-readable storage media that can store program codes.

本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those of ordinary skill in the art can appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented with electronic hardware, computer software, or a combination of both. In order to clearly illustrate the relationship between hardware and software Interchangeability, in the above description, the composition and steps of each example have been generally described according to functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered to be beyond the scope of the present invention.

在本发明所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的。例如,各个单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。In the several embodiments provided by the present invention, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of each unit is only a logical function division, and there may be other division methods during actual implementation. For example multiple units or components may be combined or integrated into another system, or some features may be omitted, or not implemented.

本发明实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。本发明实施例装置中的单元可以根据实际需要进行合并、划分和删减。另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。The steps in the methods of the embodiments of the present invention can be sequence adjusted, combined, and deleted according to actual needs. The units in the device of the embodiment of the present invention can be merged, divided and deleted according to actual needs. In addition, each functional unit in various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.

该集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,终端,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a storage medium. Based on this understanding, the technical solution of the present invention is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which can be a personal computer, a terminal, or a network device, etc.) to execute all or part of the steps of the method described in various embodiments of the present invention.

以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited thereto. Any person familiar with the technical field can easily think of various equivalent methods within the technical scope disclosed in the present invention. Modifications or substitutions shall be included in the protection scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims (10)

1.富媒体培训课件知识图谱构建方法,其特征在于,包括:1. Rich media training courseware knowledge graph construction method, which is characterized by including: 获取富媒体培训课件;Get rich media training courseware; 对所述富媒体培训课件进行预处理,以得到文字脚本;Preprocess the rich media training courseware to obtain text scripts; 根据所述文字脚本构建知识图谱。Build a knowledge graph based on the text script. 2.根据权利要求1所述的富媒体培训课件知识图谱构建方法,其特征在于,所述对所述富媒体培训课件进行预处理,以得到文字脚本,包括:2. The method for constructing a knowledge graph of rich media training courseware according to claim 1, characterized in that the preprocessing of the rich media training courseware to obtain a text script includes: 对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本;Preprocess the audio courseware in the rich media training courseware to obtain the courseware timing script; 对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本;Preprocess the video courseware in the rich media training courseware to obtain voice and courseware timing scripts and image and courseware timing scripts; 组合所述课件时序脚本、语音与课件时序脚本以及图像与课件时序脚本,以得到文字脚本。The courseware timing script, the voice and courseware timing script, and the image and courseware timing script are combined to obtain a text script. 3.根据权利要求2所述的富媒体培训课件知识图谱构建方法,其特征在于,所述对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本,包括:3. The rich media training courseware knowledge graph construction method according to claim 2, characterized in that the preprocessing of the audio courseware in the rich media training courseware to obtain the courseware timing script includes: 对所述富媒体培训课件中的音频课件使用语音识别和自然语言处理技术识别人声讲话内容,并将人声讲话内容转化为文本,生成课件时序脚本。For the audio courseware in the rich media training courseware, speech recognition and natural language processing technology are used to identify the human speech content, convert the human speech content into text, and generate a courseware timing script. 4.根据权利要求2所述的富媒体培训课件知识图谱构建方法,其特征在于,所述对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本,包括:4. The knowledge graph construction method of rich media training courseware according to claim 2, characterized in that the video courseware in the rich media training courseware is preprocessed to obtain voice and courseware timing scripts and images and courseware Timing scripts, including: 对所述富媒体培训课件中的视频课件提取视频课件音轨,使用语音识别和自然语言处理技术识别人声讲话内容,以生成包含时序信息的语音与课件时序脚本;Extract the video courseware audio track from the video courseware in the rich media training courseware, and use speech recognition and natural language processing technology to identify the human speech content to generate a speech and courseware timing script containing timing information; 对所述富媒体培训课件中的视频课件逐帧提取为图片;Extract the video courseware in the rich media training courseware into pictures frame by frame; 对所述图片的指定区域进行OCR识别,以生成包含时序信息的图像与课件时序脚本。Perform OCR recognition on designated areas of the picture to generate images and courseware timing scripts containing timing information. 5.根据权利要求4所述的富媒体培训课件知识图谱构建方法,其特征在于,所述对所述图片的指定区域进行OCR识别,以生成包含时序信息的图像与课件时序脚本,包括:5. The rich media training courseware knowledge map construction method according to claim 4, characterized in that the OCR recognition is performed on the designated area of the picture to generate an image and courseware timing script containing timing information, including: 对所述图片的指定区域进行OCR识别,以提取字幕信息;Perform OCR recognition on designated areas of the picture to extract subtitle information; 对所述字幕信息进行重复字幕的去除,以生成包含时序信息的图像与课件时序脚本。Repeated subtitles are removed from the subtitle information to generate images and courseware timing scripts containing timing information. 6.根据权利要求1所述的富媒体培训课件知识图谱构建方法,其特征在于,所述根据所述文字脚本构建知识图谱,包括:6. The method for constructing a knowledge graph of rich media training courseware according to claim 1, characterized in that said constructing a knowledge graph according to the text script includes: 对所述文字脚本提取命名实体;Extract named entities from the text script; 对所述文字脚本提取所述命名实体的关系;Extract the relationship of the named entity from the text script; 采用所述命名实体以及所述命名实体的关系构建带有上下文的实体关系三元组;Using the named entity and the relationship of the named entity to construct an entity relationship triplet with context; 利用所述实体关系三元组构建知识图谱。The knowledge graph is constructed using the entity relationship triplet. 7.富媒体培训课件知识图谱构建装置,其特征在于,包括:7. Rich media training courseware knowledge graph construction device, which is characterized by including: 课件获取单元,用于获取富媒体培训课件;Courseware acquisition unit, used to obtain rich media training courseware; 预处理单元,用于对所述富媒体培训课件进行预处理,以得到文字脚本;A preprocessing unit, used to preprocess the rich media training courseware to obtain text scripts; 构建单元,用于根据所述文字脚本构建知识图谱。A building unit used to build a knowledge graph based on the text script. 8.根据权利要求7所述的富媒体培训课件知识图谱构建装置,其特征在于,所述预处理单元包括:8. The rich media training courseware knowledge graph construction device according to claim 7, characterized in that the preprocessing unit includes: 第一预处理子单元,用于对所述富媒体培训课件中的音频课件进行预处理,以得到课件时序脚本;The first preprocessing subunit is used to preprocess the audio courseware in the rich media training courseware to obtain the courseware timing script; 第二预处理子单元,用于对所述富媒体培训课件中的视频课件进行预处理,以得到语音与课件时序脚本以及图像与课件时序脚本;The second preprocessing subunit is used to preprocess the video courseware in the rich media training courseware to obtain voice and courseware timing scripts and image and courseware timing scripts; 组合子单元,用于组合所述课件时序脚本、语音与课件时序脚本以及图像与课件时序脚本,以得到文字脚本。The combination subunit is used to combine the courseware timing script, the voice and courseware timing script, and the image and courseware timing script to obtain a text script. 9.一种计算机设备,其特征在于,所述计算机设备包括存储器及处理器,所述存储器上存储有计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至6中任一项所述的方法。9. A computer device, characterized in that the computer device includes a memory and a processor, a computer program is stored on the memory, and when the processor executes the computer program, it implements any one of claims 1 to 6 method described in the item. 10.一种存储介质,其特征在于,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6中任一项所述的方法。10. A storage medium, characterized in that the storage medium stores a computer program, and when the computer program is executed by a processor, the method according to any one of claims 1 to 6 is implemented.
CN202311342061.1A 2023-10-17 2023-10-17 Rich media training courseware knowledge graph construction method and device and computer equipment Pending CN117648445A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311342061.1A CN117648445A (en) 2023-10-17 2023-10-17 Rich media training courseware knowledge graph construction method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311342061.1A CN117648445A (en) 2023-10-17 2023-10-17 Rich media training courseware knowledge graph construction method and device and computer equipment

Publications (1)

Publication Number Publication Date
CN117648445A true CN117648445A (en) 2024-03-05

Family

ID=90046710

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311342061.1A Pending CN117648445A (en) 2023-10-17 2023-10-17 Rich media training courseware knowledge graph construction method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN117648445A (en)

Similar Documents

Publication Publication Date Title
US20220214775A1 (en) Method for extracting salient dialog usage from live data
US11836183B2 (en) Digital image classification and annotation
US20230214423A1 (en) Video generation
JP6361351B2 (en) Method, program and computing system for ranking spoken words
US9652452B2 (en) Method and system for constructing a language model
US12198433B2 (en) Searching within segmented communication session content
US12106750B2 (en) Multi-modal interface in a voice-activated network
CN105956053A (en) Network information-based search method and apparatus
WO2024188277A1 (en) Text semantic matching method and refrigeration device system
WO2025060594A1 (en) Search processing method, electronic device, and storage medium
CN116521626A (en) Personal knowledge management method and system based on content retrieval
CN115033661A (en) A method and device for natural language semantic understanding based on vertical domain knowledge graph
US20230112385A1 (en) Method of obtaining event information, electronic device, and storage medium
CN116010545A (en) Data processing method, device and equipment
CN114925206A (en) Artificial intelligence body, voice information recognition method, storage medium and program product
CN119106098A (en) A video plot question-answering method and device based on RAG
CN110008314B (en) Intention analysis method and device
US20240244290A1 (en) Video processing method and apparatus, device and storage medium
CN111161737A (en) Data processing method and device, electronic equipment and storage medium
CN117648445A (en) Rich media training courseware knowledge graph construction method and device and computer equipment
CN116614669A (en) Audio and video data processing method and device, electronic equipment and storage medium
CN116978028A (en) Video processing method, device, electronic equipment and storage medium
CN114491153A (en) Method, medium, apparatus and computing device for determining cover image
CN117289869B (en) Data processing method, device, equipment and storage medium
US20250119625A1 (en) Generating video insights based on machine-generated text representations of videos

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination