WO2022092565A1 - Système de production de livre audio numérique et procédé associé - Google Patents

Système de production de livre audio numérique et procédé associé Download PDF

Info

Publication number
WO2022092565A1
WO2022092565A1 PCT/KR2021/012649 KR2021012649W WO2022092565A1 WO 2022092565 A1 WO2022092565 A1 WO 2022092565A1 KR 2021012649 W KR2021012649 W KR 2021012649W WO 2022092565 A1 WO2022092565 A1 WO 2022092565A1
Authority
WO
WIPO (PCT)
Prior art keywords
digital
digital audio
data
audiobook
text data
Prior art date
Application number
PCT/KR2021/012649
Other languages
English (en)
Korean (ko)
Inventor
이장우
Original Assignee
이장우
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 이장우 filed Critical 이장우
Publication of WO2022092565A1 publication Critical patent/WO2022092565A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present invention relates to a digital audiobook production system and a method therefor, and more particularly, to convert text data into digital audio data to produce an on-demand digital audiobook, and to provide users with a designated type.
  • digital audiobooks do not contain information such as text or images that have been published as books or that can be published without going through the narration process. It refers to a digital book that contains digital text data as digital audio data in an electronic recording medium or storage device, and then reads, sees, and listens to its contents with a computer or mobile terminal through a wired/wireless information and communication network.
  • This digital audiobook is implemented so that it is displayed visually and aurally through a terminal such as a PC equipped with a dedicated reader or display means on which text or images of books are displayed, and the user reads directly as the technology develops.
  • a terminal such as a PC equipped with a dedicated reader or display means on which text or images of books are displayed, and the user reads directly as the technology develops.
  • this digital audiobook offers a lower price compared to paper books from the buyer's point of view, saves time through online purchase (download from the e-book publisher's website), purchases necessary parts separately, as well as watching video materials while reading or background It provides the advantage of being able to listen to music, and from the standpoint of publishers, it is possible to obtain business profits by saving production and distribution costs such as printing and bookbinding, and by reducing inventory burden and easy updating of book contents.
  • the present invention has been devised to solve the above problems, and it is possible to easily and quickly produce an audiobook in real time at a lower cost, and through this, a digital audiobook that can be created as a digital audiobook as desired by books around the world.
  • An object of the present invention is to provide an audiobook production system and a method therefor.
  • a digital audiobook production system for achieving the above object concludes a contract with a publisher server having a copyright, collects book data of the publisher, converts it into digital text data, and converts it into digital audio file a digital audiobook server that converts raw digital audio data in the form of audio files, and converts the raw digital audio data in the form of audio files into tones selected by the user terminal, and provides the converted raw digital audio data to the user terminal; and
  • It connects to the digital audiobook server through a wired/wireless network to download and execute raw digital audio data or designated digital audio data obtained by selecting a desired tone and converting the raw digital audio file into a desired tone, and processing text data.
  • a plurality of user terminals for uploading and downloading and executing raw digital audio data or designated digital audio data for the text data;
  • the user terminal is characterized in that the necessary part of the digital audio data is stored as audio and text data through search or extraction through a voice command when there is an important passage while listening to the digital audio data.
  • the digital audio book server translates the collected digital text data or digital text data uploaded from the user terminal into various languages, and converts the translated digital text data into raw digital audio data or designated digital audio data. do.
  • the digital audiobook server is characterized in that the digital text data uploaded from the user terminal or collected digital text data is analyzed in full context with artificial intelligence to generate raw digital audio data with tones suitable for the context.
  • the digital audio book server is characterized in that the user of the user terminal is divided into free general users and paid customer users, and managed, and differentially provides digital audio book services for general users and customer users.
  • the digital audio book server is characterized in that it analyzes the user's usage data through the user terminal with artificial intelligence and recommends digital audio data required or preferred by the user to the user terminal.
  • a user terminal accesses a digital audiobook server through a wired/wireless network and selects digital text data provided by the digital audiobook server or digital text data stored by itself uploading;
  • step (E) When uploading the self-stored digital text data in step (A), select the desired language and upload the digital text data uploaded from the digital audiobook server to the selected language to generate translated digital text data Then, it is characterized in that after the step (B) is performed.
  • the digital audiobook server analyzes the entire context with artificial intelligence and designates a tone matching the context.
  • step (F) searching (indexing) or extracting a necessary part through a voice command of a user terminal while executing the designated digital audio data of step (D);
  • FIG. 1 is a block diagram of a digital audiobook production system according to an embodiment of the present invention.
  • FIG. 2 is an internal configuration diagram of the digital audiobook server shown in FIG. 1 .
  • FIG. 3 is a flowchart illustrating a digital audiobook production method according to an embodiment of the present invention.
  • FIG. 4 is a flowchart illustrating a method of indexing, extracting, and storing digital audio data through a voice command while executing digital audio data according to an embodiment of the present invention.
  • FIG. 1 is a configuration diagram of a digital audiobook production system according to an embodiment of the present invention
  • FIG. 2 is an internal configuration diagram of the digital audiobook server shown in FIG.
  • a plurality of user terminals 100a, 100b, ..., 100n and a digital audiobook server 300 are connected through a wired/wireless network 200 . do.
  • the user terminals 100a, 100b, ..., 100n are terminals possessed by a user who is provided with a digital audiobook service by accessing the digital audiobook server 300 through the wired/wireless network 200, for example, a PC or a smartphone.
  • the user terminals 100a, 100b, ..., 100n conveniently and freely download and execute raw digital audio data produced by the digital audio book server 300 and stored in the raw digital audio DB 312 .
  • An on-demand digital audiobook service that allows you to listen to a variety of books (digital audio data) is available.
  • the user terminals 100a, 100b, ..., 100n select the desired tone, convert the raw digital audio data into the desired tone through the digital audiobook server 300, and store it in the designated digital audio DB 314.
  • the user terminals 100a, 100b, ..., 100n want to convert the text data stored in the user terminal into digital audio data, not the digital audio book that is produced in advance, the text data is converted into the user terminal 100a. ,100b,...,100n) upload to the digital audiobook server 300, convert it into raw digital audio data through the digital audiobook server 300, and select and convert the desired tone to produce customized digital audio data. there is.
  • the user terminals 100a, 100b, ..., 100n require digital audio data through search or extraction through a voice command when there is an important passage while listening to the digital audio data of the digital audio book server 300 while executing it. Parts can be saved as audio and text data.
  • the user terminals 100a, 100b, ..., 100n translate the translated text data into digital audio data after translating it into a desired language through the text data translation function of the translation module 304 of the digital audio book server 300 . You can create transformations.
  • text data in Korean is produced as digital audio data, but text data is translated into a selected (desired) language and the translated text data is produced as digital audio data, so that any text data can be converted into digital audio data of a desired language. and the user can switch to the desired language while listening to digital audio data, and through this, it can be usefully used for language learning.
  • the user terminals 100a, 100b, ..., 100n upgrade digital audio data through uploading to the digital audio book server 300 for inaccurate or inappropriate digital audio data and convert it into big data, thereby generating digital audio data of the digital audio server 300. can be continuously upgraded.
  • the digital audiobook server 300 includes a licensing module 301 , a TTS module 302 , a conversion module 303 , a translation module 304 and an analysis module 305 , It is connected to the publisher server 400 having the copyright for , and the tone server 500 having various tones similar to the narration of a voice actor by wire or wireless.
  • the licensing module 301 concludes an intellectual property right contract or a copyright contract with various publisher servers 400 , collects book data of the publisher, converts it into digital text, and stores and manages it in the digital text DB 310 .
  • the TTS module 302 converts various digital text data in the form of a text file stored in the digital text DB 310 into raw digital audio data in the form of a digital voice file, and stores and manages the data in the raw digital audio DB 312 .
  • the TTS module 302 extracts an optimal prosody model through the TTS algorithm and converts it into a digital voice file form close to natural sound and natural tone.
  • the conversion module 303 converts raw digital audio data in the form of digital audio files stored in the raw digital audio DB 312 to a tone from the tone server 500 according to the request of the user terminals 100a, 100b, ..., 100n.
  • the selected tone is converted into the selected tone and provided to the user terminals 100a, 100b, ..., 100n, and stored and managed in the designated digital audio DB 314.
  • the human ear is very sensitive and sensual, so it is easy to get tired of repetitive sounds and loses concentration. Even when using digital audiobooks, it is always new and does not easily get tired or lose concentration.
  • the translation module 304 includes various digital text data in the form of text files collected through the licensing module 301 and stored in the digital text DB 310, or digital text uploaded from the user terminals 100a, 100b, ..., 100n.
  • the data is translated into various languages and stored and managed in the translated digital text DB 316 .
  • the translated digital text data is converted into designated digital audio data through the TTS module 302 and the conversion module 303 according to the request of the user terminal, and stored and managed in the designated digital audio DB 314 .
  • the analysis module 305 analyzes the user's usage data through the user terminals 100a, 100b, ..., 100n with artificial intelligence, and recommends and provides digital audio data required or preferred by the user.
  • the analysis module 305 analyzes the entire context of the collected digital text data or digital text data uploaded from the user terminals 100a, 100b, ..., 100n with artificial intelligence, and the optimal tone that best suits the context It generates raw (basic) digital audio data with , but converts it to the most natural and natural tones by generating digital audio data with tones that best match the context, especially for interactive texts.
  • the digital audio book server 300 operates a digital text DB 310, a raw digital audio DB 312, a designated digital audio DB 314, and a translation digital text DB 316, manage
  • the digital audiobook server 300 provides the users of the user terminals 100a, 100b, ..., 100n for free general users who do not pay digital audiobook service fees and paid customers who pay digital audiobook service fees. It is possible to classify users into users, store and manage them in a user DB (not shown), and provide digital audiobook services for general users and customer users differently (differently).
  • the general user may not limit the daily or one-time usage data of the customer user while limiting the user's daily or one-time usage data, and the general user may have a translation service or analysis of the translation module 304 . While not providing the recommendation service of the module 305 , a translation or recommendation service may be provided to the customer user.
  • FIG. 3 is a flowchart illustrating a digital audiobook production method according to an embodiment of the present invention.
  • the user accesses the digital audiobook server 300 through the wired/wireless network 200 using the user terminals 100a, 100b, ..., 100n (S302), logs in, and wants to use the digital audiobook service.
  • Digital text data is selected from the digital text DB 310 or books (digital text data) stored in the user terminals 100a, 100b, ..., 100n are uploaded (S034).
  • the TTS module 302 converts the digital text data in the form of a text file into raw digital audio data in the form of a digital voice file and stores it in the raw digital audio DB 312. (S306).
  • AI can analyze the entire context and designate the optimal tone that best matches the context.
  • the conversion module 303 converts the raw digital audio data into the selected tone (into designated digital audio data). converted) and stored in the designated digital audio DB 314 and provided to the user terminals 100a, 100b, ..., 100n (S310).
  • the user terminals 100a, 100b, ..., 100n use the digital audio book service by executing the provided designated digital audio data.
  • the user may select and upload a desired language (S320).
  • the translation module 304 of the digital audiobook server 300 translates the uploaded digital text data into the selected language to generate translated digital text data (S322), and subsequent steps including step S306 for the translated digital text data carry out
  • FIG. 4 is a flowchart illustrating a method of indexing, extracting, and storing digital audio data through a voice command while executing digital audio data according to an embodiment of the present invention.
  • step S312 of FIG. 3 that is, when the user wants to store the specified digital audio data while receiving the digital audio book service through the user terminals 100a, 100b, ..., 100n (S402), the user wants to save it.
  • voice command Search by Voice Command
  • a search Indexing
  • Copy Copy
  • the extracted digital audio/text data is stored in the user terminals 100a, 100b, ..., 100n (Paste) (S406), and the stored digital audio/text data is translated ( S408).
  • the indexed digital audio/text data is repeatedly executed or stored in the user terminals 100a, 100b, ..., 100n (Paste) (S410), and the stored digital audio/text data is translated. (S412).
  • the present invention converts all books into digital audio data through a digital audiobook server to generate an on-demand digital audiobook, and selectively applies a user's desired tone in a designated type to digital Audiobooks can be downloaded, and by accessing a digital audiobook server from a user terminal and uploading text data, the user can convert them into customized type digital audio data and download them.
  • the necessary part of the digital audio data can be extracted and recorded as audio and text data through the first voice command, and the data can be repeatedly reproduced by searching for a keyword through the second voice command.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Tourism & Hospitality (AREA)
  • Primary Health Care (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

La présente invention concerne un système de production de livre audio numérique et un procédé associé. Plus précisément, la présente invention concerne un système permettant de convertir des données textuelles en données audio numériques de façon à produire un livre audio numérique à la demande, d'appliquer sélectivement une tonalité souhaitée par un utilisateur comme type désigné de façon à produire un livre audio numérique, et de télécharger des données textuelles par l'utilisateur de façon à produire un livre audio numérique personnalisé, et un procédé associé. Un système de production de livre audio numérique selon un mode de réalisation de la présente invention comprend : un serveur de livre audio numérique qui établit un contrat avec un serveur d'éditeur ayant un droit d'auteur, collecte des données de livre d'un éditeur correspondant pour convertir les données de livre collectées en données textuelles numériques, convertit les données textuelles numériques en données audio numériques brutes sous la forme d'un fichier audio numérique, et convertit les données audio numériques brutes en un fichier audio ayant une tonalité sélectionnée par un terminal utilisateur et fournit celui-ci au terminal utilisateur ; et une pluralité de terminaux utilisateur qui accèdent au serveur de livre audio numérique par l'intermédiaire d'un réseau filaire/sans fil pour sélectionner des données audio numériques brutes ou une tonalité souhaitée, télécharger et exécuter des données audio numériques désignées obtenues par conversion du fichier audio numérique brut en un fichier audio ayant la tonalité souhaitée, et télécharger des données textuelles pour télécharger et exécuter des données audio numériques brutes ou des données audio numériques désignées quant aux données textuelles.
PCT/KR2021/012649 2020-10-27 2021-09-16 Système de production de livre audio numérique et procédé associé WO2022092565A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020200139958A KR102465504B1 (ko) 2020-10-27 2020-10-27 디지털 오디오북 제작시스템 및 그 방법
KR10-2020-0139958 2020-10-27

Publications (1)

Publication Number Publication Date
WO2022092565A1 true WO2022092565A1 (fr) 2022-05-05

Family

ID=81384169

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2021/012649 WO2022092565A1 (fr) 2020-10-27 2021-09-16 Système de production de livre audio numérique et procédé associé

Country Status (2)

Country Link
KR (1) KR102465504B1 (fr)
WO (1) WO2022092565A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070118408A (ko) * 2006-06-12 2007-12-17 에스케이 텔레콤주식회사 오디오북 서비스 제공 방법 및 시스템
US20100235729A1 (en) * 2009-03-16 2010-09-16 Kocienda Kenneth L Methods and Graphical User Interfaces for Editing on a Multifunction Device with a Touch Screen Display
KR20100132866A (ko) * 2009-06-10 2010-12-20 엘지전자 주식회사 이동 단말기 및 그 제어방법
KR20120108197A (ko) * 2011-03-23 2012-10-05 에스케이플래닛 주식회사 오디오 파일 동기화를 위한 번역 서비스 시스템 및 그 방법
KR20130117996A (ko) * 2012-04-19 2013-10-29 정지훈 스마트 러닝 서비스를 제공하기 위한 시스템 및 그 방법

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101789057B1 (ko) 2016-06-17 2017-10-23 한밭대학교 산학협력단 시각 장애인을 위한 자동 오디오 북 시스템 및 그 운영 방법

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070118408A (ko) * 2006-06-12 2007-12-17 에스케이 텔레콤주식회사 오디오북 서비스 제공 방법 및 시스템
US20100235729A1 (en) * 2009-03-16 2010-09-16 Kocienda Kenneth L Methods and Graphical User Interfaces for Editing on a Multifunction Device with a Touch Screen Display
KR20100132866A (ko) * 2009-06-10 2010-12-20 엘지전자 주식회사 이동 단말기 및 그 제어방법
KR20120108197A (ko) * 2011-03-23 2012-10-05 에스케이플래닛 주식회사 오디오 파일 동기화를 위한 번역 서비스 시스템 및 그 방법
KR20130117996A (ko) * 2012-04-19 2013-10-29 정지훈 스마트 러닝 서비스를 제공하기 위한 시스템 및 그 방법

Also Published As

Publication number Publication date
KR20220055644A (ko) 2022-05-04
KR102465504B1 (ko) 2022-11-11

Similar Documents

Publication Publication Date Title
Steinmetz et al. Multimedia: computing, communications and applications
Schiel et al. The SmartKom Multimodal Corpus at BAS.
JP2002304419A5 (fr)
CN100559460C (zh) 用元数据标记音频信号的方法和系统
WO2012141433A2 (fr) Système de lecture multimédia pour un livre électronique basé sur des documents pdf et procédé de lecture pour celui-ci, et application pour un ordinateur personnel ou un dispositif mobile sur lequel celui-ci est mis en œuvre
CN102244788A (zh) 信息处理方法、信息处理装置、场景元数据提取装置、丢失恢复信息生成装置和程序
US20240070397A1 (en) Human-computer interaction method, apparatus and system, electronic device and computer medium
JP2020056996A (ja) 音色選択可能なボイス再生システム、その再生方法、およびコンピュータ読み取り可能な記録媒体
WO2013154267A1 (fr) Système pour fournir un contenu de cours à l'aide de matériels de cours et de données synchronisées, et procédé pour fournir un contenu de cours
JP2019091416A5 (fr)
CN109284367A (zh) 用于处理文本的方法和装置
WO2022092565A1 (fr) Système de production de livre audio numérique et procédé associé
US20220391440A1 (en) Content providing system, content providing method, and storage medium
CN109272983A (zh) 用于亲子教育的双语切换装置
WO2012102424A1 (fr) Dispositif et procédé pour l'édition d'un livre d'images électronique
WO2018225968A1 (fr) Système de combinaison de modèles d'image animée et procédé associé
WO2015088173A1 (fr) Dispositif intelligent dans lequel est installée une application de reproduction de matériel pédagogique, qui reproduit des informations de matériel pédagogique de type livre à l'aide d'un stylo de reconnaissance de code capable d'exécuter une communication bluetooth
Ng et al. Preservation of interactive multimedia performances
WO2013151286A1 (fr) Système ayant une fonction de bloc-notes électronique, et son procédé de fonctionnement
US11968432B2 (en) Information processing system, information processing method, and storage medium
WO2023149678A1 (fr) Dispositif d'apprentissage de mémorisation auto-dirigé et procédé associé
US20110132980A1 (en) System, apparatus, method for playing multimedia
CN110275860B (zh) 一种记录讲授过程的系统及方法
WO2011016598A1 (fr) Support de fourniture de contenu et son procédé de fourniture
KR20200068380A (ko) 가상현실(Virtual Reality)기술을 이용한 가상현실성경(VRB:Virtual Reality Bible) 제작 방법 및 시스템

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21886559

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.09.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21886559

Country of ref document: EP

Kind code of ref document: A1