KR930010781A - Document reading system - Google Patents

Document reading system Download PDF

Info

Publication number
KR930010781A
KR930010781A KR1019910020922A KR910020922A KR930010781A KR 930010781 A KR930010781 A KR 930010781A KR 1019910020922 A KR1019910020922 A KR 1019910020922A KR 910020922 A KR910020922 A KR 910020922A KR 930010781 A KR930010781 A KR 930010781A
Authority
KR
South Korea
Prior art keywords
character
image data
string
image
phoneme
Prior art date
Application number
KR1019910020922A
Other languages
Korean (ko)
Inventor
곽동후
Original Assignee
정용문
삼성전자 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 정용문, 삼성전자 주식회사 filed Critical 정용문
Priority to KR1019910020922A priority Critical patent/KR930010781A/en
Publication of KR930010781A publication Critical patent/KR930010781A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

원고 내용을 읽어 문자정보를 인식한후 인식한 문자정보를 음성합성하는 문서낭독 시스템을 제공한다. 이를 위하여 원고의 이미지 정보를 픽셀 단위의 이진화의 화상데이타로 변환한다. 그리고 이런 화상데이타를 수신하여 1문자열 단위로 추출하고, 추출한 1문자열의 화상상태를 분석하여 각 문자들을 인식한다. 이후 문자인식 데이타를 수신하여 실시간으로 수신한 문자코드에 대한 음소데이타 베이스를 결합한후 음의 크기, 피치, 유무성음정보, 성도 계수등의 음의 특성을 구하고, 이를 바탕으로 음을 합성하여 출력한다.The present invention provides a document reading system that reads the text and recognizes the text information, and then synthesizes the recognized text information by voice. To this end, image information of the original is converted into image data of binarization in units of pixels. Then, the image data is received and extracted in units of one string, and each character is recognized by analyzing the image state of the extracted one string. After receiving the character recognition data, combine the phoneme data base for the received character code in real time, and then obtain the sound characteristics such as loudness, pitch, voice information, vocal coefficients, etc. and synthesize the sound based on this. .

Description

문서 낭독 시스템Document reading system

본 내용은 요부공개 건이므로 전문내용을 수록하지 않았음Since this is an open matter, no full text was included.

제1도는 문서낭독 시스템의 구성도.1 is a block diagram of a document reading system.

제2도는 문자인식 흐름도.2 is a character recognition flow chart.

제5도는 음성합성기의 블럭구성도.5 is a block diagram of a speech synthesizer.

Claims (4)

문서 낭독 시스템에 있어서, 원고의 이미지 정보를 픽셀 단위의 이진화의 화상 데이타로 변환하는 스캔과정과, 상기 화상데이타를 수신하여 수평 히스토그램을 이용하여 1문자열 단위로 추출하고, 추출한 1문자열의 화상상태를 분석하여 각 문자들을 인식하는 과정과, 상기 문자인식 데이타를 수신하여 수신한 문자코드에 대한 음소데이타 베이스를 결합한후 음의 크기, 치치, 유무성음정보, 성도 계수등의 음의 특성을 구하고, 이를 바탕으로 음을 합성하는 과정으로 이루어짐을 특징으로 하는 문서 낭독 방법.In a document reading system, a scanning process of converting image information of an original into image data of binarization in units of pixels, receiving the image data, extracting the image data in units of one string using a horizontal histogram, and extracting the extracted image state of one string Analyze each character by analyzing, and combine the phoneme database of the received character code by receiving the character recognition data, and then obtain the sound characteristics such as loudness, tooth value, presence and absence voice information, vocal tract coefficient, etc. A method of reading a document, characterized in that it consists of a process of synthesizing sound on the basis. 제1항에 있어서, 문자 인식 과정이, 수신되는 화상데이타로부터 화소수의 분포상태에 따라 1문자열을 추출하는 과정과, 상기 추출한 1문자열내의 화소수를 분석하여 문자영역 단위로 분리하는 과정과, 상기 문자영역분리후 해당 문자영역의 문자 종류를 분류하는 과정과, 상기 분류과정에서 한글일시 자소 구성상태에 따라 한글 유형을 분류하는 과정과, 상기 한글 유형 분류후 해당 문자의 자소를 분리하는 과정과, 상기 자소 분리후 각자소의 모음 및 자음을 인식한후 해당 문자의 코드를 생성하는 과정과, 상기 분류과정에서 영문일시 대문자 및 소문자유무를 분류하는 대분류과정과, 상기 대분류 과정 수행후 해당 영문자를 인식하여 영문코드를 발생하는 소분류과정으로 이루어짐을 특징으로 하는 문서 낭독 방법.The method of claim 1, wherein the character recognition process comprises: extracting one string from the received image data according to the distribution state of the number of pixels, analyzing the number of pixels in the extracted one string, and separating the character strings into character area units; Classifying a character type of the corresponding character area after separating the character area; classifying a Hangul type according to the Hangul temporary phoneme configuration state in the classification process; separating the phoneme of the character after classifying the Hangul type; Recognizing the vowels and consonants of each phoneme after separating the phonemes, generating a code of the corresponding character, a large classification process for classifying the presence or absence of uppercase and lowercase letters in the classification process, and recognizing the corresponding alphabetic character after performing the large classification process. Document reading method, characterized in that consisting of a small classification process for generating an English code. 제1항 또는 제2항에 있어서, 음성합성과정이, 수신 문자 코드들의 구문을 해석하여 문장의 분류 및 종류를 판정하고, 띄어 읽기 상태 탐색하는 동시에 문자 및 기호등으로 처리하는 과정과, 상기 구문해석후, 운율 처리를 위하여 합성단위를 처리하고 접속구간 및 포즈 문장의 억양, 엑센트의 음량을 처리하는 과정과, 상기 운을 처리후 무음 및 촉음처리를 행하고, 접속처리를 수행하는 과정과, 상기 접속처리후 음성신호로 합성하는 과정으로 이루어짐을 특징으로 하는 문서 낭독 방법.The method according to claim 1 or 2, wherein the speech synthesis process comprises: analyzing the syntax of the received character codes to determine the classification and type of the sentence, searching for a space to be read, and processing the text into symbols and symbols; After the analysis, the process of processing the synthesis unit for the rhyme process, the process of accent of the connection section and the pause sentence, the volume of the accent, and the process of performing the process of the connection process by performing a silent and tactile process after the luck; And a process of synthesizing the audio signal after the connection process. 문서 낭독 시스템에 있어서, 원고의 이미지 정보를 픽셀 단위의 이진화의 화상 데이타로 변환하는 이미지 스캐너와, 상기 화상데이타를 수신하여 1문자열 단위로 추출하고, 추출한 1문자열의 화상상태를 분석하여 각문자들을 인식하는 문자인식기와, 상기 문자인식 데이타를 수신하여 수신한 문자코드에 대한 음소데이타 베이스를 결합한후 음의 크기, 피치, 유무성음정보, 성도 계수등의 음의 특성을 구하고, 이를 바탕으로 음을 합성하는 음성합성기로 구성된 것을 특징으로 하는 문서 낭독 장치.In the document reading system, an image scanner converts an image information of an original into binarized image data, and receives the image data, extracts the image data in units of one string, and analyzes the extracted image state of each character string. After combining the recognized character recognizer and the phoneme data base for the received character code by receiving the character recognition data, the sound characteristics such as loudness, pitch, voice information, and vocal coefficients are obtained. And a document synthesizer configured to synthesize a speech synthesizer. ※ 참고사항 : 최초출원 내용에 의하여 공개하는 것임.※ Note: The disclosure is based on the initial application.
KR1019910020922A 1991-11-22 1991-11-22 Document reading system KR930010781A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1019910020922A KR930010781A (en) 1991-11-22 1991-11-22 Document reading system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1019910020922A KR930010781A (en) 1991-11-22 1991-11-22 Document reading system

Publications (1)

Publication Number Publication Date
KR930010781A true KR930010781A (en) 1993-06-23

Family

ID=67348378

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1019910020922A KR930010781A (en) 1991-11-22 1991-11-22 Document reading system

Country Status (1)

Country Link
KR (1) KR930010781A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100387232B1 (en) * 1996-10-31 2003-07-22 삼성전자주식회사 Apparatus and method for generating korean prosody
KR100472215B1 (en) * 2002-01-22 2005-03-08 다인정보통신(주) A voice recorder having function of image scanner and processing data thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100387232B1 (en) * 1996-10-31 2003-07-22 삼성전자주식회사 Apparatus and method for generating korean prosody
KR100472215B1 (en) * 2002-01-22 2005-03-08 다인정보통신(주) A voice recorder having function of image scanner and processing data thereof

Similar Documents

Publication Publication Date Title
CN1260704C (en) Method for voice synthesizing
EP1377964B1 (en) Speech-to-speech generation system and method
CN112151005B (en) Chinese and English mixed speech synthesis method and device
KR970029143A (en) Text Recognition Translation System and Voice Recognition Translation System
KR20090040014A (en) Apparatus and method for synchronizing text analysis-based lip shape
De Zoysa et al. Project Bhashitha-Mobile based optical character recognition and text-to-speech system
CN111243597A (en) Chinese-English mixed speech recognition method
KR930010781A (en) Document reading system
CN114999447A (en) Speech synthesis model based on confrontation generation network and training method
Madre et al. OCR based image text to speech conversion using MATLAB
Brøndsted et al. A system for recognition of hummed tunes
JP2813209B2 (en) Large vocabulary speech recognition device
Colaco et al. Design and implementation of Konkani text to speech generation system using OCR technique
US20080249776A1 (en) Methods and Arrangements for Enhancing Machine Processable Text Information
Jose et al. Malayalam Text-to-Speech
Sherpa et al. Pioneering Dzongkha text-to-speech synthesis
Grassini Italianising English words with G2P techniques in TTS voices. An evaluation of different models
Gupta et al. Image to text to speech: a web-based application using optical character recognition and speech synthesis
Kamath et al. Kannada Text-to-Speech System using MATLAB
D’souza Kannada Text-to-Speech System using MATLAB
Talukder et al. An efficient speech generation method based on character and modifier of Bangla PDF Document
CN116580696A (en) Speech stream synthesis method and device based on emotion recognition
JPS61249099A (en) Voice recognition equipment
Mukherjee et al. Context based speech analysis of Bengali Language as a part of TTS conversion
Huda et al. Bangla Speech Synthesizer System for Bangladesh

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E601 Decision to refuse application