EP0218723A1 - Datenkartensystem für initialisierungsspracherkennungseinrichtungen - Google Patents

Datenkartensystem für initialisierungsspracherkennungseinrichtungen

Info

Publication number
EP0218723A1
EP0218723A1 EP86903718A EP86903718A EP0218723A1 EP 0218723 A1 EP0218723 A1 EP 0218723A1 EP 86903718 A EP86903718 A EP 86903718A EP 86903718 A EP86903718 A EP 86903718A EP 0218723 A1 EP0218723 A1 EP 0218723A1
Authority
EP
European Patent Office
Prior art keywords
data
card
spoken
words
spots
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP86903718A
Other languages
English (en)
French (fr)
Inventor
Jerome Drexler
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Drexler Technology Corp
Original Assignee
Drexler Technology Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Drexler Technology Corp filed Critical Drexler Technology Corp
Publication of EP0218723A1 publication Critical patent/EP0218723A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07CTIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
    • G07C9/00Individual registration on entry or exit
    • G07C9/20Individual registration on entry or exit involving the use of a pass
    • G07C9/22Individual registration on entry or exit involving the use of a pass in combination with an identity check of the pass holder
    • G07C9/25Individual registration on entry or exit involving the use of a pass in combination with an identity check of the pass holder using biometric data, e.g. fingerprints, iris scans or voice recognition
    • G07C9/257Individual registration on entry or exit involving the use of a pass in combination with an identity check of the pass holder using biometric data, e.g. fingerprints, iris scans or voice recognition electronically

Definitions

  • the invention relates to spoken-word recogni ⁇ tion systems.
  • Suzuki, et al. (U.S. patents 4,060,694, 4,078,154 and 4,100,370) teach a voice recognition system in which the phonemes as spoken by different speakers and the voice of the person speaking can be recognized. A key phrase is spoken. Parallel filters derive a spectral characteristic parameter which contains weighting factors extracted and compared with the selected phoneme in memory. Improved specificity over other speakers can be obtained by varying the weighting factors through a num ⁇ ber of different values, and storing in memory the set of parameters for each sound as spoken by a specific speaker. The system can be used thus for voice Feedback ⁇ tion.
  • Felix et al. disclose a method for identifying an individual using a combination of speech and face recognition.
  • the voice signature of a person uttering a key word into a microphone is compared in a pattern matched with the previously stored voice signature of a known person uttering the same key word.
  • a momentary image of that person's mouth region is recorded and compared with that of the same known person.
  • the results of the comparison are analyzed to verify that the identity of the speaker is that of the known person.
  • Katayama U.S. patent 4,461,023 discloses a method of storing spoken words for use in a speech recog- -2-
  • An object of the invention is to devise a spoken word recognition system which is of reduced com ⁇ plexity and which can be quickly and easily programmed to understand any individual's voice commands more readily.
  • Another object of the invention is to devise a system in which a voice command unit can be initialized easily by each individual user without needing a knowledge of programming or of the unit's operation.
  • a set of words is spoken by a user into a microphone.
  • the spoken words are analyzed and speech characteristics are extracted.
  • Such charac ⁇ teristics include pitch, intonation, speed of speaking, accent parameters, and other parameters.
  • a spoken-word recognition unit receives a user's voice message and identifies the words with the help of the speaker's voice characteristics in its mem ⁇ ory, which was initialized by the spoken-word identifica ⁇ tion data on a card. In this manner, the spoken words can be recognized. For each new speaker, the unit must first be "taught" a particular speaker's characteristics so that the unit can more easily recognize the spoken words. The card provides the information to teach the unit. A record of an individual's speech characteristics is laser recorded on a card which is later read into the unit by placing it in a card reader and the character ⁇ istics entered into the short-term memory of the spoken- word recognition unit.
  • the card has a strip of laser recording material, such as the reflective direct-read-after-write material described in U.S. patent 4,284,716 to Drexler et al.
  • a modulated laser beam records data on the strip, in situ, by ablation, melting, physical or chemical change or deformation, thereby forming spots having a detectable change in an optical characteristic relative to the strip.
  • the recording process on the above mentioned direct-read-after-write material produces differences in reflectivity detectable by a light detector. No pro ⁇ cessing after laser recording is required when the re ⁇ cording strip is a direct-read-after-write material.
  • Laser recording materials also may be used that require heat processing after laser recording.
  • Each person has his own speech characteris ⁇ tics, in much the same way that each person has his own set of fingerprints.
  • the card with the recorded speech characteristics is read by shining a laser beam or light emitting diode onto the strip.
  • the beam typically, has an intensity of ten percent of the recording intensity.
  • the beam is reflected from the strip to a photodetector.
  • the detector detects the contrast in optical character ⁇ istics between the strip and the recorded spots, and transmits corresponding signals to the speech recognition unit's short-term memory.
  • the system is now ready to listen to words spoken by the user and to identify the words with the help of the speaker's voice characteris ⁇ tics stored in the memory. By this procedure the speaker's words are more clearly identified.
  • the uniform surface reflectivity of this reflec ⁇ tive strip before recording typically would range be ⁇ tween 8% and 65%.
  • the average reflectivity over a laser recorded spot might be in the range of 5% to 25%.
  • the reflective contrast ratio of the recorded spots would range between 2:1 and 7:1.
  • Laser recording materials are known in the art that create either low reflectivity spots in a reflective field or high reflective spots in a low reflectivity field. An example of the latter type is described in U.S. patent 4,343,879. When the reflectivity of the field is in the range of 8% to 20% the reflective spots have a reflectivity of about 40%.
  • the reflective con ⁇ trast ratio would range from 2:1 to 5:1. Photographic pre-formatting would create spots having a 10% reflectivi ⁇ ty in a reflective field or 40% in a low reflectivity field.
  • the voice information on the card would typi ⁇ cally be in digital form. It would inform the word recognition unit of macro aspects of speech such as accent parameters, speed of speaking, dropping of "th" beginnings or “g” endings, variations in intensity as well as the micro aspects such as tone, pitch, intonation, etc. With this advance knowledge about the speech characteristics of the words about to be spoken the words can more easily be recognized.
  • the card can store tens, hundreds or even thousands of deviation parameters from a "normal" voice. When a word is not understood the word interpreter unit would add in corrections to the unidentified word based -5-
  • Fig. 1 is a schematic diagram of the spoken- word recognition system of the present invention.
  • Fig. 2 is a schematic diagram of the data card encoding of the present invention.
  • Fig. 3 is a plan view of one side of a data card in accord with the present invention.
  • Fig. 4 is a partial side sectional view taken along lines 4-4 in Fig. 3.
  • Fig. 5 is a detail of laser writing on a portion of the laser recording strip illustrated by dashed lines in Fig. 3.
  • Fig. 6 is a plan view of an apparatus for reading and writing on the optical recording media strip illustrated in Fig. 3.
  • a spoken-word recog ⁇ nition system 10 reads a person's voice characteristics from a wallet-size card 31 containing a strip of laser recordable material. Each person would have a card 31 with his own speech characteristics prerecorded thereon.
  • the system 10 is initialized with respect to the particular voice characteristics of the card owner by inserting the card 31 into system 10. A sufficient number of character ⁇ istics is recorded so that words spoken by a particular speaker may be identified.
  • a data card encoding system 110 is used to form a card 131.
  • a set of words 116 is spoken by a person into a microphone 117.
  • the resulting signal is analyzed by a speech analyzer 121 and speech characteristics 122 are extracted.
  • Such charac ⁇ teristics 122 include pitch, formats, ratio of voiced to unvoiced amplitudes, and other parameters used to help identify words and parts of words.
  • speech analyzer 121 sends a digital signal 122 repre ⁇ senting a person's speech characteristics to a data card writer/reader 129 which writes the data with a laser onto card 131 by shining a modulated laser beam 130 onto the card 131.
  • the card 131 has a strip of optical contrast laser recording material disposed thereon.
  • the beam 130 records data onto the card 131, in situ, by ablation, melting physical or chemical change or deformation, thereby forming spots with contrasting reflectivity rela ⁇ tive to the unrecorded strip.
  • Reflected beam 132 is read by the card reader/writer 129 to confirm laser writing.
  • the spoken-word recognition system 10 is initialized by placing a prerecorded card 31 in data card reader 29.
  • the card reader 29 shines a light beam 30 from a laser or a LED onto the prerecorded strip.
  • This read beam typically, has an intensity of five to ten percent of the typical semiconductor laser recording intensity.
  • the light beam 32 is reflected from the strip to a photodetector, which detects this contrast in re- flectivity between the strip and recorded spots.
  • Card reader 29 transmits a signal 24 corresponding to the recorded data to the short-term memory of the spoken-word recognition unit 23.
  • the system 10 is now ready to listen to words
  • the words 16 spoken by the user are analyzed and interpreted by the speech recognition unit 23 with respect to the voice character ⁇ istics 24, now stored in its short-term memory.
  • the words 16 are recognized and the result is sent to an output device 27, such as a CRT terminal.
  • a data card 11 is illustrated having a size common to most credit cards. -7-
  • the width dimension of such a card is approximately 54 mi and the length dimension is approximately 85 mm. These dimensions are not critical, but preferred because such a size easily fits into a wallet and has historically been adopted as a convenient size for automatic teller ma ⁇ chines and the like.
  • the card's base 13 is a dielectric, usually a plastic material such as polyvinyl chloride or similar material. Polycarbonate plastic is preferred.
  • the surface finish of the base should have low specular reflectivity, preferably less than 10%.
  • Base 13 carries strip 15.
  • the strip is about 16 or 35 millimeters wide and extends the length of the card. Alternatively, the strip may have other sizes and orientations.
  • the strip is relatively thin, approximate- ly 60-200 microns, although this is not critical.
  • the strip may be applied to the card by any convenient method which achieves flatness.
  • the strip is adhered to the card with an adhe ⁇ sive and covered by a transparent laminating sheet 19 which serves to keep strip 15 flat, as well as protecting the strip from dust and scratches.
  • Sheet 19 is a thin, transparent plastic sheet laminating material or a coat ⁇ ing, such as a transparent lacquer.
  • the material is preferably made of polycarbonate plastic.
  • the opposite side of base 13 may have user identification indicia embossed on the surface of the card. Other indicia such as card number and the like may be optionally provided.
  • the high resolution laser recording material which forms strip 15 may be any of the reflective record ⁇ ing material which have been developed for use as direct read-after-write (DRAW) optical disks, so long as the materials can be formed on thin substrates.
  • DRAW direct read-after-write
  • An advantage of reflective materials over transmissive materials is that the read/write equipment is all on one side of the card, the data storage capacity is doubled, and the automatic focus is easier.
  • Materials which are preferred are those having high reflectivity and low melting point, particularly Cd, Sn, Tl, Ind, Bi and amalgams. Suspensions of reflective metal particles in organic colloids also form low melting temperature laser recording media. Silver is one such metal. Typical recording media are described in U.S. patents Nos. 4,314,260, 4,298,684, 4,278,758, 4,278,758, 4,278,756 and 4,269,917, all assigned to the assignee of the present invention.
  • the laser recording material which is selected should be compatible with the laser which is used for writing on it. Some materials are more sensitive than others at certain wavelengths. Good sensitivity to in ⁇ frared light is preferred because infrared is affected least by scratches and dirt on the transparent laminating sheet.
  • the selected recording material- should have a favorable signal-to-noise ratio and form chigh contrast data bits with the read/write system with which it is used.
  • the material should not lose data when subjected to temperatures of about 122°F(50°C) for long periods.
  • the material should also be capable of re ⁇ cording at speeds of at least several thousand bits/sec. This generally precludes the use of materials that re ⁇ quire long heating times or that rely on slow chemical reactions in the presence of heat, which may permit recording of only a few bits/sec.
  • a large number of highly reflective laser recording materials have been used for optical data disk applications.
  • Data is recorded by forming spots in the sur ⁇ rounding field of the reflective layer itself, thereby altering the reflectivity in the data spot.
  • Data is read by detecting the optical reflective contrast between the surrounding reflective field of unrecorded areas and the recorded spots. Spot reflectivity of less than half the
  • reflectivity of the surrounding field produces a contrast ratio of at least two to one, which is sufficient con ⁇ trast for reading. Greater contrast is preferred.
  • Re ⁇ flectivity of the strip field of about 50% is preferred with reflectivity of a spot in the reflective field being less than 10%, thus creating a contrast ratio of greater than five to one.
  • data may also be re ⁇ corded by increasing the reflectivity of the strip.
  • the recording laser can melt a field of dull microscopic spikes on the strip to create flat shiny spots. This method is described in SPIE, Vol. 329, Optical Disk Technology (1982), p. 202.
  • a spot re ⁇ flectivity of more than twice the surrounding spiked field reflectivity produces a contrast ratio of at least two to one, which is sufficient contrast for reading.
  • the dashed line 33 corresponds to the dashed line 33 in Fig. 3.
  • the oblong spots 35 are aligned in a path and have generally similar dimensions.
  • the spots are generally circular or oval in shape with the axis of the oval perpendicular to the lengthwise dimension of the strip.
  • a second group of spots 37 is shown aligned in a second path.
  • the spots 37 have similar dimensions to the spots 35.
  • the spacing between paths is not critical, except that the optics of the readback system should be able to easily distinguish between paths.
  • tracks which are separated by only a few microns may be resolved. The spacing and pattern of the spots along each path is selected for easy decoding.
  • the spots illustrated in Fig. 5 have a recom ⁇ mended size of approximately 5 microns by 20 microns, or circular spots 5 microns or 10 microns in diameter.
  • the smallest dimension of a spot should be less than 50 microns. In the preferred embodiment the largest dimension would also be less than 50 microns.
  • the size of the strip 15 could be expanded to the point where it covers a large extent of the card.
  • the laser recording strip 15 could completely cover a single side of the card.
  • a minimum information capacity of 250,000 bits is indicated and a storage capacity of over one million bits is preferable.
  • a side view of the lengthwise dimen ⁇ sion of a card 41 is shown inserted into card reader/writer 29.
  • the card is usually received in a movable holder 42 which brings the card into the beam trajectory.
  • a laser light source 43 preferably a pulsed semiconductor laser of near infrared wavelength emits a beam 45 which passes through collimating and focussing optics 47.
  • the beam is sampled by a beam splitter 49 which transmits a portion of the beam through a focusing lens 51 to a photodetector 53.
  • the detector 53 confirms laser writing and is not essential.
  • the beam is then directed to a first servo controlled mirror 55 which is mounted for rotation along the axis 57 in the direction indicated by the arrows A.
  • the purpose of the mirror 55 is to find the lateral edges of the laser recording material in a coarse mode of operation and then in a fine mode of operation identify data paths which exist prede ⁇ termined distances from the edges. From mirror 55, the beam is directed toward mirror 61. This mirror is mounted for rotation at pivot 63. The purpose of mirror 55 is for fine control of motion of the beam along the length of the card. Coarse control of the lengthwise position of the card relative to the beam is achieved by motion of movable holder 42. The position of the holder may be established by a linear motor adjusted by a closed loop position servo system of the type used in magnetic disk drives.
  • the card may be pre- recorded with database information or a preinscribed pattern containing servo tracks, timing marks, program instructions, and related functions. These positioning marks can be used as a reference for the laser recording -11-
  • U.S. patent No. 4,304,848 describes how formatting may be done photolithographically. Formatting may also be done using laser recording or surface molding of the servo tracks, having marks, programming and related functions. Dil, in U.S. patent 4,209,804 teaches a type of surface molding. Reference position information may be prerecorded on the card so that position error signals may be generated and used as feedback in motor control. Upon reading one data path, the mirror 55 is slightly rotated. The motor moves holder 41 lengthwise so that the path can be read, and so on.
  • the beam should deliver sufficient laser pulse energy to the surface of the recording material to create spots. Typically, 5-20 milliwatts is required, depending on the recording material.
  • a 20 milliwatt semiconductor laser focussed to a five micron beam size, records at tempera ⁇ tures of about 200 C and is capable of creating spots in about 75 microseconds.
  • the wavelength of the laser should be compatible with the recording material. In the read mode, power is lowered to about 5% to 10% of the record power.
  • Optical contrast between a spot and surrounding field are detected by light detector 65 which may be a photodiode.
  • Light is focussed onto detector 65 by beam splitter 67 and focusing lens 69.
  • Servo motors not shown, control the positions of the mirrors and drive the mirrors in accord with instructions received from control circuits, as well as from feedback devices.
  • the detector 65 produces electrical signals corresponding to spots. These signals are processed by the spoken-word recogni ⁇ tion unit and used for identifying words spoken by a particular speaker.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Credit Cards Or The Like (AREA)
  • Optical Recording Or Reproduction (AREA)
EP86903718A 1985-04-09 1986-03-10 Datenkartensystem für initialisierungsspracherkennungseinrichtungen Withdrawn EP0218723A1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US72138185A 1985-04-09 1985-04-09
US721381 1985-04-09

Publications (1)

Publication Number Publication Date
EP0218723A1 true EP0218723A1 (de) 1987-04-22

Family

ID=24897749

Family Applications (1)

Application Number Title Priority Date Filing Date
EP86903718A Withdrawn EP0218723A1 (de) 1985-04-09 1986-03-10 Datenkartensystem für initialisierungsspracherkennungseinrichtungen

Country Status (3)

Country Link
EP (1) EP0218723A1 (de)
CA (1) CA1258317A (de)
WO (1) WO1986006197A1 (de)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0248593A1 (de) * 1986-06-06 1987-12-09 Speech Systems, Inc. Vorverarbeitungssystem zur Spracherkennung
JPH0795240B2 (ja) * 1986-12-19 1995-10-11 株式会社日立製作所 個人音声パタ−ン入りカ−ドシステム
US4827518A (en) * 1987-08-06 1989-05-02 Bell Communications Research, Inc. Speaker verification system using integrated circuit cards
FR2642882B1 (fr) * 1989-02-07 1991-08-02 Ripoll Jean Louis Appareil de traitement de la parole
ES2114493A1 (es) * 1996-05-22 1998-05-16 Univ Madrid Politecnica Sistema de verificacion de identidad de personas mediante soporte portatil de informacion basado en el reconocimiento de la voz.

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BE787377A (fr) * 1971-08-09 1973-02-09 Waterbury Nelson J Cartes de securite et systeme d'utilisation de telles cartes
US4284716A (en) * 1979-07-06 1981-08-18 Drexler Technology Corporation Broadband reflective laser recording and data storage medium with absorptive underlayer

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO8606197A1 *

Also Published As

Publication number Publication date
CA1258317A (en) 1989-08-08
WO1986006197A1 (en) 1986-10-23

Similar Documents

Publication Publication Date Title
US4711996A (en) Redundant optical recording of information in different formats
US4500777A (en) High data capacity, scratch and dust resistant, infrared, read-write data card for automatic teller machines
US4360728A (en) Banking card for automatic teller machines and the like
US4609812A (en) Prerecorded dual strip data storage card
US4680459A (en) Updatable micrographic pocket data card
US4542288A (en) Method for making a laser recordable wallet-size plastic card
US4544835A (en) Data system containing a high capacity optical contrast laser recordable wallet-size plastic card
US5421619A (en) Laser imaged identification card
US4683371A (en) Dual stripe optical data card
US4692394A (en) Method of forming a personal information card
US4680460A (en) System and method for making recordable wallet-size optical card
US4745268A (en) Personal information card system
US4680458A (en) Laser recording and storage medium
US4835376A (en) Laser read/write system for personal information card
US4680456A (en) Data system employing wallet-size optical card
JPH02501241A (ja) 更新可能なマイクログラフィックポケットデータカード
EP0326576A1 (de) Optisches speichersystem für datenkarten
US4588665A (en) Micrographic film member with laser written data
US4656346A (en) System for optically reading and annotating text on a data card
AU549957B2 (en) Banking card for automatic teller machines and the like
CA1258317A (en) Data card system for initializing spoken-word recognition units
US4758485A (en) Slides and recording method for audiovisual slide show
JP2001126267A (ja) 光記録媒体記録装置及び光記録媒体
BE902603A (fr) Carte et appareil a memoire optique et procede de fabrication d'une telle carte.
JPS6082396A (ja) ホログラムによるカ−ド識別方法

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH DE FR GB IT LI LU NL SE

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 19870110

RIN1 Information on inventor provided before grant (corrected)

Inventor name: DREXLER, JEROME