GB2422238A - Generation of data from speech or voiceless mouthed speech - Google Patents

Generation of data from speech or voiceless mouthed speech Download PDF

Info

Publication number
GB2422238A
GB2422238A GB0500926A GB0500926A GB2422238A GB 2422238 A GB2422238 A GB 2422238A GB 0500926 A GB0500926 A GB 0500926A GB 0500926 A GB0500926 A GB 0500926A GB 2422238 A GB2422238 A GB 2422238A
Authority
GB
United Kingdom
Prior art keywords
sensor
human
target
targets
sensors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0500926A
Other versions
GB0500926D0 (en
Inventor
Michael John Fagan
Paul Chapman
James Michael Gilbert
Stephen Robert Ell
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Hull
Original Assignee
University of Hull
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Hull filed Critical University of Hull
Priority to GB0500926A priority Critical patent/GB2422238A/en
Publication of GB0500926D0 publication Critical patent/GB0500926D0/en
Priority to PCT/GB2006/000131 priority patent/WO2006075179A1/en
Publication of GB2422238A publication Critical patent/GB2422238A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Manipulator (AREA)

Abstract

A plurality of magnets (10) are attached to respective different parts of the mouth of a human. A plurality of magnetic field sensors (26A-26F) are mutually spaced from one another and mounted on a support worn by the human. As the human speaks or makes voiceless mouthed speech, the magnetic field experienced at the sensors (26A-F) vary, because of the movement of the mouth. The sensors (26A-F) detect the changing intensities and pass them to a processor (32). The processor (32) can use a variety of methods to analyse the signals from the sensors (26A-F) . The field intensity signals can be used for, for example, determining words spoken, control of equipment and identification of individuals. Where the human is incapable of voicing speech, mouthed speech can be analysed and converted to artificial speech.

Description

GENERATION OF DATA FROM SPEECH
OR VOICELESS MOUTHED SPEECH
The invention relates to a method and system for generating data which varies dependent on human speech or on voiceless, mouthed human speech. Voiceless, mouthed human speech refers to a person moving his or her mouth as if that person was speaking, but without any sound being made, or with only a very quiet sound being made.
It is known to generate data, in the form of an audio signal, by using a microphone to record human speech. The audio signal can subsequently be used for a large number of purposes. For example, it is now possible to use a processor to analyse the audio signal so as to identify the actual words spoken by the person. Alternatively, the audio signal can be used to control equipment in accordance with commands spoken by the person.
However, there are some circumstances in which the use of a microphone to produce an audio signal corresponding to human speech is inappropriate. For example, some people are unable to vocalise speech but can produce voiceless, mouthed speech. This is relevant to, for example, people who have undergone surgery to remove those parts of anatomy required for speech for disease or trauma, and those who have lost function as a result of disease or trauma, for example, laryngectomy, pharyngo- laryngectomy, glossectomy, or hemi-glossectomy. It may also be applicable to people suffering from conditions such as ataxias, dysarthrias or motor neurone disease. People who are only able to produce voiceless, mouthed speech can be understood by persons who are able to lip read. However, a considerable amount of time and effort is required to become proficient in lip reading.
In accordance with a first aspect of the invention, there is provided a method of generating data which varies dependent on human speech or voiceless, mouthed human speech, comprising, providing a plurality of targets and a plurality of sensors, attaching either the targets or the sensors to respective different parts of the mouth of a human, positioning the other of the targets or the sensors in a mutually spaced relationship so that each sensor can magnetically sense at least one of the targets and each target can be magnetically sensed by at least one of the sensors, and using said sensors to magnetically sense said targets as the human speaks or makes voiceless, mouthed speech, and generating said data based on said magnetic sensing of said targets.
The word target is used to designate something which can be sensed by a sensor.
In accordance with a second aspect of the invention, there is provided a method of generating data which varies dependent on human speech or voiceless, mouthed human speech, comprising, attaching a plurality of magnets to respective different parts of the mouth of a human, positioning in a mutually spaced relationship a plurality of magnetic field sensors so that each sensor is positioned for sensing one or more of the magnets and so that each magnet can be sensed by at least one of the sensors, and generating said data wherein said generating includes using each sensor to sense a respective changing magnetic field intensity at said sensor as the human speaks or makes voiceless mouthed speech and the data includes a respective component for each sensor corresponding to the changing magnetic field intensity sensed by said each sensor.
In accordance with a third aspect of the invention, there is provided a system for generating data which varies dependent on human speech or voiceless, mouthed human speech, comprising, a plurality of targets and a plurality of sensors, either the targets or the sensors being attachable to respective different parts of the mouth of a human, the other of the targets or the sensors being positionable in a mutually spaced relationship so that each sensor can magnetically sense at least one of the targets and each target can be magnetically sensed by at least one of the sensors, and a processor for processing data, wherein the data is based on magnetic sensing of said targets by said sensors as the human speaks or makes voiceless, mouthed speech.
In accordance with a fourth aspect of the invention there is provided a system for generating data which varies dependent on human speech or voiceless, mouthed human speech, comprising a plurality of magnets attachable to respective different parts of the mouth of a human, a plurality of magnetic field sensors positionable in a mutually spaced relationship so that each sensor is positioned for sensing one or more of the magnets and so that each magnet can be sensed by at least one of the sensors, and a processor adapted for processing data, wherein said data includes a respective component for each sensor, and each component corresponds to a changing magnetic field intensity sensed by the corresponding sensor.
Accordingly, the current invention provides a method and a system for generating data which varies dependent on speech made by people capable of vocalising, and this is a useful alternative to the use of a microphone to produce an audio signal as discussed above. For example, the system may be used when normal speech is inaudible or indistinguishable over background noise. Additionally, the current invention provides a method and a system for generating data which varies dependent on voiceless, mouthed speech. This is of use both by people who are unable to produce vocalised speech but who can produce voiceless, mouthed speech, and also by other people in situations where vocalised speech is undesirable.
The data generated in accordance with the current method can be put to many different uses. For example, it is relatively simple to compare data produced in accordance with the current invention as a person speaks or makes voiceless mouthed speech, against pre-recorded data produced in a similar manner, also in accordance with the invention, while a person spoke or mouthed a particular word or phrase. Such a comparison is useful in the field of voice control of equipment as it is possible to determine whether a user speaks or voicelessly mouths a particular command word or phrase for controlling a piece of equipment.
Another use of the data generated in accordance with the current invention is in the field of security when it is desired to confirm the identity of an individual. When different people speak (or make voiceless mouthed speech), then the pattern in which the various parts of their mouths move will differ. Accordingly, by comparing data generated in accordance with the current invention on a first occasion, with data generated in a similar manner in accordance with the current invention on a second occasion, it may be possible to assess whether it is the same person who was speaking or mouthing on the two different occasions.
The data generated in accordance with the current invention can also be used to identify sounds andlor words spoken or mouthed. In the case that the generated data is used to identify words mouthed by a person incapable of voiced speech, then the identification of the words mouthed maybe used, in turn, to generate artificial, audible speech. Alternatively identification of words or phrases spoken or mouthed can be used to control equipment, or words identified can be fed into a word processing application or a translation application.
Also, as the data generated in accordance with the current invention is derived from the movement of different parts of the mouth, the data may be used to generate a moving visual representation of the shape of the mouth. This could be useful to improve pronunciation, for example the pronunciation of a person learning a foreign language.
The following is a more detailed description of embodiments of the invention, by way of example, reference being made to the appended schematic drawings in which: Figures IA to ID show components of a first system for generating data which varies depending on human speech or voiceless, mouthed human speech, and for using the generated data to identify words spoken or mouthed; Figure 2 shows a sensor unit from the system of Figures 1A to ID; Figure 3 shows a support carrying six sensor units from the system shown in Figures IA to ID; and Figure 4 shows a second system for generating data which varies dependent on human speech or voiceless mouthed speech, and for identifying words spoken or mouthed.
Referring first to Figures 1A to ID, and in particular to Figure 1A, the first system includes ten very small, high-powered magnets 10. Each one of the magnets 10 is hermetically sealed within a coating (not shown) formed from an impermeable material with a high degree of bio- compatibility. One suitable material is PTFE (polytetrafluoroethylene). As indicated in Figure IA, two of the magnets 10 are attached to the upper lip 12 of a user 14 of the system. These two magnets 10 may be attached by being implanted in the upper lip 12. A second pair of the magnets 10 are attached to respective teeth 16 of the upper jaw of the user 14. A third pair of the magnets 10 are attached to the tongue 18 of the user 14. The third pair of magnets 10 may be attached to the tongue 18 by being implanted in the tongue 18. A fourth pair of the magnets 10 are attached to respective teeth 20 of the lower jaw of the user 14.
Finally, a fifth pair of the magnets 10 are attached to the lower lip 22 of the user 14.
As can be seen in Figure IA, some of the magnets 10 are located at the right side of the mouth, and the other magnets 10 are located at the left side of the mouth.
Referring to Figures IC, 2 and 3, the system also includes six sensor units 24A, 24B, 24C, 24D, 24E, 24F, each sensor unit 24A - F containing a respective magnetic field sensor 26A, 26B, 26C, 26D, 26E, 26F. The magnetic field sensors 26A - F are suitable for sensing respective magnetic field intensities at the respective locations of the sensors 26A - F. A suitable type of magnetic field sensor is a Hall effect sensor. As shown in Figures 1C and 3, the sensor units 24A - F are mounted on a rigid generally circular support 28 which serves to hold the sensor units 24A F (and the magnetic field sensors 26A - F contained within the sensor units) in a fixed spatial configuration relative to one another. The support 28 is provided with a hinge (not shown) and a clasp (not shown) so that the support 28 can be worn around the neck in the manner of necklace, as shown in Figure lB.
(As best seen in Figure 3, each sensor unit 24A - F also includes a respective tracker sensor 30A, 30B, 30C, 30D, 30E, 30F. The tracker sensors 30A - 30F are not used in this first system, but are used in the second system described below.) As shown in Figure 1D, the system also includes a processor 32 for processing magnetic field intensity data provided by the magnetic field sensors 26A - F. Although the processor 32 is shown as a pocket-sized processor, it will be appreciated that it may be necessary to use a larger sized processor 32 to accommodate the necessary hardware components of the processor. The processor 32 ideally contains many gigabytes of fast memory and multiple processor units running at very high speeds, for example in excess of five gigahertz. The processor also includes at least six analogue to digital converters. The processor 32 will also include hundreds of gigabytes of data storage.
The processor 32 will also include a high quality sound card and loud speaker. The processor 32 is connected to the magnetic field sensors 26A F by six respective connection leads 34 so that the data from each magnetic field sensor 26A - 26F is fed to a respective analogue to digital converter (not shown).
The operation of the first system will now be described. The ten magnets 10 together generate a combined magnetic field (not shown). As the user 14 speaks or makes voiceless mouthed speech, the respective positions of the ten magnets 10 change. It will be appreciated that magnets 10 attached to different parts of the user's mouth will move in different ways. For example, the pair of magnets 10 attached to the teeth 20 of the lower jaw will tend to move generally up and down on an arc. The pair of magnets 10 attached to the lower lip 22 will also generally move up and down on an arc, but will also move in other directions as the lower lip 22 moves relative to the teeth of the lower jaw 20. The pair of magnets 10 attached to the tongue 18 will tend to move up and down and also backwards and forwards as the tongue moves. The pair of magnets 10 attached to the teeth 16 of the upper jaw will tend to move less. The magnets 10 attached to the upper lip 12 will move as the upper lip 12 moves.
As the user 14 speaks or makes voiceless mouthed speech, then the combined magnetic
field generated by the magnets 10 will change.
The magnetic field sensors 26A - F are held by the support 28 in the combined magnetic field. The arrangement is such that each magnetic field sensor 26A - 26F is able to sense each magnet 10. That is to say the magnetic field intensity that each magnetic field sensor 26A - 26F experiences is influenced by each of the magnets 10.
Each magnetic field sensor 26A - 26F detects a magnetic field intensity at its own location. The intensities will generally be different at each of the different magnetic field sensors 26A - F. As the user 14 speaks or makes voiceless mouthed speech, the magnetic field intensities sensed by the magnetic field sensors 26A - 26F change.
Each magnetic field sensor 26A - 26F passes magnetic field intensity data to the processor 32. The processor 32 uses the six signals from the six magnetic field sensors 26A - 26F to calculate, on a real time basis, the changing positions of the ten magnets 10. In turn, the changing positions of the magnets 10 are used to identify specific sounds andlor words spoken or mouthed by the user 14. Calculation of the changing positions of the ten magnets 10 is performed as follows.
Given a known magnet of known strength and orientation, it is possible to use finite element analysis to determine the intensity of the magnetic field at any spatial point relative to the magnet. Where there are two or more magnets of known strength and known orientation fixed adjacent to one another, it is possible to use finite element analysis to calculate the magnetic field intensity at any point in space adjacent to the magnets. Calculations of this type are well known.
In the current case, the positions of the ten magnets 10 are not known and are changing. The changing positions of the magnets 10 changes the combined magnetic field in the vicinity of the sensor units 24A - 24F. The six magnetic field intensities sensed at each of the six magnetic field sensors 26A - 26F are known. It is possible to use the known, changing magnetic field intensities to calculate the positions of the ten magnets 10. This is done by an inverse iterative finite element analysis approach.
Initially, the processor 32 assumes respective positions, relative to the sensor units 24A - F, for the ten magnets 10. The processor 32 then uses finite element analysis to calculate, based on the assumed positions, the magnetic field intensities that would be present at each of the positions of the magnetic field sensors 26A - F. The calculated intensities are compared to the measured intensities experienced at that point in time by the six magnetic field sensors 26A - F. The most likely outcome is that the assumed positions of the magnets 10 will be incorrect and that the calculated magnetic field intensities will not match the actual magnetic field intensities. The processor 32 then starts a second iteration assuming different positions of the magnets 10. This process is repeated until the calculated magnetic field intensities at the magnetic field sensors 26A - F corresponds to the actual magnetic field intensities experienced by the magnetic field sensors 26A - F. At this stage, the assumed positions of the ten magnets (relative to the sensor units 24A F) are considered to be correct, and the positions are stored. As the intensities of the magnetic field experienced at the magnetic field sensors 26A - F will vary depending on the distance away from the sensors 26A - F of the magnets 10, differences between the calculated and actual magnetic field intensities can be used to help chose assumed positions of magnets 10 which are more likely to be correct. In this way, the number of iterations can be reduced.
Once the iterative process calculates the correct positions for the ten magnets at a particular point in time, the iterative process is then repeated, on the basis of the actual magnetic field intensities experienced by the six magnetic field sensors 26A - F at a subsequent point in time. In this way, the changing positions of the ten magnets 10 are - 12 - calculated over time.
It will be appreciated that if the positions of the ten magnets 10 are known relative to the positions of the sensor units 24A - F, then the positions of the magnets 10 relative to one another will also be known. From the positions of the magnets 10 relative to one another, and from the way in which these positions change over time, it is possible to identify sounds and words spoken or mouthed by the user 14, because the mouth assumes different shapes to make different sounds.
It will be appreciated that the user 14 may be moving his head up or down, or from side to side while speaking or making voiceless mouthed speech. This will clearly affect the magnetic field intensities experienced at the locations of the sensor units 24A - F. This does not, however, prevent the processor 32 from determining the positions of the magnets 10 relative to one another. This is because, regardless of the position of the head of the user 14 relative to the support 28, the iterative process described above calculates the positions of the ten magnets 10 relative to the support 28, and from this information the positions of the magnets 10 relative to one another can be calculated.
It is the positions of the magnets 10 relative to one another which reflect the shape of the mouth, and which can be used to determine sounds and words spoken or mouthed by the user 14.
The method described above to calculate the positions of the magnets 10 relies on the - 13 - fact that the sensor units 24A -F are held immobile relative to one another by the support 28. However, the fact that the support 28 is rigid may result in discomfort when the support 28 is worn around the neck. In view of this, a second system using a flexible support is also envisaged and shown in Figure 4. Features of the system shown in Figure 4 which are the same as the corresponding features shown in Figures IA - I B, 2 and 3 are given the same reference numerals and are not described in detail.
As shown in Figure 4, the second system also includes six sensor units 24A - F which are identical to the sensor units 24A - F described in respect of the first system. Also, the second system also includes ten magnets 10 (two of which are shown in Figure 4) which are identical to the magnets 10 described above and which are attached to various parts of the mouth as described above in respect of the first system. The second system also includes a processor 32 which is substantially the same as the processor 32 described above.
In the second system shown in Figure 4, the sensor units 24A - F are mounted on a flexible support 36 such that the sensor units 24A - 24F can move relative to one another. In the second system shown in Figure 4, each of the tracker sensors 30A - 30F in each of the sensor units 24A - 24F are operative. The tracker sensors 30A - 30F are used in conjunction with the field generator 38, in a known manner, to determine the positions of the sensors units 24A - 24F. The tracker sensors 30A-30F detect a magnetic field generated by the field generator 38. Systems comprising suitable tracker sensors 30A-30F in combination with a suitable field generator 38 are commercially available - such as the microBlRD (trade mark) system from Ascension Technology Corporation or the Aurora (trade mark) system from Northern Digital Inc. It will be noted that the field generator 38 should be switched off when the field sensors 26A- 26F are being used to detect the magnets 10 so as to avoid interference. It may also be necessary to correct the position output provided by the tracker sensors 30A-30F in combination with the field generator, to compensate for any interference from the magnets 10.
The positions of the sensor units 24A - 24F (as determined by the tracker sensors 30A- 30F in combination with the field generator 38) are used by the processor 32, during the calculation of the positions of the magnets 10, to correct for changes in the positions of the sensor units 24A - 24F relative to one another.
It will be appreciated that the first and second systems described above can be used in a number of ways.
Firstly, the user 14 may be incapable of producing vocalised speech, while being capable of producing voiceless mouthed speech. Both of the systems described above would then identify sounds and words mouthed by the user 14, and the processor 32 would use the words identified to generate artificial audible speech via the processor sound card and loud speaker. In this way, the user 14 can communicate easily by mouthing words, and lip reading is not required. It may be possible to use earlier recordings of the user's voice, before the ability to vocalise was lost, to create artificial speech which sounds like the original speech of the user.
Alternatively, given that both systems described above determine words spoken or mouthed by the user 14, then either system maybe used to feed the words spoken or mouthed into another application such as a word processor, or a language translation application.
Alternatively, the two systems described above can be used in the control of equipment. Both systems identify words that are spoken or mouthed by the user 14.
The processor 32 can be programmed so that when certain command words are spoken or mouthed by the user 14, the processor 32 will pass on commands to equipment to perform corresponding tasks. For example, the systems could be used to control equipment for the physically disabled, or for controlling industrial or military equipment. When the systems are used to control equipment, a major advantage is that background noise(such as machinery noise) does not interfere with the determination of the words spoken.
Additionally, the systems could be used to allow two or more people to communicate with each other without making any sound audible to others. For example, two people could wear respective systems in accordance with the invention. They could then - 16 - voicelessly mouth messages to one another. The words mouthed by each person would be identified by the processor worn by that person and then transmitted, eg by radio waves, to the other person, who would hear the message in an earpiece. Other people would hear nothing and a clear line of sight between the users is not required.
It will be appreciated that both systems described above calculate the changing positions of the ten magnets 10. This positional information could be used to generate a visual representation of the user's mouth. The representation may show movement of the mouth in real time, or in slow motion, or movement may be frozen. Such a representation may be useful to teach the user 14 how to pronounce specific words, for example if the user 14 is learning a foreign language.
It will be appreciated that the first and second systems described above may be modified. For example, there may be different number of magnets 10. The magnets may be attached to different parts of the mouth. Suitable adhesives may be used for attachment.
There may be a different number of sensor units 24A - F. Instead of being mounted on the support 28 and worn around the neck, the sensor units 24A F may be held in the region of the user's mouth in a different manner. For example, the sensor units 24A - F may be mounted on a support (e.g. a rigid support) which is worn on or fitted to the user's head. The processor 32 may include any hardware components which are suitable for performing the processing tasks described above.
A third system will now be described. The third system is suitable for generating data which varies dependent on human speech or voiceless mouthed speech. This third system comprises ten magnets, identical to the magnets 10 described above with reference to Figure 1A. These ten magnets are attached to different parts of the mouth of a user in the same manner as the magnets 10 described above with reference to Figure lÀ.
The third system also includes six sensor units which are identical to the sensor units 24A - 24F.
The sensor units of the third system are connected to a processor which has the same hardware features as the processor 32 described above with reference to Figure 1D.
The third system works in a similar manner to the first system described above.
Specifically, when the user talks or makes voiceless mouthed speech, the sensor units are used to sense magnetic field intensities at the locations of the sensor units. The magnetic field intensities from the six sensor units are passed to the processor.
However, instead of using the iterative finite element analysis procedure to determine the changing positions of the magnets, the processor analyses the magnetic field intensity signals in a different manner. - 18-
In the third system, the processor has a very large amount of prerecorded data stored thereon. The stored data is made up of magnetic field intensity data which has been previously recorded using a system of magnets and sensors identical to the third system currently being described. Thus, for example, the pre-recorded data may comprise magnetic field intensity data for a very large number of words andlor sounds. For example, for the sound of the letter p, the pre-recorded data may include six signals corresponding to the changing magnetic field intensities at the six sensor units as a person moves his or her mouth to make the sound p. Accordingly, the third system identifies words spoken by a user of the system by comparing magnetic field intensity data generated by the sensor units with pre- recorded magnetic field intensity data recorded while certain sounds or words were spoken. By matching magnetic field intensity data being generated by the third system with the pre-recorded data, it is possible to determine the sounds and words being spoken by the user.
Of course, it will be appreciated that the pre-recorded data must be recorded using, as close as possible, the same number and attachment positions of the magnets, and the same number and positions of the sensor units. The processor may use fuzzy and neural network methodologies to match magnetic field intensity data from the system with the pre-recorded data.
As will be appreciated, the third system does not analyse the magnetic field intensity data generated by the system to determine the relative positions of the different parts of the user's mouth. Instead, the magnetic field intensity data is compared to similar data to determine words and sounds spoken. For this system, it is clear that any movement of the head of the user relative to the sensor units could greatly influence the magnetic field intensity data, independently of the words spoken or mouthed by the user. To avoid this effect, it may be desirable for the sensor units to be mounted on a support (e.g. a support which holds to units immovable relative to one another) which can be fitted to the user's head, so that the sensor units are held generally immobile relative to the user's head.
A fourth system now to be described is considerably more simple and intended primarily for the purposes of controlling equipment, or identifying an individual.
The fourth system consists often magnets identical to the magnets 10 described above with reference to Figure 1A, and attached to parts of a mouth of a user in an identical manner to that shown in Figure 1A. Thefourth system also includes six sensor units which are identical to the sensor units described above and which are mounted on a support which is capable of being worn on the head of the user so that the sensor units remain immobile relative to the head and to one another. The fourth system also contains a processor. In this system, the processor contains pre-recorded data corresponding to a limited number of words of phrases. The pre-recorded data consists - 20 - of magnetic field intensity data generated by a user speaking or mouthing the limited number of words and phrases and using an arrangement of magnets and sensors to generate the pre-recorded data which is identical to that of the fourth system.
In use, a user may wish to control a piece of equipment, such as a wheelchair. The pre- recorded data may include magnetic field intensity signals corresponding to the word forward. If the user of the fourth system speaks the word forward, the generated magnetic field intensity data is passed to the processor and compared against the pre- recorded data. This comparison allows the processor to determine that the user has spoken the word forward, and the processor then passes a command to the wheelchair so that the wheelchair moves forward. Clearly, the user can also control the wheelchair by making voiceless, mouthed speech.
Given that different people will move the different parts of their mouths in slightly different ways as they say the same phrase, the fourth system can also be used as a method of identification. In order to do this, a person may speak a known phrase so as to generate magnetic field intensity data using the fourth system. At a later date, the same person may speak the same phrase using the same system. By comparing the magnetic field intensity signals produced on the two occasions, it may be possible to confirm that the same person has spoken the phrase on the two occasions.
The third and fourth systems can be varied, for example, in the number and positions of magnets and in the number and positions of sensor units.
In a fifth system, the magnets 10 described above are replaced with resonant coils and the magnetic field sensors 26A-26F are replaced with coils, referred to as sensor coils, which generate respective alternating magnetic fields. In use, the resonant coils interact with the alternating magnetic fields generated by the sensor coils and resulting changes in current and voltage in the sensor coils can be used to sense this interaction. The interactions give information allowing the positions of the resonant coils to be determined. Each of the resonant coils could optionally be tuned to a different resonant frequency and the sensor coils could then cycle through the frequencies so that the positions of each resonant coil could be established in turn.
In a sixth system, the magnets 10 described above are replaced with coils, referred to as target coils, and the magnetic field sensors 26A-26F are replaced with coils, referred to as sensor coils, which generate respective alternating magnetic fields. In use, the target coils are connected to electronic devices which modulate the behaviour of the target coils in such a way that this modulation may be detected from the changing current and voltage in the sensor coils. The interaction between target coils and sensor coils gives information allowing the positions of the target coils to be determined. Each of the target coils could optionally be tuned to a different resonant frequency and the sensor coils could then cycle through the frequencies so that the positions of each resonant coil could be established in turn. Alternatively each target - 22 - coil could modulate the behaviour of the target coils in a different manner, for example using a different modulation frequency, to allow the positions of different target coils to be differentiated. The power transferred to the target coils from the sensor coils could be used to provide electrical power for the modulation electronics.

Claims (59)

- 23 - CLAIMS
1. A method of generating data which varies dependent on human speech or voiceless, mouthed human speech, comprising, providing a plurality of targets and a plurality of sensors, attaching either the targets or the sensors to respective different parts of the mouth of a human, positioning the other of the targets or the sensors in a mutually spaced relationship so that each sensor can magnetically sense at least one of the targets and each target can be magnetically sensed by at least one of the sensors, and using said sensors to magnetically sense said targets as the human speaks or makes voiceless, mouthed speech, and generating said data based on said magnetic sensing of said targets.
2. A method according to claim 1, wherein said magnetic sensing of said targets comprises each sensor sensing a respective changing magnetic field intensity at said each sensor.
3. A method according to claim 1 or claim 2, wherein the targets are attached to the respective parts of the human mouth.
- 24 -
4. A method according to claim 3, when claim 3 is dependent on claim 2, wherein the targets are magnets, and the data includes a respective component for each sensor corresponding to the changing magnetic field intensity sensed by said each sensor.
5. A method according to any preceding claim, further comprising using a processor to compare said generated data with pre-recorded data.
6. A method according to claim 5, wherein the pre-recorded data corresponds to a word or plurality of words previously spoken or voicelessly mouthed, and said comparison comprises determining whether said generated data has a predetermined degree of similarity with said pre-recorded data to indicate whether the human has spoken or voicelessly mouthed said word or plurality of words during said sensing of said targets.
7. A method according to claim 6, wherein said comparison determines whether said human is the same human who previously spoke or voicelessly mouthed said word or plurality of words to which said pre-recorded data corresponds.
8. A method according to claim 6 or claim 7, wherein said comparison is used to operate equipment.
9. A method according to any one of claims 1 to 4, wherein said generated data is used to identify the human.
10. A method according to any one of claims ito 4, wherein said generated data is used to operate equipment in accordance with specific commands spoken or voicelessly mouthed by the human during said generation of said data.
11. A method according to any one of claims 1 to 4, wherein said generated data is used to identify words spoken or voicelessly mouthed by the human.
12. A method according to claim 11, wherein said identification of said words includes using a processor to compare said generated data with prerecorded data.
13. A method according to claim 12, wherein said pre-recorded data corresponds to previously spoken or voicelessly mouthed sounds, words or phrases.
14. A method according to claim Ii, wherein said generated data is analysed to determine the respective positions of said mouth parts at a plurality of moments in time, and said mouth part positions are used in said identification of said words spoken or voicelessly mouthed.
15. A method according to claim 14, wherein said determination of said mouth part - 26 - positions involves using finite element analysis or other numerical analysis technique
16. A method according to any one of claims 11 to 15, further including generating audible artificial speech corresponding to said words spoken or voicelessly mouthed by the human.
17. A method according to claim 14 or any claim dependent thereon, further including generating a visual representation of the mouth based on said mouth part positions.
18. A method according to any preceding claim, wherein said other of the targets or the sensors which are positioned in said mutually spaced relationship are fixed in position relative to one another.
19. A method according to claim 18, wherein said other of the targets or the sensors which are positioned in said mutually spaced relationship are fixed in position relative to the head of the human.
20. A method according to any one of claims 1 to 14, wherein the positions of said other of the targets or the sensors which are positioned in said mutually spaced relationship are variable relative to one another and a tracking system is used to determine their positions relative to one another thereby to compensate for relative movement there between.
21. A method according to any preceding claim, wherein each sensor is positioned to sense all of the targets.
22. A method according to any preceding claim, wherein at least one of the targets or the sensors, as the case may be, is attached to the tongue.
23. A method according to claim 22, wherein said at least one target or sensor attached to the tongue is implanted in the tongue.
24. A method according to any preceding claim, wherein at least one of the targets or the sensors, as the case may be, is attached to a tooth.
25. A method according to any preceding claim, wherein at least one of the targets or the sensors, as the case may be, is attached to a lip.
26. A method according to claim 25, wherein said at least one target or sensor attached to a lip is implanted in the lip.
27. A method according to any preceding claim, wherein at least one target or sensor, as the case may be, is attached to the upper lip, at least one target or sensor, as the case may be, is attached to a tooth on the upperjaw, at least one target or sensor, as the case may be, is attached to the tongue, at least one target or sensor, as the case may be, is attached to a tooth on the lowerjaw, and at least one target or sensor, as the case may be, is attached to the lower lip.
28. A method according to claim 1, wherein the targets are attached to the respective parts of the human mouth, each sensor comprising a respective coil which generates an alternating magnetic field and each target comprising a respective resonant coil, each sensor sensing at least one target by sensing interaction between the sensor's alternating magnetic field and the resonant coil of said at least one target.
29. A method according to claim 28, wherein each target resonant coil is tuned to a different resonant frequency and the sensor coils generate magnetic fields with frequencies which change over time to sense different target coils at different times.
30. A method according to claim 1, wherein the targets are attached to the respective parts of the human mouth, each sensor comprising a respective coil which generates an alternating magnetic field and each target comprising a respective coil, each sensor sensing at least one target by sensing interaction between the sensor's alternating magnetic field and the coil of said at least one target.
31. A method according to claim 28 or claim 30, wherein each target coil is - 29 - connected to electronic devices capable of modulating the behaviour of that target coil.
32. A method according to claim 31, wherein each target coil modulates the behaviour of the target coil in a different manner.
33. A method according to claim 31 or claim 32, wherein electrical power is supplied to the modulation electronics from the target coil.
34. A system for generating data which varies dependent on human speech or voiceless, mouthed human speech, comprising, a plurality of targets and a plurality of sensors, either the targets or the sensors being attachable to respective different parts of the mouth of a human, the other of the targets or the sensors being positionable in a mutually spaced relationship so that each sensor can magnetically sense at least one of the targets and each target can be magnetically sensed by at least one of the sensors, and a processor for processing data, wherein the data is based on magnetic sensing of said targets by said sensors as the human speaks or makes voiceless, mouthed speech.
35. A system according to claim 34, wherein each sensor is adapted for sensing a respective changing magnetic field intensity at said each sensor.
36. A system according to claim 35, wherein the targets are magnets adapted for attachment to respective parts of the mouth of a human, and wherein the data includes a respective component for each sensor corresponding to the changing magnetic field intensity sensed by said each sensor.
37. A system according to any one of claims 34 to 36, wherein the processor is adapted to compare said data with pre-recorded data.
38. A system according to claim 37, wherein the pre-recorded data corresponds to a word or plurality of words previously spoken or voicelessly mouthed, and said comparison comprises determining whether said data has a predetermined degree of similarity with said pre-recorded data to indicate whether said human has spoken or voicelessly mouthed said word or plurality of words.
39. A system according to claim 38, wherein said processor is adapted so that said comparison determines whether said human is the same human who previously spoke or voicelessly mouthed said word or plurality of words to which said pre-recorded data corresponds.
40. A system according to claim 38 or claim 39, wherein said processor is connected to equipment to operate said equipment on the basis of said comparison.
41. A system according to any one of claims 34 to 36, wherein said processor is adapted such that said data is used to identify the human.
42. A system according to any one of claims 34 to 36, wherein said processor is adapted so that said data is used to operate equipment in accordance with specific commands spoken or voicelessly mouthed by the human.
43. A system according to any one of claims 34 to 36, wherein said processor is adapted such that said data is used to identify words spoken or voicelessly mouthed by the human.
44. A system according to claim 43, wherein said identification of said words includes using said processor to compare said data with prerecorded data.
45. A system according to claim 44, wherein said pre-recorded data corresponds to previously spoken or voicelessly mouthed sounds, words or phrases.
46. A system according to claim 43, wherein said processor is adapted for analysing said data to determine the respective positions of said mouth parts at a plurality of moments in time, and said mouth part positions are used in said identification of said words spoken or voicelessly mouthed.
47. A system according to claim 46, wherein said processor is adapted such that said determination of said mouth part positions involves using iterative finite element analysis.
48. A system according to any one of claims 43 to 47, further including means connected to said processor for generating audible artificial speech corresponding to said words spoken or voicelessly mouthed by the human.
49. A system according to claim 46 or any claim dependent thereon, including means connected to said processor for generating a visual representation of the mouth based on said mouth part positions.
50. A system according to any one of claims 34 to 49, wherein the sensors are mounted on a support.
51. A system according to claim 50, wherein the support is adapted to be fixedly mounted to the head of human.
52. A system according to any one of claims 34 to 49, wherein the sensors are movable relative to one another, and further including a tracking system to determine the relative positions of the sensors, the tracking system being connected to the processor to compensate for relative movement between the sensors.
53. A system according to any one of claims 34 to 52, wherein the targets are - 33 - attached to respective different mouth parts of a human, and the sensors are positioned in a mutually spaced relationship so that each sensor is positioned for sensing one or more of the targets and so that each target can be sensed by at least one of the sensors.
54. A system according to claim 34, wherein the targets are attachable to the respective parts of the human mouth, each sensor comprising a respective coil which generates an alternating magnetic field and each target comprising a respective resonant coil, each sensor sensing at least one target by sensing interaction between the sensor's alternating magnetic field and the resonant coil of said at least one target.
55. A system according to claim 54, wherein each target resonant coil is tuned to a different resonant frequency and the sensor coils generate magnetic fields with frequencies which change over time to sense different target coils at different times.
56. A system according to claim 34, wherein the targets are attached to the respective parts of the human mouth, each sensor comprising a respective coil which generates an alternating magnetic field and each target comprising a respective coil, each sensor sensing at least one target by sensing interaction between the sensor's alternating magnetic field and the coil of said at least one target.
57. A system according to claim 54 or claim 56, wherein each target coil is connected to electronic devices capable of modulating the behaviour of that target coil. - 34-
58. A method substantially as hereinbefore described with reference to the accompanying drawings.
59. A system substantially as hereinbefore described with reference to the accompanying drawings.
GB0500926A 2005-01-17 2005-01-17 Generation of data from speech or voiceless mouthed speech Withdrawn GB2422238A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB0500926A GB2422238A (en) 2005-01-17 2005-01-17 Generation of data from speech or voiceless mouthed speech
PCT/GB2006/000131 WO2006075179A1 (en) 2005-01-17 2006-01-16 Generation of data from speech or voiceless mouthed speech

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0500926A GB2422238A (en) 2005-01-17 2005-01-17 Generation of data from speech or voiceless mouthed speech

Publications (2)

Publication Number Publication Date
GB0500926D0 GB0500926D0 (en) 2005-02-23
GB2422238A true GB2422238A (en) 2006-07-19

Family

ID=34224720

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0500926A Withdrawn GB2422238A (en) 2005-01-17 2005-01-17 Generation of data from speech or voiceless mouthed speech

Country Status (2)

Country Link
GB (1) GB2422238A (en)
WO (1) WO2006075179A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104517608A (en) * 2013-09-30 2015-04-15 韦伯斯特生物官能(以色列)有限公司 Controlling a system using voiceless alaryngeal speech
CN105321519A (en) * 2014-07-28 2016-02-10 刘璟锋 Speech recognition system and unit
US10283120B2 (en) 2014-09-16 2019-05-07 The University Of Hull Method and apparatus for producing output indicative of the content of speech or mouthed speech from movement of speech articulators

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9679575B2 (en) 2011-12-22 2017-06-13 Intel Corporation Reproduce a voice for a speaker based on vocal tract sensing using ultra wide band radar

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2617299A1 (en) * 1987-06-24 1988-12-30 Pierre Williamson Self-contained and portable voice recognition system, for management of computer files
JPH0713582A (en) * 1993-06-21 1995-01-17 Toshiba Corp Portable speech recognition output assisting device
US5453687A (en) * 1993-01-12 1995-09-26 Zierdt; Andreas Method and a device to determine the spatial arrangement of a directionally sensitive magnetic field sensor
JPH0869297A (en) * 1994-08-30 1996-03-12 Aqueous Res:Kk Speech recognition device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6006175A (en) * 1996-02-06 1999-12-21 The Regents Of The University Of California Methods and apparatus for non-acoustic speech characterization and recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2617299A1 (en) * 1987-06-24 1988-12-30 Pierre Williamson Self-contained and portable voice recognition system, for management of computer files
US5453687A (en) * 1993-01-12 1995-09-26 Zierdt; Andreas Method and a device to determine the spatial arrangement of a directionally sensitive magnetic field sensor
JPH0713582A (en) * 1993-06-21 1995-01-17 Toshiba Corp Portable speech recognition output assisting device
JPH0869297A (en) * 1994-08-30 1996-03-12 Aqueous Res:Kk Speech recognition device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104517608A (en) * 2013-09-30 2015-04-15 韦伯斯特生物官能(以色列)有限公司 Controlling a system using voiceless alaryngeal speech
CN104517608B (en) * 2013-09-30 2020-08-18 韦伯斯特生物官能(以色列)有限公司 Throat-free voice control system using unvoiced sound
CN105321519A (en) * 2014-07-28 2016-02-10 刘璟锋 Speech recognition system and unit
US10283120B2 (en) 2014-09-16 2019-05-07 The University Of Hull Method and apparatus for producing output indicative of the content of speech or mouthed speech from movement of speech articulators

Also Published As

Publication number Publication date
GB0500926D0 (en) 2005-02-23
WO2006075179A1 (en) 2006-07-20

Similar Documents

Publication Publication Date Title
Richmond et al. Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus
Gonzalez-Lopez et al. Silent speech interfaces for speech restoration: A review
Gonzalez et al. Direct speech reconstruction from articulatory sensor data by machine learning
Lotto et al. Perceptual compensation for coarticulation by Japanese quail (Coturnix coturnix japonica)
Shiller et al. Perceptual recalibration of speech sounds following speech motor learning
Alais et al. Multisensory processing in review: from physiology to behaviour
Summerfield Lipreading and audio-visual speech perception
US20220208194A1 (en) Devices, systems, and methods for personal speech recognition and replacement
Fuchs et al. The new bionic electro-larynx speech system
KR20150104345A (en) Voice synthesys apparatus and method for synthesizing voice
Freitas et al. An introduction to silent speech interfaces
GB2422238A (en) Generation of data from speech or voiceless mouthed speech
WO2019046744A1 (en) Wearable vibrotactile speech aid
Meltzner et al. Speech recognition for vocalized and subvocal modes of production using surface EMG signals from the neck and face.
Rudzicz Production knowledge in the recognition of dysarthric speech
Richmond Preliminary inversion mapping results with a new EMA corpus
Weisenberger et al. The role of tactile aids in providing information about acoustic stimuli
Gilbert et al. Restoring speech following total removal of the larynx by a learned transformation from sensor data to acoustics
Blauert et al. Reflexive and reflective auditory feedback
Fu et al. Auditory training for cochlear implant patients
Spille et al. Binaural scene analysis with multidimensional statistical filters
Ifukube Sound-based assistive technology
Geumann et al. Are there compensatory effects in natural speech?
Howard et al. Training a vocal tract synthesiser to imitate speech using distal supervised learning
Saikachi et al. Development and perceptual evaluation of amplitude-based F0 control in electrolarynx speech

Legal Events

Date Code Title Description
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1091924

Country of ref document: HK

WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)
REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1091924

Country of ref document: HK