USRE41602E1 - Digital camera with voice recognition annotation - Google Patents
Digital camera with voice recognition annotation Download PDFInfo
- Publication number
- USRE41602E1 USRE41602E1 US11/392,923 US39292306A USRE41602E US RE41602 E1 USRE41602 E1 US RE41602E1 US 39292306 A US39292306 A US 39292306A US RE41602 E USRE41602 E US RE41602E
- Authority
- US
- United States
- Prior art keywords
- data
- voice
- digital camera
- text
- routines
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000000034 method Methods 0.000 claims description 22
- 238000004891 communication Methods 0.000 claims description 8
- 230000000977 initiatory effect Effects 0.000 claims 3
- 238000010586 diagram Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 229910044991 metal oxide Inorganic materials 0.000 description 1
- 150000004706 metal oxides Chemical class 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/80—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
- H04N9/82—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only
- H04N9/8205—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only involving the multiplexing of an additional signal and the colour video signal
- H04N9/8233—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback the individual colour picture signal components being recorded simultaneously only involving the multiplexing of an additional signal and the colour video signal the additional signal being a character code signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/4223—Cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/433—Content storage operation, e.g. storage operation in response to a pause request, caching operations
- H04N21/4334—Recording operations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440236—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8146—Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics
- H04N21/8153—Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics comprising still images, e.g. texture, background image
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/765—Interface circuits between an apparatus for recording and another apparatus
- H04N5/77—Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera
- H04N5/772—Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera the recording apparatus and the television camera being placed in the same enclosure
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/80—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
- H04N9/804—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components
- H04N9/8042—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components involving data reduction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/80—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
- H04N9/804—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components
- H04N9/806—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components with processing of the sound signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N1/00—Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
- H04N1/32—Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
- H04N1/32101—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
- H04N1/32128—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title attached to the image data, e.g. file header, transmitted message header, information on the same page or in the same computer file as the image
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N2101/00—Still video cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N2201/00—Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
- H04N2201/0077—Types of the still picture apparatus
- H04N2201/0084—Digital still camera
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N2201/00—Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
- H04N2201/32—Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
- H04N2201/3201—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
- H04N2201/3261—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of multimedia information, e.g. a sound signal
- H04N2201/3264—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of multimedia information, e.g. a sound signal of sound signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N2201/00—Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
- H04N2201/32—Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
- H04N2201/3201—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
- H04N2201/3261—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of multimedia information, e.g. a sound signal
- H04N2201/3266—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of multimedia information, e.g. a sound signal of text or character information, e.g. text accompanying an image
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N2201/00—Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
- H04N2201/32—Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
- H04N2201/3201—Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
- H04N2201/3274—Storage or retrieval of prestored additional information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/78—Television signal recording using magnetic recording
- H04N5/781—Television signal recording using magnetic recording on disks or drums
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/907—Television signal recording using static stores, e.g. storage tubes or semiconductor memories
Definitions
- the present invention relates to electronic photography, and in particular to a digital camera that translates recorded voice annotations to text annotations when external power is provided.
- Digital cameras have become popular for both professional and amateur photography. As digital cameras have become more popular, their sophistication has increased, allowing additional features. For example, some digital cameras allow the user to record voice annotations. However, when the pictures are printed, the voice annotations are lost, since recorded voice cannot be usefully displayed on a printed picture. A need arises for a way in which a voice annotation may be recorded when a picture is taken, but a text annotation is included with the picture when it is printed or transmitted.
- the present invention is a digital camera which allows voice annotations to be recorded for each picture, but which includes text annotations with each such picture when the picture is transmitted from the camera.
- the digital camera of the present invention includes an image sensing apparatus operable to receive light comprising an image and output image data representing the image, a first memory operable to store the image data, a sound sensing apparatus operable to receive a sound and output sound data representing the sound, wherein the sound is speech and the sound data is voice data, a second memory operable to store the voice data, a third memory operable to store text data; and a voice recognition apparatus operable to access the second memory, translate the stored voice data into text data and store the text data in the third memory, when the digital camera is provided with external power. Because the voice to text translation process is compute-intensive, and thus, power-consuming, the translation is deferred until external power is provided.
- the present invention may further include an I/O adapter operable to access the first memory and the third memory and transmit the stored image data and the stored text data, when the digital camera is communicatively connected to an external device.
- I/O adapter operable to access the first memory and the third memory and transmit the stored image data and the stored text data, when the digital camera is communicatively connected to an external device.
- the image data represent a picture
- the recorded voice data represent a voice annotation associated with the picture
- the text data is a text annotation associated with the picture.
- the voice recognition apparatus includes a microprocessor operable to execute image capture routines, voice recording routines and voice recognition routines.
- the microprocessor may be further operable to execute data transfer routines.
- external power and communications connections are provided by a cradle assembly.
- FIG. 1 shows a digital camera system 100 , according to the present invention.
- FIG. 2 is an exemplary block diagram of a digital camera shown in FIG. 1 .
- FIG. 3 is a flow diagram of a process of operation of the system shown in FIG. 1 .
- FIG. 4 is an exemplary format of data stored in a memory shown in FIG. 2 .
- FIG. 5 is another exemplary format of data stored in a memory shown in FIG. 2 .
- FIG. 1 A digital camera system 100 , according to the present invention, is shown in FIG. 1 .
- System 100 includes digital camera 102 and cradle assembly 104 .
- Cradle assembly 104 includes cradle 106 , which receives camera 102 , allowing attachment of the cradle to the camera.
- Cradle assembly 104 includes power connector 108 and data connector 110 , which provide power and data connections to camera 102 during the recharging, data transfer and voice recognition processes.
- Power is supplied to power connector 108 by power supply 112 via power cable 114 .
- Power supply 112 may be a wall-mounted device, an automotive power adapter, or a battery-powered device.
- Data may be transferred via data cable 116 , which connects to data connector 110 , and which provides communicative connection to an external device, such as a personal computer 119 , or to a communication device, such as wireless system 120 , cable modem 122 , asymmetric digital subscriber line (ADSL) modem 124 , local area network interface device 126 , integrated services digital network (ISDN) interface device 128 , or voice line modem 130 .
- Wireless system 120 includes a modem and wireless transceiver communicatively connected to a wireless network. The recharging, data transfer and voice recognition processes are performed when the camera is returned to the cradle after pictures are taken and voice annotations are recorded.
- communication devices 120 - 130 provide direct access to destination computer system or server 132 over the Internet 134 .
- communication devices 120 - 130 provide access to an intermediate system 136 .
- the intermediate system may be a server or other computer system and is used to improve the convenience and speed of data transfers from camera 102 .
- cradle 106 may not be used. Rather, power connector 108 and data connector 110 may be directly attached to camera 102 .
- the connectors may be attached separately or combined in a single assembly.
- a digital camera 102 is shown in FIG. 2 .
- Digital camera 102 includes an image sensing apparatus 201 , which receives light comprising an image and outputs digital image data representing the image.
- Image sensing apparatus 201 typically includes a lens 202 , which focuses the image onto image sensor 204 .
- Image sensor 204 which is typically a charge-coupled device (CCD) or a complementary metal oxide semiconductor (CMOS) device, outputs a signal representing the image to A/D converter 206 , which converts it to digital image data by digitizing the signal, and outputs the digital image data to microprocessor 208 .
- Digital camera 102 also includes sound sensing apparatus 209 , which receives sounds, such as speech and outputs digital sound data representing the sound.
- Microphone 210 senses sounds, typically spoken words, and outputs a signal representing the sensed sounds to A/D converter 212 , which digitizes the signal and outputs the digital sound data to microprocessor 208 .
- Microprocessor 208 stores the digital image and sound data in memory 214 .
- Memory 214 is typically semiconductor memory, such as RAM or flash memory. Memory 214 may be built-in to camera 102 or memory 214 may be removable and non-volatile, such as flash memory cards, or may also be disk storage, such as a floppy disk or other removable media drive, or a hard drive in or attached to digital camera 102 .
- Digital camera 102 includes I/O adapter 216 , which includes connector 217 , for transferring data into or out of the camera via data connector 110 and data cable 116 .
- Digital camera 102 also includes power supply 218 , which includes a battery, regulating and recharging circuitry and connector 219 . This allows digital camera 102 to be powered by power supply 112 via power cable 114 and power connector 108 .
- Other well-known components, such as viewfinder, shutter switch, etc., are not shown.
- Microprocessor 208 stores image data for each picture taken in image data block 220 in memory 214 .
- the image data in block 220 is typically compressed to save memory space.
- Microprocessor 208 stores the recorded voice (speech) data associated with each stored image in recorded voice data block 222 .
- the recorded voice data is also compressed.
- Text data associated with each stored image is also stored in memory 214 in recognized text annotation data block 223 .
- the stored text data is generated by performing voice recognition on the recorded voice data, as described below.
- any sound may be recorded and stored by digital camera 102 , not just speech.
- the recorded sound will be stored in memory 214 in recorded voice data block 222 .
- the recorded sound will be treated as recorded voice data and voice recognition will be attempted on the recorded sound. In this situation, voice recognition will fail, causing digital camera 102 to recognize that the recorded sound is not voice data.
- the recorded sound will then be treated not as voice data, but simply as recorded sound data.
- the voice recognition is performed by voice recognition unit 224 using voice recognition data 225 .
- voice recognition is performed using a digital signal processor (DSP).
- DSP digital signal processor
- voice recognition unit 224 is not used and voice recognition is performed by microprocessor 208 executing voice recognition routines 226 , using voice recognition data 225 . This embodiment does not provide real-time recognition, but saves the expense of voice recognition unit 224 .
- the output of the voice recognition process is text data, which is stored in recognized text annotation data block 223 .
- Digital camera 102 also includes software routines which are executed by microprocessor 208 .
- Image/voice capture routines 228 control the process of taking digital photographs, recording voice annotations and compressing and storing the data in image data block 220 and recorded sound data block 222 .
- Voice recognition routines 226 control the process of recognizing the voice annotations stored in recorded sound data block 222 , generating text annotations and storing them in recognized text annotation data block 223 .
- Data transfer routines 230 control the process of transferring data from digital camera 102 .
- Voice recognition data 225 is typically stored in RAM built-in to digital camera 102 . However, voice recognition data 225 may be stored in removable memory, so that the camera may be customized to recognize particular voices or languages.
- Software routines 226 - 230 are typically stored in nonvolatile memory, such as ROM or flash memory.
- Digital camera system 100 is operated as shown in FIG. 3 .
- the camera is removed from cradle 106 .
- the camera is used to take one or more pictures and to record one or more voice annotations.
- Microprocessor 208 executes image/voice capture routines 228 in order to take each picture, compress the image data, and store the image data in image data block 220 in memory 214 .
- microprocessor 208 executes image/voice capture routines 228 in order to record each voice annotation, compress the voice data, and store the voice data in recorded voice data block 222 in memory 214 .
- Camera 102 may be used to take pictures and record voice annotations until the completion of a picture-taking session.
- a picture-taking session may be completed because memory 214 has become full, because the battery charge has become low, or because the user has taken the desired pictures.
- camera 102 is placed in cradle 106 , which causes attachment of both power connector 108 and data connector 110 to camera 102 . If cradle 106 is not used, then, at a minimum, power connector 108 must be attached to camera 102 . Typically, data connector 110 is also connected at this time, but that is not required.
- Microprocessor 208 detects that camera 102 has been provided with external power. The detection may be accomplished by any well-known technique. For example, power supply circuitry 218 may detect the presence of external power on power connector 219 and signal microprocessor 208 . Other well-known techniques may also be used.
- microprocessor 208 Upon detecting that camera 102 has been provided with external power, in step 308 , microprocessor 208 executes voice recognition routines 226 in order to translate the stored voice annotations to text.
- the details of the voice recognition routines depend upon the embodiment of digital camera.
- microprocessor 208 signals unit 224 to begin voice recognition.
- Voice recognition unit 224 then translates the stored voice annotations to text using voice recognition data 225 and stores the recognized text in block 223 .
- voice recognition unit 224 signals completion to microprocessor 208 .
- voice recognition routines 226 include code that cause microprocessor 208 to itself perform the translation of the stored voice annotations to text using voice recognition data 225 .
- Microprocessor 208 also stores the recognized text block 223 .
- microprocessor 208 transfers the stored image and text data to an attached device via data cable 116 , if data connector 110 is attached to camera 102 . If data connector 102 is not attached, camera 102 can store the image and text data for later transfer. Alternatively, if memory 214 is removable, the image and text data may be transferred by removing memory 214 .
- the attached device is typically a personal computer or workstation, but may be a local or wide-area network, a server, a mainframe or mini-computer, a communication device, etc.
- Voice recognition annotation may be further enhanced by combination with information that modifies the associated annotation.
- the modifying information may be specified by the user of the camera by manipulating a menu displayed by the camera or by speaking keywords that are recognized as such by the camera.
- an annotation may be specified as being a description of the picture associated with the annotation, the name of the place depicted, the time the picture was taken, the names of persons depicted, etc.
- the user may enter information specifying the name, address, e-mail address, etc. of a recipient for each picture of group of pictures.
- the user may likewise enter different description, place, name, etc. information for each recipient of each picture or group of pictures.
- FIG. 4 An exemplary format of data stored in memory 214 is shown in FIG. 4 .
- the image data from each picture taken is stored as a block of image data.
- the image data from picture 1 is stored in block 402
- the image data from picture N is stored in block 404 .
- All blocks of image data 402 - 404 are stored contiguously.
- the recorded voice data associated with each picture taken is stored as a block of recorded voice data.
- the recorded voice data from the voice annotation associated with picture 1 is stored in block 406
- the recorded voice data from the voice annotation associated with picture N is stored in block 408 . All blocks of recorded voice data 406 - 408 are stored contiguously.
- the translated text annotation data associated with each picture taken is stored as a block of text data.
- the translated text annotation data associated with picture 1 is stored in block 410
- the translated text annotation data associated with picture N is stored in block 412 . All blocks of translated text annotation data 410 - 412 are stored contiguously.
- FIG. 5 Another exemplary format of data stored in memory 214 is shown in FIG. 5 .
- the image data from each picture, the recorded voice data associated with each picture and the translated text annotation data associated with each picture are each stored as blocks of data.
- the image data from picture 1 is stored as block 502
- the recorded voice data associated with picture 1 is stored as block 504
- the translated text data associated with picture 1 is stored as block 506 .
- the image data from a picture is stored contiguously with the recorded voice data and the translated text data associated with the picture.
- blocks 502 , 504 and 506 which are all associated with picture 1
- block 508 , 510 and 512 which are all associated with picture N, are stored contiguously.
- FIGS. 4 and 5 are only two examples of data storage formats that may be used. Any other format that maintains the association among the image data, the recorded voice data and the translated text data may be used as well. For example, a well-known file system may be used.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Graphics (AREA)
- Studio Devices (AREA)
Abstract
A digital camera which allows voice annotations to be recorded for each picture, but which includes text annotations with each such picture when the picture is transmitted from the camera. The digital camera includes an image sensing apparatus operable to receive light comprising an image and output image data representing the image, a first memory operable to store the image data, a sound sensing apparatus operable to receive a sound and output sound data representing the sound, wherein the sound is speech and the sound data is voice data, a second memory operable to store the voice data, a third memory operable to store text data; and a voice recognition apparatus operable to access the second memory, translate the stored voice data into text data and store the text data in the third memory, when the digital camera is provided with external power. In one embodiment, the voice recognition apparatus includes a microprocessor operable to execute image capture routines, voice recording routines and voice recognition routines. The microprocessor may be further operable to execute data transfer routines.
Description
The present invention relates to electronic photography, and in particular to a digital camera that translates recorded voice annotations to text annotations when external power is provided.
Digital cameras have become popular for both professional and amateur photography. As digital cameras have become more popular, their sophistication has increased, allowing additional features. For example, some digital cameras allow the user to record voice annotations. However, when the pictures are printed, the voice annotations are lost, since recorded voice cannot be usefully displayed on a printed picture. A need arises for a way in which a voice annotation may be recorded when a picture is taken, but a text annotation is included with the picture when it is printed or transmitted.
The present invention is a digital camera which allows voice annotations to be recorded for each picture, but which includes text annotations with each such picture when the picture is transmitted from the camera. The digital camera of the present invention includes an image sensing apparatus operable to receive light comprising an image and output image data representing the image, a first memory operable to store the image data, a sound sensing apparatus operable to receive a sound and output sound data representing the sound, wherein the sound is speech and the sound data is voice data, a second memory operable to store the voice data, a third memory operable to store text data; and a voice recognition apparatus operable to access the second memory, translate the stored voice data into text data and store the text data in the third memory, when the digital camera is provided with external power. Because the voice to text translation process is compute-intensive, and thus, power-consuming, the translation is deferred until external power is provided.
The present invention may further include an I/O adapter operable to access the first memory and the third memory and transmit the stored image data and the stored text data, when the digital camera is communicatively connected to an external device.
It is preferred that the image data represent a picture, the recorded voice data represent a voice annotation associated with the picture, and the text data is a text annotation associated with the picture.
In one embodiment, the voice recognition apparatus includes a microprocessor operable to execute image capture routines, voice recording routines and voice recognition routines. The microprocessor may be further operable to execute data transfer routines.
In one embodiment, external power and communications connections are provided by a cradle assembly.
The details of the present invention, both as to its structure and operation, can best be understood by referring to the accompanying drawings, in which like reference numbers and designations refer to like elements.
A digital camera system 100, according to the present invention, is shown in FIG. 1. System 100 includes digital camera 102 and cradle assembly 104. Cradle assembly 104 includes cradle 106, which receives camera 102, allowing attachment of the cradle to the camera. Cradle assembly 104 includes power connector 108 and data connector 110, which provide power and data connections to camera 102 during the recharging, data transfer and voice recognition processes. Power is supplied to power connector 108 by power supply 112 via power cable 114. Power supply 112 may be a wall-mounted device, an automotive power adapter, or a battery-powered device. Data may be transferred via data cable 116, which connects to data connector 110, and which provides communicative connection to an external device, such as a personal computer 119, or to a communication device, such as wireless system 120, cable modem 122, asymmetric digital subscriber line (ADSL) modem 124, local area network interface device 126, integrated services digital network (ISDN) interface device 128, or voice line modem 130. Wireless system 120 includes a modem and wireless transceiver communicatively connected to a wireless network. The recharging, data transfer and voice recognition processes are performed when the camera is returned to the cradle after pictures are taken and voice annotations are recorded.
In one embodiment, communication devices 120-130 provide direct access to destination computer system or server 132 over the Internet 134. In another embodiment, communication devices 120-130 provide access to an intermediate system 136. The intermediate system may be a server or other computer system and is used to improve the convenience and speed of data transfers from camera 102.
Alternatively, cradle 106 may not be used. Rather, power connector 108 and data connector 110 may be directly attached to camera 102. The connectors may be attached separately or combined in a single assembly.
A digital camera 102, according to the present invention, is shown in FIG. 2. Digital camera 102 includes an image sensing apparatus 201, which receives light comprising an image and outputs digital image data representing the image. Image sensing apparatus 201 typically includes a lens 202, which focuses the image onto image sensor 204. Image sensor 204, which is typically a charge-coupled device (CCD) or a complementary metal oxide semiconductor (CMOS) device, outputs a signal representing the image to A/D converter 206, which converts it to digital image data by digitizing the signal, and outputs the digital image data to microprocessor 208. Digital camera 102 also includes sound sensing apparatus 209, which receives sounds, such as speech and outputs digital sound data representing the sound. Microphone 210 senses sounds, typically spoken words, and outputs a signal representing the sensed sounds to A/D converter 212, which digitizes the signal and outputs the digital sound data to microprocessor 208. Microprocessor 208 stores the digital image and sound data in memory 214. Memory 214 is typically semiconductor memory, such as RAM or flash memory. Memory 214 may be built-in to camera 102 or memory 214 may be removable and non-volatile, such as flash memory cards, or may also be disk storage, such as a floppy disk or other removable media drive, or a hard drive in or attached to digital camera 102.
It will be seen that any sound may be recorded and stored by digital camera 102, not just speech. The recorded sound will be stored in memory 214 in recorded voice data block 222. The recorded sound will be treated as recorded voice data and voice recognition will be attempted on the recorded sound. In this situation, voice recognition will fail, causing digital camera 102 to recognize that the recorded sound is not voice data. The recorded sound will then be treated not as voice data, but simply as recorded sound data.
In one embodiment, the voice recognition is performed by voice recognition unit 224 using voice recognition data 225. Typically, voice recognition is performed using a digital signal processor (DSP). Use of a DSP allows real-time or near-real time recognition, at significant expense. However, real-time voice recognition is not necessary in the present invention, since recognition is not performed until the camera has been returned to the cradle. Thus, in another embodiment of the present invention, voice recognition unit 224 is not used and voice recognition is performed by microprocessor 208 executing voice recognition routines 226, using voice recognition data 225. This embodiment does not provide real-time recognition, but saves the expense of voice recognition unit 224.
The output of the voice recognition process is text data, which is stored in recognized text annotation data block 223.
Upon detecting that camera 102 has been provided with external power, in step 308, microprocessor 208 executes voice recognition routines 226 in order to translate the stored voice annotations to text. The details of the voice recognition routines depend upon the embodiment of digital camera. In an embodiment that includes voice recognition unit 224, microprocessor 208 signals unit 224 to begin voice recognition. Voice recognition unit 224 then translates the stored voice annotations to text using voice recognition data 225 and stores the recognized text in block 223. When voice recognition is completed, voice recognition unit 224 signals completion to microprocessor 208.
In an embodiment that does not include voice recognition unit 224, voice recognition routines 226 include code that cause microprocessor 208 to itself perform the translation of the stored voice annotations to text using voice recognition data 225. Microprocessor 208 also stores the recognized text block 223.
When voice recognition is completed, in step 310, microprocessor 208 transfers the stored image and text data to an attached device via data cable 116, if data connector 110 is attached to camera 102. If data connector 102 is not attached, camera 102 can store the image and text data for later transfer. Alternatively, if memory 214 is removable, the image and text data may be transferred by removing memory 214. The attached device is typically a personal computer or workstation, but may be a local or wide-area network, a server, a mainframe or mini-computer, a communication device, etc.
Voice recognition annotation may be further enhanced by combination with information that modifies the associated annotation. The modifying information may be specified by the user of the camera by manipulating a menu displayed by the camera or by speaking keywords that are recognized as such by the camera. For example, an annotation may be specified as being a description of the picture associated with the annotation, the name of the place depicted, the time the picture was taken, the names of persons depicted, etc. The user may enter information specifying the name, address, e-mail address, etc. of a recipient for each picture of group of pictures. The user may likewise enter different description, place, name, etc. information for each recipient of each picture or group of pictures.
An exemplary format of data stored in memory 214 is shown in FIG. 4. In this example, the image data from each picture taken is stored as a block of image data. For example, the image data from picture 1 is stored in block 402, and the image data from picture N is stored in block 404. All blocks of image data 402-404 are stored contiguously. The recorded voice data associated with each picture taken is stored as a block of recorded voice data. For example, the recorded voice data from the voice annotation associated with picture 1 is stored in block 406, and the recorded voice data from the voice annotation associated with picture N is stored in block 408. All blocks of recorded voice data 406-408 are stored contiguously. The translated text annotation data associated with each picture taken is stored as a block of text data. For example, the translated text annotation data associated with picture 1 is stored in block 410, and the translated text annotation data associated with picture N is stored in block 412. All blocks of translated text annotation data 410-412 are stored contiguously.
Another exemplary format of data stored in memory 214 is shown in FIG. 5. As in FIG. 4 , the image data from each picture, the recorded voice data associated with each picture and the translated text annotation data associated with each picture are each stored as blocks of data. For example, the image data from picture 1 is stored as block 502, the recorded voice data associated with picture 1 is stored as block 504 and the translated text data associated with picture 1 is stored as block 506. However, in this example, the image data from a picture is stored contiguously with the recorded voice data and the translated text data associated with the picture. Thus, blocks 502, 504 and 506, which are all associated with picture 1, are stored contiguously. Likewise, block 508, 510 and 512, which are all associated with picture N, are stored contiguously.
Although specific embodiments of the present invention have been described, it will be understood by those of skill in the art that there are other embodiments that are equivalent to the described embodiments. Accordingly, it is to be understood that the invention is not to be limited by the specific illustrated embodiments, but only by the scope of the appended claims.
Claims (27)
1. A digital camera comprising:
an image sensing apparatus operable to receive light comprising an image and output digital image data representing the image as a picture;
a digital memory including first, second, third, and fourth storage areas within the memory;
digital image data stored in the first storage area of the digital memory;
a sound sensing apparatus operable to receive a sound and output sound data representing the sound, wherein the sound is speech and the sound data is voice data;
voice data stored in the second storage area of the digital memory;
text data stored in the third storage area of the digital memory;
a voice recognition apparatus operable to access the second storage area, translate the stored voice data into text data and store the text data in the third storage area, when the digital camera is provided with external power ; and
image, voice and text data of a picture stored in contiguous locations in the fourth storage area of the digital memory.
2. The digital camera of claim 1 , further comprising
an I/O adapter operable to access the first memory and the third memory and transmit the stored image data and the stored text data, when the digital camera is communicatively connected to an external device.
3. The digital camera of claim 1 , wherein the image data represents a picture, the voice data represents a voice annotation associated with the picture, and the text data is a text annotation associated with the picture.
4. The digital camera of claim 3 , further comprising information that modifies the text annotation.
5. The digital camera of claim 1 , further comprises comprising:
a microprocessor within the camera programmed to perform image capture routines, voice recording routines, voice recognition routines and text routines within the microprocessor .
6. The digital camera of claim 5 , wherein the microprocessor is further operable to execute data transfer routines.
7. The digital camera of claim 1 , wherein external power and communications connections are provided by a cradle assembly for recharging, initiating voice recognition processes and connections to external networks and systems.
8. A method of operating a digital camera comprising the steps of:
receiving light comprising an image and outputting digital image data representing the image;
storing the image data as a picture in a first storage area of a digital memory;
receiving a sound and outputting sound data representing the sound, wherein the sound is speech and the sound data is voice data;
storing the voice data in a second storage area of the digital memory;
translating the stored voice data into text data, when the digital camera is supplied with external power ;
storing the text data in a third storage area of the digital memory; and
storing the image, voice and text data of each picture in contiguous locations in a fourth storage area of the digital memory.
9. The method of claim 8 , further comprising the step of:
transmitting the stored image data and the stored text data, when the digital camera is communicatively connected to an external device.
10. The method of claim 8 , wherein the image data represents a picture, the voice data represents a voice annotation associated with the picture, and the text data is a text annotation associated with the picture.
11. The digital camera method of claim 10 , further comprising information that modifies the text annotation.
12. The method of claim 8 further comprising:
performing in a microprocessor within the camera image capture routines, voice recording routines, voice recognition routines and text routines programmed within the microprocessor .
13. The method of claim 12 , wherein the microprocessor is further operable to execute data transfer routines.
14. The method of claim 8 , further comprising the step of:
providing external power and communications connections with a cradle assembly for recharging, initiating voice recognition processes and connections to external networks and systems.
15. A digital camera comprising:
means for receiving light comprising an image and outputting digital image data representing the image as a picture;
a digital memory having first, second, third and fourth storage areas within the digital memory
means for storing the image data in the first storage area of the digital memory;
means for receiving a sound and outputting sound data representing the sound, wherein the sound is speech and the sound data is voice data;
means for storing the voice data in the second storage area of the digital memory;
means for translating the stored recorded voice data into text data, when the digital camera is supplied with external power ;
means for storing text data in the third storage area of the digital memory; and
means for storing image, voice and text data of each picture in contiguous locations in the fourth storage area of the digital memory.
16. The digital camera of claim 15 , further comprising:
means for transmitting the stored image data and the stored text data, when the digital camera is communicatively connected to an external device.
17. The digital camera of claim 15 , wherein the image data represents a picture, the voice data represents a voice annotation associated with the picture, and the text data is a text annotation associated with the picture.
18. The digital camera of claim 17 , further comprising information that modifies the text annotation.
19. The digital camera of claim 15 comprising:
a microprocessor within the camera programmed to perform image capture routines, voice recording routines, voice recognition routines and text routines within the microprocessor .
20. The digital camera of claim 19 , wherein the microprocessor is further operable to execute data transfer routines.
21. The digital camera of claim 15 , further comprising:
means for providing external power and communications for recharging, initiating voice recognition processes and connections to external networks and systems.
22. The digital camera of claim 1 , wherein the voice recognition apparatus is operable to access the second storage area, translate the stored voice data into text data and store the text data in the third storage area when the digital camera is provided with external power.
23. The digital camera of claim 5 , further comprising a ROM or flash memory for storing the image capture routines, voice recording routines, and text routines.
24. The method of claim 8 , wherein the stored voice data is translated into text data when the digital camera is supplied with external power.
25. The method of claim 12 , further comprising storing the image capture routines, voice recording routines, and text routines in a ROM or flash memory.
26. The digital camera of claim 15 , wherein the means for translating translates the stored voice data into text data when the digital camera is provided with external power.
27. The digital camera of claim 19 , further comprising a means for storing the image capture routines, voice recording routines, and text routines.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/392,923 USRE41602E1 (en) | 1998-12-16 | 2006-03-28 | Digital camera with voice recognition annotation |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/213,313 US6721001B1 (en) | 1998-12-16 | 1998-12-16 | Digital camera with voice recognition annotation |
US11/392,923 USRE41602E1 (en) | 1998-12-16 | 2006-03-28 | Digital camera with voice recognition annotation |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/213,313 Reissue US6721001B1 (en) | 1998-12-16 | 1998-12-16 | Digital camera with voice recognition annotation |
Publications (1)
Publication Number | Publication Date |
---|---|
USRE41602E1 true USRE41602E1 (en) | 2010-08-31 |
Family
ID=22794600
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/213,313 Ceased US6721001B1 (en) | 1998-12-16 | 1998-12-16 | Digital camera with voice recognition annotation |
US11/392,923 Expired - Lifetime USRE41602E1 (en) | 1998-12-16 | 2006-03-28 | Digital camera with voice recognition annotation |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/213,313 Ceased US6721001B1 (en) | 1998-12-16 | 1998-12-16 | Digital camera with voice recognition annotation |
Country Status (2)
Country | Link |
---|---|
US (2) | US6721001B1 (en) |
JP (1) | JP3272336B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8462231B2 (en) | 2011-03-14 | 2013-06-11 | Mark E. Nusbaum | Digital camera with real-time picture identification functionality |
Families Citing this family (76)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7016595B1 (en) * | 1999-05-28 | 2006-03-21 | Nikon Corporation | Television set capable of controlling external device and image storage controlled by television set |
JP3777922B2 (en) * | 1999-12-09 | 2006-05-24 | コニカミノルタフォトイメージング株式会社 | Digital imaging apparatus, image processing system including the same, image processing apparatus, digital imaging method, and recording medium |
US6542295B2 (en) * | 2000-01-26 | 2003-04-01 | Donald R. M. Boys | Trinocular field glasses with digital photograph capability and integrated focus function |
US8345105B2 (en) * | 2000-03-06 | 2013-01-01 | Sony Corporation | System and method for accessing and utilizing ancillary data with an electronic camera device |
JP2001333378A (en) * | 2000-03-13 | 2001-11-30 | Fuji Photo Film Co Ltd | Image processor and printer |
JP4124402B2 (en) * | 2000-03-31 | 2008-07-23 | 株式会社リコー | Image input device |
JP2002209175A (en) * | 2000-10-16 | 2002-07-26 | Canon Inc | External storage device for imaging apparatus, its control method, imaging unit, and its control method |
US7032182B2 (en) * | 2000-12-20 | 2006-04-18 | Eastman Kodak Company | Graphical user interface adapted to allow scene content annotation of groups of pictures in a picture database to promote efficient database browsing |
JP4434502B2 (en) * | 2001-01-19 | 2010-03-17 | 富士フイルム株式会社 | Digital camera |
JP2002305677A (en) * | 2001-04-06 | 2002-10-18 | Sony Corp | Digital camera |
WO2002084999A1 (en) * | 2001-04-06 | 2002-10-24 | Sony Corporation | Digital camera and data transfer method |
JP2002359761A (en) * | 2001-05-31 | 2002-12-13 | Asahi Optical Co Ltd | Cradle for digital camera |
US7075579B2 (en) * | 2001-06-05 | 2006-07-11 | Eastman Kodak Company | Docking station assembly for transmitting digital files |
JP4812190B2 (en) * | 2001-06-20 | 2011-11-09 | オリンパス株式会社 | Image file device |
US20040201681A1 (en) * | 2001-06-21 | 2004-10-14 | Jack Chen | Multimedia data file producer combining image and sound information together in data file |
US7158175B2 (en) * | 2001-11-30 | 2007-01-02 | Eastman Kodak Company | System including a digital camera and a docking unit for coupling to the internet |
GB0129787D0 (en) * | 2001-12-13 | 2002-01-30 | Hewlett Packard Co | Method and system for collecting user-interest information regarding a picture |
GB2383247A (en) * | 2001-12-13 | 2003-06-18 | Hewlett Packard Co | Multi-modal picture allowing verbal interaction between a user and the picture |
US20030133015A1 (en) * | 2001-12-17 | 2003-07-17 | Jackel Lawrence David | Web-connected interactive digital camera |
US20030204403A1 (en) * | 2002-04-25 | 2003-10-30 | Browning James Vernard | Memory module with voice recognition system |
US7398209B2 (en) | 2002-06-03 | 2008-07-08 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
US7843495B2 (en) * | 2002-07-10 | 2010-11-30 | Hewlett-Packard Development Company, L.P. | Face recognition in a digital imaging system accessing a database of people |
US8064650B2 (en) * | 2002-07-10 | 2011-11-22 | Hewlett-Packard Development Company, L.P. | File management of digital images using the names of people identified in the images |
US7693720B2 (en) | 2002-07-15 | 2010-04-06 | Voicebox Technologies, Inc. | Mobile systems and methods for responding to natural language speech utterance |
KR100458642B1 (en) * | 2002-09-19 | 2004-12-03 | 삼성테크윈 주식회사 | Method for managing data files within portable digital apparatus, utilizing representative voice |
FR2844935B1 (en) * | 2002-09-25 | 2005-01-28 | Canon Kk | TRANSCODING DIGITAL DATA |
US20040085454A1 (en) * | 2002-11-04 | 2004-05-06 | Ming-Zhen Liao | Digital camera capable of transforming the audio input to its picture immediately into a readable illustration and transmitting same |
US7324943B2 (en) * | 2003-10-02 | 2008-01-29 | Matsushita Electric Industrial Co., Ltd. | Voice tagging, voice annotation, and speech recognition for portable devices with optional post processing |
US20050114131A1 (en) * | 2003-11-24 | 2005-05-26 | Kirill Stoimenov | Apparatus and method for voice-tagging lexicon |
JP4018678B2 (en) * | 2004-08-13 | 2007-12-05 | キヤノン株式会社 | Data management method and apparatus |
US20060092291A1 (en) * | 2004-10-28 | 2006-05-04 | Bodie Jeffrey C | Digital imaging system |
JP4396511B2 (en) * | 2004-12-20 | 2010-01-13 | ソニー株式会社 | Printing system |
US7627638B1 (en) * | 2004-12-20 | 2009-12-01 | Google Inc. | Verbal labels for electronic messages |
JP2006197115A (en) * | 2005-01-12 | 2006-07-27 | Fuji Photo Film Co Ltd | Imaging device and image output device |
FR2881910B1 (en) * | 2005-02-09 | 2007-05-25 | Eastman Kodak Co | SHOOTING EQUIPMENT AND IMAGE TRANSMISSION METHOD BY LOCAL NETWORK |
US7640160B2 (en) | 2005-08-05 | 2009-12-29 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
US7620549B2 (en) * | 2005-08-10 | 2009-11-17 | Voicebox Technologies, Inc. | System and method of supporting adaptive misrecognition in conversational speech |
US7949529B2 (en) | 2005-08-29 | 2011-05-24 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
EP1934971A4 (en) * | 2005-08-31 | 2010-10-27 | Voicebox Technologies Inc | Dynamic speech sharpening |
US7529772B2 (en) * | 2005-09-27 | 2009-05-05 | Scenera Technologies, Llc | Method and system for associating user comments to a scene captured by a digital imaging device |
US8467672B2 (en) * | 2005-10-17 | 2013-06-18 | Jeffrey C. Konicek | Voice recognition and gaze-tracking for a camera |
US7697827B2 (en) | 2005-10-17 | 2010-04-13 | Konicek Jeffrey C | User-friendlier interfaces for a camera |
US20070250526A1 (en) * | 2006-04-24 | 2007-10-25 | Hanna Michael S | Using speech to text functionality to create specific user generated content metadata for digital content files (eg images) during capture, review, and/or playback process |
US8375283B2 (en) * | 2006-06-20 | 2013-02-12 | Nokia Corporation | System, device, method, and computer program product for annotating media files |
US8301995B2 (en) * | 2006-06-22 | 2012-10-30 | Csr Technology Inc. | Labeling and sorting items of digital data by use of attached annotations |
US8073681B2 (en) | 2006-10-16 | 2011-12-06 | Voicebox Technologies, Inc. | System and method for a cooperative conversational voice user interface |
US8396280B2 (en) * | 2006-11-29 | 2013-03-12 | Honeywell International Inc. | Apparatus and method for inspecting assets in a processing or other environment |
US7818176B2 (en) * | 2007-02-06 | 2010-10-19 | Voicebox Technologies, Inc. | System and method for selecting and presenting advertisements based on natural language processing of voice-based input |
US8438214B2 (en) * | 2007-02-23 | 2013-05-07 | Nokia Corporation | Method, electronic device, computer program product, system and apparatus for sharing a media object |
US20090002497A1 (en) * | 2007-06-29 | 2009-01-01 | Davis Joel C | Digital Camera Voice Over Feature |
US8059882B2 (en) * | 2007-07-02 | 2011-11-15 | Honeywell International Inc. | Apparatus and method for capturing information during asset inspections in a processing or other environment |
JP5144424B2 (en) * | 2007-10-25 | 2013-02-13 | キヤノン株式会社 | Imaging apparatus and information processing method |
US20090107212A1 (en) * | 2007-10-30 | 2009-04-30 | Honeywell International Inc. | Process field instrument with integrated sensor unit and related system and method |
US8140335B2 (en) | 2007-12-11 | 2012-03-20 | Voicebox Technologies, Inc. | System and method for providing a natural language voice user interface in an integrated voice navigation services environment |
US8385588B2 (en) * | 2007-12-11 | 2013-02-26 | Eastman Kodak Company | Recording audio metadata for stored images |
US8438034B2 (en) * | 2007-12-21 | 2013-05-07 | Koninklijke Philips Electronics N.V. | Method and apparatus for playing pictures |
US9305548B2 (en) | 2008-05-27 | 2016-04-05 | Voicebox Technologies Corporation | System and method for an integrated, multi-modal, multi-device natural language voice services environment |
US8589161B2 (en) * | 2008-05-27 | 2013-11-19 | Voicebox Technologies, Inc. | System and method for an integrated, multi-modal, multi-device natural language voice services environment |
US9383225B2 (en) * | 2008-06-27 | 2016-07-05 | Honeywell International Inc. | Apparatus and method for reading gauges and other visual indicators in a process control system or other data collection system |
US8941740B2 (en) * | 2008-09-05 | 2015-01-27 | Honeywell International Inc. | Personnel field device for process control and other systems and related method |
US8326637B2 (en) | 2009-02-20 | 2012-12-04 | Voicebox Technologies, Inc. | System and method for processing multi-modal device interactions in a natural language voice services environment |
US9171541B2 (en) | 2009-11-10 | 2015-10-27 | Voicebox Technologies Corporation | System and method for hybrid processing in a natural language voice services environment |
WO2011059997A1 (en) | 2009-11-10 | 2011-05-19 | Voicebox Technologies, Inc. | System and method for providing a natural language content dedication service |
US9247306B2 (en) * | 2012-05-21 | 2016-01-26 | Intellectual Ventures Fund 83 Llc | Forming a multimedia product using video chat |
US20140078331A1 (en) * | 2012-09-15 | 2014-03-20 | Soundhound, Inc. | Method and system for associating sound data with an image |
CN104683683A (en) * | 2013-11-29 | 2015-06-03 | 英业达科技有限公司 | System for shooting images and method thereof |
FR3014675A1 (en) * | 2013-12-12 | 2015-06-19 | Oreal | METHOD FOR EVALUATING AT LEAST ONE CLINICAL FACE SIGN |
US9984457B2 (en) * | 2014-03-26 | 2018-05-29 | Sectra Ab | Automated grossing image synchronization and related viewers and workstations |
US9898459B2 (en) | 2014-09-16 | 2018-02-20 | Voicebox Technologies Corporation | Integration of domain information into state transitions of a finite state transducer for natural language processing |
EP3195145A4 (en) | 2014-09-16 | 2018-01-24 | VoiceBox Technologies Corporation | Voice commerce |
EP3207467A4 (en) | 2014-10-15 | 2018-05-23 | VoiceBox Technologies Corporation | System and method for providing follow-up responses to prior natural language inputs of a user |
US10431214B2 (en) | 2014-11-26 | 2019-10-01 | Voicebox Technologies Corporation | System and method of determining a domain and/or an action related to a natural language input |
US10614799B2 (en) | 2014-11-26 | 2020-04-07 | Voicebox Technologies Corporation | System and method of providing intent predictions for an utterance prior to a system detection of an end of the utterance |
US10331784B2 (en) | 2016-07-29 | 2019-06-25 | Voicebox Technologies Corporation | System and method of disambiguating natural language processing requests |
US10489633B2 (en) | 2016-09-27 | 2019-11-26 | Sectra Ab | Viewers and related methods, systems and circuits with patch gallery user interfaces |
CN113113043B (en) * | 2021-04-09 | 2023-01-13 | 中国工商银行股份有限公司 | Method and device for converting voice into image |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5546145A (en) * | 1994-08-30 | 1996-08-13 | Eastman Kodak Company | Camera on-board voice recognition |
US5602458A (en) * | 1993-08-16 | 1997-02-11 | Eastman Kodak Company | Rechargeable camera having operational inhibit of a flash unit power storage circuit during recharging |
US5692225A (en) * | 1994-08-30 | 1997-11-25 | Eastman Kodak Company | Voice recognition of recorded messages for photographic printers |
US5737491A (en) * | 1996-06-28 | 1998-04-07 | Eastman Kodak Company | Electronic imaging system capable of image capture, local wireless transmission and voice recognition |
US5940121A (en) * | 1997-02-20 | 1999-08-17 | Eastman Kodak Company | Hybrid camera system with electronic album control |
US6031526A (en) * | 1996-08-08 | 2000-02-29 | Apollo Camera, Llc | Voice controlled medical text and image reporting system |
US6084630A (en) * | 1991-03-13 | 2000-07-04 | Canon Kabushiki Kaisha | Multimode and audio data compression |
US6181883B1 (en) * | 1997-06-20 | 2001-01-30 | Picostar, Llc | Dual purpose camera for VSC with conventional film and digital image capture modules |
US6469738B1 (en) * | 1997-02-26 | 2002-10-22 | Sanyo Electric Co., Ltd. | Frames allowable to be shot in a digital still camera |
-
1998
- 1998-12-16 US US09/213,313 patent/US6721001B1/en not_active Ceased
-
1999
- 1999-12-09 JP JP34971299A patent/JP3272336B2/en not_active Expired - Fee Related
-
2006
- 2006-03-28 US US11/392,923 patent/USRE41602E1/en not_active Expired - Lifetime
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6084630A (en) * | 1991-03-13 | 2000-07-04 | Canon Kabushiki Kaisha | Multimode and audio data compression |
US5602458A (en) * | 1993-08-16 | 1997-02-11 | Eastman Kodak Company | Rechargeable camera having operational inhibit of a flash unit power storage circuit during recharging |
US5546145A (en) * | 1994-08-30 | 1996-08-13 | Eastman Kodak Company | Camera on-board voice recognition |
US5692225A (en) * | 1994-08-30 | 1997-11-25 | Eastman Kodak Company | Voice recognition of recorded messages for photographic printers |
US5737491A (en) * | 1996-06-28 | 1998-04-07 | Eastman Kodak Company | Electronic imaging system capable of image capture, local wireless transmission and voice recognition |
US6031526A (en) * | 1996-08-08 | 2000-02-29 | Apollo Camera, Llc | Voice controlled medical text and image reporting system |
US5940121A (en) * | 1997-02-20 | 1999-08-17 | Eastman Kodak Company | Hybrid camera system with electronic album control |
US6469738B1 (en) * | 1997-02-26 | 2002-10-22 | Sanyo Electric Co., Ltd. | Frames allowable to be shot in a digital still camera |
US6181883B1 (en) * | 1997-06-20 | 2001-01-30 | Picostar, Llc | Dual purpose camera for VSC with conventional film and digital image capture modules |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8462231B2 (en) | 2011-03-14 | 2013-06-11 | Mark E. Nusbaum | Digital camera with real-time picture identification functionality |
Also Published As
Publication number | Publication date |
---|---|
JP3272336B2 (en) | 2002-04-08 |
JP2000184258A (en) | 2000-06-30 |
US6721001B1 (en) | 2004-04-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
USRE41602E1 (en) | Digital camera with voice recognition annotation | |
US7831598B2 (en) | Data recording and reproducing apparatus and method of generating metadata | |
US20070236583A1 (en) | Automated creation of filenames for digital image files using speech-to-text conversion | |
JP2000232599A (en) | Digital camera, operating method, recording medium read by computer, computer system and automatic digital photograph | |
US8462231B2 (en) | Digital camera with real-time picture identification functionality | |
US20040119837A1 (en) | Image pickup apparatus | |
JP2005202651A (en) | Information processing apparatus, information processing method, recording medium with program recorded thereon, and control program | |
JP2006270263A (en) | Photographing system | |
JP2004147325A (en) | System and method for associating information with captured image | |
US6804652B1 (en) | Method and apparatus for adding captions to photographs | |
JP2005346259A (en) | Information processing device and information processing method | |
US6950128B1 (en) | Information storage medium with a rotatably mounted camera | |
JP2007199908A (en) | Emoticon input apparatus | |
JP5246592B2 (en) | Information processing terminal, information processing method, and information processing program | |
US7460738B1 (en) | Systems, methods and devices for determining and assigning descriptive filenames to digital images | |
JP2006166434A (en) | Portable data storage device and image recording device directly connectable to computer usb port | |
JP2003348524A (en) | Digital camera with communication function | |
JPH09200668A (en) | Image pickup device | |
JP2001045178A (en) | Image transmission method, image transmission system, electronic camera and image transmitter | |
JP2003169243A (en) | Cradle, and cradle system | |
JP2010193207A (en) | Mobile information terminal, image information management method, and image information management program | |
JP2005184469A (en) | Digital still camera | |
JP2004301894A (en) | Method and device for voice recording, digital camera, and method and device for image reproduction | |
JPH11355627A (en) | Digital still camera | |
JP2006180089A (en) | Digital camera and searching system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FPAY | Fee payment |
Year of fee payment: 8 |
|
FPAY | Fee payment |
Year of fee payment: 12 |