WO2001050665A1 - Watermark-based personal audio appliance - Google Patents

Watermark-based personal audio appliance Download PDF

Info

Publication number
WO2001050665A1
WO2001050665A1 PCT/US2000/035630 US0035630W WO0150665A1 WO 2001050665 A1 WO2001050665 A1 WO 2001050665A1 US 0035630 W US0035630 W US 0035630W WO 0150665 A1 WO0150665 A1 WO 0150665A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
user
data
watermark
payload
Prior art date
Application number
PCT/US2000/035630
Other languages
French (fr)
Inventor
Geoffrey B. Rhoads
William Y. Conwell
Original Assignee
Digimarc Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/476,686 external-priority patent/US7562392B1/en
Application filed by Digimarc Corporation filed Critical Digimarc Corporation
Priority to AU22957/01A priority Critical patent/AU2295701A/en
Publication of WO2001050665A1 publication Critical patent/WO2001050665A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H20/00Arrangements for broadcast or for distribution combined with broadcast
    • H04H20/28Arrangements for simultaneous broadcast of plural pieces of information
    • H04H20/30Arrangements for simultaneous broadcast of plural pieces of information by a single channel
    • H04H20/31Arrangements for simultaneous broadcast of plural pieces of information by a single channel using in-band signals, e.g. subsonic or cue signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/00007Time or data compression or expansion
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/00086Circuits for prevention of unauthorised reproduction or copying, e.g. piracy
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/00086Circuits for prevention of unauthorised reproduction or copying, e.g. piracy
    • G11B20/00884Circuits for prevention of unauthorised reproduction or copying, e.g. piracy involving a watermark, i.e. a barely perceptible transformation of the original data which can nevertheless be recognised by an algorithm
    • G11B20/00891Circuits for prevention of unauthorised reproduction or copying, e.g. piracy involving a watermark, i.e. a barely perceptible transformation of the original data which can nevertheless be recognised by an algorithm embedded in audio data

Definitions

  • Fig. 3 is a block diagram of a device according to one embodiment of the present invention.
  • Fig. 4 is a block diagram of a system in which the device of Fig. 3 may be utilized.
  • a device 10 includes a microphone 12, an A/D converter 13, a processor 14, one or more indicators 16, one or more buttons 18, a wireless interface 20, and a power source 22.
  • the device can be packaged in a small plastic housing, preferably as small as is practical (e.g., sized and configured to serve as a key chain ornament, perhaps akin to the Tomagatchi toys that were recently popular).
  • the housing has one or more small holes to permit audio penetration through the housing to the microphone 12.
  • the processor 14 can take various forms, including a dedicated hardware device (e.g., an ASIC), a general purpose processor programmed in accordance with instructions stored in non- volatile RAM memory, etc.
  • the indicators 16 can be as simple as a single LED lamp, or as complex as an alphanumeric LCD or other multi-element display. In one embodiment, the indicator simply indicates when the processor has decoded a watermark in audio sensed by the microphone. More elaborate signaling techniques can of course be used, including two- or three-color LEDs that can be used to signal different states with different colors, indicators with flashing patterns or changing displays, etc.
  • buttons 18 are used by the user to indicate an interest in the audio just- heard.
  • the power source 22 can be a battery, solar cell, storage capacitor, or other source of energy suitable for powering the components of the device 10.
  • the wireless interface 20 serves to exchange data with a relay station 24 (Fig. 4).
  • the interface is radio-based, and provides a one-way communications channel.
  • other wireless technologies e.g., IR
  • two-way communication can be provided.
  • the relay station can be a cellular repeater (if the interface transmits using cellular frequencies and protocols), or a local receiver, e.g., associated with the user's computer.
  • the relay station can also be a paging system relay station (e.g., as are used for two-way pagers), or may be a low earth orbit satellite-based repeater.
  • the processor monitors the ambient audio for the presence of encoded data, e.g., a digital watermark, and decodes same. If power considerations permit, the device is "always-on.” In other embodiments, one of the buttons 18 can be used to awaken the device. In such other embodiments, another button-press can serve to turn-off the device, or the device can power-down after a predetermined period, e.g., of not sensing any watermarked audio.
  • encoded data e.g., a digital watermark
  • the data payload encoded by the watermark may take various forms.
  • One is a Digital Object Identifier - an ID corresponding to the standardized digital object numbering system promulgated by the International DOI Foundation (www.doi.org).
  • Another is to include plural data fields variously representing, e.g., the name of the publisher, the name of the artist, the title of the work, the date of publication, etc., etc.
  • Another is to encode a unique identifier (UID), e.g., of 16 - 64 bits.
  • the UID serves as an index to a remote database where additional information (e.g., publisher, artist, title, date of publication, etc., are stored).
  • the data transmitted from the device 10 to the relay station 24 typically includes some or all of the watermark payload data, and also includes data identifying the device 10, or its user (user-ID data). Again, this data can include several data fields (e.g. user name, audio delivery information such as email address or URL, age, gender, model of device 10, etc.). Alternatively, a serial number or other unique identifier can be used, which serves as an index to a database have a corresponding record of information relating to the user and/or device.
  • the audio-ID and user-ID data are typically formatted and encoded by the device 10 according to a protocol that provides error correcting, framing, and other data useful in assuring reliable transmission to the relay station, and/or for further transport.
  • Some embodiments of device 10 recognize just a single form of watermarking, and can understand only payload data presented in a single format.
  • the device may be capable of recognizing watermarking according to several different techniques, and with several different payload formats. This latter functionality can be achieved, e.g., by cyclically trying different decoding techniques until one that produces valid output data (e.g., by reference to a checksum or other indicia) is obtained. That decoding technique and payload interpretation can thereafter be used until valid output data is no longer obtained.
  • the device 10 transmits data to the relay station at the moment the user presses the button 18.
  • a store-and-forward mode is used. That is, when the user presses the button 18, the decoded watermark data is stored in memory within the device. Thereafter, e.g., when the device is coupled with a "nest” or “holster” at the user's computer (or when download capability is otherwise activated), the stored data is downloaded - either through that device or otherwise.
  • the infrastructure between the device 10 and delivery of the audio to its ultimate destination can take myriad forms.
  • a server 28 can be a "MediaBridge" server of the type described, e.g., in the assignee's applications 60/164,619, filed November 10, 1999, and 09/343,104, filed June 29, 1999.
  • Server 28 parses the data and routes some or all of it to a data repository 30 at which the audio requested by the user is stored.
  • This repository dispatches the audio to the user (e.g., to a computer, media player, storage device, etc.), again through the internet.
  • Additional information detailing the destination 32 of the audio may be included in the data sent from the device 10, or can be retrieved from a database at the server 28 based on a user-ID sent from the device 10.
  • the repository 30 (which may be co-located with server 28, or not) includes various data beyond the audio itself.
  • the repository can store a collection of metadata (e.g., XML tags) corresponding with each stored item of audio.
  • This metadata can be transmitted to the user's destination 32, or can be used, e.g., for rights management purposes (to limit the user's reproduction or re-distribution rights for the audio, etc.), to establish a fee for the audio, etc.
  • metadata e.g., XML tags
  • This metadata can be transmitted to the user's destination 32, or can be used, e.g., for rights management purposes (to limit the user's reproduction or re-distribution rights for the audio, etc.), to establish a fee for the audio, etc.
  • One suitable metatag standard is that under development by ⁇ indecs> (Interoperability of Data in E- Commerce Systems, www.indecs.org).
  • the audio data can be delivered in streaming form, such as using technology available from RealNetworks (RealAudio), Microsoft (Windows Media Player), MP3, Audiobase, Beatnik, Bluestreak.com, etc.
  • RealNetworks RealAudio
  • Microsoft Windows Media Player
  • MP3, Audiobase Audiobase
  • Beatnik Bluestreak.com
  • the former three systems require large (e.g., megabytes) player software on the receiving (client) computer; the latter do not but instead rely, e.g., on small Java applets that can be downloaded with the music.
  • the audio can be delivered in a file format.
  • the file itself is delivered to the user's destination 32 (e.g., as an email attachment).
  • the user is provided a URL to permit access to, or downloading of, the audio.
  • the URL may be a web site that provides an interface through which the user can pay for the requested music, if pre-payment hasn't been arranged.
  • the user's destination 32 is typically the user's own computer. If a "live" IP address is known for that computer (e.g., by reference to a user profile database record stored on the server 28), the music can be transferred immediately. If the user's computer is only occasionally connected to the internet, the music can be stored at a web site (e.g. protected with a user-set password), and can be downloaded to the user's computer whenever it is convenient.
  • a "live" IP address is known for that computer (e.g., by reference to a user profile database record stored on the server 28)
  • the music can be transferred immediately. If the user's computer is only occasionally connected to the internet, the music can be stored at a web site (e.g. protected with a user-set password), and can be downloaded to the user's computer whenever it is convenient.
  • the destination 32 is a personal music library associated with the user.
  • the library can take the form, e.g., of a hard-disk or semiconductor memory array in which the user customarily stores music.
  • This storage device is adapted to provide music data to one or more playback units employed by the user (e.g. a personal MP3 player, a home stereo system, a car stereo system, etc.).
  • the library is physically located at the user's residence, but could be remotely sited, e.g. consolidated with the music libraries of many other users at a central location.
  • the personal music library can have its own internet connection. Or it can be equipped with wireless capabilities, permitting it to receive digital music from wireless broadcasts (e.g. from a transmitter associated with the server 28). In either case, the library can provide music to the user's playback devices by short-range wireless broadcast.
  • technology such as that available from Sonicbox, permits audio data delivered to the computer to be short range FM-broadcast by the user's computer to nearby FM radios using otherwise-unused radio spectrum.
  • Some implementations of the present invention support several different delivery technologies (e.g., streaming, file, URL), and select among them in accordance with the profiles of different users.
  • Payment for the audio can be accomplished by numerous means. One is by charging of a credit card account associated with the user (e.g., in a database record corresponding to the user-ID).
  • Some implementations of the invention make use of secure delivery mechanisms, such as those provided by InterTrust, Preview Systems, etc.
  • secure delivery mechanisms such as those provided by InterTrust, Preview Systems, etc.
  • such systems also include their own secure payment facilities.
  • buttons that are activated by the user to initiate capture of an audio selection
  • other interfaces can be used.
  • it can be a voice-recognition system that responds to spoken commands, such as "capture” or "record.”
  • it can be a form of gesture interface.
  • the same functionality can be built-into radios (including internet-based radios that receive wireless IP broadcasts), computer audio systems, and other appliances.
  • the microphone can be omitted and, in some cases, the wireless interface as well.
  • the data output from the device can be conveyed, e.g., through the network connection of an associated computer, etc.
  • the server 28 can provide to the user several internet links associated with the sensed audio. Some of these links can provide commerce opportunities (e.g., to purchase a CD on which the sensed audio is recorded). Others can direct the user to news sites, concert schedules, fan-club info, etc. In some such embodiments, the ancillary information is provided to the user without the audio itself.
  • the data provided to the user's destination typically includes information about the context in which the data was requested. In a simple case this can be the time and date on which the user pressed the Capture button. Other context information can be the identification of other Birddawg devices 10 that were nearby when the Capture button was pressed. (Such information can be gleaned, e.g., by each device transmitting a brief WhoAml message periodically, receiving such messages from other nearby devices, and logging the data thus received.)
  • Still other context information might be the location from which the Capture operation was initiated. This can be achieved by decoding of a second watermark signal, e.g., on a low level white-noise broadcast.
  • the public address system in public places can broadcast a generally-indiscernable noise signal that encodes a watermark signal.
  • Devices 10 can be arranged to detect two (or more) watermarks from the same audio stream, e.g., by reference to two pseudo-random sequences with which the different watermarks are encoded. One identifies the audible audio, the other identifies the location. By such an arrangement, for example, the device 10 can indicate to the server 28 (and thence to the user destination 32) the location at which the user encountered the audio. (This notion of providing location context information by subliminal audio that identifies the location has powerful applications beyond the particular scenario contemplated herein.)
  • the device 10 can buffer watermark information from several previous audio events, permitting the user to scroll back and select (e.g., in conjunction with a screen display 16) the ID of the desired audio.
  • An arrangement like the foregoing may require that the decoded watermark information be interpreted for the user, so that the user is not presented simply a raw binary watermark payload.
  • the interpreted information presented to the user can comprise, e.g., the source (CNN Airport News, WABC Radio, CD-ROM, MTV), the artist (Celine Dion), the title (That's the Way It Is), and/or the time decoded (3:38:02 p.m.), etc.
  • One way to achieve the foregoing functionality is to convey both the binary UID payload and abbreviated text (e.g., 5- or 6-bit encoded) through the watermark "channel" on the audio.
  • the watermark channel conveys data a UID, four characters of text, and associated error-correcting bits, every ten seconds. In the following ten seconds the same UID is conveyed, together with the next four characters of text.
  • Another way to achieve such functionality is to provide a memory in the device 10 that associates the watermark payload (whether UID or field-based) with corresponding textual data (e.g., the source/artist/title referenced above).
  • a 1 megabyte semiconductor non- volatile RAM memory can serve as a look-up table, matching code numbers to artist names and song titles.
  • the memory is indexed in accordance with one or more fields from the decoded watermark, and the resulting textual data from the memory (e.g. source/artist/title) is presented to the user.
  • Such a memory will commonly require periodic updating.
  • the wireless interface 20 in device 10 can include reception capabilities, providing a ready mechanism for providing such updated data.
  • the device "awakens" briefly at otherwise idle moments and tunes to a predetermined frequency at which updated data for the memory is broadcast, either in a baseband broadcast channel, or in an ancillary (e.g. SCA) channel.
  • internet delivery of update data for the memory can be substituted for wireless delivery.
  • a source/artist/title memory in the device 10 can be updated by placing the device in a "nest" every evening.
  • the nest (which may be integrated with a battery charger for the appliance) can have an internet connection, and can exchange data with the device by infrared, inductive, or other proximity-coupling technologies, or through metal contacts. Each evening, the nest can receive an updated collection of source/artist/title data, and can re-write the memory in the device accordingly.
  • the watermark data can always be properly intepreted for presentation to the user.
  • the "Capture” concepts noted above can be extended to other functions as well.
  • One is akin to forwarding of email. If a consumer hears a song that another friend would enjoy, the listener may send a copy of the song to the friend.
  • This instruction can be issued by pressing a "Send” button, or by invoking a similar function on a graphical (or voice- or gesture-responsive) user interface.
  • the device so- instructed can query the person as to the recipient. The person can designate the desired recipient(s) by scrolling through a pre-stored list of recipients to select the desired one.
  • the list can be entered through a computer to which the device is coupled.
  • the user can type-in a name (if the device provides a keypad), or a portion thereof sufficient to uniquely identify the recipient. Or the person may speak the recipient's name.
  • a voice recognition unit can listen to the spoken instructions and identify the desired recipient.
  • An "address book"-like feature has the requisite information for the recipient (e.g., the web site, IP address, or other data identifying the location to which music for that recipient should stored or queued, the format in which the music should be delivered, etc.) stored therein.
  • the appliance dispatches instructions to the server 28, including an authorization to incur any necessary charges (e.g., by debiting the sender's credit card). Again, the server 28 attends to delivery of the music in a desired manner to the specified recipient.
  • a listener may query the device (by voice, GUI or physical button, textual, gesture, or other input) to identify CDs on which the ambient audio is recorded. Or the listener may query the device for the then-playing artist's concert schedule.
  • the appliance can contact a remote database and relay the query, together with the user ID and audio ID data.
  • the database locates the requested data, and presents same to the user - either through a UI on device 10, or to the destination 32. If desired, the user can continue the dialog with a further instruction, e.g., to buy one of the CDs on which the then-playing song is included.
  • this instruction may be entered by voice, GUI, etc., and dispatched from the device to the server, which can then complete the transaction in accordance with pre-stored information (e.g. credit card account number, mailing address, etc.).
  • pre-stored information e.g. credit card account number, mailing address, etc.
  • a confirming message can be relayed to the device 10 or destination 32 for presentation to the user.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

A portable device (10) uses a microphone to listen to ambient audio, decodes a watermark signal (18) therein, and uses the decoded data to request delivery of the audio or related information to the user's home or other location. The device is desirably pocket-sized, or suitable for carrying on a key-ring. The device may also detect a second watermark signal that is present in the user's environment (e.g., played through a public address speaker system) to aid the user in recalling the context (16) from which the audio was requested.

Description

WATERMARK-BASED PERSONAL AUDIO APPLIANCE
Related Application Data The technology detailed in the present application is also related to that detailed in applications 09/343,104, filed June 29, 1999; 60/134,782, filed May 19, 1999; 09/292,569, filed April 15, 1999; 09/314,648, filed May 19, 1999; 60/141,763, filed June 30, 1999; 60/158,015, filed October 6, 1999; 60/163,332, filed November 3, 1999; 60/164,619, filed November 10, 1999; 09/452,023, filed November 30, 1999; 09/452,021, filed November 30, 1999; and in patent 5,862,260.
Introduction
16 year old Bob struts into the coffee shop down from high school with a couple of buddies, a subtle deep pound in the ambient sound track lets them know they're in the right place. The three of them instinctually pull out of their pockets their audio Birddawgs (a small hand held unit about the size and style of an auto-door-alarm device, or "fob"), and when they see the tiny green light, they smile, high five, and push the big "GoFetch" button in synchrony. That tune will now be waiting for them at home, safely part of their preferred collection and ever-so-thankfully not lost to their collective bad memory (if they even knew the name of the artist and tune title in the first place!).
33 year old Mary is at home listening to the latest batch of holiday tunes being offered up over her 2-decade-long favorite radio station. She's spent many days now half-consciously culling the tunes for that perfect arrangement for the new year's bash that she regrettably agreed to host. 10:40 AM rolls around and some new tune catches her ear, a tune she knows can work well following the jingle-cats rendition of
Strawberry Fields. She half jogs over to the stereo and hits the "GoFetch" button. In a few days, she'll sit down at the computer and put together the final sound track for the gala evening ahead, her play list dutifully waiting for her shuffling instructions and desired start time. 49 year old Jack (the financial analyst) is thoroughly bored sitting in the crowded gate D23 at Dulles. Droning 20 feet up and over his head is the airport network station, currently broadcasting the national weather report. As the segue to the business segment approaches, the teaser review mentions that they'll be having a report on today's rally in the bond market and the driving forces behind it. Jack pulls out his Birddawg-enabled Palm Pilot on the off-chance they actually will have a little depth in the reporting. Indeed, as the segment plays and starts discussing the convoluted effects of Greenspan's speech to the Internet-B-Free society, he taps the "GoFetch" button, knowing that once he gets back to his main browsing environment he will be able to follow dozens of links that the airport network has pre-assigned to the segment.
The foregoing and other features and advantages of the present invention will be more readily apparent from the following detailed description, which proceeds with reference to the accompanying figures.
Brief Description of the Drawings
Fig. 3 is a block diagram of a device according to one embodiment of the present invention.
Fig. 4 is a block diagram of a system in which the device of Fig. 3 may be utilized.
Detailed Description Referring to Fig. 3, a device 10 according to one embodiment of the present invention includes a microphone 12, an A/D converter 13, a processor 14, one or more indicators 16, one or more buttons 18, a wireless interface 20, and a power source 22. The device can be packaged in a small plastic housing, preferably as small as is practical (e.g., sized and configured to serve as a key chain ornament, perhaps akin to the Tomagatchi toys that were recently popular). The housing has one or more small holes to permit audio penetration through the housing to the microphone 12.
The processor 14 can take various forms, including a dedicated hardware device (e.g., an ASIC), a general purpose processor programmed in accordance with instructions stored in non- volatile RAM memory, etc. The indicators 16 can be as simple as a single LED lamp, or as complex as an alphanumeric LCD or other multi-element display. In one embodiment, the indicator simply indicates when the processor has decoded a watermark in audio sensed by the microphone. More elaborate signaling techniques can of course be used, including two- or three-color LEDs that can be used to signal different states with different colors, indicators with flashing patterns or changing displays, etc.
The buttons 18 are used by the user to indicate an interest in the audio just- heard. In one embodiment, there is a single button 18, and it is emblazoned with a stylized legend that can serve as a trademark or service mark, e.g., Getlt!, GoFetch, Birddawg, something Batman-esque ("Wham," "Zappp," "Powϋ," etc.), or something more mundane (e.g., Capture).
The power source 22 can be a battery, solar cell, storage capacitor, or other source of energy suitable for powering the components of the device 10.
The wireless interface 20 serves to exchange data with a relay station 24 (Fig. 4). In one embodiment, the interface is radio-based, and provides a one-way communications channel. In other embodiments other wireless technologies can be used (e.g., IR), and/or two-way communication can be provided.
The relay station can be a cellular repeater (if the interface transmits using cellular frequencies and protocols), or a local receiver, e.g., associated with the user's computer. The relay station can also be a paging system relay station (e.g., as are used for two-way pagers), or may be a low earth orbit satellite-based repeater.
In operation, the processor monitors the ambient audio for the presence of encoded data, e.g., a digital watermark, and decodes same. If power considerations permit, the device is "always-on." In other embodiments, one of the buttons 18 can be used to awaken the device. In such other embodiments, another button-press can serve to turn-off the device, or the device can power-down after a predetermined period, e.g., of not sensing any watermarked audio.
A number of techniques for watermarking audio (and decoding same) are known, as illustrated by patents 5,862,260, 5,963,909, 5,940,429, 5,940,135, 5,937,000, 5,889,868, 5,833,432, 5,945,932, WO9939344 (corresponding to US application 09/017,145), and WO9853565 (corresponding to US applications 08/858,562 and 08/974,920). Commercially-available audio watermarking software includes that available from AudioTrack, Verance (formerly Aris/Solana), Cognicity, Liquid Audio, and others.
The data payload encoded by the watermark (the audio-ID) may take various forms. One is a Digital Object Identifier - an ID corresponding to the standardized digital object numbering system promulgated by the International DOI Foundation (www.doi.org). Another is to include plural data fields variously representing, e.g., the name of the publisher, the name of the artist, the title of the work, the date of publication, etc., etc. Another is to encode a unique identifier (UID), e.g., of 16 - 64 bits. The UID serves as an index to a remote database where additional information (e.g., publisher, artist, title, date of publication, etc., are stored). The data transmitted from the device 10 to the relay station 24 typically includes some or all of the watermark payload data, and also includes data identifying the device 10, or its user (user-ID data). Again, this data can include several data fields (e.g. user name, audio delivery information such as email address or URL, age, gender, model of device 10, etc.). Alternatively, a serial number or other unique identifier can be used, which serves as an index to a database have a corresponding record of information relating to the user and/or device.
The audio-ID and user-ID data are typically formatted and encoded by the device 10 according to a protocol that provides error correcting, framing, and other data useful in assuring reliable transmission to the relay station, and/or for further transport. Some embodiments of device 10 recognize just a single form of watermarking, and can understand only payload data presented in a single format. In other embodiments, the device may be capable of recognizing watermarking according to several different techniques, and with several different payload formats. This latter functionality can be achieved, e.g., by cyclically trying different decoding techniques until one that produces valid output data (e.g., by reference to a checksum or other indicia) is obtained. That decoding technique and payload interpretation can thereafter be used until valid output data is no longer obtained.
In some embodiments, the device 10 transmits data to the relay station at the moment the user presses the button 18. In other embodiments, a store-and-forward mode is used. That is, when the user presses the button 18, the decoded watermark data is stored in memory within the device. Thereafter, e.g., when the device is coupled with a "nest" or "holster" at the user's computer (or when download capability is otherwise activated), the stored data is downloaded - either through that device or otherwise.
The infrastructure between the device 10 and delivery of the audio to its ultimate destination can take myriad forms. One is shown in Fig. 4. In this arrangement, some or all of the data received by the relay station 24 is routed through the internet 26 to a server 28. (The server 28 can be a "MediaBridge" server of the type described, e.g., in the assignee's applications 60/164,619, filed November 10, 1999, and 09/343,104, filed June 29, 1999.) Server 28 parses the data and routes some or all of it to a data repository 30 at which the audio requested by the user is stored. This repository, in turn, dispatches the audio to the user (e.g., to a computer, media player, storage device, etc.), again through the internet. (Address information detailing the destination 32 of the audio may be included in the data sent from the device 10, or can be retrieved from a database at the server 28 based on a user-ID sent from the device 10.)
In some embodiments, the repository 30 (which may be co-located with server 28, or not) includes various data beyond the audio itself. For example, the repository can store a collection of metadata (e.g., XML tags) corresponding with each stored item of audio. This metadata can be transmitted to the user's destination 32, or can be used, e.g., for rights management purposes (to limit the user's reproduction or re-distribution rights for the audio, etc.), to establish a fee for the audio, etc. One suitable metatag standard is that under development by <indecs> (Interoperability of Data in E- Commerce Systems, www.indecs.org).
The audio data can be delivered in streaming form, such as using technology available from RealNetworks (RealAudio), Microsoft (Windows Media Player), MP3, Audiobase, Beatnik, Bluestreak.com, etc. The former three systems require large (e.g., megabytes) player software on the receiving (client) computer; the latter do not but instead rely, e.g., on small Java applets that can be downloaded with the music.
Alternatively, the audio can be delivered in a file format. In some embodiments the file itself is delivered to the user's destination 32 (e.g., as an email attachment). In others, the user is provided a URL to permit access to, or downloading of, the audio. (The URL may be a web site that provides an interface through which the user can pay for the requested music, if pre-payment hasn't been arranged.)
The user's destination 32 is typically the user's own computer. If a "live" IP address is known for that computer (e.g., by reference to a user profile database record stored on the server 28), the music can be transferred immediately. If the user's computer is only occasionally connected to the internet, the music can be stored at a web site (e.g. protected with a user-set password), and can be downloaded to the user's computer whenever it is convenient.
In other embodiments, the destination 32 is a personal music library associated with the user. The library can take the form, e.g., of a hard-disk or semiconductor memory array in which the user customarily stores music. This storage device is adapted to provide music data to one or more playback units employed by the user (e.g. a personal MP3 player, a home stereo system, a car stereo system, etc.). In most installations, the library is physically located at the user's residence, but could be remotely sited, e.g. consolidated with the music libraries of many other users at a central location.
The personal music library can have its own internet connection. Or it can be equipped with wireless capabilities, permitting it to receive digital music from wireless broadcasts (e.g. from a transmitter associated with the server 28). In either case, the library can provide music to the user's playback devices by short-range wireless broadcast.
In many embodiments, technology such as that available from Sonicbox, permits audio data delivered to the computer to be short range FM-broadcast by the user's computer to nearby FM radios using otherwise-unused radio spectrum. Some implementations of the present invention support several different delivery technologies (e.g., streaming, file, URL), and select among them in accordance with the profiles of different users.
Payment for the audio (if needed) can be accomplished by numerous means. One is by charging of a credit card account associated with the user (e.g., in a database record corresponding to the user-ID).
Some implementations of the invention make use of secure delivery mechanisms, such as those provided by InterTrust, Preview Systems, etc. In addition to providing secure containers by which the audio is distributed, such systems also include their own secure payment facilities.
By such arrangements, a user can conveniently compile an archive of favorite music - even while away from home. To provide a comprehensive disclosure without unduly lengthening this specification, the disclosures of the applications and patents cited above are incorporated herein by reference.
Having described and illustrated the principles of my invention with reference to a preferred embodiment and several variations thereof, it should be apparent that the detailed embodiment is illustrative only and should not be taken as limiting the scope of my invention.
For example, while the invention is illustrated with reference to a button that is activated by the user to initiate capture of an audio selection, other interfaces can be used. For example, in some embodiments it can be a voice-recognition system that responds to spoken commands, such as "capture" or "record." Or it can be a form of gesture interface.
Likewise, while the invention is illustrated with reference to a stand-alone device, the same functionality can be built-into radios (including internet-based radios that receive wireless IP broadcasts), computer audio systems, and other appliances. In such case the microphone can be omitted and, in some cases, the wireless interface as well. (The data output from the device can be conveyed, e.g., through the network connection of an associated computer, etc.)
Moreover, while the invention is illustrated with reference to an embodiment in which audio, alone, is provided to the user, this need not be the case. As in the Dulles airport scenario in the introduction, the server 28 can provide to the user several internet links associated with the sensed audio. Some of these links can provide commerce opportunities (e.g., to purchase a CD on which the sensed audio is recorded). Others can direct the user to news sites, concert schedules, fan-club info, etc. In some such embodiments, the ancillary information is provided to the user without the audio itself.
Although not particularly detailed, the data provided to the user's destination typically includes information about the context in which the data was requested. In a simple case this can be the time and date on which the user pressed the Capture button. Other context information can be the identification of other Birddawg devices 10 that were nearby when the Capture button was pressed. (Such information can be gleaned, e.g., by each device transmitting a brief WhoAml message periodically, receiving such messages from other nearby devices, and logging the data thus received.)
Still other context information might be the location from which the Capture operation was initiated. This can be achieved by decoding of a second watermark signal, e.g., on a low level white-noise broadcast. The public address system in public places, for example, can broadcast a generally-indiscernable noise signal that encodes a watermark signal. Devices 10 can be arranged to detect two (or more) watermarks from the same audio stream, e.g., by reference to two pseudo-random sequences with which the different watermarks are encoded. One identifies the audible audio, the other identifies the location. By such an arrangement, for example, the device 10 can indicate to the server 28 (and thence to the user destination 32) the location at which the user encountered the audio. (This notion of providing location context information by subliminal audio that identifies the location has powerful applications beyond the particular scenario contemplated herein.)
In some embodiments, the device 10 can buffer watermark information from several previous audio events, permitting the user to scroll back and select (e.g., in conjunction with a screen display 16) the ID of the desired audio.
An arrangement like the foregoing may require that the decoded watermark information be interpreted for the user, so that the user is not presented simply a raw binary watermark payload. The interpreted information presented to the user can comprise, e.g., the source (CNN Airport News, WABC Radio, CD-ROM, MTV), the artist (Celine Dion), the title (That's the Way It Is), and/or the time decoded (3:38:02 p.m.), etc.
One way to achieve the foregoing functionality is to convey both the binary UID payload and abbreviated text (e.g., 5- or 6-bit encoded) through the watermark "channel" on the audio. In one such arrangement, the watermark channel conveys data a UID, four characters of text, and associated error-correcting bits, every ten seconds. In the following ten seconds the same UID is conveyed, together with the next four characters of text. Another way to achieve such functionality is to provide a memory in the device 10 that associates the watermark payload (whether UID or field-based) with corresponding textual data (e.g., the source/artist/title referenced above). A 1 megabyte semiconductor non- volatile RAM memory, for example, can serve as a look-up table, matching code numbers to artist names and song titles. When the user queries the device to learn the identify of a song (e.g., by operating a button 18), the memory is indexed in accordance with one or more fields from the decoded watermark, and the resulting textual data from the memory (e.g. source/artist/title) is presented to the user. Such a memory will commonly require periodic updating. The wireless interface 20 in device 10 can include reception capabilities, providing a ready mechanism for providing such updated data. In one embodiment, the device "awakens" briefly at otherwise idle moments and tunes to a predetermined frequency at which updated data for the memory is broadcast, either in a baseband broadcast channel, or in an ancillary (e.g. SCA) channel. In variants of the foregoing, internet delivery of update data for the memory can be substituted for wireless delivery. For example, a source/artist/title memory in the device 10 can be updated by placing the device in a "nest" every evening. The nest (which may be integrated with a battery charger for the appliance) can have an internet connection, and can exchange data with the device by infrared, inductive, or other proximity-coupling technologies, or through metal contacts. Each evening, the nest can receive an updated collection of source/artist/title data, and can re-write the memory in the device accordingly. By such arrangement, the watermark data can always be properly intepreted for presentation to the user.
The "Capture" concepts noted above can be extended to other functions as well. One is akin to forwarding of email. If a consumer hears a song that another friend would enjoy, the listener may send a copy of the song to the friend. This instruction can be issued by pressing a "Send" button, or by invoking a similar function on a graphical (or voice- or gesture-responsive) user interface. In response, the device so- instructed can query the person as to the recipient. The person can designate the desired recipient(s) by scrolling through a pre-stored list of recipients to select the desired one. (The list can be entered through a computer to which the device is coupled.) Alternatively, the user can type-in a name (if the device provides a keypad), or a portion thereof sufficient to uniquely identify the recipient. Or the person may speak the recipient's name. As is conventional with hands-free vehicle cell phones, a voice recognition unit can listen to the spoken instructions and identify the desired recipient. An "address book"-like feature has the requisite information for the recipient (e.g., the web site, IP address, or other data identifying the location to which music for that recipient should stored or queued, the format in which the music should be delivered, etc.) stored therein. In response to such command, the appliance dispatches instructions to the server 28, including an authorization to incur any necessary charges (e.g., by debiting the sender's credit card). Again, the server 28 attends to delivery of the music in a desired manner to the specified recipient.
Still further, a listener may query the device (by voice, GUI or physical button, textual, gesture, or other input) to identify CDs on which the ambient audio is recorded. Or the listener may query the device for the then-playing artist's concert schedule. Again, the appliance can contact a remote database and relay the query, together with the user ID and audio ID data. The database locates the requested data, and presents same to the user - either through a UI on device 10, or to the destination 32. If desired, the user can continue the dialog with a further instruction, e.g., to buy one of the CDs on which the then-playing song is included. Again, this instruction may be entered by voice, GUI, etc., and dispatched from the device to the server, which can then complete the transaction in accordance with pre-stored information (e.g. credit card account number, mailing address, etc.). A confirming message can be relayed to the device 10 or destination 32 for presentation to the user.
While the invention particularly contemplates audio, the principles detailed above find applications in many other media, and in many other applications of the MediaBridge server 28.
Moreover, while the invention particularly contemplates watermarks as the channel by which audio is identified, in other embodiments different techniques can be used. For example, digital radio protocols provide ID fields by which audio can be identified. Similarly, IP protocols for internet delivery of radio include identification fields within their packet formats. Accordingly, audio distributed according to formats that include audio IDs therein can likewise be employed according to the present invention. In view of the many embodiments to which the principles of my invention may be applied, it should be apparent that the detailed embodiment is illustrative only and should not be taken as limiting the scope of the invention. Rather, I claim as myr invention all such modifications as may fall within the scope and spirit of the following claims, and equivalents thereto.

Claims

I CLAIM:
1. A device comprising a housing sized for carrying in a user's pocket and including: a transducer to receive ambient audio and to output electrical signals corresponding thereto; a watermark detector coupled to the transducer for producing payload information; a memory storing user identification information; and an interface that receives at least some of both the payload information and the user identification information for transmission to a relay station.
2. The device of claim 1 in which the interface is a wireless interface.
3. The device of claim 1 including an alphanumeric display.
4. The device of claim 1 including a keypad.
6. A method comprising: receiving audio at a device; discerning from the audio a plural-bit audio ID; obtaining a user ID from a memory in the device; transmitting at least portions of both the audio ID and the user ID to a location remote from said device.
7. The method of claim 6 in which the audio ID comprises a Digital Object
Identifier.
8. The method of claim 6 that further comprises receiving the audio by a microphone.
9. The method of claim 8 that further comprises discerning at least two IDs from the audio, one being said audio ID, another being an ID corresponding to an environment in which the device is located.
10. In a method of steganographically encoding audio with a plural-bit binary watermark payload, an improvement wherein the watermark payload comprises a Digital Object Identifier.
11. A method comprising generating a noise-like signal having a plural-bit location identifier encoded therein, and airing said signal through at least one loudspeaker in an environment, said aired signal being generally indiscernible by human listeners present in said environment.
PCT/US2000/035630 1999-12-30 2000-12-28 Watermark-based personal audio appliance WO2001050665A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU22957/01A AU2295701A (en) 1999-12-30 2000-12-28 Watermark-based personal audio appliance

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/476,686 1999-12-30
US09/476,686 US7562392B1 (en) 1999-05-19 1999-12-30 Methods of interacting with audio and ambient music

Publications (1)

Publication Number Publication Date
WO2001050665A1 true WO2001050665A1 (en) 2001-07-12

Family

ID=23892845

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/035630 WO2001050665A1 (en) 1999-12-30 2000-12-28 Watermark-based personal audio appliance

Country Status (2)

Country Link
AU (1) AU2295701A (en)
WO (1) WO2001050665A1 (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1278183A1 (en) * 2001-07-19 2003-01-22 Samsung Electronics Co., Ltd. Voice operated electronic appliance
US7450734B2 (en) 2000-01-13 2008-11-11 Digimarc Corporation Digital asset management, targeted searching and desktop searching using digital watermarks
GB2484140A (en) * 2010-10-01 2012-04-04 Ucl Business Plc Communicating data between devices
US8570586B2 (en) 2005-05-02 2013-10-29 Digimarc Corporation Active images through digital watermarking
US9009482B2 (en) 2005-07-01 2015-04-14 Verance Corporation Forensic marking using a common customization function
US9106964B2 (en) 2012-09-13 2015-08-11 Verance Corporation Enhanced content distribution using advertisements
US9117270B2 (en) 1998-05-28 2015-08-25 Verance Corporation Pre-processed information embedding system
US9153006B2 (en) 2005-04-26 2015-10-06 Verance Corporation Circumvention of watermark analysis in a host content
US9189955B2 (en) 2000-02-16 2015-11-17 Verance Corporation Remote control signaling using audio watermarks
US9208334B2 (en) 2013-10-25 2015-12-08 Verance Corporation Content management using multiple abstraction layers
US9251549B2 (en) 2013-07-23 2016-02-02 Verance Corporation Watermark extractor enhancements based on payload ranking
US9262794B2 (en) 2013-03-14 2016-02-16 Verance Corporation Transactional video marking system
US9323902B2 (en) 2011-12-13 2016-04-26 Verance Corporation Conditional access using embedded watermarks
US9596521B2 (en) 2014-03-13 2017-03-14 Verance Corporation Interactive content acquisition using embedded codes
US11410670B2 (en) 2016-10-13 2022-08-09 Sonos Experience Limited Method and system for acoustic communication of data
US11671825B2 (en) 2017-03-23 2023-06-06 Sonos Experience Limited Method and system for authenticating a device
US11682405B2 (en) 2017-06-15 2023-06-20 Sonos Experience Limited Method and system for triggering events
US11683103B2 (en) 2016-10-13 2023-06-20 Sonos Experience Limited Method and system for acoustic communication of data
US11870501B2 (en) 2017-12-20 2024-01-09 Sonos Experience Limited Method and system for improved acoustic transmission of data
US11988784B2 (en) 2020-08-31 2024-05-21 Sonos, Inc. Detecting an audio signal with a microphone to determine presence of a playback device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5740244A (en) * 1993-04-09 1998-04-14 Washington University Method and apparatus for improved fingerprinting and authenticating various magnetic media
US5825871A (en) * 1994-08-05 1998-10-20 Smart Tone Authentication, Inc. Information storage device for storing personal identification information
US6035177A (en) * 1996-02-26 2000-03-07 Donald W. Moses Simultaneous transmission of ancillary and audio signals by means of perceptual coding

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5740244A (en) * 1993-04-09 1998-04-14 Washington University Method and apparatus for improved fingerprinting and authenticating various magnetic media
US5825871A (en) * 1994-08-05 1998-10-20 Smart Tone Authentication, Inc. Information storage device for storing personal identification information
US6035177A (en) * 1996-02-26 2000-03-07 Donald W. Moses Simultaneous transmission of ancillary and audio signals by means of perceptual coding

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9117270B2 (en) 1998-05-28 2015-08-25 Verance Corporation Pre-processed information embedding system
US7450734B2 (en) 2000-01-13 2008-11-11 Digimarc Corporation Digital asset management, targeted searching and desktop searching using digital watermarks
US9189955B2 (en) 2000-02-16 2015-11-17 Verance Corporation Remote control signaling using audio watermarks
EP1278183A1 (en) * 2001-07-19 2003-01-22 Samsung Electronics Co., Ltd. Voice operated electronic appliance
US9153006B2 (en) 2005-04-26 2015-10-06 Verance Corporation Circumvention of watermark analysis in a host content
US8570586B2 (en) 2005-05-02 2013-10-29 Digimarc Corporation Active images through digital watermarking
US9009482B2 (en) 2005-07-01 2015-04-14 Verance Corporation Forensic marking using a common customization function
GB2484140A (en) * 2010-10-01 2012-04-04 Ucl Business Plc Communicating data between devices
US11157582B2 (en) 2010-10-01 2021-10-26 Sonos Experience Limited Data communication system
US10025870B2 (en) 2010-10-01 2018-07-17 Asio Ltd Data communication system
GB2484140B (en) * 2010-10-01 2017-07-12 Asio Ltd Data communication system
US9323902B2 (en) 2011-12-13 2016-04-26 Verance Corporation Conditional access using embedded watermarks
US9106964B2 (en) 2012-09-13 2015-08-11 Verance Corporation Enhanced content distribution using advertisements
US9262794B2 (en) 2013-03-14 2016-02-16 Verance Corporation Transactional video marking system
US9251549B2 (en) 2013-07-23 2016-02-02 Verance Corporation Watermark extractor enhancements based on payload ranking
US9208334B2 (en) 2013-10-25 2015-12-08 Verance Corporation Content management using multiple abstraction layers
US9596521B2 (en) 2014-03-13 2017-03-14 Verance Corporation Interactive content acquisition using embedded codes
US11410670B2 (en) 2016-10-13 2022-08-09 Sonos Experience Limited Method and system for acoustic communication of data
US11683103B2 (en) 2016-10-13 2023-06-20 Sonos Experience Limited Method and system for acoustic communication of data
US11854569B2 (en) 2016-10-13 2023-12-26 Sonos Experience Limited Data communication system
US11671825B2 (en) 2017-03-23 2023-06-06 Sonos Experience Limited Method and system for authenticating a device
US11682405B2 (en) 2017-06-15 2023-06-20 Sonos Experience Limited Method and system for triggering events
US11870501B2 (en) 2017-12-20 2024-01-09 Sonos Experience Limited Method and system for improved acoustic transmission of data
US11988784B2 (en) 2020-08-31 2024-05-21 Sonos, Inc. Detecting an audio signal with a microphone to determine presence of a playback device

Also Published As

Publication number Publication date
AU2295701A (en) 2001-07-16

Similar Documents

Publication Publication Date Title
US8255693B2 (en) Methods and devices responsive to ambient audio
WO2001050665A1 (en) Watermark-based personal audio appliance
EP1098460B1 (en) Method of and system for providing identification of broadcast programmes
AU759009B2 (en) Audience measurement system incorporating a mobile handset and a base station
US20050028189A1 (en) System to provide access to information related to a broadcast signal
US20070074262A1 (en) Display device, display method, and display control program
JP2005506763A (en) Service data distribution system
JP2002163386A (en) System and method for transmitting/receiving data
JPH08508618A (en) Wireless receiver for information transfer using subcarrier
KR20020093994A (en) System and method for extracting, decoding, and utilizing hidden data embedded in audio signals
CN101228808A (en) Information providing system
CN1816980B (en) Record device,record method and record program
US20170236554A1 (en) System and Method for On-Demand Storage of Randomly Selected Data
JP4995901B2 (en) Distributing semi-unique codes via broadcast media
WO2001001331A1 (en) Digital watermarks in tv and radio broadcasts
US7583928B2 (en) Information providing method
CN1448023B (en) Method for accessing information
EP1739925A1 (en) Process and system for transmitting a key information about a product
US8752118B1 (en) Audio and video content-based methods
US20070071418A1 (en) Recording device, recording method, and program
US8073382B2 (en) Communication device, communicate method for communication device, and communication program for communication device
US8954348B2 (en) Electronically ordering a product
US20070070220A1 (en) Method and camera for image enhancement
EP1973103A1 (en) System and method for providing additional information related to an audio piece
WO2003017135A2 (en) Assisted web-browsing system

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP