CN101517550A - Social and interactive applications for mass media - Google Patents

Social and interactive applications for mass media Download PDF

Info

Publication number
CN101517550A
CN101517550A CNA200680044650XA CN200680044650A CN101517550A CN 101517550 A CN101517550 A CN 101517550A CN A200680044650X A CNA200680044650X A CN A200680044650XA CN 200680044650 A CN200680044650 A CN 200680044650A CN 101517550 A CN101517550 A CN 101517550A
Authority
CN
China
Prior art keywords
descriptor
audience ratings
media broadcast
information
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA200680044650XA
Other languages
Chinese (zh)
Other versions
CN101517550B (en
Inventor
迈克尔·芬克
舒梅特·巴卢哈
米歇尔·科维尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Priority claimed from PCT/US2006/045551 external-priority patent/WO2007064641A2/en
Publication of CN101517550A publication Critical patent/CN101517550A/en
Application granted granted Critical
Publication of CN101517550B publication Critical patent/CN101517550B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses systems, methods, apparatuses, user interfaces and computer program products for providing social and interactive applications for mass media based on real-time ambient-audio and/or video identification. In some implementations, a method includes: receiving descriptors identifying ambient audio associated with a media broadcast; comparing the descriptors to one or more reference descriptors; and determining a rating for the media broadcast based at least in part on the results of the comparison.

Description

Social and the interactive application of mass medium
Related application
The application requires No. the 60/740th, 760, the U.S. Provisional Patent Application of on November 29th, 2005 application, and name is called the right of priority of " Environment-Based Referrals ", by reference its content is herein incorporated.
The application requires No. the 60/823rd, 881, the U.S. Provisional Patent Application of on August 29th, 2006 application, and name is called the right of priority of " Audio Identification Based on Signatures ", by reference its content is herein incorporated.
Technical field
The present invention relates to the social and interactive application of mass medium.
Background technology
Mass medium passage (for example TV and station broadcast) provides limited content to the mass audience usually.On the contrary, WWW provides magnanimity information, but may have only a few peoples interested in its content.Traditional interactive television is attempted to erect bridge between these two kinds of communication mediums by providing a device to spectators so that carry out mutual and reception content and/or the service relevant with television broadcasting with their TV.
Traditional interactive television only can be used the spectators that paid the subscription expense by cable or satellite network usually.Receive interactive television services, spectators must lease or buy set-top box and by the technician it be installed.Spectators' TV is connected to this set-top box, and this set-top box can allow spectators to utilize remote control or other input equipment and its TV to carry out alternately, and reception information, amusement and service (for example: advertisement, online shopping, survey, ludic activity or the like).
Although traditional interactive television can improve spectators' television experience, but still need be for the social and interactive application of mass medium, it not only depends on important additional firmware or physical connection between TV or radio station and set-top box or the computing machine.
Social and the interactive television-j o-application of traditional that interactive TV system lacked can provide side information to the mass medium passage like a cork.Utilize traditional system, the user will have to sign in on the computing machine and inquire about these information, thereby the feeling of passivity that is provided by mass medium can be provided.And when the user watched broadcast program, the traditional tv system can not provide side information in real time.
Social and the interactive television-j o-application of traditional another that interactive TV system lacked can be dynamically with spectators with link in real time from organizing social reciprocity colony (ad hoc social peercommunity) (for example: discussion group, chatroom etc.).Please imagine that you watch up-to-date serial " friend " on TV, and find dramatis personae Mo Nika pregnancy.You want to chat in real time, comment, or read the reaction of other spectators to the story of a play or opera.A kind of selection is your computing machine of login, title or other correlation word of input " friend " in search engine, and carry out search to find discussion group about " friend ".Yet the feeling of passivity that is provided by mass medium has been provided in spectators' this inquiry behavior, and can not make spectators be ready to carry out dynamic mutual (for example: comment, chat etc.) with the while other spectators that watch this program.
Another deficiency of traditional tv system and interactive TV system is the straightforward procedure of the popular audience ratings of assessment broadcast program incident.The popular audience ratings of broadcast program incident all haves a great attraction for user, announcer and advertiser.Can by such as The evaluation system of audience ratings and so on partly solves these demands.Yet these audience ratings need be installed special-purpose hardware and be needed to participate in spectators' cooperation.
Summary of the invention
By providing social and system interactive application based on real time environment audio frequency and/or video identification, method is installed, and user interface and computer program can address the aforementioned drawbacks.
In certain embodiments, provide a kind of method, comprising: receive the descriptor that has identified the environment audio frequency that is associated with media broadcast; The reference descriptor that this descriptor is relevant with this media broadcast compares; And compile customized information about this media broadcast based on described comparative result.
In certain embodiments, provide a kind of method, comprising: receive first descriptor that has identified the environment audio frequency that is associated with first media broadcast; Reception has identified second descriptor of the environment audio frequency that is associated with second media broadcast; Relatively whether this first descriptor is identical with definite this first and second media broadcast with this second descriptor; And the result compiles customized information based on the comparison.
In certain embodiments, provide a kind of method, comprising: detect the environment audio frequency that is associated with media broadcast; Generate the descriptor of this media broadcast of sign; This descriptor is sent to Internet resources; And receive the customized information that compiles from described Internet resources based on described descriptor.
In certain embodiments, provide a kind of system, comprise database with reference to descriptor.Database server is connected to this database and client effectively.This database server is configured to receive the descriptor from this client, to identify the environment audio frequency that is associated with media broadcast, with the descriptor that receives and one or morely compare with reference to descriptor, and the result compiles customized information about this media broadcast based on the comparison.
In certain embodiments, provide a kind of system, comprise being configured to be used for audio detection device that the environment audio frequency is sampled.Client-side interface is connected to this audio detection device effectively and can be configured to the descriptor that generates identification medium broadcasting.This client-side interface can be configured to and is used to send this descriptor to Internet resources, and is used for based on the customized information that compile of this descriptor reception from these Internet resources.
In certain embodiments, provide a kind of method, comprising: receive the descriptor that has identified the environment audio frequency that is associated with media broadcast; With this descriptor and one or morely compare with reference to descriptor; And the audience ratings of determining this media broadcast at least in part based on the result of this comparison.
In certain embodiments, provide a kind of method, comprising: generate the sign and the descriptor of the environment audio frequency that is associated of media broadcast; Provide this descriptor to the audience ratings supplier, in order to determine the audience ratings of this media broadcast based on this descriptor; Reception is from this audience ratings of this audience ratings supplier; And on display device, show this audience ratings.
In certain embodiments, provide a kind of method, comprising: record is from the environment audio fragment of media broadcast; Generate descriptor by this environment audio fragment; And provide this descriptor to the audience ratings supplier.
In certain embodiments, provide a kind of system, comprising: with reference to the database of descriptor.Server is connected to this database and client effectively.This server can be configured to receive the descriptor that is used to identify the environment audio frequency that is associated with media broadcast from this client, with this descriptor that receives and one or morely compare with reference to descriptor, and the audience ratings of determining this media broadcast at least in part based on this comparative result.
In certain embodiments, provide a kind of system, comprising: comprise being set to be used for audio detection device that the environment audio frequency is sampled.Client-side interface is connected with this audio detection device effectively, and can be set to generate the descriptor that has identified the environment audio frequency that is associated with media broadcast.Client-side interface can be configured to and is used for sending this descriptor to Internet resources; And based on the viewership information of this descriptor reception from Internet resources.
Other embodiment is oriented system, method, device, user interface, and computer program.
Description of drawings
Fig. 1 is the block diagram of an embodiment of popular personalization system;
Fig. 2 is the diagrammatic sketch of an embodiment that comprises the environment audio recognition systems of client-side interface shown in Figure 1;
Fig. 3 is the process flow diagram of an embodiment that is used to provide the process of popular personalized application;
Fig. 4 is the process flow diagram of an embodiment of audio-frequency fingerprint processing procedure;
Fig. 5 is the process flow diagram that is used for carrying out with popular personalized application an embodiment of mutual user interface;
Fig. 6 is the block diagram of an embodiment of the hardware structure of client, and wherein this client is used to realize client-side interface as shown in Figure 1; And
Fig. 7 is the process flow diagram of an embodiment of duplicate detection process.
Embodiment
Popular personalized application
Popular personalized application provides personalization and the interactive information relevant with mass medium broadcasting (for example: TV, station broadcast, film, Internet radio etc.).These application include but not limited to: the customized information layer, organize social reciprocity colony certainly, real-time popular audience ratings and video (or audio frequency) bookmark etc.Although the example of some mass media of Pi Luing is under the background of television broadcasting herein, disclosed embodiment is equally applicable to radio station and/or music broadcast.
The customized information layer provides additional information for the mass medium passage.The example of customized information layer includes but not limited to: fashion, politics, commerce, health, tourism or the like.For example, when the news footage watched about the celebrity, on TV screen or computer display apparatus, can present the fashion layer for spectators, this fashion layer provides the clothes wear and wear about this celebrity and dress ornament in this news footage information and/picture.In addition, personalized layer can comprise the advertisement of this news footage Related product of propaganda or service, for example point to sell this famous person the link of habilimented clothes shop.
The place of comment is provided for the user who watches identical performance on TV or listen to the identical broadcasts radio station from the social reciprocity colony of group.For example, can (for example: chatroom, message board provide the comment media to the user who watches the up-to-date top news of CNN, the WIKI page, video links etc.), this comment media allows the user that ongoing mass medium broadcasting is talked, and comments on or read other spectators' reaction.
Real-time popular audience ratings provides viewership information (being similar to the Nielsen audience ratings) for content supplier and user.For example, social network that can be by the user and/or by similar rating crowd immediately provides the television channel watching or listen to or the real-time popular audience ratings in broadcasting station to the user.
Video or audio frequency bookmark provide easily method to create the personalized library of their favorite broadcasted content for the user.For example, the user can push button or remote control equipment simply on computers, and the environment audio frequency of broadcasted content and/or the fragment of video promptly are recorded, and handles and preserves.This fragment can be used as the bookmark of the part of pointing to this program or this program, so that watch in the future.This bookmark can be shared with friend or be saved and be provided with the back individual with reference to using.
Popular personalized network
Fig. 1 is the block diagram of popular personalization system 100 that popular personalized application is provided.System 100 comprises one or more client-side interfaces 102, and audio database server 104 and social application server 106, each part mentioned above all communicate by network 108 (for example: the Internet, in-house network, LAN (Local Area Network), wireless network etc.).
Client-side interface 102 can be any apparatus that allows the user's input and the information of reception, and it can present user interface on display device, and it includes but not limited to: desktop computer or portable computer, electronic equipment, phone, mobile phone, display system, TV, computer monitor, navigational system, portable electronic device/register, PDA(Personal Digital Assistant), game machine, hand-hold electronic equipments, and EMBEDDED AVIONICS or device.The description of client-side interface 102 sees Fig. 2 for details.
In certain embodiments, client-side interface 102 comprises environment audio detection device (for example microphone), is used for monitoring and (for example: the environment audio frequency of mass medium broadcasting user's living room) is recorded under the broadcast environment.One or more environment audio sections or " small fragment " are converted into unique and stable statistical abstract, are called as " audio-frequency fingerprint " or " descriptor ".In certain embodiments, descriptor is the compressed file that comprises one or more audio signature parts, and wherein this audio signature part can be equal to mutually with reference descriptor or the statistic relevant with this mass medium broadcasting that generate before.
A kind of technology of music recognition generation audio-frequency fingerprint that is used to is at Ke, Y., Hoiem, D., Sukthankar, R. " the Computer Vision and Pattern Recognition " name that is published in 2005 is called in the literary composition of " Computer Vision for Music Identification " and is described, and by reference it is herein incorporated in full.In certain embodiments, adopted the music recognition method that proposes by people such as Ke, thought that television audio data and inquiry generate descriptor, as shown in Figure 4.
U.S. Provisional Patent Application the 60/823rd, No. 881 " Audio Identification Based onSignatures (based on the audio identification of signature) " has been described a kind of technology of utilizing small echo to generate audio descriptor.This application has been described a kind of technology, utilizes computer vision technique and large-scale data stream Processing Algorithm to combine to produce the compression descriptor/fingerprint of the audio fragment that can effectively be mated.This technology has been used small echo, and small echo is the mathematical tool of known classification decomposition function.
In " based on the audio identification of signature " application, the embodiment of retrieving may further comprise the steps: 1) behind the audible spectrum of given audio fragment, extract spectral image, for example: duration 11.6*w ms, the average d ms of random interval.To each spectral image: 2) calculate the small echo of this spectral image; 3) extract the top-t small echo; 4) create the binary representation of this top-t small echo; 5) use minimum hash (min-hash) to create the sub-fingerprint of this top-t small echo; 6) utilization has the LSH of b bins and 1 hash table to search the sub-fingerprint section of mating most; 7) abandon the sub-fingerprint that mates less than v; 8) calculate from the sub-fingerprint of residue candidate to the hamming distance of inquiring about sub-fingerprint; And 9) use dynamic programming that these coupling leap times are carried out combination.
In certain embodiments, descriptor and the relevent users' identifiers (" user ID ") that is used to identify client-side interface 102 is sent to audio database server 104 through network 108.Audio database server 104 is with this descriptor and a plurality ofly compare with reference to descriptor, and wherein this a plurality ofly is predetermined and is stored in the audio database 110 that is connected with audio database server 104 with reference to descriptor.In certain embodiments, audio database server 104 continuously updated stored in audio database 110, from the reference descriptor of recently mass media broadcasting.
Audio database server 104 is determined the descriptor that receives and with reference to the optimum matching in the descriptor, and optimum matching information is sent to social application server 106.This matching treatment is described in detail with reference to Fig. 4.
In certain embodiments, social application server 106 is accepted to be connected with the WEB browser that client-side interface 102 is associated.Utilize match information, social application server 106 gathers this user's customized information, and sends it to client-side interface 102.This customized information can include but not limited to: advertisement, the customized information layer, popular audience ratings, and with comment media relevant information (for example: organize social reciprocity colony certainly, forum, discussion group, video conference or the like).
In certain embodiments, this customized information can be used for spectators and create the chatroom, and need not to understand the program that they are watching in real time.Descriptor in the data stream that can be sent by client by direct comparison to be determining coupling, thereby creates the chatroom.That is, can create the chatroom around spectators with coupling descriptor.In such embodiments, need not the descriptor that will receive and compare with reference to descriptor from spectators place.
In certain embodiments, social application server 106 provides webpage to client-side interface 102, and this page is run on WEB browser (for example, the Internet Explorer of Microsoft of client-side interface 102 TM) receive and show.Society's application server 106 also receives the user ID from client-side interface 102 and/or audio database server 104, compiles personal content and webpage is offered client-side interface 102 with help.
It is evident that other embodiment of system 100 also is possible.For example system 100 can comprise a plurality of audio databases 110, audio database server 104 and/or social application server 106.Alternatively, audio database server 104 and/or social application server 106 can be individual server or system, or the part of Internet resources and/or service.In addition, network 108 can comprise a plurality of networks and link, utilize multiple network equipment (for example: hub, router etc.) and medium (for example: copper conductor, optical fiber, wireless frequency, or the like) with various topology and the configuration they are linked together effectively.Client-server architectures described here is only as example.Other computer architecture also is possible.
The environment audio recognition systems
Fig. 2 is an environment audio recognition systems 200, comprises client-side interface 102 as shown in Figure 1.System 200 comprises mass medium system 202 (for example, televisor, radio, computing machine, electronic equipment, mobile phone, game machine, network equipment or the like), environment audio detection device 204, client-side interface 102 (for example: desktop computer or portable computer or the like) and network access equipment 206.In certain embodiments, client-side interface 102 comprises that in order to present the display device 210 of user interface (UI) 208 wherein user interface 208 can be carried out alternately for user and popular personalized application, as shown in Figure 5.
Mass medium system 202 generates the environment audio frequency of mass medium broadcasting (for example television audio) when operation, this environment audio frequency is detected by environment audio detection device 204.Environment audio detection device 204 can be any equipment that can the testing environment audio frequency, comprise free-standing microphone and with the integrated microphone of client-side interface 102.Detected environment audio frequency is by client-side interface 102 codings, so that the descriptor of this environment audio frequency of sign to be provided.This descriptor is sent to audio database server 104 via network access equipment 206 and network 108.
In certain embodiments, the audio file (" fragment ") of n second (for example 5 seconds) environment audio frequency is monitored and write down to the client software that moves on client-side interface 102 continuously.According to process shown in Figure 4, this fragment is converted into the m frame (for example 415 frames) of the descriptor (for example 32 bits) of k bits of encoded subsequently.In certain embodiments, monitoring and record are based on incident.For example, monitoring and record can be on a specified date and the fixed time (for example: Monday, late 8:00) start and time of lasting appointment (for example late 8:00 to 9:00) automatically.Alternatively, monitoring and record from opertaing device (for example: remote control etc.) user input (for example: click the mouse function key or Macintosh etc.) and starting also can be in response to.In certain embodiments, utilize the stream of the 32 bit/frame differentiation feature of people's descriptions such as Ke to send variation (streaming variation), come the environment audio frequency is encoded.
In certain embodiments, client software is as " sidebar " or the operation of other user interface element.By this way, when client-side interface 102 started, the environment audio sample can begin immediately and move in " background ", and its result (selectively) is shown in and need not to call a complete WEB browser session in this sidebar.
In certain embodiments, the environment audio sample can start or begin in spectators' service of signing in to or when using (as: Email etc.) at client-side interface 102.
Descriptor is sent to audio database server 104.In certain embodiments, descriptor is the statistical abstract of the compression of environment audio frequency, and is described as people such as Ke.By sending statistical abstract, because this statistical abstract is irreversible, that is, original audio frequency can not recover from this descriptor, therefore can keep the privacy of user voice.Therefore, monitored dialogue with the user who writes down or other people can't be reproduced from this descriptor in broadcast environment.In certain embodiments, can utilize one or more existing encryption technologies (for example: asymmetric or symmetric key encryption, oval encryption or the like) to encrypt descriptor, so that extra privacy and security to be provided.
In certain embodiments, descriptor is submitted (being also referred to as the query specification symbol) to and is sent to audio database server 104 as inquiry, and this inquiry is submitted to and is in response to observation process at client-side interface 102 detected trigger events.For example, trigger event can be the beginning melody (as: the beginning song of " Song Feichuan ") of TV programme or performer's dialogue.In certain embodiments, the query specification symbol can be used as the part of continuous Streaming Media process and is sent to audio database server 104.In certain embodiments, the query specification symbol can be imported (as: by remote control, click or the like) in response to the user and be sent to audio database server 104.
Popular individuation process
Fig. 3 is the process flow diagram of popular individuation process 300.The step of process 300 need not to finish with any specific order, and at least some steps can be performed under multithreading or parallel processing environment at one time.
Process 300 at client-side interface (for example: begin when client-side interface 102) having monitored and having write down the environment audio fragment that mass media broadcasts under the broadcast environment (302).The environment audio fragment of record is encoded as descriptor (as: statistical abstract of compression), and this descriptor can be used as inquiry and be sent to audio database server (304).The database that this audio database server will be inquired about with the reference descriptor that calculates from mass media broadcasting statistic compares, to determine and this inquiry candidate's descriptor of coupling.This candidate's descriptor is sent to social application server or other Internet resources, and this candidate's descriptor of this society's application server or other network resource usage compiles customized information (310) for the user.For example, the play " Song Feichuan " if this user is televiewing, then the query specification symbol that generates from the environment audio frequency of this TV play will be complementary from before " Song Feichuan " reference descriptor that program obtained.Therefore, candidate's descriptor of coupling is used to compile customized information about " Song Feichuan " (for example, News Stories, discussion group points to from the link of organizing social reciprocity colony or chatroom, advertisement or the like).In certain embodiments, utilize hash (hashing) technology (as direct Hash or position sensing hashing (LSH)) to carry out matcher effectively, to obtain the short tabulation of candidate's descriptor, as shown in Figure 4.Handle candidate's descriptor subsequently in verification step, for example people such as Ke is described.
In certain embodiments, directly mate, rather than each inquiry and database with reference to descriptor are mated from different spectators' query specification symbol.Such embodiment can allow to create the reciprocity colony of the society of group certainly about subject events, and the database with reference to descriptor concerning this subject events is unavailable.Such embodiment can mate in real time and is in identical public place (as the stadium, bar etc.) and uses portable electric appts () spectators for example: mobile phone, PDA etc.
Popular audience ratings
In certain embodiments, from current spectators' tabulation of watching broadcasting (as: TV play, advertisement etc.), release in real time and the statistic of compiling.These statistics can be collected in background when spectators use other to use.Statistic can include but not limited to: the average number of 1) watching the spectators of this broadcasting; 2) spectators watch the average time of this broadcasting; 3) other program of watching of spectators; 4) spectators' minimum and maximum number; 5) spectators leave the most normal which program that switches to after the broadcasting; 6) spectators watch the time of broadcasting; 7) spectators change the number of times of platform; 8) which advertisement spectators watch; And 9 enter when broadcasting as spectators, and he comes or the like by the most normal the switching from which program.One or more popular audience ratings can be determined from these statistics.
Can use the counter of each monitored broadcasting channel to generate the statistic that is used to generate popular audience ratings.In certain embodiments, these counters can be divided into population group data or geographical group data.Spectators can utilize popular audience ratings to understand which most popular (as: increasing by announcement audience ratings when 2004 Super Bowl intermissions are performed) in the ongoing broadcast program.Advertisement and content supplier also can utilize popular audience ratings to come in response to audience ratings dynamically to adjust the material of playing.This is especially suitable for advertisement, because short by the time of advertising company's making, the advertisement that version is many can be changed according to spectators' audience ratings level at an easy rate.Other example of statistic includes but not limited to: per capita or by the time television broadcasting with respect to the popular audience ratings of station broadcast, prime time in one day, promptly, the peak is watched/is listened to the period, domestic household number in the given area, surfing channel number during specific program (program category, one day special time), volume of broadcasting or the like.
Customized information is sent to client-side interface (312).Popular audience ratings also can be stored in the database uses (318), for example dynamic adjustment of above-mentioned advertisement for other process.This customized information is received (314) at client-side interface, and formatted and be presented on user interface (316) at this.This customized information can with the comment media of presenting to the user in user interface (for example: the text message in the chatroom) be associated.In certain embodiments, the chatroom can comprise one or more son groups.For example, may comprise the child group that is called " Song Fei expert " about the discussion group of " Song Feichuan ", perhaps child group can with specific crowd, for example the age watches the women of " Song Feichuan " to be associated at 20-30 between year, or the like.
In certain embodiments, the raw information (as Counter Value) that is used to popular audience ratings to generate statistic is collected and storage at client-side interface, rather than at social application server.As long as the user is online and/or call popular personalized application, this raw information just can be transferred into the broadcaster.
In certain embodiments, broadcasting measuring box (BMB) is installed in client-side interface.This BMB is similar to set-top box but the hardware simplicity equipment that is not connected with broadcasting equipment.Need hardware to be installed on TV different with Neilsen audience ratings system, and BMB can be installed near mass medium system place or within the scope of TV signal.In certain embodiments, automatic record audio fragment of BMB and generation are stored in the descriptor in the storer (as flash media).In certain embodiments, BMB comprises one or more hardware button alternatively, and the user can indicate them watching which broadcast program is (similar by pressing the button Audience ratings).Can collect the descriptor of being stored every now and then by the BMB equipment that audience ratings provider selects, perhaps BMB (for example: phone can connect by network, the Internet is such as the radio in SMS/ carrier wave radio station or the like) descriptor stored to one or more interested parties broadcasting every now and then.
In certain embodiments, can monitor advertisement to judge this advertising effect, this effect can be reported to the advertiser.For example, which advertisement is viewed, is skipped volume of advertisement or the like.
In certain embodiments, image-capturing apparatus (as: digital camera, video recorder or the like) can be used to metering has watching or broadcast listening for how many spectators.For example, various existing image matching algorithms can be applied to image or image sequence, to determine to be in the quantity of the spectators in the broadcast environment during specific broadcasting.The image and/or the data that obtain from this image can be used to combine with audio descriptor, to collect user's customized information, calculate popular audience ratings, or for other purpose.
The audio-frequency fingerprint processing procedure
Fig. 4 is the process flow diagram of audio-frequency fingerprint processing procedure 400.The step of process 400 need not to finish with the order of any specific, and at least some steps can be carried out in multithreading or parallel processing environment at one time.Process 400 in real time or (for example: query specification symbol that client-side interface 102) generates and the reference descriptor that is stored in one or more databases mate with client-side interface on low delay ground.The technology that process 400 adopts people such as Ke to propose is come processing environment voice data (for example from television broadcasting) and inquiry.
The environment audio fragment that process 400 starts from the mass medium broadcasting that client-side interface obtains environment audio detection device (for example microphone) (for example: the audio frequency of 5-6 second) be decomposed into overlapped frame (402).In certain embodiments, (for example: 12 milliseconds at interval) be separated by several milliseconds between these frames.Each frame be converted into through the training after can overcome audio-frequency noise and distortion descriptor (for example: 32 bit descriptors) (404), described as people such as Ke.In certain embodiments, each descriptor is represented the identification statistical abstract of audio fragment.
In certain embodiments, descriptor can be used as query fragment (being also referred to as the query specification symbol) and be sent to the audio database server, this descriptor this audio database server be complementary the statistical abstract (406) of the environment audio fragment of mass medium broadcasting of record before this has identified with reference to descriptor with reference to the database of descriptor.Can determine to have the tabulation (408) of candidate's descriptor of optimum matching.Can mark to candidate's descriptor, make the score value of the candidate descriptor consistent be higher than the candidate's descriptor (410) that accords with not enough sequential unanimity with query specification with query specification symbol sequential.(for example has highest score, the score value that surpasses a fully high threshold value) candidate's descriptor is sent out or otherwise offers social application server (412), and these candidate's descriptors can be used to compile the customized information relevant with media broadcast at social application server.Utilize threshold value to guarantee that this descriptor is abundant coupling before candidate's descriptor being sent or otherwise offers social application server (412).
In certain embodiments, the broadcasting that can provide from different media companies generates the database with reference to descriptor, and this database can be indexed and be used to generate descriptor.In other embodiments, also can utilize list of television programmes or other metadata and/or the information that is built in the broadcast singal generates with reference to descriptor.
In certain embodiments, can utilize speech recognition technology to help which program of identification just watching.This technology can help the user that media event is discussed and be not only TV play.For example, the user can watch the emission of space shuttle at the channel different with other spectators, therefore, may obtain different sound signal (for example, because different announcers).Speech recognition technology can be used to identidication key (as, space shuttle, emission or the like), and this key word can be used to user and comment media are linked.
The hash descriptor
People such as Ke utilize computer vision technique to seek the height of audio frequency is differentiated power, succinct statistic.Their program is in the positive example (wherein x is the noise version of same audio frequency with x ') of mark and counter-example (wherein x with x ' from different audio frequency) centering training.In this training stage, based on the machine learning techniques usage flag that advances (boosting) to 32 filtrators of the statistic of selecting to have created jointly high differentiation power and the combination of threshold value.First rank and the second jump variation of assigning to locate spectrogram amplitude of filtrator utilization on time domain and frequency domain.A benefit using these simple and easy differential filter is to use the integral image technology to calculate them efficiently, wherein the integral image technology is at Viola, P. and Jones, M. be published in " International Journal ofComputer Vision ", (2002) phase, exercise question is to describe to some extent in the article of " Robust Real-Time ObjectDetection ", and it is incorporated herein by reference in full.
In certain embodiments, these 32 filtrators are output as threshold value, provide 1bit/ each filtrator at each audio frame.These 32 threshold values are only by the descriptor of that frame of the audio frequency that is sent out and obtain.Sparse property in this coding has guaranteed user's privacy, prevents unwarranted eavesdropping.In addition, the descriptor of these 32 bits is strong for the audio distortion in the training data, thereby make positive example (as the coupling frame) have little Hamming distance from (for example, weighing the distance of different bit numbers), and counter-example (as unmatched frame) have big Hamming distance from.Note that the filtrator that also can use more or less quantity, and each audio frame use more than each filtrator of 1bit/ (as, use many bits of many threshold testings).
In certain embodiments, 32 bit descriptors itself are as the hash key of direct hashing algorithm.Descriptor is the hash function of balance.By not only to query specification symbol, also similar descriptor to fraction (with the Hamming distance of original query descriptor from mostly being most 2) inquire about, retrieval rate is further enhanced.
Inner inquiry sequential consistance
In case use above-mentioned hash routine that query specification symbol and audio database are complementary, verify this coupling with determine which database returns to hit item (hit) be accurate coupling.Otherwise candidate's descriptor has many frames and the query specification symbol is complementary, but sequential organization is wrong.
In certain embodiments, checking is to be that support to the coupling of ad hoc inquiry database skew place realizes by each database being hit item is considered as.For example, if at 5 seconds, " Song Feichuan " query fragment of 415 frame lengths, 8 descriptor (q among the q 8) hit the 1008th database descriptor (X 1008), then this has supported the candidate matches between 1001 frame to 1415 frames in inquiry in 5 seconds and audio database.At q nAnd X 1000+n, other coupling between (1≤n≤415) can be supported the candidate matches that this is identical.
Except the sequential consistance, when session sound temporarily was submerged in the environment audio frequency, we also needed interpreted frame.This can be modeled as the alternative conversion between environment audio frequency and interference sound.For each inquiry frame i, a hidden variable y is arranged all iIf, y iEqual 0, then Cha Xun i frame only is modeled as interference; If y iEqual 1, then the i frame is modeled as from clean environment audio frequency.Adopt a kind of extreme sight (pure environment audio frequency or pure interference), the extremely low precision that is used to expression (32 bit) each audio frame proves rationally, and by supposing (y at these two kinds i=0 and y i=1) for providing the extra bit probability of beating, in 32 positions of this frame vector each weakened under.At last, we utilize the transition probability that is obtained by training data, will be modeled as implicit first-order Markov process in the conversion of the interframe between pure environment audio frequency or the pure disturbance state.For example, the 66 parameter probability models that on CVPR 2005, provide of people such as we reusable Ke.
Environment data base vector x at query vector q and skew N frame NBetween the final mask of matching probability, for:
P ( q | x N ) = &Pi; n = 1 415 P ( < q n , x N + n > | y n ) P ( y n | y n - 1 ) - - - ( 1 )
Wherein<q n, x mExpression 32 bit frame vector q nWith x mBetween the bit difference.This model combines sequential consistency constraint and environment/interference hidden Markov model.
Coupling back consistance is filtered
When people watch TV through being everlasting and other people talk, disturb thereby fragmentary but stronger sound wave occurs, particularly using when the environment audio frequency being sampled based on portable microphone.Suppose most of talk modes for continuing for two or three seconds, the simple exchanging meeting between spectators can not be discerned 5 seconds inquiry.
In certain embodiments, the coupling back is filtered and is used to handle not matching of these intermittent low confidences.For example, we can use and have hidden Markov model continuous time that L expects that the channel of residence time (expected dwell time) (being the time between channel changes) switches second.The high confidence level of society's application server 106 in will the nearest past mate (with its " (discounted) after the discounting " degree of confidence) as with the part of each client sessions associated state information.Utilize this information, server 106 can according to which have higher degree of confidence and from recently passing chosen content index coupling, or select current index to mate.
We use M hAnd C hOptimum matching and log-likelihood degree of confidence score value thereof in the step before the expression (before 5 seconds).If we are to applying markov model simply of optimum matching before this, and do not carry out other observation, then our expection be the optimum matching of current time with 5 seconds optimum matching forward is identical agenda, and is C in this pre-interim our degree of confidence h-l/L, wherein l=5 second is the query time step-length.The discount of l/L and Markov model probability e in log-likelihood -l/LCorrespondence, wherein, during the long time step of l in switching channels not.
Another kind of hypothesis produces by the audio frequency coupling to current inquiry.We use M 0Expression is to the optimum matching of current audio fragment, that is, this coupling is produced by audio-frequency fingerprint processing procedure 400.C 0Be the log-likelihood degree of confidence score value that provides by audio-frequency fingerprint processing procedure 400.
If these two kinds couplings (the history expection of renewal is observed with current fragment) draw different couplings, then we select to have the more hypothesis of high confidence level score value:
Figure A20068004465000271
Wherein, M 0Be used to select the coupling of related content and M by social application server 106 0And C 0Be used as M hAnd C hAnd bring next step into.
User interface
Fig. 5 is the process flow diagram that is used for carrying out with popular personalized application an embodiment of mutual user interface 208.User interface 208 comprises personalized layer viewing area 502, comment media viewing area 504, sponsored link viewing area 506, and content viewing area 508.Personalized layer viewing area 502 provide with content viewing area 508 in relevant side information and/or the image of video content that show.Utilize navigation bar 510 and input equipment (as mouse or remote control) to navigate to personalized layer.Each layer has relevant label in navigation bar 510.For example, if the user selects " fashion " label, the fashion layer that then comprises the fashion related content that is associated with " Song Feichuan " will be presented on viewing area 502.
In certain embodiments, client-side interface 102 comprises the display device 210 that can present user interface 208.In certain embodiments, user interface 208 is that social application server 106 is that provided and be presented on interaction network page in the browser window on display device 210 screens.In certain embodiments, user interface 208 is lasting, and the broadcast audio that uses in the content match process still can be used for after having changed mutual.In certain embodiments, user interface 208 is in time or in response to trigger event (enter the chatroom as the new person, the beginning of commercial advertisement, or the like) and be dynamically updated.For example, when broadcasting commercial advertisement, sponsored link viewing area 506 will be updated to the new url 518 relevant with the subject events of this commercial advertisement at every turn.
In certain embodiments, customized information and sponsored link can be sent to spectators or be shown in sidebar after a while by e-mail.
In certain embodiments, client-side interface 102 receives customized information from social application server 106.This packets of information purse rope page or leaf, Email, message board, link, instant message, chatroom or add ongoing discussion group, the electronics room, video conference or Web conference, voice call (as:
Figure A20068004465000281
) wait invitation.In certain embodiments, user interface 208 provides from previous broadcasting or the film seen to the visit of commenting on and/or point to the link of comment.For example, watch the DVD of " Shrek " if the user is current, how people talked about this film before he may want to see.
In certain embodiments, viewing area 502 comprises audience ratings district 512, and this audience ratings district 512 is used to show and broadcast related popular audience ratings.For example, viewing area 512 can show with other TV play of broadcasting simultaneously to be compared, and current how many spectators of having watch " Song Feichuan ".
In certain embodiments, comment media viewing area 504 is rendered as the environment of chatroom type, and a plurality of users can comment on broadcasting therein.In certain embodiments, viewing area 504 comprises text box 514, is sent to the comment of chatroom to utilize input mechanism 516 (as button) input.
Sponsored link viewing area 506 comprises and broadcasting associated advertisement relevant information, image and/or link.For example, one of link 518 can be taken to the user website of sale " Song Feichuan " commodity.
Content viewing area 508 is the place of display of broadcast content.For example, the scene of current broadcasting can show together with other relevant information (as collection of drama numbering, title, timetable or the like).In certain embodiments, viewing area 508 comprises and is used for controller 520 (as scroll button) that displaying contents is navigated.
Video bookmarks
In certain embodiments, button 522 can be contained in the content viewing area that is used for marking video.For example, by button click 522,508 TV play of broadcasting " Song Feichuan " are added into the favorite video library of user in the viewing area, then can be by using based on the Streaming Media of WEB or other access method and this TV play is watched in program request.According to the strategy that the content owner sets, streaming media service can provide free single to watch playback, be content owner's charge as the agency, or insertion will be to the advertisement of content owner's paying.
The client-side interface hardware structure
Fig. 6 is used for the block diagram of the hardware structure 600 of client-side interface 102 as shown in Figure 1.Although hardware structure 600 is typical computing equipment (as PC), disclosed embodiment also can realize that it includes but not limited to by any equipment that can present user interface on display device: desk-top computer or portable computer; Electronic equipment; Phone; Mobile phone; Display system; TV; Monitor; Navigational system; Portable electronic device/register; The personal electric assistant; Games system; Hand-hold electronic equipments; And EMBEDDED AVIONICS or device.
In certain embodiments, system 600 comprises one or more processor 602 (as CPU), comprise one or more display device 604 (as: cathode ray tube (CRT)s alternatively, LCD (LCD) etc.), microphone interface 606, one or more network interfaces 608 (as: USB (universal serial bus) (USB), Ethernet, Fire
Figure A20068004465000301
Port or the like); Comprise one or more input equipments 610 (as mouse, keyboard etc.) and one or more computer-readable medium 612 alternatively.In these assemblies each is and is connected to one or more buses 614 (as: extension standards architecture (EISA), peripheral component interconnection (PCI), USB, Fire effectively
Figure A20068004465000302
NuBus, comprehensive wiring system (PDS) etc.).
There are not display device or input equipment in certain embodiments, and system 600 execution sampling and codings (as generating descriptor etc.) in background, and the no user input.
Term " computer-readable medium " refers to participate in providing to processor 602 arbitrary medium of execution command, includes but not limited to: non-volatile media (as CD or disk), Volatile media (as internal memory) and transmission medium.Transmission medium includes but not limited to: concentric cable, copper conductor and optical fiber.Transmission medium is sound wave, light wave or radiowave also.
Computer-readable medium 612 comprises that further operating system 616 is (as Mac
Figure A20068004465000303
Figure A20068004465000304
Unix, Linux etc.), network communication module 618, client software 620 and one or more application 622.Operating system 616 can be the multi-user, multiprocessing, and multitask, multithreading, real-time or the like.Operating system 616 is carried out basic task, includes but not limited to: identification is from the input of input equipment 610; Output is sent to display device 604; Trace file and catalogue on memory device 612; The control external unit (as, disc driver, printer, image-capturing apparatus etc.); And on one or more buses 614, manage flow.
Network communication module 618 comprise be used for setting up with keep network be connected (as be used to realize communication protocol, as transmission control protocol/Internet Protocol (TCP/IP), HTTP(Hypertext Transport Protocol), Ethernet, USB (universal serial bus) (USB), Fire
Figure A20068004465000305
Deng) various assemblies.
The client that client software 620 provides various component softwares to be used to realize popular personalized application, and be used to carry out the function (as the environment audio identification) to various clients shown in Figure 5 as Fig. 1.In certain embodiments, some or all processes of being carried out by client software 620 can be integrated in the operating system 616.In certain embodiments, at least can be partly at Fundamental Digital Circuit, or computer hardware, firmware, software, or implementation in above-mentioned every combination in any.
Other uses 624 can comprise other any software application, includes but not limited to: word processing, and browser, Email, instant message, media player, phone software, or the like.
Detect advertisement and replay
Duplicate detection
For the inquiry readiness database time, can utilize foregoing descriptor to indicate that in advance the material of repetition is helpful.The material that repeats can include but not limited to: the program of repetition, and advertisement, sub-fragment (as the stock fragment in the news program), or the like.Utilize these signs, can present the material of repetition by this way, promptly can not release the every other material (hitting item) of the concern scope that exceeds the user who searches for as surpassing initial 10-20 bar.The process 700 that describes below provides a kind of method that detects those copies before to any search inquiry of database earlier.
The video ads deletion
Broadcaster is the replay of embedded advertisements about one of complaint of allowing the searched and playback of material.From the viewpoint of broadcaster, replay is reactive: thus because it provides free advertisement to reduce the directly value of the broadcast program of paying of advertiser to the advertiser.Unless delete old advertisement and suitably add new advertisement in mode from some comments to original broadcast company that return, otherwise they can not gain from broadcasted material before.It is a kind of by searching repetition that the process 700 that describes below provides, and also may detect the approach of embedded advertisements together with Else Rule (as the duration, volume, visual activity, chain group blank frame (bracketing blank frame) etc.) together.
Video frequency abstract
" summary " of non-if desired repeated program material 1(promptly short version), a kind of mode of obtaining it is deletion advertisement (as detected repeated material), and from just in time taking out fragment the material before or after this location advertising.On radio and television, these positions typically comprise " trailer " (before advertisement) and " recapitulaion " (just in time after advertisement) in program.If summary need be made by the news program of the non-ad material that comprises non-repetition and repetition, typically the non-ad material of this repetition is equivalent to primary sound summary broadcast (sound bite).Usually, the information that provides of these fragments is less than News Stories host's the information that narration provided, and is good deletion candidate target.If summary need be made by narration program (as: film or publish in instalments part), the track of repetition typically is equivalent to theme, the sight music, or mourn in silence.Again, these are fit to fragment deleted from summarized radio typically.Process 700 described below provides and detects the method that these repeat tracks, so as from this summarized radio with its deletion.
The duplicate detection process
Fig. 7 is the process flow diagram according to an embodiment of duplicate detection process.The step of process 700 need not to finish with the order of any specific, and at least some steps can be carried out under multithreading or parallel processing environment at one time.
Process 700 starts from from such as TV, and the content collection that video is uploaded etc. is created the database (702) of audio statistics amount.For example, this database can comprise the descriptor of 32 bits/every frame, and is described as people such as Ke.Obtain inquiry from this database, and this database is moved this inquiry where produced repetition (704) to understand.In certain embodiments, the short-movie section of audio statistics amount is taken as inquiry and is performed, utilizing hashing technique (as direct hashing or position sensing hashing (LSH)) to check nonidentity coupling (coupling but inconsistent), thereby obtain the short tabulation of possible sense of hearing coupling.In proof procedure, handle these candidate matches then, for example, described as people such as Ke.Can be identified as repeated content (706) with the corresponding content of the candidate matches of empirical tests.
The strongest nonidentity coupling is forward and backward " growth " in time, with the beginning and the end point (708) of searching repeated material.In certain embodiments, can utilize existing dynamic programming technology (as Viterbi (Viterbi) decoding) to finish this.When extending this coupling forward in time, last timeslice in " seed " coupling is set to " coupling " by force, and the first last timeslice that is lower than the coupling of confidence level that is used for the same database skew between this inquiry and this coupling is set to " not matching ".In certain embodiments, the coupling score value of the single frame between these two point of fixity is used as observed value, and has used the single order Markov model of the single conversion of enable state internal conversion and from " coupling " to " not matching " state.Can slightly at random be provided with from matching unmatched transition probability to 1/L, wherein L is the frame number between these two point of fixity, and it is corresponding to the minimum cognition to dislocation in the allowed band.Another selects the possibility of transition probability to be to use coupling intensity profile to setover to the estimation of morning and later conversion.But this can increase the complexity of dynamic programming model, and unlikelyly can improve the result, has been used as observed value because will mate intensity in during this period.Can make and use the same method the fragment match (for example, only switch past/future and move same algorithm) of growing backward in time.
In certain embodiments, audio prompt combines with non-acoustic information (as visual cues), to obtain higher matching accuracy.For example, by the audio frequency matched and searched to coupling can verify (or check for the second time) (710) subsequently by utilizing simple visual similarity to measure.These tolerance can include but not limited to: color histogram (as the frequency of similar color in two images), and to statistic of amount of edge and distribution etc.These not only need to calculate at entire image, also to calculate at the subregion of image, and with target image in corresponding subregion compare.
Search the application of advertisement (opposite) for those with all kinds of repeated materials, the testing result of repeated material can be used to distinguish advertisement combine with the module of non-advertisement (712).These distinguishing characteristicss can be dependent on the advertisement convention, as the duration (for example, 10/15/30 second commercial breaks is commonplace), (as: volume of advertisement often is higher than near the volume of program material to depend on volume, if therefore the material that repeats is louder than the material of either side, then it probably is advertisement), (as: advertisement has more conversion fast often and more moves in the camera lens between camera lens to depend on visual activity, if therefore the material that repeats has bigger interframe difference than the program material of either side, then it probably is advertisement), and depend on chain group blank frame (local advertisement of inserting typically can not the national feedback of complete filling be broadcast (national feed) and be its space that stays, and causing with 30 seconds multiple serves as to occur blank frame at interval and mourn in silence).
In case identify advertisement, can analyze this advertisement material on every side, and generate statistic.For example, can do the how many times advertisement to specific products, or broadcasted how many times or the like and generated statistic about specific fragment about utilizing specific intention (as image, text).In certain embodiments, one or more old advertisements are deleted or alternative by new advertisement.Also have some to be used for the technology of purposes of commercial detection and replacement at Covell, M., Baluja, S., Fink, M. be published in IEEE SignalProcessing Society, MMSP 2006 International Workshop on MultimediaSignal Processing, 3-6 day in October, 2006, Canada, exercise question is described in the literary composition of " AdvertisementDetection and Replacement Using Acoustic and Visual Repetition ", and it is incorporated herein by reference in full.
In certain embodiments, can be used to expansion process 700 from content owner's information (as the position that ad material is inserted, the position that program repeats), and can increase matching accuracy about this content detailed structure.In certain embodiments, can use video statistics amount alternate audio statistic to determine repetition.In other embodiments, also can use the combination of video and audio statistics amount.
The audio fragment auction
In certain embodiments, the advertiser can participate in the auction relevant with the environment audio frequency, and wherein this environment audio frequency wants product sold relevant with service with this advertiser.For example, but the right of a plurality of advertiser's auction to obtain its product or service and the audio fragment or the descriptor of relevant " Song Feichuan " are associated.The winner of auction subsequently can be when this theme environment audio frequency occurring be presented on some relevant informations (for example, sponsored link) in face of the spectators.In certain embodiments, but advertiser's auction has the environment audio fragment that unit level (meta-level) is described.For example, but advertiser's auction audio frequency relevant with television advertising (for example, this is and the relevant audio frequency of Ford Explorer TV advertisement), closed caption (as: captions show " Yankees baseball "), (as: this audio frequency will appear at " Song Feichuan " and broadcast 15 minutes in the program segment position, appear at a commercial breaks 3 minutes afterwards and next the commercial breaks 1 minute before), or low level sound or visual characteristic (as: " background music ", " dialogue sound ", " explosive sound " etc.).
In certain embodiments, one or more popular personalized application can be carried out other task the user, such as moving under the background of browsing another website (as sponsored link).The material relevant with media broadcast (as television content) can join in the sponsored link auction identical with another content resource (as web site contents) related materials.For example, the advertisement that TV is relevant can mix with the advertisement corresponding to the current web page content.
Within the scope of appended claim, allow disclosed embodiment is carried out various modifications.

Claims (60)

1. method comprises:
Reception has identified the descriptor of the environment audio frequency that is associated with media broadcast;
Described descriptor and the reference descriptor that is associated with described media broadcast are compared; And
Compile customized information based on described comparative result about described media broadcast.
2. the method for claim 1, wherein compare further and comprise with described descriptor with reference to descriptor:
Use the described descriptor that receives to inquire about database with reference to descriptor; And
Based on match-on criterion determine with the described descriptor that receives be complementary one or more with reference to descriptor.
3. method as claimed in claim 2, wherein, determine described one or morely further comprise with reference to descriptor:
Determine the set of candidate based on described match-on criterion with reference to descriptor; And
Use proof procedure to verify the set of described candidate with reference to descriptor.
4. method as claimed in claim 3, wherein, determine that described candidate further comprises with reference to the set of descriptor:
Based on evaluating score value to described with reference to descriptor with the sequential consistance of the described descriptor that receives; And
Determine the set of described candidate by described score value with reference to descriptor.
The method of claim 1, wherein 5. described with reference to descriptor from the database reception of reference descriptor.
6. the method for claim 1, wherein compiling customized information further comprises:
Provide and comment on the communication link of media.
7. method comprises:
Reception has identified first descriptor of the environment audio frequency that is associated with first media broadcast;
Reception has identified second descriptor of the environment audio frequency that is associated with second media broadcast;
More described first descriptor and described second descriptor are to determine whether described first media broadcast is identical with described second media broadcast; And
Compile customized information based on described comparative result.
8. method comprises:
Detect the environment audio frequency that is associated with media broadcast;
Generation has identified the descriptor of described media broadcast;
Described descriptor is sent to Internet resources; And
Receive customized information based on described descriptor from described Internet resources application server through compiling.
9. method as claimed in claim 8 wherein, generates descriptor and further comprises:
The fragment of record environment audio frequency;
Described environment tablet section is decomposed into overlapping frame; And
Described frame is converted to the descriptor of the statistical abstract that has identified described environment audio fragment.
10. method as claimed in claim 8 further comprises: train described descriptor to overcome noise.
11. a system comprises:
Database with reference to descriptor; And
Server, it is connected to described database and client effectively, described database server is configured to receive the descriptor from described client, to identify the environment audio frequency that is associated with media broadcast, with the described descriptor that receives and one or morely compare, and compile the customized information of relevant described media broadcast based on described comparative result with reference to descriptor.
12. system as claimed in claim 11, the wherein said descriptor that receives is generated by the environment audio sample.
13. system as claimed in claim 11, wherein, the described descriptor that receives is the compressed file that comprises one or more audio signature parts.
14. system as claimed in claim 11, wherein, described database server receives client identifier from described client to identify described client.
15. system as claimed in claim 11, wherein, described customized information comprises the information that is associated with the comment media.
16. system as claimed in claim 11, wherein, described customized information comprises the information that is associated with advertisement.
17. system as claimed in claim 11 wherein, describedly upgrades from recent media broadcast with reference to descriptor termly.
18. system as claimed in claim 11, wherein, described customized information offers described client in webpage.
19. system as claimed in claim 11, wherein, described client comprises display device, carries out alternately in order to allow user and popular personalized application.
20. system as claimed in claim 11, wherein, described client is at optional period monitoring and record environment audio frequency.
21. system as claimed in claim 11, wherein, thereby the described descriptor that receives is encoded described environmental audio signal can't be resumed.
22. system as claimed in claim 11, wherein, the described descriptor that receives is encrypted.
23. system as claimed in claim 11, wherein, the described descriptor that receives is sent to described database server in response in submitting in the inquiry of the trigger event at described client place.
24. system as claimed in claim 11, wherein, the described descriptor that receives is as the part of Streaming Media process and be sent to described audio database server.
25. system as claimed in claim 11 wherein, is describedly generated by one or more client with reference to descriptor.
26. system as claimed in claim 11, wherein, described customized information comprises the information that is used for setting up communication link between the user of same geographic location.
27. system as claimed in claim 11, wherein, the responsive hashing in use location is determined described coupling.
28. system as claimed in claim 11, wherein, the described descriptor that receives is represented the identification statistical abstract of audio sample.
29. system as claimed in claim 11 wherein, uses the reference descriptor consistent with the described descriptor sequential that is generated by described client to determine described coupling.
30. a method comprises:
Reception is from the descriptor of multi-client system, and described descriptor has identified the environment audio frequency that is associated with real-time media broadcasting;
More described descriptor with reference to descriptor, to determine just to mate (positive match), wherein, at least in part based on the described descriptor that receives and describedly determine describedly just to mate with reference to the sequential consistance between the descriptor;
Based on the described establishment social groups of just mating; And
To the information of described client transmission about described social groups.
31. method as claimed in claim 30, wherein, the described information that is sent out comprises in order to the communication link of communication channel to be provided between described client and described social groups.
32. method as claimed in claim 30, wherein, described descriptor is to use small echo to generate.
33. a method comprises:
Reception has identified the descriptor of the environment audio frequency that is associated with media broadcast;
With described descriptor and one or morely compare with reference to descriptor; And
Determine the audience ratings of described media broadcast at least in part based on described comparative result.
34. method as claimed in claim 33 wherein, determines that audience ratings further comprises:
Determine counting by more described descriptor that receives and the described coupling that obtains with reference to descriptor; And
Determine audience ratings based on described counting at least in part;
35. method as claimed in claim 33 further comprises: provide information to equipment according to described audience ratings.
36. method as claimed in claim 35 further comprises:
Determine the variation of audience ratings;
Revise described information in response to the variation of described audience ratings; And
Provide described amended information to described equipment.
37. method as claimed in claim 33 further comprises: provide described audience ratings to equipment.
38. method as claimed in claim 33 wherein, determines that described audience ratings further comprises:
Reception is from the information of the equipment relevant with described media broadcast;
Use described information to determine statistic; And
Determine described audience ratings from described statistic.
39. method as claimed in claim 33 wherein, determines that audience ratings further comprises: the par of determining the user of the described media broadcast of reception.
40. method as claimed in claim 33 wherein, determines that audience ratings further comprises: determine that the user receives the average time of described media broadcast.
41. method as claimed in claim 33 wherein, determines that audience ratings further comprises: the maximum quantity of determining the user of the described media broadcast of reception.
42. method as claimed in claim 33 wherein, determines that audience ratings further comprises: the minimum number of determining the user of the described media broadcast of reception.
43. method as claimed in claim 33 further comprises:
Described audience ratings is associated with demographic group data (demographic group data); And
To providing described audience ratings with described demographic group associated device.
44. method as claimed in claim 33 further comprises:
Described audience ratings is associated with the geographical group data; And
To providing described audience ratings with described geographical group associated device.
45. a method comprises:
Generation has identified the descriptor of the environment audio frequency that is associated with media broadcast;
Provide described descriptor to the audience ratings supplier, to determine the audience ratings of described media broadcast based on described descriptor;
Reception is from described audience ratings supplier's described audience ratings; And
On display, show described audience ratings.
46. method as claimed in claim 45 further comprises:
Reception is from described audience ratings supplier's information, and wherein this information is associated with described media broadcast; And
On display, show described information.
47. method as claimed in claim 45 wherein, determines that described audience ratings further comprises:
Collect the information that is associated with described media broadcast;
Described information is offered the audience ratings system;
Reception is from the audience ratings of described audience ratings system, and wherein said audience ratings is at least in part based on the information of described collection; And
On display, show described audience ratings.
48. method as claimed in claim 45 wherein, determines that audience ratings further comprises: the par of determining the user of the described media broadcast of reception.
49. method as claimed in claim 45 wherein, determines that audience ratings further comprises: determine that the user receives the average time of described media broadcast.
50. method as claimed in claim 45 wherein, determines that audience ratings further comprises: the maximum quantity of determining the user of the described media broadcast of reception.
51. method as claimed in claim 45 wherein, determines that audience ratings further comprises: the minimum number of determining the user of the described media broadcast of reception.
52. method as claimed in claim 45 further comprises: receive the information that is associated with the demography group from described audience ratings supplier.
53. method as claimed in claim 45 further comprises: receive the information that is associated with the geographical group data from described audience ratings supplier.
54. a method comprises:
Record is from the environment audio fragment of media broadcast;
Generate descriptor from described environment audio fragment; And
Described descriptor is offered the audience ratings supplier.
55. method as claimed in claim 54 comprises:
Product or service are associated with described descriptor; And
Provide with described product or serve relevant information to one or more users.
56. method as claimed in claim 54 further comprises:
Confirmation of receipt receive the user input of described media broadcast; And
Import to determine described audience ratings based on described user.
57. method as claimed in claim 54 further comprises:
Catch the user's who receives described media broadcast digital picture; And
Determine described audience ratings based on described digital picture at least in part.
58. a system comprises:
Database with reference to descriptor; And
Server, it is connected to described database and client effectively, described database server is configured to receive the descriptor from described client, to identify the environment audio frequency that is associated with media broadcast, with the described descriptor that receives and one or morely compare with reference to descriptor, and the audience ratings of determining described media broadcast at least in part based on described comparative result.
59. a method comprises:
Reception is from the descriptor of multi-client system, and described descriptor has identified the environment audio frequency that is associated with real-time media broadcasting;
More described descriptor with reference to descriptor, just mate determining, wherein, at least in part based on the described descriptor that receives and describedly determine just to mate with reference to the sequential consistance between the descriptor;
Generate audience ratings based on described just the coupling; And
Send described audience ratings to described client.
60. method as claimed in claim 59 further comprises:
Use described descriptor to detect advertisement in the described media broadcast;
Generate the statistic of described advertisement; And
Described statistic is sent to and described advertisement associated advertisement master.
CN200680044650.XA 2005-11-29 2006-11-27 Social and interactive applications for mass media Active CN101517550B (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US74076005P 2005-11-29 2005-11-29
US60/740,760 2005-11-29
US82388106P 2006-08-29 2006-08-29
US60/823,881 2006-08-29
PCT/US2006/045551 WO2007064641A2 (en) 2005-11-29 2006-11-27 Social and interactive applications for mass media

Publications (2)

Publication Number Publication Date
CN101517550A true CN101517550A (en) 2009-08-26
CN101517550B CN101517550B (en) 2013-01-02

Family

ID=40332842

Family Applications (2)

Application Number Title Priority Date Filing Date
CN200680044650.XA Active CN101517550B (en) 2005-11-29 2006-11-27 Social and interactive applications for mass media
CNA2006800515590A Pending CN101361301A (en) 2005-11-29 2006-11-27 Detecting repeating content in broadcast media

Family Applications After (1)

Application Number Title Priority Date Filing Date
CNA2006800515590A Pending CN101361301A (en) 2005-11-29 2006-11-27 Detecting repeating content in broadcast media

Country Status (1)

Country Link
CN (2) CN101517550B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101720048B (en) * 2009-12-04 2011-06-01 山东大学 Audience rating information searching method for audience rating survey system based on audio frequency characteristics
CN102523497A (en) * 2011-12-29 2012-06-27 北京衡准科技有限公司 Method for predicting television program hot spot information based on monitoring of mass television program forenotices
CN102577421A (en) * 2009-09-29 2012-07-11 通用仪表公司 Digital rights management protection for content identified using a social TV service
WO2012126406A2 (en) * 2012-04-24 2012-09-27 华为技术有限公司 Method and system for researching viewership
CN102783057A (en) * 2010-02-24 2012-11-14 阿尔卡特朗讯公司 Method and server for detecting a video program received by a user
CN102918591A (en) * 2010-04-14 2013-02-06 谷歌公司 Geotagged environmental audio for enhanced speech recognition accuracy
CN103123787A (en) * 2011-11-21 2013-05-29 金峰 Method for synchronizing and exchanging mobile terminal with media
CN103370920A (en) * 2011-03-04 2013-10-23 高通股份有限公司 Method and apparatus for grouping client devices based on context similarity
CN103688253A (en) * 2011-11-25 2014-03-26 株式会社攀登 Review method, computer-program product, and review system
CN104349182A (en) * 2014-04-10 2015-02-11 江苏优因特智能科技有限公司 Intelligent terminal media playing content feedback method realized through sound channel
CN104349183A (en) * 2014-04-10 2015-02-11 江苏优因特智能科技有限公司 Media television reception effect feedback collecting method realized through sound channel
CN104813673A (en) * 2012-09-21 2015-07-29 谷歌公司 Sharing content-synchronized ratings
US9237225B2 (en) 2013-03-12 2016-01-12 Google Technology Holdings LLC Apparatus with dynamic audio signal pre-conditioning and methods therefor
CN106847258A (en) * 2013-02-20 2017-06-13 谷歌公司 Method and apparatus for sharing adjustment speech profiles
CN107111789A (en) * 2014-10-24 2017-08-29 索尼公司 Context sensitive media categories
CN111123290A (en) * 2014-08-18 2020-05-08 谷歌有限责任公司 Matching conversions from an application to selected content items
US10896685B2 (en) 2013-03-12 2021-01-19 Google Technology Holdings LLC Method and apparatus for estimating variability of background noise for noise suppression
CN112699787A (en) * 2020-12-30 2021-04-23 湖南快乐阳光互动娱乐传媒有限公司 Method and device for detecting advertisement insertion time point
US11735175B2 (en) 2013-03-12 2023-08-22 Google Llc Apparatus and method for power efficient signal conditioning for a voice recognition system

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8359205B2 (en) 2008-10-24 2013-01-22 The Nielsen Company (Us), Llc Methods and apparatus to perform audio watermarking and watermark detection and extraction
US9667365B2 (en) 2008-10-24 2017-05-30 The Nielsen Company (Us), Llc Methods and apparatus to perform audio watermarking and watermark detection and extraction
CA2760677C (en) * 2009-05-01 2018-07-24 David Henry Harkness Methods, apparatus and articles of manufacture to provide secondary content in association with primary broadcast media content
US9344759B2 (en) * 2013-03-05 2016-05-17 Google Inc. Associating audio tracks of an album with video content
CN115883873A (en) * 2021-09-28 2023-03-31 山东云缦智能科技有限公司 Video comparison method based on video genes

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102577421A (en) * 2009-09-29 2012-07-11 通用仪表公司 Digital rights management protection for content identified using a social TV service
CN102577421B (en) * 2009-09-29 2016-01-20 摩托罗拉移动有限责任公司 For using the digital copyright management protection of the content of social TV service identification
CN101720048B (en) * 2009-12-04 2011-06-01 山东大学 Audience rating information searching method for audience rating survey system based on audio frequency characteristics
CN102783057A (en) * 2010-02-24 2012-11-14 阿尔卡特朗讯公司 Method and server for detecting a video program received by a user
US8682659B2 (en) 2010-04-14 2014-03-25 Google Inc. Geotagged environmental audio for enhanced speech recognition accuracy
CN102918591B (en) * 2010-04-14 2016-06-08 谷歌公司 For strengthening the environmental audio having GEOGRAPHICAL INDICATION of speech recognition accuracy
CN102918591A (en) * 2010-04-14 2013-02-06 谷歌公司 Geotagged environmental audio for enhanced speech recognition accuracy
CN103370920A (en) * 2011-03-04 2013-10-23 高通股份有限公司 Method and apparatus for grouping client devices based on context similarity
CN103123787A (en) * 2011-11-21 2013-05-29 金峰 Method for synchronizing and exchanging mobile terminal with media
CN103123787B (en) * 2011-11-21 2015-11-18 金峰 A kind of mobile terminal and media sync and mutual method
CN103688253A (en) * 2011-11-25 2014-03-26 株式会社攀登 Review method, computer-program product, and review system
CN102523497A (en) * 2011-12-29 2012-06-27 北京衡准科技有限公司 Method for predicting television program hot spot information based on monitoring of mass television program forenotices
WO2012126406A3 (en) * 2012-04-24 2013-03-28 华为技术有限公司 Method and system for researching viewership
CN102763427A (en) * 2012-04-24 2012-10-31 华为技术有限公司 Method and system for researching viewership
WO2012126406A2 (en) * 2012-04-24 2012-09-27 华为技术有限公司 Method and system for researching viewership
CN104813673A (en) * 2012-09-21 2015-07-29 谷歌公司 Sharing content-synchronized ratings
CN106847258A (en) * 2013-02-20 2017-06-13 谷歌公司 Method and apparatus for sharing adjustment speech profiles
US11557308B2 (en) 2013-03-12 2023-01-17 Google Llc Method and apparatus for estimating variability of background noise for noise suppression
US10896685B2 (en) 2013-03-12 2021-01-19 Google Technology Holdings LLC Method and apparatus for estimating variability of background noise for noise suppression
US9237225B2 (en) 2013-03-12 2016-01-12 Google Technology Holdings LLC Apparatus with dynamic audio signal pre-conditioning and methods therefor
US11735175B2 (en) 2013-03-12 2023-08-22 Google Llc Apparatus and method for power efficient signal conditioning for a voice recognition system
CN104349182A (en) * 2014-04-10 2015-02-11 江苏优因特智能科技有限公司 Intelligent terminal media playing content feedback method realized through sound channel
CN104349183A (en) * 2014-04-10 2015-02-11 江苏优因特智能科技有限公司 Media television reception effect feedback collecting method realized through sound channel
CN111123290A (en) * 2014-08-18 2020-05-08 谷歌有限责任公司 Matching conversions from an application to selected content items
CN111123290B (en) * 2014-08-18 2024-03-19 谷歌有限责任公司 Matching transformations from an application to selected content items
CN107111789A (en) * 2014-10-24 2017-08-29 索尼公司 Context sensitive media categories
CN107111789B (en) * 2014-10-24 2020-12-25 索尼公司 Method for determining identification data by user equipment and user equipment
CN112699787A (en) * 2020-12-30 2021-04-23 湖南快乐阳光互动娱乐传媒有限公司 Method and device for detecting advertisement insertion time point
CN112699787B (en) * 2020-12-30 2024-02-20 湖南快乐阳光互动娱乐传媒有限公司 Advertisement insertion time point detection method and device

Also Published As

Publication number Publication date
CN101361301A (en) 2009-02-04
CN101517550B (en) 2013-01-02

Similar Documents

Publication Publication Date Title
CN101517550B (en) Social and interactive applications for mass media
CA2631151C (en) Social and interactive applications for mass media
CN1607832B (en) Method and system for inferring information about media stream objects
US11223433B1 (en) Identification of concurrently broadcast time-based media
CN102084358A (en) Associating information with media content
CN103797482A (en) Methods and systems for performing comparisons of received data and providing follow-on service based on the comparisons
CN105230035A (en) For the process of the social media of time shift content of multimedia selected
Fink et al. Social-and interactive-television applications based on real-time ambient-audio identification
JP2012039550A (en) Information processing device, information processing system, information processing method and program
Fink et al. Mass personalization: social and interactive applications using sound-track identification
KR102297362B1 (en) Apparatus and method for providing advertisement based on user characteristic using content playing apparatus
KR20100111907A (en) Apparatus and method for providing advertisement using user&#39;s participating information
JP2006165658A (en) Program metadata creating/management method, program metadata creating/management system, program meta-data creating apparatus, program metadata evaluation apparatus, computer program and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: American California

Patentee after: Google limited liability company

Address before: American California

Patentee before: Google Inc.

CP01 Change in the name or title of a patent holder