The application is the artificial Shanghai Tiantong Electronic Technology Co., Ltd. of application, entitled multimedia player, application
Number be 200810043654.7, the applying date be on July 21st, 2008 application for a patent for invention divisional application.
Background technique
With the development of network technology and microelectric technique, more and more multimedia terminals can both be watched in multimedia
Hold, also accessible internet.Mobile phone, the palm that these equipment include digital TV terminal based on PC, can surf the Internet
Upper computer (PDA), DTV STB, digital TV integrated machine etc..
While user is using multimedia terminal viewing audio-visual content, it may be desirable to search the phase on local and internet
Hold inside the Pass.Common practice is the browser on opening a terminal, into some search service page, according to the more matchmakers currently watched
Body information summary goes out keyword, and input keyword, click obtain search result after determining, browse in the result, find needs
Information.Since the range of results that search engine provides is very big, user often will search key or search engine setting in into
The type of one step limit search, such as video, picture, consulting, blog, BBS often repeatedly can just obtain after search interested
As a result.
Another situation is that each user has oneself to like and be accustomed to when watching audio-visual content, for example like seeing
The program of some channel, is liked seeing concert etc. contest show of liking watching the football game.Thus to some channel, to football match,
It can be very interested be in the relevant information of concert.These relevant informations are also intended to complete by the search of user's craft.
On the interface of multimedia terminal is mostly video and audio broadcast interface, if search engine returning the result and regarding
Audio broadcast interface shows that such as it is particularly the case for the digital TV terminal based on computer on the screen at the same, leaves search for and returns
The display space for returning result will be very limited, therefore cannot provide as the search in generic browser all as a result, have must
The result to return to search engine further filters, that is, provides the URL link and abstract for being best suitable for user demand on a small quantity.
In these cases, user is had the following problems when using multimedia terminal:
Search terms are not automatically generated according to the multimedia content that user currently watches, it is all to submit to searching for search engine
Rope item will lean on user oneself to summarize, and system, which can neither automatically generate search terms, will not provide any prompt.
All search terms will be manually entered using the input mode provided in terminal.Mobile phone, palm PC and machine top
There are great differences for usage of the box when inputting text and browsing result and on PC, is unable to reach on PC and scans in browser
Comfort level.
Not according to the use habit of user, hobby, system can not automatically provide a user " recommendation " letter
Breath.The information of these " recommendations " can be with a variety of sources, the letter that includes in the information being locally stored including multimedia terminal, input source
Information on breath (EPG information of such as DTV), internet.
The result of search is not filtered further, display space is very limited on some multimedia terminals, if search
In the content of return and common PC browser quite, including thousands of items link and abstract, and same content is at multimedia end
Display will be very crowded on end, and be unfavorable for lookup of the user to content.
Specific embodiment
The invention discloses a kind of multimedia players, including multimedia playing module, Keyword Selection module and data
Search module, the Keyword Selection module include in the first Keyword Selection module and the second Keyword Selection module at least
One, the first Keyword Selection module extracts keyword from multimedia playing module data being played on, institute
It states the second Keyword Selection module and records the content and manufacturing history record that the multimedia playing module plays, described second closes
Keyword selecting module extracts keyword from the historical record, and the data search module is according to extracted keyword in net
On network, local terminal, at content being played on three in one or more relevant search informations, and the relevant information is shown
Show.
The multimedia playing module, Keyword Selection module and data search module are integrated in the same hardware device
On.The integrated level of equipment can be made higher in this way, processing speed is faster.
Or the multimedia playing module, Keyword Selection module and data search module are set to different equipment
On, the different equipment room passes through data network connection.Such as DTV STB, internal structure comparatively just than
Relatively simple, if to complete the such work of keyword extraction, the speed of set-top box will be slow, and will not generally set
Memory is set, the storage of historical record can not be carried out.But if the information of the data of broadcasting is passed through transmission of network by set-top box
To the equipment dedicated for extracting keyword, such as intelligent search agents server shown in Fig. 1, so that it may greatly improve
The speed of keyword search.Memory is additionally provided in intelligent search agents server shown in Fig. 1, set-top box can also incite somebody to action
The information of the data of broadcasting gives intelligent search agents server by transmission of network, is completed by the intelligent search agents server
The extraction of keyword.
Extracted keyword is shown that user selects in listed keyword by the Keyword Selection module
Select, the data search module according to selection result on network, local terminal, at content being played on three in one at or
Many places relevant search information, and the relevant information is shown.The extracted keyword of Keyword Selection module may be endless
Meet the requirement of user entirely, therefore selected after keyword is listed by user, can make search that more there is specific aim, improves
The efficiency of search.
The frequency that the first Keyword Selection module occurs according to voice in the multimedia audio of broadcasting, which is believed
Breath is identified as text information as keyword.
The frequency that the second Keyword Selection module occurs according to voice in the multimedia audio of broadcasting, which is believed
Breath is identified as text information storage into the historical record.
The frequency that the first Keyword Selection module occurs according to text in the multimedia video of broadcasting, which is made
For keyword.
The frequency that the second Keyword Selection module occurs according to text in the multimedia video of broadcasting, which is deposited
It stores up in the historical record.
User presets keyword, and the first Keyword Selection module is being played on according to multimedia playing module
Data and the correlation of the preset keyword select keyword from data being played on.It can also be improved pass in this way
The accuracy of keyword search.
The keyword be it is multiple, the multiple keyword is combined with the logical relation of "AND", "or", " non-", the number
According to search module according to the result of combination on network, local terminal, at content being played on three in one or more search
Rope relevant information, and the relevant information is shown.
Pass of the Keyword Selection module after extracting keyword perhaps in historical record in broadcasting, with the extraction
Keyword relative words are also used as keyword.The relative words of broadcasting content are also stored in history by the second Keyword Selection module
Record.The range of keyword search can be improved in this way.
The data search module is triggered by user and is returned search result or clocked flip and returns to search knot
Fruit or not timing trigger and return to search result.
Realize that the hardware net equipment of multimedia player of the present invention is shown in Figure 1, digital TV terminal carries out more matchmakers
The broadcasting of volume data, intelligent search agents server carries out the extraction of keyword, and is taken by network connection to search engine
Business device, obtains relevant search information.
User interface in multimedia player embodiment of the present invention as shown in Fig. 2, include multiple " windows " in the interface,
" window " refers to that one piece of display area on terminal interface, the region can pass through shortcut key, Macintosh, mouse-click
Or the modes such as double-click, menu option, remote controller key open and close." window " can be split as multiple incoherent small
" window " can use Alpha for showing and providing the content that " window " is included and function between " window " and " window "
The modes such as Blending are overlapped.The function triggering of " window " can also be pressed such as by shortcut key, combination in several ways
The modes such as key, mouse-click or double-click, menu option trigger, key here include the keyboard of PC, on multimedia terminal by
Key, remote controller key.The content of " window " show can also in several ways, such as drag up and down scroll bar, left and right dragging
Scroll bar, upper and lower page turning, automatic rolling, timing automatic page turning, mouse drag etc..
" window " is the display area for playing multimedia and needing to occupy 101 in Fig. 2, if including view in multimedia content
It frequently, then is video display area, if multimedia content only has audio, 101 " windows " can be existing multimedia terminal and show
Content: such as lyrics, the animation being automatically synthesized, advertisement, news.
102 " windows " are supplied to the interface that user generates search query term, in the skill using the first Keyword Selection module
In art scheme, the present invention can generate in the following way search query term:
Keyword prompting mode: terminal will automatically extract out " keyword " from currently playing multimedia content, these
" keyword " will be shown on 102 " windows ", and user selects the correlation between these " keywords " and " keyword " such as by hand
"AND", "or" etc. eventually form search query term and deliver search engine.
Automatically generate mode: terminal will automatically extract out " keyword " from currently playing multimedia content, while 102
" window " will provide series of rules option.System will be according to default, the pre-set rule of user and current key
Word automatically generates search query term.
Keyword prompt is plus the mode automatically generated: user can choose " keyword " as search terms by hand
Component part, and the rest part of search terms will be automatically generated according to the pre-set rule of user and current keyword.
Rule in 102 " windows " includes: " keyword " create-rule and the rule for generating search option." keyword " is raw
It is as follows at Sample Rules:
Start speech identifying function, the highest word retrieval of the frequency of occurrences in voice is come out and is used as " keyword ";
Start subtitling image identification function, the highest word retrieval of the frequency of occurrences in subtitle is come out and is used as " keyword ";
If had in EPG (electronic program guides) text information and video image character identification information, vocabulary is identical, with
This vocabulary is used as " keyword ".
In addition, user usually has personalized preference and habit to multimedia content, it is often desirable to daily according to user
Information appreciation habit, information use habit carry out self-teaching, using research tool, online and be collected locally relevant information,
Periodically or non-periodically it is shown to user.The present invention provides interface display method, basic function and the operation side for realizing this function
Formula.In the technical solution using the second Keyword Selection module, " keyword " derives from the appreciation habit previous to user, makes
With the historical record of habit.For example, find that user sees that the frequency of a certain sports channel is higher by the analysis to historical record, it should
The channel name of sports channel just becomes " keyword ".Compared with the first Keyword Selection module, the second Keyword Selection module " is closed
The source of keyword " is not extracted from currently playing content, but analysis and summary comes out from the historical record of a period of time.By
It needs to store in historical record, and not all multimedia terminal has store function (such as not set-top box of hard disk).At this moment
The intelligent search agents server that waiting can be used on network completes this function, storage, statistical to user's history record
Analysis is completed by intelligent search agents server, and the analysis result " keyword " of formation will periodically or non-periodically be pushed to multimedia
Terminal, and shown in 102 " windows ".
After obtaining " keyword " using above two technical solution, just basis is somebody's turn to do " keyword " and is scanned for.Automatically
The Sample Rules for generating search option are as follows:
First keyword of the digital TV channel name as search option, program title is as the second keyword, remaining part
Divide random combine;
First keyword of the text recognition result as search option in image, EPG information as the second keyword, remaining
Part random combine;
EPG information as the highest vocabulary of frequency in the first keyword, speech recognition result as search option second
Keyword, rest part random combine.
103 " windows " show search result, and the result of search includes hyperlinks between Web pages, web-page summarization, similar webpage, webpage
The chained list of snapshot.If user is interested in certain search result, chain can be entered by the triggering mode of above-mentioned window function
Connect the pointed page.Above-mentioned window can be passed through when user wishes to return to search result by being somebody's turn to do " window " offer " retrogressing " function
The triggering mode of mouth function backs to result of page searching.103 " windows " can also be used in display and are not directly entered by search
The page, including BBS, Blog, QQ forum etc..The hyperlink of these pages is arranged in the rule in 102 " windows ", including hand
Work input and system are supplied to the modes such as user's selection.
The source of " keyword " that automatically extracts out includes digital TV channel name, program title, EPG, subtitle
Subtitle, graph text information Teletext, information (such as BML), speech recognition result, audio frequency characteristics as content broadcast format
Extract description, text recognition result in image, the information extracted in video and audio watermark etc..
It include a large amount of content of text in the information played on multimedia terminal, channel name, EPG information, with text side
The caption information that formula provides.Also with the presence of much in the form of other media, but the content of text, such as TV play can be converted to
Title, the subtitle existing for the image format can be using text be converted into a manner of image recognition, voice can pass through voice
Identification is converted into text.Black region in Fig. 3 shows that the region that fixed text is easy to appear in digital TV image (is not wrapped
Include the channel identification region in the upper left corner), it is easy to appear a period of time (10 minutes or more) always present text in that region
Word is such as: TV play title, variety show title, advertisement, theme of news.Although these texts exist with image mode,
Have the characteristics that program broadcast slot immobilizes, often discloses actual program theme, it is every in these regions by periodically calculating
The method of a pixel relativity of time domain can easily extract the changeless text of these in image and be converted into text,
To understand the subject information of currently playing content.
Video is the main medium that DTV includes, including content character abundant, in addition to the vision that image has is special
Property and spatial character outside, also there is time response, the video object characteristic and kinetic characteristic etc..It, can be with video processing technique
Video is split according to each attribute (such as scene, the video object or kinetic characteristic), then the behaviour such as classified, clustered
Make, obtains the tactic pattern of video.The video object can also be extracted from video, track its movement, binding time specificity analysis
Its mode and being associated between other objects, to find high-level event summary, concept or mode.
Audio is audio media, and main feature has fundamental tone, tone, the rhythm or melody etc..There are two types of audio mining is usual
Approach: (1) use speech recognition technology audio mining at text, is converted into text mining by speech recognition.Due to digital electricity
Depending on broadcast characteristic, voice use all more standardize, if mandarin is more, dialect is few, therefore the essence of natural-sounding recognizer
Degree is high, and when identification, needed training sample was easy to collect.(2) sound characteristic, such as tone, the rhythm are directly extracted from audio,
Acoustic pattern is analyzed with the method for cluster.Machine learning techniques, including rough set, artificial neural network and decision tree technique energy
Fundamental frequency, Energy distribution and other features for being enough in analysis audio are excavated hidden to obtain the structure of audio event and object
Information clue, rule and the mode being contained in audio stream.Such as pass through the extraction and to phonetic feature in magnanimity speech database
It practises, obtains the mode of tone and rhythm variation so that speech synthesis more naturalization and intelligence.
After being extracted " keyword ", it is necessary to which the query result returned to search engine is further processed, because logical
Often small part is only user's really interested content in the query result of return, and no matter multimedia terminal is
Application program or handheld terminal or set-top box application based on PC platform, video will occupy most of areas on display interface,
Space for showing search result is little.The post-processing of query result includes: such as only to return to preceding 5 according to the post-processing of rule
Item hyperlink;Simplify abstract, the sentence that such as display includes query word;Remove the essentially identical hyperlink of content.Last basis
Regular and relevant setting provides some hyperlink and abstract closest to user demand.
If the present invention is applied to DTV, the logical framework of digital TV terminal is as shown in Figure 4.The number electricity
Depending on terminal receive the signals such as satellite digital television signal, ground wave digital television signal or cable digital television signal it
Afterwards, the processing such as channel tuner, analog-to-digital conversion and channel demodulation are carried out, treated, and signal is demultiplexed, and obtains EPG data
Signal, video data signal, caption data signal and voiceband data signal, to the EPG data signal, video data signal,
After caption data signal and voiceband data signal are decoded or parse respectively, then pre-processed, and according to user policy
Metadata synthesis is carried out, then processing obtains " keyword ", sends " keyword " on network and scans for, return is searched
Hitch fruit shows to user.
When progress " keyword " extracts, it is also necessary to which intelligent search agents server completes a few thing.The intelligence
Energy search agent server counts the metadata as shown in figure 5, after the metadata for receiving digital TV terminal processing
Word television multimedia data mining can specifically include image recognition, speech recognition, text identification and user operation habits note
Record, these data are stored in the database of the operation of storage user's history, user's history content and user policy, to the data
Aggregation of data in library statisticallys analyze and generates the data for automatic search query, and this is used for the number of automatic search query
It is scanned for according to search engine is sent to.Search engine returns to query result, and the intelligent search agents server combines storage
User's history operation, user's history content and user policy database in content the query result is handled, at this
Reason includes advertising information insertion, abstract filtering, chained dependence filtering and user policy filtering, and treated, and query result can also
To be stored in special database, and the query result can be sent to digital TV terminal, display for a user.
In conclusion the present invention extracts keyword by Keyword Selection module in the data of broadcasting or remembers in history
The mode that keyword is extracted in record, so that the input of keyword need not excessively rely on keyboard, and obtains the accuracy of keyword
Also it greatly improves, it is convenient for users to use.