CN1254136A - Method for inquiring about index multi-media header data and its device - Google Patents

Method for inquiring about index multi-media header data and its device Download PDF

Info

Publication number
CN1254136A
CN1254136A CN 98124160 CN98124160A CN1254136A CN 1254136 A CN1254136 A CN 1254136A CN 98124160 CN98124160 CN 98124160 CN 98124160 A CN98124160 A CN 98124160A CN 1254136 A CN1254136 A CN 1254136A
Authority
CN
China
Prior art keywords
word
key word
key
index
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 98124160
Other languages
Chinese (zh)
Inventor
林光信
陈玄同
穆立源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inventec Corp
Original Assignee
Inventec Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inventec Corp filed Critical Inventec Corp
Priority to CN 98124160 priority Critical patent/CN1254136A/en
Publication of CN1254136A publication Critical patent/CN1254136A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention mainly adopts multi-stage index mode, and uses the multi-media data to build index library, then the user can provide the key word to be inquired, and can obtain the characteristics of the key word by means of coding operation, and then the multi-stage inquiry and index can be implemented according to these characteristics, and the inquired multi-media file data can be outputted by means of browsing device, so that the multi-media data can be reused so as to attain the goal of fully utilizing existent data.

Description

The method and apparatus of inquiring about index multi-media header data
The present invention relates to the method and the device of a kind of inquiring about index multi-media header (TITLE) data, particularly relate to and a kind ofly can carry out search index to multi-medium data, the method and the device that make multi-medium data reuse and to make full use of.
Popular along with multimedia computer, make some books, show that file etc. all shows with computing machine, and the audio-video playing function of multimedia computer more makes image and sound optical disk in vogue.Yet, the data of these multimedias TITLE but can only be used for itself, other product can't read the data of this multimedia TITLE, maybe can't discern its file layout, and can not its included any data be browsed, therefore limited the purposes of these data, also make these data can't do further utilization, form the waste of resource, increase the difficulty and the time of data search.
The object of the present invention is to provide a kind of method and device of search index multimedia TITLE data, it mainly is the mode that adopts multiple index, multi-medium data is set up index database, the key word that will inquire about is proposed by the user, obtain the various features of key word by the coding operation, carry out multistage inquiry and index according to these features after, the multimedia file data of being inquired about by browser output, make multi-medium data to reuse, reach the purpose that makes full use of ready-made data.
According to aforementioned, the present invention puts several index databases by specific purpose tool on multimedia CD, in order to carry out index at multi-medium data, wherein an index database is all available data header tables in the multi-medium data, remaining index file is then as the index file of title, on aforesaid index database, set up index again, make the user can effectively inquire required multi-medium data, and promote the speed that searches.
According to aforementioned, the present invention is directed to the encoding operation that the feature of key word is carried out, be to adopt reverse eliminating algorithm, to reach the purpose of quick searching.
Another object of the present invention is to cooperating aforesaid index function, when the exploitation title, can be its multi-medium data and write two browsers, first browser is called e-book, can browse the data of whole multimedia, second browser can receive parameter, the user can submit to this browser with the content that will inquire about by parameter, just can give the user with relevant multi-medium data back pass, by above-mentioned process, can make the multimedia on the TITLE to be used, and data resource can not wasted by later ITILE.
The object of the present invention is achieved like this, a kind of method and apparatus of search index multimedia ITILE data promptly is provided, mainly by by the key word in title table is set on multimedia, elementary subject index table, several index databases of secondary subject index table, import in the browser after providing required key word by the user, utilize the mode of multiple index, special little coding the according to key word, and search to compare by various features to key word, and all comprise the data of key word in the acquisition multi-medium data, being back to browser displays, multi-medium data can be utilized again, reach the purpose that makes full use of ready-made data.
Below will structural design of the present invention and principle of operation be done a detailed explanation, and consult accompanying drawing, will do further to understand feature of the present invention, wherein:
Fig. 1 is structure of the present invention and schematic diagram of the function;
Fig. 2 is the flow chart of steps of reverse eliminating algorithm of the present invention.
The present invention mainly provides a kind of method, and the multi-medium data in the CD can be utilized again, reaches the purpose that makes full use of ready-made data.It mainly is the mode that adopts multiple index, as shown in Figure 1, at first, include: key word in title table 30 (CDINDEX.DAT), elementary subject index table 50 (CDINDEX.ID1), secondary subject index table 40 (CDINDEX.ID2) by several index databases are set on multimedia CD.Import in the browser 20 after providing required key word 10 by the user, feature (as detailed aftermentioned) according to key word 10 produces aforesaid three kinds of table data 30,40,50, behind multistage inquiry and index, the result who is inquired about is back to browser 20, and this result is displayed.
The key word 10 that the user provided, may be some irregular phrase combinations, therefore five features that extract key word are encoded, and search with reverse elimination methods, with by to feature comparisons such as the suffix symbol of the initial of the first word initial of key word 10, first word length, second word, key word 10, key lengths, by overall thinking pattern to thin portion, integral body is confirmed in affirmation one by one to thin portion, and all include the data of key word 10 in the acquisition CD.
In five condition codes that aforementioned key word 10 is taken out, the first word initial of key word 10, first word length are formed first feature of key word 10; The initial of second word is then as second feature of key word 10.If key word 10 be a Chinese, then first feature is got hang down 12 of lead-in ISN, and second feature is got the high eight-bit of the second word ISN.
Suffix symbol and three contents of key length of comprising key word 10, key word in the key word in title table 30; Illustrate with table one content, the key word 10 of key word in title table 30 the 0th position promptly writes down three contents of Li jian, n, 7 (comprising the space) respectively in " Li jian " key table, by that analogy, promptly write down in three of all key words 10 and be dissolved in the key word in title table 30.
The key table of table one embodiment of the invention
Key word The suffix joint of key word The length of key word
????Li??jian ??n(ASCII?110) ????7
????Li??ming ??g(ASCII?103) ????7
???Wang?Dong ??g(ASCII?103) ????9
?Wang?Dong?sheng ??g(ASCII?103) ????15
Mu Liyuan ???(ASCII?183) ????6
The secondary index table of table two embodiment of the invention
Second feature of key word ????FROM ????TO
0????j ????0 ????0
1????m ????1 ????1
2????d ????2 ????3
3 223 (standing) ????4 ????4
The elementary concordance list of table three embodiment of the invention
????FROM ????TO
??????... ???...
236????0 ????1
??????... ???...
471????2 ????2
??????... ???...
4287???3 ????3
??????... ???...
Secondary subject index table 40 is used for writing down the position of key word 10 in key word in title table 30 of first and second features, include second feature of key word 10, the reference position (FROM) of key word 10 in key word in title table 30 that all comprise first and second features, key word 10 three contents of end position (TO) in key word in title table 30 that all comprise first and second features, illustrate with table two content:
Second of key word " Li jian " is characterized as j, only has 0 place, position to have this feature, so its reference position and end position are respectively 0,0; And key word " Wang dong " reaches second feature of " Wangdong sheng " and is d, is 2 so it opens the beginning position, and end position is 3, by that analogy, sets up out a complete secondary subject index table 40.
The size of elementary subject index table 50 is fixed, relevant with the maximum length of key word 10, first feature of all key words 10 all can calculate its position at elementary subject index table 50 by following formula, and the content of this position is pointed to second feature of secondary this key word 10 of subject index table 40.Elementary subject index table 50 has two item number certificates, be respectively the reference position (FROM) of all key words 10 in secondary subject index table 40 that contains first feature, and contain the end position (END) of all key words 10 in secondary subject index table 40 of second feature.
The initial ASCII character value of position=(length of word-1) * 128+ word
If Chinese, then
Position=(the hanging down 12 of Chinese inner code)+(128 * 32)
Wherein, 32 is the maximum length of definition of keywords, and 128 is the size of English ASCII character value.
When scanning word, at first can if this position is empty, then there be key word 10 according to its first characteristic query in the content of elementary subject index table 50 as prefix, carry out the scanning of next speech, otherwise continue the secondary subject index table of inquiry.
Aforesaid index database 30,40,50 can adopt reverse eliminating algorithm to reach the purpose of quick searching after setting up and finishing.As shown in Figure 2, be the flow chart of steps of reverse eliminating algorithm of the present invention; Include the following step: step a. tries to achieve position in the elementary subject index table 50 by first feature of waiting to look into word, if empty, then
Table is difficult the key word 10 headed by this word, i.e. execution in step e, otherwise by elementary title rope
The end of drawing table 50 and reference position are sought the corresponding content of secondary subject index table 40.Step b. is found secondary subject index table 40 to wait word second feature and the step a that look into the word back
First content relatively, if different then table does not have the key words 10 of these two speech, execution in step 3;
Otherwise look in key word in title table 30 corresponding by the end of its content and reference position
Hold.Step c is taken out the length and the suffix symbol of corresponding this key word 10 in the key word in title table 30, comparison
Wait to look into the alphabetical suffix symbol that whether is equal to key word 10 of suffix symbol of word,, then do not have this if not
Wait to look into the key word headed by the word, execution in step e; Otherwise, key word 10 and word to be looked into are pursued
Individual character relatively, if difference execution in step e then, identical person is execution in step d then.Steps d. confirm that word to be looked into is a key word 10, further it is carried out operations such as word mark again.Step e. finishes relatively.
Do further to carry out explanation with pair reverse eliminating algorithm of table one, table two and table three: suppose that waiting to look into word is " Wang dong sheng ", it first is characterized as W4; Can to obtain its position in elementary subject index table 50 be 471 according to aforesaid formula, initially is respectively " 2 " and " 2 " with end position by finding it in the elementary subject index table 50, therefore will inquire about the position " 2 " in the secondary subject index table 40.
Wait to look into second of word and be characterized as " d ", first content of secondary subject index table position " 2 " is " d ", be " 2 ", " 3 " initial being divided into the end position content, therefore can learn the content that will look into position in the key word in title table 30 " 3 ", " 2 ".
Take out the content of position " 2 " in the key word in title table 30 earlier, its corresponding key word 10 length are 15, the suffix symbol is " g ", and then inquiry waits whether the length of looking into word conforms to the suffix symbol, obtain an identical comparing result in this example, then treat again and look into word and this key word 10 and compare the identical step e that promptly carries out of comparative result character by character.So can be revealed on the screen by browser 20 all include each data of key word 10 in the discs.
Further again, when crucial TITLE, can be its multi-medium data and write two browsers 20, first browser is called e-book, can browse the whole multimedia data, second browser can receive parameter, the user can submit to this browser 20 with the content that will inquire about by parameter, just can give the user with relevant multi-medium data back pass, by above-mentioned process, can make the multimedia on the TITLE to be used, and data resource can not wasted by later TITLE.
Brought forward is described, the device and method of search index multimedia TITLE data provided by the present invention, only need pass through keyword query, can further utilize multimedia TITLE data existing or backward, make the unlikely waste of data need not, help the input time that data query and program designer with the user shorten data.
Below the present invention has been done a detailed description, but above-described, only be a preferred embodiment of the present invention, can not be limited to scope of the invention process.Therefore all various variations and modifications of doing according to claim of the present invention all still belong to claim scope of the present invention.

Claims (9)

1. the method and apparatus of an inquiring about index multi-media header data, it is characterized in that, mainly by by the key word in title table is set on multimedia, elementary subject index table, several index databases of secondary subject index table, import in the browser after providing required key word by the user, utilize the mode of multiple index, feature according to key word is encoded, and search to compare by various features to key word, and all comprise the data of key word in the acquisition multi-medium data, being back to browser displays, multi-medium data can be utilized again, reach the purpose that makes full use of ready-made data.
2. the method and apparatus of inquiring about index multi-media header data as claimed in claim 1 is characterized in that, this key word is formed first feature with its first word initial, first word length.
3. the method and apparatus of inquiring about index multi-media header data as claimed in claim 1 is characterized in that, this key word is that initial with its second word is as second feature.
4. the method and apparatus of inquiring about index multi-media header data as claimed in claim 1 is characterized in that, this key word can be the multibyte ISN, and its first feature is get the lead-in ISN low 12, and second feature is got the high eight-bit of the second word ISN.
5. the method and apparatus of inquiring about index multi-media header data as claimed in claim 1 is characterized in that, this key word in title table further comprises suffix symbol and three contents of key length of key word, key word.
6. the method and apparatus of inquiring about index multi-media header data as claimed in claim 1, it is characterized in that, this secondary subject index table is the position of key word in key table of record first and second features, comprises second feature of key word, the reference position of key word in key table that all comprise first and second features, the end position of key word in key table that all comprise first and second features.
7. the method and apparatus of inquiring about index multi-media header data as claimed in claim 1, it is characterized in that, the size of this elementary subject index table is fixed, relevant with the maximum length of key word, the reference position of all key words in secondary subject index table that comprises first feature, and the end position of all key words in secondary subject index table of second feature.
8. the method and apparatus of inquiring about index multi-media header data as claimed in claim 1 is characterized in that, can further utilize reverse eliminating algorithm to inquire about, and by overall thinking pattern to thin portion, the data of thin portion is confirmed one by one and is confirmed integral body.
9. the method and apparatus of inquiring about index multi-media header data as claimed in claim 8 is characterized in that, this reverse eliminating algorithm comprises:
A. try to achieve position step in the elementary subject index table by first feature of waiting to look into word;
B. to wait to look into word second feature of word back and the comparison step that step a is found first content in the secondary subject index table;
C. the length of corresponding this key word and suffix symbol in the key table accords with alphabetical comparison step with waiting the suffix of looking into word;
D. confirm that word to be looked into is the step of key word;
E. carry out operations steps such as word mark.
CN 98124160 1998-11-12 1998-11-12 Method for inquiring about index multi-media header data and its device Pending CN1254136A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 98124160 CN1254136A (en) 1998-11-12 1998-11-12 Method for inquiring about index multi-media header data and its device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 98124160 CN1254136A (en) 1998-11-12 1998-11-12 Method for inquiring about index multi-media header data and its device

Publications (1)

Publication Number Publication Date
CN1254136A true CN1254136A (en) 2000-05-24

Family

ID=5228517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 98124160 Pending CN1254136A (en) 1998-11-12 1998-11-12 Method for inquiring about index multi-media header data and its device

Country Status (1)

Country Link
CN (1) CN1254136A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007095834A1 (en) * 2006-02-22 2007-08-30 Dong Wang Composite display method and system for search engine of same resource information based on degree of attention
CN100430921C (en) * 2001-12-29 2008-11-05 Lg电子株式会社 Multimedia data searching and browsing system
CN101295312B (en) * 2008-06-18 2011-12-28 中兴通讯股份有限公司 Method for presenting data by table
CN101089853B (en) * 2006-06-15 2013-06-19 三星电子株式会社 Apparatus and method for browsing contents
CN102298621B (en) * 2006-02-22 2013-11-06 王东 System for obtaining page user focus degree PageFocus by method for aggregating and displaying same source information search engine based on focus degree

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100430921C (en) * 2001-12-29 2008-11-05 Lg电子株式会社 Multimedia data searching and browsing system
WO2007095834A1 (en) * 2006-02-22 2007-08-30 Dong Wang Composite display method and system for search engine of same resource information based on degree of attention
CN101025737B (en) * 2006-02-22 2011-08-17 王东 Attention degree based same source information search engine aggregation display method
CN102298621B (en) * 2006-02-22 2013-11-06 王东 System for obtaining page user focus degree PageFocus by method for aggregating and displaying same source information search engine based on focus degree
CN101089853B (en) * 2006-06-15 2013-06-19 三星电子株式会社 Apparatus and method for browsing contents
CN101295312B (en) * 2008-06-18 2011-12-28 中兴通讯股份有限公司 Method for presenting data by table

Similar Documents

Publication Publication Date Title
US6408270B1 (en) Phonetic sorting and searching
CN100557606C (en) Be used to search the method and apparatus of string
US8204921B2 (en) Efficient storage and search of word lists and other text
EP2172853B1 (en) Database index and database for indexing text documents
CN1325513A (en) Document semantic analysis/selection with knowledge creativity capability
CN102915299A (en) Word segmentation method and device
WO2009005961A1 (en) Phonetic search using normalized string
CN1345426A (en) System and method for extracting index key data fields
US20110113052A1 (en) Query result iteration for multiple queries
CN1148657C (en) File processing method, data processing device, and storage medium
CN1254136A (en) Method for inquiring about index multi-media header data and its device
CN110019637B (en) Sorting algorithm for standard document retrieval
CN102799661A (en) Method and system for implementing semantic retrieval on electronic files
CN1287316C (en) Method and system for compressing column becoming longer in period of indexing high key code generation
US20080306912A1 (en) Query result iteration
CN1648829A (en) Method and system for inputting chinese characters
CN101063984A (en) system and method for automatically arranging order of picture frame
CN1147811C (en) Chinese character identifying method and system with correcting function
CN1144144C (en) High-speed text search method
CN1822001A (en) Single word searching method for hand held data processor
JP2001052024A (en) Method and device for retrieving similar feature amount and storage medium storing retrieval program for similar feature amount
CN1121655C (en) Fast non-regular phrase searching method
Zeng et al. A Chinese Document Retrieval Method Considering Text Order Information
JPH10177582A (en) Method and device for retrieving longest match
Singh Search algorithms

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication