Summary of the invention
Technical matters to be solved by this invention provide a kind of can be in the compression storehouse in a plurality of functional areas News Search, the scope of application is wide, can search for the method for the gathering of realizing on hand-held learning terminal search of the advantage of the education resource in the non-search examination question bag.
For solving the problems of the technologies described above, the present invention realizes as follows: the method for the gathering search that realizes on hand-held learning terminal of the present invention is characterized in that:
1, be provided for examination question search input media on the described hand-held learning terminal of A, search matched result comprises examination question topic literary composition, and answer is resolved, knowledge point, topic type, difficulty, source etc.As literary composition search in order, input " the same cage of chicken rabbit ", " greenhouse effect " or the like.As by in examine source search, input " 2007 Suzhou ", " 2007 Hangzhou ".As by the knowledge point search, import " linear function ", " Ohm law " or the like.Be provided for the input media of knowledge explanation search, search matched result comprises examination Flash courseware, knowledge point explanation etc.Be provided for supporting the input media of extracurricular knowledge search, described input media is connected with the data input pin of hand-held learning terminal respectively;
Store unified education resource information database (comprising a plurality of databases that upgrade by network or other storage medium such as knowledge concepts metadatabase, study test and appraisal resources bank, knowledge explanation resources bank, the related data bank of examination question, extracurricular knowledge data bank) in the storer of B, described hand-held learning terminal through overcompression;
Each database in the C, education resource information database be not fight separately, independent separate, but a kind of organic combination, the knowledge concepts metadatabase is knowledge segmentation and the guarantee of managing, the explanation resources bank is that the multi-angle of knowledge is set forth, the test and appraisal resources bank is the management segmentation to acquisition of knowledge degree, the related data bank record of examination question knowledge point and test and appraisal examination question incidence relation.Complement each other between them, interrelated, indispensable.Being not only the basic information data of supporting function of search also is the content strong point of realizing study diagnosis, smart group and individualized learning.
D, select database to be searched as required, the hunting zone is set;
E, the content that will search in input field input;
F, microprocessor are discerned the content of input, and all input contents are carried out the word segmentation processing of key word;
G, microprocessor be search and key words content and coupling or the partly content of coupling in proper order fully in the database of selecting;
If the content that the keyword that the H search is cut the throat and imported is complementary, the Search Results that then relevant content of microprocessor driven display device demonstration, and highlighted demonstration and the some contents of keyword are complementary;
I, select and check the content that the match is successful.
2, the present invention solves the technical scheme that its technical matters adopts and further comprises: described database Chinese version type content adopt general Huffman compression algorithm or<prefix length, suffix〉compression algorithm compression after be kept in the storer of hand-held learning terminal, wherein prefix length adopts numeric representation.
3, in the described database numeric type content adopt general Huffman or<numerical value, numerical value sign〉the compression algorithm compression after be kept in the storer of hand-held learning terminal, numerical value adopts initial value or difference to represent, the numerical value sign is used for distinguishing the kind of numerical value, and promptly numerical value is difference or initial value.
4, described database compressed package correspondence unique discernible Bale No. (promptly wrapping ID number), the index file that wherein comprises the corresponding all kinds of study coursewares that are used for user learning of contents of test question, the knowledge point related with examination question in the examination question bag, sets up according to contents of test question; The index file that comprises search content in other database bag, sets up according to content.
5, described types of databases can be set up the index file for search respectively:
A, wherein the index file of knowledge concepts metadatabase, study test and appraisal resources bank (test and appraisal test item bank), knowledge explanation resources bank, extracurricular knowledge data base comprises<the frequency file, position paper, storehouse, Field territory 〉, wherein storehouse, Field territory is used to express the incidence relation of article and its attribute, its attribute comprises article number, examination question QID, in sentence number, article title, the address of sensing article content, cognitive classification, complexity, answer, parsing, the similar examination question one or multinomial;
6, described input content is carried out the key word word segmentation processing,
The key word of A input is English, uses the space as the mark of word segmentation;
The key word of B input is a Chinese, two words (being about to first word and second word combination back) are as first pre-keyword earlier, the entry that search matches in " Chinese standard control storehouse ", if search the content and the entry of coupling with it in proper order fully, then add thereafter the 3rd word in " Chinese standard control storehouse " once more matching content and the order, if the match is successful, then continuation adding word thereafter circulates coupling up to coupling maximization formation keyword sets, if coupling is unsuccessful, then preceding two words are carried out separation mark as a keyword, again circulation coupling behind the 3rd the word adding word is thereafter maximized the formation keyword sets up to coupling; If do not search the entry that mates fully with the content and the order of preceding two words (being first word and second word combination back), then behind first word, make separation mark with it as a keyword, again second word added that a word thereafter searches for the entry that matches in " Chinese standard control storehouse ", in " Chinese standard control storehouse ", can find all entries of coupling of content and order if add the back, the 3rd word that then adds again thereafter circulates coupling up to coupling maximization formation keyword sets, do not constitute keyword if add the back entry, then adopt the last entry that the match is successful as keyword, then as the beginning of back entry, the side drips all input in Chinese contents is carried out participle a next word according to this;
7, after content carries out participle to input, then on hand-held learning terminal, carry out:
A after with participle keyword in capitalization all change small letter into, and make corresponding mark;
The B filtering does not have the keyword and the punctuation mark of practical significance.
8, Sou Suo result carries out showing after the prioritization to content displayed through microprocessor:
A, examination question search: the position is also adjacent in the article of keyword adjacent behind the participle at database, and then this article preferentially shows; Otherwise the article that the keyword frequency of occurrences is high preferentially shows; Otherwise the forward article in the position that first keyword occurs in the database article preferentially shows;
B, extracurricular knowledge search: the article that the content of article title and keyword and order are all mated fully in the database preferentially shows; Otherwise the content of keyword all is contained in the preferential demonstration in the article title in proper order, otherwise, preferentially show according to method described in the A.
When 9, display result is text and numeral, the content that the match is successful directly is presented on the display device, and with the highlighted demonstration of keyword.
10, the education resource information database is not only the basic information data of supporting function of search, also is the content strong point of realizing other all functions of this learning terminal of study diagnosis, smart group and individualized learning.For example on learning terminal of the present invention, set up corresponding " wrong topic collection database file " at unified education resource database compressed package, week recording user separate wrong examination question; Set up corresponding " favorites data library file " at unified education resource database compressed package, be used for recording the content of collection meaning, set up corresponding " study archive information database file " at unified education resource database compressed package, be used for writing down test examination question that the user done and the knowledge point that is associated, deagnostic report, knowledge point of learning and study condition.Described " wrong topic collection database file " and " favorites data library file " " study archive information database file " wait the data store organisation of all learning terminals identical, all comprise: wrap ID number, examination question QID number, examination question bag title, collection date, wherein comprised its storing path in the examination question bag title.
Good effect of the present invention: the method for the gathering search that realizes on hand-held learning terminal of the present invention is that the education resource database with the multi-level mesh architecture of new high efficiency combines with the search engine technique of PC, be applied to the personal hand-held equipment built-in field, for the learner to learning object search, assess, obtain and use etc. provides the most scientific and effective support, simultaneously:
1, the present invention has solved the great technical barrier of sharing and rationally utilizing this basic education field of education resource effectively by making up the education resource database of new high efficiency.In education resource information, the base attribute of learning object is to realize the core of education resource information sharing.The base attribute of learning object is normally with chapter, section, the mode of joint makes up, and different teachers, different regions, different tissues is not quite similar to organizing also of content, not only different in combination, in the construction of conventional contents, usually in the content construction, be discrete between the various version, why the function of search of prior art can only be confined to search for the examination question bag also is because the search exam pool is discrete in the content construction, become an independently system separately with other education resource storehouse, the repetition of usually working on this not only Resources Construction, serious waste of resources causes the functional defect that can't remedy.
2, the education resource information bank being applied to the personal hand-held equipment built-in field, is the content guarantee that realizes that function of the present invention is achieved.The education resource information bank is by conceptual data storehouse, knowledge point, test and appraisal resource (examination question) database, and the database of three cores that are mutually related of explanation resource database constitutes.In learning process, the content of knowledge is not that disperse, mixed and disorderly with unordered, thus these three databases be not separate, fight separately but a kind of organic combination.The knowledge concepts database is knowledge segmentation and the guarantee of managing, and the explanation resources bank is that the multi-angle of knowledge is set forth, and the test and appraisal resources bank is the management segmentation to acquisition of knowledge degree, and the three complements each other, and is indispensable.Under such condition, use interactive conceptual relation,, education resource is classified according to predetermined multi-level netted structure, code requirement, unified sign are described the attribute of learning object, make the student can conveniently obtain the various relevant dominance and the recessive information of learning object.Based on these attributes, the user can carry out that Classification Management, search are searched, browsing information or the like to learning object, the examination question that the user is searched no longer only limits to search isolated examination question, courseware resource in the topic bag, but all data messages in the whole education resource information bank.Referring to Fig. 3, what database that isolates in the technology before the contrast and search utility adopted is that each database all must use independent search utility and a plurality of control data corporation, the shared data bank that the present invention takes and unique search utility and control data corporation.
3, the education resource information database is combined with the search engine technique of PC, make function of search with the study in this learning terminal diagnosis (basis diagnosis, special topic training, integration test), smart group volume (two-way step special training, wrong topic group volume, chapters and sections group volume) synchronously wait other correlation functions simulation cover volumes, in examine true topic) be combined into an indivisible organic whole.
4, the present invention collects the education resource information material, and according to the essential attribute of learning objects such as architectonic inner link and method of thinking its classification processing, editor is organized into unified database compressed package; This compressed database is not only to use for function of search, and and handheld learning terminal on the resource database of all functions are organic wholes of sharing fully.In the present invention promptly, the compressed data packets of each functional area is not independent separately, has no related.They are to make up according to unified attribute from collecting, put in order, classify, synthesizing, and have fundamentally realized education resource information sharing and reasonable utilization, allow the learner can really utilize effectively study of function of search execution.And all data messages are edited arrangement compression by education expert's level teacher of a special classification and are formed in the education resource information database, preset in the database of handheld device, need not the user and download separately.Can from the database compressed package, search for the content that satisfies condition at any time, everywhere.Pass through the mutual of database simultaneously, guaranteed the renewal of data.Function of search and study diagnosis, personalized set are rolled up the learning method that combines on the other hand, have opened one for man-to-man digital education and have fanned the door of brand-new study.
Embodiment
In order to make the purpose, technical solutions and advantages of the present invention clearer,, the present invention is described in further details below in conjunction with embodiment and accompanying drawing.At this, exemplary embodiment of the present invention and explanation thereof are used to explain the present invention, but not as a limitation of the invention.
As Fig. 1, Fig. 2, Fig. 3, Fig. 4, shown in, the method for the gathering search that realizes on hand-held learning terminal of the present invention is characterized in that:
1, be provided for examination question search input media on the described hand-held learning terminal of A, search matched result comprises examination question topic literary composition, and answer is resolved, knowledge point, topic type, difficulty, source etc.As literary composition search in order, input " the same cage of chicken rabbit ", " greenhouse effect " or the like.As by in examine source search, input " 2007 Suzhou ", " 2007 Hangzhou ".As by the knowledge point search, import " linear function ", " Ohm law " or the like.Be provided for the input media of knowledge explanation search, search matched result comprises examination Flash courseware, knowledge point explanation etc.Be provided for supporting the input media of extracurricular knowledge search, described input media is connected with the data input pin of hand-held learning terminal respectively;
Store unified education resource information database (comprising a plurality of databases that upgrade by network or other storage medium such as knowledge concepts metadatabase, study test and appraisal resources bank, knowledge explanation resources bank, the related data bank of examination question, extracurricular knowledge data bank) in the storer of B, described hand-held learning terminal through overcompression;
Each database in the C, education resource information database be not fight separately, independent separate, but a kind of organic combination, the knowledge concepts metadatabase is knowledge segmentation and the guarantee of managing, the explanation resources bank is that the multi-angle of knowledge is set forth, the test and appraisal resources bank is the management segmentation to acquisition of knowledge degree, the related data bank record of examination question knowledge point and test and appraisal examination question incidence relation.Complement each other between them, interrelated, indispensable.Being not only the basic information data of supporting function of search also is the content strong point of realizing study diagnosis, smart group and individualized learning.
D, select database to be searched as required, the hunting zone is set;
E, the content that will search in input field input;
F, microprocessor are discerned the content of input, and all input contents are carried out the word segmentation processing of key word;
G, microprocessor be search and key words content and coupling or the partly content of coupling in proper order fully in the database of selecting;
If the content that the keyword that the H search is cut the throat and imported is complementary, the Search Results that then relevant content of microprocessor driven display device demonstration, and highlighted demonstration and the some contents of keyword are complementary;
I, select and check the content that the match is successful.
2, the present invention solves the technical scheme that its technical matters adopts and further comprises: described database Chinese version type content adopt general Huffman compression algorithm or<prefix length, suffix〉compression algorithm compression after be kept in the storer of hand-held learning terminal, wherein prefix length adopts numeric representation.
3, in the described database numeric type content adopt general Huffman or<numerical value, numerical value sign〉the compression algorithm compression after be kept in the storer of hand-held learning terminal, numerical value adopts initial value or difference to represent, the numerical value sign is used for distinguishing the kind of numerical value, and promptly numerical value is difference or initial value.
4, described database compressed package correspondence unique discernible Bale No. (promptly wrapping ID number), the index file that wherein comprises the corresponding all kinds of study coursewares that are used for user learning of contents of test question, the knowledge point related with examination question in the examination question bag, sets up according to contents of test question; The index file that comprises search content in other database bag, sets up according to content.
5, described types of databases can be set up the index file for search respectively:
A, wherein the index file of knowledge concepts metadatabase, study test and appraisal resources bank (test and appraisal test item bank), knowledge explanation resources bank, extracurricular knowledge data base comprises<the frequency file, position paper, storehouse, Field territory 〉, wherein storehouse, Field territory is used to express the incidence relation of article and its attribute, its attribute comprises article number, examination question QID, in sentence number, article title, the address of sensing article content, cognitive classification, complexity, answer, parsing, the similar examination question one or multinomial;
6, described input content is carried out the key word word segmentation processing,
The key word of A input is English, uses the space as the mark of word segmentation;
The key word of B input is a Chinese, two words (being about to first word and second word combination back) are as first pre-keyword earlier, the entry that search matches in " Chinese standard control storehouse ", if search the content and the entry of coupling with it in proper order fully, then add thereafter the 3rd word in " Chinese standard control storehouse " once more matching content and the order, if the match is successful, then continuation adding word thereafter circulates coupling up to coupling maximization formation keyword sets, if coupling is unsuccessful, then preceding two words are carried out separation mark as a keyword, again circulation coupling behind the 3rd the word adding word is thereafter maximized the formation keyword sets up to coupling; If do not search the entry that mates fully with the content and the order of preceding two words (being first word and second word combination back), then behind first word, make separation mark with it as a keyword, again second word added that a word thereafter searches for the entry that matches in " Chinese standard control storehouse ", in " Chinese standard control storehouse ", can find all entries of coupling of content and order if add the back, the 3rd word that then adds again thereafter circulates coupling up to coupling maximization formation keyword sets, do not constitute keyword if add the back entry, then adopt the last entry that the match is successful as keyword, then as the beginning of back entry, the side drips all input in Chinese contents is carried out participle a next word according to this;
7, after content carries out participle to input, then on hand-held learning terminal, carry out:
A after with participle keyword in capitalization all change small letter into, and make corresponding mark;
The B filtering does not have the keyword and the punctuation mark of practical significance.
8, Sou Suo result carries out showing after the prioritization to content displayed through microprocessor:
A, examination question search: the position is also adjacent in the article of keyword adjacent behind the participle at database, and then this article preferentially shows; Otherwise the article that the keyword frequency of occurrences is high preferentially shows; Otherwise the forward article in the position that first keyword occurs in the database article preferentially shows;
B, extracurricular knowledge search: the article that the content of article title and keyword and order are all mated fully in the database preferentially shows; Otherwise the content of keyword all is contained in the preferential demonstration in the article title in proper order, otherwise, preferentially show according to method described in the A.
When 9, display result is text and numeral, the content that the match is successful directly is presented on the display device, and with the highlighted demonstration of keyword.
10, the education resource information database is not only the basic information data of supporting function of search, also is the content strong point of realizing other all functions of this learning terminal of study diagnosis, smart group and individualized learning.For example on learning terminal of the present invention, set up corresponding " wrong topic collection database file " at unified education resource database compressed package, week recording user separate wrong examination question; Set up corresponding " favorites data library file " at unified education resource database compressed package, be used for recording the content of collection meaning, set up corresponding " study archive information database file " at unified education resource database compressed package, be used for writing down test examination question that the user done and the knowledge point that is associated, deagnostic report, knowledge point of learning and study condition.Described " wrong topic collection database file " and " favorites data library file " " study archive information database file " wait the data store organisation of all learning terminals identical, all comprise: wrap ID number, examination question QID number, examination question bag title, collection date, wherein comprised its storing path in the examination question bag title.
The present invention is newly developed a kind ofly uses the search technique that the gathering searching method is realized on the personal hand-held learning terminal in education sector, comprise examination question search, extracurricular knowledge search, three search of knowledge explanation search parts.Mainly, cover existing junior middle school student's number, thing, change course towards student user comprehensively; On the personal hand-held learning terminal, preset junior middle school student's number, thing, chemistry and practise total data resources bank compressed package, need not to download separately.As long as user's input search condition on the personal hand-held learning terminal touches the above-mentioned engine of searching accordingly of startup by input equipment respectively again, be met the information of condition.
Its internal circuit is located in the casing, display device and input media fixedly are located on the casing, internal circuit is provided with central processing unit, power circuit is connected with the central processing unit power end, liquid crystal driver module is connected with central processing unit with the LCD data bus by the LCD control bus, input block is connected with the I/O mouth of central processing unit, is provided with the data storer of the multiple education resource of storage in the central processing unit, and the data storer is connected with microprocessor unit.Need the content of search by the input media input, the multiple resource that microprocessor will be imported in content and the storer is mated simultaneously, and the result of all search matched successes is presented on the display device, the present invention can improve the search experience of user on hand-held learning terminal greatly, the maximization of realization hunting zone makes education resource obtain more reasonable use.Thereby fundamentally, solved because of searching resource and can not share, caused the repetition resource to take finite storage space in a large number, and repeated resource, wasted a series of serious problems such as a large amount of manpower and materials for building.
The hand-held learning terminal that has resource sharing gathering function, it adopts internal control circuit of the prior art, it is keyboard and touch-screen that the hand-held learning terminal outside is respectively equipped with as the input media among Fig. 1, this button comprise up and down directionkeys, acknowledgement key, return key, loud minor adjustment function key, touch screen LCD, etc. input media, can or click touch screen by the acknowledgement key function button, start the search engine of corresponding data Kuku.
The store memory of the hand-held learning terminal among the present invention contains each interrelated, monolithic databases such as the renewable test item bank through overcompression, extracurricular knowledge base, knowledge explanation storehouse.Each database is stored in the storer of hand-held learning terminal after processing processing on PC.
The process that the present invention realizes:
Be divided into following two separate and related parts:
1, set up raw data base, the standard control storehouse relevant on the PC, and, mainly comprise as follows the processing processing procedure in all storehouses with each languages:
(1), referring to accompanying drawing 5, at comprising various texts, picture, image, animation, sound, music and set up the raw data library file, and the raw data library file carried out following processing back form an index file; Participle; Meaningless speech of filtering and punctuation mark; All capitalizations are converted to small letter; Index file with raw data base content and above-mentioned formation compresses the target database compressed package that one of back formation comprises " original library file content ", " according to the index file of original library file content foundation " again, and wherein target test item bank compressed package also comprises " all kinds of study coursewares that are used for user learning that the knowledge point related with contents of test question is corresponding ".
2, hand-held learning terminal start search engine to the parsing of target database compressed package, call, processing procedure such as demonstration.
(1), below in conjunction with the processing processing procedure of example explanation PC to relevant criterion contrast storehouse and raw data base:
1, the processing of sorting of the keyword in all kinds of standard control storehouses and the index file set up according to raw data base:
The ordering of Chinese: according to the GB ISN entry is sorted, the corresponding corresponding ISN address of each entry, front several portions content and all identical entry of order can sort in order with adjacent address, and be as shown in the table:
Key term storehouse after the ordering |
Temperature |
The greenhouse |
Greenhouse effect |
The implication of greenhouse effect |
The keyword ordering of other languages is handled: according to international standard Unicode ISN ordering rule from small to large.
2, index file, the hand-held learning terminal related resolution to database is arranged, set up to PC to raw data base
(1), the raw data base at " test item bank, knowledge explanation storehouse, extracurricular knowledge base " falls to arrange:
A, the process of falling row: will " article number " with " all keywords that it comprises " between the relation of one-to-many fall row for many-to-one relation: a between " all keywords that comprise in the article " and " keyword affiliated article number ", wherein, one examination question i.e. one piece of article, so the examination question in the test item bank number (QID number) i.e. " article number ";
B. arrange structure: by above-mentioned A as can be known, arrange structure and form by " key word ", " article of keyword correspondence number " institute.
(2), set up index file at all raw data bases:
All keywords at the original storehouse content correspondence of " test item bank, knowledge explanation storehouse, extracurricular knowledge base " carry out participle, the meaningless speech of filtering and punctuation mark, set up index file after all capitalizations are converted to small letter, ordering, the row of falling.
The structure of A, index file is as follows:
Wherein the indexed file structure in test item bank, extracurricular knowledge base, knowledge explanation storehouse is as shown in the table:
The frequency file |
Position paper |
Storehouse, Field territory |
A, frequency file: all number of times that occur in the article of keyword under raw data base;
B, position paper: the byte location that occurs in the article of keyword under raw data base;
C, storehouse, Field territory: the incidence relation that is used to express article and its attribute, its attribute comprises that address, the cognitive classification in knowledge point of examination question correspondence, the complexity of examination question, script, the examination question of article number, article title, sensing article content are resolved, in the similar examination question one or multinomial, this shows, a keyword correspondence storehouse, one or more Field territory, and article is number as shown in the table with the corresponding relation in storehouse, title Field territory:
The incidence relation and the benefit of being drawn by storehouse, Field territory are as follows:
One examination question is one piece of article, so examination question number (QID number) is article number; The title of examination question is that article title is its corresponding knowledge point title, because one corresponding one or more knowledge point of examination question, so one examination question have one or more article titles, promptly article number and knowledge point title or perhaps the existence of article title territory be one to one or the relation of one-to-many in the test item bank; Because one examination question correspondence one or more knowledge points, and a knowledge point correspondence one or more study coursewares, thus as can be known one examination question also corresponding one or more study coursewares.
C, set up above-mentioned index structure after, then obviously as can be seen the structure of each database form as described below:
A, " test item bank " are made up of " according to the index file of contents of test question foundation ", " contents of test question ", " study courseware " three part institutes, and its structure component relationship is as shown in the table:
A), wherein index file is made up of " frequency file, position paper, storehouse, Field territory ", wherein storehouse, Field territory is used to express the incidence relation of examination question and its attribute, and it number is QJD number, article title, the address of sensing contents of test question, cognitive classification, complexity, answer, parsing, similar examination question etc. that its attribute comprises examination question;
B), wherein contents of test question by comprising text, numeral, picture, image;
C), wherein learn courseware and comprise " the multimedia learning courseware that the text of each languages, numeral, picture, image, sound are formed.
D, hand-held learning terminal start search engine, and the process of keyword coupling in each database:
A, keyword and the keyword in the index file of input carried out content and coupling in proper order;
B, " test item bank, extracurricular knowledge base, knowledge explanation " are found corresponding frequency file and position paper according to the pointer of the sensing frequency file of the keyword correspondence that the match is successful in the index file and the pointer of sensing position paper;
C, " test item bank, extracurricular knowledge base, knowledge explanation " are found all articles under the keyword and all positions in affiliated article in the database original contents according to frequency file and position paper;
The index structure of E, above-mentioned " test item bank, extracurricular knowledge base, knowledge explanation " illustrates as follows with a two-dimensional structure figure:
|
Article A |
Article B |
Article C |
Article D |
Keyword 1 (China) |
|
?3(P1,P2,P3) |
|
?1(P1) |
Keyword 2 (people) |
?3(P1,P2,P3) |
?2(P1,P2) |
|
?2(P1,P2) |
Keyword 3 (republic) |
?1(P1) |
?2(PI,P2) |
3(P1,P2, |
?2(P1,P2) |
A, first row be content be " keyword 1-3 " expression be lexicon file, " article A ", " article B ", " article C ", " article D " column are then represented frequency file and position paper, the frequency file is with numeral (as above table in numeral 3,1,2), the frequency number of times that the keyword that refers to be expert at occurs in the article of column; Position paper with numeral (as above in the table<P*, wherein * is 1-3), the position that the keyword that refers to be expert at occurs in the article of column that is to say the position of keyword character in affiliated examination question, represents with byte number;
B, can analyze at frequency file in the last table and position paper, the corresponding relation of keyword and affiliated all articles number is as follows:
|
Corresponding article number |
Keyword 1 (China) |
B、D |
Keyword 2 (people) |
A、B、D |
Keyword 3 (republic) |
A、B、C、D |
F, establish index structure, the benefit of setting up index file is:
A, in each independent database compressed package, set up the storage space that index file has been sacrificed hand-held learning terminal, but the keyword of input only need get final product with the keyword coupling in the index file, avoided search engine will import the order coupling that magnanimity information in keyword and the entire database compressed package carries out character string, thereby saved the processing time of hand-held learning terminal, and then save user's stand-by period, improved efficient;
3, all kinds of standard control storehouses and comprise index file and database original contents in each interior raw data base compression, hand-held learning terminal to its relevant decompression processing:
(1), compression method has following three kinds at least:
A, first kind are the general Huffman compression methods that is adopted at the text of all languages or numeral;
B, second kind are that the text at all languages adopts<prefix length suffix〉compression method
A, prefix length numeral wherein, make current entry be associated with adjacent entry on it suffix then for text as letter or Chinese character;
Hand-held learning machine microprocessor finds the entry of a neighbor address on the current entry earlier when b, decompress(ion) reduction, again above-mentioned adjacent entry is got the letter of the represented numerical value number of prefix length or the character of Chinese or other languages according to order from left to right, backward is backtracked all letters of the correct expression of prefix length or the character of Chinese or other languages, the combined decompression procedure of then finishing entry of all objects that will backtrack at last and suffix successively.
Shown in the compression of c, Chinese and decompression procedure are exemplified below:
As " harm of greenhouse effect " with the method be expressed as<4, the people, the process of reduction " people of greenhouse effect " (with English decompress(ion) method of reducing unanimity) as shown in the table:
C, the third is to adopt<numerical value at numeral, the numerical value sign〉compression method:
A, numerical value wherein represent that with the difference of initial value or a currency and a last value numerical value sign is initial value or difference in order to the kind of expression numerical value, represent it to be difference as represent initial value, 1 with O.
Thereby b, numerical value can prevent the overlong time that the hand-held learning terminal long user of making of decompression time waits for former value representation, so directly preserving numeral in certain part position is initial value, and do not preserve the difference of itself and previous numeral, so when saving as initial value, do not need the decompress(ion) reduction, thereby save anti-, reach the reasonable equilibrium in time, space.”
The method for expressing of c, difference can reduce the length of numeral, and then reduces and preserve the byte number that this numeral needs.For example current article number is 16390, to preserve with 3 bytes when not compressing, a last examination question number is 16383, preserve the compression back and 16390 difference is 7, then only can preserve with a byte, the 3rd article number then saves as 1 (i.e. 16391 and 16390 difference) after the compression, thereby reaches the purpose in saving space if 16391;
The decompress(ion) reduction process of d, difference is similar to above-mentioned English and Chinese<prefix length, suffix〉upwards neighbor address carry out backward and backtrack summation in turn.Illustrate the storage means of difference compression and hand-held learning terminal decompression procedure below to it:
A), following table is the numeral employing<numerical value of one group of initial value, the numerical value sign〉contrast before and after the compression:
B), upward in showing, the row at " compression back " place comprise<numerical value the numerical value sign 〉, wherein first column of figure is represented initial value or difference (as 1,2,60,70,80,81), secondary series comprises O, 1 numeral is the numerical value sign, and wherein O represents initial value, 1 expression difference.
C), numerical value " 70 " through once the reduction promptly obtain its initial value, its reduction process is as shown in the table:
D), numerical value " 80 " promptly obtains its initial value through twice reduction, its reduction process is as shown in the table:
A, dwindle the capacity of each raw data base, thereby saved the storage space of hand-held learning terminal;
B, make hand-held learning terminal between the processing speed of CPU and storage space, average out.
4, so far, the processing procedure to each raw data base finishes on the PC.
(2), below in conjunction with example and description of drawings hand-held learning terminal start search engine to the parsing of each target database compressed package, call, processing procedure such as demonstration:
1, referring to accompanying drawing 4, accompanying drawing 5, all keywords of input comprise the input of Chinese text content on hand-held learning terminal, start search engine:
(1), input plain text: utilize relevant input method input plain text.
2, all keywords are carried out word segmentation processing:
The participle of Chinese:
A, method: referring to accompanying drawing 2, key word to input among the present invention is Chinese, two words (being about to first word and second word combination back) are as first pre-keyword earlier, the entry that search matches in " Chinese standard control storehouse ", if search the content and the entry of coupling with it in proper order fully, then add thereafter the 3rd word in " Chinese standard control storehouse " once more matching content and the order, if the match is successful, then continuation adding word thereafter circulates coupling up to coupling maximization formation keyword sets, if coupling is unsuccessful, then preceding two words are carried out separation mark as a keyword, again circulation coupling behind the 3rd the word adding word is thereafter maximized the formation keyword sets up to coupling; If do not search the entry that mates fully with the content and the order of preceding two words (being first word and second word combination back), then behind first word, make separation mark with it as a keyword, again second word added that a word thereafter searches for the entry that matches in " Chinese standard control storehouse ", in " Chinese standard control storehouse ", can find all entries of coupling of content and order if add the back, the 3rd word that then adds again thereafter circulates coupling up to coupling maximization formation keyword sets, do not constitute keyword if add the back entry, then adopt the last entry that the match is successful as keyword, then as the beginning of back entry, Using such method is carried out participle with all input in Chinese contents to a next word.
B, example: input key word " Fu Laier company limited is found in 2006 ", getting " good fortune is come " earlier compares with " Chinese standard control storehouse " middle keyword, and found and comprised the entry of " good fortune is come ", add " that " word then and promptly get " Fu Laier " three words and " Chinese standard control storehouse " comparison, in this storehouse, also found " Fu Laier " speech, then adding " having " word more promptly gets " Fu Laier has " four words and compares with " Chinese standard control storehouse ", in this storehouse, can not find " Fu Laier has " such speech, then explanation " Fu Laier has " can not be as a word segmentation processing, and " Fu Laier " that remove behind last key word " limit " word that adds promptly can be a keyword, to " there be " word and " limit " word thereafter to be combined into new entry again, compare with " Chinese standard control storehouse ", in this storehouse, found " limited " this speech, so analogize down, then above-mentioned key word word segmentation result is: " Fu Laier/limited/company/establishment/in/2006 years ", then " Fu Laier " found in explanation, " limited ", " company ", " establishment ", " in ", " 2006 " these keywords.
3, filter the speech of no practical significance in the keyword and punctuation mark (as in the Chinese " " "Yes" etc.);
4, unify all capital and small letters in the key words content, convert all capitalizations to small letter;
5, microprocessor mates the keyword of input and the data in each database compressed package:
(1), the keyword of all inputs and the keyword in the index file are carried out the coupling of content and order, comprise with index file in each plain text
(2), the sensing frequency file of the keyword correspondence that the match is successful and the pointer of position paper find corresponding frequency file and position paper in " test item bank, extracurricular knowledge base, knowledge explanation storehouse " basis " index file ";
(3), " test item bank, extracurricular knowledge base, knowledge explanation storehouse " finds all articles under the keyword and all positions in affiliated article in the corresponding database compressed package according to frequency file and position paper;
7, the result to all search shows ordering:
(1), the demonstration of test item bank ordering: what, the principle of " the priority demonstration " of the position of first keyword in article to whether all search result content that the match is successful are adjacent according to the position of adjacent keyword in different articles, keyword is mentioned in article frequency number of times show ordering:
A, at first, adjacent keyword position in the article that the match is successful is adjacent then preferentially to be shown.
B, secondly, adjacent keyword position in the article that the match is successful is non-conterminous, the frequency number of times that is mentioned in article of keyword relatively then, the preferential demonstration that frequency is many.
C, last, the frequency number of times that keyword is mentioned in the article that the match is successful is identical, then according to the preferential demonstration of the front and back of the character position of appearance order first in the article that the match is successful of first keyword
(3), the demonstration principle of ordering of extracurricular knowledge base:
Article title regarded in keyword, mate with the article title in the database compressed package, with the input title is keyword, the title of all articles is the article that title comprises this keyword in the compressed package of video data storehouse, preferential keyword and title content and the on all four article of order of showing shows that and then key words content is contained in the article of title content in proper order; Otherwise, special keyword is as article content, all contents that comprise this keyword are shown, and according to above-mentioned frequency number of times what, frequency number of times whether adjacent with the position of adjacent keyword in the database compressed package, that keyword occurs in database identical then relatively first keyword first order of appearance position in the article that the match is successful show ordering.
8, the display drive device that calls hand-held learning terminal according to above-mentioned demonstration ranking results shows all Search Results:
Displaying contents is a text, the article content that the match is successful is presented on the display device, and with the highlighted demonstration of keyword;
(3), the following operating process that specifies examination question search among the present invention for example.
1, the building storehouse, ordering, fall row, index, compression of original test item bank:
(1), be provided with article 1 (examination question 1) and article 2 (examination question 2) and constitute an original test item bank file:
The content of A, article 1 is: the content of B, article 2 is: (2), the content of article 1 and article 2 is carried out the keyword word segmentation processing:
A, participle purpose: carry out index and inquiry because search engine is based on keyword in the index file, at first will obtain the keyword of these two pieces of articles, be and build index file and extract keyword;
B, segmenting method: article content is equivalent to a character string, finds out all words in the character string earlier, promptly adopts the segmenting method that is divided into mark with sky, and then the keyword of article 1 and article 2 is:
All keywords of a, article 1 are: b, and all keywords of article 2 are:
(2), with reference to " meaningless speech standard control storehouse ", " punctuation mark contrast storehouse " the insignificant keyword of filtering and punctuation mark, then the keyword of article 1 and article 2 is:
All keywords of A, article 1 are:
All keywords of B, article 2 are:
(3), the row of falling:
A, " article number " in above-mentioned article 1 and the article 2 lined up " all keywords in the examination question ": " all keywords in the examination question " are to " have all articles of this keyword number ";
B, the ordering rule of all keywords in the examination question according to " front several portions content with order all identical entry sort in order with adjacent address " sorted;
(4), set up index file:
A, with article 1 and article 2 arrange on the basis add " frequency of occurrences " and " position occurring " information after, index structure becomes " keyword+examination question number+[frequency of occurrences]+occur position ", wherein " examination question number+[frequency of occurrences] " is the frequency file, and " position occurring " is position paper.
(5), the contents of test question in index file and the original test item bank is compressed;
(6), promptly form target test item bank compressed package behind the corresponding study courseware that is used for user learning in the knowledge point that is associated of adding and examination question, from the above, comprise " contents of test question ", " all kinds of study coursewares that are used for user learning that the knowledge point related with contents of test question is corresponding ", " according to the index file of contents of test question foundation " three parts in the target test item bank compressed package;
(7), so far, the process that on the PC original test item bank is processed into target test item bank compressed package is finished.
2, hand-held learning terminal starts search engine, resolves target test item bank compressed package, display of search results:
(1), imports all key words
(2), key word carried out word segmentation processing after, obtain five keywords:
(3), with reference to behind " meaningless speech standard control storehouse ", " punctuation mark contrast storehouse " the meaningless speech of filtering and the punctuation mark on the hand-held learning terminal, obtain three keywords.
(4), with binary search the keyword in the lexicon file in the index file of keyword and test item bank compressed package is complementary.
(5), find article corresponding in the test item bank compressed package and content with the result that the frequency file and the position paper of index file are complementary, promptly find article 1 and article 2 according to keyword in the last table;
(6), the article content that the match is successful is shown ordering:
According to the preferential principle that shows of the article of the position of adjacent keyword in the test item bank compressed package under adjacent, the keyword of input is adjacent, and above-mentioned two keywords are also adjacent in the article 2, so that the content of article 2 has precedence over the content display position of article 1 is forward;
(7) according to the result who shows ordering, hand-held learning terminal drives display driver dress tawny daylily display of search results;
The present invention can be widely used in the various hand-held learning terminals, as electronic dictionary, learning machine etc.