DISPLAYING APPARATUS AND METHOD FOR PROCESSING TEXT INFORMATION THEREOF
FIELD OF THE INVENTION The present invention relates to a displaying apparatus and a method for processing text information thereof, and more particularly, to a displaying apparatus and a method for processing text information thereof, being capable of readily searching for and displaying text information desired by a user by reflecting a preference of the user using the text information. BACKGROUND ART
A variety of displaying apparatuses and displaying systems for generally processing and displaying text information have been developed. Of the text information, especially teletext information has been supplied free of charge from broadcasting stations along with broadcast signals, which is usually known as teletext broadcasting.
This kind of teletext information service has been used in providing information about stock market price, price, sports and weather as well as news reports, and various kinds of information such as detailed explanation of programs broadcast or guide of programs supplied to another channel to viewers. Usually, TV broadcasting stations transmit image
signals under different standards such as National Television Standards Committee (NTSC) , Phase Alternation Line (PAL) , Systems Equential Couleur A Memoire (SECAM) and so on depending upon national standards. Teletext information is transmitted by use of an unused portion of a short interval between frames (verbal blanking interval: VBI) among electric waves in transmission.
Users have requested to have predetermined decoders to convert text information such as teletext so that it can be displayed through a displaying apparatus.
Generally, teletext information transmitted by a broadcasting station has an index item to indicate the kind of information, which is linked to each teletext page, and may also include a keyword corresponding to the index item. In addition, each teletext page may include a plurality of subteletext pages.
In this respect, where a user wishes to search for any information desired, if he/she selects the number of any corresponding page based on a first search by a keyword, without considering all teletext pages, he/she can view information contained in the specific teletext page. However, since the amount of information provided is very vast, it is not easy to locate appropriate information relevant to the user's interest. The current service of providing teletext
information has a limitation in promptly providing a user with exact information as desired, appropriately reflecting the user's desire or need.
For example, keywords corresponding to respective pages constituting the teletext information are small in number but indicate extensive coverage in the teletext service. Accordingly, it is insufficient to indicate information in the specific area corresponding to the user's preference, and thus, it is poor to search for any relevant and appropriate teletext page.
For this reason, it takes considerably a long time to search for any information desired by the user from the vast amount of teletext information provided, and it is also not easy to locate the desired information. DISCLOSURE OF INVENTION
Accordingly it is an object of the present invention to provide a displaying apparatus capable of more easily searching for and processing teletext information desired by a user by reflecting the user's preference, in consideration of the user's feedback on the teletext information provided, and a method of processing the text information thereof.
The foregoing and/or other aspects of the present invention are also achieved by providing a displaying apparatus displaying text information, comprising: a data
converting unit converting the text information received so as to allow it to be displayed to a user; and a preference processing unit analyzing and then processing a user's preference relative to the text information based on a characteristic value established by the user relative to the converted text information.
According to the embodiment of the present invention, the preference processing unit comprises: a preference extracting unit extracting the user's preference relative to desired text information based on the text information having the characteristic value established by the user; and a preference searching unit searching for the desired text information corresponding to the user's preference from new text information. According to the embodiment of the present invention, the preference extracting unit comprises a Rough set based calculation unit discerning and determining a significance relative to a semantic element appropriate for analyzing the user's preference based on the text information and extracting the user's preference.
According to the embodiment of the present invention, the preference extracting unit further comprises a filtering unit filtering the semantic element inappropriate for analyzing the user's preference from the text information.
According to the embodiment of the present invention, the preference extracting unit further comprises a user profile generating unit generating a user profile having the significance and a core element selected among the semantic elements included in the text information having the characteristic values.
According to the embodiment of the present invention, the Rough set based calculation unit comprises: a discernibility table calculating unit calculating a discernibility table relative to the text information and the semantic elements contained therein based on the established characteristic values associated with the text information; and a reduction calculating unit calculating a reduct set representing the significance to the semantic elements by applying an MD-Heuristic algorithm to the discernibility table.
According to the embodiment of the present invention, the preference searching unit comprises a similarity calculating unit determining whether the new text information is the desired text information, based on the determination as to whether the new text information is similar to the user profile based on the lower similarity and the upper similarity.
According to the embodiment of the present invention, the preference searching unit comprises a fuzzy
approximation calculating unit calculating a fuzzy approximation, considering the user profile and synonym elements semantically equivalent to the semantic elements included in the new text information, to reflect it in determining the similarities.
According to the embodiment of the present invention, the displaying apparatus further comprise a user interface unit allowing the user to establish the characteristic values associated with the converted text information.
The foregoing and/or other aspects of the present invention are also achieved by providing a method of processing text information of a displaying apparatus having a data converting unit converting received text information so as to be displayed to a user and displaying the converted text information thereon, comprising: establishing characteristic values associated with the converted text information; extracting a user's preference relative to desired text information based on the text information having the characteristic values; and searching desired text information corresponding to the user's preference from new text information.
According to the embodiment of the present invention, the extracting the user preference comprises : filtering
semantic element inappropriate for analyzing the user's preference from the text information; and discerning and determining a significance by a fuzzy set based discernibility relative to the semantic element appropriate for analyzing the user's preference based on the filtered text information and extracting the user's preference.
According to the embodiment of the present invention, extraction of the user's preference is performed by generating a user profile including the significance and a core element selected among the semantic elements included in text information having the characteristic values.
According to the embodiment of the present invention, extraction of the user's preference comprises: calculating a discernibility table associated with the text information and the semantic elements included therein based on the characteristic values established in the text information; and calculating a reduction set representing the significance relative to the semantic elements by applying an MD-Heuristic algorithm to the discernibility table.
According to the embodiment of the present invention, searching the desired text information is performed by determining whether the new text information is similar
to the user profile based on the lower similarity and the upper similarity, to determine whether the new text information is the desired text information.
According to the embodiment of the present invention, determination as to whether the next text information is the desired text information is performed by calculating a fuzzy approximation, considering the user profile and synonym elements semantically equivalent to semantic elements included in the new text information, to reflect it in determining the similarity.
BRIEF DESCRIPTION OF DRAWINGS
The above and/or other aspects and advantages of the present invention will become apparent and more readily appreciated from the following description of the exemplary embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic diagram illustrating a construction of a displaying apparatus according to an exemplary embodiment of the present invention; FIG. 2 is a block diagram illustrating an operation of the displaying apparatus according to an exemplary embodiment of the present invention; and
FIG. 3 is a flow chart illustrating a text information processing method according to an exemplary embodiment of the present invention.
MODES FOR CARRYING OUT THE INVENTION
Reference will now be made in detail to exemplary- embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
FIG. 1 is a schematic diagram illustrating a construction of a displaying apparatus capable of displaying text information thereon, according to an exemplary embodiment of the present invention. As illustrated therein, the displaying apparatus 10 comprises a microcomputer 20 for processing broadcast signals, image signals and the like which are externally received with text information, and a decoder 30 as a data converting unit converting the text information included in the image signals and the like processed from the microcomputer 20 so that they can be displayed to the user. The text information converted through the decoder 30 is displayed through a user interface unit 40. The user interface unit 40 supplies feedback information according to the user's determination on the displayed text information to the displaying apparatus 10, which is supplied as an intermediary enabling the user and the displaying apparatus 10 to operate interactively.
The user interface unit 40 displays the converted text information supplied from the decoder 30 so that the user can directly view it, and also allow the user to establish a determination result such as a rate, as a predetermined characteristic value, based on respective determinations according to the user's interest in the displayed text information. Further, the user may select a keyword or the page number displayed in an index item of the text information through the user interface unit 40, to thereby search for any desired information.
In other words, the user may establish characteristic values of the rates, Good or Bad, with respect to the respective pages indicating text information through the user interface unit 40 so that feedback information which will be used as a basis for analyzing the user's preference can be provided to the displaying apparatus 10. Owing to this, the displaying apparatus 10 according to the present invention may extend the coverage of searching for text information, based on the rates established by the user as well as the keywords. Accordingly, to view the text information desired by the user, there is no need to search for all the pages indicating text information provided in the vast amount as in the conventional method. Also the displaying apparatus 10 according to an
exemplary embodiment of the present invention comprises a preference processing unit 50 analyzing and processing the user's preference relative to the text information provided through the decoder 30, based on the feedback information including rates supplied through the user interface unit 40.
FIG. 2 is a block diagram schematically illustrating an operation of the displaying apparatus according to an exemplary embodiment of the present invention. Referring to FIG. 2, the preference processing unit 50 comprises a preference extracting unit 51 extracting the user's preference from predetermined text information, and a preference searching unit 55 searching for whether the text information received is appropriate for the user's preference, based on the user's preference extracted from the preference extracting unit 51. A profile generating unit 54 generating a user profile reflecting the user's preference analyzed in the preference extracting unit 51 may be provided within the preference processing unit 50. Further, the preference processing unit 50 may be constructed with modules, so that the processing may be as done with software program.
The preference extracting unit 51 according to the present invention adopts a Rough set based discernibility approach suggested by Skoworn et al . to analyze the
user's preference, under which the preference extracting unit 51 first analyzes feedback information indicating the user's interest relative to several text pages including text information and grasps the user's preference to significant semantic elements, that is, significant words, included in the corresponding text page.
Namely, the user's preference is understood based on the rates of Good or Bad allocated to each page of the text information supplied by the user as the feedback information through the user interface unit 40. At this time, the significant words may be used in discerning respective text pages having the rates of Good or Bad. In this respect, the user may also establish more rate values, in addition to the two rates, Good and Bad, as characteristic values reflecting his/her own interest.
The preference extracting unit 51 refers to a set of significant semantic elements, namely, significant words, as a reduct. As will be described later, the reduct will be used in searching for and discerning text pages relevant or irrelevant to the user's preference.
The preference extracting unit 51 may further comprise a storage unit (not shown) to store therein text information converted and supplied by the decoder 30 for preference analysis. At this time, the text information
constituting a text page is stored on a word basis. In the storage unit, words ranked in a predetermined sequence, which are extracted depending upon the frequency of the words indicated in the predetermined number of pages having the text information, may be stored in the storage unit.
The preference extracting unit 51 may further comprise a filtering unit to filter semantic elements inappropriate for reflecting the user's interest. For example, semantic elements such as Λthe,' *where, ' 'about,' λalong, ' etc., having no significant information about text, are referred to as "filter words," which may be excluded from the objects subject to preference analysis. At this time, a predetermined database may be constructed to eliminate these filter words.
In this exemplary embodiment of the present invention, it may be considered that words belonging to predetermined rankings, for example, 50 rankings according to the frequency of the words represented in a predetermined page of the text information are appropriate semantic elements to reflect the user's interest. At this time, the frequency of corresponding words relative to each page of text information can be normalized; in this case, the frequency of each word may exist in the section of [0, 1] .
The frequency of each word may be used in determining the significance of that word within the page of the corresponding text information. Accordingly, to locate any significant word indicating the user's preference, the preference extracting unit 51 may comprise a Rough set based operation unit performing an operation based on the Rough set based discernibility approach.
Based on this approach, the Rough set based operation unit constructs a decision system composed of rows and columns, each row indicating each page of text information and each column indicating words as semantic elements. This decision system has characteristic values decided corresponding to pages representing text information. These characteristic values may be Good or Bad, according to the rank established to reflect the user's interest, which is supplied as feedback information by the user.
This system may be indicated as a decision table as described below, wherein each column may be indicated with the frequency of each word included in the corresponding page and a rank indicating the user's interest in the corresponding page with respect to the text page of each row. Here, text information refers to teletext information by way of example.
In the above decision system, the value of each entry may indicate the normalized frequency of the word relative to the page indicating each piece of teletext information. At this time, the frequency is in the form of continuous value but it inherently has no important information. Accordingly, these continuous values are discretized and converted into interval values. In other words, with respect to a page including respective words, the frequency of each word is arranged in the sequence of increment; that is, it may be defined as such intervals as in { [lowest value, next higher value] , [next higher value, second next higher] , ...} . According to an exemplary embodiment of the present invention, mean values relative to the interval values defined as above are defined as λcut values.' The xcut value' serves to locate the significance of a predetermined word, that is, a semantic element appropriate for discerning the page indicating text
information. Accordingly, as the number of λcut values' increases corresponding to each word, the preference analysis is performed by locating the least "cut values' set which can recognize the page indicating the maximum text information in the present information.
For this purpose, the Rough set based operation unit may be provided a discernibility table calculating unit 52 and a reduct calculating unit 53 so as to make a discernibilility table and then apply an MD-heuristic algorithm to be described later. As an example, the discernibility table calculating unit 52 can make the following discernibility table from the above decision table.
In the discernibility table, each row is indicated with a page number indicating a pair of teletext information having different rankings and each column is
indicated with a word and a cut value corresponding to the word. Here, the determination value relative to each word and each pair of pages is determined depending upon whether the cut value exists within the range of frequency value that the pair of pages have.
For example, in the above discernibility table, a first word (Word 1) in the first column has the cut value of 11. According to this, in case of the page with a page pair (100, 101) having different rates, the determination value thereof is 0 because the cut value is in excess of the range of frequency value (12 ~ 19) of the corresponding page pair. In the case of (100, 102) page, the determination value thereof is 1 because the cut value is within the range of frequency value (10 ~ 12) of the corresponding page pair.
The reduct calculating unit 53 applies the MD- Heuristic algorithm according to each of the following operations to find a reduct by use of the discernibility table calculated in the discernibility table calculating unit 52.
Step 1: Set that T=r, where T represents the maximum deviation of the determination values and r represents the available maximum deviation. For example, in the above discernibility table, the maximum deviation (T) of the determination values 0 and 1 is 1. Define W as a set
of the maximum reduct and set that W = Null .
Step 2: Set T = T-I where there is no column including T.
Step 3 : Select a column in which T has the maximum frequency. This column includes a word discerning the maximum number of documents or pages corresponding to the current discernibility level.
Step 4 : Select a word and a cut value corresponding to the concerned column and delete the concerned column from the discernibility table. Accordingly, since the discernibility of the concerned column has already been considered, all the rows corresponding to the row indicated with T in the concerned column are deleted and a word of the concerned column is added to the set of reduct.
Step 5 : Return to step 1 where there remains any row and repeat the steps described above. Otherwise, terminate these steps where there remains no row.
A reduct set generated according to the algorithm described above refers to the user's preference, which is used in generating a profile for analyzing the user's preference in the profile generating unit 54. At this time, the user profile may include keywords contained in the head part of the text page rated as, e.g., "Good" by the user. Meanwhile, the profile generating unit 54 may
be provided as an auxiliary module of the preference extracting unit 51.
Where the displaying apparatus 10 according to the present invention analyzes teletext information, it analyzes the user's preference by analyzing feedback information supplied from the user, searches for appropriate teletext pages which are relevant to the user's interest and filters inappropriate teletext pages which are irrelevant to the user's interest. In the present invention, since relevant teletext pages are searched based on the user's preference analyzed in the preference extracting unit 51 as described above, there is no need to search for all text pages to view information desired by the user. In the displaying apparatus 10 according to an exemplary embodiment of the present invention, the preference searching unit 55 employs a Rough-Fuzzy similarity based approach, to search for appropriate pages relevant to the user's interest based on the user profile supplied from the profile generating unit 54. According to this, pages representing inappropriate text information irrelevant to the user's interest can be filtered. As a result, the text page having the highest similarity to the user profile can be searched for and displayed.
In the displaying apparatus 10 according to an exemplary" embodiment of the present invention, a fuzzy approximation calculating unit 56 and a similarity calculating unit 57 may be provided within the preference searching unit 55 so as to determine the similarity between the user profile and a new text page.
The fuzzy approximation calculating unit 56 inspects the reduct set from the user profile supplied by the profile generating unit 54. Here, the user profile includes the frequency of a set of core semantic elements, namely, keywords, in addition to the reduct having the normalized frequency. At this time, the frequency of a keyword set represents the frequency of a keyword included in a predetermined text information page rated as Good by the user. For example, in case of teletext service, a keyword is provided in the head part of each page representing teletext information and selected by the user through the user interface unit 40. According to this, a complete set of the keyword and the reduct may constitute a user profile.
Meanwhile, the displaying apparatus 10 according to an exemplary embodiment of the present invention is more efficient in a sense that it may extensively apply to synonym elements semantically equivalent to semantic elements included in text information, that is, a synonym
word for a predetermined word. Here, the synonym word may be provided in advance by making a database. Also, in case of teletext service, synonym database may be constructed, considering words commonly represented in a predetermined page.
According to this, the fuzzy approximation calculating unit 56 seeks values of the fuzzy lower approximation and the fuzzy upper approximation by use of the user profile and the synonym database. The fuzzy lower approximation includes words certainly defining a desired text page and the fuzzy upper approximation includes words possibly defining a desired teletext page. The fuzzy lower approximation and the fuzzy upper approximation may be defined as in the following formulas (1) and (2) .
(2 )
μw
rn (8 )
= (y) ,βpΛv, y)] l y e U)
To the fuzzy lower approximation and the fuzzy upper approximation may be allocated fuzzy weight, considering each synonym of each word. In the present invention, the
fuzzy approximations are calculated relative to all words corresponding to the user profile and new text pages. In this calculation process, a weight of each word is determined, considering synonym words. This weight is used in representing words belonging to the user profile and the page of new text information.
In an exemplary embodiment of the present invention, the similarity calculating unit 57 of the displaying apparatus 20 determines whether the new text page has a similarity to the user profile. In the present invention, relevance of the new text page to the user profile is determined based on a value of the similarity to the user profile, and the similarity value can be calculated by the following formulas. In the following formulas (3) and (4) , Bj and Bu respectively refer to the lower approximation and the upper approximation relative to the set of the words of the user profile (S2) with the new text page (Si) .
(3 )
(4 )
Bu = aP>rE tø ) ! - ! (aPrR (Si ) π aprE (S2 ))
where | - 1 represents a bounded difference and apr represents a fuzzy approximation.
Each approximation calculated from the above formulas ((3) and (4)) includes words represented by a difference between a word set of the user profile reflecting the user's preference and an intersection of the word set of the user profile with a word set of the new teletext page. Employing the results of the above formulas (3) and (4) , values of the lower similarity and the upper similarity of the user profile (S2) with the new text page (Si) can be calculated by the following formulas (5) and (6) .
(5)
(6) card(Bn)
Similarity p (S1 , S, ) = 1 - [ ' U J ] card (apr E (S2 ))
(In the above formulas, a xcard' function represents the number of words contained in the concerned set . )
In the similarity values calculated above, "0" indicates that two sets, that is, the user profile and the page representing the new text information, are not identical, and "1" indicates that the two sets are perfectly identical.
The similarity value represents the relevance to the user profile as a focus of relevance relative to the new text page. At this time, the lower similarity value is calculated considering all the words certainly defining the user's interest, and the highest similarity value is calculated considering all the words possibly defining the user's interest. Here, to obtain a similarity, the fuzzy lower approximation and the fuzzy upper approximation obtained by the formulas (1) and (2) can be applied to predetermined terms of the above formulas (3) through (6) .
According to this, whether or not a new text page corresponds to appropriate text information relevant to the user's interest is determined based on the following conditions.
A new text page is determined to be relevant to the user's interest where 1) both the lower similarity and the upper similarity are maximum, 2) the lower similarity is maximum and the upper similarity is medium, and c) the upper similarity is maximum and the lower similarity is
medium. However, when these conditions are not satisfied, the new text page is determined to be irrelevant to the user's interest.
A method of processing text information according to an exemplary embodiment of the present invention will be described with reference to the flow chart illustrated in FIG. 3. Here, the text information refers to teletext information supplied from a broadcasting station and so on, by way of example. When data signals containing therein teletext information included in broadcast signals or image signals, etc. are received by the displaying apparatus 10 at operation 100, the teletext information is extracted through the decoder 30 by way of the microcomputer 20 to process the signals, and are converted so as to be displayed to a user at operation 102. The teletext information converted in this way is displayed through the user interface unit 40 at operation 104, and the user evaluates individually the predetermined number of pages of the teletext information and establishes characteristic values as a predetermined rate at operation 106.
According to this, the user's preference is calculated through the preference extracting unit 51 and the preference searching unit 55, based on the rate
established relative to the teletext information fed back from the user. By allowing the extracted preference to be reflected and displayed after searching whether the new teletext information is relevant to the preference, the user's preference can be analyzed and processed.
To extract the user's preference, a discernibility table is obtained from the rate established by the user and the predetermined number of teletext information pages supplied from the decoder 30, corresponding to the established rate, according to the Rough set based discernibility approach as described above, and a reduct set is calculated by extracting significant semantic elements, namely, significant words, through the MD- Heuristic algorithm at operation 108. At this time, a filtering process to filter irrelevant semantic elements not appropriate for reflecting the user's preference from the words constituting semantic elements included in the teletext information may be performed as described above.
Based on the calculated reduct set, a user profile reflecting the user's interest is generated at operation 110. The user profile may include therein a core element, namely, a keyword, selected among semantic elements included in the page of teletext information having a predetermined characteristic value, namely a rate (e.g., the rate of Good) .
The semantic element included in the user profile reflects the user's preference relative to teletext information based on the Rough set based discernibility approach. According to this, a similarity to the user profile relative to new teletext information supplied from the decoder 30 is calculated at operation 112. As described above, it is determined whether the upper similarity and the lower similarity calculated according to Rough-Fuzzy similarity based approach meet a predetermined condition at operation 114.
When a predetermined similarity condition is satisfied, it is determined that the new teletext information supplied through the decoder 30 is appropriate information relevant to the user's preference, that is, teletext information desired by the user at operation 116. Subsequently, the desired information can be displayed through the user interface unit 40. When the predetermined similarity condition is not satisfied, it is determined that the new teletext information is not appropriate information irrelevant to the user' s preference at operation 118. Like this, a process of searching for new teletext information by reflecting the user's preference is performed. Meanwhile, in the process of determining whether the
new teletext information is teletext information desired by the user, the fuzzy lower approximation and the fuzzy upper approximation are calculated considering synonym elements, namely, synonym words, having equivalent meanings to the semantic elements included in the teletext information and they can be reflected in determining the similarity to the corresponding semantic element .
If the methods described in detail according to exemplary elements of the present invention are applied to various displaying apparatuses, they are effective in searching for and processing relevant teletext information based on the user's interest.
In the exemplary embodiments of the present invention, a displaying apparatus and a method of processing text information thereof has been described with respect to teletext information transmitted through air waves as text information, by way of example. However, the displaying apparatus according to the present invention is applicable with a method of processing text information in various displaying apparatuses capable of visually displaying the text information. For example, where text information is transmitted through a variety of communications lines for telephones or cable circuit TVs, etc. or in a wireless manner, to display through a
variety of displaying apparatuses such as computer monitors, portable computers, cable circuit TVs, etc. connected in a wired or wireless manner, it is possible to search for and process the desired text information based on the analysis of the user's preference according to exemplary embodiments of the present invention.
As described above, the present invention provides a displaying apparatus capable of more easily searching for new text information relevant to a user's preference and processing text information desired by the user by analyzing the user's preference relative to text information based on the user's interest. In addition, text information irrelevant to the user's preference may be filtered in the searching process. Although the present invention has been described in connection with the exemplary embodiments illustrated in the accompanying drawings, it should be understood that the present invention is not limited thereto and those skilled in the art can make various modifications and changes without departing from the scope of the invention.