WO2006134682A1 - 固有表現抽出装置、方法、及びプログラム - Google Patents
固有表現抽出装置、方法、及びプログラム Download PDFInfo
- Publication number
- WO2006134682A1 WO2006134682A1 PCT/JP2005/023768 JP2005023768W WO2006134682A1 WO 2006134682 A1 WO2006134682 A1 WO 2006134682A1 JP 2005023768 W JP2005023768 W JP 2005023768W WO 2006134682 A1 WO2006134682 A1 WO 2006134682A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- extraction
- specific expression
- order
- specific
- expression
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Definitions
- Named entity extraction apparatus Named entity extraction apparatus, method, and program
- the present invention relates to a specific expression extraction apparatus capable of extracting a specific expression adapted to a user.
- the specific expression is a proper noun, company name, e-mail address, country name, city name, product name, organization name, time, date and time, monetary expression, percentage, which is treated as one unit by the task.
- a specific language item such as an expression.
- Patent Document 1 Japanese Patent Laid-Open No. 2003-248680
- the specific expression extraction method that can deal with different tasks by operating a plurality of conventional specific expression extraction modules, the type and unit of the extracted specific expression to the user or display terminal.
- the specific expression extraction method since it cannot be stored in association with each other, it is impossible to extract a specific expression adapted to the user or the display terminal.
- the user since it cannot be stored in association with each other, it is impossible to extract a specific expression adapted to the user or the display terminal.
- the user understands the specific expression and meaning including a character string redundant to the user when the extracted unique expression is displayed. In order to do so, I had to read a specific expression that was too short.
- a subtitle is redundant information for a user who can identify a program only by the main subject, with respect to a program having the main title and the subtitle as a program name. Conversely, for a user who cannot identify a program only with the main subject, the subtitle is necessary information, and it is necessary to present the main subject and the subtitle together.
- the present invention has been made in view of the above circumstances, and a user's input history and table are provided. It is an object to provide an apparatus for extracting a specific expression that can extract a specific expression adapted to the extraction condition represented by the display capability of the display terminal.
- the named entity extraction apparatus sequentially uses one or more named entity patterns indicating a criterion for identifying the named portion included in the text, and the one or more named entity patterns are used.
- An apparatus for extracting a specific expression from an input text an extraction order setting means for determining an extraction order indicating a use order of a specific expression pattern depending on an extraction condition, and an order indicated by the predetermined extraction order
- a specific expression extraction means for extracting a specific expression from the one or more input texts using a specific expression pattern.
- the unique expression having a short or long character string length is compared with the unique expression having a nested structure.
- the specific expression extraction process can be stopped when the proper expression of the character string length optimal for the user is extracted, and the optimal specific expression is extracted for the user and the display terminal. be able to.
- FIG. 1 is a configuration diagram of a named entity extraction apparatus according to a first embodiment.
- FIG. 2 is a diagram illustrating an example of an extraction order stored in an extraction order storage unit used in the named entity extraction apparatus according to the first embodiment.
- FIG. 3 is a diagram showing an example of rules used for extraction.
- FIG. 4 is a diagram showing another example of rules used for extraction.
- FIG. 5 is a diagram showing still another example of rules used for extraction.
- FIG. 6 is a diagram showing a specific example of extraction performed using a rule for extracting names.
- FIG. 7 is a flowchart showing an operation in the first embodiment.
- FIG. 8 is a diagram showing a specific example of the extraction result obtained by the named entity extraction apparatus of the first embodiment.
- FIG. 9 is a diagram illustrating an example of the extraction order stored in the extraction order storage unit used in the named entity extraction apparatus of the first embodiment.
- FIG. 10 is a diagram illustrating an example of an extraction order stored in an extraction order storage unit used in the named entity extraction apparatus according to the first embodiment.
- FIG. 11 is a configuration diagram showing an example of an extraction order reading unit used in the named entity extraction apparatus of the first embodiment.
- FIG. 12 is a flowchart showing an operation example in the first embodiment.
- FIG. 13 is a diagram showing an example of the contents of the usage pattern database used in the named entity extraction apparatus of the first embodiment.
- FIG. 14 is a diagram illustrating an example of the contents of an extraction order database used in the named entity extraction apparatus according to the first embodiment.
- FIG. 15 is a diagram showing an example of the contents of the usage pattern database used in the named entity extraction apparatus of the first embodiment.
- FIG. 16 is a diagram showing an example of the contents of a usage pattern database used in the named entity extraction apparatus according to the first embodiment.
- FIG. 17 is a diagram showing an example of the extraction order stored in the extraction order storage unit used in the named entity extraction apparatus of the first embodiment.
- FIG. 18 is a diagram illustrating an example of an extraction order stored in an extraction order storage unit used in the named entity extraction apparatus according to the first embodiment.
- FIG. 19 is a block diagram showing an example of the extraction end determination unit used in the named entity extraction apparatus of the first embodiment.
- FIG. 20 is a flowchart showing an operation example in the first embodiment.
- FIG. 21 is a diagram illustrating an example of the contents stored in the extraction count storage unit used in the named entity extraction apparatus according to the first embodiment.
- FIG. 22 is a diagram illustrating an example of contents stored in an extraction number storage unit used in the named entity extraction apparatus according to the first embodiment.
- FIG. 23 is a diagram showing an example of the contents stored in the extraction order storage unit used in the named entity extraction apparatus of the first embodiment.
- FIG. 24 is a diagram illustrating an example of contents stored in an extraction order storage unit used in the named entity extraction apparatus according to the first embodiment.
- FIG. 25 is a diagram showing an example of the extraction order stored in the extraction order storage unit used in the named entity extraction apparatus of the first embodiment.
- FIG. 26 is a configuration diagram of a named entity extraction apparatus according to a modification of the first embodiment.
- FIG. 27 is a block diagram showing a configuration of the named entity extraction apparatus according to the second embodiment of the present invention.
- FIG. 28 is a diagram showing an example of the extraction order stored in the extraction order storage unit used in the named entity extraction apparatus of the second embodiment.
- FIG. 29 is a flowchart showing an operation example in the second embodiment.
- FIGS. 30 (A) and 30 (B) are diagrams showing examples of displaying unique expressions in the second embodiment.
- FIG. 31 is a block diagram showing the configuration of the named entity extraction apparatus according to Embodiment 3 of the present invention.
- FIG. 32 is a diagram illustrating an example of contents stored in a specific expression storage unit used in the specific expression extraction apparatus of the third embodiment.
- FIG. 33 is a flowchart showing an operation example in the third embodiment.
- FIG. 34 is a diagram showing a display example of proper expressions in the third embodiment.
- FIG. 35 is a diagram showing a display example of the specific expression in the third embodiment.
- FIG. 36 is a diagram showing a display example of the specific expression in the third embodiment.
- FIG. 37 is a diagram showing an example of Chinese input text in the modified example.
- the named entity extraction apparatus of the present invention indicates a criterion for identifying the named portion included in the text.
- a specific expression extraction device that extracts a specific expression from one or more input texts using one or more specific expression patterns in sequence, and the extraction conditions for the specific expression pattern to be used for extraction of the specific expression
- the extraction condition includes a user who uses the extracted specific expression, a terminal device that displays the extracted specific expression, an attribute of the input text, the number of input texts, and a specific expression extracted in the past. It may be expressed by using at least one of the number of times of being performed.
- the input text may represent program information constituting an electronic program guide.
- the specific expression extraction device sets the order of the specific expression patterns used for extraction of the specific expressions according to the extraction conditions, for example, a user who uses the extracted specific expressions, Depending on the extraction conditions represented by the terminal device that displays the extracted specific expressions, the input text attributes, the number of input texts, and the number of times the specific expressions have been extracted in the past, etc. Extraction results can be obtained
- This configuration is suitable, for example, when the input text represents program information constituting an electronic program guide.
- a program title when extracting a program title from a program information as a unique expression, a relatively short unique expression consisting only of the main subject is extracted and presented to users who are familiar with the program, and to a user who is not the main subject.
- a relatively long unique expression consisting of subtitles and subtitles, it is possible to present a program title with a length reflecting the optimum content according to the user.
- the terminal device that displays the extracted unique expression is a portable information terminal device
- the main subject is extracted and displayed, and when the terminal device is a home-use television broadcast receiving device, the main subject and subtitles are displayed. If these are extracted and displayed, a long program title is displayed on the portable information terminal device, and as a result, the listability is impaired and the inconvenience of being difficult to see for the user can be reduced.
- the specific expression extraction device further stores a specific expression pattern storage unit that stores a plurality of specific expression patterns, and a plurality of extraction conditions that are stored in the specific expression pattern storage unit.
- Extraction order storage means for storing the order to be used for extraction of the unique expression of one or more specific expression patterns, wherein the extraction order setting means is given one of the plurality of extraction conditions And the order of the unique expression patterns stored in the extraction order storage means for the given extraction condition may be determined as the extraction order.
- the specific expression extraction device further includes a specific expression pattern storage unit that stores a plurality of specific expression patterns, and one or more specific expression patterns stored in the specific expression pattern storage unit.
- An extraction order storage means for storing the order to be used for extraction of the specific expressions and an extraction order for changing the order of the specific expression patterns stored in the extraction order storage means in accordance with the extraction conditions. Changing means, and the extraction order setting means may determine the order of the unique expression patterns after the change as the extraction order.
- the feature of the present invention that different extraction results can be obtained by using different unique expression patterns depending on the extraction conditions, specifically, according to the extraction conditions from a plurality of extraction orders. This can be realized by changing the extraction order according to the force realized by using one or the extraction conditions.
- the specific expression extraction device includes a user identifier for identifying a user as an extraction condition, and further includes user identification means for acquiring the user identifier, and the extraction order storage means includes a plurality of user identifiers.
- the extraction order setting means stores the acquired user identifier in the extraction order storage means for storing the order of one or more specific expression patterns stored in the specific expression pattern storage means.
- the specific expression pattern order may be determined as the extraction order, and the specific expression extraction device uses the terminal identifier of the terminal device that displays the extracted specific expression as an extraction condition, and further, the terminal Terminal identifier acquisition means for acquiring an identifier, the extraction order storage means for each of a plurality of terminal identifiers Storing the order of one or more unique expression patterns stored in the pattern storage means
- the extraction order setting means may determine the order of the unique expression patterns stored in the extraction order storage means for the acquired terminal identifier as the extraction order.
- the specific expression extraction device includes an attribute acquisition unit that acquires an attribute of the input text as an extraction condition, and further acquires the attribute of the input text, and the extraction order storage unit includes each of a plurality of attributes.
- the order of one or more specific expression patterns stored in the specific expression pattern storage means is stored, and the extraction order setting means is stored in the extraction order storage means for the acquired attributes
- the order of the specific expression patterns may be determined as the extraction order.
- the input text represents program information constituting an electronic program guide
- a program category included in the program information is acquired as an attribute of the input text, and the acquired program category Therefore, if a unique unique expression pattern that can appropriately extract a unique expression from program information of the program category is used, it is possible to obtain a good extraction result.
- the specific expression extraction apparatus uses the number of input texts as an extraction condition, and further extracts an information database storing a plurality of texts and one or more texts serving as input texts from the information database.
- the extraction order setting means determines the order of the unique expression patterns stored in the extraction order storage means for the number of retrieved texts as the extraction order, and extracts the specific expression.
- the means includes a unique expression pattern in the order shown in the predetermined extraction order. A unique expression may be extracted from the retrieved text using a text.
- the input text represents the program information constituting the electronic program guide
- the program title is extracted from the program text as the unique expression
- the number of input texts is predetermined. If it is less than the value, a specific expression consisting only of the main topic is extracted. If it is greater than the threshold value, a specific expression consisting of the main subject and the subtitle is extracted. If the same unique expression is extracted, the inconvenience that the user cannot distinguish them can be reduced.
- the specific expression extraction device uses the number of input texts as an extraction condition, and further stores an information database storing a plurality of texts, and a text for acquiring a plurality of texts from the information database. Similar text that obtains, as the input text, a plurality of texts similar to each other when displayed on the display means, from a plurality of texts obtained by the obtaining means, display means for displaying the text, and the text obtaining means.
- the extraction order storage means stores the order of one or more specific expression patterns stored in the specific expression pattern storage means for each of a plurality of values indicating the number of texts.
- the extraction order setting means includes the extraction order storage means for the number of texts acquired by the similar text acquisition means.
- the order of the stored unique expression patterns is determined as the extraction order, and the specific expression extraction means uses the specific expression patterns in the order shown in the determined extraction order, and the similar text acquisition means uses the specific expression patterns.
- a specific expression may be extracted from the acquired text.
- the specific expression extraction apparatus uses the number of times that a specific expression has been extracted in the past as an extraction condition.
- an extraction number counting means for counting the number of times the unique expressions have been extracted in the past using the unique expression pattern.
- the order changing means may change the order of the unique expression patterns stored in the extraction order storage means according to the counted number.
- the extraction order indicates a plurality of specific expression patterns in an order in which a longer specific expression is expected to be extracted for each use when sequentially used, and the specific expression extraction apparatus further determines in advance.
- an extraction truncation unit may be provided that terminates extraction performed using the specific expression pattern thereafter.
- the threshold value is set to a necessary limit length according to the user, the terminal device, and the like, a longer-than-necessary specific expression cannot be extracted. Necessary specific expressions can be extracted while reducing the amount of computation required to extract tangible expressions.
- the named entity extraction apparatus of the present invention extracts a named entity from one or more input texts by sequentially using one or more named entity patterns indicating the criterion for determining the named entity portion included in the text.
- a unique expression extraction device an information database storing a plurality of texts, a text acquisition means for acquiring a plurality of texts from the information database, and a specific expression pattern storage storing a plurality of specific expression patterns
- An extraction order storage means for storing a plurality of orders to be used for extracting a specific expression of one or more specific expression patterns stored in the specific expression pattern storage means, and the text acquisition means From a plurality of acquired texts, a specific expression is extracted using a specific expression pattern in each order stored in the extraction order storage means,
- a named entity extraction means for the named entity set named entities extracted for each ordinal the named entity extraction For each unique expression set obtained by the output means, the number of similar specific expressions, which is the number of similar specific expressions included in the specific expression set, is calculated, and the specific expression set with the smallest
- the specific expression extraction device further includes display means for displaying text, and the specific expression determination means uses the unique expression when calculating the number of similar specific expressions for each specific expression set. If partial specific expressions corresponding to the number of characters that can be displayed on the display means are extracted, and the extracted partial specific expressions are similar, the number of similar partial specific expressions may be used as the number of similar specific expressions.
- the named entity extraction apparatus of the present invention extracts a named entity from one or more input texts by sequentially using one or more named entity patterns indicating a criterion for determining the named entity included in the text.
- a unique expression extraction device that stores a plurality of unique expression patterns, and a unique expression extraction of one or more unique expression patterns stored in the specific expression pattern storage means.
- An extraction order storage means for storing the order to be used in the extraction order storage means, and the one or more input texts using the one or more specific expression patterns in the order stored in the extraction order storage means.
- the specific expression extracting means for extracting the specific expression is associated with the input text, the specific expression extracted from the input text, and the stage in the order in which the extraction is performed.
- a specific expression storage means for storing, a display condition designating means for designating a predetermined stage or one or more specific expressions extracted in a common stage according to a user operation, and the display condition instruction means If the default stage is specified, the previous specified entity storage means When all the unique expressions stored corresponding to the designated stage are acquired, and when one or more unique expressions are designated from the display condition instruction means, from the proper expression storing means, Specific expression acquired by the specific expression acquisition means; specific expression acquisition means for acquiring a specific expression stored corresponding to the next stage of the common stage for the input text corresponding to each specified specific expression; A duplication deletion means for deleting duplicates from the expression; and a display means for displaying the unique expressions remaining after the duplication is deleted by the duplication deletion means.
- the extracted specific expressions can be displayed in each direction, for example, from a simple specific expression to a complex specific expression in a moving direction. Convenient for checking in stages.
- the present invention can be realized not only as such a unique expression extraction apparatus but also as a specific expression extraction method in which processing executed by characteristic means included in such a specific expression extraction apparatus is a step. It can also be realized as a program that causes a computer to execute these steps. Needless to say, such a program can be distributed via a recording medium such as a CD-ROM or a transmission medium such as the Internet.
- FIG. 1 is a configuration diagram of a named entity extraction apparatus according to Embodiment 1 of the present invention.
- This specific expression extraction device sets the usage order of one or more specific expression patterns used for extraction according to the extraction conditions, and uses the specific expression pattern in the set order to extract the specific expressions from the input text.
- a specific expression extraction device for extraction includes an input unit 101, an extraction order storage unit 102, an extraction order reading unit 103, a specific expression pattern storage unit 104, a specific expression extraction unit 105, and an extraction end determination unit 106.
- the extraction order reading unit 103 is an example of an extraction order setting unit.
- the input unit 101 includes input devices such as a keyboard, a mouse, and a remote controller. When the user inputs text including a specific expression, the input unit 101 outputs a value 1 as an initial value of the input text and the extraction order to be processed. To do.
- the input unit 101 includes information on TV broadcast programs and The information about the content stored in the hard disk recorder or the like, or the text to be presented to the user from the database that stores the content existing on the Internet, and the value 1 as the initial value of the text to be acquired and the extraction order to be processed 1 May be output.
- the extraction order storage unit 102 stores the extraction order, which is the order in which the specific expression pattern stored in the specific expression pattern storage unit 104 is used, and the specific expression pattern name corresponding to the extraction order in association with each other. Further, the total number of extraction orders, which is the total number of extraction orders, is also stored.
- Fig. 2 shows an example of the extraction order stored in the extraction order storage unit 102, and (3, (1, unique, specific number of extraction order, (extraction order, specific expression pattern name to be used)) is shown. (Representation A pattern), (2, proper expression B pattern), (3, proper expression C pattern), ...-) are stored.
- the extraction order reading unit 103 reads the specific expression pattern name and the total number of extraction orders corresponding to the input extraction order from the extraction order storage unit 102 and inputs them.
- the text, the extraction order, and the extraction order total number read from the extraction order storage unit 102 and the unique expression pattern name are output.
- the specific expression pattern storage unit 104 extracts a specific expression A pattern 104A used to extract the specific expression A, a specific expression B pattern 104B used to extract the specific expression B, and a specific expression C.
- the proper expression C pattern 104C used for is stored.
- the small title is the text corresponding to the main title “Matsugami Electric Founding” and the middle title is the small title plus the number of times “ If the text and large title corresponding to “Matsugami Electric Founding (1)” are all equivalent to “Matsugami Electric Founding (1) —Birth One”, the specific expression A is the small title and specific expression.
- the specific expression A pattern 104A is a rule for extracting a small title
- the specific expression B pattern 104B is a rule for extracting a medium title
- the specific expression C pattern 104C Is a rule for extracting large titles.
- the rule is the character string to be extracted itself, the character string to be extracted and the character ⁇ IJ before and after it being stored, the concatenation probability of the character string to be extracted and the character string before and after it, etc. It is.
- a method of extracting a character string that matches a pattern contained in the personal name regular expression table from the text as a personal name is used.
- a name probability table that stores the concatenation probability between the character string before and after the appearance of the person name and the person name as shown in Fig. 5 is used.
- the likelihood value is “1.1” by adding the probability value “0.2” of “ ⁇ I ⁇ ,”), and when the likelihood exceeds a specific threshold, there is a method of extracting from the text as a person name.
- the named entity extraction unit 105 extracts a rule (person name pattern) for extracting a person name illustrated in FIGS. Is used in accordance with the method described above to extract the personal name “Takashi Saki”, which is an example of a specific expression.
- the rule S may be constructed for multiple characters, with the power S being the rule for only one character before and after.
- specific expression A pattern 104A the specific expression B pattern 104B, and the specific expression C pattern 104C are collectively referred to as specific expression patterns, respectively.
- the specific expression extraction unit 105 specifies the specific expression pattern corresponding to the input specific expression pattern name.
- An expression is read from the expression pattern storage unit 104, and a specific expression is extracted from the input text using the read specific expression pattern. Then, the extraction order inputted from the text including the extracted specific expressions and the extraction order reading unit 103 Output total number and extraction order.
- the extraction end determination unit 106 receives a numerical value 1 as the extraction order value if the extraction order is smaller than the total extraction order number. And the extraction order after the addition and the text input from the specific expression extraction unit 105 are output to the extraction order reading unit 103.
- the text input from the specific expression extraction unit 105 is output as a result text that is a specific expression extraction result.
- the user inputs text including a specific expression from the input unit 101 (step S101).
- the input unit 101 outputs the input text and the value 1 as the initial value of the extraction order to be processed to the extraction order reading unit 103 (step S102).
- the input unit 101 displays the text “Matsugami Electric Founded”.
- the value 1 is output to the extraction order reading unit 103 as the initial value of the extraction order.
- the extraction order reading unit 103 reads the unique expression pattern name corresponding to the input extraction order and the total number of extraction orders from the extraction order storage unit 102 ( In step S103), the input text, the extraction order, the extraction order total number read from the extraction order storage unit 102, and the unique expression pattern name are output.
- the extraction order reading unit 103 receives the text “Matsugami Electric founding (1) One birth one” and the extraction order value 1 from the input part 101, the extraction order value 1 is input. Is read from the extraction order storage unit 102 and the text “Matsugami Electric Founding (1) —Birth—” is input.
- the extraction order value 1, the extraction order total value 3, and the specific expression pattern name “specific expression A pattern” are output to the specific expression extraction unit 105.
- the specific expression extraction unit 105 responds to the input specific expression pattern name.
- the specific expression pattern to be read is read from the specific expression pattern storage unit 104 (step S104), and the specific expression is extracted from the input text using the read specific expression pattern (step S105). Then, the text including the extracted unique expressions, the extraction order total number input from the extraction order reading unit 103, and the extraction order are output.
- the named entity extraction unit 105 receives from the extraction order reading unit 103 the text "Matsugami Electric Founding (1) -Birth-", the extraction order value 1 and the extraction order total value 3.
- the specific expression pattern name “specific expression A pattern” is input, the specific expression pattern “specific expression A pattern” corresponding to the input specific expression pattern name “specific expression A pattern” is stored in the specific expression pattern. This is read from the section 104, and the specific expression is extracted from the text “Matsugami Electric founding (1) —Birth—” input using the read specific expression pattern “specific expression A pattern”.
- the unique expression pattern “specific expression A pattern” is a pattern for extracting a small title
- the text “Matsugami Electric Founding Note (1) "Is extracted as a specific expression of" small title ".
- the extracted unique expressions are ⁇ Kugen title type '' and ⁇ Ku / Yangi type ''. ”
- the unique expression extraction unit 105 then extracts the text “KUMI TITLE> Matsukami Denki Sangaku / Small Title> (1) —Birth—” including the extracted specific expression, the extraction order total value 3 and the extraction order. 1 is output to the extraction end determination unit 106 (extraction result in Fig. 8 (first time)).
- the extraction end determination unit 106 receives the extraction order number if the extraction order is smaller than the total extraction order number (step S106).
- the numerical value 1 is added to the value (step S107), and the extraction order after addition and the text input from the specific expression extraction unit 105 are output to the extraction order reading unit 103. If the extraction order is equal to or greater than the total number of extraction orders (step S106), the text input from the specific expression extraction unit 105 is output as a result text that is a specific expression extraction result.
- the extraction end determination unit 106 sends the extraction order total value 3 and the extraction order value 1 and the text "KUMI TITLE" from Matsushita Electric Co., Ltd. If (1) One birth is entered, the extraction order value 1 is smaller than the extraction order total value 3. So, add the number 1 to the extraction order value 1 to make the value 2, and extract the extraction order value 2 and the text “Small title> Matsugami Electric founding / Small title> (1) Birth one” Output to order reading unit 103.
- the extraction order reading unit 103 and the specific expression extraction unit 105 perform the same processing as described above, and the extraction end determination unit 106 receives the extraction order total value 3 and the extraction order from the specific expression extraction unit 105.
- the extraction order reading unit 103 and the specific expression extraction unit 105 perform the same processing as described above, and the extraction end determination unit 106 receives the value 3 of the extraction order total number from the specific expression extraction unit 105.
- Extraction order value 3 and the text “Kuo Taito Nore> ⁇ Medium Title> Kusai Tight Nore> Matsugami Denki Founding / Small Title> (1) Ku / Medium Title> —Birth—Kaku / Large Title>” are entered.
- the extraction order storage unit 102 stores the total number of extraction orders, the extraction order, and the unique expression pattern name in association with the extraction order.
- the extraction order total number, the extraction order, and the set of unique expression pattern names are stored in association with the user identifier for identifying the user, and the extraction order reading unit 103 receives the text and the extraction order from the input unit 101, and
- the extraction order reading unit 103 receives the text and the extraction order from the input unit 101, and
- the total number of extraction orders corresponding to the input user identifier, the extraction order, and the specific expression pattern name corresponding to the extraction order input for the set of specific expression pattern names and the total number of extraction orders Is extracted from the extraction order storage unit 102, and the input text, the extraction order, the user identifier, the extraction order total number read from the order storage unit 102, and the unique expression pattern name are output. That You may do it.
- the input unit 101 is an example of a user identification unit.
- the specific expression extraction unit 105 and the extraction end determination unit 106 output the user identifier output from the extraction order reading unit 103 as it is in addition to the operation in the above embodiment.
- the extraction order storage unit 102 sets (01, 3, (1, unique expression A pattern), (user identifier, total number of extraction orders, (extraction order, specific expression pattern name)), (2, proper expression B pattern), (3, proper expression C pattern), ( ⁇ ), (02, 2, (1, proper expression I pattern), (2, proper table 3 ⁇ 4! Pattern), (3, It is assumed that the proper expression K pattern), ⁇ ⁇ -), ⁇ -are stored.
- the contents of the extraction order storage unit 102 in this case are as shown in FIG.
- the extraction order reading unit 103 receives the user identifier “01”, the text “Matsugami Electric Founding (1) One birth One” and the extraction order value 1 from the input unit 101, and the input user identifier is input.
- the “proprietary expression A pattern” is output to the specific expression extraction unit 105.
- the subsequent unique expression extraction unit 105 and the extraction end determination unit 106 further output the user identifier “01” in addition to the operation of the above embodiment. In this way, the extraction order of specific expressions and the specific expressions to be extracted can be changed for each user, and specific expressions adapted to the user can be extracted.
- the extraction order storage unit 102 sets a unique expression pattern name in association with the total number of extraction orders, the extraction order, and the extraction order, holds a plurality of sets, and sets the types of unique expressions to be extracted for each set.
- the extraction order reading unit 103 assigns the corresponding set ID and manages the extraction order.
- the extraction order storage unit 10 stores the unique expression pattern name corresponding to the extraction order input for the set of the order and the specific expression pattern name and the total number of extraction orders. It is also possible to read from 2, input text, extraction order, set ID, extraction order total number read from the extraction order storage unit 102, and unique expression pattern name.
- the set ID functions as information indicating the extraction condition
- the extraction order reading unit 103 is stored in order in the extraction order storage unit 102 corresponding to the set ID by the above-described reading operation.
- the specific expression pattern is set as one or more specific expression patterns used for extraction and their usage order.
- the unique expression extraction unit 105 and the extraction end determination unit 106 output the set ID output from the extraction order reading unit 103 as it is in addition to the operation in the above embodiment.
- the extraction order storage unit 102 sets (01, 3, (1, unique expression A) as a set of (set ID, total number of extraction orders, (extraction order, specific expression pattern name)). Pattern), (2, proper expression B pattern), (3, proper expression C pattern), ..., (02, 2, (1, proper expression I pattern), (2, proper table 3 ⁇ 4! Pattern))
- the contents of the extraction order storage unit 102 are as shown in Fig. 10.
- the extraction order reading unit 103 is input to the input unit 101 and the set ID “ (Set ID, total number of extraction order, (extraction order, unique expression pattern name)) (01, 3,
- the user can extract the text related to the program name from the input text by specifying the set ID "01" in the above embodiment, and, as an example, the unique expression I pattern is the surname of the person name.
- the unique expression I pattern is the surname of the person name.
- a specific expression K pattern is a rule for extracting the first name and last name of a person name. If you specify, you can extract text related to a person's name from the input text. That is, the user can specify a specific expression to be extracted.
- the set ID is a force identifier corresponding to the type of the unique expression to be extracted.
- the set ID is a terminal identifier for identifying the terminal that displays the specific expression to be extracted. Further, the input unit 101 displays the specific expression. By making it possible to obtain the terminal identifier of a terminal, it is possible to extract a specific expression corresponding to the terminal.
- the input unit 101 is an example of a terminal identifier acquisition unit.
- the unique name of the program name is useful on a television. Even if the specific name of the program name is not necessary on a CD player, such as when the unique name of the program name is unnecessary, Since it is possible to set a specific expression to be extracted for each display terminal, it is not necessary to display redundant information for the display terminal.
- the named entity extraction apparatus uses the text input by the user from the input unit 101 as a search keyword to search for information related to a TV broadcast program or content stored in a hard disk recorder or the like.
- an information database 306 that stores text information related to contents existing on the Internet, and is configured as a device that performs a specific expression extraction for the searched text.
- an extraction order database 302 that stores the total number in association with each other.
- the output order reading unit 103 further includes a text search unit 303, an order total number acquisition unit 304, and a usage pattern acquisition unit 305.
- the text search unit 303 receives the text and the initial value of the extraction order from the input unit 101.
- the text including a part of the input text is acquired from the information database 306, and the search result text and the extraction order are output to the order total number acquisition unit 304.
- the order total number acquisition unit 304 receives the search result text from the text search unit 303.
- the extraction order the total number of extraction orders corresponding to the number of texts in the input search result text is obtained from the extraction order database 302 in the extraction order storage unit 102, and the total number of extraction orders and the search result text to be obtained are acquired.
- the extraction order are output to the usage pattern acquisition unit 305.
- the specific expression pattern name corresponding to the input extraction order is acquired from the usage pattern database 301 of the extraction order storage unit 102 and acquired.
- the specific expression pattern name, the search result text, the total number of extraction orders, and the extraction order may be output to the specific expression extraction unit 105.
- FIG. 11 shows an example of the contents of the usage pattern database 301.
- ((Extraction order, specific expression pattern name)) ((1, specific expression A pattern), (2, specific expression B pattern), (3, proper expression C pattern),.
- Fig. 14 shows an example of the contents of the extraction order database 302. (Number of text, total number of extraction order) ((1 or less, 1), (2 or more, 5 or less, 2), (6 or more, 3 )) Is memorized.
- the text search unit 303 inputs text from the information database.
- the text “Matsugami Electric Founding (1) Birth 1” and “Matsugami Electric Founding (2) —Development—” are retrieved (step S202).
- (1) Birth 1 ”,“ Matsugami Electric Founding (2) — Development 1 ”and the extraction order value 1 are output to the order total acquisition unit 304.
- the total order acquisition unit 304 inputs the search result text “Matsugami Electric Founding (1) birth 1” and “Matsugami Electric Founding (2) — Development One” and the extraction order value 1 from the text search unit 303.
- the extraction order database 302 of the extraction order storage unit 102 is input, and the extraction order total number 2 corresponding to the text number 2 of the input search result text is obtained (step S203).
- the extraction order value 1 is output to the usage pattern acquisition unit 305.
- the usage pattern acquisition unit 305 receives the extraction order value 1 input from the usage pattern database 301 of the extraction order storage unit 102.
- the unique expression pattern name “specific expression A pattern” corresponding to is acquired (step S204), the specific expression pattern name “specific expression A pattern”, the search result text, the extraction order total number 2, the extraction order value 1 and Is output to the named entity extraction unit 105.
- the text "Matsugami Electric founding" searched from the input unit 101 is input.
- the information contained in the information database 306 includes electronic program guide information and music information.
- the input unit 101 is used to input Giannore, etc.
- the text search unit 303 displays the title corresponding to the input genre in the information database.
- the search result text retrieved from 306 may be used.
- FIG. 15 to FIG. 15 show the usage pattern database and the extraction order database that are associated with a common set ID.
- the set ID is used as an extraction condition together with the number of input texts, and the set ID is further input from the input unit 101.
- the extraction order reading unit 103 uses the use pattern database corresponding to the set ID input from the input unit 101.
- the extraction order database and the specific expression pattern name are obtained by referring to the extraction order database.
- the extraction order reading unit 103 uses one or more unique expression patterns stored in the extraction order storage unit 102 in order corresponding to the set ID. It is set as a specific expression pattern and its use order.
- the set ID is input from the input unit 101 and stored in the force extraction order database 302 in association with the number of texts, and the extraction order reading unit 103
- the total number of extraction orders and the set ID corresponding to the number of search result texts may be acquired from the extraction order database 302, and the unique expression pattern name may be acquired with reference to the usage pattern database corresponding to the set ID. .
- the extraction order reading unit 103 extracts the unique expression patterns stored in the extraction order storage unit 102 according to the number of search result texts. One or more specific expression patterns to be used and their use order are set.
- An example of the extraction order database 302 in this case is shown in FIG.
- the extraction order reading unit 103 sets the specific expression pattern and the usage order used for extraction based on the number of search result texts searched by the text search unit 303.
- the text search unit 303 extracts text for the number of characters that can be displayed on the display unit from the search result text to be searched, and sets the search result text similar to the plurality of extracted texts as a similar text group.
- the total order number acquisition unit 304 it is possible to set the usage order of the unique expression pattern used for extraction based on the number of similar texts when displayed.
- FIG. 26 is a configuration diagram of the named entity extraction apparatus according to such a modification. Compared with the specific expression extraction apparatus shown in FIG. 11, this specific expression extraction apparatus includes a similar text acquisition unit 308 and a display unit 309.
- the genre “documentary” is input from the input unit 101 to the text search unit 303, and the text “Documentary: History of Matsugami Denki”, “Human Document” is input from the information database 306.
- “Matsushita's footprint (1)” and “Human document Matsushita's footprint (2)” are searched, and the number of characters that can be displayed per unique expression on the display unit 309 is eight.
- the similar text acquisition unit 308 extracts the first eight characters of the text "Documentary”, “Human document”, and “Human document” from the searched text, performs similarity determination, and the same text.
- the text “human document Matsushita footprint (1)” and “human document Matsushita footprint (2)” corresponding to “human document” determined to be the same text group is output to the order total number acquisition unit 304,
- the text “Documentary History of Matsugami Electric” corresponding to the text “Documentary” determined to be similar is output to the display unit 309 as the result text.
- the order total number acquiring unit 304 refers to the extraction order database 302 in FIG. 25 to acquire the extraction order total number 2 and the set ID value 02
- the usage pattern acquiring unit 305 acquires the usage pattern in FIG.
- the database 301 when the extraction order is 1, the unique expression I pattern is obtained, and when the extraction order is 2, the eigentable pattern is obtained.
- the expression I pattern is used, the part corresponding to the subtitle is extracted from the program name text as a small title, and if the eigentable pattern is used, the part that summarizes the number of consecutive subtitles and subtitles from the program name text is the middle title.
- the similar text acquisition unit 308 has been described so that when the similar text is determined, the same text is regarded as the similar text. May be determined. For example, if the number of displayed characters is 10 characters and the specific ratio is 80%, if the character strings of 8 or more characters are the same, it is determined that the text is similar.
- the search result text determined by the similar text acquisition unit 308 to be other than the similar text group is displayed as it is on the display unit 309, and the text is identified for the user with respect to the similar text group.
- the display of the search result text on the display unit is also necessary for the user to identify the text in consideration of the number of characters that can be displayed on the display unit. Can be extracted.
- the extraction order reading unit 103 may input the text attribute if the text attribute is added to the text attribute only by the user.
- Part 101 functions as an attribute acquisition part that acquires the text attribute assigned to the text.
- the unique expression pattern name and the extraction order total number corresponding to the text attribute acquired by 1 may be read from the extraction order storage unit 102.
- the input unit 101 in this case is an example of an attribute acquisition unit, and the contents of the extraction order storage unit 102 are as shown in FIG.
- This text attribute may indicate a category of a TV program such as “drama”, “news program”, or “validity one” as well as a classification such as "IT document” or "TV program information”. ,. Since the category of the television program is included in the program information constituting the electronic program guide, the input unit 101 can acquire the category from the program information constituting the electronic program guide.
- the text attribute is estimated by calculating the distance between the word vector generated using the word included in the text and the word vector expressing the text attribute. Then, the unique expression pattern name and the extraction order total number corresponding to the text attribute may be read from the extraction order storage unit 102. In this way, the extraction performance of the specific expression extraction can be improved, and when the text attribute is given to the target text of the specific expression extraction, the user need not specify the text attribute.
- a terminal name for displaying the result text which is the text extracted from the unique expression, or a terminal identifier that can identify the terminal may be used.
- the contents of the extraction order storage unit 102 in this case are as shown in FIG. By doing this, it is possible to set a specific expression to be extracted for each terminal that displays the result text.
- the named entity extraction apparatus is configured such that the extraction order change unit 204 included in the extraction end determination unit 106 changes the extraction order according to the extraction condition. If the unique expression pattern name corresponding to the extraction order input from the extraction end determination unit 106 cannot be read, the reading unit 103 further outputs a value 1 as an extraction end flag, and if the unique expression pattern name can be read, the reading is performed. A value 0 is output as the end flag, and the unique expression extraction unit 105 extracts a specific expression corresponding to the specific expression pattern read from the specific expression pattern storage unit 104 when the input extraction end flag is the value 1. Place If the specific expression is not extracted, the value 0 and the extraction end flag value 1 are output as the extraction flag.
- the extraction end determination unit 106 An extraction number storage unit 202, an extraction number update unit 203, and an extraction order change unit 204 are provided.
- the determination unit 201 If the extraction end flag has a value of 0, the numerical value 1 is added to the value of the extraction order, and the extracted extraction order and the text input from the specific expression extraction unit 105 are output to the extraction order reading unit 103. At this time, the extraction order is equal to the total number of extraction orders.
- the text input from the unique expression extraction unit 105 is output as result text that is the result of the specific expression extraction. If the extraction end flag is 1, the extraction end flag value 1 is output to the extraction order change unit 204. Output.
- the extraction number storage unit 202 stores an extraction order that is the order in which the unique expressions are extracted, and an extraction number that is the number of times that the specific expressions are extracted in this extraction order.
- the extraction order changing unit 204 is when the total number of extraction times in the extraction number storage unit 202 is equal to or greater than a certain value.
- the extraction order of the extraction order storage unit 102 may be changed based on the number of extractions corresponding to the extraction order stored in the extraction number storage unit 202.
- the extraction number updating unit 203 and the extraction number storage unit 202 are an example of an extraction number counting unit that counts the number of times that a unique expression has been extracted in the past using individual unique expression patterns.
- the determination unit 201 extracts the value 3 of the extraction order from the unique expression extraction unit 105, the value 1 of the extraction order, and the text “Small Title> Matsugami Electric Founding / Small Title> (1) If the extraction end flag value 0 is input (“Birth—”) (step S301), the extraction end flag value is 0 (step S309). 1 is added to the value 1 to make the value 2 (step S303), and the extraction order value 2 and the text ⁇ Small Title> Matsugami Electric Founding / Small Title> (1) birth are read in the extraction order. To part 103 (step S304)
- the determination unit 201 obtains the extraction order total value 3 and the extraction order value 3 from the specific expression extraction unit 105, and the text "ku title” Kunaka title> Ku title> Matsugami Electric founding / If a small title> (1) Z Z middle title> first birth / large title> '' and the extraction end flag value 0 are entered, the extraction order value 3 is equal to the extraction order total value 3 (step S 302), the input text “Large title> Kunaka title> Kuminato title> Matsugami Electric founding note / Small tight nore> (1) ⁇ Z medium tight nore> —Birth 1 / big title>” The result text is output (Step S310).
- Step S309 Since the extraction end flag is 0 (Step S309), the value 1 is added to the value 3 in the extraction order to set the value to 4 (Step S303).
- the extraction order reading unit 103 If the extraction order reading unit 103 has the extraction order value 5 input from the extraction end determination unit 106 and the specific expression pattern name corresponding to the extraction order value 5 cannot be read, the extraction order reading unit 103 is the extraction end flag value 1, the extraction order total value 3, the extraction order value 5, and the text “Large title”, “Medium title”, “Small title”, Matsugami Electric founding / small title> (1) Ku / medium tight nore> first birth / big title> ”is output to the named entity extraction unit 105.
- the determination unit 201 extracts the value 3 of the extraction order from the specific expression extraction unit 105, the value 5 of the extraction order, and the text “Large Tit Nore> ⁇ Medium Title> ⁇ Small Tit Nore> > (1) Ku Z Middle Title> —Birth—Kaku / Large Title> ”and extraction end flag value 1 (step S301), the extraction end flag value is 1 (step S309).
- the end flag value 1 is output to the extraction order changing unit 204 (step S311).
- the extraction number storage unit 202 stores an extraction order that is the order in which the unique expressions are extracted, and an extraction number that is the number of times that the specific expressions are extracted in this extraction order.
- FIG. 21 shows an example of the extraction order stored in the extraction number storage unit 202 and the number of extractions corresponding to the extraction order.
- extraction order number of extractions
- ((19), (2 6), (3 3 ), (4, 1)) are stored.
- it means that small titles were extracted 9 times, medium titles 6 times, large titles 3 times, and all titles 1 time.
- the extraction number update unit 203 receives the text, the extraction flag value 1, the extraction order total value 3, the extraction order value 1, and the extraction end flag value 0 from the specific expression extraction unit 105.
- Step S301 since the extraction flag to be input is value 1 (Step S305), 1 is added to the value 9 of the number of extractions stored in the extraction number storage unit 202 corresponding to the value 1 of the extraction order, Set the value to 10 (step S3 06). Similarly, in the subsequent processing, specific expressions are also extracted for medium titles and large titles. Therefore, the respective extraction count values stored in the extraction count storage unit 202 corresponding to the extraction order values 2 and 3 are used. Update 6 and value 3 to value 7 and value 4.
- FIG. 22 shows the contents of the extraction number storage unit 202 after being updated by the extraction number updating unit 203.
- the extraction order changing unit 204 has a total number of extraction times in the extraction number storage unit 202 that is equal to or greater than a specific value (for example, value 20) (
- the extraction order value 2 corresponding to the extraction order stored in the extraction number storage unit 202 is equal to or greater than a specific value (for example, the value 5).
- the total number of extraction orders is set (step S308).
- FIG. 23 shows the contents of the extraction order storage unit 102 after being changed by the extraction order changing unit 204.
- the total number of extraction orders is changed using the history of extracting unique expressions from the user's input text, and it is unique to the search result text searched from the information database using the changed total number of extraction orders.
- Expression can be extracted, and as a result, the unique expression extracted from the search result text can be matched with the same form as the specific expression with high input frequency of the user. Or, it becomes possible for the user to automatically extract only the specific expressions necessary for identifying the text.
- the extraction order storage unit 102 sets the extraction order total number, the extraction order, and the specific expression pattern names in association with the extraction order, and holds a plurality of such sets.
- the extraction number storage unit 202 can be managed by managing the extraction order and the number of extractions as a set for each user identifier and set ID.
- Figure 24 shows the extraction order when the extraction order and number of extractions are managed as a set for each user identifier. The contents of the intro memory 102 are shown.
- the extraction end determination unit 106 determines whether to continue the specific expression extraction process based on the total number of extraction orders and the extraction order, but may determine based on the number of characters of the extracted specific expressions.
- the specific expression extraction unit 105 outputs the number of characters of the specific expression to be extracted in addition to the operation in the above embodiment to the extraction end determination unit 106, and the extraction end determination unit 106
- the extraction order the number of characters of the unique expression to be extracted, and the text are input from the extraction unit 105
- the numerical value 1 is added to the value of the extraction order, and the addition is performed.
- the subsequent extraction order and the text input from the specific expression extraction unit 105 are output to the extraction order reading unit 103. If the number of characters in the specific expression is equal to or greater than the specific number of characters, the specific expression extraction unit 105 The input text is output as the result text that is the extraction result of the proper expression.
- the extraction end determination unit 106 terminates the extraction performed using the subsequent specific expression pattern. It is an example.
- the extraction end determination unit 106 reads the text "Kumiko Title” Matsugami Electric founding / small title from which the extraction order value 1 and the unique expression are extracted from the specific expression extraction unit 105. > (1) Birth 1 ”and the extracted number of characters 7 in the proper expression“ Matsugami Electric Founding ”is input, the number of characters in the specific expression 7 is a specific number of characters (in this example, the number of characters is set to 8) The extraction order value is set to 2, and the extraction order value 2 and the text “Kumiko Title> Matsugami Electric Founding / Small Title” (1) —Birth— are entered in the extraction order reading section 103. Output.
- the extraction end determination unit 106 further extracts the value “2” of the extraction order and the specific expression from the specific expression extraction unit 105 “kunaka title> 1) If the number of characters 9 in the proper expression “Matsugami Denki Kogyo (1)” extracted as ⁇ Tight in the Z> is born ”is entered, the number of characters in the specific expression 9 is more than a specific number of 8 characters. "Kunaka title> Kuminato title> Matsugami Electric founding Z small title> (1) Ku / Naka Tit Nore> One birth one" is output as the result text.
- the unique expression that cannot be displayed is not extracted by setting the number of characters that can be displayed as the threshold number of characters of the extraction end determination unit 106. It is possible to reduce the processing amount of the specific expression extraction.
- the specific expression extraction apparatus of the above embodiment further includes a changing unit that allows the user to change the extraction order total number, extraction order, and specific expression pattern name stored in the extraction order storage unit 102. Also good. In this way, the user can change the extracted specific expressions.
- the present embodiment it is possible to extract only the specific expressions necessary for the user, application, and terminal by setting the order of extracting the specific expressions.
- the number of specific expressions targeted for speech recognition can be reduced. Can be improved.
- the unique expression extracted from the unique expression extraction device is stored as a search target keyword in the search target database together with the search target data, the search target keyword can be reduced. Search accuracy can be improved.
- FIG. 27 is a configuration diagram showing the configuration of the named entity extraction apparatus according to the second embodiment of the present invention.
- the specific expression extraction apparatus of the present embodiment is an apparatus for extracting the minimum specific expression necessary for the user to identify the text when the search result text includes the same character string.
- the usage pattern database 401 and the extraction order database 402 are associated with a common set ID, and the usage pattern database 401 includes an extraction order and a unique expression pattern name corresponding to the extraction order for each set ID. It is memorized and the extraction order data
- the database 402 stores the total number of extraction orders for each set ID.
- FIG. 15 is an example of the usage pattern database 401
- FIG. 28 is an example of the contents of the extraction order database 402.
- the order total number acquisition unit 403 receives the youngest set ID from the extraction order database 402 and the extraction order total number and set corresponding to the set ID. Acquires the maximum ID value, and outputs the search result text, extraction order, set ID, total extraction order, and maximum set ID value to usage pattern acquisition unit 305
- the extraction order is reset to 1, and 1 is added to the input set ID.
- the total number of extraction orders corresponding to the set ID after addition is obtained from the extraction order database, and the search result text, extraction order, set ID, total number of extraction orders, and maximum set ID are stored in the usage pattern acquisition unit 305. Output.
- the extraction end determination unit 404 adds 1 to the extraction order, When the extraction order is larger than the total number of extraction orders, the set ID, the maximum value of the set ID and the text are output to the unique expression determination unit 405, and when the set ID is less than the maximum value of the set ID, the total number of order acquisition unit 403 If the extraction order, the total number of extraction orders, the set ID, and the maximum value of the set ID are output, and the extracted extraction order is less than or equal to the total number of extraction orders, the text, extraction order, set ID, extraction order total number, and set The maximum ID value is output to the usage pattern acquisition unit 305.
- the specific expression determination unit 405 extracts a unique ID extracted from a plurality of texts that are simultaneously input in association with the set ID. If the set ID is equal to the maximum value of the set ID, the number of similar specific expressions is calculated and calculated for the specific expressions stored in association with each set ID. The unique expression corresponding to the smallest set ID is output as the result text.
- FIG. 29 is a flowchart showing a flow of an operation example when extracting a specific expression.
- the information included in the information database 306 is electronic program information
- the text search unit 303 receives the genre "documentary” and the initial value 1 of the extraction order from the input unit 101 (step S401)
- the information database The texts of the program names corresponding to the genre “Documentary” from 306 “Documentary (1) —Birth of Matsugami Electric”, “Documentary (2) —Development of Matsugami Electric” — “Human Documents Taro Matsushita's Footprints ( 1) ”,“ Human Document Taro Matsushita's Footprint (2) ”(Step S402), and the search result text“ Documentary (1) —Birth of Matsugami Electric ”—“ Documentary (2) —Matsugami Development of electrical appliances ”,“ Human document Taro Matsushita's footprint (1) ”,“ Human document Taro Matsushita's footprint (2) ”and the extraction order 1 are output to the total order acquisition unit 403.
- the order total number acquisition unit 403 receives the search result text “documentary” from the text search unit 303.
- the usage pattern acquisition unit 305 receives the maximum value of the set ID as shown in FIG.
- the unique expression pattern name “Unique expression A pattern” corresponding to the set ID value 1 and the extraction order value 1 input from the usage pattern database 401 of 5 is acquired (step S404).
- the proper expression extraction unit 105 uses the input specific expression pattern “specific expression A pattern” to input the text “Documentary (1) —Birth of Matsugami Electric” — “ Documentary (2) — Development of Matsugami Electric ”,“ Human Document Taro Matsushita's Footprint (1) ”,“ Human Document Taro Matsushita's Footprint (2) ” Extract “Documentary”, “Human Document”, and “Human Document” (Step S406), and extract the unique expression from the text “Small Title> Documentary Ku / Small Title> (1) —Birth of Matsugami Electric— ”,“ Koku Title> Documentary K Z Title> (2) —Development of Matsugami Electric ”,“ Kuku Title> Human Document K Z Tai Tait Nore> Taro Matsushita's feet (1) ”,“ Small Tights> Human Documents / Small Tights> Taro Matsushita's Footprint (2) ”, the total number of extraction orders, the extraction
- the extraction end determination unit 404 adds 1 to the input extraction order value to set the value to 2 (step S407), and the extracted extraction order value 2 is less than or equal to 2 in the extraction order total number.
- Step S4 08 the input text “Kakuta Tit Nore> Documentary Kaku / Small Title” (1) —Birth of Matsukami Denki— ”,“ Kaku Title> Documentary Kaku / Small Title> (2) —Matsugami “Development of Electric Appliances”, “Koku Title> Human Documents / Small Title> Taro Matsushita's Footprint (1)”, “Koku Title> Human Documents / Small Title> Taro Matsushita's Footprint (2)”
- Output total number 2 extraction order 2, set ID value 1, and set ID maximum value 2 are output to usage pattern acquisition section 305.
- the usage pattern acquisition unit 305 acquires the specific expression pattern name "specific expression B pattern", and the specific expression extraction unit 105 uses the specific expression pattern "specific expression B pattern”.
- the proper expression extraction unit 105 has found that the text “Kokunaka Tight Treasure> ⁇ Small Tight Tale> Documentary Kaku / Tight Tight Treasure> (1) Kuku / Middle Tight Treasure> birth of Ichimatsu Kouichi”, “ Kunaka Taito Nore> ⁇ Small Tit Nore> Documentary One / Small Title> (2) ⁇ Z Middle Title> — Development of Matsugami Denki ”,“ Kuo Taito Nore> Human Document Ku Z Small Title> Footprint of Taro Matsushita (1 ) ”,“ Small Title> Human Document ⁇ / Small Tit Nore> Taro Matsushita's Footprint (2) ”, Extraction Order Total 2 and Extraction Order Value 2 And set ID value 1 and set ID maximum value 2 are output to extraction end
- the extraction end determination unit 404 adds 1 to the input extraction order value to set the value to 3 (step S407), and the extracted extraction order value 3 is larger than the extraction order total number 2 (
- the unique expression determination unit 405 sets the set ID value 1 and the maximum set ID value 2 and the text “ ⁇ Medium Tight Nore> Small Tight Nore> Documentary / Small Tight Nore> (1) Ku / Medium Tight Nore> Matsugami Electric's birth 1 "," Kunaka Tight Nore> Small Title> Documentary Ku / Small Tight Nore> (2) Ku / Naka Tight Nore> Development of Ichimatsu Kouden "" Koku Title "Human Document Ku Z Small “Title Nore> Footprint of Matsushita Taro (1)", “Koku Tight Treasure> Human Document / Tatoshi Matsushita Footprint (2)” is output (Step S409), and Set ID value 1 is the maximum of Set ID Since the value is less than 2 (step S410), the total number of sequence
- the total order acquisition unit 403 extracts the extraction order value 3 and the set ID value from the extraction end determination unit 404.
- the usage pattern acquisition unit 305 acquires the specific expression pattern name “specific expression I pattern”, and the specific expression extraction unit 105 uses the specific expression pattern “specific expression I pattern”.
- the unique expression extraction unit 105 reads the text “Documentary (1) —Kakuko Title> birth of Matsugami Electric / Small Title> — ”,“ Documentary (2) —Kakuta Title> Development of Matsugami Electric Z Small Title> — ”,“ Human Document "Title> Taro Matsushita's Footprints Z Small Title>(1)","Human Documents Small Title> Taro Matsushita's Footprints / Small Titles” (2) ", Extraction Order Total 2 and Extraction Order Value 2 and the set ID of the value 1 and set ID Is output to the extraction end determination unit 404.
- the usage pattern acquisition unit 305 acquires the specific expression pattern name “specific expression J pattern”, and the specific expression extraction unit 105 uses the specific expression pattern “specific table 3 ⁇ 4 [pattern”.
- the expression extractor 105 has the text “Documentary (1)-Small Title> Birth of Matsugami Electric / Small Title> —”, “Documentary (2)-Small Title> Development of Matsugami Electric / Small Title> 1 ”,“ Human Document Kunaka Tight Nore> ⁇ Small Tight Nore> Taro Matsushita's Footprint / Small Tight Nore> (1) ⁇ Z Middle Tight Nore> ” Shimotaro's Footprint / Small Title> (2) ⁇ / Medium Title> ”, Extraction Order Total 2, Extraction Order Value 2, Set ID Value 2, and
- the extraction end determination unit 404 has the extraction order value 3 larger than the total number of extraction orders (step S408).
- ID Maximum 2 and text “Documentary (1) — Small Title> birth of Matsugami Electric / Small Title> —”, “Documentary (2) — Small Small Tight> Development of Matsugami Electric / Small Title > ”,“ Human Document Small Title> Taro Matsushita's Footprint / Small Title> (1) ”,“ Human Document Small Title> Taro Matsushita's Footprint / Small Title> (2) ” Output to the determination unit 405.
- the specific expression determination unit 405 receives the set ID value 1 and the maximum set ID value 2 from the extraction end determination unit 404 and the text "Kakunaka Tight Treasure> ⁇ Small Tight Tail> Documentary Kut / Small Tight Treasure> (1) ⁇ / Naka Taito Nore> birth of Ichimatsu Kou Denki>, Kunaka Taito Nore> ⁇ Small Tight Nore> Documentary Ku-Z Small Title> (2) Ku / Middle Title> Development of Ichimatsu Kou Denki>, Kukun Taito Nore> If you enter "Human Document Ku Z Title> Matsushita Taro's Footprint (1)", "Kumiko Title> Human Document Ku / Small Tit Nore> Matsushita Taro Footprint (2)", the set ID value 1 A unique expression “documentary (1) extracted from multiple texts input simultaneously ) "," Documentary (2) ",” Human Document ",” Human Document ".
- the unique expression determining unit 405 further sets the unique expression "human” to the set ID value 1. Since “Document” and “Human Document” are the same, the number of similar unique expressions is set to 2 (here, only the same text is considered to be similar text, but the same text may be more than a certain number of characters), and the set ID For the value of 2, all the unique expressions are different, so the number of similar specific expressions is 0.
- the specific expression determining unit 405 calculates the number of similar specific expressions using the extracted specific expressions as they are, but the text is generally displayed on a display unit having a finite size. If the number of display characters that can be displayed per unique expression is known on the display unit, only the text for the number of display characters is extracted from the beginning of the specific expression, and the number of similar specific expressions is extracted from the extracted text. If you ask for it, ...
- the unique expression determination unit 405 has (set ID, unique expression group) as (1, (Matsushita Electric Industrial's birth, Matsushita Electric Industrial development, Matsushita Electric Industrial stock price transition, Matsushita Consider the case where the new product introduction in the electronics industry)), (2, (Matsushita documentary, Matsushita documentary, economic second use, trendy product introduction)) is input.
- This example shows an example of the unique expression extracted from the program name included in the program information. These specific expressions are detailed tables with different numbers of characters used per specific expression. The display format and the display format with improved listability are used separately, and it is determined that they are displayed on the same display section.
- the specific expression determining unit 405 has a maximum of 12 characters of text (1, 1, (Birth of Matsushita Electric Industrial, Development of Matsushita Electric Industrial, Stock Price Transition of Matsushita Electric Industrial, Introduction of New Products of Matsushita Electric Industrial)), (2, (Introduction of Matsushita Documentary, Matsushita Documentary, Economic News, Trendy Products))) To extract. In this case, all characters of each unique expression are extracted. Then, the number of similar unique expressions for the set ID values 1 and 2 is obtained as 0 and 2, respectively.
- the unique expression corresponding to the set ID value 1 with the smallest number of similar specific expressions is “Matsushita Electric Industrial's "Birth”, “Development of Matsushita Electric Industrial”, “Stock price transition of Matsushita Electric Industrial”, “Introducing new products of Matsushita Electric Industrial” are output to the display section as result text.
- Fig. 30 (A) is an example of a detailed display format.
- the program information power for three channels is displayed on one screen using the unique information of up to 12 characters extracted. Is done. This format is suitable for users who want to watch program information in more detail.
- the specific expression determination unit 405 displays a maximum of six characters of text ( 1, (Matsushita Electric Industrial, Matsushita Electric Industrial, Matsushita Electric Industrial, Matsushita Electric Industrial)), (2, (Matsushita Document, Matsushita Document, economic news, trendy product introduction)). Then, the number of similar unique expressions for the set ID values 1 and 2 is 4 and 2, respectively.
- the unique expressions corresponding to the set ID value 2 with the smallest number of similar unique expressions are “Matsushita Documente” and “Matsushita”. "Docume”, "Economic news", "Introducing trendy products” are output as result text.
- Figure 30 (B) shows an example of a display format with improved listability. In this example, one screen is displayed.
- Program information for 6 channels is displayed using unique information of up to 6 characters extracted from the name of the program. This format is suitable for users who want to overlook program information more widely.
- unique information that is originally 7 characters or more it is possible to omit the following characters by replacing the 6th character with a predetermined character (for example, “ ⁇ ”). May be specified.
- the set in which the specific expression included in each set is the most different.
- the number of unique expressions that can be identified by the user can be increased by making the unique expressions of the final result.
- FIG. 31 is a configuration diagram showing a configuration of the named entity extraction device according to the third embodiment of the present invention.
- the specific expression extraction apparatus deletes the duplicate and further specifies the specific expression to be displayed, and then the specified specific expression is nested.
- an input unit 101, an extraction order storage unit 102, an extraction order reading unit 103, a specific expression pattern storage unit 104, and a specific expression extraction unit 105 are provided.
- the specific expression storage unit 501 stores the input text, the specific expression extracted from the input text, and the extraction order in association with each other.
- An example of the contents of the named entity storage unit 501 is shown in FIG.
- the extraction end determination unit 502 associates the extraction order with the specific expressions extracted from the text and the text to represent the specific expressions. If the extraction order is smaller than the total number of extraction orders, the numerical value 1 is added to the value of the extraction order, and the extraction order after the addition and the text input from the specific expression extraction part 105 are stored. Is output to the extraction order reading unit 103.
- Display unit 506 displays a specific expression.
- the display condition acquisition unit 503 has a fixed designation specified by the user from the extraction order of the unique expressions displayed on the display unit 506 and the plurality of displayed unique expressions. Enter the expression.
- the specific expression acquisition unit 504 acquires a specific expression corresponding to the extraction order input from the specific expression storage unit 501, and the display condition acquisition unit
- the extraction order and the specific expression specified by the user are input from 503
- the text corresponding to the input extraction order and the specific expression is searched from the text stored in the specific expression storage unit 501, and the text of the searched text In the specific expression, a specific expression corresponding to the extraction order next to the input extraction order is acquired.
- Duplicate deletion unit 505 eliminates duplication of the unique expression acquired by specific expression acquisition unit 504 and displays it on display unit 506.
- FIG. 33 is a flowchart showing a flow of an operation example when extracting and displaying a specific expression.
- step S101 to step S105 Operation for extracting a specific expression from input text using the input unit 101, the extraction order storage unit 102, the extraction order reading unit 103, the specific expression pattern storage unit 104, and the specific expression extraction unit 105 (step S101 to step S105) Since the same operation as in Embodiment 1 is performed, the description thereof is omitted.
- the extraction end determination unit 502 receives the total number of extraction orders 3 and the extraction order 1 from the proper expression extraction unit 105 and the text “Kaku Title> Documentary K / Small Title” (1) —Birth of Matsugami Electric— ”,“ Small Title> Documentary Ku / Small Tight> (2) —Development of Matsugami Denki ”,“ Kumiko Title> Human Document Ku / Small Title ”Taro Matsushita's Footprint (1)”, “Kumiko Title> Human Document / Small Title> Taro Matsushita's Footprint (2) ”is input, the extraction order 1 and the text“ Documentary (1) —Birth of Matsugami Electric ”,“ Documentary (2) —Development of Matsugami Electric —, “Human Document Taro Matsushita's Footprint (1)”, “Human Document Taro Matsushita's Footprint (2)” and the specific expressions “Documentary”, “Documentary”, “Human Document”, “Human Document”,
- the value is calculated to be 2 (step S107), the value 2 of the extraction order after addition and the text input from the named entity extraction unit 105
- the display condition acquisition unit 503 inputs the value 1 that is the initial value of the extraction order to the specific expression acquisition unit 504 without accepting the user force designation.
- the proper expression acquisition part 504 receives the proper expression "documentary” corresponding to the extraction order value 1 from the proper expression storage part 501. ",” Documentary “,” Human Document “,” Human Document “are acquired (step S503).
- the duplicate deletion unit 505 performs duplication on these specific expressions.
- the unique expressions “documentary” and “human document” are displayed on the display unit 506 (step S507).
- An example of the display contents displayed on the display unit 506 at this time is shown in FIG.
- the duplicate deletion unit 505 may simultaneously display the number of duplicates in each unique expression when displaying each unique expression. An example of the display contents displayed on the display unit 506 at this time is shown in FIG.
- Each unique expression displayed here is added with a user interface function for accepting a predetermined operation by the user, for example, a mouse click operation.
- a predetermined operation by the user for example, a mouse click operation.
- the display condition acquisition unit 503 accepts a predetermined operation by the user to one of the displayed specific expressions, the display condition acquisition unit 503 inputs the specific expression on which the operation has been performed and the extraction order of the specific expression to the specific expression acquisition unit 504. .
- the display condition acquisition unit 503 inputs the extraction order value 1 and the specific expression “documentary” specified by the user.
- the specific expression acquisition unit 504 receives an input from the text stored in the specific expression storage unit 501.
- the text “Documentary (1) Birth of Ichimatsu Kouen” “Documentary (2) —Evolution of Matsuue Denki” corresponding to the value 1 of the extracted extraction order and the specific expression “Documentary” is searched (Step S504).
- the specific expressions “documentary (1)” and “documentary (2)” corresponding to the extraction order value 2 next to the input extraction order are acquired (step S505).
- the input text since the input text is displayed according to the nested structure of the extracted unique expression, the input text can be displayed in the menu hierarchy.
- the menu hierarchy is generated according to the nesting of the unique expression, so the user needs to search for the target title from a list of titles with duplicates. Nagu Menu You can find the title you want just by navigating the hierarchy.
- FIGS. 37 (A) and 37 (B) are examples of program names included in the Chinese program information to be input text. From this program name, a unique expression is extracted in the same manner as described above and presented to the user. As a result, the unique expression adapted to the extraction condition represented by the user's input history, display capability of the display terminal, etc. A unique expression extraction device that can be extracted from the input text is obtained.
- the extracted specific expression when extracting a specific expression from a text, can be adapted to a user application and a terminal used by the user, It is useful for DVD recorders, TVs, audio components, terminals that can access the Internet, and information retrieval servers.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
- Television Signal Processing For Recording (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2007521081A JP4129048B2 (ja) | 2005-06-15 | 2005-12-26 | 固有表現抽出装置、方法、及びプログラム |
CN2005800496646A CN101167075B (zh) | 2005-06-15 | 2005-12-26 | 专有表现抽取装置、方法以及程序 |
US11/916,222 US7761437B2 (en) | 2005-06-15 | 2005-12-26 | Named entity extracting apparatus, method, and program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2005-175678 | 2005-06-15 | ||
JP2005175678 | 2005-06-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006134682A1 true WO2006134682A1 (ja) | 2006-12-21 |
Family
ID=37532053
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2005/023768 WO2006134682A1 (ja) | 2005-06-15 | 2005-12-26 | 固有表現抽出装置、方法、及びプログラム |
Country Status (4)
Country | Link |
---|---|
US (1) | US7761437B2 (ja) |
JP (2) | JP4129048B2 (ja) |
CN (1) | CN101167075B (ja) |
WO (1) | WO2006134682A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1965312A3 (en) * | 2007-03-01 | 2010-02-10 | Sony Corporation | Information processing apparatus and method, program, and storage medium |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101075228B (zh) * | 2006-05-15 | 2012-05-23 | 松下电器产业株式会社 | 识别自然语言中的命名实体的方法和装置 |
US7917489B2 (en) * | 2007-03-14 | 2011-03-29 | Yahoo! Inc. | Implicit name searching |
EP2025523B1 (en) | 2007-07-26 | 2014-10-22 | Brother Kogyo Kabushiki Kaisha | Sheet processing apparatus |
JP2009094658A (ja) * | 2007-10-05 | 2009-04-30 | Hitachi Ltd | 関連情報提供装置、及び関連情報提供方法 |
US7987416B2 (en) * | 2007-11-14 | 2011-07-26 | Sap Ag | Systems and methods for modular information extraction |
US8185509B2 (en) * | 2008-10-15 | 2012-05-22 | Sap France | Association of semantic objects with linguistic entity categories |
US20100138402A1 (en) * | 2008-12-02 | 2010-06-03 | Chacha Search, Inc. | Method and system for improving utilization of human searchers |
JP4645731B2 (ja) * | 2008-12-10 | 2011-03-09 | コニカミノルタビジネステクノロジーズ株式会社 | 画像処理装置、画像データ管理方法、およびコンピュータプログラム |
JP2010149537A (ja) * | 2008-12-23 | 2010-07-08 | Autonetworks Technologies Ltd | 制御装置、制御方法及びコンピュータプログラム |
JP5540537B2 (ja) * | 2009-03-24 | 2014-07-02 | 株式会社オートネットワーク技術研究所 | 制御装置、制御方法及びコンピュータプログラム |
US8290968B2 (en) | 2010-06-28 | 2012-10-16 | International Business Machines Corporation | Hint services for feature/entity extraction and classification |
CN102737030A (zh) * | 2011-04-06 | 2012-10-17 | 上海量明科技发展有限公司 | 专利文档的数据输出方法、终端及系统 |
JP2016133861A (ja) * | 2015-01-16 | 2016-07-25 | 株式会社ぐるなび | 情報多言語変換システム |
US10776424B2 (en) * | 2016-07-29 | 2020-09-15 | Newswhip Media Limited | System and method for identifying and ranking trending named entities in digital content objects |
US10803057B1 (en) | 2019-08-23 | 2020-10-13 | Capital One Services, Llc | Utilizing regular expression embeddings for named entity recognition systems |
US11586812B2 (en) | 2019-10-31 | 2023-02-21 | International Business Machines Corporation | Unsupervised generation of rules for an adapter grammar |
US10904027B1 (en) | 2020-03-31 | 2021-01-26 | Amazon Technologies, Inc. | Usage-based device naming and grouping |
CN116737924B (zh) * | 2023-04-27 | 2024-06-25 | 百洋智能科技集团股份有限公司 | 一种医疗文本数据处理方法及装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001134600A (ja) * | 1999-11-08 | 2001-05-18 | Nec Corp | 情報抽出システム、情報抽出方法および情報抽出用プログラムを記録した記録媒体 |
JP2002334076A (ja) * | 2001-05-10 | 2002-11-22 | Communication Research Laboratory | テキスト処理方法 |
JP2004046775A (ja) * | 2002-05-15 | 2004-02-12 | Nippon Telegr & Teleph Corp <Ntt> | 固有表現抽出装置及び方法並びに固有表現抽出プログラム |
JP2004086534A (ja) * | 2002-08-27 | 2004-03-18 | Nippon Telegr & Teleph Corp <Ntt> | 時系列情報からの固有情報抽出方法および装置,並びに時系列情報からの固有情報抽出プログラムおよびそのプログラムを記録した記録媒体 |
JP2004312627A (ja) * | 2003-04-10 | 2004-11-04 | Matsushita Electric Ind Co Ltd | テレビジョン受像装置およびその番組情報検索方法 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0652221A (ja) | 1992-05-08 | 1994-02-25 | Fujitsu Ltd | 固有名詞の自動抽出方式 |
JPH10283355A (ja) | 1997-04-02 | 1998-10-23 | Nippon Telegr & Teleph Corp <Ntt> | 企業名解析方法及び装置 |
JP3575242B2 (ja) * | 1997-09-10 | 2004-10-13 | 日本電信電話株式会社 | キーワード抽出装置 |
JP2000099501A (ja) * | 1998-09-17 | 2000-04-07 | Internatl Business Mach Corp <Ibm> | 文書データへの情報の埋め込み方法およびシステム |
JP2001318792A (ja) * | 2000-05-10 | 2001-11-16 | Nippon Telegr & Teleph Corp <Ntt> | 固有表現抽出規則生成システムと方法およびその処理プログラムを記録した記録媒体ならびに固有表現抽出装置 |
US7490092B2 (en) * | 2000-07-06 | 2009-02-10 | Streamsage, Inc. | Method and system for indexing and searching timed media information based upon relevance intervals |
JP4106889B2 (ja) | 2001-09-25 | 2008-06-25 | 沖電気工業株式会社 | 情報検索システム |
US7315810B2 (en) | 2002-01-07 | 2008-01-01 | Microsoft Corporation | Named entity (NE) interface for multiple client application programs |
EP1485825A4 (en) * | 2002-02-04 | 2008-03-19 | Cataphora Inc | DETAILED EXPLORATION TECHNIQUE OF SOCIOLOGICAL DATA AND CORRESPONDING APPARATUS |
-
2005
- 2005-12-26 JP JP2007521081A patent/JP4129048B2/ja not_active Expired - Fee Related
- 2005-12-26 CN CN2005800496646A patent/CN101167075B/zh not_active Expired - Fee Related
- 2005-12-26 WO PCT/JP2005/023768 patent/WO2006134682A1/ja active Application Filing
- 2005-12-26 US US11/916,222 patent/US7761437B2/en active Active
-
2007
- 2007-12-10 JP JP2007318956A patent/JP4977589B2/ja not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001134600A (ja) * | 1999-11-08 | 2001-05-18 | Nec Corp | 情報抽出システム、情報抽出方法および情報抽出用プログラムを記録した記録媒体 |
JP2002334076A (ja) * | 2001-05-10 | 2002-11-22 | Communication Research Laboratory | テキスト処理方法 |
JP2004046775A (ja) * | 2002-05-15 | 2004-02-12 | Nippon Telegr & Teleph Corp <Ntt> | 固有表現抽出装置及び方法並びに固有表現抽出プログラム |
JP2004086534A (ja) * | 2002-08-27 | 2004-03-18 | Nippon Telegr & Teleph Corp <Ntt> | 時系列情報からの固有情報抽出方法および装置,並びに時系列情報からの固有情報抽出プログラムおよびそのプログラムを記録した記録媒体 |
JP2004312627A (ja) * | 2003-04-10 | 2004-11-04 | Matsushita Electric Ind Co Ltd | テレビジョン受像装置およびその番組情報検索方法 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1965312A3 (en) * | 2007-03-01 | 2010-02-10 | Sony Corporation | Information processing apparatus and method, program, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
JP4129048B2 (ja) | 2008-07-30 |
JPWO2006134682A1 (ja) | 2009-01-08 |
US7761437B2 (en) | 2010-07-20 |
JP2008152774A (ja) | 2008-07-03 |
CN101167075A (zh) | 2008-04-23 |
JP4977589B2 (ja) | 2012-07-18 |
US20090119274A1 (en) | 2009-05-07 |
CN101167075B (zh) | 2010-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2006134682A1 (ja) | 固有表現抽出装置、方法、及びプログラム | |
US11048882B2 (en) | Automatic semantic rating and abstraction of literature | |
JP6505421B2 (ja) | 情報抽出支援装置、方法およびプログラム | |
JP2012027845A (ja) | 情報処理装置、関連文提供方法、及びプログラム | |
JP2011134334A (ja) | ショートテキスト通信のトピックを識別するためのシステムおよび方法 | |
JP2006251866A (ja) | 情報処理装置および方法、プログラム、並びに記録媒体 | |
CN101526938B (zh) | 文档处理装置 | |
KR101607468B1 (ko) | 콘텐츠에 대한 키워드 태깅 방법 및 시스템 | |
JP7395377B2 (ja) | コンテンツ検索方法、装置、機器、および記憶媒体 | |
CN110866408B (zh) | 数据库制作装置以及检索系统 | |
JP2002175330A (ja) | 情報検索装置,スコア決定装置,情報検索方法,スコア決定方法及びプログラム記録媒体 | |
US20130054578A1 (en) | Text search apparatus and text search method | |
JP4959603B2 (ja) | ドキュメントを解析するためのプログラム,装置および方法 | |
JP5224532B2 (ja) | 評判情報分類装置及びプログラム | |
US20200005169A1 (en) | System for predicting mood of user by using web content, and method therefor | |
JP2007279978A (ja) | 文書検索装置及び文書検索方法 | |
JP2012141681A (ja) | クエリセグメント位置決定装置 | |
JP2000259653A (ja) | 音声認識装置及び音声認識方法 | |
JP2005173999A (ja) | 電子ファイル検索装置、電子ファイル検索システム、電子ファイル検索方法、プログラムおよび記録媒体 | |
JP2007172179A (ja) | 意見抽出装置、意見抽出方法、および意見抽出プログラム | |
JP7180767B2 (ja) | 応答処理プログラム、応答処理方法および情報処理装置 | |
JP2004220226A (ja) | 検索文書のための文書分類方法及び装置 | |
JP4980604B2 (ja) | 文書検索装置、文書検索方法、文書検索プログラム及び記録媒体 | |
JP2016035688A (ja) | テキスト分析装置、テキスト分析方法、テキスト分析プログラムおよび記録媒体 | |
JP2008293070A (ja) | 文書解析システム、および文書解析方法、並びにコンピュータ・プログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
DPE2 | Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2007521081 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 200580049664.6 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11916222 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 05820180 Country of ref document: EP Kind code of ref document: A1 |