WO2016132558A1 - Information processing device and method, and program - Google Patents

Information processing device and method, and program

Info

Publication number
WO2016132558A1
WO2016132558A1 PCT/JP2015/054890 JP2015054890W WO2016132558A1 WO 2016132558 A1 WO2016132558 A1 WO 2016132558A1 JP 2015054890 W JP2015054890 W JP 2015054890W WO 2016132558 A1 WO2016132558 A1 WO 2016132558A1
Authority
WO
Grant status
Application
Patent type
Prior art keywords
concept
data
processing
mail
information
Prior art date
Application number
PCT/JP2015/054890
Other languages
French (fr)
Japanese (ja)
Inventor
ヤコブ ハルスコウ
秀樹 武田
Original Assignee
株式会社Ubic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor ; File system structures therefor

Abstract

[Problem] To provide an information processing device, method, and program which improve usability. [Solution] An information processing device creates a database which maps selected subject concepts with data elements which are subordinate concepts of the subject concepts; extracts, from among data, data including the data elements which are registered with the database, and creates a summary which expresses the content of the extracted data with superordinate concepts of the data elements; and, on the basis of the created summary, classifies the data which includes the data elements which are registered with the database, and displays the result of the classification.

Description

Information processing apparatus and method, and program

The present invention relates to an information processing apparatus and method and a program, for example, it is suitably applied to an information processing apparatus to monitor e-mail.

Conventionally, and when detecting a change in the environment, when detecting a particular state, the system informs the user that it has detected the change or specific conditions have been studied extensively. For example, Patent Document 1, effectively detects abnormality occurring in the control system, the abnormality detection system to isolate the abnormalities are observed control system is disclosed.

JP 2012-168755 JP

By the way, in such a system, or when the system does not detect the "change" and "particular state", but the system is functioning normally truly "change" and "specific state" that does not occur or, the system can not recognize the user or not been able to detect a "change" or "specific state" to not functioning correctly.

Accordingly, in such a system, when the system is in a state that does not detect the "change" or "specific condition", for example if provided to the user a complete picture of the contents of the e-mail in a predetermined time period, the system normally While functioning is considered that it is possible to improve truly "change" and give the user easily recognizes that "specific state" is not generated, the sense of security and reliability for the system. Also by doing so, without the user having to look through each email, it is possible to recognize the user an overview of the contents of the e-mail in a predetermined time period, the convenience of the system as seen from the user It believed capable of improving.

In recent years, such as the introduction site of the sale site and eateries of the goods and the like on the Internet, there has been an increasing number of cases to me for a user review on the product and eateries like. Such a user of the review, although it is useful information for users who are trying to use the purchased the goods or the restaurant or the like, requires a considerable amount of time and effort in order to look through all of the reviews and thus.

Accordingly, in such a website, if it is possible to provide an overview of such review the user, by omitting the time and effort to look through each review, the convenience of the entire Internet system as viewed from the user It believed capable of improving.

The present invention has been made in view of the above, intended to provide an information processing apparatus and method and program capable of improving convenience when viewed from the user by presenting an overview of the data to the user it is.

In the present invention for solving the above problems, an information processing apparatus, and selected a target concept, a database creation portion that creates a database associating the data elements that are subordinate concept of the target concept, the target data extracting including the data elements registered in the database from the data, the summary creation unit for the contents of the extracted the data to create a summarized expressed in the preamble of the data element, based on the summary classifies the data including the data elements registered in the database, and to provide a display unit for displaying the classification result.

In the present invention The information processing method includes a first step of creating the information processing apparatus, and selected a target concept, the database associating the data elements that are subordinate concept of the target concept, the information processing apparatus, a second step of creating extracts data including the data elements registered in the database from the data, the contents of the extracted the data expressed in the preamble of the data element summary, the information processing apparatus, on the basis of the summary to classify the data including the data elements registered in the database, and to include a third step of displaying the classification result.

Further, in the present invention, in the program, the information processing apparatus, and selected a target concept, a first step of creating a database associating the data elements that are subordinate concept of the target concept, from the data the extracts data including the data elements in a database, a second step of the contents of the extracted the data to create a summarized expressed in the preamble of the data element, based on the abstract, the database the data classifying containing registered the data elements, and so as to execute the processing including a third step of displaying the classification result.

According to the information processing apparatus, information processing method, and program, the user, based on the display result of the information processing apparatus, it is possible to grasp the overall picture of data, the trouble of the user look over individual data it can be omitted.

According to the present invention can be realized an information processing apparatus and method and program capable of improving convenience when viewed from the user.

Is a block diagram showing the schematic configuration of an information processing apparatus according to this embodiment. It is a graph for explaining the electronic dictionary. (A) is a conceptual diagram overview of the present invention, (B) is a schematic diagram illustrating an example of a display format of the classification results. It is a conceptual diagram explaining a target concept. Extraction is a conceptual diagram showing a schematic configuration of an electronic mail management table. It is a flowchart showing a processing procedure of a database creation process. It is a flowchart illustrating a processing procedure of summarization process. It is a graph for explaining the abstraction level filtering process. It is a flowchart illustrating a processing procedure of a display process.

For it is described with reference to the accompanying drawings an embodiment of the present invention.

(1) Configuration FIG. 1 of the information processing apparatus according to the first embodiment (1-1) In this embodiment, 1 denotes an information processing apparatus according to the present embodiment as a whole. The information processing apparatus 1 monitors the email flowing through the network 2 such as an in-house LAN (Local Area Network), e-mail data specific preset in (subject, body and attachments) in keyword when detecting, and email monitoring notifying administrators this is a computer device and topics detection function described later it is mounted. The information processing apparatus 1 is configured to include CPU 10, a memory 11, a hard disk device 12, an interface 13, an input device 14 and display device 15.

CPU10 is a processor (controller) having the function of controlling the operation of the entire information processing apparatus 1. The memory 11 is composed of, for example, a non-volatile semiconductor memory is used as a CPU10 work memory. In memory 11, e-mail monitoring program 20, topic detection program 21 and extracted e-mail management table 22 is stored. Email monitoring program 20 is a program for executing various processes for realizing the electronic mail monitoring functions described above. For more information about the topic detection program 21 and extracted e-mail management table 22 will be described later.

Hard disk device 12 is used to store various programs and data long time. The hard disk device 12, electronic dictionary 23 and the target concept extraction database 24 is stored. Electronic Dictionary 23, Japanese words and concepts hierarchically classification, is a dictionary that was recorded these words and concepts in a systematic form. By utilizing this electronic dictionary 23, it is possible to construct a graph showing the example hierarchical relation concepts as shown in Figure 2. For more information about the target concept extracting database 24 will be described later.

Input device 14 is constituted of, for example, a keyboard and a mouse, the user is utilized to perform an operation input or setting, or the like. The display device 15 is composed of a liquid crystal display, it is used for displaying various information.

(1-2) Topic detection will be described next topic detection function mounted on the information processing apparatus 1. To the information processing apparatus 1, as shown in FIG. 3 (A), preselected concept network 2 from the distribution email within a predetermined time period (hereinafter, referred to as target concept) subordinate concepts keyword extracting an email containing the text, the extracted summary of content of each e-mail created respectively at moderate levels of abstraction, the e-mail classifying (clustering) based on the summaries created, within a predetermined time period topic detection function of presenting to the user in a format such as classification results, for example, Figure 3 of the electronic mail (B) is mounted.

Such topic detection function is realized by a two-step phase of the preparation phase and application phase. Preparation phase, the above object is extracted by the keyword subgeneric of each target concepts set in advance by the user from the electronic dictionary 23 (Fig. 1), associates the extracted keyword to the corresponding target concept a phase for creating concept extraction database 24 (FIG. 1). The application phase, creates an abstract that the content of the corresponding electronic mail by utilizing the target concept extraction database 24 created in the preparatory phase expressed in generic concept, classifies the corresponding electronic mail based on summaries created a phase for displaying the classification result according to the request from the user. Here, the "appropriate e-mail" refers to an e-mail that contains the keyword registered in the target concept for extracting database 24 in the text. The same applies to the following.

In the preparation phase, first of all, the user will choose some of the target concept in accordance with the topic to be detected from the e-mail text (topic), to register the target concepts were selected in advance in the information processing apparatus 1. For example, if you want to detect topic is "illegal" and "dissatisfaction", as shown in FIG. 4, the category of concept "action", of "emotion", "the nature and state", "risk" and "money" five to be divided, for example, for the "action" is "despise," "revenge" and, like "it suffer" for the "emotion" and "be angry", about the "nature and state" is "sluggish such as I "and" mind or attitude is bad ", such as" threatening "about the risk" "and" fool ", as the target concept concepts such as money" to be paid for the labor of the "people for" money " each set.

The information processing apparatus 1, when the target concept is set in this way, for each registered object concepts, keywords representing the lower concept searching on electronic dictionary 23, the individual detected by the search keyword creating a target concept extraction database 24 described above which associates the target concept corresponding respectively.

On the other hand, in the application phase, the information processing apparatus 1, by using the target concept extraction database 24 created as described above, from the electronic mail flowing through the network 2, is registered in the target concept extraction database 24 to extract the e-mail that contains the keyword in the text. The information processing apparatus 1, for thus extracting the email, to create a summary representation using preamble of keywords that has detected the contents of the text at that time.

For example, in the case of FIG. 3, as shown in FIG. 3 (A), the "e-mail_1" is "system" from the point of "monitoring system orders", the target concept of "sales" and "" are extracted, for "e-mail_2", for "system" from the point of "accounting system implementation", since the broader concept of "sales" and "" are extracted, these "e-mail_1" and "e-mail_2" is both will be a summary of "system sales" is created.

Then, the information processing apparatus 1, after this, when there is a request from the user, On the basis of the relevant e-mail summaries created, depending on the relevant e-mail in the predetermined period of the agreement classification, and presents the classification results to the user.

For example, in the case of FIG. 3, because the same summary of "system sales" is created for as described above, "e-mail_1" and "e-mail_2", these "e-mail_1" and "e-mail_2" is identical It is classified in the group. Then, it is displayed in the form of a "content" a summary as shown in Figure the classification results, for example, 3 (B).

As a means for realizing the topic detection function described above, the memory 11 of the information processing apparatus 1 (FIG. 1), topic detection program 21 and extracted e-mail management table 22 as described above with respect to FIG. 1 is stored ing.

Topic detection program 21 is a program for executing various processes on the topic detection functions described above, as shown in FIG. 1, and a database creation unit 30, the summary creation unit 31 and the display unit 32.

Database creation unit 30 is a module having a function of creating a target concept extraction database 24 described above on the basis of the set target concept by a user. The summary creation unit 31 extracts an e-mail containing the keyword registered in the target concept extraction database 24 in the text, it is a module having a function of creating the summary. Furthermore the display unit 32, in response to a request from the user, the appropriate e-mail classified by utilizing the summary, is a module having a function of displaying the entire picture of the corresponding e-mail in a predetermined period.

The extracted e-mail management table 22 has been extracted in the application phase, a table used to manage e-mail containing the keyword registered in the target concept extraction database 24 in the text.

The extracted e-mail management table 22, as shown in FIG. 5, the transmission date and time column 22A, content column 22B, configured with a like source address field 22C and a destination address field 22D. And the transmission date and time column 22A, the e-mail is stored the date and time are transmitted from the source, the content column 22B, a summary of the above created for the electronic mail is stored. Also the source address field 22C is stored transmission source mail address of the e-mail, the destination address field 22D, the transmission destination of the mail address of the electronic mail is stored.

Thus, in the example of FIG. 5, e-mail address of "m_higasi@aaa.co.jp" from the mail address of "a_okamoto@aaa.co.jp" to "2014/12/15 09:31:15" (source) e-mail stating that the "system sales" in (the destination) has been shown to be transmitted.

6, 7 and 9 show the specific processing contents of the various processing executed in the information processing apparatus 1 in relation to the above topics detection. In the following is described where the processing subject of various processes is the "module (~ unit)", in practice, based on the "module (~ unit)", it goes without saying that the processing is CPU10 executes .

Figure 6 shows a flow of a series of processes in the preparation phase. This process (hereinafter referred to as a database creation process) is executed by the database creation portion 30.

In practice, the database creation unit 30 starts the input device 14 (FIG. 1) is operated when the creation instruction of the target concept extracting database 24 are input database creation processing shown in FIG. 6, firstly, the user 1 or more target concept waits for is selected (SP1).

The database creation unit 30, when eventually one or more target concept is selected for each selected a target concept that time, the lower concept searching on electronic dictionaries, extracts all subgeneric respectively (SP2).

Subsequently, the database creation portion 30 for all subordinate concept of each target concepts extracted in step SP2, and is all the keywords associated with the subordinate concepts extracted from each electronic dictionary (SP3).

Further database creation unit 30 creates a target concept extraction database 24 that associates the target concept corresponding respectively all keywords extracted in step SP3 (SP4). The database creation unit 30 then terminates the database creation process.

On the other hand, FIG. 7, the series of processes apply phase, extract the e-mail containing the keyword registered in the target concept extraction database 24 the text, showing the flow of processing up to create the summary. This process (hereinafter, referred to as summarization process this) is performed by summarization unit 31.

In practice, the summary creation unit 31, the database creation processing described above with reference to FIG. 6 is terminated, starts the summary creation processing shown in FIG. 7, first, taken from the network 2 to perform the electronic mail monitoring feature of the above but one of selecting an e-mail to be analyzed from the electronic mail (SP10).

Subsequently, the resulting summary creation unit 31, by morphological analysis of the text of the selected e-mail, divides the text into individual morphemes (minimum units with meaning in language) (SP11), thereafter, the time by searching each morphemes on target concept extraction database 24, which is, according to the inside of morphemes obtained by morphological analysis, whether morphemes registered as keywords in the target concept extraction database 24 is present the judges (SP12).

Summary creation unit 31 obtains a negative result in this determination, the flow returns to step SP10, the process proceeds to e-mail next unprocessed. Summary creation unit 31 contrary, if an affirmative result is obtained is determined in step SP 12, among the morphemes obtained by morphological analysis step SP11, the morphemes registered as keywords in the target concept extraction database 24, with reference to the object concept extraction database 24, it detects the target concept is the preamble of the morphemes (keywords), respectively (SP13).

Subsequently, the summary creation unit 31, for each subject concepts detected in step SP13, from among the lower-level concepts, executes abstract filtering down ring process for extracting concepts having a predetermined degree of abstraction (SP14). This also creates a summary using the concept of too higher, resulting in for the user can not grasp the contents of the e-mail on the basis of the summary, the user can recognize the contents of the e-mail it is to create a summary using a preamble having a degree of abstraction.

In this embodiment, the summary creation unit 31, as such abstraction level filtering process, as shown in FIG. 8, for each target concept, among the keywords registered in the target concept extraction database 24, the FIG. 2 above in the graph representing the superior and inferior relations of the concepts constructed by utilizing the electronic dictionary, (a keyword that does not have a subordinate concept, 8 "leaf_1" - "leaf_3" considerable) keyword leaf level to average distance is less than a preset threshold value, and is detected as a generic concept that utilizes the largest higher concept takes the average distance in the summary.

Here, in FIG. 8, "C:" average distance from the node of the three leaf nodes of "leaf_1" - "leaf_3" is "C:" "leaf_1 'from node named to 3 leaf node of" leaf_3 " the total distance was calculated to can be calculated by dividing the total distance by the number of leaf nodes.

Specifically, in the example of FIG. 8, a node called "leaf_1" In both the distance to the node of "2", "C::" the distance to the node that, from the node of "leaf_2" "C" There, the node called "leaf_3" "C:" the distance to the node that is "1", the total distance is the sum of these distances is "5". Accordingly, the "5" is the number of leaf nodes "3" divided by "5/3 (~ 1.67)" is "C:" averages from node named to three leaf nodes of "leaf_1" - "leaf_3" distance to become.

Summary creation unit 31 where, in step SP14, for each target concept detected in step SP13, among the keywords registered in the target concept extraction database 24, detected in step SP12 morphemes (keyword) all higher than by executing such operations, respectively concepts (broader concept), it calculates the average distance from these preamble to the leaf nodes, respectively, less than the calculated average distance that is preset threshold and the average distance There is extracted one superordinate concept closest to the threshold.

Then, the summary creation unit 31, by arranging the preamble of each target concepts extracted in this way to create a summary of the email (SP15), the necessary information about the e-mail has been described above with reference to FIG. 5 further after storing the extracted e-mail management table 22 (SP16), the flow returns to step SP10.

On the other hand, FIG. 9, the series of processes apply phase, instruction to display an overview of the electronic mail corresponding in a predetermined time period from the user (hereinafter, this is referred to as whole image display instruction) is given shows the flow of processing executed in the information processing apparatus 1 when the. This process (hereinafter, referred to as a display processing this) is performed by the display unit 32 (FIG. 1).

In practice, the display unit 32, the whole image display instruction input device 14 according to the operation is given to start the display processing shown in FIG. 9, first, registered in the extracted e-mail management table 22 of the electronic mail the out all e-mail sent from the sender within a predetermined time period classified according to the content of the abstract (SP20).

The classification method in this, for example, a method of classifying electronic mail between the contents of the summary are completely consistent as the same group, even if the contents of the summary are not perfectly matched, the preamble of each concept constituting the summary If it matches perfectly matched or partially can be applied a method of classifying the same group.

Subsequently, the display unit 32 displays the classification result in step SP22, for example, in FIG. 3 (B) display device 15 in a predetermined format described above with reference to (Figure 1) (SP21), and thereafter ends this display process .

(1-3) In the information processing apparatus 1 of this embodiment as described above the effect of the present embodiment, and selected a target concept, subject concept extraction associating the keyword that represents the subordinate concept of the target concept create a use database 24, extracts the e-mail containing the keyword registered in the target concept extraction database 24 in the text, to create a summary indicating the contents of the e-mail in the preamble, a request from a user depending on, it classifies the relevant e-mail based on the summary displays the classification results.

Therefore, according to the information processing apparatus 1, during the monitoring process based on the e-mail monitoring feature, even when the state is not detected emails containing the keyword set in advance based on the e-mail monitoring feature, such based on the classification result, it is possible to recognize the entire image of the electronic mail containing the keyword registered in the target concept extraction database 24, that the information processing apparatus 1 is functioning correctly recognized by the user be able to. That is, according to the information processing apparatus 1, the user without having to look through the text of each e-mail, it is possible to recognize the entire image of the contents of the e-mail in a predetermined period. Thus, according to the information processing apparatus 1, it is possible to improve convenience when viewed from the user.

(2) In the first embodiment the second embodiment, by registering the target concept on a particular topic that the user desires to extract the e-mail containing the keyword subgeneric of the target concept, Although it configured to display an overview of these e-mail, to create the summary information processing apparatus 1 for all e-mail, classify email based on summaries created, complete picture of the classification results it may be displayed.

In this case, the preparation phase as described above is not required, the text of the e-mail morphological analysis, the results characteristic morphological extracted from (characteristic morpheme extraction process), detects the preamble of the extracted morphemes ( top concept detection processing), from the detected preamble, such moderate levels of extracts superordinate concept (abstract filtering and generic concept ranking process), based on the results in the same manner as the embodiment described above email it is sufficient to display the entire image of the classification to classify the results.

Specifically, the characteristic morpheme extracting process,
(A) providing a reference corpus (reference corpus). Here, the reference corpus, and structured text in a natural language, which was integrated into a large scale, can be extracted frequency morpheme easily from the reference corpus.

(B) there morphemes, O 11 the frequency of occurrence in the unknown data to be analyzed, the frequency of occurrence in the reference corpus and O 12, the frequency with which different all other morphemes and the morphemes appears to unknown data O 21, the frequency of the other all the morphemes appear in the reference corpus O 22.

(C) R 1 and R 2 respectively following equation

Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000002
And, C 1, C 2, N respectively following equation
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000005
As the expected frequency (expected frequencies) E 11 ~ E 22 respectively calculated by the following equation.
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000008
Figure JPOXMLDOC01-appb-M000009

(D) log-likelihood ratio (log-likelihood-ratio) is calculated by the following equation.

Figure JPOXMLDOC01-appb-M000010
The log likelihood ratio indicates that the higher the value, the probability the morphemes characterizes the unknown data is high. Thus, for example, it extracts morphemes log likelihood ratio is set in advance as a characteristic morphological.

In the preamble detection process detects by searching the preamble of morphemes extracted with characteristic morpheme extracting process described above with an electronic dictionary 23 described above with reference to FIG.

Further abstraction filtering and generic concept ranking process, first, extracts the preamble having a degree of abstraction of the above-described abstraction filtering for step SP14 in FIG. 7 from the generic concept detected by the preamble detection process. If the preamble extracted by the extraction process there is a plurality, the following equation

Figure JPOXMLDOC01-appb-M000011
The frequency of the concepts; for (Concept Frequency CF) is determined by ranking the frequency of appearance of the concept, the frequency of occurrence is higher predetermined number or frequency extracting preamble of more than a preset threshold value , to those arranging these higher-level concepts and summary of the e-mail. As a method of ranking the higher concept as described above, simply the frequency in addition to the method of determining the order of their size, for example, CF / DF (document frequency; Document Frequency) or CF / TF-IDF a method of ranking by using the calculated value by (calculated from the frequency and the document frequency of the word), may also be used other methods. After this, we classify all emails in a given period using this summary, to display the classification result.

According to the above information processing apparatus according to this embodiment, since all email can be classified according to its content, the user to recognize the full picture of the content of all e-mail in a predetermined time period it can be, thus it is possible to further to improve the convenience from the viewpoint of the user.

(3) In the embodiment noted first and second embodiments described above in other embodiments has dealt with the case where the information processing apparatus 1 holds the electronic dictionary, the present invention is not limited to this, the information processing apparatus 1 does not hold the electronic dictionary, the system as the information processing apparatus 1 requests the various search on electronic dictionary to an external device holding the electronic dictionary, and receives the result it is also possible to build a (device).

Also in the first and second embodiments described above, in step SP14 of the summarization processing described above with reference to FIG. 7, the concept of having a distance less than the average distance to the leaf level is preset threshold e-mail as the preamble of keywords extracted from the text, it has dealt with the case of creating a summary of the electronic mail by using the preamble, the present invention is not limited to this, for example, extracted from the e-mail text seeking preamble of keywords registered in the target concept extraction database 24, may create a summary of the electronic mail by utilizing the broader concept.

Further, in the first and second embodiments described above has dealt with the case of the display in a format such as the whole picture example diagram corresponding electronic mail 3 (B), the present invention is to not limited, for example, summary, classified the result is displayed as a whole according to the chart, such as a pie-chart, line chart clearly was the percentage of the total image (e.g., topics a accounts for 20% of the total, topics B accounts for 10% of the total, the topic C accounts for 5% of the total, including others topics such as 65 percent of the total), it is possible to widely apply various other display formats.

Further, in the first and second embodiments described above, the e-mail monitoring and equipped with topic detection function in the same one information processing apparatus 1 (i.e. email monitoring program 20, and topic detection program 21 of one of the information processing it has dealt with the case of mounting) to the apparatus 1, the present invention is not limited to this, mounting of these two functions to separate the information processing apparatus (e.g., discrete electronic mail monitoring program 20, and topic detection program 21 of it may be implemented) to the information processing apparatus. Or it may be constructed a system as a distributed system to perform the e-mail monitoring and topic detection by a plurality of information processing apparatus.

Further, in the first and second embodiments described above, to create a summary of e-mail, dealt with the case of classifying email, provides an overview of the classification result to the user based on the summary was, but the present invention is not limited to this, for example, the information processing apparatus 1, the correlation between a certain concept different from the concept (first concept) with the concept (second conceptual) a (co-occurrence) taking into account, the data may be able to analyze. For example, evaluation of the case is often the first concept of "System" (evaluated) and the second concept of "the angry" and (value judgments) appear simultaneously on the same data, the "system" subject value determination that has a low evaluation, the information processing apparatus 1 may be able to present to the user.

Further in the first embodiment described above has dealt with the case of creating a target concept extraction database 24 only associates the keyword and the target concepts in a preparation phase, the present invention is limited to this not, for example, the information processing apparatus 1, not only associating the keyword and the target concepts in a preparation phase, or score for the keyword (the keyword indicates positive emotions, whether shows the negative emotions, e.g. , correspondence, based on the concept emotion score corresponding to concepts extracted from the data in the application phase (e.g., summing-integrating the concept emotion score index) was quantified by 0-1 of values ​​as a concept emotion score You by), feelings toward the concept (evaluation) the (value judgments) can be presented to the user It may be.

Further in the first embodiment described above, to extract the e-mail containing the keyword belonging to the lower concepts preselected target concept, so as to create a summary of the electronic mail by using a preamble of the keyword it has dealt with the case where the present invention is not limited to this, for example, the information processing apparatus 1, a verb phrase included in the sentence extracted as superordinate concept, the data including the sentence by using the extracted verb phrase it is also possible to create a summary. For example, the information processing apparatus 1 is, "I have been looking forward to the cuisine" to extract the verb phrase that "enjoyed" from the sentence that may be presented to the user the verb phrase as a summary.

Further, in the first and second embodiments described above, although the present invention has been described with the case of applying the information processing apparatus 1 for monitoring email, the present invention is not limited thereto, the following the purpose of the implementation, or can be applied to the embodiment.

For example, the present invention can also be applied to the Internet application systems. For example, it is possible to provide a user message that was posted to the SNS, recommended information and reviews have been posted on the web site, the user is summarized by the information processing apparatus of the present invention the data, such as a profile of a user or organization. That is, the information processing apparatus, evaluated (e.g., when product reviews the user posted on the website, the product), a value judgment (summary of what has been what evaluated for the product) since it can be shown, it is possible to improve the convenience of the user about the Internet.

The present invention is a medical application systems (e.g., as an electronic medical record, nursing record, and patient diary data, or predict the prognosis of the patient, the system or to verify the efficacy) can be applied to. In this case, for example, electronic medical records, nursing records, by presenting a summary by the information processing apparatus of the present invention and a patient diary, for example, of falling into the patient a dangerous situation (for example, to fall) thereby facilitating the prediction.

The present invention can also be applied to the discovery support system. For example, documents, e-mail, by encapsulating by the information processing apparatus of the present invention the data such as spreadsheet data, for example, be only documents related to lawsuit user submits the court to efficiently extract it can.

The present invention can also be applied to forensic system. In this case, for example, documents, e-mail, by encapsulating by the information processing apparatus of the present invention the data such as spreadsheet data, for example, it is possible to facilitate the extraction of the evidence for the criminal act, such it is possible to improve the Do work efficiency.

The present invention is, for example predictive coding function based on the (small number of training data, by calculating the score for a large number of unknown data (index indicating the relevance of the height of the unknown data and a predetermined cases), the number of unknown data function ranks) can also be applied to on-board data analysis system. The data analysis system predictive coding function is installed, the client device to perform some or all of the data analysis program for executing the data analysis (e.g., a personal computer, a user terminal such as a smart phone) and, the data analysis perform some or all of the programs, the results of running and a server device to send back to the client device, configured to optionally share the client device and the server device processing included in the data analysis program .

In the case of applying the present invention to predictive coding function is mounted data analysis system, based on the value judgment indicated by a summary of the data, to adjust the score calculated for the data by the predictive coding function it may be. For example, by the predictive coding function, in a case where the data higher score is considered to match the preference of the user assigned, value judgment indicating that "no interest" from the data is shown as a summary If (i.e., if the the summary score conflict), the information processing apparatus of the present invention, for example, reduces the calculated score, it may be adjusted to the score.

The present invention can also be applied to patent search system. For example, JP, by encapsulating by the information processing apparatus data such as a document summarizing the invention, the task of extracting the invalidity from a large number of patent documents that the user can efficiently perform.

Thus the information processing apparatus of the present invention, not only the information processing apparatus 1 to monitor e-mail, forensic system, discovery support system, a medical application systems, Internet application systems, widely applied to various systems, such as patent search system can do. Furthermore, the information processing apparatus of the present invention, the portal site operator system, project evaluation system, transaction management system, a call center escalation system, such as marketing system can be widely applied to any system. That is, the present invention extracts the preamble from the data, create a summary expressed in the preamble, by presenting the summary to the user, is widely applied to the entire image of the data to the system to be presented to the user obtain.

The present invention can be broadly applied information processing apparatus that detects a change or a particular state of the environment, such as a server device for providing web pages on the Internet, a variety of information processing apparatus.

1 ...... information processing apparatus, 10 ...... CPU, 15 ...... Display unit, 21 ...... topic detection program, 22 ...... extracted e-mail management table, 23 ...... Electronic Dictionary, 24 ...... target concept extracting database, 30 ...... database creation unit, 31 ...... summarization unit, 32 ...... display unit.

Claims (6)

  1. And selected a target concept, a database creation portion that creates a database associating the data elements that are subordinate concept of the target concept,
    A summary creation unit for creating summary extracts data including the data elements registered in the database from the data of interest, the extracted contents of said data expressed in the preamble of the data element,
    Based on the summary to classify the data including the data elements registered in the database, the information processing apparatus characterized by comprising a display unit for displaying the classification result.
  2. The data elements and concepts hierarchically classified, dictionary recorded the data elements and the concept is given in advance,
    The database creation unit,
    Search for all subordinate concepts of the selected target concept from the dictionary on the dictionary,
    It extracts all the data elements corresponding to all the lower-level concepts detected by the search,
    The information processing apparatus according to claim 1, characterized in that to create the database extracted all the data elements to associate with the target concept corresponding.
  3. The summary creation unit,
    Detects the object concept which is the preamble of the data elements registered in the database included in the data,
    Among the lower concept of the subject concepts detected, a preamble of the data elements extracted from the data, detecting a concept having a predetermined level of abstraction, to create the summary using the concepts detected the information processing apparatus according to claim 1, characterized in that.
  4. The concept of having a predetermined said abstraction degree,
    In the graph representing the hierarchical relation of concepts, the information processing apparatus according to claim 2 or 3, wherein the average distance to the leaf level is a concept that has a distance less than a preset threshold value.
  5. The information processing apparatus, and selected a target concept, a first step of creating a database associating the data elements that are subordinate concept of the target concept,
    The information processing apparatus, a second step of creating extracts data including the data elements registered in the database from the data, the contents of the extracted the data expressed in the preamble of the data elements Summary When,
    An information processing method the information processing apparatus, based on said summary to classify the data including the data elements registered in the database, characterized in that it comprises a third step of displaying the classification result.
  6. And selected a target concept, a first step of creating a database associating the data elements that are subordinate concept of the target concept,
    A second step of creating a data extracting including the data elements registered in the database from the data, the contents of the extracted the data expressed in the preamble of the data element summary,
    Based on the summary, the data classifying, third program that the characterized in that to perform the processing to the information processing apparatus including the step of displaying a classification result including the data elements registered in the database.
PCT/JP2015/054890 2015-02-20 2015-02-20 Information processing device and method, and program WO2016132558A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/054890 WO2016132558A1 (en) 2015-02-20 2015-02-20 Information processing device and method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/054890 WO2016132558A1 (en) 2015-02-20 2015-02-20 Information processing device and method, and program

Publications (1)

Publication Number Publication Date
WO2016132558A1 true true WO2016132558A1 (en) 2016-08-25

Family

ID=56692068

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/054890 WO2016132558A1 (en) 2015-02-20 2015-02-20 Information processing device and method, and program

Country Status (1)

Country Link
WO (1) WO2016132558A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006318398A (en) * 2005-05-16 2006-11-24 Nippon Telegr & Teleph Corp <Ntt> Vector generation method and device, information classifying method and device, and program, and computer readable storage medium with program stored therein
US20120066210A1 (en) * 2010-09-14 2012-03-15 Microsoft Corporation Interface to navigate and search a concept hierarchy
JP2015001834A (en) * 2013-06-14 2015-01-05 日本電信電話株式会社 Content summarization device, content summarization method and content summarization program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006318398A (en) * 2005-05-16 2006-11-24 Nippon Telegr & Teleph Corp <Ntt> Vector generation method and device, information classifying method and device, and program, and computer readable storage medium with program stored therein
US20120066210A1 (en) * 2010-09-14 2012-03-15 Microsoft Corporation Interface to navigate and search a concept hierarchy
JP2015001834A (en) * 2013-06-14 2015-01-05 日本電信電話株式会社 Content summarization device, content summarization method and content summarization program

Similar Documents

Publication Publication Date Title
US20130185307A1 (en) Methods and systems of supervised learning of semantic relatedness
US20130024757A1 (en) Template-Based Page Layout for Hosted Social Magazines
US20140280204A1 (en) Document Provenance Scoring Based On Changes Between Document Versions
US20110093258A1 (en) System and method for text cleaning
US9256664B2 (en) System and method for news events detection and visualization
US20110191336A1 (en) Contextual image search
US20130263019A1 (en) Analyzing social media
US20130018824A1 (en) Sentiment classifiers based on feature extraction
US20110153595A1 (en) System And Method For Identifying Topics For Short Text Communications
US20130219255A1 (en) Authorized Syndicated Descriptions of Linked Web Content Displayed With Links in User-Generated Content
US7870135B1 (en) System and method for providing tag feedback
US20130159277A1 (en) Target based indexing of micro-blog content
US20120233152A1 (en) Generation of context-informative co-citation graphs
US20090259622A1 (en) Classification of Data Based on Previously Classified Data
US20100125531A1 (en) System and method for the automated filtering of reviews for marketability
US20150242401A1 (en) Network searching method and network searching system
US20130212109A1 (en) Methods and apparatus for classifying content
US20130159340A1 (en) Quote-based search
US20130282704A1 (en) Search system with query refinement
US20110040769A1 (en) Query-URL N-Gram Features in Web Ranking
Li et al. Contextual recommendation based on text mining
JP2005122295A (en) Relationship figure creation program, relationship figure creation method, and relationship figure generation device
US8412718B1 (en) System and method for determining originality of data content
US20140163962A1 (en) Deep analysis of natural language questions for question answering system
US8572087B1 (en) Content identification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15882659

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase in:

Ref country code: DE

NENP Non-entry into the national phase in:

Ref country code: JP

122 Ep: pct application non-entry in european phase

Ref document number: 15882659

Country of ref document: EP

Kind code of ref document: A1