WO2024166155A1 - 情報処理装置、情報処理方法、プログラム - Google Patents
情報処理装置、情報処理方法、プログラム Download PDFInfo
- Publication number
- WO2024166155A1 WO2024166155A1 PCT/JP2023/003734 JP2023003734W WO2024166155A1 WO 2024166155 A1 WO2024166155 A1 WO 2024166155A1 JP 2023003734 W JP2023003734 W JP 2023003734W WO 2024166155 A1 WO2024166155 A1 WO 2024166155A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- graph
- information processing
- processing device
- nodes
- text data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/93—Document management systems
Definitions
- This disclosure relates to an information processing device, an information processing method, and a program.
- Patent Document 1 describes extracting subjects, predicates, and objects from text data, and generating graph information that shows these in a graph structure.
- Patent Document 1 only displays an overall picture of multiple pieces of text data in a graph structure. This creates the problem that analyzing a huge amount of text data is difficult and time-consuming.
- the objective of this disclosure is to provide an information processing device that can solve the problem described above, that is, the difficulty and time required to analyze a huge amount of text data.
- An information processing device includes: a generation unit that generates a graph in which a plurality of types of sentence elements, which are set in advance and generated from text data, are represented by nodes and edges connecting the nodes according to the types of the elements, and that generates a connection graph in which a plurality of the graphs are connected according to the contents of the nodes; an extraction unit that extracts the graph having a preset relationship between the nodes based on the connection graph; Equipped with The structure is as follows.
- an information processing method includes: generating a graph in which a plurality of types of sentence elements, which are set in advance and generated from the text data, are represented by nodes and edges connecting the nodes according to the types of the elements, and generating a connected graph in which a plurality of the graphs are connected according to the contents of the nodes; extracting the graph having a preset relationship between the nodes based on the connection graph;
- a program includes: generating a graph in which a plurality of types of sentence elements, which are set in advance and generated from the text data, are represented by nodes and edges connecting the nodes according to the types of the elements, and generating a connected graph in which a plurality of the graphs are connected according to the contents of the nodes; extracting the graph having a preset relationship between the nodes based on the connection graph; Have a computer carry out the process,
- the structure is as follows.
- FIG. 1 is a block diagram showing a configuration of an information processing device according to a first embodiment of the present disclosure.
- FIG. 2 is a diagram showing a process performed by the information processing device disclosed in FIG. 1 .
- FIG. 2 is a diagram showing a process performed by the information processing device disclosed in FIG. 1 .
- FIG. 2 is a diagram showing a process performed by the information processing device disclosed in FIG. 1 .
- FIG. 2 is a diagram showing a process performed by the information processing device disclosed in FIG. 1 .
- FIG. 2 is a diagram showing a process performed by the information processing device disclosed in FIG. 1 .
- FIG. 2 is a diagram showing a process performed by the information processing device disclosed in FIG. 1 .
- 2 is a flowchart showing an operation of the information processing device disclosed in FIG. 1 .
- FIG. 11 is a block diagram showing a hardware configuration of an information processing device according to a second embodiment of the present disclosure.
- FIG. 11 is a block diagram showing a configuration of
- Fig. 1 is a diagram for explaining the configuration of an information processing device
- Fig. 2 to Fig. 7 are diagrams for explaining the processing operation of the information processing device.
- the information processing device 10 in this embodiment displays text data in a graph structure.
- the information processing device 10 targets text data contained in the body or attachment of e-mails, messages, SNS (Social Networking Service) posts, etc., and displays and analyzes the text data in a graph structure.
- e-mails that may be related to cases that are the subject of police investigations are displayed in a graph structure and analyzed.
- the text data to be displayed and analyzed in a graph structure is not limited to the above-mentioned e-mails, etc., and may be any text data.
- the information processing device 10 is composed of one or more information processing devices each having a calculation device and a storage device. As shown in FIG. 1, the information processing device 10 is composed of an input unit 11, a preprocessing unit 12, a graph generation unit 13, a graph analysis unit 14, and a display control unit 15. The functions of the input unit 11, the preprocessing unit 12, the graph generation unit 13, the graph analysis unit 14, and the display control unit 15 can be realized by the calculation device executing a program for realizing each function stored in the storage device.
- the information processing device 10 is also composed of a text data storage unit 16 and a graph storage unit 17. The text data storage unit 16 and the graph storage unit 17 are composed of a storage device.
- the information processing device 10 is also connected to a display device 30 such as a display. Each component will be described in detail below.
- the input unit 11 accepts input of text data to be processed and stores it in the text data storage unit 16.
- the input unit 11 obtains emails stored in other information processing devices or storage media as text data, and stores each item in the text data storage unit 16.
- the email contains data such as a "header” including the "subject,” “sender (From),” “destination (To),” and “time of sending (Time),” and a "body.”
- the email may contain other information, and may also contain an "attachment" that contains text data.
- the preprocessing unit 12 reads out e-mails one by one from the storage device and performs preprocessing on the text data contained in the e-mail.
- the preprocessing unit 12 divides the text data of the "body” and "attachment” contained in the e-mail into sentences, and extracts and generates a subject, predicate, and object for each sentence.
- the preprocessing unit 12 divides the text data into sentences according to a preset criterion, and extracts and generates a subject, predicate, and object for each sentence using other text data before and after the sentence and other information in the e-mail (for example, the subject, sender, and destination included in the header).
- the preprocessing unit 12 extracts the subject, predicate, and object from the sentence using preset text analysis rules and analysis models.
- the subject, predicate, and object can be extracted from natural language by using a technology called Open Information Extraction (OpenIE).
- OpenIE Open Information Extraction
- sentence elements such as the subject, predicate, and object described above may be achieved using a machine learning model that extracts sentence elements from a sentence.
- a machine learning model that extracts sentence elements from a sentence.
- the sentence elements of the subject, predicate, and object are extracted and output.
- a machine learning model is generated by supervised learning.
- supervised learning the text data is divided into sentences, and the above-mentioned machine learning model can be generated by using training data consisting of each sentence and a pair of the subject, predicate, and object of the sentence.
- the preprocessing unit 12 uses other preceding and following sentence data or other information in the email (e.g., a header) to supplement the subject and object anew, and extracts and generates a subject, predicate, and object from the sentence.
- the preprocessing unit 12 uses preset sentence analysis rules and analysis models, or utilizes a technology called OpenIE to supplement parts of the subject, predicate, and object for the sentence, and extracts and generates the subject, predicate, and object.
- OpenIE a technology called OpenIE
- the above-mentioned subject and object completion may be realized by using a machine learning model that generates sentences from sentences.
- a machine learning model that generates sentences from previously generated sentences inputs a sentence to be completed, the sentences before and after it, and header information including sender and recipient information associated with emails and SNS posts, and outputs a sentence in which the omitted sentence elements such as the subject and object are completed for the sentence to be completed.
- a machine learning model is created by supervised learning.
- supervised learning divides text data into sentences, and uses training data consisting of a sentence in which sentence elements such as the subject and object are omitted, the sentence before and after it, the header information, and the completed sentence.
- a sentence element extracted using a machine learning model that extracts sentence elements from the previous sentence may be used instead of the sentences before and after. Furthermore, completion of elements such as the subject and object may be performed not only by using a machine learning model that generates sentences from sentences, but also by adding sentence elements such as the subject and object extracted using a machine learning model that extracts sentence elements from the previous sentence to the beginning or end of the target sentence.
- a sentence may be a character string of any length, and may be divided into sentences based on any criteria.
- an example was given of generating one of each of three types of sentence elements, namely subject, predicate, and object, for each sentence of the text data, but there may be any number of types of sentence elements, and multiple elements of each type may be generated.
- the preprocessing unit 12 may generate elements such as subject, predicate, object, complement, and modifier.
- the graph generation unit 13 (generation unit) generates a graph in which the subject, predicate, and object generated for each sentence of the text data as described above are represented in a graph structure consisting of nodes and edges connecting the nodes. Specifically, as shown in FIG. 2 (2-1), the graph generation unit 13 generates graph g in which the "subject" and "object” are each represented as a node, and the "predicate” is represented as an edge connecting the nodes.
- the graph is not limited to the structure described above, and the node and edge structures may differ depending on the number and types of elements in the sentence. In other words, the graph is not limited to one represented by two nodes and one edge connecting them as shown in FIG. 2 (2-1), and further nodes and edges may be added.
- the graph generation unit 13 also connects multiple graphs according to the contents of the nodes of each graph, generates a connected graph, and stores it in the graph storage unit 17. Specifically, the graph generation unit 13 connects multiple graphs at the locations of nodes with the same contents contained in each of the multiple graphs to generate a connected graph.
- FIG. 2 (2-2) An example of connecting graphs to generate a connected graph by the graph generation unit 13 will be described with reference to FIG. 2 (2-2). First, as shown in the left diagram of FIG. 2 (2-2), it is assumed that two graphs g1 and g2 have been generated. At this time, the "object" node of each of the two graphs g1 and g2 has the same content as "object 1".
- the graph generation unit 13 connects at the node location of "object 1" so that the two graphs g1 and g2 share the "object 1" node. That is, as shown in the right diagram of FIG. 2 (2-2), the graph generating unit 13 generates a connected graph G by connecting the edge "predicate 1" connected to the node “subject 1" of the graph g1 and the edge "predicate 2" connected to the node “subject 2" of the graph g2 to one node "object 1".
- the connected graph G is not limited to the above-described connection of two graphs g1 and g2, and more graphs may be connected.
- the graph generating unit 13 may generate a connected graph by connecting graphs by other methods, not limited to the above-described method. For example, if the contents of the nodes between the multiple graphs are not completely identical, but are determined to be the same according to a preset criterion, or are determined to be related according to a preset criterion, the graph generating unit 13 may connect multiple different graphs at the node. The graph generating unit 13 may connect multiple graphs at the same node in the same manner as described above in response to a user operation, for example, an operation of specifying the same node for different graphs.
- FIG. 3 shows an example of the connected graph G generated by the graph generating unit 13.
- FIG. 3 shows the connected graph G displayed on the display device 30 by the display control unit 15.
- the display control unit 15 displays a display screen divided into a plurality of areas on the screen of the display device 30, for example, a graph display area D1, a mail list display area D2, a mail body display area D3, and a slide bar display area D4.
- the display control unit 15 then displays the generated connected graph G in the graph display area D1.
- the connected graph G in this example includes, for example, a graph g11 surrounded by a dotted line, such as the subject node “Minami”, the predicate edge “Setting”, and the object node “AA store”, and a graph g12 surrounded by a dotted line, such as the subject node “Okumura”, the predicate edge "I will visit you", and the object node "AA store”, and these graphs are connected by the object node "AA store”.
- the display control unit 15 also displays the connected graph G in the graph display area D1 as described above, and controls other display areas on the display screen to display information as follows. As shown in FIG. 3, the display control unit 15 displays in the email list display area D2 a list of header information of emails that contain a sentence that is the source of the graph included in the connected graph G, that is, the "Subject” (title), “Sender (From)”, “Destination (To)", and "Sent Time (Time)” of the email. Note that the display control unit 15 may display a list of only the "Subject" of the email in the email list display area D2.
- the display control unit 15 also displays the data of the email in the email body display area D3 as shown in FIG. 3. For example, the display control unit 15 displays in the email body display area D3 the "Subject” (title), “From", “To”, “Time”, and “Body” of the email selected by the user as shown in gray in FIG. 3 among the emails whose "Subjects" are listed in the email list display area D2. Note that the "Time” may be the time the email was sent or received.
- the display control unit 15 may also display text data contained in the "attachment” of the email in the email body display area D3, or may display only the text data of the "Body” and "Attachment” without displaying the "Header” information.
- the display control unit 15 also displays a histogram showing the number of emails per hour in the slide bar display area D4, as shown in FIG. 3. Specifically, the display control unit 15 displays a histogram with time (days) on the horizontal axis and the number of emails sent on the vertical axis. At this time, a slide bar B is set for time on the horizontal axis, and the user can specify time by changing the position and length of the slide bar B, as shown by the thick line in FIG. 3. The display control unit 15 then displays a list of emails located within the range of the slide bar B in the email list display area D2.
- the graph analysis unit 14 extracts a graph having a preset correlation between nodes from the connected graph G generated as described above. For example, the graph analysis unit 14 extracts a graph consisting of a specific node, an edge connected to the specific node, and another node connected to the specific node by an edge.
- the specific node is a node searched for or specified by the user on the display screen of the display device 30.
- the user can search for a node corresponding to a keyword by inputting a keyword in a search field displayed on the display screen of the display device 30, or can specify a node by selecting it with the pointer on the connected graph G displayed in the graph display area D1.
- the display control unit 15 described above controls the graph extracted by the graph analysis unit 14 to be displayed on the display screen of the display device 30 in a manner distinguished from other graphs.
- the graph analysis unit 14 extracts a graph including all edges and other nodes connected to the specific node "AA store”. Then, the display control unit 15 displays the extracted graph distinguished from other graphs, as shown by the thick lines in FIG. 4.
- a graph including the node “AA store", the edge "setting”, and the node “Minami” a graph including the node “AA store”, the edge "I will visit you", and the node “Okumura”
- a graph including the node "AA store", the edge "like” and the node “Kawashima”, etc. are displayed with thick lines to distinguish them from other graphs. Note that in the example of FIG. 4, the display control unit 15 displays the graphs other than the extracted graph in gray to make them less noticeable, so that the more extracted graphs are highlighted relative to the other graphs.
- FIG. 5 shows another example.
- the node of "tampering” a term that may be related to a crime
- the graph analysis unit 14 then extracts a graph including all edges and other nodes connected to the specific node "tampering”.
- the display control unit 15 then displays the extracted graph as shown by the thick lines in FIG. 5, distinguishing it from other graphs. That is, in the example of FIG.
- a graph including a node “Minami” connected to the node “tampering” by an edge, a graph including a node “Okumura” connected to the node “tampering” by an edge, a graph including a node “Sawada” connected to the node “tampering” by an edge, etc. are displayed by thick lines as distinguished from other graphs.
- the display control unit 15 also displays the email from which the extracted graph was generated, as described above, in a manner that distinguishes it from other emails. Specifically, as shown in FIG. 4 and FIG. 5, the display control unit 15 displays a black circle in the "subject" of the email from which the extracted graph was generated in the email list display area D2. This allows the user to easily recognize the email corresponding to the extracted graph. Then, when the user selects the "subject" of an email in the email list display area D2, the display control unit 15 displays the "body” of the selected email in the email body display area D3.
- the display control unit 15 displays text data, which is a sentence from the email from which the extracted graph was generated, in the email body display area D3, in a manner that distinguishes it from other text data.
- the display control unit 15 displays the text data, which is a sentence from the email from which the extracted graph was generated, in the email list display area D2, underlined.
- the display control unit 15 also displays a histogram showing the number of emails per hour in the slide bar display area D4, as shown in FIG. 3. Specifically, the display control unit 15 displays a histogram with the time (day) on the horizontal axis and the number of emails sent on the vertical axis. At this time, a slide bar B is set for the time on the horizontal axis, and the user can specify the time by changing the length of the slide bar B, as shown by the bold line in FIG. 3. The display control unit 15 then displays a list of emails located within the range of the slide bar B in the email list display area D2. Specifically, when the length of the slide bar B is set by the user's operation as shown in FIG.
- the display control unit 15 displays the "subject" of the email corresponding to the time within the range of the slide bar B in bold based on the time associated with the email, and displays it to distinguish it from the "subject" of other emails.
- the display control unit 15 displays the "subject" of emails other than the email located in the range of slide bar B in gray to make them less noticeable, thereby emphasizing the "subject” of emails located in the range of slide bar B relative to the other emails.
- the display control unit 15 displays the "subject" of the email corresponding to the graph generated from the email located in the range of slide bar B among the extracted graphs in bold, as shown in FIG. 7, to distinguish it from the other subjects.
- the display control unit 15 also displays the graphs generated from the emails located in the range of the slide bar B in the graph display area D1. Specifically, when the length of the slide bar B is set by the user's operation as shown in FIG. 6, the display control unit 15 displays the graphs generated from the emails corresponding to the time within the range of the slide bar B in bold based on the time associated with the email, and displays them in distinction from the graphs generated from the other emails. Note that in the example of FIG. 6, the display control unit 15 displays the graphs other than the graphs generated from the emails located in the range of the slide bar B in gray to make them less noticeable, so that the graphs generated from the emails located in the range of the slide bar B are displayed in a manner that emphasizes them more compared to the other graphs. At this time, if there are graphs extracted as described above, the display control unit 15 displays only the graphs generated from the emails located in the range of the slide bar B in bold among the extracted graphs, as shown in FIG. 7, to distinguish them from the other graphs.
- the display control unit 15 changes the list of e-mails highlighted in the e-mail list display area D2 or changes the graph highlighted in the graph display area D1 according to the specified time. This allows the user to easily recognize the chronological changes in the exchange of e-mails between each person.
- the information processing device 10 accepts input of text data to be processed and stores it in the text data storage unit 16 (step S1).
- the information processing device accepts input of e-mails that may be related to a case that is the subject of police investigation as text data.
- the e-mail contains data such as a "header” including a "subject,” “sender (From),” “destination (To),” and “time of sending (Time),” and a "body.”
- the information processing device 10 reads out the e-mails one by one from the storage device and performs preprocessing on the text data contained in the e-mails.
- the preprocessing the text data of the "body" contained in the e-mail is divided into sentences, and the subject, predicate, and object are extracted for each sentence.
- the information processing device 10 uses other text data before and after it or other information in the e-mail (e.g., the header) to newly complete the subject and object (step S2).
- the information processing device 10 extracts the subject, predicate, and object for each sentence in which the subject and object have been completed and for each sentence in which completion is not required (step S3).
- the information processing device 10 then associates the extracted set of subject, predicate, and object with the e-mail from which they were extracted (step S4).
- the information processing device 10 generates a graph in which the subject, predicate, and object generated for each sentence of the text data are represented in a graph structure consisting of nodes and edges connecting the nodes. Furthermore, the information processing device 10 connects multiple graphs according to the contents of the nodes of each graph to generate a connected graph. Then, the information processing device 10 displays the connected graph G on the display screen of the display device 30 (step S5). For example, the information processing device 10 displays the generated connected graph G in a graph display area D1 on the display screen of the display device 30, as shown in FIG. 3.
- the information processing device 10 displays, in the email list display area D2 on the display screen, a list of header information of emails that contain the sentence that is the source of the graph included in the connected graph G.
- the information processing device 10 also displays the body of the selected email in the email body display area D3 on the display screen.
- the information processing device 10 displays a histogram showing the number of emails per hour in the slide bar display area D4 on the display screen.
- the information processing device 10 accepts user operations on the displayed connected graph and analyzes the graph. For example, when the information processing device 10 accepts the specification of a node by a user operation, it extracts a graph related to the specified node. The information processing device 10 then highlights the extracted graph to distinguish it from other graphs (step S6). For example, as shown in FIG. 4, if the user specifies the node "AA store" in the connected graph G as a specific node, the information processing device 10 extracts a graph including all edges and other nodes connected to the specific node "AA store", and highlights the extracted graph to distinguish it from other graphs, as shown by the thick line in FIG. 4.
- the information processing device 10 displays the original email from which the extracted graph was generated as described above, distinguishing it from other emails. For example, as shown in Figures 4 and 5, the information processing device 10 adds a black circle to the "subject" of the original email from which the extracted graph was generated in the email list display area D2. Furthermore, when the user selects the "subject" of an email in the email list display area D2, the information processing device 10 displays the "body” of the selected email in the email body display area D3. For example, the information processing device 10 displays text data that is a sentence from the email from which the extracted graph was generated, distinguishing it from other text data, in the email body display area D3.
- the extracted graph is highlighted, allowing the user to easily recognize the relationships between people and the relationships between people and places. Furthermore, in this embodiment, the user can easily recognize the e-mail that corresponds to the extracted graph, and can also easily recognize the text of the e-mail.
- the information processing device 10 accepts user operations on the slide bar B displayed in the slide bar display area D4 and controls the display of graphs and email lists. For example, as shown by the bold lines in Figures 6 and 7, when the length of the slide bar B is changed by a user operation, the information processing device 10 highlights the list of emails located within the range of the slide bar B, or highlights a graph generated from the emails located within the range of the slide bar B.
- the list of highlighted e-mails and the graph change each time the length or position of the slide bar B is changed and specified. This allows the user to easily recognize the chronological changes in the exchange of e-mails between each person.
- Fig. 9 and Fig. 10 are block diagrams showing the configuration of an information processing device in embodiment 2. Note that this embodiment shows an outline of the configuration of the information processing device described in the above embodiment.
- the information processing device 100 is configured as a general information processing device, and is equipped with the following hardware configuration, as an example.
- ⁇ CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- Program group 104 loaded into RAM 103
- a storage device 105 for storing the program group 104
- a drive device 106 that reads and writes data from and to a storage medium 110 outside the information processing device.
- a communication interface 107 that connects to a communication network 111 outside the information processing device
- Input/output interface 108 for inputting and outputting data
- a bus 109 that connects each component
- FIG. 9 shows an example of the hardware configuration of the information processing device 100, and the hardware configuration of the information processing device is not limited to the above-mentioned case.
- the information processing device may be configured with a part of the above-mentioned configuration, such as not having the drive device 106.
- the information processing device may use a GPU (Graphic Processing Unit), a DSP (Digital Signal Processor), an MPU (Micro Processing Unit), an FPU (Floating point number Processing Unit), a PPU (Physics Processing Unit), a TPU (Tensor Processing Unit), a quantum processor, a microcontroller, or a combination of these.
- the information processing device 100 can be equipped with the generation unit 121 and extraction unit 122 shown in FIG. 10 by having the CPU 101 acquire and execute the program group 104.
- the program group 104 is stored in advance in the storage device 105 or ROM 102, for example, and is loaded into the RAM 103 and executed by the CPU 101 as necessary.
- the program group 104 may be supplied to the CPU 101 via the communication network 111, or may be stored in advance in the storage medium 110, and the drive device 106 may read out the programs and supply them to the CPU 101.
- the generation unit 121 and extraction unit 122 described above may be constructed of dedicated electronic circuits for realizing such means.
- the generation unit 121 generates a graph in which a plurality of types of pre-set sentence elements generated from text data are represented by nodes according to the type of element and edges connecting the nodes, and also generates a connected graph that connects the plurality of graphs according to the contents of the nodes. For example, the generation unit 121 generates a connected graph by connecting the plurality of graphs at the locations of nodes with the same contents contained in each of the plurality of graphs.
- the extraction unit 122 extracts a graph having preset relationships between nodes based on the connection graph. For example, the extraction unit 122 extracts a graph consisting of a specific node designated by the user, edges connecting to the specific node, and other nodes connected to the specific node by edges.
- a graph having associations between nodes is extracted from a connected graph generated from text. This makes it easy to recognize relationships between people included in the extracted graph, and relationships between people and places. As a result, even a huge amount of text data can be easily analyzed in a short time.
- Non-transitory computer readable medium includes various types of tangible storage medium.
- Examples of non-transitory computer readable medium include magnetic recording media (e.g., flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (e.g., magneto-optical disks), CD-ROM (Read Only Memory), CD-R, CD-R/W, and semiconductor memory (e.g., mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)).
- the program may also be supplied to a computer by various types of transitory computer readable medium. Examples of transitory computer readable medium include electrical signals, optical signals, and electromagnetic waves.
- the temporary computer-readable medium can supply the program to the computer via a wired communication path, such as an electric wire or optical fiber, or via a wireless communication path.
- the present disclosure has been described above with reference to the above-mentioned embodiments, but the present disclosure is not limited to the above-mentioned embodiments.
- Various modifications that can be understood by a person skilled in the art can be made to the configuration and details of the present disclosure within the scope of the present disclosure.
- at least one or more of the functions of the generation unit 121 and extraction unit 122 described above may be executed by an information processing device installed and connected anywhere on a network, that is, they may be executed by so-called cloud computing.
- (Appendix 1) a generation unit that generates a graph in which a plurality of types of sentence elements, which are set in advance and generated from text data, are represented by nodes and edges connecting the nodes according to the types of the elements, and that generates a connection graph in which a plurality of the graphs are connected according to the contents of the nodes; an extraction unit that extracts the graph having a preset relationship between the nodes based on the connection graph;
- An information processing device comprising: (Appendix 2) 2.
- the generation unit generates the connected graph by connecting the plurality of graphs at the locations of the nodes having the same content that are included in each of the plurality of graphs; the extraction unit extracts the graph including a specific node, an edge connected to the specific node, and another node connected to the specific node by the edge; Information processing device. (Appendix 3) 3.
- the display control unit displays the extracted graph in a manner distinguished from other graphs based on time information associated with the text data from which the extracted graph was generated.
- Information processing device. (Appendix 5) 4.
- the display control unit displays the graph generated from the text data associated with time information corresponding to a specified time in a manner distinguished from other graphs.
- Information processing device. (Appendix 6) 4.
- the display control unit displays, on the display device, the text data from which the extracted graph was generated.
- the display control unit displays, on the display device, a list of titles associated with the text data, and displays the title associated with the text data from which the extracted graph was generated, distinguished from other titles.
- Information processing device. (Appendix 8) 8. The information processing device according to claim 7, the display control unit displays the title associated with the text data from which the extracted graph was generated in a manner distinguished from other titles based on time information associated with the text data from which the extracted graph was generated.
- the information processing method generating the connected graph by connecting the plurality of graphs at the locations of the nodes having the same content that are included in each of the plurality of graphs; Extracting the graph consisting of a specific node, the edges connected to the specific node, and other nodes connected to the specific node by the edges; Information processing methods.
- Appendix 12 12.
- Appendix 13 13.
- the information processing method according to claim 12 displaying the extracted graph in a manner distinguished from other graphs based on time information associated with the text data from which the extracted graph was generated; Information processing methods.
- the information processing method according to claim 16 further comprising: displaying the title associated with the document data from which the extracted graph was generated in a manner distinguished from other titles, based on time information associated with the document data from which the extracted graph was generated; Information processing methods. (Appendix 18) 18.
- (Appendix 19) generating a graph in which a plurality of types of sentence elements, which are set in advance and generated from the text data, are represented by nodes and edges connecting the nodes according to the types of the elements, and generating a connected graph in which a plurality of the graphs are connected according to the contents of the nodes; extracting the graph having a preset relationship between the nodes based on the connection graph;
- a computer-readable storage medium that stores a program for causing a computer to execute a process.
- Information processing device 11
- Preprocessing unit 13
- Graph generating unit 14
- Graph analyzing unit 15
- Display control unit 16
- Text data storage unit 17
- Graph storage unit 100
- Information processing device 101
- CPU 102
- ROM 103
- RAM 104
- Program group 105
- Storage device 106
- Communication interface 108
- Input/output interface 109
- Bus 110
- Storage medium 111
- Communication network 121 Generation unit 122 Extraction unit
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2023/003734 WO2024166155A1 (ja) | 2023-02-06 | 2023-02-06 | 情報処理装置、情報処理方法、プログラム |
| JP2024575868A JPWO2024166155A1 (https=) | 2023-02-06 | 2023-02-06 |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2023/003734 WO2024166155A1 (ja) | 2023-02-06 | 2023-02-06 | 情報処理装置、情報処理方法、プログラム |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024166155A1 true WO2024166155A1 (ja) | 2024-08-15 |
Family
ID=92262664
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2023/003734 Ceased WO2024166155A1 (ja) | 2023-02-06 | 2023-02-06 | 情報処理装置、情報処理方法、プログラム |
Country Status (2)
| Country | Link |
|---|---|
| JP (1) | JPWO2024166155A1 (https=) |
| WO (1) | WO2024166155A1 (https=) |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2015125594A (ja) * | 2013-12-26 | 2015-07-06 | キヤノンマーケティングジャパン株式会社 | 情報処理装置、情報処理方法、プログラム |
| JP2015219901A (ja) * | 2014-05-19 | 2015-12-07 | ムジグマ・ビジネス・ソリューションズ・ピーブイティー・リミテッド | 事業問題ネットワーキングシステムおよびツール |
| JP2015225371A (ja) * | 2014-05-26 | 2015-12-14 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | 関連ノードを探索する方法、並びに、そのコンピュータ、及びコンピュータ・プログラム |
| US20160275180A1 (en) * | 2015-03-19 | 2016-09-22 | Abbyy Infopoisk Llc | System and method for storing and searching data extracted from text documents |
| JP2020098387A (ja) * | 2018-12-17 | 2020-06-25 | 株式会社日立製作所 | 因果関係表示システム及び方法 |
| JP2021196785A (ja) * | 2020-06-12 | 2021-12-27 | 株式会社日立社会情報サービス | テキストマイニング装置およびテキストマイニング方法 |
-
2023
- 2023-02-06 JP JP2024575868A patent/JPWO2024166155A1/ja active Pending
- 2023-02-06 WO PCT/JP2023/003734 patent/WO2024166155A1/ja not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2015125594A (ja) * | 2013-12-26 | 2015-07-06 | キヤノンマーケティングジャパン株式会社 | 情報処理装置、情報処理方法、プログラム |
| JP2015219901A (ja) * | 2014-05-19 | 2015-12-07 | ムジグマ・ビジネス・ソリューションズ・ピーブイティー・リミテッド | 事業問題ネットワーキングシステムおよびツール |
| JP2015225371A (ja) * | 2014-05-26 | 2015-12-14 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | 関連ノードを探索する方法、並びに、そのコンピュータ、及びコンピュータ・プログラム |
| US20160275180A1 (en) * | 2015-03-19 | 2016-09-22 | Abbyy Infopoisk Llc | System and method for storing and searching data extracted from text documents |
| JP2020098387A (ja) * | 2018-12-17 | 2020-06-25 | 株式会社日立製作所 | 因果関係表示システム及び方法 |
| JP2021196785A (ja) * | 2020-06-12 | 2021-12-27 | 株式会社日立社会情報サービス | テキストマイニング装置およびテキストマイニング方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2024166155A1 (https=) | 2024-08-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12314317B2 (en) | Video generation | |
| TWI737006B (zh) | 一種跨模態訊息檢索方法、裝置和儲存介質 | |
| US11086861B2 (en) | Translating a natural language query into a formal data query | |
| AU2017290063B2 (en) | Apparatuses, methods and systems for relevance scoring in a graph database using multiple pathways | |
| US10262080B2 (en) | Enhanced search suggestion for personal information services | |
| CN108196920B (zh) | 一种ui界面的显示处理方法及装置 | |
| CN113656587B (zh) | 文本分类方法、装置、电子设备及存储介质 | |
| US20080306899A1 (en) | Methods, apparatus, and computer-readable media for analyzing conversational-type data | |
| US11080070B2 (en) | Automated user interface analysis | |
| US11943181B2 (en) | Personality reply for digital content | |
| US20150121200A1 (en) | Text processing apparatus, text processing method, and computer program product | |
| CN114267375B (zh) | 音素检测方法及装置、训练方法及装置、设备和介质 | |
| US12361296B2 (en) | Environment augmentation based on individualized knowledge graphs | |
| CN118626717A (zh) | 大语言模型的微调方法、资源推荐方法、装置和设备 | |
| WO2024166155A1 (ja) | 情報処理装置、情報処理方法、プログラム | |
| Patel et al. | Fake review detection using opinion mining | |
| CN109299443A (zh) | 一种基于最小顶点覆盖的新闻文本去重方法 | |
| US20250315435A1 (en) | System and method for responding to queries | |
| JP4671440B2 (ja) | 評判関係抽出装置、その方法およびプログラム | |
| CN119474877A (zh) | 元宇宙智能交互训练系统 | |
| CN112446214A (zh) | 广告关键词的生成方法、装置、设备及存储介质 | |
| JP2020071737A (ja) | 学習方法、学習プログラム及び学習装置 | |
| JP6379742B2 (ja) | 情報表示制御装置およびプログラム | |
| KR102282328B1 (ko) | Lstm을 이용한 국가별 선호도 예측 시스템 및 방법 | |
| CN114880451A (zh) | 检索式对话的生成方法、装置及电子设备 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23920993 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2024575868 Country of ref document: JP Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024575868 Country of ref document: JP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 23920993 Country of ref document: EP Kind code of ref document: A1 |