WO2021244657A1 - Information stream extraction method, apparatus and device - Google Patents

Information stream extraction method, apparatus and device Download PDF

Info

Publication number
WO2021244657A1
WO2021244657A1 PCT/CN2021/098541 CN2021098541W WO2021244657A1 WO 2021244657 A1 WO2021244657 A1 WO 2021244657A1 CN 2021098541 W CN2021098541 W CN 2021098541W WO 2021244657 A1 WO2021244657 A1 WO 2021244657A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
entity
entity information
association
client
Prior art date
Application number
PCT/CN2021/098541
Other languages
French (fr)
Chinese (zh)
Inventor
德斯潘德S
库玛庞卡
乌玛尚卡尔希夫尚卡尔
汉斯马库斯
Original Assignee
智慧芽信息科技(苏州)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 智慧芽信息科技(苏州)有限公司 filed Critical 智慧芽信息科技(苏州)有限公司
Publication of WO2021244657A1 publication Critical patent/WO2021244657A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services
    • G06Q50/184Intellectual property management

Definitions

  • This specification relates to the technical field of computer data processing, in particular, to an information flow extraction method, device and equipment.
  • the purpose of the embodiments of this specification is to provide an information flow extraction method, device, and equipment, which can greatly improve the accuracy and comprehensiveness of information query.
  • This specification provides an information flow extraction method, device, and equipment that are implemented in the following ways:
  • An information flow display method including:
  • the client sends an information flow acquisition request to the server, where the information flow acquisition request includes the input information acquired by the client;
  • the server receives the information flow acquisition request, and extracts one or more entity information from the input information as target entity information;
  • the server extracts at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes the direction of the association between the entity information and Information describing the type of association;
  • the server uses the association relationship information to associate the target entity information and the associated entity information, obtains the information flow corresponding to the target entity information, and sends the information flow to the client for the Display on the client.
  • the embodiment of this specification also provides an information flow extraction method, including:
  • At least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source, and the association relationship information includes an association direction and an association between the entity information Type description information;
  • association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
  • the using the association relationship information to associate the target entity information with the associated entity information includes:
  • the target entity information is associated with the corresponding associated entity information to obtain the sub-information stream of the target entity information;
  • link sub-information streams obtained from the same or different data sources to obtain the information stream corresponding to the target entity information.
  • the association relationship information when the association relationship information includes co-located entity description information, the associated entity information corresponding to the co-located entity description information is added to the target entity information, and updated The target entity information; wherein, the co-located entity description information includes description information in other expression forms that describe the associated entity information as target entity information;
  • At least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source.
  • the method further includes:
  • the extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source include:
  • the context information includes a material process entity, a material name entity, and/or a material application entity
  • extract the associated entity information corresponding to the target entity information and the target entity information and the associated entity information according to the material process information association method The relationship information between.
  • the method further includes:
  • the manufacturer or supplier corresponding to the target entity information or associated entity information is extracted according to the product name, and the manufacturer or supplier is associated with the target entity information or associated entity information.
  • the method further includes:
  • the parameter entity information includes at least one of the material structure type entity, the process method entity, the material attribute entity, the unit entity or the measurement entity corresponding to one Entity information
  • the extracted parameter entity information is associated with the corresponding target entity information and associated entity information.
  • the method further includes: associating the sub-information stream of the target entity information with the corresponding data source.
  • the method further includes:
  • an embodiment of this specification also provides an information stream extraction device, including:
  • the first obtaining module is used to obtain target entity information
  • the first extraction module is configured to extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes pair entity information Information describing the direction of the association and the type of association;
  • the first association module is configured to use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
  • the first association module includes:
  • the first associating unit is used to obtain at least one associated entity information extracted from the data source and the association relationship information between the target entity information and the corresponding associated entity information, and use the relationship between the target entity information and the corresponding associated entity information
  • the association relationship information between each other associates the target entity information with the corresponding associated entity information, and obtains the sub-information stream of the target entity information;
  • the second associating unit is used to link the sub-information streams obtained from the same or different data sources using the target entity information as the reference information to obtain the information stream corresponding to the target entity information.
  • the device when the association relationship information includes colocation entity description information, the device further includes:
  • the update module is used to add the associated entity information corresponding to the co-located entity description information to the target entity information, and update the target entity information; wherein the co-located entity description information includes information describing the associated entity as the target entity Descriptive information in other forms of information;
  • the first extraction module is further configured to extract at least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information from the data source.
  • the device further includes:
  • the second extraction module is used to extract the information flow of each associated entity information extracted from the data source as the target entity information, and obtain the information flow corresponding to the multiple target entity information;
  • the second association module is used to link the information streams corresponding to multiple target entity information with the corresponding target entity information as the reference information to obtain the information graph.
  • the first extraction module includes:
  • the retrieval unit is used to retrieve the data source where the target entity information is located;
  • the positioning unit is used to locate the context information where the target entity information is located in the data source;
  • the extraction unit is configured to extract the associated entity information corresponding to the target entity information and the target entity according to the material process information association method when the context information includes material process entities, material name entities, and/or material application entities Information about the association relationship between information and associated entity information.
  • the device further includes:
  • the third extraction module is used to extract the target entity information and the commodity name of the associated entity information from the data source;
  • the third association module is used to extract the manufacturer or supplier corresponding to the target entity information or the associated entity information according to the product name, and compare the manufacturer or supplier with the target entity information or the associated entity information Associated.
  • the device further includes:
  • the fourth extraction module is used to extract the target entity information and the parameter entity information of the associated entity information from the data source.
  • the parameter entity information includes at least a material structure type entity, a process method entity, a material attribute entity, a unit entity, or a measurement entity Entity information corresponding to one of the types;
  • the fourth association module is used to associate the extracted parameter entity information with corresponding target entity information and associated entity information.
  • the first association module further includes:
  • the third associating unit is used for associating the sub-information stream of the target entity information with the corresponding data source.
  • the device further includes:
  • the visualization processing module is used to treat each entity information in the information flow as an information node for interacting with the user, and use a visualization method to visualize the information flow;
  • the first sending module is configured to send the processed information stream to the client, so that the user can view the information stream and trigger the information node through the client;
  • the second sending module is configured to feed back other information associated with the information node to the client based on the user's triggering operation on the information node.
  • the embodiments of this specification also provide an information flow extraction device.
  • the device includes a processor and a memory for storing processor-executable instructions. When the instructions are executed by the processor, any one of or Multiple steps of the described method.
  • the embodiment of this specification also provides an information flow display method, which is applied to a server, and includes:
  • At least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source, and the association relationship information includes an association direction and an association between the entity information Type description information;
  • association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information
  • the information stream is sent to the client for display on the client.
  • the embodiment of this specification also provides a server, including:
  • the first receiving module is configured to receive an information flow acquisition request sent by the client, where the information flow acquisition request includes input information acquired by the client;
  • the fifth extraction module is used to extract one or more entity information from the input information as target entity information
  • the sixth extraction module is configured to extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes pair entity information Information describing the direction of the association and the type of association;
  • a fifth association module configured to use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information
  • the third sending module is configured to send the information stream to the client for display on the client.
  • the embodiment of this specification also provides an information flow display method, which is applied to the client, including:
  • the information flow acquisition request includes the input information obtained by the client; so that the server receives the information flow acquisition request and extracts one or more entity information from the input information as the target entity Information; and, extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes the relationship between the entity information Information describing the association direction and association type; and, using the association relationship information to associate the target entity information and the associated entity information, obtain the information flow corresponding to the target entity information, and send the information flow To the client;
  • the embodiment of this specification also provides a client, including:
  • the fourth sending module is configured to send an information flow acquisition request to the server, where the information flow acquisition request includes the input information obtained by the client; so that the server receives the information flow acquisition request and extracts one or more information from the input information.
  • Entity information as target entity information; and extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, the association relationship information Including information describing the direction and type of association between entity information; and using the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information, And send the information stream to the client;
  • the second receiving module is used to receive the information stream sent by the server
  • the first display module is used to display the information stream.
  • the embodiment of this specification also provides an information retrieval method applied to a server, including:
  • the data source where the entity information is located is determined according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes the direction of association between the entity information And information describing the type of association;
  • the search result is sent to the client, so that the client can display it.
  • the embodiment of this specification also provides a server, including:
  • the third receiving module is configured to receive a search request sent by the client, where the search request includes input information obtained by the client;
  • the seventh extraction module is used to extract entity information from the input information, where the entity information includes entity type and entity value;
  • the data source determining module is configured to determine the data source where the entity information is located according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes Information describing the direction of association between entity information and the type of association;
  • the fifth sending module is configured to send the search result to the client, so that the client can display it.
  • the embodiment of this specification also provides an information retrieval method, which is applied to the client, including:
  • the search request includes the input information obtained by the client; so that the server extracts the entity information from the input information, the entity information includes the entity type and the entity value; according to the entity information in the data source
  • the associated entity information in and the association relationship information between the entity information and the associated entity information determine the data source where the entity information is located, and the association relationship information includes information describing the association direction and association type between the entity information; And sending the search result to the client;
  • the information input interface displayed by the client includes an information input area, a first selection list, and/or a second selection list; the information input area is used to perform information Input; the first selection list and the second selection list are used for information selection, the first selection list includes material application or material structure category information; the second selection list includes entity type category information;
  • the client terminal obtains input information based on the information input area, the first selection list, and/or the second selection list.
  • the first selection list and/or the second selection list adopts an interactive visualization format to display the information to be selected; wherein, the interactive visualization format means that the information to be selected is displayed through visualization Display the entity type or material structure, the category information of the material application in the form of, and determine the corresponding category and subcategory list selection format by receiving the trigger operation of each category information.
  • the embodiment of this specification also provides a client, including:
  • the sixth sending module is configured to send a search request to the server, the search request includes input information obtained by the client; so that the server extracts entity information from the input information, and the entity information includes entity type and entity value;
  • the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information determine the data source where the entity information is located, and the association relationship information includes the association direction between the entity information and Information describing the type of association; and sending the search result to the client;
  • the fourth receiving module is configured to receive the retrieval result sent by the server
  • the second display module is used to display the retrieval result.
  • the client further includes:
  • the input interface display module is used to display an information input interface.
  • the information input interface includes an information input area, a first selection list and/or a second selection list; the information input area is used for information input; the first selection The list and the second selection list are used for information selection, the first selection list includes category information of material application or material structure; the second selection list includes category information of entity type;
  • the input information obtaining module is configured to obtain input information based on the information input area, the first selection list and/or the second selection list.
  • the embodiment of this specification also provides a method for generating summary information, which is applied to a server, and includes:
  • the entity information and the association relationship information between the entity information are extracted from the data source, and the corresponding entity information is associated according to the extracted association relationship information; wherein the association relationship information includes the association direction between the entity information and Information describing the type of association;
  • the summary information is generated according to the associated entity information, and the generated summary information is sent to the client, so that the client can display it.
  • the embodiment of this specification also provides a server, including:
  • the fifth receiving module is configured to receive a summary information generation request sent by the client, where the summary information generation request includes the data source for which the summary is to be generated;
  • the eighth extraction module is used to extract entity information and the association relationship information between the entity information from the data source; wherein the association relationship information includes information describing the association direction and the association type between the entity information;
  • the sixth association module is used to associate corresponding entity information according to the extracted association relationship information
  • the generation module is used to generate summary information based on the associated entity information
  • the seventh sending module is configured to send the generated summary information to the client, so that the client can display it.
  • the embodiment of this specification also provides a method for generating summary information, which is applied to the client, and includes:
  • the server Send a summary information generation request to the server, where the summary information generation request includes the data source for which the summary is to be generated; so that the server extracts the entity information and the association relationship information between the entity information from the data source, based on the extracted association relationship information Associate the corresponding entity information; wherein, the association relationship information includes information describing the direction of association between the entity information and the type of association; and, according to the associated entity information, summary information is generated, and the generated summary information is sent The client;
  • the embodiment of this specification also provides a client, including:
  • the eighth sending module is configured to send a summary information generation request to the server, where the summary information generation request includes the data source for which the summary is to be generated; so that the server extracts entity information and the association relationship information between the entity information from the data source , Associate the corresponding entity information according to the extracted association relationship information; wherein the association relationship information includes information describing the association direction and association type between the entity information; and, generate summary information based on the associated entity information , Sending the generated summary information to the client;
  • the fifth receiving module is configured to receive the summary information sent by the server
  • the third display module is used to display the summary information.
  • the information flow extraction method, device, and equipment provided by one or more embodiments of this specification can extract entity information and the association relationship information between entity information at the same time, so that the association relationship information can be effectively used to further verify and verify the extracted entity information. Screening to determine whether the extracted entity information is entity information related to the target entity information, thereby further improving the accuracy of extracting entity information associated with the target entity information, and effectively filtering noise.
  • using the association relationship information can also associate the extracted entity information with the target entity information, effectively displaying the association relationship between the extracted entity information and the target entity information, which is convenient for users to view and sort out, so that users can accurately and efficiently obtain what they need Useful information, find the needed or new solutions, and improve the user experience.
  • Figure 1 is a schematic flow diagram of an information flow display method provided in this specification.
  • FIG. 2 is a schematic diagram of the information flow in an embodiment provided in this specification.
  • FIG. 3 is a schematic diagram of the information flow in another embodiment provided in this specification.
  • Figure 4 is a schematic diagram of the information map in another embodiment provided in this specification.
  • Fig. 5 is a schematic diagram of the information map in another embodiment provided in this specification.
  • Fig. 6 is a schematic diagram of an information map in another embodiment provided in this specification.
  • Fig. 7 is a schematic diagram of an information map in another embodiment provided in this specification.
  • Fig. 8 is a schematic diagram of an information map in another embodiment provided in this specification.
  • Fig. 9 is a schematic diagram of an information map in another embodiment provided in this specification.
  • FIG. 10 is a schematic diagram of the information map in another embodiment provided in this specification.
  • FIG. 11 is a schematic diagram of the information map in another embodiment provided in this specification.
  • FIG. 12 is a schematic diagram of the information map in another embodiment provided in this specification.
  • FIG. 13 is a schematic diagram of the information map in another embodiment provided in this specification.
  • FIG. 14 is a schematic diagram of the visual display of the information graph in another embodiment provided in this specification.
  • Fig. 15 is a schematic diagram of material attribute classification in another embodiment provided in this specification.
  • Figure 16 is a schematic diagram of using units and metrics to link materials to attribute lists from different attribute classes in another embodiment provided in this specification.
  • FIG. 17 is a schematic diagram of extracting attributes, units, and metrics in another embodiment provided in this specification.
  • FIG. 18 is a schematic diagram of linking the extracted attributes, units, and metrics to corresponding materials in another embodiment provided in this specification.
  • Figure 19 is a schematic diagram of extracting attributes, units, and metrics in another embodiment provided in this specification.
  • FIG. 20 is a schematic diagram of extracting attributes, units, and metrics in another embodiment provided in this specification.
  • FIG. 21 is a schematic diagram of extracting attributes, units, and metrics in another embodiment provided in this specification.
  • Figure 22 is a schematic diagram of linking extracted attributes, units, and metrics to corresponding materials in another embodiment provided in this specification.
  • Figure 23 is a schematic flow chart of an information flow extraction method provided in this specification.
  • Figure 24 is a schematic diagram of a model structure of an information flow extraction device provided in this specification.
  • Figure 25 is a schematic flow chart of an information retrieval method provided in this specification.
  • FIG. 26 is a schematic diagram of a search interface and search results in another embodiment provided in this specification.
  • Figure 27 is a schematic diagram of a retrieval interface in another embodiment provided in this specification.
  • FIG. 28 is a schematic diagram of retrieval results in another embodiment provided in this specification.
  • FIG. 29 is a schematic diagram of application classification in another embodiment provided in this specification.
  • FIG. 30 is a schematic diagram of the material structure classification interactive interface in another embodiment provided in this specification.
  • FIG. 31 is a schematic diagram of the classification of polymers in another embodiment provided in this specification according to the material structure.
  • FIG. 32 is a schematic diagram of an application classification interaction interface in another embodiment provided in this specification.
  • FIG. 33 is a further detailed classification diagram of smart materials in another embodiment provided in this specification.
  • FIG. 34 is a schematic diagram of the retrieval result in another embodiment provided in this specification.
  • FIG. 35 is a schematic diagram of a retrieval interface in another embodiment provided in this specification.
  • FIG. 36 is a schematic diagram of retrieval results in another embodiment provided in this specification.
  • Fig. 37 is a schematic diagram of an image of the protein structure, DNA plasmid, and material microstructure in another embodiment provided in this specification.
  • FIG. 38 is an image schematic diagram of a circuit diagram and a flowchart in another embodiment provided in this specification.
  • Fig. 39 is a schematic diagram of material attribute classification in another embodiment provided in this specification.
  • FIG. 40 is a schematic flowchart of a method for generating an abstract provided in this specification.
  • the user can search for information through the client.
  • the client can be a mobile device.
  • the client can be a smart phone, a tablet electronic device, a portable computer, a personal digital assistant (PDA), a vehicle-mounted device, or a smart wearable device, etc.
  • PDA personal digital assistant
  • the client can also be a desktop device.
  • the client can be a server, an industrial computer (industrial control computer), a personal computer (PC), an all-in-one machine, or an intelligent self-service terminal (kiosk), etc.
  • the user can enter information in the page displayed on the client.
  • the input information is reference information that the user wants to search for. If the user wants to inquire about alumina ceramics-related information, the user can enter the information "alumina ceramics" in the page displayed on the client, and the client can send an information search request to the server, and the search request can be accompanied by the above input information " Alumina ceramics".
  • the server may extract the input information "alumina ceramics" from the information search request, and use the input information "alumina ceramics" as reference information to extract entity information or data sources associated with the input information "alumina ceramics".
  • the server can be provided with storage or information linking with the database. After the server receives the information search request sent by the client, it can use the data source in the server memory or the data source in the database linked to it to search for information. After the server completes the information search, it can feed back the searched information to the client, so that the client can display it.
  • the associated entity information of the input information and the association type and direction between the input information and the associated entity information can be extracted from the data source, and the extracted associated entity information can be combined with the associated entity information by using the associated relationship information.
  • the input information is associated to form an information flow corresponding to the input information. Then, the information flow can be shown to the user, so as to visually view the required information and the relationship between the information through the information flow.
  • Fig. 1 A specific embodiment is shown in Fig. 1.
  • the method may include:
  • S20 The client sends an information flow acquisition request to the server, where the information flow acquisition request includes input information acquired by the client.
  • the client can send an information flow acquisition request to the server.
  • the information flow acquisition request may include user input information acquired by the client. For example, there may be an information input box corresponding to the client, and the user may input information into the client by manually inputting or selecting a drop-down selection box of the information displayed by the client. Then, the client may further respond to the user's operation to generate an information flow acquisition request, and the information flow acquisition request may be accompanied by the aforementioned input information. Then, the client can send the information flow acquisition request to the server.
  • the display interface of the client can be provided with a search button or an information flow generation button.
  • the user can trigger the search instruction by clicking the button or hitting the enter key on the keyboard, and then the client can generate information in response to the user's operation
  • the stream gets the request and sends it to the server.
  • the specific implementation manners of the foregoing information input, user operation, and information flow acquisition request are only examples, and the specific implementation is not limited to the foregoing manners.
  • the client can show the user an information input box and an information flow generation button.
  • the user can enter the information "polyvinyl alcohol” in the input box, and then click the information flow generation button, and the client can respond to the user's click operation , To generate an information flow acquisition request to obtain all or part of the information about the material.
  • the user enters the information "organic electroluminescence display coating" in the input box to obtain all or part of the information about the application of the material from the raw material to the material application.
  • other types of information can also be input to obtain information related to the input information.
  • the user can also define the extraction direction of the required information flow or the start node and end node of the information flow.
  • the process information flow of a certain material is limited to obtain information related to the material process of the material.
  • the user can enter "polyvinyl alcohol material technology" or "polyvinyl alcohol and processing technology” on the page displayed on the client.
  • the page displayed by the client is also provided with a drop-down selection box, and the drop-down selection box is set with one or more information extraction direction selection items, such as material technology, material application and other options.
  • the user can further select the information extraction direction from the drop-down selection box to obtain the corresponding information.
  • the "polyvinyl alcohol” and the extraction direction "processing technology” can be attached to the retrieval request together and sent to the server.
  • S22 The server receives the information flow acquisition request, and extracts one or more entity information from the input information as target entity information.
  • the server After the server receives the information flow acquisition request, it may extract one or more entity information from the input information in the information flow acquisition request as target entity information to extract the information flow.
  • the entity information may include information such as a proper name or a meaningful quantitative phrase recognized from the data source.
  • the proprietary name may include, for example, material name, specific material application, manufacturing method, material attribute, trade name, and so on.
  • the meaning of quantitative phrases such as measurement values of material properties, dates, and so on.
  • the entity information may include entity type and entity value.
  • the entity type may refer to information that describes the category to which the information with common characteristics belongs after categorizing the information.
  • the entity value may include specific information corresponding to each entity type in the data source.
  • the data source may include relatively closed text information such as patent documents, dissertation documents, and text information in open source databases. The information in the data source is huge and the type of information is complex, and the workload of information analysis and extraction is usually large. By combining entity information to extract information, the accuracy and efficiency of information extraction can be greatly improved.
  • the entity type may include, for example, material name entity, material structure type entity, material application entity, material process entity, material attribute entity, material trade name entity, material supplier/manufacturer entity, unit entity, and measurement Entities and so on.
  • the material process entity may also include a process method entity between materials, a manufacturing method entity between a material and a material application, an intermediate material entity in a processing or manufacturing process, and the like.
  • the intermediate material entity may include, for example, an additive entity, a catalyst entity, and the like.
  • the material name entity may include the entity type corresponding to the description information of the material name to determine what kind of material.
  • the entity value corresponding to the material name entity can be, for example, chemical, physical or biological description information with the name of the material, such as polyvinyl alcohol, zirconia, martensitic stainless steel, etc.
  • the material structure type entity may include an entity type corresponding to the description information of the type to which each material belongs, determined according to the material structure.
  • the entity value of the material structure type entity can be metal, ceramic, biology and so on.
  • the material application entity may include an entity type corresponding to the description information of the application corresponding to the material.
  • the entity value of the material application entity may be, for example, the application field of each material or specific application information.
  • the application fields may include building materials, energy materials, 3D printing materials, optoelectronic materials, and so on.
  • Specific applications may include organic electric display coatings, packaging bags, and so on.
  • the material process entity may include the entity type corresponding to the description information of the process/manufacturing method of obtaining another one or more materials or material application from one or more materials.
  • the physical value of the material process entity may be, for example, description information of the process/manufacturing method, such as polymerization, hydrolysis, alloying, metal refining, stretching, finishing, etc., as well as additives, catalysts, etc. in the material processing or manufacturing process .
  • the material property entity may include the entity type corresponding to the description information that characterizes the internal characteristics such as the structure and performance of the material.
  • the physical value of the material process entity may include, for example, melting point, thermal conductivity, specific heat capacity, yield strength, elastic modulus, and so on.
  • the material commodity name entity may include the entity type corresponding to the commodity name description information used in the actual production and sales of the material.
  • the material supplier/manufacturer entity may include the entity type corresponding to the supplier/manufacturer description information of each material.
  • the unit entity may include the entity type corresponding to the measurement unit description information of the specific value of the attribute information of each material.
  • the measurement entity may include the entity type corresponding to the specific value description information of the attribute information of each material.
  • the unit of melting point can include degrees Celsius, Kelvin, etc.
  • the corresponding degrees Celsius and Kelvin are the entity values corresponding to the unit entity.
  • the melting point of tungsten is 3410 degrees Celsius, where the melting point is the physical value of the material properties, 3410 is the physical value of the measurement entity, and degrees Celsius is the physical value of the unit entity.
  • the entity information that can be extracted is “material name entity-polyvinyl alcohol”.
  • A-B in the entity information expression form A is the entity type
  • B is the entity value, that is, the material name entity is the entity type
  • polyvinyl alcohol is the entity value.
  • the foregoing entity type and corresponding entity value expression methods are only examples for illustration, and methods such as association or table storage may also be used for specific data processing.
  • the entity information in the embodiments of this specification uniformly adopts the form of "A-B”.
  • the entity information that can be extracted for the aforementioned "melting point of tungsten is 3410 degrees Celsius” is "material name entity-tungsten", “material attribute entity-melting point”, “unit entity-degrees Celsius”, and "metric entity-3410".
  • the method for extracting entity information can be obtained by learning and training a large amount of sample data.
  • a natural language processing (Natural Langunge Possns, NLP) algorithm may be used to process and tokenize the input information.
  • a named entity recognition algorithm Named Entity Recongnition, NER
  • the entity information dictionary may include common professional word information in a designated field constructed in advance based on a large amount of sample data, and the entity information dictionary may for example be a database including professional word information such as material names, material attributes, and material applications in the material field.
  • the entity information database may include a database pre-built by the platform that includes common professional word information in a specified field, such as an entity information database that includes professional word information such as material names, material attributes, and material applications in the material field.
  • entity information database that includes professional word information such as material names, material attributes, and material applications in the material field.
  • the server can use the extracted entity information as the target entity information to extract the information flow.
  • the target entity information is reference information extracted from the information flow, and the server may use the target entity information as a reference to extract entity information associated with the target entity information and the association relationship information between the target entity information and the extracted entity information, And then construct the information flow to obtain the corresponding target entity information.
  • the entity information mentioned is "material name entity-polyvinyl alcohol”
  • the entity information can be used as the target entity information
  • the data source can be extracted from the data source with "material name entity-polyvinyl alcohol”.
  • the entity information associated with the "polyvinyl alcohol” and the association relationship information between the "material name entity-polyvinyl alcohol” and the extracted entity information are then constructed to obtain the information flow of the "material name entity-polyvinyl alcohol".
  • the server can further analyze the information flow extraction requirements corresponding to the entity information extracted from the input information. If the extraction requirement is to separately extract the information flow corresponding to each entity information, the server may use each entity information as the target entity information, and extract the information flow corresponding to each entity information.
  • the input information is "polyvinyl alcohol OR polycarbonate", and the input information adopts the form of Boolean logic expression. If the logical operator "OR" means or is defined in advance, then the server can use the related word " OR" determines the information flow extraction requirements of the two entity information to be extracted separately.
  • the server can use "material name entity-polyvinyl alcohol” and “material name entity-polycarbonate” as the target entity information respectively, that is, there are two target entity information, and the content of the two target entity information is "material name entity” -Polyvinyl alcohol” and "material name entity-polycarbonate”.
  • the server can extract the information flow of the two target entity information "material name entity-polyvinyl alcohol” and “material name entity-polycarbonate” to obtain the information flow of "material name entity-polyvinyl alcohol” and "material Name entity-polycarbonate” information flow.
  • the server may respectively extract the information streams of each target entity information in sequence, or may extract multiple target entity information information streams in parallel through parallel processing.
  • the input information mentioned in step S20 includes the direction-restricted extraction.
  • the user enters "polyvinyl alcohol” in the input box, and selects or inputs the extraction direction limitation information "organic electroluminescent display coating" in the extraction direction limitation selection box or the input box.
  • the server extracts the two entity information "material name entity-polyvinyl alcohol” and “material application entity-organic electroluminescent display coating” from the input information, and extracts the material from the material polyvinyl alcohol to Material application
  • the information flow constituted by the material processing technology between the coatings of the organic electroluminescence display, other materials involved, and material suppliers.
  • the server extracts at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, where the association relationship information includes the association between the entity information Information describing the direction and type of association.
  • the server may extract at least one associated entity information corresponding to the target entity information from the data source, and extract the association relationship information between the target entity information and the associated entity information.
  • the associated entity information may refer to entity information that has a certain association relationship with the target entity information.
  • the association relationship information may include information describing the direction and type of association between entity information.
  • the association direction may indicate the flow direction of the entity information in the information flow, and intuitively identifies the flow direction of the information flow, which is convenient for combing and viewing the entity information. If the material monomer obtains the material polymer through the material technology, the correlation direction among the material monomer, the material technology, and the material polymer is that the material monomer obtains the material polymer through the material technology. In some embodiments, when the information flow is displayed, arrows may also be used to indicate the direction of association. As shown in FIG. 2, the arrows in FIG. 2 identify the direction of association between entity information in the information flow.
  • the association type may include a feature description of the association relationship between entity information with different association relationship characteristics, so as to better sort out and verify whether there is an association between the entity information and what kind of association there is. For example, when materials are related by processing technology, the corresponding type of association can be the process between materials and materials.
  • the type of association between material and material type information is material and material type.
  • the direction and type of association between entity information can be determined by analyzing the text information where the entity information is located. For example, a large amount of sample data can be trained, and the algorithm model obtained by training can be used to extract the association direction and type of entity information from the text information.
  • the association direction and association type between the entity information can also be determined by comprehensively considering the entity types between the entity information, so as to determine the association direction and association type more accurately and conveniently.
  • the association relationship information may include information describing an association direction and an association type.
  • the information description method can be a plain text description, for example, the type of association is "material and material type", and the direction of association is a material type with "synthetic resin" as "polyvinyl alcohol”.
  • the method of information linking can also be adopted, for example, the server establishes a link for two or two entity information through symbols, graphics, etc., and then marks the link type and link direction on the link.
  • the association type and the association direction can also be represented by numbers, symbols, graphics, etc.
  • other information description methods can also be used.
  • the association relationship information may include the description information of the association direction and the association type, or may only include the description information of the association direction or the association type.
  • the association type can be marked when the information flow is displayed, or the entity value and entity type can be marked in the display entity information, and the entity type is used to characterize the association relationship between the entity information.
  • entity information can be associated with brackets or solid lines without arrows.
  • association relationship information including the association direction
  • a straight line with arrows may be used to represent the information flow direction between entity information according to the association direction in the association relationship information, so as to facilitate visual display.
  • association direction and the display manner of the association relationship are only examples, and other methods can also be used in specific implementation.
  • the associated entity information and the associated relationship information corresponding to the target entity information can be determined by analyzing the context information of the target entity information in the data source.
  • the context information may include the sentence where the target entity information is located and one or more sentences before and after the sentence where the target entity information is located. Alternatively, if there is a strong correlation between the paragraph where the target entity information is located and the paragraph before and after the paragraph through analysis, the context information may also include the paragraph where the target entity information is located and one or more paragraphs before and after the paragraph.
  • the sentence in which the target entity information is located may be initially used as the context information in which the target entity information is located to extract the associated entity information and the association relationship information with the associated entity information. Then, one or more sentences before and after the sentence where it is located can be further added to the context information, and one or more sentences before and after the sentence where it is located can be used to perform the related entity information extraction and the association relationship information with the associated entity information. Fix. Alternatively, one or more paragraphs before and after the paragraph where it is located can be further added to the context information, and the associated entity information extracted by one or more paragraphs before and after the paragraph where it is located and the relationship information between the associated entity information and the associated entity information Make corrections.
  • paragraphs before and after the paragraph where it is located can be further added to the context information, and one or more paragraphs before and after the paragraph where it is located can be used to further extract or modify the associated entity information of the target entity information and the relationship between the associated entity information and the associated entity information. Relationship information between.
  • the entity information with strong association relationship is described in a sentence.
  • the interference of other information can be avoided, and the target entity information can be extracted more accurately and efficiently.
  • Corresponding associated entity information and association relationship information with associated entity information is further modified by combining the preceding and following sentences or paragraphs, or further extraction can be achieved when the entity information or the association relationship information between the entity information is not extracted. Improve the accuracy and comprehensiveness of the extraction of entity information and the relationship between entity information.
  • the data source where the target entity information is located can be retrieved first, and then the context information of the target entity information in the data source can be located. Then, natural language processing algorithms and the like can be used to analyze the context information where the target entity information is located, and extract the associated entity information corresponding to the target entity information and the association relationship information with the associated entity information.
  • the data source where the target entity information is located can be preliminarily retrieved through methods such as information matching. Then, the method provided in the foregoing embodiment can be used to locate the context information of the target entity information in the data source.
  • the extracted description information can be analyzed, or the extracted description information, the extracted entity information, and the target entity information can be comprehensively analyzed. If there is a certain association direction and/or association type between the extracted entity information and the target entity information, Then the extracted description information can be used as the associated entity information, and the extracted description information can be used as the associated relationship information.
  • polyvinyl alcohol is a synthetic resin, usually prepared by alcoholysis (usually called hydrolysis or saponification) of polyvinyl acetate
  • the named entity extraction algorithm can be used to extract the entity information "material name entity-polyvinyl alcohol”, “material type entity-synthetic resin”, “material name entity-polyvinyl acetate”, “material process entity-alcoholysis”, “ Material process entity-hydrolysis”, “material process entity-saponification”.
  • the entity information in the above sentences and the words between the entity information can be further processed by natural language processing methods such as semantic role labeling, dependency semantic analysis, part-of-speech labeling, etc., to extract the association relationship between "polyvinyl alcohol” and "synthetic resin”
  • the description information "X is a kind of Y” further combining the entity types of "polyvinyl alcohol” and “synthetic resin” can determine the type of association between the two as “material and material type", and the direction of association is “synthetic resin”. "Polyvinyl alcohol” type of material.
  • Natural language processing methods such as named entity extraction algorithm, semantic role tagging, dependency semantic analysis, part-of-speech tagging, etc. used in the process of extracting entity information and association relationship information can extract corresponding entity information and association relationship information according to the information flow Features are obtained after deep learning of a large amount of sample data, which can improve the accuracy of entity information and association information extraction.
  • the associated entity information and association relationship information corresponding to the target entity information may be determined in the following manner:
  • the context information can also be processed using the NLP algorithm, for example, the data source can be processed and tokenized using the NLP algorithm. Then, based on the entity information dictionary or entity information database, NER can be used to identify the entity information in the context information.
  • the construction of material information flow can be based on the entity information dictionary or entity information database corresponding to the material, or the entity information dictionary or entity information database corresponding to the material structure type corresponding to the target entity information, and use the NER algorithm to obtain the context information Extract the entity information contained in the context information.
  • the target entity information is "polyvinyl alcohol” and the material structure type corresponding to "polyvinyl alcohol” is polymer
  • the entity information dictionary or entity information database corresponding to the polymer is obtained.
  • the NER algorithm is used to extract the entity information from the context information of the target entity information, and to compare the information with the information in the entity information dictionary or entity information database to more accurately and efficiently identify the entity information in the context information.
  • the corresponding description information of monomer, polymerization reaction, polymerization medium, polymer, polymer type, polymer type, manufacturing process, manufacturer, supplier, chemical modification, physical modification, application can be identified.
  • the NLP algorithm filtering method can also be used to remove the interference information in the extracted entity information.
  • the interference information may include stop words, common words, non-scientific words, words unrelated to the material context, and the like.
  • the extracted entity information and target entity information are used as objects, and for each object, the NLP algorithm is further used to analyze the context information to determine the metadata corresponding to each object.
  • the metadata may refer to information describing each object.
  • the metadata corresponding to the target entity information and the metadata corresponding to the extracted entity information can be compared with each other to determine whether there is an association relationship between the target entity information and the extracted entity information and specific association relationship information. If there is an association relationship between the target entity information and the extracted entity information, the entity information with the association relationship can be used as the associated entity information of the target entity information, and the association relationship information between the two can be determined based on the comparison result to compare The two are related.
  • S26 The server uses the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
  • the server can use the extracted association type and association direction to associate the corresponding target entity information and the associated entity information, and obtain the information flow corresponding to the target entity information.
  • the implementation of the association can be carried out with reference to the content described in step S24, which will not be repeated here.
  • S28 The server sends the information stream to the client for display on the client.
  • the server may send the information stream to the client for display on the client.
  • a visualization tool can be used to visualize the information flow, and then the visualized information flow can be sent to the client for visual display on the client.
  • Figure 2 shows the information flow established in the above example.
  • the association relationship information can be effectively used to further verify and filter the extracted entity information to determine whether the extracted entity information is entity information related to the target entity information. Thereby, the accuracy of extracting entity information associated with the target entity information is further improved, and noise is effectively filtered.
  • using the association relationship information can also associate the extracted entity information with the target entity information, effectively displaying the association relationship between the extracted entity information and the target entity information, which is convenient for users to view and sort out, so that users can accurately and efficiently obtain what they need Useful information, find the needed or new solutions, and improve the user experience.
  • the user is trying to find some or all of the processing technology or manufacturing method, raw materials, additives and other information of a specific material or material application, and the scheme provided by the above embodiment will associate the specific material or material application in the same or different data sources.
  • Entity information is extracted and associated to form an information flow and displayed to users.
  • users can easily and conveniently find the information they need, avoiding the process of selecting and sorting out multiple different data sources by themselves, saving time.
  • the information of each entity can also be displayed to users in an associated manner, which can avoid the omission of information and the neglect of information due to the large amount of data or the lack of understanding of the industry when users perform information screening on their own.
  • the relevance of enables users to more accurately obtain the solutions they need and the raw materials and processing technology needed in the solutions.
  • the extraction of effective information from noise information in the data source can also be effectively removed, and the accuracy of the information obtained by the user can be improved.
  • the information in the data source is complex and changeable, and entity information and the association relationship information between entity information are usually difficult to accurately extract.
  • the algorithm model of the associated entity information corresponding to the target entity information and the association relationship information with the associated entity information may be trained based on the entity type of the target entity information, so as to further improve the association between the associated entity information and the entity information.
  • Accuracy of relational information extraction For the entity information in the field of material technology, the conversion between materials and materials or between materials and material applications is usually through a certain processing technology or manufacturing method, and a material or material application is obtained from which material processing, And, what kind of processing technology or manufacturing method is used is usually what users want to know.
  • the extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source Can include:
  • the context information includes a material process entity, a material name entity, and/or a material application entity
  • extract the associated entity information corresponding to the target entity information and the target entity information and the associated entity information according to the material process information association method The relationship information between.
  • the material process information association manner may be determined according to the information association characteristics between materials or between materials and material applications.
  • the context information of the target entity information includes material process entities, material name entities, and/or material application entities
  • the context information may usually contain information about the processing technology conversion of one or more materials, or the material What kind of manufacturing method is used to manufacture the application?
  • the corresponding information association method can usually be the way that the material name entity obtains the material name entity through the material process entity, or the material name entity passes the material process entity The way to get the material application entity.
  • the material process entity can be used as the associated entity information of the material name entity or the material application entity, or part of the association relationship information between multiple material name entities or between the material name entity and the material application entity.
  • algorithms such as deep learning can be used to learn a large number of data sources including material process entities, material name entities, and/or material application entities, so as to accurately associate material name entities or material name entities through material process entities.
  • material application entity With the material application entity, the entity information and the relationship between entity information can be accurately extracted.
  • the user enters the information as polyvinyl alcohol in the input box of the client and clicks the information flow extraction button.
  • the client can generate an information flow acquisition request based on polyvinyl alcohol and send it to the server.
  • the server receives the information flow acquisition request, it can extract the input information from the information flow acquisition request, and then can further perform information extraction on the input information.
  • the server performs entity information extraction on the input information, and the extracted entity information may include "material name entity-polyvinyl alcohol".
  • the server can extract information from the data sources in the database, and extract one or more data sources containing polyvinyl alcohol. Then, you can locate the contextual information where the polyvinyl alcohol is located.
  • 1,2-ethylene glycol polyvinyl alcohol can be produced according to known or conventional production methods.
  • 1,2-ethylene glycol polyvinyl alcohol can be produced using the following production method: Polymerization of vinyl acetate under pressure (load) higher than under conventional conditions to prepare polyvinyl alcohol; and copolymerization of vinyl acetate and ethylene carbonate to obtain 1,2-ethylene glycol polyvinyl alcohol with the aforementioned content.
  • the "polyvinyl alcohol” and “vinyl acetate” can be associated according to the extracted association type, direction and other association information, and the "polyvinyl alcohol” and “acetic acid” can be “copolymerized” according to the extracted association information.
  • "Vinyl ester and vinyl carbonate” are associated.
  • the "arrow” can be used to indicate the direction of association, and the information display algorithm can be used to display the information flow formed after the association, as shown in Figure 3 (a) and (b) respectively.
  • the extracted information including the material name entity, the material application entity, and the material process entity can be associated with information, which can further improve the accuracy of the extraction of the material entity information and the association relationship information between the material entity information.
  • the association relationship information may further include colocation entity description information
  • the colocation entity description information may include description information in other expression forms that describe the associated entity information as target entity information.
  • the server may add the associated entity information corresponding to the co-located entity description information to the target entity information, and update the target entity information.
  • the server may extract at least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information from the data source. And, using the association relationship information to associate the updated target entity information and the associated entity information to obtain the information flow corresponding to the updated target entity information.
  • polyvinyl alcohol when analyzing the data source, extract the description information of "polyvinyl alcohol can also be expressed as "poly(vinyl alcohol)" or "PVA", you can use the description
  • the information is used as the association relationship information to associate poly(vinyl alcohol), polyvinyl alcohol and PVA.
  • the updated The target entity information can include polyvinyl alcohol, poly(vinyl alcohol), and PVA. Then, you can use poly(vinyl alcohol), polyvinyl alcohol, and PVA to simultaneously perform data source retrieval and use poly(vinyl alcohol) in the retrieved data source.
  • Poly(vinyl alcohol), polyvinyl alcohol, and PVA for the extraction of related entity information and related relationship information. After obtaining the sub-information streams corresponding to poly(vinyl alcohol), polyvinyl alcohol, and PVA, you can also use poly(vinyl alcohol) at the same time. ), polyvinyl alcohol and PVA as the reference information, associate each sub-information stream extracted based on poly(vinyl alcohol), polyvinyl alcohol, and PVA to obtain the information stream corresponding to the updated target entity information.
  • the above method can also be used to associate different expressions of the same entity information, and then further entity information extraction, such as "alcoholysis”, “hydrolysis”, and “saponification” in the above examples "It belongs to the co-location information corresponding to the extracted material process entity.
  • the metadata of each entity information may be extracted based on the solution provided in the above-mentioned embodiment, and the metadata corresponding to each entity information may be compared with each other. If the metadata corresponding to the two entity information is determined If there is a certain similarity or it is determined through metadata comparison to explain that two entity information is each other, then the two entity information can be co-located entities with each other.
  • the same material, material application, process, production method, etc. may have many different description forms in the data source. For example, polyvinyl alcohol can also be described in English or other language forms. Complex and large data sources usually lack specific and standardized descriptions of material names, applications, or processes. When users search for information, they usually enter only one or more forms that are familiar to users.
  • the method may further include: the server extracts the target entity information or the product name of the associated entity information from the data source; and extracts the target entity information or the product name corresponding to the associated entity information according to the product name.
  • the manufacturer or supplier associates the manufacturer or supplier with the target entity information or associated entity information.
  • Multiple associated entity information of the target entity information may be extracted from the same data source or multiple data sources, and the association relationship information between the target entity information and different associated entity information.
  • the server may also use the information flow formed by associating the extracted one or more associated entity information and the corresponding association relationship information as the sub-information flow. Then, using the target entity information as the reference information, the information flow from the same Data sources or different sub-information streams extracted from different data sources are linked to obtain the information stream corresponding to the target entity information, and then the final information stream is displayed to the user, which further improves the correlation between the information and facilitates the comprehensive analysis by the user Check.
  • the user needs to search for the manufacturing process of a new material, but cannot search because the manufacturing process patent has not been published or submitted.
  • the manufacturing process patent has not been published or submitted.
  • journal articles, news articles, or other types of data sources describing the material's process there may already be journal articles, news articles, or other types of data sources describing the material's process.
  • the information streams shown in (a) and (b) in FIG. 3 can be used as sub-information streams, and (c) in FIG. 3 is an information stream obtained after combining two sub-information streams.
  • the use of the association relationship information to associate the target entity information with the associated entity information may include: acquiring at least one associated entity information extracted from a data source and the target entity information The association relationship information with the corresponding associated entity information, the association relationship information between the target entity information and the corresponding associated entity information is used to associate the target entity information with the corresponding associated entity information to obtain the target A sub-information stream of entity information; using the target entity information as the reference information to link the sub-information streams obtained from the same or different data sources to obtain the information stream corresponding to the target entity information.
  • the method further includes: the server may also extract the information flow of each associated entity information extracted from the data source as the target entity information, and obtain the information flow corresponding to the multiple target entity information; and Link the information streams corresponding to multiple target entity information with the corresponding target entity information as the reference information to obtain the information map.
  • the server may further use the extracted associated entity information as target entity information, extract the information flow, and further extract the information flow corresponding to each associated entity information, so as to obtain a series of information flows. Then, each associated entity information is used as the reference information to link each information stream, so that the information graph can be obtained.
  • Figure 4 is an information map formed by the entire workflow from vinyl acetate (monomer) to organic electroluminescent display coating (application), which is related to polyvinyl alcohol by users searching for polyvinyl alcohol. Among them, all parties in Figure 4 The content in the box is entity information, and the arrow indicates the direction of association.
  • users can easily identify the entire process information flow from monomer to chemically modified polymer (including catalyst, polymerization reaction, chemical modification, physical modification and additives) in the data source with the help of information flow. , which can also help users discover alternative polymerization processes, additives and catalysts.
  • users who usually search for polyethylene may not get documents that use PE, polyethylene HDPE, or industry polyethylene trade names to label polyethylene. He may get incomplete information about different aspects of the material details. For example, some polyethylene patents may contain monomers, but there is no information about the polymerization process, the polymer processing of the polymer, or even the different applications that the polyethylene can be used for. . Users may need to search different patents to search for different details of polyethylene. Through the above solution, users can get all the information of a specific material through a single search, including name variants, suppliers, material applications, manufacturing processes, etc.
  • the user can also easily and conveniently find suitable materials for a specific application.
  • users can discover new materials, suppliers, and material manufacturing technologies for specific applications.
  • materials suitable for medical stents may find materials such as Bioflow-V and Orsio (trade name).
  • Orsio uses cobalt-chromium alloy as the base metal and has the knowledge of active poly-L-lactide (PLLA) polymer coating.
  • stents Users can further explore other materials used to make stents, including stainless steel (CYPHER stent) with poly(ethylene-vinyl acetate copolymer) or poly(n-butyl methacrylate) coating, and cobalt with carbon coating Chrome alloy (CRE8 inner stent) and so on. Then, the user can also use the information atlas to obtain the entire manufacturing process of each material used in the stent.
  • stainless steel CYPHER stent
  • CRE8 inner stent cobalt with carbon coating Chrome alloy
  • FIG. 4 and FIG. 5 are schematic diagrams of information maps of polymer materials.
  • Figures 6 and 7 are schematic diagrams of information maps of metal materials.
  • Figures 8 and 9 are schematic diagrams of information maps of ceramic materials.
  • Figures 10 and 11 are schematic diagrams of information maps of biological materials.
  • Figure 12 and Figure 13 are schematic diagrams of material information maps.
  • Figure 4, Figure 6, Figure 8, Figure 10, Figure 12 are based on the entity information extracted from the data source and the entity information formed after association.
  • Figure 5, Figure 7, Figure 9, Figure 11, Figure 13 are schematic diagrams of the general information flow of the above five materials.
  • the user can input the entity information corresponding to any one or more nodes in the information map , Such as a single material, material application, material technology, etc., and then part or all of the information related to the input information and the flow relationship between the information can be obtained, and finally the information flow can be obtained by association.
  • This allows users to accurately obtain part or the entire information flow of each material from the material monomer to the material application.
  • it also effectively associates information such as name variants, processing techniques, manufacturing methods, manufacturers, and suppliers. It effectively realizes that the user can accurately and comprehensively obtain part or all of the information related to the input information and the circulation relationship between the information through less input information.
  • the sub-information stream of the target entity information may also be associated with the corresponding data source.
  • the user can conveniently view the source of the corresponding sub-information stream, and effectively realize the tracking of various information in the information stream.
  • the page displayed by the client may also include the start node and end node options of the information flow.
  • the user can click the corresponding option and enter the corresponding corresponding in the input box.
  • Start node information and end node information After obtaining the user's click and input information, the client can generate the input information of "start node information + end node information", attach it to the search request, and send it to the server.
  • the server can generate and display the information flow based on the input information.
  • the entity information can be extracted from the start node information and the end node information separately, the entity information extracted from either the start node information or the end node information is initially used as the target entity information, and the other corresponding entity information is used as the ending entity information . Then, you can extract the information flow of the target entity information, and further use the associated entity information in the extracted information flow as the target entity information, and further extract the information flow, and so on, if a certain associated entity information is terminating entity information , Then no more information extraction is performed on the associated entity information. In this way, the information graph between the start node information and the end node information can be obtained.
  • the ending node information is a material application.
  • the information flow can be gradually extracted from the material monomer to the material application, and the information atlas from the material monomer to the material application can be obtained.
  • the method may further include: the server uses the entity information in the information stream as an information node for interacting with the user, and uses a visualization method to visualize the information stream; and the processed information stream Sent to the client, so that the user can view the information flow and trigger the information node through the client; based on the user's triggering operation on the information node, feedback other information associated with the information node to the client.
  • the information node is the information corresponding to each entity information in the information flow.
  • the trigger operation may include operation modes such as tap, long press, and slide.
  • the other information associated with the information node may include, for example, data sources or other information streams that are not shown.
  • the visualization method may adopt the Neo4j graphical method.
  • Figure 14 shows a schematic diagram of an interactive visualized information map.
  • the circles represent different information nodes, the lines between the circles indicate that the information nodes are related, and the text on the line can indicate the type of association between the information nodes. Describe information, and use arrows to indicate the direction of association between information. In some embodiments, different colors may also be used to indicate the data source from which the information comes.
  • the user can view the data source of the information node by clicking on the information node. Or, by sliding left or right based on the information node, to view the undisplayed lower information or upper information corresponding to the information node, where the lower information may refer to the material corresponding to the current information node to the material.
  • the information between applications the higher-level information may refer to the information from the material corresponding to the current information node to the single material.
  • the parameter entity information can be, for example, information such as material type, process type, material attribute, unit, or measurement.
  • the method may further include: the server extracts the target entity information and the parameter entity information of the associated entity information from the data source, and the parameter entity information includes at least the material structure type entity, the process The entity information corresponding to one of the method entity, the material attribute entity, the unit entity or the measurement entity; the extracted parameter entity information is associated with the corresponding target entity information and associated entity information.
  • the server extracts the target entity information and the parameter entity information of the associated entity information from the data source, and the parameter entity information includes at least the material structure type entity, the process The entity information corresponding to one of the method entity, the material attribute entity, the unit entity or the measurement entity; the extracted parameter entity information is associated with the corresponding target entity information and associated entity information.
  • deep learning technology can be used to extract different parameter entity information and associate it with corresponding entity information.
  • the data source usually also contains further specific description information of the entity information.
  • entity information there may also be description information such as process method type, process parameters, and manufacturer information.
  • the data source may also contain more specific description information of the polymerization process, such as solution polymerization, bulk polymerization, suspension polymerization, and emulsion polymerization. At the same time, it may also provide a large number of polymerization processes.
  • the parameter condition information used at the time such as temperature, pressure, surrounding environment, production environment and other information.
  • the above-mentioned information can be further extracted, and the material attribute entity, unit entity or measurement entity of each material involved in the above process method, or the parameter entity information of the surrounding environment, unit entity, measurement entity and other parameter entity information and corresponding entity information Association can enable users to obtain specific parameter information more accurately and comprehensively.
  • Each attribute class has different attribute subcategories.
  • the spectroscopy category can be divided into subcategories including Raman spectroscopy, X-ray diffraction, and the like.
  • Figure 16 is a schematic diagram of using units and measures to link materials to attribute lists from different attribute classes.
  • Figure 17 is a schematic diagram of extracting attributes, units, and metrics from a data source.
  • Figure 18 is a schematic diagram of linking the extracted attributes, units, and metrics to the corresponding materials.
  • Figure 19, Figure 20, and Figure 21 show different representations of attributes, units, and metrics in the data source and schematic diagrams of information extraction.
  • Figure 22 is a schematic diagram of linking the extracted attributes, units, and metrics to the corresponding materials.
  • the patent claims usually include a description of the material. If the patent claims a new material manufacturing process, the applicant usually needs to add all the details of the manufacturing process to the claims. However, If the patent involves a modification of the existing manufacturing process, the applicant does not have to add all the details of the manufacturing process. If an application-material link is newly discovered, the applicant can also add the application of the material in the patent claims. However, the applicant generally prefers to describe a broader application field in the patent claims, and more specific Applications are usually not described in the claims. In addition, the description of material properties, attributes, etc. in the claims will usually be described in general terms to obtain a better scope of protection.
  • some other embodiments of this specification also provide an information flow extraction method, which is applied to a server. As shown in FIG. 23, the method may include:
  • S42 Extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, where the association relationship information includes an association direction between the entity information And information describing the type of association;
  • S44 Use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
  • the using the association relationship information to associate the target entity information with the associated entity information may include:
  • the target entity information is associated with the corresponding associated entity information to obtain the sub-information stream of the target entity information;
  • link sub-information streams obtained from the same or different data sources to obtain the information stream corresponding to the target entity information.
  • association relationship information when the association relationship information includes colocation entity description information, the associated entity information corresponding to the colocation entity description information is added to the target entity information, and the target entity information is updated; wherein, The colocation entity description information includes description information in other expression forms that describe the associated entity information as target entity information;
  • At least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source.
  • the method may further include:
  • the extracting from the data source at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information may include:
  • the context information includes a material process entity, a material name entity, and/or a material application entity
  • extract the associated entity information corresponding to the target entity information and the target entity information and the associated entity information according to the material process information association method The relationship information between.
  • the method may further include:
  • the manufacturer or supplier corresponding to the target entity information or associated entity information is extracted according to the product name, and the manufacturer or supplier is associated with the target entity information or associated entity information.
  • the method may further include:
  • the parameter entity information includes at least one of the material structure type entity, the process method entity, the material attribute entity, the unit entity or the measurement entity corresponding to one Entity information
  • the extracted parameter entity information is associated with the corresponding target entity information and associated entity information.
  • the method may further include: associating the sub-information stream of the target entity information with a corresponding data source.
  • the method may further include:
  • the information flow extraction device provided by one or more embodiments of this specification simultaneously extracts entity information and the association relationship information between the entity information, and can effectively use the association relationship information to further verify and filter the extracted entity information to determine Whether the extracted entity information is entity information related to the target entity information, thereby further improving the accuracy of extracting entity information associated with the target entity information and effectively filtering noise.
  • using the association relationship information can also associate the extracted entity information with the target entity information, effectively displaying the association relationship between the extracted entity information and the target entity information, which is convenient for users to view and sort out, so that users can accurately and efficiently obtain what they need Useful information, find the needed or new solutions, and improve the user experience.
  • FIG. 24 shows a schematic diagram of the module structure of an embodiment of an information flow extraction device provided in the specification. As shown in FIG. 24, the device may include:
  • the first obtaining module 402 may be used to obtain target entity information
  • the first extraction module 404 may be used to extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes a pair of Information describing the direction of association between entity information and the type of association;
  • the first association module 406 may be configured to use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
  • the first association module 406 may include:
  • the first associating unit may be used to obtain at least one associated entity information extracted from the data source and the association relationship information between the target entity information and the corresponding associated entity information, and use the target entity information and the corresponding associated entity information
  • the association relationship information between associates the target entity information with the corresponding associated entity information, and obtains the sub-information stream of the target entity information;
  • the second associating unit may be used to link the sub-information streams obtained from the same or different data sources with the target entity information as the reference information, to obtain the information stream corresponding to the target entity information.
  • the apparatus may further include:
  • the update module may be used to add the associated entity information corresponding to the co-located entity description information to the target entity information, and update the target entity information; wherein the co-located entity description information includes information describing the associated entity as a target Descriptive information in other forms of entity information;
  • the first extraction module 404 may also be used to extract at least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information from the data source.
  • the device may further include:
  • the second extraction module can be used to extract the information flow of each associated entity information extracted from the data source as the target entity information, and obtain the information flow corresponding to the multiple target entity information;
  • the second association module can be used to link the information streams corresponding to multiple target entity information with the corresponding target entity information as the reference information to obtain the information graph.
  • the first extraction module 404 may include:
  • the retrieval unit can be used to retrieve the data source where the target entity information is located;
  • the locating unit may be used to locate the context information where the target entity information is located in the data source;
  • the extracting unit may be used to extract the associated entity information corresponding to the target entity information and the target according to the material process information association method when the context information includes the material process entity, material name entity, and/or material application entity The association relationship information between the entity information and the associated entity information.
  • the device may further include:
  • the third extraction module can be used to extract the target entity information and the commodity name of the associated entity information from the data source;
  • the third association module may be used to extract the manufacturer or supplier corresponding to the target entity information or associated entity information according to the product name, and associate the manufacturer or supplier with the target entity information or associated entity information Make an association.
  • the device may further include:
  • the fourth extraction module may be used to extract the target entity information and the parameter entity information of the associated entity information from the data source.
  • the parameter entity information includes at least a material structure type entity, a process method entity, a material attribute entity, a unit entity, or a metric. Entity information corresponding to one of the entities;
  • the fourth association module can be used to associate the extracted parameter entity information with corresponding target entity information and associated entity information.
  • the first association module 406 may further include:
  • the third association unit may be used to associate the sub-information stream of the target entity information with the corresponding data source.
  • the device may further include:
  • the visualization processing module can be used to treat each entity information in the information flow as an information node for interacting with the user, and use a visualization method to visualize the information flow;
  • the first sending module may be used to send the processed information stream to the client, so that the user can view the information stream and trigger the information node through the client;
  • the second sending module may be configured to feed back other information associated with the information node to the client based on a user's triggering operation on the information node.
  • the above-mentioned device may also include other implementation manners according to the description of the method embodiment.
  • specific implementation manners reference may be made to the description of the related method embodiments, which will not be repeated here.
  • the information flow extraction device provided by one or more embodiments of this specification simultaneously extracts entity information and the association relationship information between the entity information, and can effectively use the association relationship information to further verify and filter the extracted entity information to determine Whether the extracted entity information is entity information related to the target entity information, thereby further improving the accuracy of extracting entity information associated with the target entity information and effectively filtering noise.
  • using the association relationship information can also associate the extracted entity information with the target entity information, effectively displaying the association relationship between the extracted entity information and the target entity information, which is convenient for users to view and sort out, so that users can accurately and efficiently obtain what they need Useful information, find the needed or new solutions, and improve the user experience.
  • this specification also provides an information flow extraction device that includes a processor and a memory storing processor-executable instructions. When the instructions are executed by the processor, the steps include the method described in any one of the foregoing embodiments.
  • the storage medium may include a physical device for storing information, and usually the information is stored in an electric, magnetic, or optical medium after digitizing the information.
  • the storage medium may include: devices that use electric energy to store information, such as various types of memory, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk; a device that uses optical means to store information, such as CD or DVD.
  • devices that use electric energy to store information such as various types of memory, such as RAM, ROM, etc.
  • devices that use magnetic energy to store information such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk
  • a device that uses optical means to store information such as CD or DVD.
  • quantum memory graphene memory, and so on.
  • the above-mentioned device may also include other implementation manners according to the description of the method embodiment.
  • specific implementation manners reference may be made to the description of the related method embodiments, which will not be repeated here.
  • the information flow extraction device described in the above embodiment simultaneously extracts the entity information and the association relationship information between the entity information, and can effectively use the association relationship information to further verify and filter the extracted entity information to determine the extracted entity information Whether it is the entity information related to the target entity information, so as to further improve the accuracy of extracting the entity information related to the target entity information, and effectively filter the noise.
  • using the association relationship information can also associate the extracted entity information with the target entity information, effectively displaying the association relationship between the extracted entity information and the target entity information, which is convenient for users to view and sort out, so that users can accurately and efficiently obtain what they need Useful information, find the needed or new solutions, and improve the user experience.
  • the embodiment of this specification also provides an information flow display method, which is applied to a server, and the method may include:
  • At least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source, and the association relationship information includes an association direction and an association between the entity information Type description information;
  • association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information
  • the information stream is sent to the client for display on the client.
  • an embodiment of this specification also provides a server, which may include:
  • the first receiving module may be used to receive an information flow acquisition request sent by the client, where the information flow acquisition request includes input information acquired by the client;
  • the fifth extraction module may be used to extract one or more entity information from the input information as target entity information
  • the sixth extraction module may be used to extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes Information describing the direction of association between information and the type of association;
  • the fifth association module may be configured to use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information;
  • the third sending module may be used to send the information stream to the client for display on the client.
  • the embodiment of this specification also provides an information flow display method, which is applied to the client and may include:
  • the information flow acquisition request includes the input information obtained by the client; so that the server receives the information flow acquisition request and extracts one or more entity information from the input information as the target entity Information; and, extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes the relationship between the entity information Information describing the association direction and association type; and, using the association relationship information to associate the target entity information and the associated entity information, obtain the information flow corresponding to the target entity information, and send the information flow To the client;
  • an embodiment of this specification also provides a client, which may include:
  • the fourth sending module may be used to send an information flow acquisition request to the server, where the information flow acquisition request includes the input information obtained by the client; so that the server receives the information flow acquisition request and extracts one or the other from the input information Multiple entity information as target entity information; and extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, the association relationship
  • the information includes information describing the direction and type of association between entity information; and, using the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information , And send the information stream to the client;
  • the second receiving module can be used to receive the information stream sent by the server
  • the first display module can be used to display the information stream.
  • this specification also provides an information flow display device that includes a processor and a memory storing executable instructions of the processor. When the instructions are executed by the processor, the steps including the method described in any one of the foregoing embodiments are implemented.
  • the storage medium may include a physical device for storing information, and usually the information is stored in an electric, magnetic, or optical medium after digitizing the information.
  • the storage medium may include: devices that use electric energy to store information, such as various types of memory, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk; a device that uses optical means to store information, such as CD or DVD.
  • devices that use electric energy to store information such as various types of memory, such as RAM, ROM, etc.
  • devices that use magnetic energy to store information such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk
  • a device that uses optical means to store information such as CD or DVD.
  • quantum memory graphene memory, and so on.
  • the above-mentioned device may also include other implementation manners according to the description of the method embodiment.
  • specific implementation manners reference may be made to the description of the related method embodiments, which will not be repeated here.
  • an information retrieval method may also be provided, and the method may include:
  • S60 The client sends a search request, where the search request includes input information obtained by the client;
  • the server receives the search request sent by the client, and extracts entity information from the input information, where the entity information includes an entity type and an entity value;
  • the server determines the data source where the entity information is located according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes a pair of entity information.
  • S66 The server sends the search result to the client, so that the client can display it.
  • the input information may be reference information for information retrieval, and the server uses the input information as a reference to retrieve a data source that directly describes or indirectly describes the input information.
  • the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information can be implemented with reference to the embodiments in the above-mentioned information flow extraction method, and will not be repeated here.
  • Fig. 26 shows a search interface for searching for "Fullerene as polymer additives" using keywords and a schematic diagram of the search results.
  • the darker gray scale shown in FIG. 26 is the non-correlated search, and the weaker gray scale is the relevant search result.
  • Fig. 27 is a schematic diagram of a retrieval interface using the solution of the embodiment of the present specification to perform information retrieval based on material information flow.
  • FIG. 28 is a schematic diagram of retrieval results using the solution of the embodiment of this specification, and all related searches are accurately extracted by using the solution of the embodiment of this specification.
  • the name variant of "Fullerene” can be further extracted, that is, the parity information, such as the Chinese description of "Fullerene”.
  • the parity information such as the Chinese description of "Fullerene”.
  • the material structure may refer to the way in which the atoms (or ions, molecules) constituting the material are combined with each other or the form of formation (ie, structural elements), and the combination, arrangement, and various connections between the structural elements in a certain order.
  • Different materials have different structural elements.
  • various phases, structures, defects, monomers, and macromolecular chains of materials are all structural elements of materials.
  • Materials are classified according to material structure, such as polymer, metal, ceramic, biological, composite and other material classifications.
  • Figure 31 is a schematic diagram showing the classification of polymers further refined according to the material structure.
  • the material application may include the application field or specific application corresponding to the material.
  • the application field classification can include building materials, energy materials, printing materials, optoelectronic materials, and so on. Different application fields can be further divided up to specific applications, such as organic electric display coatings in optoelectronic materials. As shown in Figure 33, Figure 33 shows a further detailed classification diagram of smart materials.
  • the involved raw materials, processing technology, manufacturing methods, etc. may also be quite different.
  • entity information and entity information are based on material applications.
  • Information extraction of the association relationship between information can also further improve the accuracy of information extraction, reduce noise interference, and make the retrieved data source more in line with the needs of users.
  • many retrieval systems can also be classified according to application fields, as shown in Figure 29.
  • the left figure of Fig. 29 shows a larger application category, and the right figure shows an application category that further embodies a certain application category in the left figure. Therefore, by considering the material application when searching, it is possible to search from the data source corresponding to the corresponding application field in a targeted manner, which greatly improves the retrieval efficiency and retrieval accuracy.
  • classification and retrieval according to the material structure and material application can reduce the difficulty of detecting material graphics at the fine-grained level, which helps to detect different materials and material patterns from the image, and improve the retrieval accuracy .
  • Figure 34 shows a schematic diagram of the search results retrieved using keywords
  • Figure 35 shows a schematic diagram of the retrieval interface provided by an embodiment of this specification
  • Schematic diagram of search results It can be seen from Figure 36 that the embodiments of this specification further limit the application field when searching, and the server can further consider the patents to which the material is applied, and then extract entity information and association relationship information between entity information. Improve the accuracy of image search, image extraction, image object detection and image classifier results.
  • FIG. 37 shows a schematic diagram of a protein structure, a DNA plasmid and a microstructure of a material
  • FIG. 38 shows a schematic diagram of a circuit diagram and a flow chart.
  • the information input interface displayed by the client may further include an information input area, a first selection list, and/or a second selection list.
  • the information input area may be used for the user to input information
  • the first selection list and the second selection list may be used for the user to select information.
  • the first selection list may include material application or material structure category information.
  • the second selection list may include category information of the entity type.
  • the client terminal obtains input information based on the information input area, the first selection list, and/or the second selection list.
  • the user can perform information retrieval by entering information in the input box and limiting the entity type of the entered information.
  • Information retrieval in this way can facilitate accurate extraction of entity types and entities from the entered retrieval information. Value to improve the accuracy of information retrieval.
  • the application field or structure category can also be limited.
  • the application field or structure category it is convenient to determine the entity information corresponding to the corresponding application field or structure category and the extraction of the association relationship information between the entity information Features to improve the accuracy of information extraction.
  • the preset material structure and material application classification selection method can also make the input information more standardized and improve the accuracy of information retrieval.
  • the first selection list or the second selection list may also adopt an interactive visualization format to display the information to be selected.
  • the interactive visualization format means that the entity type or material structure, the category information of the material application is displayed in a visual form, and the corresponding category and subcategory list selection format is determined by receiving the trigger operation of each category information, as shown in Figure 30 to Figure. 33 shown.
  • Figure 30 shows several larger categories of material structure display diagrams.
  • the user can click on the material structure, and the client can determine that the user has selected the material structure as part of the input information based on the user's click operation.
  • the client can further display several larger categories.
  • the user can further click on any of these categories, such as clicking on polymers.
  • the client can determine that the user has further selected the polymer as part of the input information based on the user's click operation.
  • the client can further display the detailed classification of polymers, as shown in Figure 31.
  • the user can further click on any of the categories in FIG. 31, and the client can determine the detailed category further selected by the user as part of the input information based on the user's click operation.
  • the user's operation method can also adopt other methods.
  • Figure 32 shows the display diagrams of several larger categories of material applications
  • Figure 33 shows the detailed classification diagram of smart materials.
  • the triggering of material application and the display implementation can be the same as the material structure, which will not be repeated here.
  • the display of the material structure and the detailed classification of the material application can also be more convenient, and at the same time, the interactivity can be improved, and the user experience can be improved.
  • FIG 39 shows the 10 properties of smart materials.
  • the internal structure of the material can be changed with changes in chemical composition and external conditions, thereby changing the properties of the material.
  • low carbon steel with carbon mass fraction below 0.25% usually has good plasticity and toughness, but low strength and hardness; high carbon steel with carbon mass fraction in the range of 0.6% to 1.4% has higher strength and hardness. High, but poor plasticity and toughness.
  • Different properties of materials have a greater impact on the actual application of materials. For example, materials with different yield strengths have different application fields.
  • the embodiment of this specification also provides an information retrieval method, which is applied to a server and may include:
  • the data source where the entity information is located is determined according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes the direction of association between the entity information And information describing the type of association;
  • the search result is sent to the client, so that the client can display it.
  • an embodiment of this specification also provides a server, which may include:
  • the third receiving module may be used to receive a search request sent by the client, where the search request includes input information obtained by the client;
  • the seventh extraction module can be used to extract entity information from the input information, where the entity information includes an entity type and an entity value;
  • the data source determining module may be used to determine the data source where the entity information is located according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes Information describing the direction and type of association between entity information;
  • the fifth sending module may be used to send the search result to the client, so that the client can display it.
  • the embodiment of this specification also provides an information retrieval method, which is applied to the client and may include:
  • the search request includes the input information obtained by the client; so that the server extracts the entity information from the input information, the entity information includes the entity type and the entity value; according to the entity information in the data source
  • the associated entity information in and the association relationship information between the entity information and the associated entity information determine the data source where the entity information is located, and the association relationship information includes information describing the association direction and association type between the entity information; And sending the retrieval result to the client;
  • the retrieval result sent by the server is received and displayed.
  • the information input interface displayed by the client may include an information input area, a first selection list, and/or a second selection list; the information input area may be used for information input; the first The selection list and the second selection list may be used for information selection, the first selection list may include category information of material applications or material structure; the second selection list may include category information of entity types;
  • the client can obtain input information based on the information input area, the first selection list, and/or the second selection list.
  • the first selection list and/or the second selection list may use an interactive visualization format to display the information to be selected; wherein, the interactive visualization format may indicate that the entity type is displayed in a visualization form or The material structure, the category information of the material application, and the list selection format of the corresponding category and subcategory are determined by receiving the trigger operation of each category information.
  • an embodiment of this specification also provides a client, which may include:
  • the sixth sending module can be used to send a search request to the server, where the search request includes input information obtained by the client; so that the server extracts entity information from the input information, and the entity information includes entity type and entity value;
  • the data source where the entity information is located is determined according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes the direction of association between the entity information And information describing the associated type; and sending the search result to the client;
  • the fourth receiving module may be used to receive the retrieval result sent by the server
  • the second display module can be used to display the search results.
  • the client may further include:
  • the input interface display module may be used to display an information input interface.
  • the information input interface may include an information input area, a first selection list and/or a second selection list; the information input area is used for information input;
  • a selection list and a second selection list can be used for information selection, the first selection list can include category information of material applications or material structure; the second selection list can include category information of entity types;
  • the input information obtaining module may be used to obtain input information based on the information input area, the first selection list and/or the second selection list.
  • this specification also provides an information retrieval device that includes a processor and a memory storing processor-executable instructions. When the instructions are executed by the processor, the steps including the method described in any one of the foregoing embodiments are implemented.
  • the storage medium may include a physical device for storing information, and usually the information is stored in an electric, magnetic, or optical medium after digitizing the information.
  • the storage medium may include: devices that use electric energy to store information, such as various types of memory, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk; a device that uses optical means to store information, such as CD or DVD.
  • devices that use electric energy to store information such as various types of memory, such as RAM, ROM, etc.
  • devices that use magnetic energy to store information such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk
  • a device that uses optical means to store information such as CD or DVD.
  • quantum memory graphene memory, and so on.
  • the above-mentioned device may also include other implementation manners according to the description of the method embodiment.
  • specific implementation manners reference may be made to the description of the related method embodiments, which will not be repeated here.
  • a patent or thesis document may describe a large amount of information, and the abstract information given by the author or applicant cannot effectively reflect the main information of the entire patent or thesis document.
  • users are usually accustomed to preliminarily determining whether the required information exists in the current literature only through abstracts, which leads to missing information.
  • the user will need to spend a lot of energy to consult and analyze the entire content of the literature, which will lead to time-consuming and labor-intensive.
  • a method for generating summary information may also be provided, and the method may include:
  • S80 The client sends a summary information generation request, where the summary information generation request includes the data source for which the summary is to be generated;
  • S82 The server receives the summary information generation request sent by the client;
  • the server extracts the entity information and the association relationship information between the entity information from the data source, and associates the corresponding entity information according to the extracted association relationship information; wherein the association relationship information includes the relationship between the entity information Information describing the direction of the association and the type of association;
  • S86 The server generates summary information according to the associated entity information, and sends the generated summary information to the client, so that the client can display it.
  • the data source may include the patent application text, thesis or other types of information text for which abstract information is to be generated.
  • the server After the server obtains the corresponding information text, it can use NLP to process and tokenize the original information text.
  • NLP Next marking, based on a pre-built entity information dictionary or entity information database, NER can be used to extract entity information in the information text, and extract the material name entity or material application entity. And take the extracted entity information as the object to be analyzed.
  • the server can use the NLP algorithm to obtain the metadata of the object to be analyzed.
  • the entity information in the context information of the object to be analyzed can be extracted, and a filtering algorithm can be used to delete noise words.
  • the noise words include stop words, common words, non-scientific words and words that have nothing to do with the material context.
  • the filtered entity information can be attached to these objects as metadata.
  • the presentation form of the information summary may be a piece of text information generated based on the entity information and the association relationship information between the entity information, or may be an information flow formed based on the entity information and the association relationship information between the entity information.
  • Entity similarity can be considered by comparing multiple attributes related to different contexts and metadata.
  • Each entity may have a unique threshold for entity matching. For example, it is possible to find similarities/matches between entities based on entity thresholds, and combine entities and relationships to form important workflow links between two documents.
  • the extraction of the entity information and the association relationship information between the entity information can refer to the implementation in the above-mentioned information flow extraction method, which will not be repeated here.
  • the summary information generated by the above method can accurately and comprehensively extract the entity information and the association relationship between the entity information in the data source information, avoid user screening and sorting, and enable users to find the information they need more quickly and accurately .
  • the generated summary information may be displayed in the form of text or tables, or may be displayed in the form of information flow.
  • the extraction method of the corresponding information flow can be performed with reference to the above-mentioned embodiments.
  • the display of summary information in the manner of information flow can make it easier for users to view and improve the user experience.
  • the embodiment of this specification also provides a method for generating summary information, which is applied to a server and may include:
  • the entity information and the association relationship information between the entity information are extracted from the data source, and the corresponding entity information is associated according to the extracted association relationship information; wherein the association relationship information includes the association direction between the entity information and Information describing the type of association;
  • the summary information is generated according to the associated entity information, and the generated summary information is sent to the client, so that the client can display it.
  • an embodiment of this specification also provides a server, which may include:
  • the fifth receiving module may be used to receive a summary information generation request sent by the client, where the summary information generation request includes the data source for which the summary is to be generated;
  • the eighth extraction module can be used to extract entity information and association relationship information between entity information from the data source; wherein, the association relationship information includes information describing the association direction and association type between entity information ;
  • the sixth association module can be used to associate corresponding entity information according to the extracted association relationship information
  • the generating module is used to generate summary information based on the associated entity information
  • the seventh sending module may be used to send the generated summary information to the client, so that the client can display it.
  • the embodiment of this specification also provides a method for generating summary information, which is applied to the client and may include:
  • the server sends a summary information generation request to the server, the summary information generation request including the data source for which the summary is to be generated; so that the server extracts the entity information and the association relationship information between the entity information from the data source, based on the extracted association relationship information Associate corresponding entity information; wherein, the association relationship information includes information describing the direction of association and type of association between entity information; and, generating summary information based on the associated entity information, and sending the generated summary information
  • the server extracts the entity information and the association relationship information between the entity information from the data source, based on the extracted association relationship information Associate corresponding entity information; wherein, the association relationship information includes information describing the direction of association and type of association between entity information; and, generating summary information based on the associated entity information, and sending the generated summary information
  • the server sends the entity information and the association relationship information between the entity information from the data source, based on the extracted association relationship information Associate corresponding entity information
  • the association relationship information includes information describing the direction of association and type of association between entity information
  • an embodiment of this specification also provides a client, which may include:
  • the eighth sending module may be used to send a summary information generation request to the server, where the summary information generation request includes the data source for which the summary is to be generated; so that the server performs entity information and the association relationship information between the entity information on the data source Extracting, associating corresponding entity information according to the extracted association relationship information; wherein the association relationship information includes information describing the association direction and association type between entity information; and, generating a summary based on the associated entity information Information, sending the generated summary information to the client;
  • the fifth receiving module may be used to receive the summary information sent by the server
  • the third display module can be used to display the summary information.
  • this specification also provides a summary information generating device, including a processor and a memory storing processor-executable instructions, which, when executed by the processor, implement the steps including the method described in any one of the foregoing embodiments.
  • the storage medium may include a physical device for storing information, and usually the information is stored in an electric, magnetic, or optical medium after digitizing the information.
  • the storage medium may include: devices that use electrical energy to store information, such as various types of memory, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk; a device that uses optical means to store information, such as CD or DVD.
  • devices that use electrical energy to store information such as various types of memory, such as RAM, ROM, etc.
  • devices that use magnetic energy to store information such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk
  • a device that uses optical means to store information such as CD or DVD.
  • quantum memory graphene memory, and so on.
  • the above-mentioned device may also include other implementation manners according to the description of the method embodiment.
  • specific implementation manners reference may be made to the description of the related method embodiments, which will not be repeated here.
  • This specification also provides a system, which can be a separate information flow extraction system, or an information flow display system, or an information retrieval system, or an abstract generation system, and can also be applied to a variety of information extraction systems.
  • the system can be a single server, or it can include server clusters, systems (including distributed systems), software (applications), The actual operation device, logic gate circuit device, quantum computer, etc., combined with the terminal device necessary to implement the hardware.
  • the information retrieval system may include at least one processor and a memory storing computer-executable instructions, and the processor implements the steps of the method in any one or more of the foregoing embodiments when the processor executes the instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Tourism & Hospitality (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Technology Law (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An information stream extraction method, comprising: acquiring target entity information; extracting, from a data source, at least one piece of associated entity information corresponding to the target entity information and association relationship information between the target entity information and the associated entity information, the association relationship information comprising information describing an association direction and an association type between the entity information; and associating the target entity information with the associated entity information by using the association relationship information, so as to obtain an information stream corresponding to the target entity information. The various embodiments of the present description can greatly improve the accuracy and comprehensiveness of information extraction.

Description

一种信息流提取方法、装置及设备Information flow extraction method, device and equipment 技术领域Technical field
本说明书涉及计算机数据处理技术领域,特别地,涉及一种信息流提取方法、装置及设备。This specification relates to the technical field of computer data processing, in particular, to an information flow extraction method, device and equipment.
发明背景Background of the invention
目前,存在一些信息提取以及搜索的方法,如利用关键字进行信息查询等。然而,鉴于较多数据源中很多具体信息描述形式复杂多变、且存在不准确、不全面的问题,导致很难准确全面的提取出用户所需要的信息。At present, there are some methods of information extraction and search, such as using keywords for information query. However, in view of the complex and changeable descriptions of many specific information in many data sources, and there are problems of inaccuracy and incompleteness, it is difficult to accurately and comprehensively extract the information that users need.
发明内容Summary of the invention
本说明书实施例的目的在于提供一种信息流提取方法、装置及设备,可以大幅提高信息查询的准确性以及全面性。The purpose of the embodiments of this specification is to provide an information flow extraction method, device, and equipment, which can greatly improve the accuracy and comprehensiveness of information query.
本说明书提供一种信息流提取方法、装置及设备是包括如下方式实现的:This specification provides an information flow extraction method, device, and equipment that are implemented in the following ways:
一种信息流展示方法,包括:An information flow display method, including:
客户端向服务器发送信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;The client sends an information flow acquisition request to the server, where the information flow acquisition request includes the input information acquired by the client;
服务器接收所述信息流获取请求,从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;The server receives the information flow acquisition request, and extracts one or more entity information from the input information as target entity information;
服务器从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The server extracts at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes the direction of the association between the entity information and Information describing the type of association;
服务器利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流,以及,将所述信息流发送给所述客户端,以在所述客户端上进行展示。The server uses the association relationship information to associate the target entity information and the associated entity information, obtains the information flow corresponding to the target entity information, and sends the information flow to the client for the Display on the client.
另一方面,本说明书实施例还提供一种信息流提取方法,包括:On the other hand, the embodiment of this specification also provides an information flow extraction method, including:
获取目标实体信息;Obtain target entity information;
从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;At least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source, and the association relationship information includes an association direction and an association between the entity information Type description information;
利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流。Use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
本说明书提供的所述方法的另一个实施例中,所述利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,包括:In another embodiment of the method provided in this specification, the using the association relationship information to associate the target entity information with the associated entity information includes:
获取从数据源提取的至少一个关联实体信息以及所述目标实体信息与相应的关联实体信息之间的关联关系信息,利用所述目标实体信息与相应的关联实体信息之间的关联关系信息对所述目标实体信息与相应的关联实体信息进行关联,获得所述目标实体信息的子信息流;Obtain at least one associated entity information extracted from the data source and the association relationship information between the target entity information and the corresponding associated entity information, and use the association relationship information between the target entity information and the corresponding associated entity information to The target entity information is associated with the corresponding associated entity information to obtain the sub-information stream of the target entity information;
以所述目标实体信息为基准信息将从同一或者不同数据源获得的子信息流进行链接,获得所述目标实体信息所对应的信息流。Using the target entity information as the reference information, link sub-information streams obtained from the same or different data sources to obtain the information stream corresponding to the target entity information.
本说明书提供的所述方法的另一个实施例中,当所述关联关系信息包括同位实体描述信息时,将所述同位实体描述信息所对应的关联实体信息补充至所述目标实体信息中,更新所述目标实体信息;其中,所述同位实体描述信息包括描述关联实体信息为目标实体信息的其他表述形式的描述信息;In another embodiment of the method provided in this specification, when the association relationship information includes co-located entity description information, the associated entity information corresponding to the co-located entity description information is added to the target entity information, and updated The target entity information; wherein, the co-located entity description information includes description information in other expression forms that describe the associated entity information as target entity information;
相应的,从数据源中提取更新后的目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。Correspondingly, at least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source.
本说明书提供的所述方法的另一个实施例中,所述方法还包括:In another embodiment of the method provided in this specification, the method further includes:
将从数据源中提取到的各关联实体信息作为目标实体信息进行信息流的提取,获得多个目标实体信息所对应的信息流;Use the associated entity information extracted from the data source as the target entity information to extract the information flow, and obtain the information flow corresponding to the multiple target entity information;
将多个目标实体信息所对应的信息流以相应的目标实体信息作为基准信息进行链接,获得信息图谱。Link the information streams corresponding to multiple target entity information with the corresponding target entity information as the reference information to obtain the information map.
本说明书提供的所述方法的另一个实施例中,所述从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,包括:In another embodiment of the method provided in this specification, the extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, include:
检索所述目标实体信息所在的数据源;Retrieve the data source where the target entity information is located;
定位所述目标实体信息在数据源中所在的上下文信息;Locate the context information where the target entity information is located in the data source;
当所述上下文信息中包含材料工艺实体、材料名实体和/或材料应用实体时,根据材料工艺信息关联方式提取所述目标实体信息所对应的关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。When the context information includes a material process entity, a material name entity, and/or a material application entity, extract the associated entity information corresponding to the target entity information and the target entity information and the associated entity information according to the material process information association method The relationship information between.
本说明书提供的所述方法的另一个实施例中,所述方法还包括:In another embodiment of the method provided in this specification, the method further includes:
从数据源提取所述目标实体信息、关联实体信息的商品名;Extracting the target entity information and the commodity name of the associated entity information from the data source;
根据所述商品名提取所述目标实体信息或关联实体信息所对应的制造商或者供应商,将所述制造商或者供应商与所述目标实体信息或关联实体信息进行关联。The manufacturer or supplier corresponding to the target entity information or associated entity information is extracted according to the product name, and the manufacturer or supplier is associated with the target entity information or associated entity information.
本说明书提供的所述方法的另一个实施例中,所述方法还包括:In another embodiment of the method provided in this specification, the method further includes:
从数据源提取所述目标实体信息、关联实体信息的参数实体信息,所述参数实体信息至少包括材料结构类型实体、工艺方法实体、材料属性实体、单位实体或者度量实体中的一种所对应的实体信息;Extract the target entity information and the parameter entity information of the associated entity information from the data source. The parameter entity information includes at least one of the material structure type entity, the process method entity, the material attribute entity, the unit entity or the measurement entity corresponding to one Entity information
将提取的参数实体信息与相应的目标实体信息、关联实体信息进行关联。The extracted parameter entity information is associated with the corresponding target entity information and associated entity information.
本说明书提供的所述方法的另一个实施例中,所述方法还包括:将所述目标实体信息的子信息流与相应的数据源进行关联。In another embodiment of the method provided in this specification, the method further includes: associating the sub-information stream of the target entity information with the corresponding data source.
本说明书提供的所述方法的另一个实施例中,所述方法还包括:In another embodiment of the method provided in this specification, the method further includes:
将信息流中的各实体信息作为与用户进行交互的信息节点,并利用可视化方法对所述信息流进行可视化处理;Use the entity information in the information flow as an information node for interaction with the user, and use a visualization method to visualize the information flow;
将处理后的信息流发送给客户端,以使用户通过客户端查看所述信息流以及触发所述信息节点;Sending the processed information stream to the client, so that the user can view the information stream and trigger the information node through the client;
基于用户对所述信息节点的触发操作,向所述客户端反馈与所述信息节点关联的其他信息。Based on the user's triggering operation on the information node, feedback other information associated with the information node to the client.
另一方面,本说明书实施例还提供一种信息流提取装置,包括:On the other hand, an embodiment of this specification also provides an information stream extraction device, including:
第一获取模块,用于获取目标实体信息;The first obtaining module is used to obtain target entity information;
第一提取模块,用于从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The first extraction module is configured to extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes pair entity information Information describing the direction of the association and the type of association;
第一关联模块,用于利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流。The first association module is configured to use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
本说明书提供的所述装置的另一个实施例中,所述第一关联模块包括:In another embodiment of the device provided in this specification, the first association module includes:
第一关联单元,用于获取从数据源提取的至少一个关联实体信息以及所述目标实体信息与相应的关联实体信息之间的关联关系信息,利用所述目标实体信息与相应的关联实体信息之间的关联关系信息对所述目标实体信息与相应的关联实体信息进行关联,获得所述目标实体信息的子信息流;The first associating unit is used to obtain at least one associated entity information extracted from the data source and the association relationship information between the target entity information and the corresponding associated entity information, and use the relationship between the target entity information and the corresponding associated entity information The association relationship information between each other associates the target entity information with the corresponding associated entity information, and obtains the sub-information stream of the target entity information;
第二关联单元,用于以所述目标实体信息为基准信息将从同一或者不同数据源获得的子信息流进行链接,获得所述目标实体信息所对应的信息流。The second associating unit is used to link the sub-information streams obtained from the same or different data sources using the target entity information as the reference information to obtain the information stream corresponding to the target entity information.
本说明书提供的所述装置的另一个实施例中,当所述关联关系信息包括同位实体描述信息时,所述装置还包括:In another embodiment of the device provided in this specification, when the association relationship information includes colocation entity description information, the device further includes:
更新模块,用于将所述同位实体描述信息所对应的关联实体信息补充至所述目标实体信息中,更新所述目标实体信息;其中,所述同位实体描述信息包括描述关联实体信息为目标实体信息的其他表述形式的描述信息;The update module is used to add the associated entity information corresponding to the co-located entity description information to the target entity information, and update the target entity information; wherein the co-located entity description information includes information describing the associated entity as the target entity Descriptive information in other forms of information;
所述第一提取模块还用于从数据源中提取更新后的目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。The first extraction module is further configured to extract at least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information from the data source.
本说明书提供的所述装置的另一个实施例中,所述装置还包括:In another embodiment of the device provided in this specification, the device further includes:
第二提取模块,用于将从数据源中提取到的各关联实体信息作为目标实体信息进行信息流的提取,获得多个目标实体信息所对应的信息流;The second extraction module is used to extract the information flow of each associated entity information extracted from the data source as the target entity information, and obtain the information flow corresponding to the multiple target entity information;
第二关联模块,用于将多个目标实体信息所对应的信息流以相应的目标实体信息作为基准信息进行链接,获得信息图谱。The second association module is used to link the information streams corresponding to multiple target entity information with the corresponding target entity information as the reference information to obtain the information graph.
本说明书提供的所述装置的另一个实施例中,所述第一提取模块包括:In another embodiment of the device provided in this specification, the first extraction module includes:
检索单元,用于检索所述目标实体信息所在的数据源;The retrieval unit is used to retrieve the data source where the target entity information is located;
定位单元,用于定位所述目标实体信息在数据源中所在的上下文信息;The positioning unit is used to locate the context information where the target entity information is located in the data source;
提取单元,用于当所述上下文信息中包含材料工艺实体、材料名实体和/或材料应用实体时,根据材料工艺信息关联方式提取所述目标实体信息所对应的关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。The extraction unit is configured to extract the associated entity information corresponding to the target entity information and the target entity according to the material process information association method when the context information includes material process entities, material name entities, and/or material application entities Information about the association relationship between information and associated entity information.
本说明书提供的所述装置的另一个实施例中,所述装置还包括:In another embodiment of the device provided in this specification, the device further includes:
第三提取模块,用于从数据源提取所述目标实体信息、关联实体信息的商品名;The third extraction module is used to extract the target entity information and the commodity name of the associated entity information from the data source;
第三关联模块,用于根据所述商品名提取所述目标实体信息或关联实体信息所对应的制造商或者供应商,将所述制造商或者供应商与所述目标实体信息或关联实体信息进行关联。The third association module is used to extract the manufacturer or supplier corresponding to the target entity information or the associated entity information according to the product name, and compare the manufacturer or supplier with the target entity information or the associated entity information Associated.
本说明书提供的所述装置的另一个实施例中,所述装置还包括:In another embodiment of the device provided in this specification, the device further includes:
第四提取模块,用于从数据源提取所述目标实体信息、关联实体信息的参数实体信息,所述参数实体信息至少包括材料结构类型实体、工艺方法实体、材料属性实体、单位实体或者度量实体中的一种所对应的实体信息;The fourth extraction module is used to extract the target entity information and the parameter entity information of the associated entity information from the data source. The parameter entity information includes at least a material structure type entity, a process method entity, a material attribute entity, a unit entity, or a measurement entity Entity information corresponding to one of the types;
第四关联模块,用于将提取的参数实体信息与相应的目标实体信息、关联实体信息进行关联。The fourth association module is used to associate the extracted parameter entity information with corresponding target entity information and associated entity information.
本说明书提供的所述装置的另一个实施例中,所述第一关联模块还包括:In another embodiment of the device provided in this specification, the first association module further includes:
第三关联单元,用于将所述目标实体信息的子信息流与相应的数据源进行关联。The third associating unit is used for associating the sub-information stream of the target entity information with the corresponding data source.
本说明书提供的所述装置的另一个实施例中,所述装置还包括:In another embodiment of the device provided in this specification, the device further includes:
可视化处理模块,用于将信息流中的各实体信息作为与用户进行交互的信息节点,并利用可视化方法对所述信息流进行可视化处理;The visualization processing module is used to treat each entity information in the information flow as an information node for interacting with the user, and use a visualization method to visualize the information flow;
第一发送模块,用于将处理后的信息流发送给客户端,以使用户通过客户端查看所述信息流以及触发所述信息节点;The first sending module is configured to send the processed information stream to the client, so that the user can view the information stream and trigger the information node through the client;
第二发送模块,用于基于用户对所述信息节点的触发操作,向所述客户端反馈与所述信息节点关联的其他信息。The second sending module is configured to feed back other information associated with the information node to the client based on the user's triggering operation on the information node.
另一方面,本说明书实施例还提供一种信息流提取设备,所述设备包括处理器及用于存储处理器可执行指令的存储器,所述指令被所述处理器执行时实现上述任意一个或者多个所述方法的步骤。On the other hand, the embodiments of this specification also provide an information flow extraction device. The device includes a processor and a memory for storing processor-executable instructions. When the instructions are executed by the processor, any one of or Multiple steps of the described method.
另一方面,本说明书实施例还提供一种信息流展示方法,应用于服务器,包括:On the other hand, the embodiment of this specification also provides an information flow display method, which is applied to a server, and includes:
接收客户端发送的信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;Receiving an information flow acquisition request sent by the client, where the information flow acquisition request includes input information acquired by the client;
从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;Extract one or more entity information from the input information as target entity information;
从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;At least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source, and the association relationship information includes an association direction and an association between the entity information Type description information;
利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流;Using the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information;
将所述信息流发送给所述客户端,以在所述客户端上进行展示。The information stream is sent to the client for display on the client.
另一方面,本说明书实施例还提供一种服务器,包括:On the other hand, the embodiment of this specification also provides a server, including:
第一接收模块,用于接收客户端发送的信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;The first receiving module is configured to receive an information flow acquisition request sent by the client, where the information flow acquisition request includes input information acquired by the client;
第五提取模块,用于从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;The fifth extraction module is used to extract one or more entity information from the input information as target entity information;
第六提取模块,用于从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The sixth extraction module is configured to extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes pair entity information Information describing the direction of the association and the type of association;
第五关联模块,用于利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流;A fifth association module, configured to use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information;
第三发送模块,用于将所述信息流发送给所述客户端,以在所述客户端上进行展示。The third sending module is configured to send the information stream to the client for display on the client.
另一方面,本说明书实施例还提供一种信息流展示方法,应用于客户端,包括:On the other hand, the embodiment of this specification also provides an information flow display method, which is applied to the client, including:
向服务器发送信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;以使服务器接收所述信息流获取请求,从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;以及,从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流,并将所述信息流发送给所述客户端;Send an information flow acquisition request to the server, where the information flow acquisition request includes the input information obtained by the client; so that the server receives the information flow acquisition request and extracts one or more entity information from the input information as the target entity Information; and, extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes the relationship between the entity information Information describing the association direction and association type; and, using the association relationship information to associate the target entity information and the associated entity information, obtain the information flow corresponding to the target entity information, and send the information flow To the client;
接收服务器发送的信息流,并进行展示。Receive the information stream sent by the server and display it.
另一方面,本说明书实施例还提供一种客户端,包括:On the other hand, the embodiment of this specification also provides a client, including:
第四发送模块,用于向服务器发送信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;以使服务器接收所述信息流获取请求,从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;以及,从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流,并将所述信息流发送给所述客户端;The fourth sending module is configured to send an information flow acquisition request to the server, where the information flow acquisition request includes the input information obtained by the client; so that the server receives the information flow acquisition request and extracts one or more information from the input information. Entity information as target entity information; and extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, the association relationship information Including information describing the direction and type of association between entity information; and using the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information, And send the information stream to the client;
第二接收模块,用于接收服务器发送的信息流;The second receiving module is used to receive the information stream sent by the server;
第一展示模块,用于展示所述信息流。The first display module is used to display the information stream.
另一方面,本说明书实施例还提供一种信息检索方法,应用于服务器,包括:On the other hand, the embodiment of this specification also provides an information retrieval method applied to a server, including:
接收客户端发送的检索请求,所述检索请求包括客户端获取的输入信息;Receiving a search request sent by the client, where the search request includes input information obtained by the client;
从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;Extracting entity information from the input information, where the entity information includes an entity type and an entity value;
根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The data source where the entity information is located is determined according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes the direction of association between the entity information And information describing the type of association;
将所述检索结果发送所述客户端,以使所述客户端进行展示。The search result is sent to the client, so that the client can display it.
另一方面,本说明书实施例还提供一种服务器,包括:On the other hand, the embodiment of this specification also provides a server, including:
第三接收模块,用于接收客户端发送的检索请求,所述检索请求包括客户端获取的输入信息;The third receiving module is configured to receive a search request sent by the client, where the search request includes input information obtained by the client;
第七提取模块,用于从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;The seventh extraction module is used to extract entity information from the input information, where the entity information includes entity type and entity value;
数据源确定模块,用于根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The data source determining module is configured to determine the data source where the entity information is located according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes Information describing the direction of association between entity information and the type of association;
第五发送模块,用于将所述检索结果发送所述客户端,以使所述客户端进行展示。The fifth sending module is configured to send the search result to the client, so that the client can display it.
另一方面,本说明书实施例还提供一种信息检索方法,应用于客户端,包括:On the other hand, the embodiment of this specification also provides an information retrieval method, which is applied to the client, including:
向服务器发送检索请求,所述检索请求包括客户端获取的输入信息;以使服务器从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及将所述检索结果发送所述客户端;Send a search request to the server, the search request includes the input information obtained by the client; so that the server extracts the entity information from the input information, the entity information includes the entity type and the entity value; according to the entity information in the data source The associated entity information in and the association relationship information between the entity information and the associated entity information determine the data source where the entity information is located, and the association relationship information includes information describing the association direction and association type between the entity information; And sending the search result to the client;
接收服务器发送的所述检索结果,并进行展示。Receive the retrieval result sent by the server and display it.
本说明书提供的所述方法的另一个实施例中,所述客户端所展示的信息输入界面包括信息输入区域、第一选择列表和/或第二选择列表;所述信息输入区域用于进行信息输入;所述第一选择列表、第二选择列表用于进行信息选择,所述第一选择列表中包括材料应用或者材料结构的类别信息;所述第二选择列表中包括实体类型的类别信息;In another embodiment of the method provided in this specification, the information input interface displayed by the client includes an information input area, a first selection list, and/or a second selection list; the information input area is used to perform information Input; the first selection list and the second selection list are used for information selection, the first selection list includes material application or material structure category information; the second selection list includes entity type category information;
相应的,所述客户端基于所述信息输入区域、第一选择列表和/或第二选择列表获取输入信息。Correspondingly, the client terminal obtains input information based on the information input area, the first selection list, and/or the second selection list.
本说明书提供的所述方法的另一个实施例中,所述第一选择列表和/或第二选择列表采用包括交互式可视化格式展示待选择的信息;其中,所述交互式可视化格式表示通过可视化的形式展示实体类型或者材料结构、材料应用的类别信息并通过接收对各类别信息的触发操作确定相应的类别以及子类别的列表选择格式。In another embodiment of the method provided in this specification, the first selection list and/or the second selection list adopts an interactive visualization format to display the information to be selected; wherein, the interactive visualization format means that the information to be selected is displayed through visualization Display the entity type or material structure, the category information of the material application in the form of, and determine the corresponding category and subcategory list selection format by receiving the trigger operation of each category information.
另一方面,本说明书实施例还提供一种客户端,包括:On the other hand, the embodiment of this specification also provides a client, including:
第六发送模块,用于向服务器发送检索请求,所述检索请求包括客户端获取的输入信息;以使服务器从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及将所述检索结果发送所述客户端;The sixth sending module is configured to send a search request to the server, the search request includes input information obtained by the client; so that the server extracts entity information from the input information, and the entity information includes entity type and entity value; The associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information determine the data source where the entity information is located, and the association relationship information includes the association direction between the entity information and Information describing the type of association; and sending the search result to the client;
第四接收模块,用于接收服务器发送的所述检索结果;The fourth receiving module is configured to receive the retrieval result sent by the server;
第二展示模块,用于展示所述检索结果。The second display module is used to display the retrieval result.
本说明书提供的所述客户端的另一个实施例中,所述客户端还包括:In another embodiment of the client provided in this specification, the client further includes:
输入界面展示模块,用于展示信息输入界面,所述信息输入界面包括信息输入区域、第一选择列表和/或第二选择列表;所述信息输入区域用于进行信息输入;所述第一选择列表、第二选择列表用于进行信息选择,所述第一选择列表中包括材料应用或者材料结构的类别信息;所述第二选择列表中包括实体类型的类别信息;The input interface display module is used to display an information input interface. The information input interface includes an information input area, a first selection list and/or a second selection list; the information input area is used for information input; the first selection The list and the second selection list are used for information selection, the first selection list includes category information of material application or material structure; the second selection list includes category information of entity type;
输入信息获取模块,用于基于所述信息输入区域、第一选择列表和/或第二选择列表获取输入信息。The input information obtaining module is configured to obtain input information based on the information input area, the first selection list and/or the second selection list.
另一方面,本说明书实施例还提供一种摘要信息生成方法,应用于服务器,包括:On the other hand, the embodiment of this specification also provides a method for generating summary information, which is applied to a server, and includes:
接收客户端发送的摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;Receiving a summary information generation request sent by the client, where the summary information generation request includes the data source for which the summary is to be generated;
对所述数据源进行实体信息以及实体信息之间的关联关系信息提取,根据提取的关联关系信息对相应的实体信息进行关联;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The entity information and the association relationship information between the entity information are extracted from the data source, and the corresponding entity information is associated according to the extracted association relationship information; wherein the association relationship information includes the association direction between the entity information and Information describing the type of association;
根据关联后的实体信息生成摘要信息,以及,将生成的摘要信息发送所述客户端,以使所述客户端进行展示。The summary information is generated according to the associated entity information, and the generated summary information is sent to the client, so that the client can display it.
另一方面,本说明书实施例还提供一种服务器,包括:On the other hand, the embodiment of this specification also provides a server, including:
第五接收模块,用于接收客户端发送的摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的 数据源;The fifth receiving module is configured to receive a summary information generation request sent by the client, where the summary information generation request includes the data source for which the summary is to be generated;
第八提取模块,用于对所述数据源进行实体信息以及实体信息之间的关联关系信息提取;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The eighth extraction module is used to extract entity information and the association relationship information between the entity information from the data source; wherein the association relationship information includes information describing the association direction and the association type between the entity information;
第六关联模块,用于根据提取的关联关系信息对相应的实体信息进行关联;The sixth association module is used to associate corresponding entity information according to the extracted association relationship information;
生成模块,用于根据关联后的实体信息生成摘要信息;The generation module is used to generate summary information based on the associated entity information;
第七发送模块,用于将生成的摘要信息发送所述客户端,以使所述客户端进行展示。The seventh sending module is configured to send the generated summary information to the client, so that the client can display it.
另一方面,本说明书实施例还提供一种摘要信息生成方法,应用于客户端,包括:On the other hand, the embodiment of this specification also provides a method for generating summary information, which is applied to the client, and includes:
向服务器发送摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;以使服务器对所述数据源进行实体信息以及实体信息之间的关联关系信息提取,根据提取的关联关系信息对相应的实体信息进行关联;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,根据关联后的实体信息生成摘要信息,将生成的摘要信息发送所述客户端;Send a summary information generation request to the server, where the summary information generation request includes the data source for which the summary is to be generated; so that the server extracts the entity information and the association relationship information between the entity information from the data source, based on the extracted association relationship information Associate the corresponding entity information; wherein, the association relationship information includes information describing the direction of association between the entity information and the type of association; and, according to the associated entity information, summary information is generated, and the generated summary information is sent The client;
接收服务器发送的所述摘要信息,并进行展示。Receive and display the summary information sent by the server.
另一方面,本说明书实施例还提供一种客户端,包括:On the other hand, the embodiment of this specification also provides a client, including:
第八发送模块,用于向服务器发送摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;以使服务器对所述数据源进行实体信息以及实体信息之间的关联关系信息提取,根据提取的关联关系信息对相应的实体信息进行关联;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,根据关联后的实体信息生成摘要信息,将生成的摘要信息发送所述客户端;The eighth sending module is configured to send a summary information generation request to the server, where the summary information generation request includes the data source for which the summary is to be generated; so that the server extracts entity information and the association relationship information between the entity information from the data source , Associate the corresponding entity information according to the extracted association relationship information; wherein the association relationship information includes information describing the association direction and association type between the entity information; and, generate summary information based on the associated entity information , Sending the generated summary information to the client;
第五接收模块,用于接收服务器发送的所述摘要信息;The fifth receiving module is configured to receive the summary information sent by the server;
第三展示模块,用于展示所述摘要信息。The third display module is used to display the summary information.
本说明书一个或多个实施例提供的信息流提取方法、装置及设备,通过同时提取实体信息以及实体信息之间的关联关系信息,可以有效利用关联关系信息对提取的实体信息进行进一步的验证和筛选,以确定提取的实体信息是否为与目标实体信息相关的实体信息,从而进一步提高与目标实体信息相关联的实体信息提取的准确性,有效过滤噪音。同时,利用关联关系信息还可以将提取的实体信息与目标实体信息关联起来,有效展示提取的实体信息与目标实体信息的关联关系,便于用户查看以及梳理,使得用户可以准确高效的获得自己所需要的有用信息,找到需要的或者新的解决方案,提高用户使用体验感。The information flow extraction method, device, and equipment provided by one or more embodiments of this specification can extract entity information and the association relationship information between entity information at the same time, so that the association relationship information can be effectively used to further verify and verify the extracted entity information. Screening to determine whether the extracted entity information is entity information related to the target entity information, thereby further improving the accuracy of extracting entity information associated with the target entity information, and effectively filtering noise. At the same time, using the association relationship information can also associate the extracted entity information with the target entity information, effectively displaying the association relationship between the extracted entity information and the target entity information, which is convenient for users to view and sort out, so that users can accurately and efficiently obtain what they need Useful information, find the needed or new solutions, and improve the user experience.
附图简要说明Brief description of the drawings
图1为本说明书提供的一种信息流展示方法的流程示意图。Figure 1 is a schematic flow diagram of an information flow display method provided in this specification.
图2为本说明书提供的一个实施例中的信息流示意图。Figure 2 is a schematic diagram of the information flow in an embodiment provided in this specification.
图3为本说明书提供的另一个实施例中的信息流示意图。Figure 3 is a schematic diagram of the information flow in another embodiment provided in this specification.
图4为本说明书提供的另一个实施例中的信息图谱示意图。Figure 4 is a schematic diagram of the information map in another embodiment provided in this specification.
图5为本说明书提供的另一个实施例中的信息图谱示意图。Fig. 5 is a schematic diagram of the information map in another embodiment provided in this specification.
图6为本说明书提供的另一个实施例中的信息图谱示意图。Fig. 6 is a schematic diagram of an information map in another embodiment provided in this specification.
图7为本说明书提供的另一个实施例中的信息图谱示意图。Fig. 7 is a schematic diagram of an information map in another embodiment provided in this specification.
图8为本说明书提供的另一个实施例中的信息图谱示意图。Fig. 8 is a schematic diagram of an information map in another embodiment provided in this specification.
图9为本说明书提供的另一个实施例中的信息图谱示意图。Fig. 9 is a schematic diagram of an information map in another embodiment provided in this specification.
图10为本说明书提供的另一个实施例中的信息图谱示意图。FIG. 10 is a schematic diagram of the information map in another embodiment provided in this specification.
图11为本说明书提供的另一个实施例中的信息图谱示意图。FIG. 11 is a schematic diagram of the information map in another embodiment provided in this specification.
图12为本说明书提供的另一个实施例中的信息图谱示意图。FIG. 12 is a schematic diagram of the information map in another embodiment provided in this specification.
图13为本说明书提供的另一个实施例中的信息图谱示意图。FIG. 13 is a schematic diagram of the information map in another embodiment provided in this specification.
图14为本说明书提供的另一个实施例中的信息图谱可视化展示示意图。FIG. 14 is a schematic diagram of the visual display of the information graph in another embodiment provided in this specification.
图15为本说明书提供的另一个实施例中的材料属性分类示意图。Fig. 15 is a schematic diagram of material attribute classification in another embodiment provided in this specification.
图16为本说明书提供的另一个实施例中的利用单位和度量将材料链接到来自不同属性类中的属性列表的示意图。Figure 16 is a schematic diagram of using units and metrics to link materials to attribute lists from different attribute classes in another embodiment provided in this specification.
图17为本说明书提供的另一个实施例中的属性、单位以及度量的提取示意图。FIG. 17 is a schematic diagram of extracting attributes, units, and metrics in another embodiment provided in this specification.
图18为本说明书提供的另一个实施例中的将提取的属性、单位以及度量链接到相应的材料中示意图。FIG. 18 is a schematic diagram of linking the extracted attributes, units, and metrics to corresponding materials in another embodiment provided in this specification.
图19为本说明书提供的另一个实施例中的属性、单位以及度量的提取示意图。Figure 19 is a schematic diagram of extracting attributes, units, and metrics in another embodiment provided in this specification.
图20为本说明书提供的另一个实施例中的属性、单位以及度量的提取示意图。FIG. 20 is a schematic diagram of extracting attributes, units, and metrics in another embodiment provided in this specification.
图21为本说明书提供的另一个实施例中的属性、单位以及度量的提取示意图。FIG. 21 is a schematic diagram of extracting attributes, units, and metrics in another embodiment provided in this specification.
图22为本说明书提供的另一个实施例中的将提取的属性、单位以及度量链接到相应的材料中示意图。Figure 22 is a schematic diagram of linking extracted attributes, units, and metrics to corresponding materials in another embodiment provided in this specification.
图23为本说明书提供的一种信息流提取方法的流程示意图。Figure 23 is a schematic flow chart of an information flow extraction method provided in this specification.
图24为本说明书提供的一种信息流提取装置的模型结构示意图。Figure 24 is a schematic diagram of a model structure of an information flow extraction device provided in this specification.
图25为本说明书提供的一种信息检索方法的流程示意图。Figure 25 is a schematic flow chart of an information retrieval method provided in this specification.
图26为本说明书提供的另一个实施例中的检索界面以及搜索结果示意图。FIG. 26 is a schematic diagram of a search interface and search results in another embodiment provided in this specification.
图27为本说明书提供的另一个实施例中的检索界面示意图。Figure 27 is a schematic diagram of a retrieval interface in another embodiment provided in this specification.
图28为本说明书提供的另一个实施例中的检索结果示意图。FIG. 28 is a schematic diagram of retrieval results in another embodiment provided in this specification.
图29为本说明书提供的另一个实施例中的应用分类示意图。FIG. 29 is a schematic diagram of application classification in another embodiment provided in this specification.
图30为本说明书提供的另一个实施例中的材料结构分类交互界面示意图。FIG. 30 is a schematic diagram of the material structure classification interactive interface in another embodiment provided in this specification.
图31为本说明书提供的另一个实施例中的聚合物按材料结构进一步细化的分类示意图。FIG. 31 is a schematic diagram of the classification of polymers in another embodiment provided in this specification according to the material structure.
图32为本说明书提供的另一个实施例中的应用分类交互界面示意图。FIG. 32 is a schematic diagram of an application classification interaction interface in another embodiment provided in this specification.
图33为本说明书提供的另一个实施例中的智能材料的进一步细化的分类示意图。FIG. 33 is a further detailed classification diagram of smart materials in another embodiment provided in this specification.
图34为本说明书提供的另一个实施例中的检索结果示意图。FIG. 34 is a schematic diagram of the retrieval result in another embodiment provided in this specification.
图35为本说明书提供的另一个实施例中的检索界面示意图。FIG. 35 is a schematic diagram of a retrieval interface in another embodiment provided in this specification.
图36为本说明书提供的另一个实施例中的检索结果示意图。FIG. 36 is a schematic diagram of retrieval results in another embodiment provided in this specification.
图37为本说明书提供的另一个实施例中的蛋白质结构、DNA质粒和材料微结构的图像示意图。Fig. 37 is a schematic diagram of an image of the protein structure, DNA plasmid, and material microstructure in another embodiment provided in this specification.
图38为本说明书提供的另一个实施例中的电路图、流程图的图像示意图。FIG. 38 is an image schematic diagram of a circuit diagram and a flowchart in another embodiment provided in this specification.
图39为本说明书提供的另一个实施例中的材料属性分类示意图。Fig. 39 is a schematic diagram of material attribute classification in another embodiment provided in this specification.
图40为本说明书提供的一种摘要生成方法的流程示意图。FIG. 40 is a schematic flowchart of a method for generating an abstract provided in this specification.
实施本发明的方式Ways to implement the invention
为了使本技术领域的人员更好地理解本说明书中的技术方案,下面将结合本说明书一个或多个实施例中的附图,对本说明书一个或多个实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是说明书一部分实施例,而不是全部的实施例。基于说明书一个或多个实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本说明书实施例方案保护的范围。In order to enable those skilled in the art to better understand the technical solutions in this specification, the following will make clear and complete the technical solutions in one or more embodiments of this specification in conjunction with the drawings in one or more embodiments of this specification. It is obvious that the described embodiments are only a part of the embodiments in the specification, rather than all the embodiments. Based on one or more embodiments of the specification, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of the embodiment scheme of this specification.
在一个场景示例中,用户可以通过客户端进行信息搜索。所述客户端可以为可移动设备。例如,所述客户端可以为智能手机、平板电子设备、便携式计算机、个人数字助理(PDA)、车载设备、或智能穿戴设备等。所述客户端还可以为桌面设备。例如,所述客户端可以为服务器、工控机(工业控制计算机)、个人计算机(PC机)、一体机、或智能自助终端(kiosk)等。In an example scenario, the user can search for information through the client. The client can be a mobile device. For example, the client can be a smart phone, a tablet electronic device, a portable computer, a personal digital assistant (PDA), a vehicle-mounted device, or a smart wearable device, etc. The client can also be a desktop device. For example, the client can be a server, an industrial computer (industrial control computer), a personal computer (PC), an all-in-one machine, or an intelligent self-service terminal (kiosk), etc.
用户可以在客户端展示的页面中输入信息。所述输入信息为用户希望进行信息搜索的基准信息。如用户希望查询氧化铝陶瓷相关的信息,则用户可以在客户端展示的页面中输入信息“氧化铝陶瓷”,客户端可以向服务器发出信息搜索请求,该搜索请求中可以附带有上述输入信息“氧化铝陶瓷”。服务器可以从信息搜索请求中提取输入信息“氧化铝陶瓷”,以所述输入信息“氧化铝陶瓷”为基准信息,提取与所述输入信息“氧化铝陶瓷”存在关联的实体信息或者数据源。The user can enter information in the page displayed on the client. The input information is reference information that the user wants to search for. If the user wants to inquire about alumina ceramics-related information, the user can enter the information "alumina ceramics" in the page displayed on the client, and the client can send an information search request to the server, and the search request can be accompanied by the above input information " Alumina ceramics". The server may extract the input information "alumina ceramics" from the information search request, and use the input information "alumina ceramics" as reference information to extract entity information or data sources associated with the input information "alumina ceramics".
服务器中可以设置有存储器或者与数据库进行链接的信息。服务器在接收客户端发送的信息搜索请求后,可以利用服务器存储器中的数据源或者其所链接的数据库中的数据源进行信息搜索。服务器在完成信息搜索后,可以将搜索到的信息反馈给客户端,以使客户端进行展示。The server can be provided with storage or information linking with the database. After the server receives the information search request sent by the client, it can use the data source in the server memory or the data source in the database linked to it to search for information. After the server completes the information search, it can feed back the searched information to the client, so that the client can display it.
但通常信息在同一数据源或者不同数据源中可能存在多种不同的描述形式,甚至存在大量的不完整的、非描述性的信息,导致搜索到的信息存在不准确、不全面的问题。且搜索到的信息量可能较为庞大,无关或者噪声信息较多,信息通常也并不能简洁的展示给用户,使得用户在查看时需要耗费较大的精力进行筛选、梳理,导致用户容易错过想要的信息,且体验感较差。However, there may be many different description forms of information in the same data source or different data sources, and even a large amount of incomplete and non-descriptive information, which leads to the problem of inaccuracy and incompleteness of the searched information. Moreover, the amount of information searched may be relatively large, irrelevant or noisy information, and the information is usually not displayed concisely to users, which makes users need to spend a lot of energy to filter and sort out when viewing, which makes users easy to miss what they want. Information, and the experience is poor.
一些实施方式中,可以通过从数据源中提取输入信息的关联实体信息以及输入信息与关联实体信息之间的关联类型、关联方向等关联关系信息,并利用关联关系信息将提取的关联实体信息与输入信息进行关联,形成输入信息所对应的信息流。然后,可以将信息流展示给用户,以便于与通过信息流的方式直观查看需要的信息以及信息之间的关联关系。具体的一个实施例如图1所示,本说明书提供的信息流展示方法的一个实施例中,所述方法可以包括:In some embodiments, the associated entity information of the input information and the association type and direction between the input information and the associated entity information can be extracted from the data source, and the extracted associated entity information can be combined with the associated entity information by using the associated relationship information. The input information is associated to form an information flow corresponding to the input information. Then, the information flow can be shown to the user, so as to visually view the required information and the relationship between the information through the information flow. A specific embodiment is shown in Fig. 1. In an embodiment of the information flow display method provided in this specification, the method may include:
S20:客户端向服务器发送信息流获取请求,所述信息流获取请求包括客户端获取的输入信息。S20: The client sends an information flow acquisition request to the server, where the information flow acquisition request includes input information acquired by the client.
客户端可以向服务器发送信息流获取请求。所述信息流获取请求可以包括客户端获取的用户的输入信息。例如,所述客户端中可以对应有信息输入框,用户可以通过手动输入或者选择客户端展示的信息下拉选择框等操作方式,向客户端中输入信息。然后,客户端可以进一步响应用户的操作,生成信息流获取请求,所述信息流获取请求中可以附带有上述输入信息。然后,客户端可以将信息流获取请求发送给服务器。例如,客户端的展示界面中可以设置有搜索按钮或者信息流生成按钮,用户可以通过点击该按钮或者敲击键盘中的回车键触发搜索指令,然后,客户端可以响应于用户的操作,生成信息流获取请求,并发送给服 务器。当然,上述信息输入、用户操作以及信息流获取请求的具体实现方式仅为举例说明,具体实施时并不局限于上述方式。The client can send an information flow acquisition request to the server. The information flow acquisition request may include user input information acquired by the client. For example, there may be an information input box corresponding to the client, and the user may input information into the client by manually inputting or selecting a drop-down selection box of the information displayed by the client. Then, the client may further respond to the user's operation to generate an information flow acquisition request, and the information flow acquisition request may be accompanied by the aforementioned input information. Then, the client can send the information flow acquisition request to the server. For example, the display interface of the client can be provided with a search button or an information flow generation button. The user can trigger the search instruction by clicking the button or hitting the enter key on the keyboard, and then the client can generate information in response to the user's operation The stream gets the request and sends it to the server. Of course, the specific implementation manners of the foregoing information input, user operation, and information flow acquisition request are only examples, and the specific implementation is not limited to the foregoing manners.
一个场景示例中,客户端可以向用户展示信息输入框以及信息流生成按钮,用户可以在输入框中输入信息“聚乙烯醇”,然后点击信息流生成按钮,客户端可以响应用户的该点击操作,生成信息流获取请求,以获取有关该材料的全部或者部分信息。或者,用户在输入框中输入信息“有机电致发光显示器涂层”,以获取有关该材料应用的从原材料到材料应用的全部或者部分信息。或者,也可以输入其他类型的信息,以获取与所述输入信息有关的信息。In an example scenario, the client can show the user an information input box and an information flow generation button. The user can enter the information "polyvinyl alcohol" in the input box, and then click the information flow generation button, and the client can respond to the user's click operation , To generate an information flow acquisition request to obtain all or part of the information about the material. Or, the user enters the information "organic electroluminescence display coating" in the input box to obtain all or part of the information about the application of the material from the raw material to the material application. Alternatively, other types of information can also be input to obtain information related to the input information.
另一个场景示例中,用户也可以限定所需要信息流的提取方向或者信息流的开始节点以及结束节点。如限定提取某材料的工艺信息流,以获取该材料的材料工艺相关的信息。用户可以在客户端展示的页面中,输入“聚乙烯醇材料工艺”或者“聚乙烯醇and加工工艺”等。或者,客户端展示的页面还设置有下拉选择框,下拉选择框中设置有一个或者多个信息提取方向选择项,如材料工艺、材料应用等选项。用户在输入“聚乙烯醇”后,可以进一步从下拉选择框中选择信息提取方向,以获取相应的信息。相应的,“聚乙烯醇”和提取方向“加工工艺”可以一起附带在检索请求中,发送给服务器。In another scenario example, the user can also define the extraction direction of the required information flow or the start node and end node of the information flow. For example, the process information flow of a certain material is limited to obtain information related to the material process of the material. The user can enter "polyvinyl alcohol material technology" or "polyvinyl alcohol and processing technology" on the page displayed on the client. Or, the page displayed by the client is also provided with a drop-down selection box, and the drop-down selection box is set with one or more information extraction direction selection items, such as material technology, material application and other options. After entering "polyvinyl alcohol", the user can further select the information extraction direction from the drop-down selection box to obtain the corresponding information. Correspondingly, the "polyvinyl alcohol" and the extraction direction "processing technology" can be attached to the retrieval request together and sent to the server.
S22:服务器接收所述信息流获取请求,从所述输入信息中提取一个或者多个实体信息,作为目标实体信息。S22: The server receives the information flow acquisition request, and extracts one or more entity information from the input information as target entity information.
服务器在接收信息流获取请求后,可以从信息流获取请求中的输入信息中提取一个或者多个实体信息,作为目标实体信息,进行信息流的提取。After the server receives the information flow acquisition request, it may extract one or more entity information from the input information in the information flow acquisition request as target entity information to extract the information flow.
所述实体信息可以包括从数据源中识别出的专有名称或有意义的数量短语等信息。所述专有名称如可以包括材料名、特定的材料应用、制造方法、材料属性、商品名等等。所述意义的数量短语如材料属性的度量值、日期等等。所述实体信息可以包括实体类型以及实体值。所述实体类型可以是指对具有共同特征的信息加以归类后对其所属的类别进行描述的信息。所述实体值可以包括各实体类型在数据源中所对应的具体信息。所述数据源可以包括相对封闭的专利文献、论文文献等文本信息以及开源数据库中文本信息等等。数据源中的信息庞大且信息类型复杂,信息分析以及提取的工作量通常较大。通过结合实体信息进行信息的提取,可以大幅度提高信息提取的准确性以及高效性。The entity information may include information such as a proper name or a meaningful quantitative phrase recognized from the data source. The proprietary name may include, for example, material name, specific material application, manufacturing method, material attribute, trade name, and so on. The meaning of quantitative phrases such as measurement values of material properties, dates, and so on. The entity information may include entity type and entity value. The entity type may refer to information that describes the category to which the information with common characteristics belongs after categorizing the information. The entity value may include specific information corresponding to each entity type in the data source. The data source may include relatively closed text information such as patent documents, dissertation documents, and text information in open source databases. The information in the data source is huge and the type of information is complex, and the workload of information analysis and extraction is usually large. By combining entity information to extract information, the accuracy and efficiency of information extraction can be greatly improved.
对于材料技术领域,所述实体类型如可以包括材料名实体、材料结构类型实体、材料应用实体、材料工艺实体、材料属性实体、材料商品名实体、材料供应商/制造商实体、单位实体、度量实体等等。其中,所述材料工艺实体还可以包括材料之间的工艺方法实体、材料与材料应用之间的制造方法实体、加工或者制造过程中的中间材料实体等。所述中间材料实体如可以包括添加剂实体、催化剂实体等。For the material technology field, the entity type may include, for example, material name entity, material structure type entity, material application entity, material process entity, material attribute entity, material trade name entity, material supplier/manufacturer entity, unit entity, and measurement Entities and so on. Wherein, the material process entity may also include a process method entity between materials, a manufacturing method entity between a material and a material application, an intermediate material entity in a processing or manufacturing process, and the like. The intermediate material entity may include, for example, an additive entity, a catalyst entity, and the like.
所述材料名实体可以包括材料名称的描述信息所对应的实体类型,以用来确定何种材料。材料名实体所对应的实体值如可以为化学、物理或者生物等具有材料的名称描述信息,如聚乙烯醇、氧化锆、马氏体不锈钢等。The material name entity may include the entity type corresponding to the description information of the material name to determine what kind of material. The entity value corresponding to the material name entity can be, for example, chemical, physical or biological description information with the name of the material, such as polyvinyl alcohol, zirconia, martensitic stainless steel, etc.
所述材料结构类型实体可以包括根据材料结构确定的各材料所属的类型的描述信息所对应的实体类型。所属材料结构类型实体的实体值如可以为金属、陶瓷、生物等等。所述材料应用实体可以包括材料所对应的应用的描述信息所对应的实体类型。所述材料应用实体的实体值如可以为各材料的应用领域或者具体的应用信息。其中,应用领域如可以包括建筑材料、能量材料、3D打印材料、光电学材料等等。具体的应用如可以包括有机电显示器涂层、包装袋等。The material structure type entity may include an entity type corresponding to the description information of the type to which each material belongs, determined according to the material structure. For example, the entity value of the material structure type entity can be metal, ceramic, biology and so on. The material application entity may include an entity type corresponding to the description information of the application corresponding to the material. The entity value of the material application entity may be, for example, the application field of each material or specific application information. Among them, the application fields may include building materials, energy materials, 3D printing materials, optoelectronic materials, and so on. Specific applications may include organic electric display coatings, packaging bags, and so on.
所述材料工艺实体可以包括由一种或者多种材料获得另外一种或者多种材料或者材料应用的工艺/制造方法的描述信息所对应的实体类型。所述材料工艺实体的实体值如可以为工艺/制造方法的描述信息,如聚合、水解、合金化、金属精炼、拉伸、精加工等,以及,材料加工或者制造过程中的添加剂、催化剂等。The material process entity may include the entity type corresponding to the description information of the process/manufacturing method of obtaining another one or more materials or material application from one or more materials. The physical value of the material process entity may be, for example, description information of the process/manufacturing method, such as polymerization, hydrolysis, alloying, metal refining, stretching, finishing, etc., as well as additives, catalysts, etc. in the material processing or manufacturing process .
所述材料属性实体可以包括表征材料的结构、性能等内在特征的描述信息所对应的实体类型。所述材料工艺实体的实体值如可以包括熔点、热导率、比热容、屈服强度、弹性模量等等。The material property entity may include the entity type corresponding to the description information that characterizes the internal characteristics such as the structure and performance of the material. The physical value of the material process entity may include, for example, melting point, thermal conductivity, specific heat capacity, yield strength, elastic modulus, and so on.
所述材料商品名实体可以包括材料在实际生产、销售过程中所使用的商品名描述信息所对应的实体类型。所述材料供应商/制造商实体可以包括各材料的供应商/制造商描述信息所对应的实体类型。The material commodity name entity may include the entity type corresponding to the commodity name description information used in the actual production and sales of the material. The material supplier/manufacturer entity may include the entity type corresponding to the supplier/manufacturer description information of each material.
所述单位实体可以包括各材料的属性信息的具体值的度量单位描述信息所对应的实体类型。所述度量实体可以包括各材料的属性信息的具体值描述信息所对应的实体类型。如熔点的单位可以包括摄氏度、开尔文等,则对应的摄氏度、开尔文即为单位实体对应的实体值。如钨的熔点为3410摄氏度,其中熔点为材料属性的实体值,3410为度量实体的实体值,而摄氏度为单位实体的实体值。The unit entity may include the entity type corresponding to the measurement unit description information of the specific value of the attribute information of each material. The measurement entity may include the entity type corresponding to the specific value description information of the attribute information of each material. If the unit of melting point can include degrees Celsius, Kelvin, etc., the corresponding degrees Celsius and Kelvin are the entity values corresponding to the unit entity. For example, the melting point of tungsten is 3410 degrees Celsius, where the melting point is the physical value of the material properties, 3410 is the physical value of the measurement entity, and degrees Celsius is the physical value of the unit entity.
对于上述输入信息“聚乙烯醇”,可以提取到的实体信息为“材料名实体-聚乙烯醇”。对于该实体信息表述形式中的A-B中,A为实体类型,B为实体值,即材料名实体为实体类型,聚乙烯醇为实体值。上述实体类型与对应的实体值的表述方式仅为举例说明,具体数据处理时如还可以采用关联或者表格存储等方式。为了便于统一描述,本说明书实施例中实体信息统一采用“A-B”的形式。例如,对于上述“钨的熔点为3410摄氏度”可以提取到的实体信息为“材料名实体-钨”、“材料属性实体-熔点”、“单位实体-摄氏度”、“度量实体-3410”。For the above input information "polyvinyl alcohol", the entity information that can be extracted is "material name entity-polyvinyl alcohol". For A-B in the entity information expression form, A is the entity type, B is the entity value, that is, the material name entity is the entity type, and polyvinyl alcohol is the entity value. The foregoing entity type and corresponding entity value expression methods are only examples for illustration, and methods such as association or table storage may also be used for specific data processing. In order to facilitate a unified description, the entity information in the embodiments of this specification uniformly adopts the form of "A-B". For example, the entity information that can be extracted for the aforementioned "melting point of tungsten is 3410 degrees Celsius" is "material name entity-tungsten", "material attribute entity-melting point", "unit entity-degrees Celsius", and "metric entity-3410".
一些实施方式中,实体信息的提取方法如可以采用命名实体提取等方法,可以通过对大量的样本数据进行学习训练后获得。一些实施例中,如可以先利用自然语言处理(Natural Langunge Possns,NLP)算法对输入信息进行处理和标记化。然后,可以基于预先构建的实体信息字典或实体信息库,利用命名实体识别算法(Named Entity Recongnition,NER)来提取输入信息中的实体信息。所述实体信息字典可以包括基于大量样本数据预先构建的指定领域的常用专业词语信息,所述实体信息字典如可以为包括材料领域的材料名、材料属性、材料应用等专业词语信息的数据库。所述实体信息库可以包括平台预先构建的包括指定领域的常用专业词语信息的数据库,如包括材料领域的材料名、材料属性、材料应用等专业词语信息的实体信息库。通过基于预先构建的实体信息字典或者实体信息库进行实体信息的提取,可以大幅提高实体信息提取的效率以及准确性。服务器可以将提取的实体信息作为目标实体信息进行信息流的提取。所述目标实体信息为信息流提取的基准信息,服务器可以以该目标实体信息为基准,提取与该目标实体信息相关联的实体信息以及目标实体信息与提取的实体信息之间的关联关系信息,进而构建获得相应目标实体信息的信息流。例如,对于上述实例中提到的输入信息,从中提到的实体信息为“材料名实体-聚乙烯醇”,可以将该实体信息作为目标实体信息,从数据源中提取与“材料名实体-聚乙烯醇”相关联的实体信息以及“材料名实体-聚乙烯醇”与提取的实体信息之间的关联关系信息,进而构建获得“材料名实体-聚乙烯醇”的信息流。In some embodiments, the method for extracting entity information, such as named entity extraction, can be obtained by learning and training a large amount of sample data. In some embodiments, for example, a natural language processing (Natural Langunge Possns, NLP) algorithm may be used to process and tokenize the input information. Then, based on a pre-built entity information dictionary or entity information database, a named entity recognition algorithm (Named Entity Recongnition, NER) can be used to extract the entity information in the input information. The entity information dictionary may include common professional word information in a designated field constructed in advance based on a large amount of sample data, and the entity information dictionary may for example be a database including professional word information such as material names, material attributes, and material applications in the material field. The entity information database may include a database pre-built by the platform that includes common professional word information in a specified field, such as an entity information database that includes professional word information such as material names, material attributes, and material applications in the material field. By extracting entity information based on a pre-built entity information dictionary or entity information database, the efficiency and accuracy of entity information extraction can be greatly improved. The server can use the extracted entity information as the target entity information to extract the information flow. The target entity information is reference information extracted from the information flow, and the server may use the target entity information as a reference to extract entity information associated with the target entity information and the association relationship information between the target entity information and the extracted entity information, And then construct the information flow to obtain the corresponding target entity information. For example, for the input information mentioned in the above example, the entity information mentioned is "material name entity-polyvinyl alcohol", the entity information can be used as the target entity information, and the data source can be extracted from the data source with "material name entity-polyvinyl alcohol". The entity information associated with the "polyvinyl alcohol" and the association relationship information between the "material name entity-polyvinyl alcohol" and the extracted entity information are then constructed to obtain the information flow of the "material name entity-polyvinyl alcohol".
一些实施方式中,如果从输入信息中提取的实体信息不止一个。服务器可以进一步分析从输入信息中提取的各实体信息所对应的信息流提取要求。如果提取要求是分别提取各个实体信息所对应的信息流,则服务器可以将各个实体信息分别作为目标实体信息,分别提取各个实体信息所对应的信息流。例如输入信息为“聚乙烯醇OR聚碳酸酯”,该输入信息采用了布尔逻辑表达式的形式,如果预先限定逻辑算符“OR”表示或者的意思,则相应的,服务器可以依据该关联词“OR”确定两个实体信息的信息流提取要求为分别提取。则服务器可以分别将“材料名实体-聚乙烯醇”以及“材料名实体-聚碳酸酯”作为目标实体信息,即有两个目标实体信息,两个目标实体信息的内容分别为“材料名实体-聚乙烯醇”和“材料名实体-聚碳酸酯”。In some embodiments, if there is more than one entity information extracted from the input information. The server can further analyze the information flow extraction requirements corresponding to the entity information extracted from the input information. If the extraction requirement is to separately extract the information flow corresponding to each entity information, the server may use each entity information as the target entity information, and extract the information flow corresponding to each entity information. For example, the input information is "polyvinyl alcohol OR polycarbonate", and the input information adopts the form of Boolean logic expression. If the logical operator "OR" means or is defined in advance, then the server can use the related word " OR" determines the information flow extraction requirements of the two entity information to be extracted separately. Then the server can use "material name entity-polyvinyl alcohol" and "material name entity-polycarbonate" as the target entity information respectively, that is, there are two target entity information, and the content of the two target entity information is "material name entity" -Polyvinyl alcohol" and "material name entity-polycarbonate".
然后,服务器可以分别提取两个目标实体信息“材料名实体-聚乙烯醇”和“材料名实体-聚碳酸酯”的信息流,获得“材料名实体-聚乙烯醇”的信息流以及“材料名实体-聚碳酸酯”的信息流。具体实施时,服务器可以依次分别进行各个目标实体信息的信息流的提取,也可以通过并行处理的方式,并行进行多个目标实体信息的信息流提取。Then, the server can extract the information flow of the two target entity information "material name entity-polyvinyl alcohol" and "material name entity-polycarbonate" to obtain the information flow of "material name entity-polyvinyl alcohol" and "material Name entity-polycarbonate" information flow. During specific implementation, the server may respectively extract the information streams of each target entity information in sequence, or may extract multiple target entity information information streams in parallel through parallel processing.
如果提取要求是方向限定提取,如步骤S20中所提到的输入信息中包括提取方向限定。用户在输入框中输入“聚乙烯醇”,并在提取方向限定选择框或者输入框中选择或者输入提取方向限定信息“有机电致发光显示器涂层”。则服务器在从输入信息中提取两个实体信息“材料名实体-聚乙烯醇”以及“材料应用实体-有机电致发光显示器涂层”后,基于方向限定提取要求,提取从材料聚乙烯醇至材料应用有机电致发光显示器涂层之间的材料加工工艺、涉及的其他材料、材料供应商等信息所构成的信息流。If the extraction requirement is a direction-restricted extraction, the input information mentioned in step S20 includes the direction-restricted extraction. The user enters "polyvinyl alcohol" in the input box, and selects or inputs the extraction direction limitation information "organic electroluminescent display coating" in the extraction direction limitation selection box or the input box. Then the server extracts the two entity information "material name entity-polyvinyl alcohol" and "material application entity-organic electroluminescent display coating" from the input information, and extracts the material from the material polyvinyl alcohol to Material application The information flow constituted by the material processing technology between the coatings of the organic electroluminescence display, other materials involved, and material suppliers.
当然,上述信息流提取要求、提取要求选择方式、输入方式或者界面展示方式等为优选的举例说明,具体实施时,也可以采用其他的方式实施,这里不做限定。Of course, the above-mentioned information flow extraction requirements, extraction requirements selection methods, input methods, or interface display methods are preferred examples. During specific implementation, other methods can also be used for implementation, which are not limited here.
S24:服务器从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息。S24: The server extracts at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, where the association relationship information includes the association between the entity information Information describing the direction and type of association.
对于任意一个目标实体信息,服务器可以从数据源中提取目标实体信息所对应的至少一个关联实体信息,以及提取目标实体信息与关联实体信息之间的关联关系信息。所述关联实体信息可以是指与目标实体信息存在一定关联关系的实体信息。所述关联关系信息可以包括描述实体信息之间的关联方向、关联类型的信息。For any piece of target entity information, the server may extract at least one associated entity information corresponding to the target entity information from the data source, and extract the association relationship information between the target entity information and the associated entity information. The associated entity information may refer to entity information that has a certain association relationship with the target entity information. The association relationship information may include information describing the direction and type of association between entity information.
所述关联方向可以表示实体信息在信息流中的流转方向,以直观的标识出信息流的流向,便于对实体信息进行梳理查看。如材料单体通过材料工艺得到材料聚合物,则材料单体、材料工艺、材料聚合物之间的关联方向为材料单体通过材料工艺得到材料聚合物。一些实施方式中,在信息流展示时,也可以利用箭头表示关联方向,如图2所示,图2中的箭头标识出了信息流中各实体信息之间的关联方向。The association direction may indicate the flow direction of the entity information in the information flow, and intuitively identifies the flow direction of the information flow, which is convenient for combing and viewing the entity information. If the material monomer obtains the material polymer through the material technology, the correlation direction among the material monomer, the material technology, and the material polymer is that the material monomer obtains the material polymer through the material technology. In some embodiments, when the information flow is displayed, arrows may also be used to indicate the direction of association. As shown in FIG. 2, the arrows in FIG. 2 identify the direction of association between entity information in the information flow.
所述关联类型可以包括具有不同关联关系特征的实体信息之间的关联关系的特征描述,以便于更好的梳理以及验证实体信息之间是否存在关联、存在何种关联。如材料与材料之间通过加工工艺关联时,相应的关联类型可以为材料与材料间的工艺。材料与材料类型信息之间的关联类型为材料与材料类型。The association type may include a feature description of the association relationship between entity information with different association relationship characteristics, so as to better sort out and verify whether there is an association between the entity information and what kind of association there is. For example, when materials are related by processing technology, the corresponding type of association can be the process between materials and materials. The type of association between material and material type information is material and material type.
实体信息之间的关联方向、关联类型可以通过对实体信息所在的文本信息进行分析后确定。如可以通过对大量的样本数据进行训练,利用训练所得的算法模型从文本信息中提取出实体信息之间的关联方向、关联类型。一些实施例中,还可以通过综合考虑实体信息之间的实体类型来确定实体信息之间的关联方向、关联类型,以便于更加准确方便的确定关联方向、关联类型。The direction and type of association between entity information can be determined by analyzing the text information where the entity information is located. For example, a large amount of sample data can be trained, and the algorithm model obtained by training can be used to extract the association direction and type of entity information from the text information. In some embodiments, the association direction and association type between the entity information can also be determined by comprehensively considering the entity types between the entity information, so as to determine the association direction and association type more accurately and conveniently.
所述关联关系信息可以包括对关联方向、关联类型进行描述的信息。信息描述的方式可以为纯文本描 述,如关联类型为“材料与材料类型”,关联方向为“合成树脂”为“聚乙烯醇”的材料类型。或者,也可以采用信息链接的方式,如服务器对两个或者两个实体信息通过符号、图形等建立链接,然后,再在链接上标注关联类型以及关联方向。当然,关联类型、关联方向也可以采用数字、符号、图形等表示。当然,也可以采用其他的信息描述方式。The association relationship information may include information describing an association direction and an association type. The information description method can be a plain text description, for example, the type of association is "material and material type", and the direction of association is a material type with "synthetic resin" as "polyvinyl alcohol". Alternatively, the method of information linking can also be adopted, for example, the server establishes a link for two or two entity information through symbols, graphics, etc., and then marks the link type and link direction on the link. Of course, the association type and the association direction can also be represented by numbers, symbols, graphics, etc. Of course, other information description methods can also be used.
所述关联关系信息可以包括关联方向和关联类型的描述的信息,也可以仅包括关联方向或者关联类型的描述信息。如对于材料名的名称变体之间的实体信息,则可以没有关联方向进行描述的信息,仅有关联类型,以表示二者为同位信息。对于材料与材料类型对应的实体信息,也可以仅有关联类型的描述信息。而对于材料与材料之间的工艺所对应的实体信息,则也可以只有关联方向的描述信息。对于仅有关联类型的实体信息,在信息流展示时,可以标注出关联类型,或者也可以在展示实体信息标注出实体值以及实体类型,利用实体类型来表征出实体信息之间的关联关系,同时可以通过如括号或者没有箭头的实线对实体信息进行关联。而对于包括关联方向的关联关系信息,则可以根据该关联关系信息中的关联方向,利用带箭头的直线表征实体信息之间的信息流转方向,以便于进行直观展示。当然,上述关联方向、关联关系的展示方式仅为举例说明,具体实施时也可以采用其他的方式实施。The association relationship information may include the description information of the association direction and the association type, or may only include the description information of the association direction or the association type. For example, for the entity information between the name variants of the material name, there may be no information describing the association direction, and only the association type to indicate that the two are co-location information. For the entity information corresponding to the material and the material type, there may also be only the description information of the associated type. As for the entity information corresponding to the process between the material and the material, there may only be description information of the associated direction. For entity information that only has an association type, the association type can be marked when the information flow is displayed, or the entity value and entity type can be marked in the display entity information, and the entity type is used to characterize the association relationship between the entity information. At the same time, entity information can be associated with brackets or solid lines without arrows. For the association relationship information including the association direction, a straight line with arrows may be used to represent the information flow direction between entity information according to the association direction in the association relationship information, so as to facilitate visual display. Of course, the above-mentioned association direction and the display manner of the association relationship are only examples, and other methods can also be used in specific implementation.
一些实施方式中,所述目标实体信息所对应的关联实体信息以及关联关系信息可以通过分析目标实体信息在数据源中的上下文信息确定。In some embodiments, the associated entity information and the associated relationship information corresponding to the target entity information can be determined by analyzing the context information of the target entity information in the data source.
所述上下文信息可以包括目标实体信息所在的语句以及其所在语句的前后一个或者多个语句。或者,如果通过分析目标实体信息所在段落与该段落的前后段落存在较强的关联关系,所述上下文信息也可以包括目标实体信息所在段落以及该段落的前后一个或者多个段落。The context information may include the sentence where the target entity information is located and one or more sentences before and after the sentence where the target entity information is located. Alternatively, if there is a strong correlation between the paragraph where the target entity information is located and the paragraph before and after the paragraph through analysis, the context information may also include the paragraph where the target entity information is located and one or more paragraphs before and after the paragraph.
一些实施方式中,如可以先初步将目标实体信息所在的语句作为目标实体信息所在的上下文信息进行关联实体信息以及与关联实体信息之间的关联关系信息的提取。然后,可以进一步将其所在语句的前后一个或者多个语句补充至上下文信息中,利用其所在语句的前后一个或者多个语句对提取的关联实体信息以及与关联实体信息之间的关联关系信息进行修正。或者,也可以进一步将其所在段落的前后一个或者多个段落补充至上下文信息中,利用其所在段落的前后一个或者多个段落对提取的关联实体信息以及与关联实体信息之间的关联关系信息进行修正。In some implementations, for example, the sentence in which the target entity information is located may be initially used as the context information in which the target entity information is located to extract the associated entity information and the association relationship information with the associated entity information. Then, one or more sentences before and after the sentence where it is located can be further added to the context information, and one or more sentences before and after the sentence where it is located can be used to perform the related entity information extraction and the association relationship information with the associated entity information. Fix. Alternatively, one or more paragraphs before and after the paragraph where it is located can be further added to the context information, and the associated entity information extracted by one or more paragraphs before and after the paragraph where it is located and the relationship information between the associated entity information and the associated entity information Make corrections.
或者,可以先初步将目标实体信息所在的语句作为目标实体信息所在的上下文信息进行关联实体信息以及与关联实体信息之间的关联关系信息的提取,当未能提取到关联实体信息以及与关联实体信息之间的关联关系信息时,可以进一步将其所在语句的前后一个或者多个语句补充至上下文信息中,利用其所在语句的前后一个或者多个语句进一步提取目标实体信息的关联实体信息以及与关联实体信息之间的关联关系信息。当然,也可以进一步将其所在段落的前后一个或者多个段落补充至上下文信息中,利用其所在段落的前后一个或者多个段落进一步提取或者修正目标实体信息的关联实体信息以及与关联实体信息之间的关联关系信息。Or, you can initially use the sentence where the target entity information is located as the context information where the target entity information is located to extract the associated entity information and the associated relationship information with the associated entity information. When the associated entity information and the associated entity information cannot be extracted In the case of the association relationship information between the information, one or more sentences before and after the sentence where it is located can be further added to the context information, and one or more sentences before and after the sentence where it is used are used to further extract the associated entity information and the related entity information of the target entity information. Association information between associated entity information. Of course, one or more paragraphs before and after the paragraph where it is located can be further added to the context information, and one or more paragraphs before and after the paragraph where it is located can be used to further extract or modify the associated entity information of the target entity information and the relationship between the associated entity information and the associated entity information. Relationship information between.
通常,存在强关联关系的实体信息会在一个语句中进行描述,通过初步将目标实体信息所在的语句作为上下文信息进行信息提取,可以避免其他信息的干扰,更为准确高效的提取出目标实体信息所对应的关联实体信息以及与关联实体信息之间的关联关系信息。然后,再进一步结合前后语句或者前后段落对提取的实体信息之间的关联关系信息进一步修正,或者,在未提取到实体信息或者实体信息之间的关联关系信息时以实现进一步的提取,可以大幅提高实体信息以及实体信息之间关联关系提取的准确性以及全面性。Usually, the entity information with strong association relationship is described in a sentence. By initially extracting the sentence where the target entity information is located as context information, the interference of other information can be avoided, and the target entity information can be extracted more accurately and efficiently. Corresponding associated entity information and association relationship information with associated entity information. Then, the association relationship information between the extracted entity information is further modified by combining the preceding and following sentences or paragraphs, or further extraction can be achieved when the entity information or the association relationship information between the entity information is not extracted. Improve the accuracy and comprehensiveness of the extraction of entity information and the relationship between entity information.
一些实施方式中,可以先检索出目标实体信息所在的数据源,然后,可以定位目标实体信息在数据源中的上下文信息。然后,可以利用自然语言处理算法等对目标实体信息所在的上下文信息进行分析,提取目标实体信息所对应的关联实体信息以及与关联实体信息之间的关联关系信息。In some implementations, the data source where the target entity information is located can be retrieved first, and then the context information of the target entity information in the data source can be located. Then, natural language processing algorithms and the like can be used to analyze the context information where the target entity information is located, and extract the associated entity information corresponding to the target entity information and the association relationship information with the associated entity information.
可以通过信息匹配等方法,初步检索出目标实体信息所在的数据源。然后,可以利用如上述实施例提供的方法定位目标实体信息在数据源中的上下文信息。可以先利用命名实体提取方法,从目标实体信息所在的上下文信息中提取出实体信息,再利用语义角色标注、依存语义分析、词性标注等方法对提取的实体信息与目标实体信息所在的语句、或者提取的实体信息与目标实体信息所在的语句之间的语句进行分析,提取出可以描述提取的实体信息与目标实体信息之间关联方向、关联类型的描述信息。可以对提取的描述信息进行分析,或者对提取的描述信息、提取的实体信息以及目标实体信息进行综合分析,如果提取的实体信息与目标实体信息之间存在一定的关联方向和/或关联类型,则可以将提取的描述信息作为关联实体信息,将提取的描述信息作为关联关系信息。The data source where the target entity information is located can be preliminarily retrieved through methods such as information matching. Then, the method provided in the foregoing embodiment can be used to locate the context information of the target entity information in the data source. You can first use the named entity extraction method to extract the entity information from the context information where the target entity information is located, and then use semantic role tagging, dependency semantic analysis, part-of-speech tagging and other methods to compare the extracted entity information and the sentence where the target entity information is located, or The sentence between the extracted entity information and the sentence where the target entity information is located is analyzed, and description information that can describe the direction and type of association between the extracted entity information and the target entity information is extracted. The extracted description information can be analyzed, or the extracted description information, the extracted entity information, and the target entity information can be comprehensively analyzed. If there is a certain association direction and/or association type between the extracted entity information and the target entity information, Then the extracted description information can be used as the associated entity information, and the extracted description information can be used as the associated relationship information.
例如,对于目标实体信息“聚乙烯醇”在数据源中所在的某语句“聚乙烯醇是一种合成树脂,通常由聚醋酸乙烯酯进行醇解(通常称为水解或皂化)来制备”,可以利用命名实体提取算法提取出实体信息“材料名实体-聚乙烯醇”、“材料类型实体-合成树脂”、“材料名实体-聚醋酸乙烯酯”、“材料工艺实体-醇解”、“材料工艺实体-水解”、“材料工艺实体-皂化”。For example, for the sentence "polyvinyl alcohol" in the data source where the target entity information "polyvinyl alcohol" is located, "polyvinyl alcohol is a synthetic resin, usually prepared by alcoholysis (usually called hydrolysis or saponification) of polyvinyl acetate", The named entity extraction algorithm can be used to extract the entity information "material name entity-polyvinyl alcohol", "material type entity-synthetic resin", "material name entity-polyvinyl acetate", "material process entity-alcoholysis", " Material process entity-hydrolysis", "material process entity-saponification".
可以进一步对上述语句中实体信息以及实体信息之间的词进行语义角色标注、依存语义分析、词性标注等处理自然语言处理方法,提取出“聚乙烯醇”与“合成树脂”之间的关联关系描述信息“X是一种Y”, 进一步结合“聚乙烯醇”、“合成树脂”的实体类型可以确定二者之间的关联类型为“材料与材料类型”,关联方向为“合成树脂”为“聚乙烯醇”的材料类型。The entity information in the above sentences and the words between the entity information can be further processed by natural language processing methods such as semantic role labeling, dependency semantic analysis, part-of-speech labeling, etc., to extract the association relationship between "polyvinyl alcohol" and "synthetic resin" The description information "X is a kind of Y", further combining the entity types of "polyvinyl alcohol" and "synthetic resin" can determine the type of association between the two as "material and material type", and the direction of association is "synthetic resin". "Polyvinyl alcohol" type of material.
“聚乙烯醇”、“聚醋酸乙烯酯”以及“醇解”的关联关系描述信息为“X由Y进行Z来制备”,通过进一步分析三个实体的实体类型:“醇解”对应为材料工艺实体,“聚乙烯醇”与“聚醋酸乙烯酯”对应的实体类型为材料名实体,可以进一步确定出三个实体间的关联类型为“材料与材料间的工艺”,关联方向为“聚醋酸乙烯酯”通过“醇解”获得“聚乙烯醇”。The description information of the relationship between "polyvinyl alcohol", "polyvinyl acetate" and "alcolysis" is "X is prepared by Y and Z", through further analysis of the entity types of the three entities: "alcolysis" corresponds to the material Process entity, the entity type corresponding to "polyvinyl alcohol" and "polyvinyl acetate" is the material name entity. It can be further determined that the association type between the three entities is "material and material process", and the association direction is "poly "Vinyl acetate" obtains "polyvinyl alcohol" through "alcolysis".
同时,还可以提取出“醇解”、“水解”、“皂化”之间的关联关系描述信息“醇解(通常称为水解或皂化)”,通过该描述信息以及“醇解”、“水解”、“皂化”的实体类型,可以确定三者之间的关联类型为“同位实体”,关联方向为同位信息,所述同位实体或者同位信息表示两个或者两个以上的实体信息中的各个实体信息互为其他实体信息的其他表述形式。At the same time, it can also extract the description information of the relationship between "alcolysis", "hydrolysis", and "saponification", "alcolysis (usually called hydrolysis or saponification)", through the description information and "alcolysis" and "hydrolysis" "" and "saponified" entity type, the association type between the three can be determined as "co-entity", and the direction of association is co-location information, and the co-location entity or co-location information represents each of two or more entity information Entity information is other representations of other entity information.
实体信息以及关联关系信息提取过程中所利用的如命名实体提取算法、语义角色标注、依存语义分析、词性标注等处理自然语言处理方法,可以根据信息流提取所对应的实体信息、关联关系信息提取特征对大量的样本数据进行深度学习后获得,从而可以提高实体信息以及关联关系信息提取的准确性。Natural language processing methods such as named entity extraction algorithm, semantic role tagging, dependency semantic analysis, part-of-speech tagging, etc. used in the process of extracting entity information and association relationship information can extract corresponding entity information and association relationship information according to the information flow Features are obtained after deep learning of a large amount of sample data, which can improve the accuracy of entity information and association information extraction.
另一些实施例中,所述目标实体信息所对应的关联实体信息以及关联关系信息可以通过下述方式确定:In other embodiments, the associated entity information and association relationship information corresponding to the target entity information may be determined in the following manner:
对于上述确定的目标实体信息的上下文信息,还可以先对上下文信息利用NLP算法进行处理,如可以利用NLP算法对数据源进行处理和标记化。然后,可以基于实体信息字典或实体信息库,利用NER来识别上下文信息中的实体信息。如对于材料信息流的构建,可以基于材料所对应的实体信息字典或者实体信息库,或者,目标实体信息所对应的材料结构类型所对应的实体信息字典或者实体信息库,利用NER算法从上下文信息中提取出上下文信息中所包含的实体信息。For the context information of the target entity information determined above, the context information can also be processed using the NLP algorithm, for example, the data source can be processed and tokenized using the NLP algorithm. Then, based on the entity information dictionary or entity information database, NER can be used to identify the entity information in the context information. For example, the construction of material information flow can be based on the entity information dictionary or entity information database corresponding to the material, or the entity information dictionary or entity information database corresponding to the material structure type corresponding to the target entity information, and use the NER algorithm to obtain the context information Extract the entity information contained in the context information.
例如,假设目标实体信息为“聚乙烯醇”,“聚乙烯醇”所对应的材料结构类型为聚合物,则获取聚合物所对应的实体信息字典或者实体信息库。利用NER算法对目标实体信息的上下文信息进行实体信息提取,并与实体信息字典或者实体信息库中的信息进行信息比对,更加准确高效的识别出上下文信息中的实体信息。如可以识别出单体、聚合反应、聚合介质、聚合物、聚合物类别、聚合物类型、制造处理、制造商、供应商、化学改性、物理改性、应用等对应的描述信息。For example, if the target entity information is "polyvinyl alcohol" and the material structure type corresponding to "polyvinyl alcohol" is polymer, then the entity information dictionary or entity information database corresponding to the polymer is obtained. The NER algorithm is used to extract the entity information from the context information of the target entity information, and to compare the information with the information in the entity information dictionary or entity information database to more accurately and efficiently identify the entity information in the context information. For example, the corresponding description information of monomer, polymerization reaction, polymerization medium, polymer, polymer type, polymer type, manufacturing process, manufacturer, supplier, chemical modification, physical modification, application can be identified.
然后,还可以利用NLP算法过滤方法去除提取的实体信息中的干扰信息。所述干扰信息可以包括停用词、普通词、非科学词和与物质语境无关的词等。然后,将提取的各实体信息以及目标实体信息作为对象,对于各对象,进一步利用NLP算法对上下文信息中进行分析,确定各对象所对应的元数据。所述元数据可以是指对各对象进行描述的信息。然后,可以将目标实体信息所对应的元数据与提取的各实体信息所对应的元数据进行相互比较,确定目标实体信息与提取的各实体信息之间是否存在关联关系以及具体的关联关系信息。如果目标实体信息与提取的各实体信息之间存在关联关系,则可以将存在关联关系的实体信息作为目标实体信息的关联实体信息,并基于比较结果确定二者之间的关联关系信息,以对二者进行关联。Then, the NLP algorithm filtering method can also be used to remove the interference information in the extracted entity information. The interference information may include stop words, common words, non-scientific words, words unrelated to the material context, and the like. Then, the extracted entity information and target entity information are used as objects, and for each object, the NLP algorithm is further used to analyze the context information to determine the metadata corresponding to each object. The metadata may refer to information describing each object. Then, the metadata corresponding to the target entity information and the metadata corresponding to the extracted entity information can be compared with each other to determine whether there is an association relationship between the target entity information and the extracted entity information and specific association relationship information. If there is an association relationship between the target entity information and the extracted entity information, the entity information with the association relationship can be used as the associated entity information of the target entity information, and the association relationship information between the two can be determined based on the comparison result to compare The two are related.
S26:服务器利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流。S26: The server uses the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
服务器可以利用提取的关联类型、关联方向对相应的目标实体信息以及关联实体信息进行关联,可以获得目标实体信息所对应的信息流。关联的实施方式可以参考步骤S24所描述的内容进行,这里不做赘述。The server can use the extracted association type and association direction to associate the corresponding target entity information and the associated entity information, and obtain the information flow corresponding to the target entity information. The implementation of the association can be carried out with reference to the content described in step S24, which will not be repeated here.
S28:服务器将所述信息流发送给所述客户端,以在所述客户端上进行展示。S28: The server sends the information stream to the client for display on the client.
服务器可以将所述信息流发送给所述客户端,以在所述客户端上进行展示。如可以利用可视化工具将信息流进行可视化处理后,将可视化处理后的信息流发送给客户端,以在客户端上可视化展示。图2表示上述实例建立的信息流。The server may send the information stream to the client for display on the client. For example, a visualization tool can be used to visualize the information flow, and then the visualized information flow can be sent to the client for visual display on the client. Figure 2 shows the information flow established in the above example.
通过同时提取实体信息以及实体信息之间的关联关系信息,可以有效利用关联关系信息对提取的实体信息进行进一步的验证和筛选,以确定提取的实体信息是否为与目标实体信息相关的实体信息,从而进一步提高与目标实体信息相关联的实体信息提取的准确性,有效过滤噪音。同时,利用关联关系信息还可以将提取的实体信息与目标实体信息关联起来,有效展示提取的实体信息与目标实体信息的关联关系,便于用户查看以及梳理,使得用户可以准确高效的获得自己所需要的有用信息,找到需要的或者新的解决方案,提高用户使用体验感。By extracting entity information and the association relationship information between entity information at the same time, the association relationship information can be effectively used to further verify and filter the extracted entity information to determine whether the extracted entity information is entity information related to the target entity information. Thereby, the accuracy of extracting entity information associated with the target entity information is further improved, and noise is effectively filtered. At the same time, using the association relationship information can also associate the extracted entity information with the target entity information, effectively displaying the association relationship between the extracted entity information and the target entity information, which is convenient for users to view and sort out, so that users can accurately and efficiently obtain what they need Useful information, find the needed or new solutions, and improve the user experience.
例如,用户正试图找到特定材料或者材料应用的部分或者全部加工工艺或制造方法、原材料、添加剂等信息,通过上述实施例提供的方案将同一或者不同数据源中关于该特定材料或者材料应用的关联实体信息进行提取以及关联,形成信息流,展示给用户。利用该特定材料所对应的信息流,用户可以简单方便的找到其所需要的信息,避免用户对多个不同的数据源自行筛选、梳理的过程,节省了时间。同时,通过上述信息流的方式,还可以使得各实体信息通过关联的方式展示给用户,可以避免用户自行进行信息筛选时因数据量较大或者对行业不了解导致信息的遗漏以及忽略信息之间的关联性,使得用户可以更加准确的获取到其需要的解决方案以及解决方案中所需要的原材料、加工工艺等信息。此外,通过上述同时提取实体 信息以及实体信息之间的关联关系信息的方式,还可以有效去掉数据源中的噪音信息对有效信息的提取,提高用户获得的信息的准确性。For example, the user is trying to find some or all of the processing technology or manufacturing method, raw materials, additives and other information of a specific material or material application, and the scheme provided by the above embodiment will associate the specific material or material application in the same or different data sources. Entity information is extracted and associated to form an information flow and displayed to users. Using the information flow corresponding to the specific material, users can easily and conveniently find the information they need, avoiding the process of selecting and sorting out multiple different data sources by themselves, saving time. At the same time, through the above information flow method, the information of each entity can also be displayed to users in an associated manner, which can avoid the omission of information and the neglect of information due to the large amount of data or the lack of understanding of the industry when users perform information screening on their own. The relevance of, enables users to more accurately obtain the solutions they need and the raw materials and processing technology needed in the solutions. In addition, through the above method of simultaneously extracting entity information and the association relationship information between entity information, the extraction of effective information from noise information in the data source can also be effectively removed, and the accuracy of the information obtained by the user can be improved.
数据源中的信息复杂多变,实体信息以及实体信息之间的关联关系信息通常较难准确的提取。一些实施方式中,还可以基于目标实体信息的实体类型训练目标实体信息所对应的关联实体信息以及与关联实体信息之间的关联关系信息的算法模型,以进一步提高关联实体信息以及实体信息间关联关系信息提取的准确性。对于材料技术领域的实体信息,材料与材料之间或者材料与材料应用之间通常是通过一定的加工工艺或者制造方法进行相互转换的,而一种材料或者材料应用是由哪种材料加工获得,以及,通过何种加工工艺或者制造方法,通常也是用户所希望了解的。基于该应用场景,本说明书的另一些实施例中,所述从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,可以包括:The information in the data source is complex and changeable, and entity information and the association relationship information between entity information are usually difficult to accurately extract. In some embodiments, the algorithm model of the associated entity information corresponding to the target entity information and the association relationship information with the associated entity information may be trained based on the entity type of the target entity information, so as to further improve the association between the associated entity information and the entity information. Accuracy of relational information extraction. For the entity information in the field of material technology, the conversion between materials and materials or between materials and material applications is usually through a certain processing technology or manufacturing method, and a material or material application is obtained from which material processing, And, what kind of processing technology or manufacturing method is used is usually what users want to know. Based on this application scenario, in other embodiments of this specification, the extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, Can include:
检索所述目标实体信息所在的数据源;Retrieve the data source where the target entity information is located;
定位所述目标实体信息在数据源中所在的上下文信息;Locate the context information where the target entity information is located in the data source;
当所述上下文信息中包含材料工艺实体、材料名实体和/或材料应用实体时,根据材料工艺信息关联方式提取所述目标实体信息所对应的关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。When the context information includes a material process entity, a material name entity, and/or a material application entity, extract the associated entity information corresponding to the target entity information and the target entity information and the associated entity information according to the material process information association method The relationship information between.
所述材料工艺信息关联方式可以根据材料之间或者材料与材料应用之间的信息关联特征确定。当目标实体信息所在的上下文信息中包含材料工艺实体、材料名实体和/或材料应用实体时,该上下文信息中通常可能存在关于一个或者多个材料通常通过什么样的加工工艺转换获得,或者材料应用通过什么样的制造方法制造获得。则当提取其中的材料名实体、材料应用实体以及材料工艺实体时,则相应的信息关联方式通常可以为材料名实体通过材料工艺实体得到材料名实体的方式,或者,材料名实体通过材料工艺实体得到材料应用实体的方式。此时,材料工艺实体既可以作为材料名实体或者材料应用实体的关联实体信息,也可以为多个材料名实体之间或者材料名实体与材料应用实体之间的关联关系信息的一部分。The material process information association manner may be determined according to the information association characteristics between materials or between materials and material applications. When the context information of the target entity information includes material process entities, material name entities, and/or material application entities, the context information may usually contain information about the processing technology conversion of one or more materials, or the material What kind of manufacturing method is used to manufacture the application? When extracting the material name entity, material application entity, and material process entity, the corresponding information association method can usually be the way that the material name entity obtains the material name entity through the material process entity, or the material name entity passes the material process entity The way to get the material application entity. At this time, the material process entity can be used as the associated entity information of the material name entity or the material application entity, or part of the association relationship information between multiple material name entities or between the material name entity and the material application entity.
一些实施方式中,可以利用深度学习等算法,对大量的关于包含材料工艺实体、材料名实体和/或材料应用实体的数据源进行学习,以准确通过材料工艺实体关联材料名实体或者材料名实体与材料应用实体,实现实体信息以及实体信息之间的关联关系的准确提取。In some embodiments, algorithms such as deep learning can be used to learn a large number of data sources including material process entities, material name entities, and/or material application entities, so as to accurately associate material name entities or material name entities through material process entities. With the material application entity, the entity information and the relationship between entity information can be accurately extracted.
例如,用户在客户端的输入框中输入信息为聚乙烯醇,并点击信息流提取按钮。客户端可以基于聚乙烯醇生成信息流获取请求,并发送给服务器。服务器接收信息流获取请求后,可以从信息流获取请求中提取出输入信息,然后,可以进一步对输入信息进行信息提取。服务器对该输入信息进行实体信息提取,提取出的实体信息可以包括“材料名实体-聚乙烯醇”。然后,服务器可以对数据库中的数据源进行信息提取,提取出包含聚乙烯醇的一个或者多个数据源。然后,可以对定位聚乙烯醇所在的上下文信息。For example, the user enters the information as polyvinyl alcohol in the input box of the client and clicks the information flow extraction button. The client can generate an information flow acquisition request based on polyvinyl alcohol and send it to the server. After the server receives the information flow acquisition request, it can extract the input information from the information flow acquisition request, and then can further perform information extraction on the input information. The server performs entity information extraction on the input information, and the extracted entity information may include "material name entity-polyvinyl alcohol". Then, the server can extract information from the data sources in the database, and extract one or more data sources containing polyvinyl alcohol. Then, you can locate the contextual information where the polyvinyl alcohol is located.
假设定位到的上下文信息如下:“1,2-乙二醇聚乙烯醇可以根据已知的或常规的生产方法生产。例如1,2-乙二醇聚乙烯醇可以采用下述生产方法:在高于常规条件下的压力(负载)下进行乙酸乙烯酯聚合以制备聚乙烯醇;以及乙酸乙烯酯与碳酸乙烯酯共聚以获得上述含量的1,2-乙二醇聚乙烯醇”。It is assumed that the contextual information located is as follows: "1,2-ethylene glycol polyvinyl alcohol can be produced according to known or conventional production methods. For example, 1,2-ethylene glycol polyvinyl alcohol can be produced using the following production method: Polymerization of vinyl acetate under pressure (load) higher than under conventional conditions to prepare polyvinyl alcohol; and copolymerization of vinyl acetate and ethylene carbonate to obtain 1,2-ethylene glycol polyvinyl alcohol with the aforementioned content".
利用命名实体提取方法,可以提取出“材料名实体-乙酸乙烯酯”、“材料工艺实体-聚合”;以及,“材料名实体-乙酸乙烯酯”、“材料名实体-碳酸乙烯基酯”、“材料工艺实体-共聚”等实体信息。则可以将材料名实体对应的实体信息初步作为关联实体信息,将材料工艺实体所对应的实体信息作为材料名实体之间的关联类型。同时,对上述信息所在的语句进行分析,以进一步确定出关联方向。对描述信息“乙酸乙烯酯聚合以制备聚乙烯醇”以及“乙酸乙烯酯与碳酸乙烯酯共聚以获得上述含量的1,2-乙二醇聚乙烯醇”进行分析,可以得到连接方向为“乙酸乙烯酯”通过“聚合”得到“聚乙烯醇”,以及“乙酸乙烯酯”与“碳酸乙烯酯”通过“共聚”得到“聚乙烯醇”。Using the named entity extraction method, you can extract "material name entity-vinyl acetate", "material process entity-polymerization"; and "material name entity-vinyl acetate", "material name entity-vinyl carbonate", Entity information such as "material process entity-copolymerization". Then the entity information corresponding to the material name entity can be initially used as the associated entity information, and the entity information corresponding to the material process entity can be used as the type of association between the material name entities. At the same time, analyze the sentence where the above information is located to further determine the direction of the association. Analyzing the description information "polymerization of vinyl acetate to prepare polyvinyl alcohol" and "copolymerization of vinyl acetate and ethylene carbonate to obtain the above content of 1,2-ethylene glycol polyvinyl alcohol", the connection direction can be obtained as "acetic acid "Vinyl ester" is "polymerized" to obtain "polyvinyl alcohol", and "vinyl acetate" and "vinyl carbonate" are "copolymerized" to obtain "polyvinyl alcohol".
然后,可以根据提取的关联类型、关联方向等关联关系信息对“聚乙烯醇”和“乙酸乙烯酯”进行关联,以及,根据提取的关联关系信息“共聚”对“聚乙烯醇”和“乙酸乙烯酯、碳酸乙烯基酯”进行关联。如可以利用“箭头”表示关联方向,利用信息展示算法,展示关联后形成的信息流,分别如图3中的(a)图及(b)图所示。Then, the "polyvinyl alcohol" and "vinyl acetate" can be associated according to the extracted association type, direction and other association information, and the "polyvinyl alcohol" and "acetic acid" can be "copolymerized" according to the extracted association information. "Vinyl ester and vinyl carbonate" are associated. For example, the "arrow" can be used to indicate the direction of association, and the information display algorithm can be used to display the information flow formed after the association, as shown in Figure 3 (a) and (b) respectively.
基于材料工艺的信息关联特征,对提取到的包含材料名实体、材料应用实体以及材料工艺实体的信息进行信息关联,可以进一步提高材料实体信息以及材料实体信息之间关联关系信息提取的准确性。Based on the information association characteristics of the material process, the extracted information including the material name entity, the material application entity, and the material process entity can be associated with information, which can further improve the accuracy of the extraction of the material entity information and the association relationship information between the material entity information.
本说明书的另一些实施例中,所述关联关系信息还可以包括同位实体描述信息,所述同位实体描述信息可以包括描述关联实体信息为目标实体信息的其他表述形式的描述信息。当所述关联关系信息包括同位实体描述信息时,服务器可以将所述同位实体描述信息所对应的关联实体信息补充至所述目标实体信息中,并更新所述目标实体信息。相应的,服务器可以从数据源中提取更新后的目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。以及,利用关联关系信息对更新后的目标实体信息及关联实体信息进行关联,获得更新后的目标实体信息所对应的信息流。In other embodiments of the present specification, the association relationship information may further include colocation entity description information, and the colocation entity description information may include description information in other expression forms that describe the associated entity information as target entity information. When the association relationship information includes the co-located entity description information, the server may add the associated entity information corresponding to the co-located entity description information to the target entity information, and update the target entity information. Correspondingly, the server may extract at least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information from the data source. And, using the association relationship information to associate the updated target entity information and the associated entity information to obtain the information flow corresponding to the updated target entity information.
例如,对于上述实例中提到的聚乙烯醇,在对数据源分析时,提取到“聚乙烯醇还可以表述为“聚(乙 烯醇)”或者“PVA”的描述信息,则可以利用该描述信息作为关联关系信息,以关联聚(乙烯醇)、聚乙烯醇和PVA。还可以将聚(乙烯醇)和PVA补充至目标实体信息中,并对目标实体信息进行更新,相应的,更新后的目标实体信息可以包括聚乙烯醇、聚(乙烯醇)和PVA。然后,可以利用聚(乙烯醇)、聚乙烯醇以及PVA同时分别进行数据源的检索以及在检索到的数据源中利用聚(乙烯醇)、聚乙烯醇以及PVA进行关联实体信息以及关联关系信息的提取。在分别获得聚(乙烯醇)、聚乙烯醇以及PVA所对应的子信息流后,还可以同时以聚(乙烯醇)、聚乙烯醇以及PVA为基准信息,对基于聚(乙烯醇)、聚乙烯醇以及PVA提取到的各子信息流进行关联,获得更新后的目标实体信息所对应的信息流。For example, for the polyvinyl alcohol mentioned in the above example, when analyzing the data source, extract the description information of "polyvinyl alcohol can also be expressed as "poly(vinyl alcohol)" or "PVA", you can use the description The information is used as the association relationship information to associate poly(vinyl alcohol), polyvinyl alcohol and PVA. You can also add poly(vinyl alcohol) and PVA to the target entity information, and update the target entity information. Correspondingly, the updated The target entity information can include polyvinyl alcohol, poly(vinyl alcohol), and PVA. Then, you can use poly(vinyl alcohol), polyvinyl alcohol, and PVA to simultaneously perform data source retrieval and use poly(vinyl alcohol) in the retrieved data source. Poly(vinyl alcohol), polyvinyl alcohol, and PVA for the extraction of related entity information and related relationship information. After obtaining the sub-information streams corresponding to poly(vinyl alcohol), polyvinyl alcohol, and PVA, you can also use poly(vinyl alcohol) at the same time. ), polyvinyl alcohol and PVA as the reference information, associate each sub-information stream extracted based on poly(vinyl alcohol), polyvinyl alcohol, and PVA to obtain the information stream corresponding to the updated target entity information.
当然,对于其他的实体类型,也可以采用上述方式,将同一实体信息的不同表述形式进行关联,然后,进一步进行实体信息的提取,如上文实例中的“醇解”、“水解”、“皂化”属于提取到的材料工艺实体所对应的同位信息。Of course, for other entity types, the above method can also be used to associate different expressions of the same entity information, and then further entity information extraction, such as "alcoholysis", "hydrolysis", and "saponification" in the above examples "It belongs to the co-location information corresponding to the extracted material process entity.
或者,另一些实施方式中,还可以基于上述实施例提供的方案,提取各实体信息的元数据,对各实体信息所对应的元数据进行相互比较,如果确定两个实体信息所对应的元数据存在一定的相似性或者通过元数据比较确定两个实体信息相互为对方的解释说明,则可以两个实体信息互相为同位实体。同一种材料、材料应用、工艺、制作方法等在数据源中可能存在多种不同的描述形式,如聚乙烯醇还可以描述为英文全拼或者其他语言表述的形式。复杂庞大的各类数据源中通常缺乏对材料名称、应用或者工艺等的具体及规范化的描述,而用户在进行信息搜索时,通常输入的信息仅为用户熟知的一种或者多种形式,导致最终查询到的信息不够准确全面。例如,由于专利中对材料名称的表述方式不同,搜索诸如聚乙烯醇之类的某些材料的用户可能找不到使用PVOH、PVA、聚(乙烯醇)、水解的聚乙酸乙烯酯等的某些专利,导致用户侵权分析不够全面,从而出现侵权的风险。Or, in other embodiments, the metadata of each entity information may be extracted based on the solution provided in the above-mentioned embodiment, and the metadata corresponding to each entity information may be compared with each other. If the metadata corresponding to the two entity information is determined If there is a certain similarity or it is determined through metadata comparison to explain that two entity information is each other, then the two entity information can be co-located entities with each other. The same material, material application, process, production method, etc. may have many different description forms in the data source. For example, polyvinyl alcohol can also be described in English or other language forms. Complex and large data sources usually lack specific and standardized descriptions of material names, applications, or processes. When users search for information, they usually enter only one or more forms that are familiar to users. The information finally queried is not accurate and comprehensive. For example, due to the different ways of expressing material names in patents, users searching for certain materials such as polyvinyl alcohol may not find certain materials that use PVOH, PVA, poly(vinyl alcohol), hydrolyzed polyvinyl acetate, etc. These patents lead to incomplete analysis of user infringement, which leads to the risk of infringement.
而利用上述实施例的方案,通过在实体信息提取时,当提取到实体信息的其他表述形式时,及时将其他表述形式更新至信息搜索时所依赖的基准信息中,利用更新后的基准信息进行信息的搜索以及关联,可以大幅提高信息提取的准确性以及全面性。具体展示时,也可以为多种材料名称变体提供或者选择一个通用名称,并与各材料名链接,以对材料名进行标识,便于用户进行信息的梳理以及查阅。Using the solution of the above embodiment, when extracting entity information, when other expression forms of entity information are extracted, the other expression forms are updated to the reference information relied on during information search in time, and the updated reference information is used to perform The search and association of information can greatly improve the accuracy and comprehensiveness of information extraction. For specific display, you can also provide or select a common name for a variety of material name variants, and link with each material name to identify the material name, which is convenient for users to sort out and read the information.
另一些实施例中,所述方法还可以包括:服务器从数据源提取所述目标实体信息、关联实体信息的商品名;并根据所述商品名提取所述目标实体信息或者关联实体信息所对应的制造商或者供应商,将所述制造商或者供应商与所述目标实体信息或者关联实体信息进行关联。通过进一步提取各材料所对应的商品名,并通过商品名从开源数据库等资源中提取各材料所对应的供应商或者制造商信息,将供应商或者制造商信息与材料进行关联,可以方便用户根据信息流直接获取材料的制造源,提高用户的使用体验感。In other embodiments, the method may further include: the server extracts the target entity information or the product name of the associated entity information from the data source; and extracts the target entity information or the product name corresponding to the associated entity information according to the product name. The manufacturer or supplier associates the manufacturer or supplier with the target entity information or associated entity information. By further extracting the product name corresponding to each material, and extracting the supplier or manufacturer information corresponding to each material from open source databases and other resources through the product name, associating the supplier or manufacturer information with the material, it is convenient for users to follow The information flow directly obtains the manufacturing source of the material and improves the user experience.
从同一数据源或者多个数据源中可能提取到目标实体信息的多个关联实体信息以及目标实体信息与不同关联实体信息之间的关联关系信息。一些实施例中,服务器还可以将提取的某一个或者多个关联实体信息及对应的关联关系信息关联后所形成的信息流作为子信息流,然后,以目标实体信息为基准信息,将从同一数据源或者从不同数据源提取的不同子信息流进行链接,获得目标实体信息所对应的信息流,再将最终的信息流展示给用户,进一步提高信息之间的关联性,便于用户进行综合分析查看。Multiple associated entity information of the target entity information may be extracted from the same data source or multiple data sources, and the association relationship information between the target entity information and different associated entity information. In some embodiments, the server may also use the information flow formed by associating the extracted one or more associated entity information and the corresponding association relationship information as the sub-information flow. Then, using the target entity information as the reference information, the information flow from the same Data sources or different sub-information streams extracted from different data sources are linked to obtain the information stream corresponding to the target entity information, and then the final information stream is displayed to the user, which further improves the correlation between the information and facilitates the comprehensive analysis by the user Check.
假设用户正在寻找用于特定材料的从头到尾的解决方案,例如从单体到应用。通过上述信息流用户可以将从不同源(数据库、专利等)获得的实体缝合在一起,从而在单一位置处为获得完整的解决方案,无需再对多个不同的数据源进行分析,从而节省了时间。Suppose that the user is looking for a solution from beginning to end for a specific material, such as from monomer to application. Through the above information flow, users can stitch together entities obtained from different sources (databases, patents, etc.), so that in order to obtain a complete solution at a single location, there is no need to analyze multiple different data sources, thereby saving time.
或者,用户需要搜索某新型材料的制造流程,但由于制造工艺的专利尚未公布或提交而无法搜索。然而,可能已经存在描述了该材料工艺的期刊文章、新闻文章或者其他类型的数据源。则通过提取多个数据源中的信息,并进行关联融合,可以使得用户更为准确直观的获取到现有的制造工艺。Or, the user needs to search for the manufacturing process of a new material, but cannot search because the manufacturing process patent has not been published or submitted. However, there may already be journal articles, news articles, or other types of data sources describing the material's process. By extracting information from multiple data sources and performing correlation and fusion, users can obtain the existing manufacturing process more accurately and intuitively.
例如,可以将图3中的(a)图、(b)图所示的信息流作为子信息流,图3中的(c)图为综合两个子信息流之后获得的信息流。基于上述方案,一些实施例中,所述利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,可以包括:获取从数据源提取的至少一个关联实体信息以及所述目标实体信息与相应的关联实体信息之间的关联关系信息,利用所述目标实体信息与相应的关联实体信息之间的关联关系信息对所述目标实体信息与相应的关联实体信息进行关联,获得所述目标实体信息的子信息流;以所述目标实体信息为基准信息将从同一或者不同数据源获得的子信息流进行链接,获得所述目标实体信息所对应的信息流。For example, the information streams shown in (a) and (b) in FIG. 3 can be used as sub-information streams, and (c) in FIG. 3 is an information stream obtained after combining two sub-information streams. Based on the foregoing solution, in some embodiments, the use of the association relationship information to associate the target entity information with the associated entity information may include: acquiring at least one associated entity information extracted from a data source and the target entity information The association relationship information with the corresponding associated entity information, the association relationship information between the target entity information and the corresponding associated entity information is used to associate the target entity information with the corresponding associated entity information to obtain the target A sub-information stream of entity information; using the target entity information as the reference information to link the sub-information streams obtained from the same or different data sources to obtain the information stream corresponding to the target entity information.
另一些实施例中,所述方法还包括:服务器还可以将从数据源中提取到的各关联实体信息作为目标实体信息进行信息流的提取,获得多个目标实体信息所对应的信息流;并将多个目标实体信息所对应的信息流以相应的目标实体信息作为基准信息进行链接,获得信息图谱。In other embodiments, the method further includes: the server may also extract the information flow of each associated entity information extracted from the data source as the target entity information, and obtain the information flow corresponding to the multiple target entity information; and Link the information streams corresponding to multiple target entity information with the corresponding target entity information as the reference information to obtain the information map.
服务器可以进一步将所提取到的关联实体信息作为目标实体信息,进行信息流的提取,进一步提取到各关联实体信息所对应的信息流,从而可以获得一系列的信息流。再分别以各关联实体信息作为基准信息,对各信息流进行链接,从而可以获得信息图谱。图4为用户搜索聚乙烯醇得到的与之相关联的从醋酸乙烯酯(单体)到有机电致发光显示器涂层(应用)的整个工作流形成的信息图谱,其中,图4中各方框中的 内容为实体信息,箭头表述关联方向。The server may further use the extracted associated entity information as target entity information, extract the information flow, and further extract the information flow corresponding to each associated entity information, so as to obtain a series of information flows. Then, each associated entity information is used as the reference information to link each information stream, so that the information graph can be obtained. Figure 4 is an information map formed by the entire workflow from vinyl acetate (monomer) to organic electroluminescent display coating (application), which is related to polyvinyl alcohol by users searching for polyvinyl alcohol. Among them, all parties in Figure 4 The content in the box is entity information, and the arrow indicates the direction of association.
利用关联实体信息进一步进行信息流的提取,再进一步进行链接来构建信息图谱,可以得到从材料单体到最终应用的所有信息,方便用户对某材料的整个生命周期中所涉及的材料、应用、工艺、制造进行全面、准确、直观的了解。Use the related entity information to further extract the information flow, and then further link to construct the information map, you can get all the information from the material monomer to the final application, and it is convenient for users to understand the materials, applications, and applications involved in the entire life cycle of a material. A comprehensive, accurate and intuitive understanding of process and manufacturing.
如利用上述实施例的方案,用户可以借助信息流,轻松识别数据源中从单体到化学改性聚合物(包括催化剂、聚合反应、化学改性、物理改性和添加剂)的整个工艺信息流,从而还可以帮助用户发现替代的聚合流程、添加剂和催化剂等。For example, using the solution of the above embodiment, users can easily identify the entire process information flow from monomer to chemically modified polymer (including catalyst, polymerization reaction, chemical modification, physical modification and additives) in the data source with the help of information flow. , Which can also help users discover alternative polymerization processes, additives and catalysts.
例如,通常搜索聚乙烯的用户可能不会得到使用PE、聚乙烯HDPE或行业聚乙烯商品名称来标记聚乙烯的文献。他可能会得到关于材料详情的不同方面的不完整信息,例如一些聚乙烯专利可能包含单体,但没有有关聚合工艺、对聚合物进行的聚合物处理或甚至聚乙烯可用于的不同应用的信息。用户可能需要搜索不同的专利以搜索聚乙烯的不同详情。通过上述方案,用户可以通过单次搜索来得到特定材料的所有信息,包括名称变体、供应商、材料应用、制造工艺等。For example, users who usually search for polyethylene may not get documents that use PE, polyethylene HDPE, or industry polyethylene trade names to label polyethylene. He may get incomplete information about different aspects of the material details. For example, some polyethylene patents may contain monomers, but there is no information about the polymerization process, the polymer processing of the polymer, or even the different applications that the polyethylene can be used for. . Users may need to search different patents to search for different details of polyethylene. Through the above solution, users can get all the information of a specific material through a single search, including name variants, suppliers, material applications, manufacturing processes, etc.
同时,利用上述实施例的信息流,用户还可以简单方便的找到特定应用的合适材料。通过使用信息流,用户可以发现用于特定应用的新材料、供应商和材料制造技术。例如,寻找适用于医用内支架的材料的用户可能会找到Bioflow-V、Orsio(商品名称)之类的材料。然而,商品名称背后的材料无法从同一源获得。通过使用上述信息之间互相链接所形成的信息图谱,用户可以了解Orsio使用钴铬合金作为基础金属并具有活性聚L-丙交酯(PLLA)聚合物涂层的知识。用户可以进一步探索用于制造内支架的其他材料,包括具有聚(乙烯-醋酸乙烯共聚物)或聚(甲基丙烯酸正丁酯)涂层的不锈钢(CYPHER内支架)、具有碳涂层的钴铬合金(CRE8内支架)等。然后,用户还可以使用信息图谱来获得内支架中使用的每种材料的整个制造工艺。At the same time, by using the information flow of the above-mentioned embodiment, the user can also easily and conveniently find suitable materials for a specific application. By using the information flow, users can discover new materials, suppliers, and material manufacturing technologies for specific applications. For example, users looking for materials suitable for medical stents may find materials such as Bioflow-V and Orsio (trade name). However, the material behind the trade name cannot be obtained from the same source. By using the information map formed by the interlinking of the above information, users can understand that Orsio uses cobalt-chromium alloy as the base metal and has the knowledge of active poly-L-lactide (PLLA) polymer coating. Users can further explore other materials used to make stents, including stainless steel (CYPHER stent) with poly(ethylene-vinyl acetate copolymer) or poly(n-butyl methacrylate) coating, and cobalt with carbon coating Chrome alloy (CRE8 inner stent) and so on. Then, the user can also use the information atlas to obtain the entire manufacturing process of each material used in the stent.
图4至图13为本说明书给出的几种常见的材料类型的信息图谱示意图。其中,图4、图5为聚合物材料的信息图谱示意图。图6、图7为金属材料的信息图谱示意图。图8、图9为陶瓷材料的信息图谱示意图。图10、图11为生物材料的信息图谱示意图。图12、图13为材料的信息图谱示意图。其中,图4、图6、图8、图10、图12为根据从数据源中提取的实体信息以及实体信息关联后所形成的信息图谱。图5、图7、图9、图11、图13为上述五种材料在一般的信息流转示意图。Figures 4 to 13 are schematic diagrams of information maps of several common material types given in this specification. Among them, FIG. 4 and FIG. 5 are schematic diagrams of information maps of polymer materials. Figures 6 and 7 are schematic diagrams of information maps of metal materials. Figures 8 and 9 are schematic diagrams of information maps of ceramic materials. Figures 10 and 11 are schematic diagrams of information maps of biological materials. Figure 12 and Figure 13 are schematic diagrams of material information maps. Among them, Figure 4, Figure 6, Figure 8, Figure 10, Figure 12 are based on the entity information extracted from the data source and the entity information formed after association. Figure 5, Figure 7, Figure 9, Figure 11, Figure 13 are schematic diagrams of the general information flow of the above five materials.
对比五种材料从数据源进行实际信息提取后获得信息图谱以及其一般的信息流转示意图可知,通过本说明书实施例的方案,用户可以通过输入信息图谱中任意一个或者多个节点所对应的实体信息,如单一材料、材料应用、材料工艺等,然后,可以得到与其输入信息相关的部分或者全部信息以及信息之间的流转关系,最后关联获得信息流。使得用户可以准确的获得各材料从材料单体到材料应用的部分或者整个信息流,同时,还有效关联了名称变体、加工工艺、制造方法、制造商、供应商等信息。有效实现了用户通过较少的输入信息,准确而全面的获得与其输入信息相关的部分或者全部信息以及信息之间的流转关系。Comparing five kinds of materials to obtain the information map and general information flow diagram after extracting the actual information from the data source, it can be seen that through the solution of the embodiment of this specification, the user can input the entity information corresponding to any one or more nodes in the information map , Such as a single material, material application, material technology, etc., and then part or all of the information related to the input information and the flow relationship between the information can be obtained, and finally the information flow can be obtained by association. This allows users to accurately obtain part or the entire information flow of each material from the material monomer to the material application. At the same time, it also effectively associates information such as name variants, processing techniques, manufacturing methods, manufacturers, and suppliers. It effectively realizes that the user can accurately and comprehensively obtain part or all of the information related to the input information and the circulation relationship between the information through less input information.
当然,也可以进一步对信息流中的各信息节点的数据源进行关联,如图4、图6、图8、图10、图12所示,图中的不同灰度代表来自不同数据源的实体信息。实际应用中,也可以利用不同的颜色代表不同的数据源,或者通过点击各信息节点来查看各信息节点分别来自哪些数据源。通过进一步关联以及展示数据源,可以使得来自不同数据源的信息在单一节点出有效区分开来,有利于用户有效对不同专利权人所拥有或者利用的技术进行把控,更加简单方便的进行侵权、专利权等分析。基于该应用场景,另一些实施例中,还可以将所述目标实体信息的子信息流与相应的数据源进行关联。通过将提取的子信息流与对应的数据源进行关联,用户可以方便的查看相应的子信息流的来源,有效实现对信息流中各信息的追踪等。Of course, you can also further associate the data sources of each information node in the information stream, as shown in Figure 4, Figure 6, Figure 8, Figure 10, and Figure 12. The different gray levels in the figure represent entities from different data sources. information. In practical applications, you can also use different colors to represent different data sources, or click each information node to view which data source each information node comes from. By further associating and displaying data sources, information from different data sources can be effectively distinguished at a single node, which is beneficial for users to effectively control the technologies owned or used by different patentees, and make infringements easier and more convenient. , Patent rights, etc. Based on this application scenario, in other embodiments, the sub-information stream of the target entity information may also be associated with the corresponding data source. By associating the extracted sub-information stream with the corresponding data source, the user can conveniently view the source of the corresponding sub-information stream, and effectively realize the tracking of various information in the information stream.
另一些实施例中,基于上述实施例提供的方案,客户端展示的页面中还可以包括信息流起始节点以及结束节点选项,用户可以点击相应的选项,并对应在输入框中对应输入相应的起始节点信息以及结束节点信息。客户端在获取用户的点击以及输入信息后,可以生成“起始节点信息+结束节点信息”的输入信息,附带在检索请求中,发送给服务器。服务器可以基于该输入信息进行信息流的生成以及展示。In other embodiments, based on the solutions provided in the above embodiments, the page displayed by the client may also include the start node and end node options of the information flow. The user can click the corresponding option and enter the corresponding corresponding in the input box. Start node information and end node information. After obtaining the user's click and input information, the client can generate the input information of "start node information + end node information", attach it to the search request, and send it to the server. The server can generate and display the information flow based on the input information.
如可以分别从起始节点信息以及结束节点信息中提取实体信息,将从起始节点信息或者结束节点信息中任意一个提取的实体信息初步作为目标实体信息,另一个对应的实体信息作为终止实体信息。然后,可以对目标实体信息进行信息流的提取,并进一步将提取的信息流中的各关联实体信息作为目标实体信息,进一步进行信息流的提取,依次类推,若某关联实体信息为终止实体信息,则对该关联实体信息不再进行信息提取。从而可以获得起始节点信息以及结束节点信息之间的信息图谱。For example, the entity information can be extracted from the start node information and the end node information separately, the entity information extracted from either the start node information or the end node information is initially used as the target entity information, and the other corresponding entity information is used as the ending entity information . Then, you can extract the information flow of the target entity information, and further use the associated entity information in the extracted information flow as the target entity information, and further extract the information flow, and so on, if a certain associated entity information is terminating entity information , Then no more information extraction is performed on the associated entity information. In this way, the information graph between the start node information and the end node information can be obtained.
例如,如果起始节点信息为材料单体,结束节点信息为材料应用。利用上述方式,可以从材料单体开始进行信息流的逐步提取,直至材料应用,获得材料单体至材料应用之间的信息图谱。或者,还可以从材料应用开始进行信息流的逐步提取,直至材料单体,获得材料单体至材料应用之间的信息图谱。For example, if the starting node information is a single material, the ending node information is a material application. Using the above method, the information flow can be gradually extracted from the material monomer to the material application, and the information atlas from the material monomer to the material application can be obtained. Alternatively, it is also possible to gradually extract the information flow from the material application to the material monomer, and obtain the information map from the material monomer to the material application.
另一些实施例中,所述方法还可以包括:服务器将信息流中的各实体信息作为与用户进行交互的信息节点,并利用可视化方法对所述信息流进行可视化处理;将处理后的信息流发送给客户端,以使用户通过客户端查看所述信息流以及触发所述信息节点;基于用户对所述信息节点的触发操作,向所述客户端反馈 与所述信息节点关联的其他信息。信息节点为信息流中各实体信息所对应的信息。所述触发操作如可以包括点击、长按、滑动等操作方式。所述信息节点关联的其他信息如可以包括数据源或者未展示的其他信息流等。所述可视化方法如可以采用Neo4j图形化方法。如图14所示,图14表示互动可视化的信息图谱示意图,其中,圆圈代表不同的信息节点,圆圈之间的线表示信息节点存在关联,线上的文字可以表示各信息节点之间的关联类型描述信息,并可以通过箭头表示信息之间的关联方向。一些实施方式中,还可以利用不同的颜色表示信息所来自的数据源。In other embodiments, the method may further include: the server uses the entity information in the information stream as an information node for interacting with the user, and uses a visualization method to visualize the information stream; and the processed information stream Sent to the client, so that the user can view the information flow and trigger the information node through the client; based on the user's triggering operation on the information node, feedback other information associated with the information node to the client. The information node is the information corresponding to each entity information in the information flow. For example, the trigger operation may include operation modes such as tap, long press, and slide. The other information associated with the information node may include, for example, data sources or other information streams that are not shown. For example, the visualization method may adopt the Neo4j graphical method. As shown in Figure 14, Figure 14 shows a schematic diagram of an interactive visualized information map. The circles represent different information nodes, the lines between the circles indicate that the information nodes are related, and the text on the line can indicate the type of association between the information nodes. Describe information, and use arrows to indicate the direction of association between information. In some embodiments, different colors may also be used to indicate the data source from which the information comes.
如用户可以通过点击信息节点,来查看该信息节点的数据源。或者,通过以信息节点为基准向左或者向右滑动,来查看该信息节点所对应的未展示的下位信息或者上位信息,其中,下位信息如可以是指从当前信息节点所对应的材料至材料应用之间的信息,所述上位信息可以是指从当前信息节点所对应的材料至材料单体之间的信息。或者,以信息节点为基准向上或者向下滑动,来查看该信息节点所对应的未展示的参数实体信息、供应商/制造商实体信息。所述参数实体信息如可以为材料类型、工艺类型、材料属性、单位或度量等信息。通过设计为可与用户进行互动的形式进行展示,可以进一步提高用户使用体验感。For example, the user can view the data source of the information node by clicking on the information node. Or, by sliding left or right based on the information node, to view the undisplayed lower information or upper information corresponding to the information node, where the lower information may refer to the material corresponding to the current information node to the material. The information between applications, the higher-level information may refer to the information from the material corresponding to the current information node to the single material. Or, swipe up or down based on the information node to view the undisplayed parameter entity information and supplier/manufacturer entity information corresponding to the information node. The parameter entity information can be, for example, information such as material type, process type, material attribute, unit, or measurement. By designing the display in a form that can interact with the user, the user experience can be further improved.
基于上述应用场景,另一些实施例中,所述方法还可以包括:服务器从数据源提取所述目标实体信息、关联实体信息的参数实体信息,所述参数实体信息至少包括材料结构类型实体、工艺方法实体、材料属性实体、单位实体或者度量实体中的一种所对应的实体信息;将提取的参数实体信息与相应的目标实体信息、关联实体信息进行关联。一些实施方式中,如可以使用深度学习技术,提取不同的参数实体信息,并将其与相应的实体信息相关联。Based on the above application scenarios, in other embodiments, the method may further include: the server extracts the target entity information and the parameter entity information of the associated entity information from the data source, and the parameter entity information includes at least the material structure type entity, the process The entity information corresponding to one of the method entity, the material attribute entity, the unit entity or the measurement entity; the extracted parameter entity information is associated with the corresponding target entity information and associated entity information. In some implementations, for example, deep learning technology can be used to extract different parameter entity information and associate it with corresponding entity information.
数据源中通常还存在对实体信息的进一步具体的描述信息,如对于工艺实体信息,可能还存在工艺方法类型、工艺参数、制造商信息等描述信息。例如,对于聚合工艺,数据源中可能还存在对聚合工艺的更为具体的描述信息,如溶液聚合、本体聚合、悬浮聚合及乳液聚合等具体的工艺方法,同时,还可能给出大量的聚合时所采用的参数条件信息,如温度、压力、周围环境、生产环境等信息。可以进一步提取上述信息,将上述工艺方法、工艺中所涉及的各材料的材料属性实体、单位实体或者度量实体或者周围环境的环境参数、单位实体、度量实体等参数实体信息与相应的实体信息进行关联,可以使得用户更加准确全面的获得具体的参数信息。The data source usually also contains further specific description information of the entity information. For example, for the process entity information, there may also be description information such as process method type, process parameters, and manufacturer information. For example, for the polymerization process, the data source may also contain more specific description information of the polymerization process, such as solution polymerization, bulk polymerization, suspension polymerization, and emulsion polymerization. At the same time, it may also provide a large number of polymerization processes. The parameter condition information used at the time, such as temperature, pressure, surrounding environment, production environment and other information. The above-mentioned information can be further extracted, and the material attribute entity, unit entity or measurement entity of each material involved in the above process method, or the parameter entity information of the surrounding environment, unit entity, measurement entity and other parameter entity information and corresponding entity information Association can enable users to obtain specific parameter information more accurately and comprehensively.
图15中示出了主要属性类。每个属性类具有不同属性的子类别。例如,光谱学类可以分为包括拉曼光谱学、X射线衍射等的子类别。图16为利用单位和度量将材料链接到来自不同属性类中的属性列表的示意图。图17为从数据源中提取属性、单位以及度量的示意图。图18为将提取的属性、单位以及度量链接到相应的材料中的示意图。图19、图20、图21表示属性、单位以及度量在数据源中的不同表现形式以及信息提取示意图。图22是将提取的属性、单位以及度量链接到相应材料中的示意图。The main attribute classes are shown in Figure 15. Each attribute class has different attribute subcategories. For example, the spectroscopy category can be divided into subcategories including Raman spectroscopy, X-ray diffraction, and the like. Figure 16 is a schematic diagram of using units and measures to link materials to attribute lists from different attribute classes. Figure 17 is a schematic diagram of extracting attributes, units, and metrics from a data source. Figure 18 is a schematic diagram of linking the extracted attributes, units, and metrics to the corresponding materials. Figure 19, Figure 20, and Figure 21 show different representations of attributes, units, and metrics in the data source and schematic diagrams of information extraction. Figure 22 is a schematic diagram of linking the extracted attributes, units, and metrics to the corresponding materials.
通过上述单位/度量和属性提取,可以使与材料相关联的不同属性以及在经受不同处理时属性的变化可视化,使用户能够选择最佳的材料和处理来达到其需求。同时,提取材料属性以及单位/度量等信息,将其关联至信息流中,还可以在材料应用与材料属性之间进行关联,进一步实现基于材料属性推荐材料所对应的应用。Through the above-mentioned unit/metric and attribute extraction, different attributes associated with materials and changes in attributes when subjected to different treatments can be visualized, enabling users to select the best materials and treatments to meet their needs. At the same time, by extracting information such as material properties and units/measures, and associating them into the information flow, it is also possible to associate material applications with material properties, and further realize the application of recommending materials based on the material properties.
通过利用上述一个或者多个实施例提供的方案提取信息流,用户可以准确全面的了解到目前材料已经存在的具体应用、工艺方法等所有细节的具体的方案,同时,还可以通过关联的数据源了解到公开上述应用、工艺方法的专利权人,从而可以有效避免用户的侵权风险。By extracting the information flow by using the solutions provided by one or more of the above embodiments, users can accurately and comprehensively understand the specific applications, process methods and other specific solutions that currently exist for the materials. At the same time, they can also use the associated data sources. Know the patentee who discloses the above-mentioned application and process method, so as to effectively avoid the user's risk of infringement.
如对于进行专利侵权分析的用户,专利权利要求中通常包括对材料的描述,如果专利中要求保护一种新的材料制造流程,则申请人通常需要在权利要求中添加制造工艺的所有细节,然而,如果专利涉及对现有制造流程的修改,申请人也不必添加制造工艺的所有细节。如果新发现了应用-材料链接,则申请人也可以在专利权利要求中添加该材料的应用,然而,申请人通常更倾向于在专利权利要求中描述更为广泛的应用领域,而更具体的应用则通常不会在权利要求中进行描述。且对权利要求中的材料性质、属性等的描述也通常会进行概括性描述,以获得更好的保护范围。For example, for users who conduct patent infringement analysis, the patent claims usually include a description of the material. If the patent claims a new material manufacturing process, the applicant usually needs to add all the details of the manufacturing process to the claims. However, If the patent involves a modification of the existing manufacturing process, the applicant does not have to add all the details of the manufacturing process. If an application-material link is newly discovered, the applicant can also add the application of the material in the patent claims. However, the applicant generally prefers to describe a broader application field in the patent claims, and more specific Applications are usually not described in the claims. In addition, the description of material properties, attributes, etc. in the claims will usually be described in general terms to obtain a better scope of protection.
鉴于很多专利权利要求中缺乏对材料、制造工艺和应用的具体描述,使得很多信息的准确全面检索变的较为困难。如果第三方制造或使用材料时对已经受专利保护的基础材料进行了某些修改,或者,对制造工艺有轻微的修改,但该修改可能已被其他制造专利的宽泛权利要求涵盖,或者,待申请专利所要求保护的应用属于先前已公布的专利的应用领域范围内,而第三方未能进行全面且准确的检索,则第三方将可能面临较高的侵权风险。In view of the lack of specific descriptions of materials, manufacturing processes and applications in many patent claims, it is difficult to accurately and comprehensively retrieve a lot of information. If a third party manufactures or uses the material, some modifications are made to the basic material that has been protected by the patent, or the manufacturing process is slightly modified, but the modification may have been covered by the broad claims of other manufacturing patents, or, The applications claimed by the patent application are within the scope of the application of the previously published patent. If the third party fails to conduct a comprehensive and accurate search, the third party may face a higher risk of infringement.
基于上述实施例提供的方法,本说明书另一些实施例中还提供一种信息流提取方法,应用于服务器,如图23所示,所述方法可以包括:Based on the methods provided in the foregoing embodiments, some other embodiments of this specification also provide an information flow extraction method, which is applied to a server. As shown in FIG. 23, the method may include:
S40:获取目标实体信息;S40: Obtain target entity information;
S42:从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;S42: Extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, where the association relationship information includes an association direction between the entity information And information describing the type of association;
S44:利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流。S44: Use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
另一些实施例中,所述利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,可以包括:In other embodiments, the using the association relationship information to associate the target entity information with the associated entity information may include:
获取从数据源提取的至少一个关联实体信息以及所述目标实体信息与相应的关联实体信息之间的关联关系信息,利用所述目标实体信息与相应的关联实体信息之间的关联关系信息对所述目标实体信息与相应的关联实体信息进行关联,获得所述目标实体信息的子信息流;Obtain at least one associated entity information extracted from the data source and the association relationship information between the target entity information and the corresponding associated entity information, and use the association relationship information between the target entity information and the corresponding associated entity information to The target entity information is associated with the corresponding associated entity information to obtain the sub-information stream of the target entity information;
以所述目标实体信息为基准信息将从同一或者不同数据源获得的子信息流进行链接,获得所述目标实体信息所对应的信息流。Using the target entity information as the reference information, link sub-information streams obtained from the same or different data sources to obtain the information stream corresponding to the target entity information.
另一些实施例中,当所述关联关系信息包括同位实体描述信息时,将所述同位实体描述信息所对应的关联实体信息补充至所述目标实体信息中,更新所述目标实体信息;其中,所述同位实体描述信息包括描述关联实体信息为目标实体信息的其他表述形式的描述信息;In other embodiments, when the association relationship information includes colocation entity description information, the associated entity information corresponding to the colocation entity description information is added to the target entity information, and the target entity information is updated; wherein, The colocation entity description information includes description information in other expression forms that describe the associated entity information as target entity information;
相应的,从数据源中提取更新后的目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。Correspondingly, at least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source.
另一些实施例中,所述方法还可以包括:In other embodiments, the method may further include:
将从数据源中提取到的各关联实体信息作为目标实体信息进行信息流的提取,获得多个目标实体信息所对应的信息流;Use the associated entity information extracted from the data source as the target entity information to extract the information flow, and obtain the information flow corresponding to the multiple target entity information;
将多个目标实体信息所对应的信息流以相应的目标实体信息作为基准信息进行链接,获得信息图谱。Link the information streams corresponding to multiple target entity information with the corresponding target entity information as the reference information to obtain the information map.
另一些实施例中,所述从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,可以包括:In other embodiments, the extracting from the data source at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information may include:
检索所述目标实体信息所在的数据源;Retrieve the data source where the target entity information is located;
定位所述目标实体信息在数据源中所在的上下文信息;Locate the context information where the target entity information is located in the data source;
当所述上下文信息中包含材料工艺实体、材料名实体和/或材料应用实体时,根据材料工艺信息关联方式提取所述目标实体信息所对应的关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。When the context information includes a material process entity, a material name entity, and/or a material application entity, extract the associated entity information corresponding to the target entity information and the target entity information and the associated entity information according to the material process information association method The relationship information between.
另一些实施例中,所述方法还可以包括:In other embodiments, the method may further include:
从数据源提取所述目标实体信息、关联实体信息的商品名;Extracting the target entity information and the commodity name of the associated entity information from the data source;
根据所述商品名提取所述目标实体信息或关联实体信息所对应的制造商或者供应商,将所述制造商或者供应商与所述目标实体信息或关联实体信息进行关联。The manufacturer or supplier corresponding to the target entity information or associated entity information is extracted according to the product name, and the manufacturer or supplier is associated with the target entity information or associated entity information.
另一些实施例中,所述方法还可以包括:In other embodiments, the method may further include:
从数据源提取所述目标实体信息、关联实体信息的参数实体信息,所述参数实体信息至少包括材料结构类型实体、工艺方法实体、材料属性实体、单位实体或者度量实体中的一种所对应的实体信息;Extract the target entity information and the parameter entity information of the associated entity information from the data source. The parameter entity information includes at least one of the material structure type entity, the process method entity, the material attribute entity, the unit entity or the measurement entity corresponding to one Entity information
将提取的参数实体信息与相应的目标实体信息、关联实体信息进行关联。The extracted parameter entity information is associated with the corresponding target entity information and associated entity information.
另一些实施例中,所述方法还可以包括:将所述目标实体信息的子信息流与相应的数据源进行关联。In other embodiments, the method may further include: associating the sub-information stream of the target entity information with a corresponding data source.
另一些实施例中,所述方法还可以包括:In other embodiments, the method may further include:
将信息流中的各实体信息作为与用户进行交互的信息节点,并利用可视化方法对所述信息流进行可视化处理;Use the entity information in the information flow as an information node for interaction with the user, and use a visualization method to visualize the information flow;
将处理后的信息流发送给客户端,以使用户通过客户端查看所述信息流以及触发所述信息节点;Sending the processed information stream to the client, so that the user can view the information stream and trigger the information node through the client;
基于用户对所述信息节点的触发操作,向所述客户端反馈与所述信息节点关联的其他信息。Based on the user's triggering operation on the information node, feedback other information associated with the information node to the client.
上述实施例提供的信息流提取方法的具体实施可以参见前述方法的实施,重复之处不再赘述。For the specific implementation of the information flow extraction method provided in the foregoing embodiment, reference may be made to the implementation of the foregoing method, and the repetition will not be repeated.
本说明书一个或多个实施例提供的信息流提取装置,通过同时提取实体信息以及实体信息之间的关联关系信息,可以有效利用关联关系信息对提取的实体信息进行进一步的验证和筛选,以确定提取的实体信息是否为与目标实体信息相关的实体信息,从而进一步提高与目标实体信息相关联的实体信息提取的准确性,有效过滤噪音。同时,利用关联关系信息还可以将提取的实体信息与目标实体信息关联起来,有效展示提取的实体信息与目标实体信息的关联关系,便于用户查看以及梳理,使得用户可以准确高效的获得自己所需要的有用信息,找到需要的或者新的解决方案,提高用户使用体验感。The information flow extraction device provided by one or more embodiments of this specification simultaneously extracts entity information and the association relationship information between the entity information, and can effectively use the association relationship information to further verify and filter the extracted entity information to determine Whether the extracted entity information is entity information related to the target entity information, thereby further improving the accuracy of extracting entity information associated with the target entity information and effectively filtering noise. At the same time, using the association relationship information can also associate the extracted entity information with the target entity information, effectively displaying the association relationship between the extracted entity information and the target entity information, which is convenient for users to view and sort out, so that users can accurately and efficiently obtain what they need Useful information, find the needed or new solutions, and improve the user experience.
基于上述所述的信息流提取方法,本说明书一个或多个实施例还提供一种信息流提取装置。所述的装置可以包括使用了本说明书实施例所述方法的系统、软件(应用)、模块、组件、服务器等并结合必要的实施硬件的装置。由于装置解决问题的实现方案与方法相似,因此本说明书实施例具体的装置的实施可以参见前述方法的实施,重复之处不再赘述。具体的,图24表示说明书提供的一种信息流提取装置实施例的模块结构示意图,如图24所示,所述装置可以包括:Based on the above-mentioned information flow extraction method, one or more embodiments of this specification also provide an information flow extraction device. The described devices may include systems, software (applications), modules, components, servers, etc., which use the methods described in the embodiments of this specification, combined with necessary implementation hardware devices. Since the implementation scheme of the device to solve the problem is similar to the method, the implementation of the specific device in the embodiment of this specification can refer to the implementation of the foregoing method, and the repetition will not be repeated. Specifically, FIG. 24 shows a schematic diagram of the module structure of an embodiment of an information flow extraction device provided in the specification. As shown in FIG. 24, the device may include:
第一获取模块402,可以用于获取目标实体信息;The first obtaining module 402 may be used to obtain target entity information;
第一提取模块404,可以用于从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所 述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The first extraction module 404 may be used to extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes a pair of Information describing the direction of association between entity information and the type of association;
第一关联模块406,可以用于利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流。The first association module 406 may be configured to use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
另一些实施例中,所述第一关联模块406可以包括:In other embodiments, the first association module 406 may include:
第一关联单元,可以用于获取从数据源提取的至少一个关联实体信息以及所述目标实体信息与相应的关联实体信息之间的关联关系信息,利用所述目标实体信息与相应的关联实体信息之间的关联关系信息对所述目标实体信息与相应的关联实体信息进行关联,获得所述目标实体信息的子信息流;The first associating unit may be used to obtain at least one associated entity information extracted from the data source and the association relationship information between the target entity information and the corresponding associated entity information, and use the target entity information and the corresponding associated entity information The association relationship information between associates the target entity information with the corresponding associated entity information, and obtains the sub-information stream of the target entity information;
第二关联单元,可以用于以所述目标实体信息为基准信息将从同一或者不同数据源获得的子信息流进行链接,获得所述目标实体信息所对应的信息流。The second associating unit may be used to link the sub-information streams obtained from the same or different data sources with the target entity information as the reference information, to obtain the information stream corresponding to the target entity information.
另一些实施例中,当所述关联关系信息包括同位实体描述信息时,所述装置还可以包括:In other embodiments, when the association relationship information includes colocation entity description information, the apparatus may further include:
更新模块,可以用于将所述同位实体描述信息所对应的关联实体信息补充至所述目标实体信息中,更新所述目标实体信息;其中,所述同位实体描述信息包括描述关联实体信息为目标实体信息的其他表述形式的描述信息;The update module may be used to add the associated entity information corresponding to the co-located entity description information to the target entity information, and update the target entity information; wherein the co-located entity description information includes information describing the associated entity as a target Descriptive information in other forms of entity information;
所述第一提取模块404还可以用于从数据源中提取更新后的目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。The first extraction module 404 may also be used to extract at least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information from the data source.
另一些实施例中,所述装置还可以包括:In other embodiments, the device may further include:
第二提取模块,可以用于将从数据源中提取到的各关联实体信息作为目标实体信息进行信息流的提取,获得多个目标实体信息所对应的信息流;The second extraction module can be used to extract the information flow of each associated entity information extracted from the data source as the target entity information, and obtain the information flow corresponding to the multiple target entity information;
第二关联模块,可以用于将多个目标实体信息所对应的信息流以相应的目标实体信息作为基准信息进行链接,获得信息图谱。The second association module can be used to link the information streams corresponding to multiple target entity information with the corresponding target entity information as the reference information to obtain the information graph.
另一些实施例中,所述第一提取模块404可以包括:In other embodiments, the first extraction module 404 may include:
检索单元,可以用于检索所述目标实体信息所在的数据源;The retrieval unit can be used to retrieve the data source where the target entity information is located;
定位单元,可以用于定位所述目标实体信息在数据源中所在的上下文信息;The locating unit may be used to locate the context information where the target entity information is located in the data source;
提取单元,可以用于当所述上下文信息中包含材料工艺实体、材料名实体和/或材料应用实体时,根据材料工艺信息关联方式提取所述目标实体信息所对应的关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。The extracting unit may be used to extract the associated entity information corresponding to the target entity information and the target according to the material process information association method when the context information includes the material process entity, material name entity, and/or material application entity The association relationship information between the entity information and the associated entity information.
另一些实施例中,所述装置还可以包括:In other embodiments, the device may further include:
第三提取模块,可以用于从数据源提取所述目标实体信息、关联实体信息的商品名;The third extraction module can be used to extract the target entity information and the commodity name of the associated entity information from the data source;
第三关联模块,可以用于根据所述商品名提取所述目标实体信息或关联实体信息所对应的制造商或者供应商,将所述制造商或者供应商与所述目标实体信息或关联实体信息进行关联。The third association module may be used to extract the manufacturer or supplier corresponding to the target entity information or associated entity information according to the product name, and associate the manufacturer or supplier with the target entity information or associated entity information Make an association.
另一些实施例中,所述装置还可以包括:In other embodiments, the device may further include:
第四提取模块,可以用于从数据源提取所述目标实体信息、关联实体信息的参数实体信息,所述参数实体信息至少包括材料结构类型实体、工艺方法实体、材料属性实体、单位实体或者度量实体中的一种所对应的实体信息;The fourth extraction module may be used to extract the target entity information and the parameter entity information of the associated entity information from the data source. The parameter entity information includes at least a material structure type entity, a process method entity, a material attribute entity, a unit entity, or a metric. Entity information corresponding to one of the entities;
第四关联模块,可以用于将提取的参数实体信息与相应的目标实体信息、关联实体信息进行关联。The fourth association module can be used to associate the extracted parameter entity information with corresponding target entity information and associated entity information.
另一些实施例中,所述第一关联模块406还可以包括:In other embodiments, the first association module 406 may further include:
第三关联单元,可以用于将所述目标实体信息的子信息流与相应的数据源进行关联。The third association unit may be used to associate the sub-information stream of the target entity information with the corresponding data source.
另一些实施例中,所述装置还可以包括:In other embodiments, the device may further include:
可视化处理模块,可以用于将信息流中的各实体信息作为与用户进行交互的信息节点,并利用可视化方法对所述信息流进行可视化处理;The visualization processing module can be used to treat each entity information in the information flow as an information node for interacting with the user, and use a visualization method to visualize the information flow;
第一发送模块,可以用于将处理后的信息流发送给客户端,以使用户通过客户端查看所述信息流以及触发所述信息节点;The first sending module may be used to send the processed information stream to the client, so that the user can view the information stream and trigger the information node through the client;
第二发送模块,可以用于基于用户对所述信息节点的触发操作,向所述客户端反馈与所述信息节点关联的其他信息。The second sending module may be configured to feed back other information associated with the information node to the client based on a user's triggering operation on the information node.
需要说明的,上述所述的装置根据方法实施例的描述还可以包括其他的实施方式。具体的实现方式可以参照相关方法实施例的描述,在此不作一一赘述。It should be noted that the above-mentioned device may also include other implementation manners according to the description of the method embodiment. For specific implementation manners, reference may be made to the description of the related method embodiments, which will not be repeated here.
本说明书一个或多个实施例提供的信息流提取装置,通过同时提取实体信息以及实体信息之间的关联关系信息,可以有效利用关联关系信息对提取的实体信息进行进一步的验证和筛选,以确定提取的实体信息是否为与目标实体信息相关的实体信息,从而进一步提高与目标实体信息相关联的实体信息提取的准确性,有效过滤噪音。同时,利用关联关系信息还可以将提取的实体信息与目标实体信息关联起来,有效展示提取的实体信息与目标实体信息的关联关系,便于用户查看以及梳理,使得用户可以准确高效的获得自己所需要的有用信息,找到需要的或者新的解决方案,提高用户使用体验感。The information flow extraction device provided by one or more embodiments of this specification simultaneously extracts entity information and the association relationship information between the entity information, and can effectively use the association relationship information to further verify and filter the extracted entity information to determine Whether the extracted entity information is entity information related to the target entity information, thereby further improving the accuracy of extracting entity information associated with the target entity information and effectively filtering noise. At the same time, using the association relationship information can also associate the extracted entity information with the target entity information, effectively displaying the association relationship between the extracted entity information and the target entity information, which is convenient for users to view and sort out, so that users can accurately and efficiently obtain what they need Useful information, find the needed or new solutions, and improve the user experience.
本说明书提供的上述实施例所述的方法或装置可以通过计算机程序实现业务逻辑并记录在存储介质上,所述的存储介质可以计算机读取并执行,实现本说明书实施例所描述方案的效果。因此,本说明书还提供一种信息流提取设备,包括处理器及存储处理器可执行指令的存储器,所述指令被所述处理器执行时实现包括上述任意一个实施例所述方法的步骤。The method or device described in the foregoing embodiment provided in this specification can implement business logic through a computer program and be recorded on a storage medium, and the storage medium can be read and executed by a computer to achieve the effects of the solution described in the embodiment of this specification. Therefore, this specification also provides an information flow extraction device that includes a processor and a memory storing processor-executable instructions. When the instructions are executed by the processor, the steps include the method described in any one of the foregoing embodiments.
所述存储介质可以包括用于存储信息的物理装置,通常是将信息数字化后再以利用电、磁或者光学等方式的媒体加以存储。所述存储介质有可以包括:利用电能方式存储信息的装置如,各式存储器,如RAM、ROM等;利用磁能方式存储信息的装置如,硬盘、软盘、磁带、磁芯存储器、磁泡存储器、U盘;利用光学方式存储信息的装置如,CD或DVD。当然,还有其他方式的可读存储介质,例如量子存储器、石墨烯存储器等等。The storage medium may include a physical device for storing information, and usually the information is stored in an electric, magnetic, or optical medium after digitizing the information. The storage medium may include: devices that use electric energy to store information, such as various types of memory, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk; a device that uses optical means to store information, such as CD or DVD. Of course, there are other ways of readable storage media, such as quantum memory, graphene memory, and so on.
需要说明的,上述所述的设备根据方法实施例的描述还可以包括其他的实施方式。具体的实现方式可以参照相关方法实施例的描述,在此不作一一赘述。It should be noted that the above-mentioned device may also include other implementation manners according to the description of the method embodiment. For specific implementation manners, reference may be made to the description of the related method embodiments, which will not be repeated here.
上述实施例所述的信息流提取设备,通过同时提取实体信息以及实体信息之间的关联关系信息,可以有效利用关联关系信息对提取的实体信息进行进一步的验证和筛选,以确定提取的实体信息是否为与目标实体信息相关的实体信息,从而进一步提高与目标实体信息相关联的实体信息提取的准确性,有效过滤噪音。同时,利用关联关系信息还可以将提取的实体信息与目标实体信息关联起来,有效展示提取的实体信息与目标实体信息的关联关系,便于用户查看以及梳理,使得用户可以准确高效的获得自己所需要的有用信息,找到需要的或者新的解决方案,提高用户使用体验感。The information flow extraction device described in the above embodiment simultaneously extracts the entity information and the association relationship information between the entity information, and can effectively use the association relationship information to further verify and filter the extracted entity information to determine the extracted entity information Whether it is the entity information related to the target entity information, so as to further improve the accuracy of extracting the entity information related to the target entity information, and effectively filter the noise. At the same time, using the association relationship information can also associate the extracted entity information with the target entity information, effectively displaying the association relationship between the extracted entity information and the target entity information, which is convenient for users to view and sort out, so that users can accurately and efficiently obtain what they need Useful information, find the needed or new solutions, and improve the user experience.
基于上述实施例提供的方法,本说明书实施例还提供一种信息流展示方法,应用于服务器,所述方法可以包括:Based on the method provided in the foregoing embodiment, the embodiment of this specification also provides an information flow display method, which is applied to a server, and the method may include:
接收客户端发送的信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;Receiving an information flow acquisition request sent by the client, where the information flow acquisition request includes input information acquired by the client;
从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;Extract one or more entity information from the input information as target entity information;
从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;At least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source, and the association relationship information includes an association direction and an association between the entity information Type description information;
利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流;Using the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information;
将所述信息流发送给所述客户端,以在所述客户端上进行展示。The information stream is sent to the client for display on the client.
基于上述实施例提供的方法,本说明书实施例还提供一种服务器,可以包括:Based on the method provided in the foregoing embodiment, an embodiment of this specification also provides a server, which may include:
第一接收模块,可以用于接收客户端发送的信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;The first receiving module may be used to receive an information flow acquisition request sent by the client, where the information flow acquisition request includes input information acquired by the client;
第五提取模块,可以用于从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;The fifth extraction module may be used to extract one or more entity information from the input information as target entity information;
第六提取模块,可以用于从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The sixth extraction module may be used to extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes Information describing the direction of association between information and the type of association;
第五关联模块,可以用于利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流;The fifth association module may be configured to use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information;
第三发送模块,可以用于将所述信息流发送给所述客户端,以在所述客户端上进行展示。The third sending module may be used to send the information stream to the client for display on the client.
基于上述实施例提供的方法,本说明书实施例还提供一种信息流展示方法,应用于客户端,可以包括:Based on the method provided in the foregoing embodiment, the embodiment of this specification also provides an information flow display method, which is applied to the client and may include:
向服务器发送信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;以使服务器接收所述信息流获取请求,从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;以及,从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流,并将所述信息流发送给所述客户端;Send an information flow acquisition request to the server, where the information flow acquisition request includes the input information obtained by the client; so that the server receives the information flow acquisition request and extracts one or more entity information from the input information as the target entity Information; and, extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes the relationship between the entity information Information describing the association direction and association type; and, using the association relationship information to associate the target entity information and the associated entity information, obtain the information flow corresponding to the target entity information, and send the information flow To the client;
接收服务器发送的信息流,并进行展示。Receive the information stream sent by the server and display it.
基于上述实施例提供的方法,本说明书实施例还提供一种客户端,可以包括:Based on the method provided by the foregoing embodiment, an embodiment of this specification also provides a client, which may include:
第四发送模块,可以用于向服务器发送信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;以使服务器接收所述信息流获取请求,从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;以及,从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流,并将所述信息流发送给所述客户端;The fourth sending module may be used to send an information flow acquisition request to the server, where the information flow acquisition request includes the input information obtained by the client; so that the server receives the information flow acquisition request and extracts one or the other from the input information Multiple entity information as target entity information; and extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, the association relationship The information includes information describing the direction and type of association between entity information; and, using the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information , And send the information stream to the client;
第二接收模块,可以用于接收服务器发送的信息流;The second receiving module can be used to receive the information stream sent by the server;
第一展示模块,可以用于展示所述信息流。The first display module can be used to display the information stream.
本说明书提供的上述实施例所述的方法或装置可以通过计算机程序实现业务逻辑并记录在存储介质上,所述的存储介质可以计算机读取并执行,实现本说明书实施例所描述方案的效果。因此,本说明书还提供一种信息流展示设备,包括处理器及存储处理器可执行指令的存储器,所述指令被所述处理器执行时实现包括上述任意一个实施例所述方法的步骤。The method or device described in the foregoing embodiment provided in this specification can implement business logic through a computer program and be recorded on a storage medium, and the storage medium can be read and executed by a computer to achieve the effects of the solution described in the embodiment of this specification. Therefore, this specification also provides an information flow display device that includes a processor and a memory storing executable instructions of the processor. When the instructions are executed by the processor, the steps including the method described in any one of the foregoing embodiments are implemented.
所述存储介质可以包括用于存储信息的物理装置,通常是将信息数字化后再以利用电、磁或者光学等方式的媒体加以存储。所述存储介质有可以包括:利用电能方式存储信息的装置如,各式存储器,如RAM、ROM等;利用磁能方式存储信息的装置如,硬盘、软盘、磁带、磁芯存储器、磁泡存储器、U盘;利用光学方式存储信息的装置如,CD或DVD。当然,还有其他方式的可读存储介质,例如量子存储器、石墨烯存储器等等。The storage medium may include a physical device for storing information, and usually the information is stored in an electric, magnetic, or optical medium after digitizing the information. The storage medium may include: devices that use electric energy to store information, such as various types of memory, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk; a device that uses optical means to store information, such as CD or DVD. Of course, there are other ways of readable storage media, such as quantum memory, graphene memory, and so on.
需要说明的,上述所述的设备根据方法实施例的描述还可以包括其他的实施方式。具体的实现方式可以参照相关方法实施例的描述,在此不作一一赘述。It should be noted that the above-mentioned device may also include other implementation manners according to the description of the method embodiment. For specific implementation manners, reference may be made to the description of the related method embodiments, which will not be repeated here.
另一些实施场景中,用户进行技术查询、侵权分析等分析时还通常采用信息检索的方式,以查询出数据源或者数据源背后的专利权人。但通常信息在同一数据源或者不同数据源中可能存在多种不同的描述形式,甚至存在大量的不完整的、非描述性的信息,导致搜索到的信息存在不准确、不全面的问题。且用户在查看时需要耗费较大的精力进行筛选、梳理,体验感较差。相应的,如图25所示,本说明书的另一些实施例中,还可以提供一种信息检索方法,所述方法可以包括:In other implementation scenarios, users often use information retrieval methods when performing technical queries, infringement analysis, etc., to find out the data source or the patentee behind the data source. However, there may be many different description forms of information in the same data source or different data sources, and even a large amount of incomplete and non-descriptive information, which leads to the problem of inaccuracy and incompleteness of the searched information. In addition, users need to spend a lot of energy to filter and sort out when viewing, and the experience is poor. Correspondingly, as shown in FIG. 25, in other embodiments of this specification, an information retrieval method may also be provided, and the method may include:
S60:客户端发送检索请求,所述检索请求包括客户端获取的输入信息;S60: The client sends a search request, where the search request includes input information obtained by the client;
S62:服务器接收客户端发送的检索请求,从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;S62: The server receives the search request sent by the client, and extracts entity information from the input information, where the entity information includes an entity type and an entity value;
S64:服务器根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;S64: The server determines the data source where the entity information is located according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes a pair of entity information. Information describing the direction of the association and the type of association;
S66:服务器将所述检索结果发送所述客户端,以使所述客户端进行展示。S66: The server sends the search result to the client, so that the client can display it.
所述输入信息可以为信息检索的基准信息,服务器以所述输入信息为基准,检索出直接描述或者间接描述所述输入信息的数据源。实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息可以参考上述信息流提取方法中的实施例实施,这里不做赘述。The input information may be reference information for information retrieval, and the server uses the input information as a reference to retrieve a data source that directly describes or indirectly describes the input information. The associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information can be implemented with reference to the embodiments in the above-mentioned information flow extraction method, and will not be repeated here.
图26表示利用关键字搜索“Fullerene as聚合物添加剂”的检索界面以及检索结果示意图。图26中所示出的灰度较深的为非相关搜索,灰度较弱的为相关搜索结果。如图26所示,通过关键字搜索仅获得了3条相关专利。图27为利用本说明书实施例的方案的检索界面示意图,基于材料信息流的方式进行信息检索。图28为利用本说明书实施例的方案的检索结果示意图,利用本说明书实施例的方案准确的提取到了全部的相关搜索。Fig. 26 shows a search interface for searching for "Fullerene as polymer additives" using keywords and a schematic diagram of the search results. The darker gray scale shown in FIG. 26 is the non-correlated search, and the weaker gray scale is the relevant search result. As shown in Figure 26, only 3 related patents were obtained through keyword search. Fig. 27 is a schematic diagram of a retrieval interface using the solution of the embodiment of the present specification to perform information retrieval based on material information flow. FIG. 28 is a schematic diagram of retrieval results using the solution of the embodiment of this specification, and all related searches are accurately extracted by using the solution of the embodiment of this specification.
很多数据源中可能并不存在特别明确的关于“Fullerene(富勒烯)as(作为)聚合物添加剂”的表述信息,或者,即使存在“Fullerene(富勒烯)as(作为)聚合物添加剂”的表述,也并不一定与用户输入的表述形式相同,导致很多数据源利用关键字匹配的方式,并不能有效的检索出来。或者,用户在输入关键字的时候,也可以采用“或”的方式,分离不同的关键字信息,但是通过该种方式,又可能导致较多的噪声信息被检索出来。Many data sources may not have specific information about the expression of "Fullerene as (as) polymer additive", or even if there is "Fullerene (fullerene) as (as) polymer additive" The expression of is not necessarily the same as the expression entered by the user, resulting in many data sources using keyword matching methods, which cannot be effectively retrieved. Or, when the user inputs a keyword, he can also use an "or" method to separate different keyword information, but in this way, more noise information may be retrieved.
而利用本说明书实施例的方案,通过从数据源中搜索输入信息中实体信息的关联实体信息以及与关联实体信息之间的关联关系信息,即使数据源并未明确或者具体的存在“Fullerene”作为“添加剂”的描述信息,仍然可以利用实体信息:添加剂实体-Fullerene,通过关联实体信息以及与关联实体信息之间的关联关系信息,准确定位出描述该实体信息的数据源,进而有效提取出关于描述“Fullerene”作为“添加剂”角色的所有数据源,因此,可以有效将信号(相关搜索)与噪声(无关搜索)分开,大幅提高的搜索准确性。进一步限定条件“聚合物”,可以排除掉描述“Fullerene”在其他材料类型所对应的信息流中作为“添加剂”的数据源,进一步提高信息检索的准确性。Using the solution of the embodiment of this specification, by searching the associated entity information of the entity information in the input information and the association relationship information with the associated entity information from the data source, even if the data source is not clear or specifically exists "Fullerene" as The description information of "additive" can still use entity information: Additive entity-Fullerene, through the associated entity information and the association relationship information with the associated entity information, accurately locate the data source describing the entity information, and then effectively extract the information about the entity. All data sources that describe the role of "Fullerene" as the "additive", therefore, can effectively separate the signal (related search) from the noise (irrelevant search), greatly improving the search accuracy. Further restricting the condition "polymer" can eliminate the data source describing "Fullerene" as an "additive" in the information flow corresponding to other material types, and further improve the accuracy of information retrieval.
一些实施例中,还可以进一步提取出“Fullerene”的名称变体,即同位信息,如中文描述“富勒烯”。通过将同位信息进一步更新至检索所对应的基准信息中,可以进一步提高检索结果的准确性以及全面性。In some embodiments, the name variant of "Fullerene" can be further extracted, that is, the parity information, such as the Chinese description of "Fullerene". By further updating the parity information to the reference information corresponding to the search, the accuracy and comprehensiveness of the search results can be further improved.
一些实施方式中,还可以在信息检索时按材料结构或材料应用进行分类检索。如可以在信息检索时在输入信息中按材料结构或材料应用增加限定。In some implementations, it is also possible to perform classified retrieval according to material structure or material application during information retrieval. For example, it is possible to add restrictions according to the material structure or material application in the input information during information retrieval.
所述材料结构可以是指组成材料的原子(或离子、分子)相互结合的方式或构成的形式(即结构要素)以及结构要素按一定次序的组合、排列及相互间的各种联系。不同材料有各种不同的结构要素,例如材料各种各样的相、组织、缺陷、单体、大分子链等都属于材料的结构要素。材料按材料结构进行分类如可以包括聚合物、金属、陶瓷、生物、复合物等材料分类,对于上述各分类,还可以进一步对于有更为细化的分类,如图31所示。图31表示聚合物按材料结构进一步细化的分类示意图。The material structure may refer to the way in which the atoms (or ions, molecules) constituting the material are combined with each other or the form of formation (ie, structural elements), and the combination, arrangement, and various connections between the structural elements in a certain order. Different materials have different structural elements. For example, various phases, structures, defects, monomers, and macromolecular chains of materials are all structural elements of materials. Materials are classified according to material structure, such as polymer, metal, ceramic, biological, composite and other material classifications. For the above classifications, there can be further detailed classifications, as shown in Figure 31. Figure 31 is a schematic diagram showing the classification of polymers further refined according to the material structure.
所述材料应用可以包括材料对应的应用领域或者具体的应用。应用领域分类如可以包括建筑材料、能量材料、打印材料、光电学材料等等。对于不同的应用领域还可以进一步进行划分,直至具体的应用,如光电学材料里的有机电显示器涂层。如图33所示,图33表示智能材料的进一步细化的分类示意图。The material application may include the application field or specific application corresponding to the material. The application field classification can include building materials, energy materials, printing materials, optoelectronic materials, and so on. Different application fields can be further divided up to specific applications, such as organic electric display coatings in optoelectronic materials. As shown in Figure 33, Figure 33 shows a further detailed classification diagram of smart materials.
不同材料结构的材料所对应的实体类型或者各实体信息之间的关联关系信息描述形式、材料单体至材料应用的信息流特征可能会存在较大的差异,通过预先对材料结构进行分类,基于待检索的材料、材料应用或者材料工艺等所属的材料结构类别进行检索,可以进一步提高实体信息以及实体信息之间的关联关系信息提取的准确性,使得检索到的数据源更符合用户所需。There may be big differences in the entity types corresponding to materials of different material structures or the association relationship between each entity information, and the information flow characteristics from the material monomer to the material application. By pre-classifying the material structure, based on Retrieving the material structure category to which the material to be retrieved, material application, or material technology belongs to can further improve the accuracy of entity information and the association relationship information between entity information, so that the retrieved data source is more in line with the needs of users.
对应不同的应用领域或者对应于不同的具体的应用,所涉及的原材料、加工工艺、制造方法等也可能存在较大的差异,通过进一步对材料应用进行细化,基于材料应用进行实体信息以及实体信息之间的关联关系信息提取,也可以进一步提高信息提取的准确性,降低噪声干扰,使得检索到的数据源更符合用户所需。同时,对于部分数据源,如专利、论文等,很多检索系统也可以按应用领域进行分类,如图29所示。图29左侧图表示较大的应用分类,右侧图表示左侧图中的某一应用分类进一步具体化的应用分类。因此,通过在检索时考虑材料应用,也可以有针对性的从相应的应用领域所对应的数据源进行检索,大幅提高检索效率以及检索准确性。Corresponding to different application fields or corresponding to different specific applications, the involved raw materials, processing technology, manufacturing methods, etc. may also be quite different. Through further refinement of material applications, entity information and entity information are based on material applications. Information extraction of the association relationship between information can also further improve the accuracy of information extraction, reduce noise interference, and make the retrieved data source more in line with the needs of users. At the same time, for some data sources, such as patents, papers, etc., many retrieval systems can also be classified according to application fields, as shown in Figure 29. The left figure of Fig. 29 shows a larger application category, and the right figure shows an application category that further embodies a certain application category in the left figure. Therefore, by considering the material application when searching, it is possible to search from the data source corresponding to the corresponding application field in a targeted manner, which greatly improves the retrieval efficiency and retrieval accuracy.
特别对于图像信息的检索,通过根据材料结构以及材料应用进行分类检索,可以降低细粒度级别上对材料图形的检测难度,有助于从图像中检测出不同的材料和材料图案,提高检索准确性。Especially for the retrieval of image information, classification and retrieval according to the material structure and material application can reduce the difficulty of detecting material graphics at the fine-grained level, which helps to detect different materials and material patterns from the image, and improve the retrieval accuracy .
例如,如图34至图36所示,图34表述利用关键字检索到的检索结果示意图;图35表示本说明书实施例提供的检索界面示意图;图36表示利用本说明书实施例提供的上述方案的检索结果示意图。由图36可知,本说明书实施例通过在检索时进一步限定应用领域,服务器通过先进一步考虑材料应用所对于的专利,然后,再从中进行实体信息提取以及实体信息间的关联关系信息提取,可以进一步提高图像搜索、图像提取、图像对象检测和图像分类器结果的准确性。For example, as shown in Figure 34 to Figure 36, Figure 34 shows a schematic diagram of the search results retrieved using keywords; Figure 35 shows a schematic diagram of the retrieval interface provided by an embodiment of this specification; Schematic diagram of search results. It can be seen from Figure 36 that the embodiments of this specification further limit the application field when searching, and the server can further consider the patents to which the material is applied, and then extract entity information and association relationship information between entity information. Improve the accuracy of image search, image extraction, image object detection and image classifier results.
例如,如图37以及图38所示,图37表示蛋白质结构、DNA质粒和材料微结构的图像示意图,图38表示电路图、流程图的图像示意图。当对蛋白质结构、DNA质粒和材料微结构进行检索时,通过考虑材料结构分类,可以有效排除具有特定图像类型(例如电路图、流程图等)等噪声图像所对应的专利,准确提取出蛋白质结构、DNA质粒和材料微结构的图像。For example, as shown in FIG. 37 and FIG. 38, FIG. 37 shows a schematic diagram of a protein structure, a DNA plasmid and a microstructure of a material, and FIG. 38 shows a schematic diagram of a circuit diagram and a flow chart. When searching for protein structure, DNA plasmid and material microstructure, by considering the classification of material structure, patents corresponding to noise images with specific image types (such as circuit diagrams, flowcharts, etc.) can be effectively eliminated, and the protein structure, Image of DNA plasmid and material microstructure.
基于上述应用场景示例,一些实施例中,所述客户端所展示的信息输入界面还可以包括信息输入区域、第一选择列表和/或第二选择列表。所述信息输入区域可以用于用户进行信息输入,所述第一选择列表以及第二选择列表可以用于用户进行信息选择。所述第一选择列表中可以包括材料应用或者材料结构的类别信息。所述第二选择列表中可以包括实体类型的类别信息。相应的,所述客户端基于所述信息输入区域、第一选择列表和/或第二选择列表获取输入信息。Based on the foregoing application scenario example, in some embodiments, the information input interface displayed by the client may further include an information input area, a first selection list, and/or a second selection list. The information input area may be used for the user to input information, and the first selection list and the second selection list may be used for the user to select information. The first selection list may include material application or material structure category information. The second selection list may include category information of the entity type. Correspondingly, the client terminal obtains input information based on the information input area, the first selection list, and/or the second selection list.
如图27所示,用户可以通过在输入框中输入信息,并限定输入信息的实体类型进行信息检索,通过该种方式进行信息检索,可以便于从输入的检索信息中准确提取出实体类型以及实体值,提高信息检索的准确性。As shown in Figure 27, the user can perform information retrieval by entering information in the input box and limiting the entity type of the entered information. Information retrieval in this way can facilitate accurate extraction of entity types and entities from the entered retrieval information. Value to improve the accuracy of information retrieval.
如图27以及图35所示,还可以限定应用领域或者结构类别,通过限定应用领域或者结构类别,可以便于确定相应应用领域或者结构类别所对应的实体信息以及实体信息间的关联关系信息的提取特征,提高信息提取的准确性。同时,通过预设的材料结构以及材料应用分类选择的方式,还可以使得输入信息更规范化,提高信息检索的准确性。As shown in Figure 27 and Figure 35, the application field or structure category can also be limited. By limiting the application field or structure category, it is convenient to determine the entity information corresponding to the corresponding application field or structure category and the extraction of the association relationship information between the entity information Features to improve the accuracy of information extraction. At the same time, the preset material structure and material application classification selection method can also make the input information more standardized and improve the accuracy of information retrieval.
一些实施例中,所述第一选择列表或者第二选择列表还可以采用包括交互式可视化格式展示待选择的信息。所述交互式可视化格式表示通过可视化的形式展示实体类型或者材料结构、材料应用的类别信息并通过接收对各类别信息的触发操作确定相应的类别以及子类别的列表选择格式,如图30至图33所示。In some embodiments, the first selection list or the second selection list may also adopt an interactive visualization format to display the information to be selected. The interactive visualization format means that the entity type or material structure, the category information of the material application is displayed in a visual form, and the corresponding category and subcategory list selection format is determined by receiving the trigger operation of each category information, as shown in Figure 30 to Figure. 33 shown.
图30表示材料结构的几种较大类别展示图,用户可以通过点击材料结构,客户端可以基于用户的点击操作确定用户选择了材料结构作为输入信息的一部分。客户端还可以进一步展示几种较大类别。用户可以进一步点击其中任一分类,如点击聚合物。客户端可以基于用户的点击操作确定用户进一步选择了聚合物作为输入信息的一部分。客户端还可以进一步展示聚合物的细化分类,如图31所示。用户可以进一步点击图31中的任一分类,客户端可以基于用户的点击操作确定用户进一步选择的细节分类作为输入信息的一部分。当然,用户的操作方式还可以采用其他的方式。图32表示材料应用的几种较大类别展示图,图33表示智能材料的细化分类示意图。对于材料应用的触发以及展示实施方式可以同材料结构,这里不做赘述。Figure 30 shows several larger categories of material structure display diagrams. The user can click on the material structure, and the client can determine that the user has selected the material structure as part of the input information based on the user's click operation. The client can further display several larger categories. The user can further click on any of these categories, such as clicking on polymers. The client can determine that the user has further selected the polymer as part of the input information based on the user's click operation. The client can further display the detailed classification of polymers, as shown in Figure 31. The user can further click on any of the categories in FIG. 31, and the client can determine the detailed category further selected by the user as part of the input information based on the user's click operation. Of course, the user's operation method can also adopt other methods. Figure 32 shows the display diagrams of several larger categories of material applications, and Figure 33 shows the detailed classification diagram of smart materials. The triggering of material application and the display implementation can be the same as the material structure, which will not be repeated here.
通过采用交互式可视化格式作为第一选择列表或者第二选择列表的形式,也可以更加便于材料结构以及材料应用细化分类的展示,同时,还可以提高交互性,提高用户使用体验感。By adopting an interactive visualization format as the form of the first selection list or the second selection list, the display of the material structure and the detailed classification of the material application can also be more convenient, and at the same time, the interactivity can be improved, and the user experience can be improved.
另一些实施例中,还可以将材料应用与材料属性进行链接,并展示与特定应用有关的属性,如图39所示,图39展示出了智能材料的10大属性。材料的内部结构可随化学成分和外界条件的变化而改变,从而改变材料的性能。例如碳的质量分数在0.25%以下的低碳钢,通常具有良好的塑性和韧性,但强度和硬 度较低;碳的质量分数在0.6%~1.4%范围的高碳钢,其强度和硬度较高,而塑性和韧性较差。材料的不同属性对材料的实际应用具有较大的影响,如不同屈服强度的材料,其所应用的领域具有较大的差别。通过在材料应用与材料属性之间建立关联,通过数据源中对于材料属性的不同单位/度量值的描述,可以有效确定以及验证材料应用信息的提取,提高信息提取以及检索的准确性。In other embodiments, it is also possible to link material applications and material properties, and display properties related to specific applications, as shown in Figure 39, which shows the 10 properties of smart materials. The internal structure of the material can be changed with changes in chemical composition and external conditions, thereby changing the properties of the material. For example, low carbon steel with carbon mass fraction below 0.25% usually has good plasticity and toughness, but low strength and hardness; high carbon steel with carbon mass fraction in the range of 0.6% to 1.4% has higher strength and hardness. High, but poor plasticity and toughness. Different properties of materials have a greater impact on the actual application of materials. For example, materials with different yield strengths have different application fields. By establishing an association between material application and material properties, and by describing different units/measurements of material properties in the data source, the extraction of material application information can be effectively determined and verified, and the accuracy of information extraction and retrieval can be improved.
基于上述实施例提供的方法,本说明书实施例还提供一种信息检索方法,应用于服务器,可以包括:Based on the method provided in the foregoing embodiment, the embodiment of this specification also provides an information retrieval method, which is applied to a server and may include:
接收客户端发送的检索请求,所述检索请求包括客户端获取的输入信息;Receiving a search request sent by the client, where the search request includes input information obtained by the client;
从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;Extracting entity information from the input information, where the entity information includes an entity type and an entity value;
根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The data source where the entity information is located is determined according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes the direction of association between the entity information And information describing the type of association;
将所述检索结果发送所述客户端,以使所述客户端进行展示。The search result is sent to the client, so that the client can display it.
基于上述实施例提供的方法,本说明书实施例还提供一种服务器,可以包括:Based on the method provided in the foregoing embodiment, an embodiment of this specification also provides a server, which may include:
第三接收模块,可以用于接收客户端发送的检索请求,所述检索请求包括客户端获取的输入信息;The third receiving module may be used to receive a search request sent by the client, where the search request includes input information obtained by the client;
第七提取模块,可以用于从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;The seventh extraction module can be used to extract entity information from the input information, where the entity information includes an entity type and an entity value;
数据源确定模块,可以用于根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The data source determining module may be used to determine the data source where the entity information is located according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes Information describing the direction and type of association between entity information;
第五发送模块,可以用于将所述检索结果发送所述客户端,以使所述客户端进行展示。The fifth sending module may be used to send the search result to the client, so that the client can display it.
基于上述实施例提供的方法,本说明书实施例还提供一种信息检索方法,应用于客户端,可以包括:Based on the method provided in the foregoing embodiment, the embodiment of this specification also provides an information retrieval method, which is applied to the client and may include:
向服务器发送检索请求,所述检索请求包括客户端获取的输入信息;以使服务器从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及将所述检索结果发送所述客户端;Send a search request to the server, the search request includes the input information obtained by the client; so that the server extracts the entity information from the input information, the entity information includes the entity type and the entity value; according to the entity information in the data source The associated entity information in and the association relationship information between the entity information and the associated entity information determine the data source where the entity information is located, and the association relationship information includes information describing the association direction and association type between the entity information; And sending the retrieval result to the client;
接收服务器发送的所述检索结果,并进行展示。The retrieval result sent by the server is received and displayed.
另一些实施例中,所述客户端所展示的信息输入界面可以包括信息输入区域、第一选择列表和/或第二选择列表;所述信息输入区域可以用于进行信息输入;所述第一选择列表、第二选择列表可以用于进行信息选择,所述第一选择列表中可以包括材料应用或者材料结构的类别信息;所述第二选择列表中包括实体类型的类别信息;In other embodiments, the information input interface displayed by the client may include an information input area, a first selection list, and/or a second selection list; the information input area may be used for information input; the first The selection list and the second selection list may be used for information selection, the first selection list may include category information of material applications or material structure; the second selection list may include category information of entity types;
相应的,所述客户端可以基于所述信息输入区域、第一选择列表和/或第二选择列表获取输入信息。Correspondingly, the client can obtain input information based on the information input area, the first selection list, and/or the second selection list.
另一些实施例中,所述第一选择列表和/或第二选择列表可以采用包括交互式可视化格式展示待选择的信息;其中,所述交互式可视化格式可以表示通过可视化的形式展示实体类型或者材料结构、材料应用的类别信息并通过接收对各类别信息的触发操作确定相应的类别以及子类别的列表选择格式。In other embodiments, the first selection list and/or the second selection list may use an interactive visualization format to display the information to be selected; wherein, the interactive visualization format may indicate that the entity type is displayed in a visualization form or The material structure, the category information of the material application, and the list selection format of the corresponding category and subcategory are determined by receiving the trigger operation of each category information.
基于上述实施例提供的方法,本说明书实施例还提供一种客户端,可以包括:Based on the method provided by the foregoing embodiment, an embodiment of this specification also provides a client, which may include:
第六发送模块,可以用于向服务器发送检索请求,所述检索请求包括客户端获取的输入信息;以使服务器从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及将所述检索结果发送所述客户端;The sixth sending module can be used to send a search request to the server, where the search request includes input information obtained by the client; so that the server extracts entity information from the input information, and the entity information includes entity type and entity value; The data source where the entity information is located is determined according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes the direction of association between the entity information And information describing the associated type; and sending the search result to the client;
第四接收模块,可以用于接收服务器发送的所述检索结果;The fourth receiving module may be used to receive the retrieval result sent by the server;
第二展示模块,可以用于展示所述检索结果。The second display module can be used to display the search results.
另一些实施例中,所述客户端还可以包括:In other embodiments, the client may further include:
输入界面展示模块,可以用于展示信息输入界面,所述信息输入界面可以包括信息输入区域、第一选择列表和/或第二选择列表;所述信息输入区域用于进行信息输入;所述第一选择列表、第二选择列表可以用于进行信息选择,所述第一选择列表中可以包括材料应用或者材料结构的类别信息;所述第二选择列表中可以包括实体类型的类别信息;The input interface display module may be used to display an information input interface. The information input interface may include an information input area, a first selection list and/or a second selection list; the information input area is used for information input; A selection list and a second selection list can be used for information selection, the first selection list can include category information of material applications or material structure; the second selection list can include category information of entity types;
输入信息获取模块,可以用于基于所述信息输入区域、第一选择列表和/或第二选择列表获取输入信息。The input information obtaining module may be used to obtain input information based on the information input area, the first selection list and/or the second selection list.
本说明书提供的上述实施例所述的方法或装置可以通过计算机程序实现业务逻辑并记录在存储介质上,所述的存储介质可以计算机读取并执行,实现本说明书实施例所描述方案的效果。因此,本说明书还提供一种信息检索设备,包括处理器及存储处理器可执行指令的存储器,所述指令被所述处理器执行时实现包括上述任意一个实施例所述方法的步骤。The method or device described in the foregoing embodiment provided in this specification can implement business logic through a computer program and be recorded on a storage medium, and the storage medium can be read and executed by a computer to achieve the effects of the solution described in the embodiment of this specification. Therefore, this specification also provides an information retrieval device that includes a processor and a memory storing processor-executable instructions. When the instructions are executed by the processor, the steps including the method described in any one of the foregoing embodiments are implemented.
所述存储介质可以包括用于存储信息的物理装置,通常是将信息数字化后再以利用电、磁或者光学等方式的媒体加以存储。所述存储介质有可以包括:利用电能方式存储信息的装置如,各式存储器,如RAM、ROM等;利用磁能方式存储信息的装置如,硬盘、软盘、磁带、磁芯存储器、磁泡存储器、U盘;利用 光学方式存储信息的装置如,CD或DVD。当然,还有其他方式的可读存储介质,例如量子存储器、石墨烯存储器等等。The storage medium may include a physical device for storing information, and usually the information is stored in an electric, magnetic, or optical medium after digitizing the information. The storage medium may include: devices that use electric energy to store information, such as various types of memory, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk; a device that uses optical means to store information, such as CD or DVD. Of course, there are other ways of readable storage media, such as quantum memory, graphene memory, and so on.
需要说明的,上述所述的设备根据方法实施例的描述还可以包括其他的实施方式。具体的实现方式可以参照相关方法实施例的描述,在此不作一一赘述。It should be noted that the above-mentioned device may also include other implementation manners according to the description of the method embodiment. For specific implementation manners, reference may be made to the description of the related method embodiments, which will not be repeated here.
通常一篇专利或者论文文献可能描述了大量的信息,而作者或者申请人所给的摘要信息并不能有效体现整篇专利或者论文文献的主要信息。在进行技术查询时,由于信息量加大,用户通常又习惯于仅通过摘要来初步确定当前文献中是否存在其所需要的信息,从而导致信息的漏查。但是,如果避免漏查,用户又需要花费大量的精力,查阅并分析文献的全部内容,导致费时费力。相应的,如图40所示,本说明书的另一些实施例中,还可以提供一种摘要信息生成方法,所述方法可以包括:Generally, a patent or dissertation document may describe a large amount of information, and the abstract information given by the author or applicant cannot effectively reflect the main information of the entire patent or dissertation document. When making technical inquiries, due to the increase in the amount of information, users are usually accustomed to preliminarily determining whether the required information exists in the current literature only through abstracts, which leads to missing information. However, if the omission is avoided, the user will need to spend a lot of energy to consult and analyze the entire content of the literature, which will lead to time-consuming and labor-intensive. Correspondingly, as shown in FIG. 40, in other embodiments of this specification, a method for generating summary information may also be provided, and the method may include:
S80:客户端发送摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;S80: The client sends a summary information generation request, where the summary information generation request includes the data source for which the summary is to be generated;
S82:服务器接收客户端发送的所述摘要信息生成请求;S82: The server receives the summary information generation request sent by the client;
S84:服务器对所述数据源进行实体信息以及实体信息之间的关联关系信息提取,根据提取的关联关系信息对相应的实体信息进行关联;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;S84: The server extracts the entity information and the association relationship information between the entity information from the data source, and associates the corresponding entity information according to the extracted association relationship information; wherein the association relationship information includes the relationship between the entity information Information describing the direction of the association and the type of association;
S86:服务器根据关联后的实体信息生成摘要信息,以及,将生成的摘要信息发送所述客户端,以使所述客户端进行展示。S86: The server generates summary information according to the associated entity information, and sends the generated summary information to the client, so that the client can display it.
所述数据源可以包括待生成摘要信息的专利申请文本、论文或者其他类型的信息文本。服务器在获取相应的信息文本后,可以使用NLP对原始信息文本进行处理和标记化。标记后,可以基于预先构建的实体信息字典或实体信息库,利用NER进行信息文本中的实体信息提取,提取出材料名实体或者材料应用实体。并将提取的实体信息作为待分析对象。The data source may include the patent application text, thesis or other types of information text for which abstract information is to be generated. After the server obtains the corresponding information text, it can use NLP to process and tokenize the original information text. After marking, based on a pre-built entity information dictionary or entity information database, NER can be used to extract entity information in the information text, and extract the material name entity or material application entity. And take the extracted entity information as the object to be analyzed.
然后,服务器可以利用NLP算法获取待分析对象的元数据。例如,可以提取待分析对象的上下文信息中的实体信息,并利用过滤算法删除干扰词。所述干扰词包括停用词,普通词,非科学词和与物质语境无关的词。然后,可以过滤后的实体信息作为元数据附加到这些对象。Then, the server can use the NLP algorithm to obtain the metadata of the object to be analyzed. For example, the entity information in the context information of the object to be analyzed can be extracted, and a filtering algorithm can be used to delete noise words. The noise words include stop words, common words, non-scientific words and words that have nothing to do with the material context. Then, the filtered entity information can be attached to these objects as metadata.
然后,可以将待分析对象的元数据进行比较,建立两个待分析对象之间的关联关系。然后,可以利用该关联关系关联待分析对象。从而建立信息文本中重要实体信息之间的关联关系,形成信息摘要,并反馈给客户端进行展示。所述信息摘要的表现形式可以为基于实体信息以及实体信息之间的关联关系信息所生成的一段文字信息,也可以为基于实体信息以及实体信息之间的关联关系信息形成的信息流。Then, the metadata of the object to be analyzed can be compared to establish an association relationship between the two objects to be analyzed. Then, you can use the association relationship to associate the object to be analyzed. In this way, the association relationship between the important entity information in the information text is established, the information summary is formed, and the information is fed back to the client for display. The presentation form of the information summary may be a piece of text information generated based on the entity information and the association relationship information between the entity information, or may be an information flow formed based on the entity information and the association relationship information between the entity information.
对于多个信息文本,也可以将从两个信息文本中提取的实体信息和关联关系信息相互比较,发现两个信息文本之间的实体信息和关联关系信息之间的相似性。可以通过比较与不同上下文和元数据有关的多个属性来考虑实体相似性。每个实体可能具有用于实体匹配的唯一阈值。如可以基于实体阈值在实体之间找到相似/匹配,将实体和关系组合在一起,从而在两个文档之间形成重要的工作流程链接。For multiple information texts, it is also possible to compare the entity information and the association relationship information extracted from the two information texts to find the similarity between the entity information and the association relationship information between the two information texts. Entity similarity can be considered by comparing multiple attributes related to different contexts and metadata. Each entity may have a unique threshold for entity matching. For example, it is possible to find similarities/matches between entities based on entity thresholds, and combine entities and relationships to form important workflow links between two documents.
实体信息以及实体信息之间的关联关系信息的提取可以参考上述信息流提取方法中实施,这里不做赘述。通过上述方式生成摘要信息,可以准确、全面的提取出数据源信息中实体信息以及实体信息之间的关联关系,避免用户的筛选、梳理,使用户可以更加快速、准确的找到自己所需要的信息。The extraction of the entity information and the association relationship information between the entity information can refer to the implementation in the above-mentioned information flow extraction method, which will not be repeated here. The summary information generated by the above method can accurately and comprehensively extract the entity information and the association relationship between the entity information in the data source information, avoid user screening and sorting, and enable users to find the information they need more quickly and accurately .
所述生成的摘要信息可以通过文字或者表格等形式展示,也可以通过信息流的形式进行展示。相应的信息流的提取方法可以参考上述实施例进行,通过信息流的方式进行摘要信息的展示,可以更加便于用户查看,提高用户使用体验感。The generated summary information may be displayed in the form of text or tables, or may be displayed in the form of information flow. The extraction method of the corresponding information flow can be performed with reference to the above-mentioned embodiments. The display of summary information in the manner of information flow can make it easier for users to view and improve the user experience.
基于上述实施例提供的方法,本说明书实施例还提供一种摘要信息生成方法,应用于服务器,可以包括:Based on the method provided in the foregoing embodiment, the embodiment of this specification also provides a method for generating summary information, which is applied to a server and may include:
接收客户端发送的摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;Receiving a summary information generation request sent by the client, where the summary information generation request includes the data source for which the summary is to be generated;
对所述数据源进行实体信息以及实体信息之间的关联关系信息提取,根据提取的关联关系信息对相应的实体信息进行关联;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The entity information and the association relationship information between the entity information are extracted from the data source, and the corresponding entity information is associated according to the extracted association relationship information; wherein the association relationship information includes the association direction between the entity information and Information describing the type of association;
根据关联后的实体信息生成摘要信息,以及,将生成的摘要信息发送所述客户端,以使所述客户端进行展示。The summary information is generated according to the associated entity information, and the generated summary information is sent to the client, so that the client can display it.
基于上述实施例提供的方法,本说明书实施例还提供一种服务器,可以包括:Based on the method provided in the foregoing embodiment, an embodiment of this specification also provides a server, which may include:
第五接收模块,可以用于接收客户端发送的摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;The fifth receiving module may be used to receive a summary information generation request sent by the client, where the summary information generation request includes the data source for which the summary is to be generated;
第八提取模块,可以用于对所述数据源进行实体信息以及实体信息之间的关联关系信息提取;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The eighth extraction module can be used to extract entity information and association relationship information between entity information from the data source; wherein, the association relationship information includes information describing the association direction and association type between entity information ;
第六关联模块,可以用于根据提取的关联关系信息对相应的实体信息进行关联;The sixth association module can be used to associate corresponding entity information according to the extracted association relationship information;
生成模块,用于根据关联后的实体信息生成摘要信息;The generating module is used to generate summary information based on the associated entity information;
第七发送模块,可以用于将生成的摘要信息发送所述客户端,以使所述客户端进行展示。The seventh sending module may be used to send the generated summary information to the client, so that the client can display it.
基于上述实施例提供的方法,本说明书实施例还提供一种摘要信息生成方法,应用于客户端,可以包括:Based on the method provided in the foregoing embodiment, the embodiment of this specification also provides a method for generating summary information, which is applied to the client and may include:
向服务器发送摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;以使服务器对所述数据源进行实体信息以及实体信息之间的关联关系信息提取,根据提取的关联关系信息对相应的实体信息进行关联;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,根据关联后的实体信息生成摘要信息,将生成的摘要信息发送所述客户端;Send a summary information generation request to the server, the summary information generation request including the data source for which the summary is to be generated; so that the server extracts the entity information and the association relationship information between the entity information from the data source, based on the extracted association relationship information Associate corresponding entity information; wherein, the association relationship information includes information describing the direction of association and type of association between entity information; and, generating summary information based on the associated entity information, and sending the generated summary information The client;
接收服务器发送的所述摘要信息,并进行展示。Receive the summary information sent by the server and display it.
基于上述实施例提供的方法,本说明书实施例还提供一种客户端,可以包括:Based on the method provided by the foregoing embodiment, an embodiment of this specification also provides a client, which may include:
第八发送模块,可以用于向服务器发送摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;以使服务器对所述数据源进行实体信息以及实体信息之间的关联关系信息提取,根据提取的关联关系信息对相应的实体信息进行关联;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,根据关联后的实体信息生成摘要信息,将生成的摘要信息发送所述客户端;The eighth sending module may be used to send a summary information generation request to the server, where the summary information generation request includes the data source for which the summary is to be generated; so that the server performs entity information and the association relationship information between the entity information on the data source Extracting, associating corresponding entity information according to the extracted association relationship information; wherein the association relationship information includes information describing the association direction and association type between entity information; and, generating a summary based on the associated entity information Information, sending the generated summary information to the client;
第五接收模块,可以用于接收服务器发送的所述摘要信息;The fifth receiving module may be used to receive the summary information sent by the server;
第三展示模块,可以用于展示所述摘要信息。The third display module can be used to display the summary information.
本说明书提供的上述实施例所述的方法或装置可以通过计算机程序实现业务逻辑并记录在存储介质上,所述的存储介质可以计算机读取并执行,实现本说明书实施例所描述方案的效果。因此,本说明书还提供一种摘要信息生成设备,包括处理器及存储处理器可执行指令的存储器,所述指令被所述处理器执行时实现包括上述任意一个实施例所述方法的步骤。The method or device described in the foregoing embodiment provided in this specification can implement business logic through a computer program and be recorded on a storage medium, and the storage medium can be read and executed by a computer to achieve the effects of the solution described in the embodiment of this specification. Therefore, this specification also provides a summary information generating device, including a processor and a memory storing processor-executable instructions, which, when executed by the processor, implement the steps including the method described in any one of the foregoing embodiments.
所述存储介质可以包括用于存储信息的物理装置,通常是将信息数字化后再以利用电、磁或者光学等方式的媒体加以存储。所述存储介质有可以包括:利用电能方式存储信息的装置如,各式存储器,如RAM、ROM等;利用磁能方式存储信息的装置如,硬盘、软盘、磁带、磁芯存储器、磁泡存储器、U盘;利用光学方式存储信息的装置如,CD或DVD。当然,还有其他方式的可读存储介质,例如量子存储器、石墨烯存储器等等。The storage medium may include a physical device for storing information, and usually the information is stored in an electric, magnetic, or optical medium after digitizing the information. The storage medium may include: devices that use electrical energy to store information, such as various types of memory, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memory, U disk; a device that uses optical means to store information, such as CD or DVD. Of course, there are other ways of readable storage media, such as quantum memory, graphene memory, and so on.
需要说明的,上述所述的设备根据方法实施例的描述还可以包括其他的实施方式。具体的实现方式可以参照相关方法实施例的描述,在此不作一一赘述。It should be noted that the above-mentioned device may also include other implementation manners according to the description of the method embodiment. For specific implementation manners, reference may be made to the description of the related method embodiments, which will not be repeated here.
本说明书还提供一种系统,所述系统可以为单独的信息流提取系统、或者信息流展示系统、或者信息检索系统、或者摘要生成系统,也可以应用在多种信息提取系统中。所述的系统可以为单独的服务器,也可以包括使用了本说明书的一个或多个所述方法或一个或多个实施例装置的服务器集群、系统(包括分布式系统)、软件(应用)、实际操作装置、逻辑门电路装置、量子计算机等并结合必要的实施硬件的终端装置。所述信息检索系统可以包括至少一个处理器以及存储计算机可执行指令的存储器,所述处理器执行所述指令时实现上述任意一个或者多个实施例中所述方法的步骤。This specification also provides a system, which can be a separate information flow extraction system, or an information flow display system, or an information retrieval system, or an abstract generation system, and can also be applied to a variety of information extraction systems. The system can be a single server, or it can include server clusters, systems (including distributed systems), software (applications), The actual operation device, logic gate circuit device, quantum computer, etc., combined with the terminal device necessary to implement the hardware. The information retrieval system may include at least one processor and a memory storing computer-executable instructions, and the processor implements the steps of the method in any one or more of the foregoing embodiments when the processor executes the instructions.
需要说明的,上述所述的系统根据方法或者装置实施例的描述还可以包括其他的实施方式,具体的实现方式可以参照相关方法实施例的描述,在此不作一一赘述。It should be noted that the above-mentioned system may also include other implementation manners according to the description of the method or device embodiment. For the specific implementation manner, refer to the description of the related method embodiment, and details are not repeated here.
本说明书实施例并不局限于必须是符合标准数据模型/模板或本说明书实施例所描述的情况。某些行业标准或者使用自定义方式或实施例描述的实施基础上略加修改后的实施方案也可以实现上述实施例相同、等同或相近、或变形后可预料的实施效果。应用这些修改或变形后的数据获取、存储、判断、处理方式等获取的实施例,仍然可以属于本说明书的可选实施方案范围之内。The embodiments of this specification are not limited to the conditions described in the embodiments of this specification that must conform to the standard data model/template. Certain industry standards or implementations described in custom methods or examples with slight modifications can also achieve the same, equivalent or similar implementation effects of the foregoing examples, or predictable implementation effects after modification. The examples obtained by applying these modified or deformed data acquisition, storage, judgment, processing methods, etc., can still fall within the scope of the optional implementation solutions of this specification.
本说明书是参照根据本说明书实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。This specification is described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to the embodiments of this specification. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are used to generate It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法或者设备中还存在另外的相同要素。It should also be noted that the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or equipment including a series of elements includes not only those elements, but also Other elements that are not explicitly listed, or include elements inherent to such processes, methods, commodities, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other same elements in the process, method, or device that includes the element.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本说明书的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述并不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或 者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。The various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment. In the description of this specification, descriptions with reference to the terms "one embodiment", "some embodiments", "examples", "specific examples", or "some examples" etc. mean specific features described in conjunction with the embodiment or example , Structure, materials or features are included in at least one embodiment or example in this specification. In this specification, the schematic representation of the above-mentioned terms does not necessarily refer to the same embodiment or example. Moreover, the described specific features, structures, materials or characteristics can be combined in any one or more embodiments or examples in a suitable manner. In addition, those skilled in the art can combine and combine the different embodiments or examples and the features of the different embodiments or examples described in this specification without contradicting each other.
以上所述仅为本说明书的实施例而已,并不用于限制本说明书。对于本领域技术人员来说,本说明书可以有各种更改和变化。凡在本说明书的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本说明书的权利要求范围之内。The above descriptions are only examples of this specification, and are not intended to limit this specification. For those skilled in the art, this specification can have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this specification shall be included in the scope of the claims of this specification.

Claims (35)

  1. 一种信息流展示方法,包括:An information flow display method, including:
    客户端向服务器发送信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;The client sends an information flow acquisition request to the server, where the information flow acquisition request includes the input information acquired by the client;
    服务器接收所述信息流获取请求,从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;The server receives the information flow acquisition request, and extracts one or more entity information from the input information as target entity information;
    服务器从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The server extracts at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes the direction of the association between the entity information and Information describing the type of association;
    服务器利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流,以及,将所述信息流发送给所述客户端,以在所述客户端上进行展示。The server uses the association relationship information to associate the target entity information and the associated entity information, obtains the information flow corresponding to the target entity information, and sends the information flow to the client for the Display on the client.
  2. 一种信息流提取方法,包括:An information flow extraction method includes:
    获取目标实体信息;Obtain target entity information;
    从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;At least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source, and the association relationship information includes an association direction and an association between the entity information Type description information;
    利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流。Use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
  3. 根据权利要求2所述的方法,其中,所述利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,包括:The method according to claim 2, wherein the using the association relationship information to associate the target entity information with the associated entity information comprises:
    获取从数据源提取的至少一个关联实体信息以及所述目标实体信息与相应的关联实体信息之间的关联关系信息,利用所述目标实体信息与相应的关联实体信息之间的关联关系信息对所述目标实体信息与相应的关联实体信息进行关联,获得所述目标实体信息的子信息流;Obtain at least one associated entity information extracted from the data source and the association relationship information between the target entity information and the corresponding associated entity information, and use the association relationship information between the target entity information and the corresponding associated entity information to The target entity information is associated with the corresponding associated entity information to obtain the sub-information stream of the target entity information;
    以所述目标实体信息为基准信息将从同一或者不同数据源获得的子信息流进行链接,获得所述目标实体信息所对应的信息流。Using the target entity information as the reference information, link sub-information streams obtained from the same or different data sources to obtain the information stream corresponding to the target entity information.
  4. 根据权利要求2或3所述的方法,其中,当所述关联关系信息包括同位实体描述信息时,将所述同位实体描述信息所对应的关联实体信息补充至所述目标实体信息中,更新所述目标实体信息;其中,所述同位实体描述信息包括描述关联实体信息为目标实体信息的其他表述形式的描述信息;The method according to claim 2 or 3, wherein, when the association relationship information includes co-located entity description information, the associated entity information corresponding to the co-located entity description information is added to the target entity information, and all information is updated. The target entity information; wherein, the co-located entity description information includes description information in other expression forms that describe the associated entity information as target entity information;
    相应的,从数据源中提取更新后的目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。Correspondingly, at least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source.
  5. 根据权利要求2至4任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 2 to 4, wherein the method further comprises:
    将从数据源中提取到的各关联实体信息作为目标实体信息进行信息流的提取,获得多个目标实体信息所对应的信息流;Use the associated entity information extracted from the data source as the target entity information to extract the information flow, and obtain the information flow corresponding to the multiple target entity information;
    将多个目标实体信息所对应的信息流以相应的目标实体信息作为基准信息进行链接,获得信息图谱。Link the information streams corresponding to multiple target entity information with the corresponding target entity information as the reference information to obtain the information map.
  6. 根据权利要求2至5任一项所述的方法,其中,所述从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,包括:The method according to any one of claims 2 to 5, wherein the extracting at least one associated entity information corresponding to the target entity information and the association relationship between the target entity information and the associated entity information from the data source Information, including:
    检索所述目标实体信息所在的数据源;Retrieve the data source where the target entity information is located;
    定位所述目标实体信息在数据源中所在的上下文信息;Locate the context information where the target entity information is located in the data source;
    当所述上下文信息中包含材料工艺实体、材料名实体和/或材料应用实体时,根据材料工艺信息关联方式提取所述目标实体信息所对应的关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。When the context information includes a material process entity, a material name entity, and/or a material application entity, extract the associated entity information corresponding to the target entity information and the target entity information and the associated entity information according to the material process information association method The relationship information between.
  7. 根据权利要求2至6任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 2 to 6, wherein the method further comprises:
    从数据源提取所述目标实体信息、关联实体信息的商品名;Extracting the target entity information and the commodity name of the associated entity information from the data source;
    根据所述商品名提取所述目标实体信息或关联实体信息所对应的制造商或者供应商,将所述制造商或者供应商与所述目标实体信息或关联实体信息进行关联。The manufacturer or supplier corresponding to the target entity information or associated entity information is extracted according to the product name, and the manufacturer or supplier is associated with the target entity information or associated entity information.
  8. 根据权利要求2至7任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 2 to 7, wherein the method further comprises:
    从数据源提取所述目标实体信息、关联实体信息的参数实体信息,所述参数实体信息至少包括材料结构类型实体、工艺方法实体、材料属性实体、单位实体或者度量实体中的一种所对应的实体信息;Extract the target entity information and the parameter entity information of the associated entity information from the data source. The parameter entity information includes at least one of the material structure type entity, the process method entity, the material attribute entity, the unit entity or the measurement entity corresponding to one Entity information
    将提取的参数实体信息与相应的目标实体信息、关联实体信息进行关联。The extracted parameter entity information is associated with the corresponding target entity information and associated entity information.
  9. 根据权利要求3至8任一项所述的方法,其中,所述方法还包括:将所述目标实体信息的子信息流与相应的数据源进行关联。The method according to any one of claims 3 to 8, wherein the method further comprises: associating a sub-information stream of the target entity information with a corresponding data source.
  10. 根据权利要求2所述的方法,其中,所述方法还包括:The method according to claim 2, wherein the method further comprises:
    将信息流中的各实体信息作为与用户进行交互的信息节点,并利用可视化方法对所述信息流进行可视化处理;Use the entity information in the information flow as an information node for interaction with the user, and use a visualization method to visualize the information flow;
    将处理后的信息流发送给客户端,以使用户通过客户端查看所述信息流以及触发所述信息节点;Sending the processed information stream to the client, so that the user can view the information stream and trigger the information node through the client;
    基于用户对所述信息节点的触发操作,向所述客户端反馈与所述信息节点关联的其他信息。Based on the user's triggering operation on the information node, feedback other information associated with the information node to the client.
  11. 一种信息流提取装置,包括:An information flow extraction device includes:
    第一获取模块,用于获取目标实体信息;The first obtaining module is used to obtain target entity information;
    第一提取模块,用于从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The first extraction module is configured to extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes pair entity information Information describing the direction of the association and the type of association;
    第一关联模块,用于利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流。The first association module is configured to use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information.
  12. 根据权利要求11所述的装置,其中,所述第一关联模块包括:The device according to claim 11, wherein the first association module comprises:
    第一关联单元,用于获取从数据源提取的至少一个关联实体信息以及所述目标实体信息与相应的关联实体信息之间的关联关系信息,利用所述目标实体信息与相应的关联实体信息之间的关联关系信息对所述目标实体信息与相应的关联实体信息进行关联,获得所述目标实体信息的子信息流;The first associating unit is used to obtain at least one associated entity information extracted from the data source and the association relationship information between the target entity information and the corresponding associated entity information, and use the relationship between the target entity information and the corresponding associated entity information The association relationship information between each other associates the target entity information with the corresponding associated entity information, and obtains the sub-information stream of the target entity information;
    第二关联单元,用于以所述目标实体信息为基准信息将从同一或者不同数据源获得的子信息流进行链接,获得所述目标实体信息所对应的信息流。The second associating unit is used to link the sub-information streams obtained from the same or different data sources using the target entity information as the reference information to obtain the information stream corresponding to the target entity information.
  13. 根据权利要求11或12所述的装置,其中,当所述关联关系信息包括同位实体描述信息时,所述装置还包括:The apparatus according to claim 11 or 12, wherein, when the association relationship information includes colocation entity description information, the apparatus further comprises:
    更新模块,用于将所述同位实体描述信息所对应的关联实体信息补充至所述目标实体信息中,更新所述目标实体信息;其中,所述同位实体描述信息包括描述关联实体信息为目标实体信息的其他表述形式的描述信息;The update module is used to add the associated entity information corresponding to the colocation entity description information to the target entity information, and update the target entity information; wherein the colocation entity description information includes information describing the associated entity as the target entity Descriptive information in other forms of information;
    所述第一提取模块还用于从数据源中提取更新后的目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。The first extraction module is further configured to extract at least one associated entity information corresponding to the updated target entity information and the association relationship information between the target entity information and the associated entity information from the data source.
  14. 根据权利要求11至13任一项所述的装置,其中,所述装置还包括:The device according to any one of claims 11 to 13, wherein the device further comprises:
    第二提取模块,用于将从数据源中提取到的各关联实体信息作为目标实体信息进行信息流的提取,获得多个目标实体信息所对应的信息流;The second extraction module is used to extract the information flow of each associated entity information extracted from the data source as the target entity information, and obtain the information flow corresponding to the multiple target entity information;
    第二关联模块,用于将多个目标实体信息所对应的信息流以相应的目标实体信息作为基准信息进行链接,获得信息图谱。The second association module is used to link the information streams corresponding to multiple target entity information with the corresponding target entity information as the reference information to obtain the information graph.
  15. 根据权利要求11至14任一项所述的装置,其中,所述第一提取模块包括:The device according to any one of claims 11 to 14, wherein the first extraction module comprises:
    检索单元,用于检索所述目标实体信息所在的数据源;The retrieval unit is used to retrieve the data source where the target entity information is located;
    定位单元,用于定位所述目标实体信息在数据源中所在的上下文信息;The positioning unit is used to locate the context information where the target entity information is located in the data source;
    提取单元,用于当所述上下文信息中包含材料工艺实体、材料名实体和/或材料应用实体时,根据材料工艺信息关联方式提取所述目标实体信息所对应的关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息。The extraction unit is used to extract the associated entity information corresponding to the target entity information and the target entity according to the material process information association method when the context information includes material process entities, material name entities, and/or material application entities Information about the association relationship between information and associated entity information.
  16. 根据权利要求11至15任一项所述的装置,其中,所述装置还包括:The device according to any one of claims 11 to 15, wherein the device further comprises:
    第三提取模块,用于从数据源提取所述目标实体信息、关联实体信息的商品名;The third extraction module is used to extract the target entity information and the commodity name of the associated entity information from the data source;
    第三关联模块,用于根据所述商品名提取所述目标实体信息或关联实体信息所对应的制造商或者供应商,将所述制造商或者供应商与所述目标实体信息或关联实体信息进行关联。The third associating module is used to extract the manufacturer or supplier corresponding to the target entity information or the associated entity information according to the product name, and perform the comparison between the manufacturer or supplier and the target entity information or associated entity information. Associated.
  17. 根据权利要求11至16任一项所述的装置,其中,所述装置还包括:The device according to any one of claims 11 to 16, wherein the device further comprises:
    第四提取模块,用于从数据源提取所述目标实体信息、关联实体信息的参数实体信息,所述参数实体信息至少包括材料结构类型实体、工艺方法实体、材料属性实体、单位实体或者度量实体中的一种所对应的实体信息;The fourth extraction module is used to extract the target entity information and the parameter entity information of the associated entity information from the data source. The parameter entity information includes at least a material structure type entity, a process method entity, a material attribute entity, a unit entity, or a measurement entity Entity information corresponding to one of the types;
    第四关联模块,用于将提取的参数实体信息与相应的目标实体信息、关联实体信息进行关联。The fourth association module is used to associate the extracted parameter entity information with corresponding target entity information and associated entity information.
  18. 根据权利要求12至17任一项所述的装置,其中,所述第一关联模块还包括:The device according to any one of claims 12 to 17, wherein the first association module further comprises:
    第三关联单元,用于将所述目标实体信息的子信息流与相应的数据源进行关联。The third associating unit is used for associating the sub-information stream of the target entity information with the corresponding data source.
  19. 根据权利要求11至18任一项所述的装置,其中,所述装置还包括:The device according to any one of claims 11 to 18, wherein the device further comprises:
    可视化处理模块,用于将信息流中的各实体信息作为与用户进行交互的信息节点,并利用可视化方法对所述信息流进行可视化处理;The visualization processing module is used to treat each entity information in the information flow as an information node for interacting with the user, and use a visualization method to visualize the information flow;
    第一发送模块,用于将处理后的信息流发送给客户端,以使用户通过客户端查看所述信息流以及触发所述信息节点;The first sending module is configured to send the processed information stream to the client, so that the user can view the information stream and trigger the information node through the client;
    第二发送模块,用于基于用户对所述信息节点的触发操作,向所述客户端反馈与所述信息节点关联的其他信息。The second sending module is configured to feed back other information associated with the information node to the client based on the user's triggering operation on the information node.
  20. 一种信息流提取设备,所述设备包括处理器及用于存储处理器可执行指令的存储器,所述指令被所述处理器执行时实现上述权利要求2至10任一项所述方法的步骤。An information flow extraction device, the device comprising a processor and a memory for storing executable instructions of the processor, and when the instructions are executed by the processor, the steps of the method according to any one of claims 2 to 10 are realized .
  21. 一种信息流展示方法,应用于服务器,包括:An information flow display method, applied to a server, including:
    接收客户端发送的信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;Receiving an information flow acquisition request sent by the client, where the information flow acquisition request includes input information acquired by the client;
    从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;Extract one or more entity information from the input information as target entity information;
    从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;At least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information are extracted from the data source, and the association relationship information includes the association direction and the association between the entity information Type description information;
    利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流;Using the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information;
    将所述信息流发送给所述客户端,以在所述客户端上进行展示。The information stream is sent to the client for display on the client.
  22. 一种服务器,包括:A server that includes:
    第一接收模块,用于接收客户端发送的信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;The first receiving module is configured to receive an information flow acquisition request sent by the client, where the information flow acquisition request includes input information acquired by the client;
    第五提取模块,用于从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;The fifth extraction module is used to extract one or more entity information from the input information as target entity information;
    第六提取模块,用于从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The sixth extraction module is configured to extract at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes pair entity information Information describing the direction of the association and the type of association;
    第五关联模块,用于利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流;A fifth association module, configured to use the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information;
    第三发送模块,用于将所述信息流发送给所述客户端,以在所述客户端上进行展示。The third sending module is configured to send the information stream to the client for display on the client.
  23. 一种信息流展示方法,应用于客户端,包括:An information flow display method, applied to the client, includes:
    向服务器发送信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;以使服务器接收所述信息流获取请求,从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;以及,从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流,并将所述信息流发送给所述客户端;Send an information flow acquisition request to the server, where the information flow acquisition request includes the input information obtained by the client; so that the server receives the information flow acquisition request and extracts one or more entity information from the input information as the target entity Information; and, extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, and the association relationship information includes the relationship between the entity information Information describing the association direction and association type; and, using the association relationship information to associate the target entity information and the associated entity information, obtain the information flow corresponding to the target entity information, and send the information flow To the client;
    接收服务器发送的信息流,并进行展示。Receive the information stream sent by the server and display it.
  24. 一种客户端,包括:A client, including:
    第四发送模块,用于向服务器发送信息流获取请求,所述信息流获取请求包括客户端获取的输入信息;以使服务器接收所述信息流获取请求,从所述输入信息中提取一个或者多个实体信息,作为目标实体信息;以及,从数据源中提取所述目标实体信息对应的至少一个关联实体信息以及所述目标实体信息与关联实体信息之间的关联关系信息,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,利用所述关联关系信息对所述目标实体信息以及关联实体信息进行关联,获得所述目标实体信息所对应的信息流,并将所述信息流发送给所述客户端;The fourth sending module is configured to send an information flow acquisition request to the server, where the information flow acquisition request includes the input information obtained by the client; so that the server receives the information flow acquisition request and extracts one or more information from the input information. Entity information as target entity information; and extracting at least one associated entity information corresponding to the target entity information and the association relationship information between the target entity information and the associated entity information from the data source, the association relationship information Including information describing the direction and type of association between entity information; and using the association relationship information to associate the target entity information and the associated entity information to obtain the information flow corresponding to the target entity information, And send the information stream to the client;
    第二接收模块,用于接收服务器发送的信息流;The second receiving module is used to receive the information stream sent by the server;
    第一展示模块,用于展示所述信息流。The first display module is used to display the information stream.
  25. 一种信息检索方法,应用于服务器,包括:An information retrieval method applied to a server, including:
    接收客户端发送的检索请求,所述检索请求包括客户端获取的输入信息;Receiving a search request sent by the client, where the search request includes input information obtained by the client;
    从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;Extracting entity information from the input information, where the entity information includes an entity type and an entity value;
    根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The data source where the entity information is located is determined according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes the direction of association between the entity information And information describing the type of association;
    将所述检索结果发送所述客户端,以使所述客户端进行展示。The search result is sent to the client, so that the client can display it.
  26. 一种服务器,包括:A server that includes:
    第三接收模块,用于接收客户端发送的检索请求,所述检索请求包括客户端获取的输入信息;The third receiving module is configured to receive a search request sent by the client, where the search request includes input information obtained by the client;
    第七提取模块,用于从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;The seventh extraction module is used to extract entity information from the input information, where the entity information includes entity type and entity value;
    数据源确定模块,用于根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The data source determining module is configured to determine the data source where the entity information is located according to the associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information, and the association relationship information includes Information describing the direction of association between entity information and the type of association;
    第五发送模块,用于将所述检索结果发送所述客户端,以使所述客户端进行展示。The fifth sending module is configured to send the search result to the client, so that the client can display it.
  27. 一种信息检索方法,应用于客户端,包括:An information retrieval method, applied to the client, includes:
    向服务器发送检索请求,所述检索请求包括客户端获取的输入信息;以使服务器从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及将所述检索结果发送所述客户端;Send a search request to the server, the search request includes the input information obtained by the client; so that the server extracts the entity information from the input information, the entity information includes the entity type and the entity value; according to the entity information in the data source The associated entity information in and the association relationship information between the entity information and the associated entity information determine the data source where the entity information is located, and the association relationship information includes information describing the association direction and association type between the entity information; And sending the retrieval result to the client;
    接收服务器发送的所述检索结果,并进行展示。The retrieval result sent by the server is received and displayed.
  28. 根据权利要求27所述的方法,其中,所述客户端所展示的信息输入界面包括信息输入区域、第一选 择列表和/或第二选择列表;所述信息输入区域用于进行信息输入;所述第一选择列表、第二选择列表用于进行信息选择,所述第一选择列表中包括材料应用或者材料结构的类别信息;所述第二选择列表中包括实体类型的类别信息;The method according to claim 27, wherein the information input interface displayed by the client terminal includes an information input area, a first selection list and/or a second selection list; the information input area is used for information input; The first selection list and the second selection list are used for information selection, the first selection list includes category information of material applications or material structure; the second selection list includes category information of entity types;
    相应的,所述客户端基于所述信息输入区域、第一选择列表和/或第二选择列表获取输入信息。Correspondingly, the client terminal obtains input information based on the information input area, the first selection list, and/or the second selection list.
  29. 根据权利要求28所述的方法,其中,所述第一选择列表和/或第二选择列表采用包括交互式可视化格式展示待选择的信息;其中,所述交互式可视化格式表示通过可视化的形式展示实体类型或者材料结构、材料应用的类别信息并通过接收对各类别信息的触发操作确定相应的类别以及子类别的列表选择格式。The method according to claim 28, wherein the first selection list and/or the second selection list adopts an interactive visualization format to display the information to be selected; wherein, the interactive visualization format means that the information to be selected is displayed in a visualized form The entity type or material structure, the category information of the material application, and the corresponding category and subcategory list selection format are determined by receiving the trigger operation of each category information.
  30. 一种客户端,包括:A client, including:
    第六发送模块,用于向服务器发送检索请求,所述检索请求包括客户端获取的输入信息;以使服务器从所述输入信息中提取实体信息,所述实体信息包括实体类型及实体值;根据所述实体信息在数据源中的关联实体信息以及实体信息与关联实体信息之间的关联关系信息确定所述实体信息所在的数据源,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及将检索结果发送所述客户端;The sixth sending module is configured to send a search request to the server, the search request includes input information obtained by the client; so that the server extracts entity information from the input information, and the entity information includes entity type and entity value; The associated entity information of the entity information in the data source and the association relationship information between the entity information and the associated entity information determine the data source where the entity information is located, and the association relationship information includes the association direction between the entity information and Information describing the type of association; and sending the search result to the client;
    第四接收模块,用于接收服务器发送的所述检索结果;The fourth receiving module is configured to receive the retrieval result sent by the server;
    第二展示模块,用于展示所述检索结果。The second display module is used to display the retrieval result.
  31. 根据权利要求30所述的客户端,其中,所述客户端还包括:The client according to claim 30, wherein the client further comprises:
    输入界面展示模块,用于展示信息输入界面,所述信息输入界面包括信息输入区域、第一选择列表和/或第二选择列表;所述信息输入区域用于进行信息输入;所述第一选择列表、第二选择列表用于进行信息选择,所述第一选择列表中包括材料应用或者材料结构的类别信息;所述第二选择列表中包括实体类型的类别信息;The input interface display module is used to display an information input interface. The information input interface includes an information input area, a first selection list and/or a second selection list; the information input area is used for information input; the first selection The list and the second selection list are used for information selection, the first selection list includes category information of material application or material structure; the second selection list includes category information of entity type;
    输入信息获取模块,用于基于所述信息输入区域、第一选择列表和/或第二选择列表获取输入信息。The input information obtaining module is configured to obtain input information based on the information input area, the first selection list and/or the second selection list.
  32. 一种摘要信息生成方法,应用于服务器,包括:A method for generating summary information, applied to a server, includes:
    接收客户端发送的摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;Receiving a summary information generation request sent by the client, where the summary information generation request includes the data source for which the summary is to be generated;
    对所述数据源进行实体信息以及实体信息之间的关联关系信息提取,根据提取的关联关系信息对相应的实体信息进行关联;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The entity information and the association relationship information between the entity information are extracted from the data source, and the corresponding entity information is associated according to the extracted association relationship information; wherein the association relationship information includes the association direction between the entity information and Information describing the type of association;
    根据关联后的实体信息生成摘要信息,以及,将生成的摘要信息发送所述客户端,以使所述客户端进行展示。The summary information is generated according to the associated entity information, and the generated summary information is sent to the client, so that the client can display it.
  33. 一种服务器,包括:A server that includes:
    第五接收模块,用于接收客户端发送的摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;The fifth receiving module is configured to receive a summary information generation request sent by the client, where the summary information generation request includes the data source for which the summary is to be generated;
    第八提取模块,用于对所述数据源进行实体信息以及实体信息之间的关联关系信息提取;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;The eighth extraction module is used to extract entity information and the association relationship information between the entity information from the data source; wherein the association relationship information includes information describing the association direction and the association type between the entity information;
    第六关联模块,用于根据提取的关联关系信息对相应的实体信息进行关联;The sixth association module is used to associate corresponding entity information according to the extracted association relationship information;
    生成模块,用于根据关联后的实体信息生成摘要信息;The generation module is used to generate summary information based on the associated entity information;
    第七发送模块,用于将生成的摘要信息发送所述客户端,以使所述客户端进行展示。The seventh sending module is configured to send the generated summary information to the client, so that the client can display it.
  34. 一种摘要信息生成方法,应用于客户端,包括:A method for generating summary information, applied to a client, includes:
    向服务器发送摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;以使服务器对所述数据源进行实体信息以及实体信息之间的关联关系信息提取,根据提取的关联关系信息对相应的实体信息进行关联;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,根据关联后的实体信息生成摘要信息,将生成的摘要信息发送所述客户端;Send a summary information generation request to the server, the summary information generation request including the data source for which the summary is to be generated; so that the server extracts the entity information and the association relationship information between the entity information from the data source, based on the extracted association relationship information Associate corresponding entity information; wherein, the association relationship information includes information describing the direction of association and type of association between entity information; and, generating summary information based on the associated entity information, and sending the generated summary information The client;
    接收服务器发送的所述摘要信息,并进行展示。Receive the summary information sent by the server and display it.
  35. 一种客户端,包括:A client, including:
    第八发送模块,用于向服务器发送摘要信息生成请求,所述摘要信息生成请求包括待生成摘要的数据源;以使服务器对所述数据源进行实体信息以及实体信息之间的关联关系信息提取,根据提取的关联关系信息对相应的实体信息进行关联;其中,所述关联关系信息包括对实体信息之间的关联方向以及关联类型进行描述的信息;以及,根据关联后的实体信息生成摘要信息,将生成的摘要信息发送所述客户端;The eighth sending module is configured to send a summary information generation request to the server, where the summary information generation request includes the data source for which the summary is to be generated; so that the server extracts entity information and the association relationship information between the entity information from the data source , Associate the corresponding entity information according to the extracted association relationship information; wherein the association relationship information includes information describing the association direction and association type between the entity information; and, generate summary information based on the associated entity information , Sending the generated summary information to the client;
    第五接收模块,用于接收服务器发送的所述摘要信息;The fifth receiving module is configured to receive the summary information sent by the server;
    第三展示模块,用于展示所述摘要信息。The third display module is used to display the summary information.
PCT/CN2021/098541 2020-06-05 2021-06-07 Information stream extraction method, apparatus and device WO2021244657A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010503615.1 2020-06-05
CN202010503615.1A CN113761214A (en) 2020-06-05 2020-06-05 Information flow extraction method, device and equipment

Publications (1)

Publication Number Publication Date
WO2021244657A1 true WO2021244657A1 (en) 2021-12-09

Family

ID=78783908

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/098541 WO2021244657A1 (en) 2020-06-05 2021-06-07 Information stream extraction method, apparatus and device

Country Status (2)

Country Link
CN (1) CN113761214A (en)
WO (1) WO2021244657A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170098013A1 (en) * 2015-10-05 2017-04-06 Yahoo! Inc. Method and system for entity extraction and disambiguation
CN107665252A (en) * 2017-09-27 2018-02-06 深圳证券信息有限公司 A kind of method and device of creation of knowledge collection of illustrative plates
CN110019540A (en) * 2017-07-20 2019-07-16 阿里巴巴集团控股有限公司 Implementation method, methods of exhibiting and the device of enterprise's map, equipment
CN110083284A (en) * 2019-05-06 2019-08-02 三角兽(北京)科技有限公司 Candidate information processing unit, candidate information display methods, storage medium and electronic equipment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150293947A1 (en) * 2014-04-10 2015-10-15 Raghuvira Bhagavan Validating relationships between entities in a data model
CN106095762A (en) * 2016-02-05 2016-11-09 中科鼎富(北京)科技发展有限公司 A kind of news based on ontology model storehouse recommends method and device
US10796697B2 (en) * 2017-01-31 2020-10-06 Microsoft Technology Licensing, Llc Associating meetings with projects using characteristic keywords
CN107562884A (en) * 2017-09-04 2018-01-09 百度在线网络技术(北京)有限公司 A kind of information flow shows method, apparatus, server and storage medium
CN109918669B (en) * 2019-03-08 2023-08-08 腾讯科技(深圳)有限公司 Entity determining method, device and storage medium
CN110909176B (en) * 2019-11-20 2021-03-02 腾讯科技(深圳)有限公司 Data recommendation method and device, computer equipment and storage medium
CN111078727A (en) * 2019-12-17 2020-04-28 Oppo广东移动通信有限公司 Brief description generation method and device and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170098013A1 (en) * 2015-10-05 2017-04-06 Yahoo! Inc. Method and system for entity extraction and disambiguation
CN110019540A (en) * 2017-07-20 2019-07-16 阿里巴巴集团控股有限公司 Implementation method, methods of exhibiting and the device of enterprise's map, equipment
CN107665252A (en) * 2017-09-27 2018-02-06 深圳证券信息有限公司 A kind of method and device of creation of knowledge collection of illustrative plates
CN110083284A (en) * 2019-05-06 2019-08-02 三角兽(北京)科技有限公司 Candidate information processing unit, candidate information display methods, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN113761214A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
US11409777B2 (en) Entity-centric knowledge discovery
US11645317B2 (en) Recommending topic clusters for unstructured text documents
CN105493075B (en) Attribute value retrieval based on identified entities
JP5332477B2 (en) Automatic generation of term hierarchy
JP5423030B2 (en) Determining words related to a word set
US8498984B1 (en) Categorization of search results
US8117198B2 (en) Methods for generating search engine index enhanced with task-related metadata
US8126888B2 (en) Methods for enhancing digital search results based on task-oriented user activity
KR101502671B1 (en) Online analysis and display of correlated information
US11989239B2 (en) Visual mapping of aggregate causal frameworks for constructs, relationships, and meta-analyses
JP2009104630A (en) Machine learning approach to determining document relevance for searching over large electronic collections of documents
WO2011097307A2 (en) Intuitive, contextual information search and presentation systems and methods
CN111061828B (en) Digital library knowledge retrieval method and device
US20160210355A1 (en) Searching and classifying unstructured documents based on visual navigation
WO2020074017A1 (en) Deep learning-based method and device for screening for keywords in medical document
KR101441219B1 (en) Automatic association of informational entities
Wang et al. The role of user reviews in app updates: A preliminary investigation on app release notes
WO2021244657A1 (en) Information stream extraction method, apparatus and device
Sarkar et al. LigerCat: using “MeSH Clouds” from journal, article, or gene citations to facilitate the identification of relevant biomedical literature
WO2016176310A1 (en) Conceptual document analysis and characterization
US20240086448A1 (en) Detecting cited with connections in legal documents and generating records of same
Kumaar et al. Effective Online Discussion Data for Teachers Reflective Thinking Using Feature base Model
TW202328946A (en) Patent search system and method thereof
Miller et al. LigerCat: Using" MeSH Clouds" from Journal, Article, or Gene Citations to Facilitate the Identification of Relevant Biomedical...

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21818861

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21818861

Country of ref document: EP

Kind code of ref document: A1