CN118103834A - Information acquisition method and device - Google Patents

Information acquisition method and device Download PDF

Info

Publication number
CN118103834A
CN118103834A CN202180103399.4A CN202180103399A CN118103834A CN 118103834 A CN118103834 A CN 118103834A CN 202180103399 A CN202180103399 A CN 202180103399A CN 118103834 A CN118103834 A CN 118103834A
Authority
CN
China
Prior art keywords
knowledge graph
information
event
node
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180103399.4A
Other languages
Chinese (zh)
Inventor
张小莲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN118103834A publication Critical patent/CN118103834A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an information acquisition method and device in the field of artificial intelligence, which are used for storing relevant information of a user through a personal knowledge graph and can realize more efficient data retrieval. The method may include: firstly, acquiring an input text of a target user, wherein at least one word included in the input text forms at least one event; then, acquiring an output sequence based on the input text, wherein the output sequence comprises at least one event type and element; the personal knowledge graph is obtained according to the output sequence, and comprises a plurality of nodes, and can specifically comprise type nodes and element nodes, wherein the type nodes are used for representing the types of at least one event, the element nodes are used for representing the elements of at least one event, the type nodes in the same event are associated with the element nodes, and the personal knowledge graph is used for recommending target users.

Description

Information acquisition method and device Technical Field
The application relates to the field of artificial intelligence, in particular to an information acquisition method and device.
Background
With rapid development and wide application of technologies such as big data, the focus of enterprises is increasingly focused on how to use big data for accurate marketing and other services, and the concept of "user portraits" has been developed: by means of big data, valuable information is mined from user behaviors and applied to various stages of users, and user experience is improved. However, user portraits typically build labels in units of information about items that the user has used, possibly resulting in inaccurate information being subsequently recommended to the user. Therefore, how to obtain more accurate information characterizing the user is a urgent problem to be solved.
Disclosure of Invention
The embodiment of the application provides an information acquisition method and device, which are used for extracting more accurate information from texts input by users by combining a neural network and syntactic analysis, and storing related information of the users through personal knowledge graphs, so that more efficient data retrieval can be realized.
In view of this, the present application provides, in a first aspect, an information acquisition method including: acquiring an input text of a target user, wherein the input text comprises at least one word, and the at least one word forms at least one event; acquiring an output sequence based on the input text, wherein the output sequence comprises at least one event type and element; the personal knowledge graph is obtained according to the output sequence, the personal knowledge graph comprises a plurality of nodes, the plurality of nodes comprise type nodes and element nodes, the type nodes are used for representing the type of at least one event, the element nodes are used for representing the elements of at least one event, the type nodes corresponding to the type in the same event are associated with the element nodes corresponding to the elements, namely the type nodes in the same event are associated with the element nodes, and the personal knowledge graph is used for recommending for a target user.
In the embodiment of the application, the types and the elements of the events generated by the target user are accurately extracted by taking the events as units, and the knowledge graph is constructed, so that each event of the target user can be more conveniently and accurately stored, and the related knowledge of the target user can be more accurately recorded. Therefore, when the target user is recommended later, accurate information can be accurately inquired by taking the event as a unit, and the complete event can be accurately inquired through the association relation among the nodes, so that the accuracy of data inquiry is improved, and the recommendation effectiveness is improved. In addition, in the embodiment of the application, the personal knowledge graph is constructed for the user, and can be constructed or updated based on the entity extracted from the input text. And the searching can be performed more efficiently by means of the nodes, so that the recommendation can be performed more efficiently for the user.
In a possible implementation manner, if the output sequence further includes an association relationship between elements of at least one event, the element nodes corresponding to the elements of the same event with the association relationship in the personal knowledge graph are associated, for example, the type, the elements and the association relationship of the event can be extracted from the input text, the association relationship includes the association relationship between the type and/or the elements, after the type node and the element node are constructed, the type node and the element node can be connected according to the association relationship, so that the complete event can be identified in the personal knowledge graph through the association relationship, and a more complete record is performed on the event; or if the output sequence further comprises at least one emotion type of the event, the element nodes corresponding to the same event in the personal knowledge graph are related through the emotion type, for example, the emotion type of the event can be extracted from the input text, the nodes in the same event are connected according to the emotion type, and complete recording of the emotion type event is completed.
Therefore, in the embodiment of the application, the element nodes can be completely recorded according to different types of events, for example, the attention events can be connected according to the association relation among the elements, the emotion events can be connected with the element nodes according to emotion types, the generalization capability is high, more types of events are recorded through corresponding connection modes, and the method can be suitable for more application scenes.
In one possible implementation manner, the output sequence may include a type requirement element of a first event, that is, any one of the at least one event, and the acquiring the personal knowledge graph according to the output sequence may include: if the initial knowledge graph comprises information of the first event, updating association relation between element nodes corresponding to the first event included in the initial knowledge graph to obtain a personal knowledge graph; if the initial knowledge graph does not contain the information of the first event, adding the type node and the element node of the first event into the initial knowledge graph, and associating the type node and the element node of the first event to obtain the personal knowledge graph.
Therefore, in the embodiment of the application, the events in the personal knowledge graph can be updated or added, so that the information included in the personal knowledge graph is enriched.
In one possible implementation manner, an initial sequence corresponding to the input text is obtained through a text processing model, wherein the initial sequence comprises vector representations of at least one word in the input text and first class labels corresponding to the at least one word; carrying out syntactic analysis on the input text to obtain a feature sequence, wherein the feature sequence comprises a second class label corresponding to at least one word; and combining the initial sequence and the characteristic sequence to obtain an output sequence, wherein the output sequence comprises elements and types of the at least one event.
Therefore, in the embodiment of the application, the neural network and the syntactic analysis are combined to extract more accurate information from the input text, and then the more accurate information extracted from the input text is used to generate or update the personal knowledge graph of the target user, so that the personal knowledge graph can more accurately reflect the characteristics of the user, and the follow-up recommendation can be performed for the target user by using the personal knowledge graph.
In one possible implementation manner, the obtaining the output sequence by combining the initial sequence and the feature sequence to obtain the personal knowledge graph may include: correcting the initial sequence according to the characteristic sequence to obtain an output sequence; and acquiring a personal knowledge graph according to the output sequence.
Therefore, in the embodiment of the application, the initial sequence extracted through the neural network can be corrected by using the characteristic sequence, so that more accurate information can be obtained by combining information extracted from the input text in various modes, and the personal knowledge graph can be obtained by using the more accurate information, thereby obtaining the personal knowledge graph capable of describing the target user more accurately.
In one possible embodiment, the foregoing method may further include: acquiring a first knowledge graph, wherein the first knowledge graph comprises a plurality of nodes, the nodes comprise information of at least one entity, and the nodes in the first personal knowledge graph can represent one entity or can represent elements or types of events; acquiring association information associated with nodes in the personal knowledge graph from the first knowledge graph; and expanding the personal knowledge graph by using the associated information to obtain an expanded personal knowledge graph.
Therefore, in the embodiment of the application, the personal knowledge graph can be expanded by using the first knowledge graph, and the data in the first knowledge graph is independent of the input data of the user, so that the personal knowledge graph contains more information, and more information can be conveniently queried in the personal knowledge graph later.
In one possible implementation manner, the outputting the output sequence corresponding to the input text through the text processing model may include: and taking the input text as input of a text processing model and outputting an initial sequence, wherein the text processing model is used for executing the following steps: performing natural language processing on the input text to obtain a feature vector sequence and an entity sequence, wherein the entity sequence comprises a vector representation corresponding to each word in at least one word, and the feature vector sequence comprises a feature vector corresponding to the input text; acquiring position information corresponding to vectors in an entity sequence; fusing the position information and the feature vector sequence to obtain a fused sequence; and classifying the entities corresponding to the fusion sequence to obtain a tag sequence, wherein the initial sequence comprises vector representations corresponding to each word and the tag sequence.
Therefore, in the embodiment of the application, the text can be converted into the vector representation by the neural network, and the context information of each word and the association relation among the words in the input text are extracted, so that the accurate information can be extracted from the input text.
In one possible embodiment, the foregoing method may further include: acquiring information of at least one node matched with the output sequence from the personal knowledge graph; and generating recommendation information for the target user according to the information of at least one node, wherein the recommendation information is used for recommending to the target user.
The embodiment of the application can be applied to a recommendation scene, so that more accurate information related to the text input by the user can be efficiently retrieved by combining with a personal knowledge graph with finer granularity, thereby realizing more efficient and accurate recommendation for the user and improving user experience.
In one possible implementation manner, the acquiring information of at least one node matched with the output sequence from the personal knowledge graph may include: screening information of at least one first node corresponding to the output sequence from the personal knowledge graph; searching information of at least one second node associated with at least one first node from the personal knowledge graph, wherein the information of the at least one node comprises information of the at least one first node and information of the at least one second node. The embodiment of the application provides a specific mode for inquiring data from a personal knowledge graph.
In one possible implementation, the information of the first node and the information of the second node are information of different domains. Therefore, the embodiment of the application can realize cross-domain recommendation for the user and improve the user experience.
In one possible implementation manner, each node in the personal knowledge graph includes a corresponding weight, the weight of each node and the storage duration or the update duration are in a negative correlation, each node is any node in the personal knowledge graph, the storage duration is the duration of storing the information of each node, and the update duration is the duration of last updating the information included in each node. Therefore, in the embodiment of the application, the information of the user can be recorded in a manner of attenuating the weight, so that the memory of the knowledge of the user is realized.
In one possible implementation manner, the generating recommendation information for the target user according to the information of the at least one node includes: sequencing at least one node according to the weight corresponding to the at least one node; and generating recommendation information according to the information of the at least one node and the ordering of the at least one node.
Therefore, in the embodiment of the application, the recommendation sequence can be arranged based on the weight, so that more effective information is recommended for the user, and the user experience is improved.
In a possible implementation manner, the acquiring the input text of the target user may include: acquiring user input data, wherein the input data comprises at least one of image, text or voice data; input text is extracted from the input data.
Therefore, in the embodiment of the application, the method and the device can be suitable for various input scenes, have strong generalization capability and improve user experience.
In one possible embodiment, the foregoing method may further include: obtaining structured data of a target user, wherein the structured data is data in a preset format; extracting information of at least one event from the structured data according to a preset rule; updating the personal knowledge graph according to the information of at least one event to obtain an updated personal knowledge graph.
Therefore, in the embodiment of the application, besides extracting information from the input text by means of a neural network and syntactic analysis, the information can be extracted from the structured data of the target user and the personal knowledge graph can be updated, so that the personal knowledge graph can be updated in more ways, and more information can be included in the personal knowledge graph.
In a second aspect, the application also provides a graphical user interface GUI, characterised in that the graphical user interface is stored in an electronic device comprising a display screen, a memory, one or more processors for executing one or more computer programs stored in the memory, the graphical user interface comprising:
Generating a personal knowledge graph in response to input operation of a target user, displaying the personal knowledge graph, wherein the input text of the target user comprises at least one word, the at least one word forms at least one event, the personal knowledge graph comprises a plurality of nodes, the nodes comprise type nodes and element nodes, the type nodes are used for representing types of the at least one event, the element nodes are used for representing elements of the at least one event, the type nodes and the element nodes in the same event are associated, and the personal knowledge graph is used for recommending the target user.
In one possible embodiment, the GUI may further include: and displaying a permission request, wherein the permission request is used for indicating whether the personal knowledge graph is acquired by using the input text of the target user. For example, input information of a user can be collected through an Application (APP) installed in the intelligent terminal of the user, and whether to allow collection of input data in each APP can be displayed in a display interface as a knowledge source of a personal knowledge graph, so that data privacy security of the user is improved.
In one possible embodiment, the GUI may further include: in response to acquiring association information associated with the nodes in the personal knowledge graph from a first knowledge graph, and using the association information to expand the personal knowledge graph to obtain an expanded personal knowledge graph, displaying the expanded personal knowledge graph, wherein the first knowledge graph comprises a plurality of nodes, the plurality of nodes comprise information of at least one entity, and the nodes in the first personal knowledge graph can represent one entity or can represent elements or types of events.
In one possible embodiment, the GUI may further include: and displaying the first knowledge graph.
In one possible embodiment, the GUI may further include: and generating recommendation information for the target user in response to the information of at least one node acquired from the personal knowledge graph, and displaying the recommendation information, wherein the recommendation information is used for recommending the target user.
In one possible embodiment, each node in the personal knowledge graph includes a corresponding weight, the at least one node is ordered according to the corresponding weight, and the GUI may further include: the recommendation information is displayed in response to generating the recommendation information according to the information of the at least one node and the ordering of the at least one node.
In one possible embodiment, the GUI may further include: and responding to the input operation of the target user on the first input interface, displaying input text, wherein the input text is extracted from input data of the target user, and the input data comprises at least one data of images, texts or voices.
In one possible embodiment, the GUI may further include: responding to the input operation of the user for the second input interface, updating the personal knowledge graph according to the obtained structured data, and displaying the updated personal knowledge graph, wherein the structured data is in a preset format.
In a third aspect, the present application provides an information acquisition apparatus comprising:
the input module is used for acquiring an input text of a target user, wherein the input text comprises at least one word, and the at least one word forms at least one event;
The text processing module is used for acquiring an output sequence based on the input text, wherein the output sequence comprises at least one event type and element;
The personal knowledge graph comprises a plurality of nodes, the plurality of nodes comprise type nodes and element nodes, the type nodes are used for representing types of at least one event, the element nodes are used for representing elements of at least one event, the type nodes in the same event are associated with the element nodes, and the personal knowledge graph is used for recommending target users.
In one possible implementation manner, if the output sequence further includes an association relationship between elements of at least one event, element nodes corresponding to elements of the same event with the association relationship in the personal knowledge graph are associated with each other; if the output sequence also comprises emotion types, element nodes corresponding to the same event in the personal knowledge graph are related through the emotion types.
In a possible implementation manner, the output sequence may include a type requirement element of a first event, that is, any one of the at least one event, and the acquiring module is specifically configured to: if the initial knowledge graph comprises information of the first event, updating the element nodes corresponding to the first event and the association relation between the element nodes in the initial knowledge graph to obtain a personal knowledge graph; if the initial knowledge graph does not contain the information of the first event, adding the type node and the element node of the first event into the initial knowledge graph, and associating the type node and the element node of the first event to obtain the personal knowledge graph.
In one possible implementation, the text processing module is specifically configured to: obtaining an initial sequence corresponding to the input text through a text processing model, wherein the initial sequence comprises vector representations of at least one word in the input text and first class labels corresponding to the at least one word; carrying out syntactic analysis on the input text to obtain a feature sequence, wherein the feature sequence comprises a second class label corresponding to at least one word; and combining the initial sequence and the characteristic sequence to obtain an output sequence, wherein the output sequence comprises elements and types of at least one event.
In one possible implementation, the text processing module is specifically configured to: and correcting the part which is not matched with the characteristic sequence in the initial sequence to obtain an output sequence.
In one possible implementation, the text processing module is further configured to: if each word in the feature sequence corresponds to a plurality of second class labels, determining a unique second class label for each word, and obtaining an updated feature sequence.
In one possible implementation, the text processing module is specifically configured to: according to the input text, an initial sequence is obtained through a text processing model, wherein the text processing model is used for executing the following steps: performing natural language processing on the input text to obtain a feature vector sequence and an entity sequence, wherein the entity sequence comprises a vector representation corresponding to each word in at least one word, and the feature vector sequence comprises a feature vector corresponding to the input text; acquiring position information corresponding to vectors in an entity sequence; fusing the position information and the feature vector sequence to obtain a fused sequence; and classifying the entities corresponding to the fusion sequence to obtain a tag sequence, wherein the initial sequence comprises vector representations corresponding to each word and the tag sequence.
In one possible implementation manner, the device further comprises an expansion module, configured to: acquiring a first knowledge graph, wherein the first knowledge graph comprises a plurality of nodes, the nodes comprise information of at least one entity, and the nodes in the first personal knowledge graph can represent one entity or can represent elements or types of events; acquiring association information associated with the nodes in the personal knowledge graph from the first knowledge graph; and expanding the personal knowledge graph by using the associated information to obtain an expanded personal knowledge graph.
In a possible implementation manner, the apparatus further includes a recommendation module, configured to: acquiring information of at least one node matched with the output sequence from the personal knowledge graph; and generating recommendation information for the target user according to the information of at least one node, wherein the recommendation information is used for recommending to the target user.
In one possible implementation, the recommendation module is specifically configured to: screening information of at least one first node corresponding to the output sequence from the personal knowledge graph; searching information of at least one second node associated with at least one first node from the personal knowledge graph, wherein the information of the at least one node comprises information of the at least one first node and information of the at least one second node.
In one possible implementation, the information of the first node and the information of the second node are information of different domains.
In one possible implementation manner, each node in the personal knowledge graph includes a corresponding weight, and the weight of each node is in a negative correlation with a storage duration or an update duration, where the storage duration is a duration of storing information of each node, and the update duration is a duration of last updating information included in each node.
In one possible implementation, the recommendation module is specifically configured to: sequencing at least one node according to the weight corresponding to the at least one node; and generating recommendation information according to the information of the at least one node and the ordering of the at least one node.
In one possible implementation, the input module is specifically configured to: acquiring user input data, wherein the input data comprises at least one of image, text or voice data; input text is extracted from the input data.
In one possible implementation of the method according to the invention,
The input module is also used for acquiring structured data of a target user, wherein the structured data is data in a preset format;
the acquisition module is also used for extracting information of at least one event from the structured data according to a preset rule;
The acquisition module is further used for updating the personal knowledge graph according to the information of at least one event to obtain an updated personal knowledge graph.
In a fourth aspect, an embodiment of the present application provides an information acquisition apparatus, including: the processor and the memory are interconnected through a line, and the processor invokes the program code in the memory to perform the processing-related functions in the information acquisition method according to any one of the first aspect.
In a fifth aspect, an embodiment of the present application provides an electronic device, including: the processor and the memory are interconnected through a line, and the processor invokes the program code in the memory to perform the processing-related functions in the information acquisition method according to any one of the first aspect.
In a sixth aspect, an embodiment of the present application provides an information acquisition apparatus, which may also be referred to as a digital processing chip or chip, the chip including a processing unit and a communication interface, the processing unit acquiring program instructions via the communication interface, the program instructions being executed by the processing unit, the processing unit being configured to perform a processing-related function as in the first aspect or any of the alternative embodiments of the first aspect.
In a seventh aspect, embodiments of the present application provide a computer readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect and any optional implementation of the first aspect.
In an eighth aspect, embodiments of the present application provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect and any of the alternative embodiments of the first aspect.
Drawings
FIG. 1 is a schematic diagram of an artificial intelligence subject framework for use with the present application;
FIG. 2 is a schematic diagram of a system architecture according to the present application;
fig. 3 is a schematic structural diagram of a convolutional neural network according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of an information acquisition method according to the present application;
FIG. 5 is a flowchart of another information acquisition method according to the present application;
FIG. 6 is a flowchart of another information acquisition method according to the present application;
FIG. 7 is a schematic flow chart of a neural network according to the present application;
FIG. 8 is a flowchart of another information acquisition method according to the present application;
FIG. 9 is a schematic diagram of an event record according to the present application;
FIG. 10 is a schematic flow chart of updating PKG according to the present application;
FIG. 11 is a schematic flow chart of setting weights for nodes according to the present application;
FIG. 12 is a schematic flow chart of PKG expansion according to the present application;
fig. 13 is a schematic view of an application scenario of the information acquisition method provided by the present application;
FIG. 14 is a flowchart of a recommendation rule of the information acquisition method according to the present application;
Fig. 15 is a schematic view of another application scenario of the information acquisition method provided by the present application;
fig. 16 is a schematic diagram of another application scenario of the information acquisition method provided by the present application;
Fig. 17 is a schematic diagram of another application scenario of the information acquisition method provided by the present application;
fig. 18 is a schematic diagram of another application scenario of the information acquisition method provided by the present application;
Fig. 19 is a schematic view of another application scenario of the information acquisition method provided by the present application;
Fig. 20 is a schematic diagram of another application scenario of the information acquisition method provided by the present application;
fig. 21 is a schematic diagram of another application scenario of the information acquisition method provided by the present application;
Fig. 22 is a schematic diagram of another application scenario of the information acquisition method provided by the present application;
Fig. 23 is a schematic diagram of an architecture of a method for acquiring deployment information in a terminal according to the present application;
FIG. 24 is a flowchart of another information acquisition method according to the present application;
FIG. 25 is a schematic diagram of a PKG according to the present application;
fig. 26 is a schematic structural diagram of an information acquisition device according to the present application;
FIG. 27 is a schematic diagram showing another information acquiring apparatus according to the present application;
fig. 28 is a schematic structural diagram of an electronic device according to the present application;
fig. 29 is a schematic diagram of a chip structure according to the present application.
Detailed Description
The following description of the technical solutions according to the embodiments of the present application will be given with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.
The information acquisition method provided by the application can be applied to an artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) scene. AI is a theory, method, technique, and application system that utilizes a digital computer or a digital computer-controlled machine to simulate, extend, and extend human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar manner to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision and reasoning, man-machine interaction, recommendation and search, AI-based theory, and the like.
Referring to fig. 1, a schematic structural diagram of an artificial intelligence main body framework is shown in fig. 1, and the artificial intelligence main body framework is described below from two dimensions of "intelligent information chain" (horizontal axis) and "IT value chain" (vertical axis). Where the "intelligent information chain" reflects a list of processes from the acquisition of data to the processing. For example, there may be general procedures of intelligent information awareness, intelligent information representation and formation, intelligent reasoning, intelligent decision making, intelligent execution and output. In this process, the data undergoes a "data-information-knowledge-wisdom" gel process. The "IT value chain" reflects the value that artificial intelligence brings to the information technology industry from the underlying infrastructure of personal intelligence, information (provisioning and processing technology implementation), to the industrial ecological process of the system.
(1) Infrastructure of
The infrastructure provides computing capability support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the base platform. Communicating with the outside through the sensor; the computing power is provided by a smart chip (CPU, NPU, GPU, ASIC, FPGA and other hardware acceleration chips); the basic platform comprises a distributed computing framework, a network and other relevant platform guarantees and supports, and can comprise cloud storage, computing, interconnection and interworking networks and the like. For example, the sensor and external communication obtains data that is provided to a smart chip in a distributed computing system provided by the base platform for computation.
(2) Data
The data of the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence. The data relate to graphics, images, voice and text, and also relate to the internet of things data of the traditional equipment, including service data of the existing system and sensing data such as force, displacement, liquid level, temperature, humidity and the like.
(3) Data processing
Data processing typically includes data training, machine learning, deep learning, searching, reasoning, decision making, and the like.
Wherein machine learning and deep learning can perform symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like on data.
Reasoning refers to the process of simulating human intelligent reasoning modes in a computer or an intelligent system, and carrying out machine thinking and problem solving by using formal information according to a reasoning control strategy, and typical functions are searching and matching.
Decision making refers to the process of making decisions after intelligent information is inferred, and generally provides functions of classification, sequencing, prediction and the like.
(4) General capability
After the data has been processed, some general-purpose capabilities can be formed based on the result of the data processing, such as algorithms or a general-purpose system, for example, translation, text analysis, computer vision processing, speech recognition, image recognition, etc.
(5) Intelligent product and industry application
The intelligent product and industry application refers to products and applications of an artificial intelligent system in various fields, is encapsulation of an artificial intelligent overall solution, and realizes land application by making intelligent information decisions, and the application fields mainly comprise: intelligent terminal, intelligent transportation, intelligent medical treatment, autopilot, smart city etc.
Embodiments of the present application relate to neural networks and related applications of natural language processing (natural language processing, NLP), and in order to better understand the schemes of the embodiments of the present application, related terms and concepts of the neural networks to which the embodiments of the present application may relate are first described below.
Corpus (Corpus): also referred to as free text, which may be words, sentences, fragments, articles, and any combination thereof. For example, "weather today is good" is a section of corpus.
Entity: objects existing in the corpus, such as a section of corpus "the dog is walked by a small Ming' can include the entities: "Xiaoming", "dog". And each entity has a corresponding category or categories, such as "Xiaoming" for "people" and "dog" for "animals".
The self-attention model (self-attention model) is that a sequence data (such as natural corpus your mobile phone is very good) is effectively encoded into a plurality of multidimensional vectors, numerical operation is conveniently carried out, and the multidimensional vectors fuse similarity information of each element in the sequence, and the similarity is called self-attention.
Loss function (loss function): also referred to as cost function (cost function), a measure that compares the predicted output of a machine learning model on a sample to the difference in the true value of the sample (also referred to as a supervision value), i.e., the difference between the predicted output of the machine learning model on the sample and the true value of the sample. The loss function may generally include error squared, cross entropy, logarithmic, exponential, etc. loss functions. For example, the mean square error can be used as a loss function, defined asThe specific loss function can be specifically selected according to the actual application scene.
Gradient: the derivative vector of the loss function with respect to the parameter.
Random gradient: the number of samples in machine learning is large, so the loss function of each calculation is calculated from the data obtained by random sampling, and the corresponding gradient is called random gradient.
Back Propagation (BP): an algorithm for calculating model parameter gradients and updating model parameters according to a loss function is calculated. The neural network can adopt a Back Propagation (BP) algorithm to correct the parameter in the initial neural network model in the training process, so that the reconstruction error loss of the neural network model is smaller and smaller. Specifically, the input signal is transmitted forward until the output is generated with error loss, and the parameters in the initial neural network model are updated by back propagation of the error loss information, so that the error loss is converged. The back propagation algorithm is a back propagation motion that dominates the error loss, and aims to obtain parameters of the optimal neural network model, such as a weight matrix.
Neural machine translation (neural machine translation): neural machine translation is a typical task of natural language processing. The task is a technique of giving a sentence in a source language and outputting a sentence in a target language corresponding to the sentence. In a common neural machine translation model, words in sentences in a source language and a target language are encoded into vector representations, and association between words and between sentences is calculated in a vector space, so that translation tasks are performed.
Pre-trained language model (pre-trained language model, PLM): is a natural language sequence encoder that encodes each word in a natural language sequence into a vector representation for a predictive task. The training of PLM involves two phases, a pre-training phase and a fine tuning (finetuning) phase. In the pre-training stage, the model performs training of language model tasks on a large-scale unsupervised text, so that word representation modes are learned. In the fine tuning stage, the model is initialized by using parameters learned in the pre-training stage, and training with fewer steps is performed on Downstream tasks (Downstream tasks) such as text classification (text classification) or sequence labeling (sequence labeling), so that semantic information obtained by pre-training can be successfully migrated to the Downstream tasks.
Embedding: refers to a characteristic representation of a sample.
BiLSTM +crf: the model is a named entity recognition model based on a neural network, and is a model based on word embedding and word embedding. BiLSTM and CRF are two different layers in the named entity recognition model.
Sigmoid multi-label classification model: the labels of a sample are not limited to one category only, but may have multiple categories, with associations between different categories. Such as a garment having a characteristic class of long-sleeved, lace, etc. attributes, the two attribute tags are not mutually exclusive, but rather are associated.
Schemas: a data format for defining a format of data to be added to a knowledge-graph; the data model corresponds to a certain field and contains meaningful concept types and attributes of the types in the field. The method is mainly used for standardizing the expression of the structured data, and one piece of data must meet the requirement of the entity object and the type of the entity object which are defined in advance by the Schema and is allowed to be updated into the knowledge graph.
Elastic search: is a distributed, highly extended, highly real-time search and data analysis engine. And a large amount of data can be conveniently provided with searching, analyzing and exploring capabilities. The horizontal scalability of the elastomer search is fully utilized, enabling the data to become more valuable in a production environment. The implementation principle of the elastic search is mainly divided into the following steps, firstly, a user submits data to an elastic search database, then a word segmentation controller is used for word segmentation of corresponding sentences, the weight and word segmentation results are stored in the data together, when the user searches the data, the results are ranked according to the weight, scoring is carried out, and then the returned results are presented to the user.
Transformers library: models, such as BERT(bidirectional encoder representations from transformers),GPT-2,RoBERTa,XLM,DistilBert,XLNet,CTRL, are provided for natural language understanding (natural language understanding, NLU) or natural language generation (natural language generation, NLG), etc., having multiple pre-training models, supporting multiple languages.
The natural language processing method provided by the embodiment of the application can be executed on a server and also can be executed on terminal equipment. The terminal device may be a mobile phone with an image processing function, a tablet personal computer (tablet personal computer, TPC), a media player, a smart television, a notebook computer (LC), a Personal Digital Assistant (PDA), a personal computer (personal computer, PC), a camera, a video camera, a smart watch, a Wearable Device (WD), or an autopilot vehicle, which is not limited in the embodiment of the present application.
Referring to fig. 2, an embodiment of the present application provides a system architecture 200. The system architecture includes a database 230 and a client device 240. The data collection device 260 is configured to collect data and store the data in the database 230, and the training module 202 generates the target model/rule 201 based on the data maintained in the database 230. How the training module 202 obtains the target model/rule 201 based on the data will be described in more detail below, and the target model/rule 201 is the neural network mentioned in the following embodiments of the present application, and specific reference is made to the relevant descriptions in fig. 4A-12 below.
The computing module may include a training module 202, and the target models/rules obtained by the training module 202 may be applied in different systems or devices. In fig. 2, the executing device 210 is configured with a transceiver 212, where the transceiver 212 may be a wireless transceiver, an optical transceiver, or a wired interface (e.g., I/O interface), etc., to interact with external devices, and a "user" may input data to the transceiver 212 through the client device 240, e.g., the client device 240 may send a target task to the executing device 210, request the executing device to train a neural network, and send a database for training to the executing device 210.
The execution device 210 may call data, code, etc. in the data storage system 250, or may store data, instructions, etc. in the data storage system 250.
The calculation module 211 processes the input data using the target model/rule 201. Specifically, the calculation module 211 is configured to: acquiring an input text of a target user, wherein the input text comprises at least one word, and the at least one word forms at least one event; acquiring an output sequence based on the input text, wherein the output sequence comprises at least one event type and element; the personal knowledge graph is obtained according to the output sequence, the personal knowledge graph comprises a plurality of nodes, the plurality of nodes comprise type nodes and element nodes, the type nodes are used for representing the type of at least one event, the element nodes are used for representing the elements of at least one event, the type nodes corresponding to the type in the same event are associated with the element nodes corresponding to the elements, namely, the type nodes in the same event are associated with the element nodes, and the personal knowledge graph is used for recommending for a target user.
Finally, transceiver 212 returns the constructed neural network to client device 240 to deploy the neural network in client device 240 or other devices.
Further, the training module 202 may obtain corresponding target models/rules 201 based on different data for different tasks to provide better results to the user.
In the case shown in fig. 2, the data in the input execution device 210 may be determined from the input data of the user, for example, the user may operate in an interface provided by the transceiver 212. In another case, the client device 240 may automatically input data to the transceiver 212 and obtain the result, and if the client device 240 automatically inputs data to obtain authorization of the user, the user may set the corresponding rights in the client device 240. The user may view the results output by the execution device 210 at the client device 240, and the specific presentation may be in the form of a display, a sound, an action, or the like. The client device 240 may also act as a data collection terminal to store the collected data associated with the target task in the database 230.
The training or updating process referred to in the present application may be performed by the training module 202. It will be appreciated that the training process of the neural network learns the manner in which the spatial transformations are controlled, and more particularly the weight matrix. The objective of training the neural network is to make the output of the neural network as close as possible to the expected value, so the weight vector of each layer of the neural network can be updated by comparing the predicted value and the expected value of the current network and according to the difference between the predicted value and the expected value (of course, the weight vector can be initialized before the first update, that is, the pre-configured parameters of each layer in the deep neural network). For example, if the predicted value of the network is too high, the values of the weights in the weight matrix are adjusted to decrease the predicted value, and the adjustment is continued until the value output by the neural network approaches or equals the desired value. Specifically, the difference between the predicted and expected values of the neural network may be measured by a loss function (loss function) or an objective function (objective function). By way of example of a loss function, training of a neural network can be understood as a process that reduces loss as much as possible, with higher output values (loss) of the loss function indicating greater differences.
As shown in fig. 2, a target model/rule 201 is trained according to a training module 202, where the target model/rule 201 may be a self-attention model in the present application in an embodiment of the present application, and the self-attention model may include a deep convolutional neural network (deep convolutional neural networks, DCNN), a recurrent neural network (recurrent neural network, RNNS), and so on. The neural network mentioned in the present application may include various types such as a deep neural network (deep neural network, DNN), a convolutional neural network (convolutional neural network, CNN), a cyclic neural network (recurrent neural networks, RNN), or a residual network other neural network, etc.
Wherein during the training phase, database 230 may be used to store a sample set for training. The execution device 210 generates a target model/rule 201 for processing the samples and iteratively trains the target model/rule 201 using the sample set in the database to obtain a mature target model/rule 201, the target model/rule 201 being embodied as a neural network. The neural network obtained by the execution device 210 may be applied to different systems or devices.
In the inference phase, the execution device 210 may call data, code, etc. in the data storage system 250, or may store data, instructions, etc. in the data storage system 250. The data storage system 250 may be disposed in the execution device 210, or the data storage system 250 may be an external memory with respect to the execution device 210. The calculation module 211 may process the samples acquired by the execution device 210 through the neural network to obtain a prediction result, where a specific expression form of the prediction result is related to a function of the neural network.
It should be noted that fig. 2 is only an exemplary schematic diagram of a system architecture according to an embodiment of the present application, and the positional relationship between devices, apparatuses, modules, etc. shown in the drawings is not limited in any way. For example, in FIG. 2, data storage system 250 is external memory to execution device 210, and in other scenarios, data storage system 250 may be located within execution device 210.
The target model/rule 201 obtained by training according to the training module 202 may be applied to different systems or devices, such as a mobile phone, a tablet computer, a notebook computer, an augmented reality (augmented reality, AR)/Virtual Reality (VR), a vehicle-mounted terminal, etc., and may also be a server or cloud device, etc.
The object model/rule 201 may be a self-attention model in the present application in the present embodiment, and specifically, the self-attention model provided in the present embodiment may include CNN, deep convolutional neural network (deep convolutional neural networks, DCNN), recurrent neural network (recurrent neural network, RNN), and so on.
Referring to fig. 3, an embodiment of the present application also provides a system architecture 300. The execution device 210 is implemented by one or more servers, optionally in cooperation with other computing devices, such as: data storage, routers, load balancers and other devices; the execution device 210 may be disposed on one physical site or distributed across multiple physical sites. The execution device 210 may use the data in the data storage system 250 or invoke the program code in the data storage system 250 to implement the steps of the information retrieval method of the present application corresponding to fig. 4-25 below.
The user may operate respective user devices (e.g., local device 301 and local device 302) to interact with the execution device 210. Each local device may represent any computing device, such as a personal computer, computer workstation, smart phone, tablet, smart camera, smart car or other type of cellular phone, media consumption device, wearable device, set top box, game console, etc.
The local device of each user may interact with the performing device 210 through a communication network of any communication mechanism/communication standard, which may be a wide area network, a local area network, a point-to-point connection, etc., or any combination thereof. In particular, the communication network may comprise a wireless network, a wired network, a combination of a wireless network and a wired network, or the like. The wireless network includes, but is not limited to: a fifth Generation mobile communication technology (5 th-Generation, 5G) system, a long term evolution (long term evolution, LTE) system, a global system for mobile communication (global system for mobile communication, GSM) or code division multiple access (code division multiple access, CDMA) network, a wideband code division multiple access (wideband code division multiple access, WCDMA) network, a wireless fidelity (WIRELESS FIDELITY, wiFi), bluetooth (blue), zigbee, radio frequency identification (radio frequency identification, RFID), long range (Lora) wireless communication, near field wireless communication (NEAR FIELD communication, NFC). The wired network may include a network of fiber optic communications or coaxial cables, etc.
In another implementation, one or more aspects of the execution device 210 may be implemented by each local device, e.g., the local device 301 may provide local data or feedback calculations to the execution device 210.
It should be noted that all functions of the execution device 210 may also be implemented by the local device. For example, the local device 301 implements the functions of the execution device 210 and provides services to its own users, or to the users of the local devices 302.
In general, features of a user may be represented by a user representation, which may be divided into a basic representation and a preference representation, where the basic representation may be labeled by actual ground facts, such as registration time, channel sources, regions where the user is located, etc., and labels generated by predicting attributes of the user, such as gender, age, car family, etc. (a relatively accurate model may be trained using labeled data sets (user features and labels), and other users of unknown gender and age may be scored and predicted using the trained model.
Therefore, the application provides an information acquisition method, which combines a neural network and symbol analysis to extract information of a user, constructs a personal knowledge graph of the user, and stores more accurate and detailed user information through the personal knowledge graph with finer granularity. The method provided by the application specifically comprises the following steps: acquiring an input text of a target user, wherein the input text comprises at least one word, and the at least one word forms at least one event; then, based on the input text, obtaining an output sequence, wherein the output sequence comprises the type and the element of each event in at least one event, the method for obtaining the output sequence can comprise a plurality of modes, the type and the element of the event contained in the input text can be analyzed in a syntactic analysis mode, the type and the element of the event contained in the input text can be output through a neural network, and the like; the personal knowledge graph is obtained according to the output sequence, the personal knowledge graph comprises a plurality of nodes, the plurality of nodes comprise type nodes and element nodes, the type nodes are used for representing the type of at least one event, the element nodes are used for representing the elements of at least one event, the type nodes corresponding to the type in the same event are associated with the element nodes corresponding to the elements, namely the type nodes in the same event are associated with the element nodes, and the personal knowledge graph is used for recommending for a target user.
Therefore, in the embodiment of the application, the types and the elements of the events generated by the target user are accurately extracted by taking the events as units, and the knowledge graph is constructed, so that each event of the target user can be more conveniently and accurately stored, and the related knowledge of the target user can be more accurately recorded. Therefore, when the target user is recommended later, accurate information can be accurately inquired by taking the event as a unit, and the complete event can be accurately inquired through the association relation among the nodes, so that the accuracy of data inquiry is improved, and the recommendation effectiveness is improved.
The information acquisition method provided by the application is described in detail below.
Referring to fig. 4, a flow chart of an information acquisition method provided by the present application is as follows.
401. And acquiring input text of the target user.
Wherein the input text may be derived from data entered by the target user.
Specifically, the input data of the target user may be acquired, and then the input text is extracted from the input data. The method for acquiring the input data of the target user may include various methods, specifically may acquire the data input by the user through the terminal interface, may receive the data input by the user from other devices, or may query the historical input data of the user from the historical data, etc.
For example, one or more of the data such as an image, voice, or text input by the user may be received and the input data may be identified, thereby extracting the input text from the input data. If the image input by the user is an image, the image can be identified, and the text can be extracted from the image; if the data input by the user is voice, voice recognition can be carried out on the input data, so that text is extracted from the voice data; if the data input by the user is text, the text can be directly used as the input text, or the text obtained by translation is used as the input text after the input text is translated, so that the method provided by the application can be applied to various input modes, can be applied to more scenes, and has high generalization capability.
402. An initial sequence of input text is obtained by a text processing model.
The text processing model is used for extracting information from an input text and outputting the extracted information in a vector form to obtain an initial sequence.
Specifically, the text processing model may be used to extract an entity and a classification label corresponding to the entity from the input text, and obtain an initial sequence. I.e. the initial sequence may include information of the entities extracted from the input text, classification labels corresponding to the entities, association relations between the entities, etc. If there are multiple entities, the multiple entities may form one or more events, and the vector representation of each entity in the input text may be extracted by a text processing model, the context meaning of each entity or the association relationship between each entity, etc.
In one possible implementation, the initial sequence may include an entity sequence and a tag sequence, and the steps specifically executed by the text processing model may include: performing natural language processing on the input text to obtain a feature vector sequence and an entity sequence, wherein the entity sequence comprises a vector representation corresponding to each word in at least one word, and the feature vector sequence comprises a feature vector corresponding to the input text; acquiring position information corresponding to vectors in an entity sequence; fusing the position information and the feature vector sequence to obtain a fused sequence; and classifying the entities corresponding to the fusion sequences to obtain tag sequences. Therefore, in the embodiment of the application, the entity in the input text and the meaning represented by the entity can be extracted through the neural network, so that the information can be extracted from the input text efficiently and quickly.
The text processing model may include one or more models for extracting information from text. For example, the text processing model may include a pre-trained language model, such as pretrain bert, a self-attention model, etc., for converting text into a vector representation, a BiLSTM +crf model, a Sigmoid model, etc., for further processing the vector representation, etc., so that available information may be extracted from the text.
403. And carrying out syntactic analysis on the input text to obtain a feature sequence.
In addition to extracting information included in the input text through the neural network, the input text may be parsed to extract a feature sequence from the input text, where the feature sequence may include entities included in the input text, association relationships between the entities, and the like.
For example, the input text may be "reddish buy apple", the entities may be extracted from the input text by syntactic analysis as "reddish", "apple", the relationship between the entities as "buy", the time as "now", and the actual meaning (or category) represented by each entity may be further determined, such as "reddish" for a person, "apple" for a fruit or a cell phone, etc.
It is understood that, in addition to extracting information included in the input text through the neural network, information of entities included in the input text may be analyzed by performing a syntactic analysis on the input text. Therefore, the information obtained in the two modes can be combined later to finally obtain more accurate information, and more accurate information can be extracted from the input text.
It should be noted that, the execution sequence of the step 402 and the step 403 is not limited in the present application, the step 402 may be executed first, the step 403 may be executed first, the step 402 and the step 403 may be executed simultaneously, and the present application is not limited in this respect, and specifically, the present application may be adjusted according to the actual application scenario.
In addition, after the features corresponding to each word in the input text are obtained through syntactic analysis, each entity may correspond to one or more features, and additional information may be added to the features corresponding to each word according to a preset format, so as to identify each word or a unique meaning represented by each entity, and obtain an updated feature sequence. For example, if an entity includes an "apple," it may be defined by adding additional information to whether the particular type of the entity is fruit or a cell phone, such as adding a "cell phone" to a feature sequence to indicate that the "apple" is a "cell phone," so that the unique meaning represented by each entity may be more accurately determined.
In addition, if the personal knowledge graph exists, the preset format and the initial personal knowledge graph can be combined, and the corresponding limiting characteristics of the entity can be inquired, if the input text is 'red eating apples', the specific type indicated by the 'apples' in the personal knowledge graph can be inquired by combining the preset grammar format, and the specific type indicated by the 'apples' in the personal knowledge graph is fruit instead of equipment, so that the additional characteristics classified as the 'fruits' can be additionally arranged for the entity 'apples'.
404. And acquiring the personal knowledge graph according to the characteristic sequence and the output sequence.
After the feature sequence and the output sequence are obtained, the initial knowledge graph can be updated or the personal knowledge graph can be generated according to the feature sequence and the output sequence. The personal knowledge graph can comprise one or more nodes, each node can comprise information extracted from data input by a target user, for example, each node can comprise information such as event types or event elements extracted from input texts, and the nodes with association relations are connected with each other. The personal knowledge graph may be used to represent characteristics of the target user, or may be used to record information related to the target user, such as information of the target user or information input by the target user, and the like.
Specifically, the personal knowledge graph may include a plurality of nodes, where the plurality of nodes may be divided into type nodes and element nodes, the type nodes are used to represent types of events, the element nodes are used to represent elements of events, and the type nodes and the element nodes of the same event are associated. For example, the input is "reddish plan tomorrow watching movie", the entity "reddish", "movie" can be extracted therefrom, the time is "tomorrow", the time and the entity are event elements, the type of the event is "entertainment", so that the type node "entertainment", the element nodes "reddish", "movie" and "tomorrow" can be established, and the type node and the element node of the same event are associated.
Therefore, in the embodiment of the application, the types and the elements of the events generated by the target user are accurately extracted by taking the events as units, and the knowledge graph is constructed, so that each event of the target user can be more conveniently and accurately stored, and the related knowledge of the target user can be more accurately recorded. Therefore, when the target user is recommended later, accurate information can be accurately inquired by taking the event as a unit, and the complete event can be accurately inquired through the association relation among the nodes, so that the accuracy of data inquiry is improved, and the recommendation effectiveness is improved. And the neural network and the syntactic analysis are combined to extract more accurate information from the input text, and then the more accurate information extracted from the input text is used to generate or update and obtain the personal knowledge graph of the target user, so that the personal knowledge graph can more accurately reflect the characteristics of the user, and the follow-up recommendation can be performed for the target user by using the personal knowledge graph. In addition, in the embodiment of the application, the personal knowledge graph is constructed for the user, and can be constructed or updated based on the entity extracted from the input text. And the searching can be performed more efficiently by the node mode, so that the recommendation can be performed more efficiently for the user.
In one possible implementation manner, the specific manner of obtaining the personal knowledge graph may include: correcting the initial sequence according to the characteristic sequence to obtain an output sequence; and acquiring a personal knowledge graph according to the output sequence. Specifically, the information included in the feature sequence and the output sequence may be matched, if the feature sequence is not matched with the output sequence, the unmatched portion in the output sequence may be corrected, for example, the unmatched portion in the output sequence is replaced with a corresponding portion in the feature sequence, or the unmatched portion in the output sequence is replaced with a corresponding portion in the feature sequence, fusion is performed, and the unmatched portion in the output sequence is replaced with a fused portion, so as to obtain the output sequence.
Therefore, in the embodiment of the application, the output sequence can be corrected by using the characteristic sequence, so that more accurate information can be obtained by combining information extracted from the input text in various modes, and the personal knowledge graph can be obtained by using the more accurate information, so that the personal knowledge graph capable of describing the target user more accurately can be obtained.
In one possible embodiment, the output sequence includes an association between at least one word forming at least one event, the at least one word including an element of the at least one event. Further, the personal knowledge graph may be constructed in units of events. Specifically, the type of at least one event, such as a calendar event class, a focus event class, can be obtained from the output sequence; the information of each event can be acquired from the corrected entity sequence according to the type of each event in at least one event; and then updating the initial knowledge graph by using the information of each event to obtain the personal knowledge graph.
In the embodiment of the application, the personal knowledge graph can be generated or updated by taking the event as a unit, so that when the information is queried in the subsequent personal knowledge graph, the required information can be queried rapidly by taking the event as a unit, and the query efficiency is improved.
Specific ways of obtaining the personal knowledge graph may include: taking the first event as an example, if the initial knowledge graph includes information of the first event, updating the information of the first event included in the initial knowledge graph by using an output sequence, for example, adding element nodes for the first event, and connecting the element nodes with association relationship to obtain a personal knowledge graph; if the personal knowledge graph does not include the information of the first event, adding the information of the first event included in the output sequence to the initial knowledge graph, for example, adding a type node and an element node of the first event, connecting the element node and the type node, and connecting the element node with the type node, and obtaining the personal knowledge graph.
Specifically, the element nodes may be connected according to the association relationship between the elements of each event and the elements of each event obtained from the entity sequence; or the characteristics of each event and the corresponding emotion category are obtained from the entity sequence. It can be understood that if the output sequence includes the association relationship between the elements of each event, the element nodes corresponding to the elements having the association relationship for the same event in the personal knowledge graph are associated with each other; if the output sequence also comprises emotion types, element nodes corresponding to the same event in the personal knowledge graph are related through the emotion types.
Therefore, different event related information can be acquired according to different event types, more scenes can be adapted, and generalization capability is strong.
Furthermore, in one possible implementation, the first knowledge-graph may also be used to augment the personal knowledge-graph of the target user. Specifically, a first knowledge graph is obtained, wherein the first knowledge graph comprises a plurality of nodes, each node is provided with at least one associated node, the nodes in the first personal knowledge graph can represent an entity or can represent elements or types of events, and the entities with association relations are connected; the association information associated with the nodes in the personal knowledge graph can be acquired from the first knowledge graph; and expanding the personal knowledge graph by using the associated information to obtain an expanded personal knowledge graph. If the first knowledge-graph is searched for the node which is the same as the entity of the personal knowledge-graph, then the information of the node associated with the node is searched for from the first knowledge-graph, and the personal knowledge-graph is expanded by using the information.
Optionally, the first knowledge graph may be a general knowledge graph or a knowledge graph of other users, so that contents included in the personal knowledge graph of the target user may be expanded through multiple graphs. For example, when the first knowledge-graph is a general knowledge-graph, each node in the general knowledge-graph may represent an entity, and when the first knowledge-graph includes personal knowledge-graphs of other users, each node in the first personal knowledge-graph may represent an element or type of an event, or the like.
Therefore, in the embodiment of the application, the first knowledge graph can be used for expanding the personal knowledge graph, so that the personal knowledge graph contains more information, and more information can be conveniently queried in the personal knowledge graph later.
In one possible implementation manner, after the output sequence is obtained, information of at least one node matched with the output sequence can be queried from the personal knowledge graph, recommendation information is generated for the target user according to the information of the at least one node, and recommendation is then performed based on the recommendation information.
Specifically, the information of at least one first node corresponding to the output sequence can be screened out from the personal knowledge graph; searching information of at least one second node associated with at least one first node from the personal knowledge graph, wherein the information of the at least one node comprises information of the at least one first node and information of the at least one second node. In addition, the information of the third node associated with the second node can be searched, or the information of the fourth node associated with the third node can be searched, and the specific query degree can be adjusted according to the actual application scene, which is not limited by the application.
The first node and the second node may include information of different domains, where the different domains indicate that entities included in the first node and the second node belong to different domains, for example, the first node includes information related to music, and the second node may include information related to a television play related to the music.
Therefore, in the embodiment of the application, the user can be characterized in a map mode, so that when the nodes related to the input text of the user are queried, the information related to the input text of the user can be efficiently queried through the association relation among the nodes.
In one possible implementation manner, each node in the personal knowledge graph includes a corresponding weight, where the weight of any node (for convenience of distinguishing the fifth node) is in a negative correlation with a storage duration or an update duration, where the storage duration is a duration of storing information of the fifth node, and the update duration is a duration of updating information included in the fifth node from a last time, that is, the longer the storage duration or the update duration of the fifth node, the smaller the weight of the fifth node. Therefore, in the embodiment of the application, the information of the user can be recorded in a manner of attenuating the weight, so that the memory of the knowledge of the user is realized.
In generating the recommendation information, the recommendation information may be generated with reference to the weight of each node. Specifically, the at least one node may be ranked according to the weight corresponding to the at least one node, and recommendation information may be generated according to the information of the at least one node and the ranking of the at least one node.
In addition, the structured data of the target user can be obtained, and the structured data is data in a preset format; extracting information of at least one event from the structured data according to a preset rule; updating the personal knowledge graph according to the information of at least one event to obtain an updated personal knowledge graph.
Therefore, in the embodiment of the application, besides extracting information from the input text by means of a neural network and syntactic analysis, the information can be extracted from the structured data of the target user and the personal knowledge graph can be updated, so that the personal knowledge graph can be updated in more ways, and more information can be included in the personal knowledge graph.
In addition, in a possible implementation manner, the method provided by the application can be deployed in a terminal or a cloud server. When deployed in a cloud server, the user may be served through a cloud platform. Therefore, in the embodiment of the application, the event is taken as an organization structure, and different behaviors and information of the user are represented and stored by different types of entities, so that the personal knowledge graph which accords with the use characteristics of the user is constructed. And recommending by combining the obtained recommendation type, the intention type and the node weight. The personal knowledge graph (Personal Knowledge graph, PKG) uses the event as a bridge to connect different types of entities, so that the path can be designed more flexibly, and even if a large amount of user behavior data or user logs are not available, the recommendation is not affected. This approach solves the problem of cold start when using a user representation.
The foregoing describes the flow of the information acquisition method provided by the present application, and the information acquisition method provided by the present application is further described below with reference to a specific application scenario.
First, as shown in fig. 5, the information acquisition method provided by the present application may be divided into a plurality of parts, and specifically may include: the information extraction 501, the PKG construction 502 outputs a PKG503, and the recommendation 504 is made based on the PKG.
It will be appreciated that in the information extraction 501 step, accurate information may be extracted from the user's input data, which may then be used to construct a PKG and recommend appropriate entities for the user based on the PKG.
In addition, as the following detailed embodiments relate to interface displays, a graphical user interface GUI provided by the present application is first described, the graphical user interface being stored in an electronic device comprising a display screen, a memory, one or more processors to execute one or more computer programs stored in the memory, the graphical user interface may comprise:
Generating a personal knowledge graph in response to input operation of a target user, and displaying the personal knowledge graph, wherein an input text of the target user comprises at least one word, the at least one word forms at least one event, the personal knowledge graph comprises a plurality of nodes, the nodes comprise type nodes and element nodes, the type nodes are used for representing types of the at least one event, the element nodes are used for representing elements of the at least one event, the type nodes corresponding to the types in the same event and the element nodes corresponding to the elements are associated with the personal knowledge graph and are used for recommending the target user.
In one possible embodiment, the GUI may further include: and displaying a permission request, wherein the permission request is used for indicating whether the personal knowledge graph is acquired by using the input text of the target user. For example, input information of a user can be collected through an Application (APP) installed in the intelligent terminal of the user, and whether to allow collection of input data in each APP can be displayed in a display interface as a knowledge source of a personal knowledge graph, so that data privacy security of the user is improved.
In one possible embodiment, the GUI may further include: displaying a first knowledge graph, wherein the first knowledge graph comprises a plurality of nodes, the nodes comprise information of at least one entity, and the nodes in the first human knowledge graph can represent one entity or can represent elements or types of events; and responding to the obtained association information associated with the nodes in the personal knowledge graph from the first knowledge graph, expanding the personal knowledge graph by using the association information to obtain an expanded personal knowledge graph, and displaying the expanded personal knowledge graph.
In one possible embodiment, the GUI may further include: and generating recommendation information for the target user in response to the information of at least one node acquired from the personal knowledge graph, and displaying the recommendation information, wherein the recommendation information is used for recommending the target user.
In one possible embodiment, each node in the personal knowledge graph includes a corresponding weight, the at least one node is ordered according to the corresponding weight, and the GUI may further include: the recommendation information is displayed in response to generating the recommendation information according to the information of the at least one node and the ordering of the at least one node.
In one possible embodiment, the GUI may further include: and responding to the input operation of the target user on the first input interface, displaying input text, wherein the input text is extracted from input data of the target user, and the input data comprises at least one data of images, texts or voices.
In one possible embodiment, the GUI may further include: responding to the input operation of the user for the second input interface, updating the personal knowledge graph according to the obtained structured data, and displaying the updated personal knowledge graph, wherein the structured data is in a preset format.
The steps shown in fig. 5 described above are described below in conjunction with the GUI provided by the present application.
1. Information extraction
The flow of information extraction may be illustrated in fig. 6, for example.
The flow of information extraction may include various ways, such as extracting information through a neural network and extracting information through syntactic analysis as shown in fig. 6.
Input text is first obtained, which may include user chat input, search input, comment input data, etc., or text identified from image, voice, or video data, etc.
After the input text is obtained, information may be extracted from the input text by neural networks and syntactic analysis, respectively, as described in exemplary fashion below.
1. Neural network
The neural network may be trained to extract entity information, association relationships between entities, and the like from the input text. The neural network may be trained, for example, using a priori, such as daily chat or annotation data of the user, and then the sentence categories of the entered text or the contextual information of the individual words in the text, etc., may be identified by the neural network.
Illustratively, as shown in fig. 7, the input text is first feature extracted using a pre-trained language model bert, and the output of bert is divided into tokens (i.e., a sequence of feature vectors after feature extraction of text by word) and CLS (a vector containing the whole sentence of features of the input text); and then sending the tokens sequence into a BiLSTM +CRF model for sequence labeling task, converting entity position information extracted by the sequence labeling task into feature vectors, adding the feature vectors of the CLS, inputting a sigmoid model for multi-label classification, and finally obtaining an output sequence, wherein the output sequence comprises an entity sequence and a classification label sequence corresponding to the entity, the entity sequence comprises entity position information, and the label sequence can comprise a class corresponding to each entity.
2. Syntactic analysis
As shown in fig. 6, the input text is first parsed, that is, the grammatical function of each word in the input text is parsed, so as to obtain a feature sequence corresponding to the input text. For example, the text "i like you," where "i" is the subject, "like" is the predicate, and "you" is the object.
It will be appreciated that by syntactic analysis of the input text, semantic features as well as part-of-speech features of individual words in the input text may be identified.
Generally, the entities and the corresponding parts of speech included in the corpus of different types may be different, so that symbol features such as part of speech tags (pos tags), semantic features, entity categories of different fields and the like of the input text can be obtained through syntactic analysis.
In addition, the method can also combine PKG and a preset schema, namely a preset grammar format to determine the limiting characteristics of each field and add additional information to the corresponding field to obtain a characteristic sequence. For example, for an entity field "light rain" in the text "light rain today", the part of speech of which may be a name of a person, weather, or name of an article, etc., at this time, it may be determined that "light rain" is a weather type in combination with the content included in the PKG and a preset schema, so that a part of speech feature is added to the field as a weather type, so that the field has a unique part of speech feature.
The resulting signature sequence may then be matched to the output sequence of the neural network using syntactic analysis. If the output sequence matches the feature sequence, the output sequence may be used as the final information extraction result.
If the output sequence is not matched with the characteristic sequence, the characteristic sequence can be used for correcting the output sequence, and the output sequence is used as a final information extraction result.
Specifically, the information of each entity in the output sequence and the information of each field in the feature sequence can be respectively matched, such as matching parts of speech, semantics, relationships between entities or fields, and the like. If the partial information in the output sequence is not matched with the information corresponding to the feature sequence, the unmatched information in the output sequence can be replaced by the corresponding information in the feature sequence. For example, if the input text includes the word "apple", if the apple in the output sequence is classified as fruit and the part of speech assigned to the field "apple" in the feature sequence is a device, the classified label fruit in the output sequence may be replaced with the device, thereby implementing the modification of the output sequence.
Therefore, in the embodiment of the application, the information can be respectively extracted from the input text by combining the neural network and the syntactic analysis, and the information extracted in the two modes is combined to obtain the final more accurate information, so that the accuracy of information extraction is improved, and the problem of long tail distribution can be solved. For example, the entities with the first 20% of the use frequency occupy 80% of the entities in daily chat of the user, most of the entities can be identified through the trained neural network, and the long tail entities with lower use frequency are corrected by adding the symbol method on the basis of the neural method, which can be understood as that the long tail entities with lower use frequency are complemented and corrected through syntactic analysis, so that the extraction accuracy of the long tail entities is improved.
It can be understood that the application constructs the personal knowledge graph of the user by adopting a mode of combining syntactic analysis and neural network. The method has the advantages that the thought of storing user data in a knowledge graph organizing mode integrates all APP or user behavior operation information into one personal knowledge graph, and the personal knowledge graph structure is organized by taking event nodes, attention nodes and communication nodes as units, so that the organization structure is convenient for efficiently extracting user information. Meanwhile, a neural network is adopted to analyze text contents, and a method of combining multiple technologies such as CRF, multi-label classification, bi-LSTM, CRF and the like is adopted to extract multi-angle contents required by the atlas, so that a more efficient approach is provided for acquiring knowledge in the atlas.
In addition, for structured data, i.e. data in a preset format, information may be extracted from the input text according to preset rules. The structured data may be data entered by a user in an application in which a data format is preset, such as a calendar, address book, album, etc.
For example, the flow of extraction may be as shown in fig. 8, taking newly created contacts as an example: firstly, knowing that the information source is an application program address book, the intention (namely the event type) can be understood as communication, and corresponding information of the address book has a specific template such as a name, a contact way, a position and the like, and entity identification and relation extraction are carried out under the template, namely, the entity in the structured data and the association relation between the entities are identified, and finally, an entity list and a relation list are obtained. Application scenes such as newly-built contact persons, calendar activity construction, information stream browsing and the like are currently realized. For other structured scenes, information extraction can be performed according to a corresponding format, so that information of the entities, association relations among the entities and the like are extracted.
Of course, for the structured data, the output sequence may be extracted by combining a neural network and a syntax analysis, and specifically may be adjusted according to an actual application scenario, which is not limited by the present application.
2. PKG construction
After obtaining the extraction result of the information extraction, i.e., the output sequence, PKG construction may be performed based on the output sequence, such as adding content included in the output sequence in the PKG, or updating a portion of the PKG corresponding to the output sequence.
The PKG construction may be divided into a plurality of parts, including knowledge analysis, knowledge generation, map construction, and map expansion, which are described below.
Knowledge analysis
In the process of knowledge analysis, the following description may be made with respect to relationship connection between entities, event element analysis, emotion analysis, time processing, or the like, respectively.
1. Relational links
Through the foregoing information extraction step, the entities included in the input text and the relationship types between the entities can be obtained. The connection relation between the entities can be constructed in the PKG according to the relation type between the entities, so that the connection between the nodes is realized.
For example, the relationship may be converted into a triplet of < entity field 1, relationship category, entity field 2> in combination with a preset category definition rule, in combination with the entity and relationship category in the output sequence. As in the "relationship class: in family, "corresponding entity 1, entity 2 should be a person name or a person's pronoun, and" relationship class: entity 1 corresponding to director/author/drama/producer/composer/word "is name or pronoun, and entity 2 should be film and television work/book/song, etc. Specifically, for example, < reddish, family, min >, < min, director, red sorghum >, etc.
In addition, after the output sequence is obtained through the information extraction step, the event type can be identified. If the event type is a schedule event type, event element analysis can be performed, and if the event type is identified to include a focus event type, emotion analysis can be performed.
2. Event element analysis
The method can judge the elements of each entity in the input text in the event, and can store the entities of the event types in the form of tuples according to the corresponding rules of different event types and the corresponding types of the entities of the event types. For example, the event element is represented in the form of output [ (entity field 1, peer), (entity field 2, destination), (entity field 3, view) ] or the like. For example, an event for a restaurant class should correspond to an entity category with a peer, destination, start time, end time, food, etc. (the above categories do not require all to occur simultaneously in an event).
Then, if the event obtained by analysis does not exist in the PKG, each event element can be stored in the PKG as a node, and if the event obtained by analysis exists in the PKG, the information of the event included in the PKG can be updated, so that the PKG can be updated in real time, the PKG can be used for storing knowledge about a user in real time, and life learning for the user is realized.
3. Emotion analysis
If the event type corresponding to the input text is recognized as the concerned event type, emotion analysis can be performed so as to judge whether the emotion type in the input text is positive, negative or neutral emotion.
For example, this part may be processed using a canonical (regex) discriminant approach in combination with a naive bayes classifier: for simple texts with obvious emotional tendency, a regular method can be used for distinguishing, such as 'I like XXX', 'I dislike XXX'; for scenes with high complexity of text description, a naive Bayesian classifier can be used for classifying the text, namely, a classification model is obtained after learning and training a classification task on a data set. In the training process, data information of corresponding classification categories needs to be collected and divided first, and the length of each piece of data information is guaranteed to be similar. Because the text information appears in the form of sentences, the content contained in the text information is relatively rich and various, the text information needs to be segmented, the sentence information is divided into vocabulary information with finer granularity, and meanwhile, some characteristic processing (such as punctuation mark removal, word stopping and other characteristics, keyword selection, smoothing technology and the like) is carried out, the occurrence frequency of each vocabulary in different emotion categories is respectively counted to calculate the conditional probability, and a word bag model, namely a naive Bayesian model, is obtained by combining the independent assumption of the conditions, so that the emotion categories are obtained.
4. Time processing
In general, time is important information for measuring the time of occurrence and disappearance of user behavior or attention, and is one of the event elements. Recording the time of entity creation, the occurrence and end time of an event is very helpful to further provide advice to the user. The module is used for unifying and standardizing the time of the natural language expression related in the processing process, and storing the natural language expression in the same format, so that the module is convenient for subsequent use.
For example, common language expressions may be unified, and time entity expressions such as "next monday", "tomorrow", "yesterday afternoon" may be standardized as time forms of "xxxx-xx-xx: xx". At the same time, the time information of the user submitting the request is obtained, and the information is stored with the knowledge of the user.
(II) knowledge Generation
After the information such as the relation category, the event element, the emotion category or the time among the entities is obtained in the above manner, the knowledge obtained by analysis can be integrated so as to be conveniently stored in the PKG.
Specifically, the obtained time, the relationship link, the event element, or the like may be integrated, that is, integrated in units of events. For example, as shown in fig. 9, in the calendar event 1, an input text of the user, such as "the next tuesday and dream see letter", may be obtained, and then the event type, entity category, and the like are obtained through the foregoing information extraction and knowledge analysis steps. In the search event 2, the text "sweet" input by the user is obtained, and the event type, entity type, and the like of the event are determined by the foregoing information extraction, knowledge analysis, and the like.
Further, knowledge in the PKG may be updated or knowledge that does not exist otherwise may be added: and carrying out relational link on the extracted entity list, and searching and matching by using the inverted index in ELASTICSEARCH. The inverted index in ELASTICSEARCH divides and processes all the matched fields again, and the information table is stored upside down. And combining the constraint of entity types, information sources and the like, the matching of the entity fields can give a descending list from high to low according to the matching degree scores of the entity fields and can take the entity with the highest score as the entity corresponding to the field for linking. Specific processing as shown in fig. 10, after obtaining an entity through information extraction and knowledge analysis, entity search matching is performed in the PKG. If the existing knowledge corresponding to the extracted entity is accurately matched in the PKG, entity linking is performed, namely, the entities with association relation are associated. If the entity is not matched, other entity fields mentioned by the text can be considered, the entities mentioned by the user can be intelligently distinguished, reasoning disambiguation is performed, and the link accuracy is improved. It will be appreciated that fuzzy matching may be performed on the personal knowledge of the user, e.g. matching PKG and similar fields of meaning in the extracted knowledge, disambiguating by reasoning about whether the PKG and similar fields of meaning in the extracted knowledge are actually the same entity, and if so, continuing the knowledge linking, i.e. linking the relationship between the entity and the entity. If the same or similar entity as the extracted information is not matched in the PKG, the new entity can be added according to the new knowledge.
(III) atlas construction
Specifically, after knowledge generation, the PKG is built according to a predefined schema.
Specifically, the construction of PKGs is centered on the current user, extending with a number of different branch categories: "calendar event", "attention event", "contact event", each extension records the current system time to mark the time sequence of data generation. The calendar event indicates that the currently constructed content is a calendar, and the information of event time, characters, places and the like is related in the constructed text; "attention event" means information that the currently constructed content is of interest to the user, and can be classified into like (positive), dislike (negative) and attention to 3 interest trends.
Take the example of the user mentioned with Zhang Sant, "friday me wants to see younger you with Lisi Tuo". Event=xxxx-xx-xxxx: xx (time-normalized friday), "calendar event class: entertainment ", the entity list contains" litu "and" younger you "related information, [ companion, movie name ], user=" Zhang san ".
Take the example of the user's younger's nice look with Zhang Sandi and "Zhou Xiaoyu". The build module will get event type= "attention class: entertainment ", entity list contains" Zhou Xiaoyu "and" younger you "related information, entity= [ actor, movie name ], association < Zhou Xiaoyu, relationship class: actor, younger you > triplet, user= "Zhang san".
Therefore, in the embodiment of the application, the personal knowledge graph is constructed by taking the event as a unit, and compared with the user portrait, the method provided by the application can describe the user and save the knowledge of the user through the personal knowledge graph with finer granularity, so that the user can be described or saved more accurately, the knowledge backtracking can be performed more accurately later, and more accurate user information can be queried. It can be understood that the personal knowledge graph provided by the application records and stores the operation behaviors of the user by taking the event operated by the user as a unit. Dividing the operation behaviors of the user into different intention types, analyzing information under corresponding intention, obtaining event elements of the operation behaviors, and adding the event elements into a map. In the use process, the content related to the operation habit of the user and the related content of the element can be quickly obtained according to certain elements of the operation behaviors, so that the use habit of the user is more relevant. Meanwhile, the occurrence time of the behavior is stored, and a way is provided for subsequent iterative updating or sequential searching. The knowledge graph taking the event as a unit provides a new information organization structure mode and provides a new channel for searching and analyzing different requirements.
In addition, the general user knowledge is preferred and forgotten over time, and the application realizes the memorization of the user knowledge by setting a weight for each node and updating the weight periodically or in real time.
For example, as shown in fig. 11, when updating information of a node in a PKG, if there is an extracted entity in the PKG, a weight corresponding to the node may be updated by a memory decay manner, and for example, a calculation manner of the weight may be expressed as:
Wherein α, β, γ are weighting coefficients after normalization processing, the number of times is the number of occurrences of an event determined by inputting a text, N represents the current number of times, N max represents the maximum number of times of the user, the degree of importation represents the association relationship between the node and other nodes in the PKG, the degree of importance of the node is reflected, D represents the degree of importation of the node, and D max represents the maximum degree of importation. The second term is a time factor, which decreases as the creation time or update time of the node increases, i.e., has a negative correlation with the creation time or update time of the node.
(IV) atlas expansion
Specifically, the first knowledge-graph may be used to augment the target personal knowledge-graph. Taking the first knowledge graph as an example, a knowledge graph (common knowledge graph, CKG) may be taken as the example, and various entities and association relations between the entities may be included in the CKG. The entities included in the universal knowledge graph may be entities of the same domain or different domains.
It can be understood that the application can mine the implicit intention of the user by updating and complementing the knowledge of the PKG through the information of the domain-drooping knowledge graph.
Specifically, an entity in the PKG may be searched in the CKG, after matching to the same entity as the entity in the PKG, information of the associated node is continuously searched in the CKG, and the information of the node in the CKG and the information of the associated node are extended into the PKG, thereby expanding the PKG through richer information included in the CKG.
For example, as shown in fig. 12, after information extraction is performed to obtain an entity list and a graph is constructed, for each node in the PKG (for convenience of distinguishing between the nodes of the PKG), domain knowledge matching is performed in the CKG, if a CKG node matching with the PKG node exists, an associated node associated with the CKG node is queried from the CKG, and then information is extracted from the CKG node and the associated node as domain knowledge, and knowledge updating or supplementing is performed on the PKG to obtain the PKG with richer information.
Taking a scene as an example, as shown in fig. 13, information of each node in the PKG is first acquired, such as "youth you in the film field," send you a small safflower, "and" thinking about "in the music field," trapezium good, "etc. And (3) searching and matching in the CKG to infer an easy-to-melt small seal in the field of the star concerned by the user. If the iron man, the letter, the wave earth and the CKG of the film domain are searched and matched, a new concept which is not found in the PKG, namely a science fiction film, is learned. And then sending the information acquired by the domain knowledge graph into PKG. Therefore, knowledge complement can be carried out on the entity which does not exist in the PKG, and relationship complement can be carried out on the knowledge which exists in the PKG but does not find the relationship, so that the PKG is expanded.
Therefore, in the embodiment of the application, the personal knowledge graph and the general knowledge graph are combined, and the functions of relationship completion, reasoning and the like are adopted to mine the deeper relationship among all the nodes in the personal knowledge graph. For example, the user pays attention to Wang Xiaofei and songs of red bean, and can mine the singing relation between Wang Fei and red bean through a general knowledge graph, so that deeper information can be mined.
3. PKG-based recommendation
After extracting information from the input text to obtain an output sequence, information of a node corresponding to an entity in the output sequence may be queried from the PKG, and recommendation information for the user may be generated using the information of the node.
Specifically, the PKG may be applied to various recommendation scenarios for users, such as input method recommendation, search recommendation, trip notification, commodity recommendation, or the like.
In a possible implementation, the method can be applied to entity prediction, such as after an entity is extracted from input text, an associated node can be queried from the PKG based on the entity, the entity to be input by a user is predicted from the associated node, and recommendation is performed in a display interface of the user.
The entity prediction screenshot improves the problems of information disorder and even prediction accuracy reduction caused by the fact that a recommendation user inputs related various information currently when PKG recommendation is performed. For example, when the user inputs "i am going to the day", the recommendation type should be mainly named (of course, other types of scenes may exist) at this time, and the PKG may preferentially recommend the entity of the related name type, so that the accuracy of prediction can be improved, and the user experience can be improved.
In particular, a method of regex in combination with a bayesian probability model may be used. For common simple expressions, the recommended types can be obtained directly through regularization, and the types are arranged according to the occurrence probability. The complex expression is calculated by using a neural network, the probability of different texts followed by different entity types is calculated by using a Bayesian probability model, and the recommended types are given according to the probability. After the predicted recommendation type list is obtained, the user is recommended better by combining various factors such as the entity currently involved by the user, the weights of different entities and the like.
In general, knowledge-graph based recommendation approaches may include Embeddig-based approaches, path-based approaches, and approaches combining Embeddig with paths, and the like. Illustratively, in embodiments of the present application, the recommendation ordering may be based on the paths of the PKG. Specifically, the recommendation may be made in combination with the obtained recommendation type, intent type, and node weight. In PKG, events are used as organization structures, and the events are used as bridges to connect different types of entities, so that paths are designed more flexibly, and the recommendation problem in the scene that the entities do not belong to the same field is solved well. The recommendation rule may include, as shown in fig. 14, searching the PKG for an intended node related to the entity list according to the recommendation type and the event type, then acquiring other nodes connected to the intended node and sorting the nodes according to weights, and selecting an entry with the highest weight as a recommendation word. The recommendation may be based on existing nodes of the PKG. For example, take a movie as an example: the user refers to the science fiction movie such as the letter and the like for a plurality of times, and then the letter and the science fiction movie are recommended by combining the user intention and the like in the recommendation process. In addition, PKG and CKG can be combined, and the deduced extended entry of the node can be added into a recommendation system to serve as a feature vector of a user.
Therefore, in the embodiment of the application, the personal knowledge graph which shows the characteristics of the user can be constructed according to the user information, the knowledge which does not find the relation can be supplemented and expanded through reasoning, the completion and expansion of the prior knowledge of the user can be realized, and the reasoning content is listed in the recommendation range, so that more information can be recommended for the user, the cross-domain recommendation can be realized, and the user experience is improved. For example, the user pays attention to a plurality of science fiction movies, such as 'revenge's, 'letter' and 'steel', etc., digs out the potential interest points of the user, namely, the science fiction movies according to reasoning, adds the science fiction movies into the personal knowledge graph, and recommends 'science fiction movies' for the user in the corresponding context. The cross-domain reasoning can be combined with the content of different vertical domains to make reasoning, for example, the user pays attention to the content of songs of red bean, movies of heavenly stems, and the like, and the cross-domain reasoning can make the potential attention of the user include Wang Xiaofei.
In addition, the personal knowledge graph provided by the application can comprise finer granularity information, so that finer personalized recommendation granularity is realized, and fine granularity recommendation is performed on the user by combining the user data, recommendation, intention type and weight. Wherein the type of recommendation (e.g., name) and the type of intent (e.g., entertainment) obtained by the model provide important information for the recommendation, and further the weights represent different degrees of attention of the user to the thing.
In addition, the knowledge graph is adopted to store the behavior operation data of the user, and a unique personal knowledge graph specific to the user is constructed. The conventional recommendation system adopts a table form to organize user data, and has a certain difference in the storage definition and the searching efficiency compared with the storage of the graph. And the storage mode of the map can rapidly acquire the content directly related to the current content or related to n hops, and the data storage of the table needs longer time to inquire and access.
Some possible application scenarios are described below by way of example.
Scene one
The method provided by the application can be deployed at a terminal, a user can receive or send information in communication software, a GUI is shown in FIG. 15, the user can send information in the communication software, at the moment, the information sent by the user can be used as an input text, the entity and the relation among the entities in the input text are extracted, and a PKG is constructed.
Then, as shown in fig. 16, when a user inputs a text in the input interface, a matching text may be screened out from the PKG, and a text to be input by the user is predicted and displayed in the display interface, so that the user can quickly realize the input.
Scene two
A GUI is shown in FIG. 17, the method provided by the application can be deployed on a terminal, a user can acquire a text input by the user in a search program, and information of an entity, such as 'Wang Fei', 'red bean', emotion categories corresponding to the entity and the like, can be extracted from the text input by the user and added into a PKG.
Then, as shown in fig. 18, when a user inputs a text in the input interface, a matching text may be screened out from the PKG, and a text to be input by the user is predicted and displayed in the display interface, thereby enabling the user to quickly implement the input.
Scene three
A GUI is shown in FIG. 19, the method provided by the application can be deployed at a terminal, a user can input in a calendar APP, the terminal can acquire a user structured calendar event text to obtain structured data, and extract information such as entity, time and the like corresponding to event elements from the structured data, and the extracted information is added to a PKG, so that the calendar of the user is recorded, and the user is reminded in time.
Scene four
Aiming at the scene of the voice assistant deployed in the terminal, when a user inputs a text in the voice assistant interface, nodes corresponding to the text input by the user can be screened out in the PKG, and related nodes can be further screened out. And displaying the information of the associated node in a display interface of the voice assistant. And when the information of the associated nodes is displayed, the information of the associated nodes can be ranked, and the weight value of each node is combined, and the information of the nodes with larger weight values is ranked according to the weight value from large to small, for example, the information of the nodes with larger weight values is ranked to a position which is more convenient for a user to input.
For example, as shown in fig. 20, a user may query a voice assistant for a contact of "Wang Meng", and a terminal may query a PKG for information related to the entity "Wang Meng", and then screen out information classified as a contact, and display the information in a display interface of the terminal.
For another example, as shown in fig. 21, a GUI may request that the voice assistant play music, and the terminal may search the PKG for information related to the music, such as "red bean" for the music, and may play the music "red bean".
Also for example, as shown in fig. 22, a GUI may learn user preference information through daily input data of the user and use it as recommendation information for a search recommendation engine.
The steps of the information acquisition method provided by the application are described in detail in combination with the application scenario, and the architecture and the complete application scenario of the information acquisition method provided by the application are described in an exemplary manner. It should be noted that the following general description of the architecture is only provided, and specific execution steps of each module under the architecture can be referred to in fig. 4-22, and will not be repeated.
For example, the information acquisition method provided by the application can be deployed at a terminal, and the architecture deployed at the terminal can be shown in fig. 23.
The application scene layer may include a service application program (APP) installed in the terminal, and the application scene layer is connected with the algorithm layer through an algorithm interface, and the service APP may receive a user data set and receive data applicable to a search/recommendation engine fed back by the algorithm layer. The core part of the architecture is an algorithm layer, and can comprise a plurality of modules: 1) Construction of PKG (knowledge system), mainly involves: a. learning of user behavior data (text data), namely knowledge extraction, b, generating and storing user knowledge; c. user knowledge graph construction, 2) PKG expansion: knowledge reasoning and knowledge updating completion; 3) Use of PKG: such as intent prediction and knowledge ordering.
The data management layer is used for storing or managing user data, such as PKG, CKG or schema, and the like, can be stored in the data management layer, provides functions of data storage, management and the like for the algorithm layer, and is a basic platform of a query engine or an reasoning process.
The complete flow of the information acquisition method provided by the application can be shown in fig. 24.
Wherein user data, such as user input data, may be obtained, structured data and unstructured data may be included.
The method comprises the steps of extracting information from input data, extracting entities and association relations among the entities, analyzing types (namely intention types) of events formed by the entities or event elements and the like, and analyzing emotion of the entities to obtain emotion types.
Event information may also be extracted from the input data and stored in a predetermined format.
Therefore, in the user knowledge extraction section, user knowledge such as event information, emotion type, intention type (i.e., event type), association relation of entity event, event element, and the like can be extracted, and the user knowledge can be stored.
In addition, information related to the user knowledge is queried from a common knowledge graph (namely CKG), and the user knowledge is updated or complemented based on the information, so that more complete user knowledge is obtained.
Updating the user knowledge into the PKG based on a predefined schema, namely building a map. For example, the PKG constructed may be as shown in fig. 25, in which, centering on the target user "me", various types of events related to the target user are saved, in which nodes having an association relationship are connected.
Meanwhile, weight is set for each node through a memory attenuation mechanism, so that the memory of user knowledge is realized in a weight mode, and recommendation can be more effectively carried out on a target user.
In the application scenario, event type prediction (i.e., intention prediction) can be performed based on information extracted from input data, prediction information is queried from the PKG, and the prediction information is ranked according to the weight of each node and recommended to a user, so that user experience is improved.
The foregoing describes in detail the flow of the method provided by the present application, and the following describes the apparatus for performing the method.
Referring to fig. 26, a schematic structural diagram of an information acquisition device according to the present application is provided. The information acquisition device may include:
An input module 2601, configured to obtain an input text of a target user, where the input text includes at least one word, and the at least one word forms at least one event;
A text processing module 2602, configured to obtain an output sequence based on the input text, where the output sequence includes at least one event type and element;
The obtaining module 2603 is configured to obtain a personal knowledge graph according to the output sequence, where the personal knowledge graph includes a plurality of nodes, the plurality of nodes include type nodes and element nodes, the type nodes are used to represent types of at least one event, the element nodes are used to represent elements of at least one event, type nodes corresponding to types in the same event and element node associated type nodes corresponding to elements are associated with element nodes corresponding to the same event, and the personal knowledge graph is used to recommend to a target user.
In one possible implementation manner, if the output sequence includes the association relationship between the elements of each event, the element nodes corresponding to the elements having the association relationship in the same event in the personal knowledge graph are associated with each other; if the output sequence also comprises emotion types, element nodes corresponding to the same event in the personal knowledge graph are related through the emotion types.
In one possible implementation, the obtaining module 2603 is specifically configured to: if the initial knowledge graph comprises information of a first event, updating association relation between element nodes corresponding to the first event and the element nodes included in the initial knowledge graph to obtain a personal knowledge graph, wherein the first event is any event in at least one event; if the initial knowledge graph does not contain the information of the first event, adding the type of the first event and the node corresponding to the element in the initial knowledge graph, and associating the type node of the first event with the element node to obtain the personal knowledge graph.
In one possible implementation, the obtaining module 2603 is specifically configured to: obtaining an initial sequence corresponding to the input text through a text processing model, wherein the initial sequence comprises at least one word vector representation in the input text and a first class label corresponding to at least one word; carrying out syntactic analysis on the input text to obtain a feature sequence, wherein the feature sequence comprises a second class label corresponding to at least one word; and combining the initial sequence and the characteristic sequence to obtain an output sequence, wherein the output sequence comprises elements and types of at least one event.
In one possible implementation, the text processing module 2602 is specifically configured to: and correcting the part which is not matched with the characteristic sequence in the initial sequence to obtain an output sequence.
In one possible implementation, the text processing module 2602 is further configured to: if each word in the feature sequence corresponds to a plurality of second class labels, determining a unique second class label for each word, and obtaining an updated feature sequence.
In one possible implementation, the text processing module 2602 is specifically configured to: according to the input text, an initial sequence is obtained through a text processing model, wherein the text processing model is used for executing the following steps: performing natural language processing on the input text to obtain a feature sequence and an entity sequence, wherein the entity sequence comprises vector representations corresponding to each word in at least one word, and the feature sequence comprises feature vectors corresponding to the input text; acquiring position information corresponding to vectors in an entity sequence; fusing the position information and the feature sequence to obtain a fused sequence; and classifying the entities corresponding to the fusion sequence to obtain a tag sequence, wherein the initial sequence comprises vector representations corresponding to each word and the tag sequence.
In one possible implementation, the apparatus further includes an expansion module 2604 configured to: acquiring a first knowledge graph, wherein the first knowledge graph comprises a plurality of nodes, the nodes comprise information of at least one entity, and the nodes in the first personal knowledge graph can represent one entity or can represent elements or types of events; acquiring association information associated with nodes in the personal knowledge graph from the first knowledge graph; and expanding the personal knowledge graph by using the associated information to obtain an expanded personal knowledge graph.
In one possible implementation, the apparatus further includes a recommendation module 2605 configured to: acquiring information of at least one node matched with the output sequence from the personal knowledge graph; and generating recommendation information for the target user according to the information of at least one node, wherein the recommendation information is used for recommending to the target user.
In one possible implementation, the recommendation module 2605 is specifically configured to: screening information of at least one first node corresponding to the output sequence from the personal knowledge graph; searching information of at least one second node associated with at least one first node from the personal knowledge graph, wherein the information of the at least one node comprises information of the at least one first node and information of the at least one second node.
In one possible implementation, the information of the first node and the information of the second node are information of different domains.
In one possible implementation manner, each node in the personal knowledge graph includes a corresponding weight, and the weight of each node is in a negative correlation with a storage duration or an update duration, where the storage duration is a duration of storing information of each node, and the update duration is a duration of last updating information included in each node.
In one possible implementation, the recommendation module is specifically configured to: sequencing at least one node according to the weight corresponding to the at least one node; and generating recommendation information according to the information of the at least one node and the ordering of the at least one node.
In one possible implementation, the input module 2601 is specifically configured to: acquiring user input data, wherein the input data comprises at least one of image, text or voice data; input text is extracted from the input data.
In one possible implementation of the method according to the invention,
The input module 2601 is further configured to obtain structured data of the target user, where the structured data is data in a preset format;
The obtaining module 2603 is further configured to extract information of at least one event from the structured data according to a preset rule;
the obtaining module 2603 is further configured to update the personal knowledge graph according to the information of at least one event, and obtain an updated personal knowledge graph.
Referring to fig. 27, another schematic structure of an information acquisition device according to the present application is shown below.
The information acquisition device may include a processor 2701 and a memory 2702. The processor 2701 and the memory 2702 are interconnected by wiring. Wherein program instructions and data are stored in memory 2702.
The memory 2702 stores therein program instructions and data corresponding to the steps in fig. 4 to 25.
The processor 2701 is configured to perform the method steps performed by the information acquisition apparatus shown in any of the foregoing embodiments of fig. 4-25.
Optionally, the information acquisition device may further include a transceiver 2703 for receiving or transmitting data.
Also provided in embodiments of the present application is a computer-readable storage medium having a program stored therein that, when executed on a computer, causes the computer to perform the steps of the method described in the embodiments of fig. 6-25 described above.
Alternatively, the aforementioned information acquisition device shown in fig. 27 is a chip.
Referring to fig. 28, a schematic structural diagram of another electronic device provided by the present application is as follows.
The electronic device may include a processor 2801 and a memory 2802. The processor 2801 and memory 2802 are interconnected by wires. Wherein program instructions and data are stored in memory 2802.
The memory 2802 stores therein program instructions and data corresponding to the steps in fig. 4 to 25.
The processor 2801 is used to perform the method steps previously described for the electronic device shown in fig. 4-25.
Optionally, the electronic device may also include a transceiver 2803 for receiving or transmitting data.
Embodiments of the present application also provide a computer-readable storage medium having a program stored therein, which when run on a computer causes the computer to perform the steps of the method described in the embodiments shown in fig. 4-25.
Alternatively, the electronic device shown in fig. 28 described above is a chip.
The embodiment of the application also provides an information acquisition device, which can also be called as a digital processing chip or a chip, wherein the chip comprises a processing unit and a communication interface, the processing unit acquires program instructions through the communication interface, the program instructions are executed by the processing unit, and the processing unit is used for executing the method steps of the foregoing fig. 4-25.
The embodiment of the application also provides a digital processing chip. The digital processing chip has integrated therein circuitry and one or more interfaces for implementing the above-described processor 2701, processor 2801, or the functions of processor 2701, processor 2801. When the memory is integrated into the digital processing chip, the digital processing chip may perform the method steps of any one or more of the preceding embodiments. When the digital processing chip is not integrated with the memory, the digital processing chip can be connected with the external memory through the communication interface. The digital processing chip implements the actions executed by the information acquisition device, or the electronic device in the above embodiments according to the program codes stored in the external memory.
Embodiments of the present application also provide a computer program product which, when run on a computer, causes the computer to perform the steps of the method described in the embodiments of figures 4 to 25 described above.
The information acquisition device provided by the embodiment of the application can be a chip, and the chip comprises: a processing unit, which may be, for example, a processor, and a communication unit, which may be, for example, an input/output interface, pins or circuitry, etc. The processing unit may execute the computer-executable instructions stored in the storage unit, so that the chip in the server performs the information acquisition method described in the embodiment shown in fig. 6 to 25. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, or the like, and the storage unit may also be a storage unit in the wireless access device side located outside the chip, such as a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a random access memory (random access memory, RAM), or the like.
In particular, the aforementioned processing unit or processor may be a central processing unit (central processing unit, CPU), a Network Processor (NPU), a graphics processor (graphics processing unit, GPU), a digital signal processor (DIGITAL SIGNAL processor, DSP), an Application Specific Integrated Circuit (ASIC) or field programmable gate array (field programmable GATE ARRAY, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc. The general purpose processor may be a microprocessor or may be any conventional processor or the like.
Referring to fig. 29, fig. 29 is a schematic structural diagram of a chip according to an embodiment of the present application, where the chip may be represented as a neural network processor NPU 290, and the NPU 290 is mounted as a coprocessor on a main CPU (Host CPU), and the Host CPU distributes tasks. The NPU has a core part of an arithmetic circuit 2903, and the controller 2904 controls the arithmetic circuit 2903 to extract matrix data in a memory and perform multiplication.
In some implementations, the arithmetic circuit 2903 includes a plurality of processing units (PEs) inside. In some implementations, the operational circuit 2903 is a two-dimensional systolic array. The operation circuit 2903 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the operational circuitry 2903 is a general-purpose matrix processor.
For example, assume that there is an input matrix a, a weight matrix B, and an output matrix C. The arithmetic circuit fetches the data corresponding to the matrix B from the weight memory 2902, and buffers it on each PE in the arithmetic circuit. The arithmetic circuit performs matrix operation on the matrix a data and the matrix B data from the input memory 2901, and the partial result or the final result of the matrix obtained is stored in an accumulator (accumulator) 2908.
Unified memory 2906 is used to store input data and output data. The weight data is directly passed through the memory cell access controller (direct memory access controller, DMAC) 2905, which is carried into the weight memory 2902. The input data is also carried into the unified memory 2906 through the DMAC.
Bus interface unit (bus interface unit, BIU) 2910 is used for the AXI bus to interact with DMAC and finger memory (instruction fetch buffer, IFB) 2909.
Bus interface unit 2910 (bus interface unit, BIU) is configured to fetch instructions from an external memory by instruction fetch memory 2909, and to fetch raw data of input matrix a or weight matrix B from the external memory by memory cell access controller 2905.
The DMAC is mainly used to transfer input data in the external memory DDR to the unified memory 2906 or to transfer weight data to the weight memory 2902 or to transfer input data to the input memory 2901.
The vector calculation unit 2907 includes a plurality of operation processing units that perform further processing on the output of the operation circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, magnitude comparison, and the like, as necessary. The method is mainly used for non-convolution/full-connection layer network calculation in the neural network, such as batch normalization (batch normalization), pixel-level summation, up-sampling of a characteristic plane and the like.
In some implementations, vector calculation unit 2907 can store the vector of processed outputs to unified memory 2906. For example, vector computation unit 2907 may apply a linear function and/or a non-linear function to the output of operation circuit 2903, such as linear interpolation of feature planes extracted by a convolutional layer, and further such as a vector of accumulated values, to generate an activation value. In some implementations, vector calculation unit 2907 generates normalized values, pixel-level summed values, or both. In some implementations, the vector of processed outputs can be used as an activation input to the operational circuitry 2903, e.g., for use in subsequent layers in a neural network.
A finger fetch memory (instruction fetch buffer) 2909 connected to the controller 2904 for storing instructions used by the controller 2904;
The unified memory 2906, the input memory 2901, the weight memory 2902, and the finger memory 2909 are On-Chip memories. The external memory is proprietary to the NPU hardware architecture.
The operations of the respective layers in the recurrent neural network may be performed by the operation circuit 2903 or the vector calculation unit 2907.
The processor referred to in any of the foregoing may be a general purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for controlling the execution of the programs of the methods of fig. 4-25 described above.
It should be further noted that the above-described apparatus embodiments are merely illustrative, and that the units described as separate units may or may not be physically separate, and that units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the application, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general purpose hardware, or of course by means of special purpose hardware including application specific integrated circuits, special purpose CPUs, special purpose memories, special purpose components, etc. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions can be varied, such as analog circuits, digital circuits, or dedicated circuits. But a software program implementation is a preferred embodiment for many more of the cases of the present application. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a readable storage medium, such as a floppy disk, a U-disk, a removable hard disk, a Read Only Memory (ROM), a random access memory (random access memory, RAM), a magnetic disk or an optical disk of a computer, etc., including several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to execute the method according to the embodiments of the present application.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Drive (SSD)), etc.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Claims (26)

  1. An information acquisition method, characterized by comprising:
    acquiring input text of a target user, wherein the input text comprises at least one word, and the at least one word forms at least one event;
    Obtaining an output sequence based on the input text, wherein the output sequence comprises the type and the element of the at least one event;
    The personal knowledge graph is obtained according to the output sequence, the personal knowledge graph comprises a plurality of nodes, the plurality of nodes comprise type nodes and element nodes, the type nodes are used for representing types of at least one event, the element nodes are used for representing elements of the at least one event, the type nodes and the element nodes in the same event are associated, and the personal knowledge graph is used for recommending the target user.
  2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
    If the output sequence further comprises the association relation between the elements of the at least one event, associating element nodes corresponding to the elements with the association relation of the same event in the personal knowledge graph; or alternatively
    And if the output sequence also comprises the emotion type of the at least one event, the element nodes corresponding to the same event in the personal knowledge graph are related through the emotion type.
  3. A method according to claim 1 or 2, wherein the output sequence comprises a type and an element of a first event, the first event being any one of the at least one event; the step of obtaining the personal knowledge graph according to the output sequence comprises the following steps:
    If the initial knowledge graph comprises the information of the first event, updating the element nodes corresponding to the first event or the association relation between the element nodes in the initial knowledge graph to obtain the personal knowledge graph;
    If the initial knowledge graph does not include the information of the first event, adding the type node and the element node of the first event to the initial knowledge graph, and associating the type node and the element node of the first event to obtain the personal knowledge graph.
  4. A method according to any of claims 1-3, wherein said obtaining an output sequence based on said input text comprises:
    Obtaining an initial sequence corresponding to the input text through a text processing model, wherein the initial sequence comprises vector representations of at least one word in the input text and first class labels corresponding to the at least one word;
    Carrying out syntactic analysis on the input text to obtain a feature sequence, wherein the feature sequence comprises a second class tag corresponding to the at least one word;
    And combining the initial sequence and the characteristic sequence to obtain the output sequence, wherein the output sequence comprises elements and types of the at least one event.
  5. The method of claim 4, wherein said combining said initial sequence and said signature sequence to obtain said output sequence comprises:
    And correcting the part which is not matched with the characteristic sequence in the initial sequence to obtain the output sequence.
  6. The method according to claim 4 or 5, characterized in that the method further comprises:
    If each word in the feature sequence corresponds to a plurality of second class labels, determining a unique second class label for each word to obtain an updated feature sequence.
  7. The method according to any one of claims 1-6, further comprising:
    Acquiring a first knowledge graph, wherein the first knowledge graph comprises a plurality of nodes, and the nodes comprise information of at least one entity;
    acquiring association information associated with nodes in the personal knowledge graph from the first knowledge graph;
    And expanding the personal knowledge graph by using the association information to obtain an expanded personal knowledge graph.
  8. The method according to any one of claims 1-7, further comprising:
    acquiring information of at least one node matched with the output sequence from the personal knowledge graph;
    And generating recommendation information for the target user according to the information of the at least one node, wherein the recommendation information is used for recommending the target user.
  9. The method of claim 8, wherein each node in the personal knowledge graph includes a corresponding weight, and the weight of each node is inversely related to a storage time period or an update time period, where the storage time period is a time period for storing information of each node, and the update time period is a time period from a last update of the information included in each node.
  10. The method of claim 9, wherein the generating recommendation information for the target user based on the information of the at least one node comprises:
    Sequencing the at least one node according to the weight corresponding to the at least one node;
    And generating the recommendation information according to the information of the at least one node and the ordering of the at least one node.
  11. The method according to any one of claims 1-10, further comprising:
    Obtaining structured data of the target user, wherein the structured data is data in a preset format;
    extracting information of at least one event from the structured data according to a preset rule;
    and updating the personal knowledge graph according to the information of the at least one event to obtain an updated personal knowledge graph.
  12. A graphical user interface, GUI, stored in an electronic device comprising a display screen, a memory, one or more processors to execute one or more computer programs stored in the memory, the graphical user interface comprising:
    Generating a personal knowledge graph in response to input operation of a target user, and displaying the personal knowledge graph, wherein the input text of the target user comprises at least one word, the at least one word forms at least one event, the personal knowledge graph comprises a plurality of nodes, the nodes comprise type nodes and element nodes, the type nodes are used for representing types of the at least one event, the element nodes are used for representing elements of the at least one event, and the type nodes and the element nodes in the same event are associated with the personal knowledge graph and are used for recommending the target user.
  13. The GUI of claim 12, further comprising:
    And responding to the obtained association information associated with the nodes in the personal knowledge graph from the first knowledge graph, expanding the personal knowledge graph by using the association information to obtain an expanded personal knowledge graph, and displaying the expanded personal knowledge graph, wherein the first knowledge graph comprises a plurality of nodes, and each node comprises information of at least one entity.
  14. A GUI according to claim 12 or 13, further comprising:
    And generating recommendation information for the target user in response to the information of at least one node acquired from the personal knowledge graph, and displaying the recommendation information, wherein the recommendation information is used for recommending the target user.
  15. An information acquisition apparatus, characterized by comprising:
    the input module is used for acquiring input text of a target user, wherein the input text comprises at least one word, and the at least one word forms at least one event;
    A text processing module, configured to obtain an output sequence based on the input text, where the output sequence includes a type and an element of the at least one event;
    The personal knowledge graph comprises a plurality of nodes, the plurality of nodes comprise type nodes and element nodes, the type nodes are used for representing types of at least one event, the element nodes are used for representing elements of the at least one event, the type nodes and the element nodes in the same event are associated, and the personal knowledge graph is used for recommending the target user.
  16. The apparatus of claim 15, wherein the device comprises a plurality of sensors,
    If the output sequence further comprises the association relation between the elements of the at least one event, associating element nodes corresponding to the elements with the association relation of the same event in the personal knowledge graph; or alternatively
    And if the output sequence also comprises the emotion type of the at least one event, the element nodes corresponding to the same event in the personal knowledge graph are related through the emotion type.
  17. The apparatus according to claim 15 or 16, wherein the output sequence comprises a type and an element of a first event, the first event being any one of the at least one event; the acquisition module is specifically configured to:
    If the initial knowledge graph comprises the information of the first event, updating the element nodes corresponding to the first event or the association relation between the element nodes in the initial knowledge graph to obtain the personal knowledge graph;
    If the initial knowledge graph does not include the information of the first event, adding the type node and the element node of the first event to the initial knowledge graph, and associating the type node and the element node of the first event to obtain the personal knowledge graph.
  18. The apparatus according to any one of claims 15-17, wherein the text processing module is specifically configured to:
    Obtaining an initial sequence corresponding to the input text through a text processing model, wherein the initial sequence comprises vector representations of at least one word in the input text and first class labels corresponding to the at least one word;
    Carrying out syntactic analysis on the input text to obtain a feature sequence, wherein the feature sequence comprises a second class tag corresponding to the at least one word;
    And combining the initial sequence and the characteristic sequence to obtain the output sequence, wherein the output sequence comprises elements and types of the at least one event.
  19. The apparatus of any one of claims 15-18, further comprising an expansion module to:
    acquiring a first knowledge graph, wherein the first knowledge graph comprises a plurality of nodes, and each node comprises information of at least one entity;
    acquiring association information associated with nodes in the personal knowledge graph from the first knowledge graph;
    And expanding the personal knowledge graph by using the association information to obtain an expanded personal knowledge graph.
  20. The apparatus according to any one of claims 15-19, further comprising a recommendation module for:
    acquiring information of at least one node matched with the output sequence from the personal knowledge graph;
    And generating recommendation information for the target user according to the information of the at least one node, wherein the recommendation information is used for recommending the target user.
  21. The apparatus of claim 20, wherein each node in the personal knowledge graph includes a corresponding weight, and the weight of each node is inversely related to a storage time period or an update time period, where the storage time period is a time period for storing information of each node, and the update time period is a time period from a last update of information included in each node.
  22. The apparatus of claim 21, wherein the recommendation module is specifically configured to:
    Sequencing the at least one node according to the weight corresponding to the at least one node;
    And generating the recommendation information according to the information of the at least one node and the ordering of the at least one node.
  23. An information acquisition device comprising at least one processor and a memory, the at least one processor being coupled to the memory for reading and executing instructions in the memory to perform the method of any of claims 1-11.
  24. An electronic device, comprising: a processor; a memory; the memory stores one or more computer programs, the one or more computer programs comprising instructions, which when executed by the one or more processors, cause the electronic device to perform the method of any of claims 1-11.
  25. A computer readable storage medium comprising a program which, when executed by a processing unit, performs the method of any of claims 1 to 11.
  26. A computer program product comprising a computer program which, when executed by a processor, implements the method of any one of claims 1 to 11.
CN202180103399.4A 2021-10-21 2021-10-21 Information acquisition method and device Pending CN118103834A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/125260 WO2023065211A1 (en) 2021-10-21 2021-10-21 Information acquisition method and apparatus

Publications (1)

Publication Number Publication Date
CN118103834A true CN118103834A (en) 2024-05-28

Family

ID=86058655

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180103399.4A Pending CN118103834A (en) 2021-10-21 2021-10-21 Information acquisition method and device

Country Status (2)

Country Link
CN (1) CN118103834A (en)
WO (1) WO2023065211A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116523268B (en) * 2023-06-30 2023-09-26 广东中大管理咨询集团股份有限公司 Person post matching analysis method and device based on big data portrait
CN116821712B (en) * 2023-08-25 2023-12-19 中电科大数据研究院有限公司 Semantic matching method and device for unstructured text and knowledge graph
CN116932780B (en) * 2023-09-13 2024-01-09 之江实验室 Astronomical knowledge graph construction method, resource searching method, device and medium
CN116955836B (en) * 2023-09-21 2024-01-02 腾讯科技(深圳)有限公司 Recommendation method, recommendation device, recommendation apparatus, recommendation computer readable storage medium, and recommendation program product
CN117076660B (en) * 2023-10-16 2024-01-26 浙江同花顺智能科技有限公司 Information recommendation method, device, equipment and storage medium
CN117436457B (en) * 2023-11-01 2024-05-03 人民网股份有限公司 Irony identification method, irony identification device, computing equipment and storage medium
CN117540062B (en) * 2024-01-10 2024-04-12 广东省电信规划设计院有限公司 Retrieval model recommendation method and device based on knowledge graph
CN117633254B (en) * 2024-01-26 2024-04-05 武汉大学 Knowledge-graph-based map retrieval user portrait construction method and system

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10628775B2 (en) * 2015-08-07 2020-04-21 Sap Se Sankey diagram graphical user interface customization
CN106503172A (en) * 2016-10-25 2017-03-15 天闻数媒科技(湖南)有限公司 The method and apparatus that learning path recommended by knowledge based collection of illustrative plates
CN109800300A (en) * 2019-01-08 2019-05-24 广东小天才科技有限公司 A kind of learning Content recommended method and system
CN110008349B (en) * 2019-02-01 2020-11-10 创新先进技术有限公司 Computer-implemented method and apparatus for event risk assessment
CN110334159A (en) * 2019-05-29 2019-10-15 苏宁金融服务(上海)有限公司 Information query method and device based on relation map
CN111191046A (en) * 2019-12-31 2020-05-22 北京明略软件系统有限公司 Method, device, computer storage medium and terminal for realizing information search
CN113095346A (en) * 2020-01-08 2021-07-09 华为技术有限公司 Data labeling method and data labeling device

Also Published As

Publication number Publication date
WO2023065211A1 (en) 2023-04-27

Similar Documents

Publication Publication Date Title
US11314806B2 (en) Method for making music recommendations and related computing device, and medium thereof
CN110717017B (en) Method for processing corpus
CN118103834A (en) Information acquisition method and device
CN112507715B (en) Method, device, equipment and storage medium for determining association relation between entities
CN107066464B (en) Semantic natural language vector space
US11194842B2 (en) Methods and systems for interacting with mobile device
CN112131350B (en) Text label determining method, device, terminal and readable storage medium
US20200125967A1 (en) Electronic device and method for controlling the electronic device
CN105094315A (en) Method and apparatus for smart man-machine chat based on artificial intelligence
AU2016256753A1 (en) Image captioning using weak supervision and semantic natural language vector space
KR20200010131A (en) Electronic apparatus and control method thereof
JP7488871B2 (en) Dialogue recommendation method, device, electronic device, storage medium, and computer program
CN113806588A (en) Method and device for searching video
Ahmed et al. Sentiment analysis for smart cities: state of the art and opportunities
CN117009650A (en) Recommendation method and device
CN115114395A (en) Content retrieval and model training method and device, electronic equipment and storage medium
CN116432019A (en) Data processing method and related equipment
CN117216535A (en) Training method, device, equipment and medium for recommended text generation model
Huang et al. Sentiment analysis algorithm using contrastive learning and adversarial training for POI recommendation
CN112784156A (en) Search feedback method, system, device and storage medium based on intention recognition
CN116910201A (en) Dialogue data generation method and related equipment thereof
CN116910357A (en) Data processing method and related device
CN116628345A (en) Content recommendation method and device, electronic equipment and storage medium
CN116955591A (en) Recommendation language generation method, related device and medium for content recommendation
CN116977701A (en) Video classification model training method, video classification method and device

Legal Events

Date Code Title Description
PB01 Publication