WO2022078102A1 - 一种实体识别方法、装置、设备以及存储介质 - Google Patents

一种实体识别方法、装置、设备以及存储介质 Download PDF

Info

Publication number
WO2022078102A1
WO2022078102A1 PCT/CN2021/116228 CN2021116228W WO2022078102A1 WO 2022078102 A1 WO2022078102 A1 WO 2022078102A1 CN 2021116228 W CN2021116228 W CN 2021116228W WO 2022078102 A1 WO2022078102 A1 WO 2022078102A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
entity
semantic representation
sequence
information
Prior art date
Application number
PCT/CN2021/116228
Other languages
English (en)
French (fr)
Inventor
刘刚
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2022078102A1 publication Critical patent/WO2022078102A1/zh
Priority to US17/947,548 priority Critical patent/US20230015606A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the field of computer technology, and in particular, to entity recognition.
  • Entity recognition is a fundamental task in natural language processing with a wide range of applications. Take an entity as an example, an entity generally refers to an entity with a specific meaning or strong referentiality in the text, which usually includes a person's name, a place name, an organization name, a date and time, a proper noun, etc. By extracting the above entities from unstructured input text, more categories of entities can be identified according to business requirements, such as product name, model, price, etc. Therefore, the concept of entity can be very broad, as long as it is a special text fragment required by the business, it can be called an entity. Through entity recognition, the desired data or objects can be extracted. Entity recognition is the basis for subsequent content mining analysis, relation extraction and event analysis.
  • the process of entity recognition can use the multi-pattern matching (Aho–Corasick, AC) algorithm, that is, to find the internal rules of pattern strings to achieve efficient jumps in each mismatch, such as identifying the same prefix relationship between pattern strings for entity recognition. .
  • AC multi-pattern matching
  • the present application provides an entity identification method, which can effectively improve the efficiency and accuracy of entity identification.
  • an embodiment of the present application provides an entity identification method, which can be applied to a system or program including an entity identification function in a terminal device, and specifically includes:
  • the target text information is input into the input representation layer in the target recognition model to generate a target vector sequence, the target vector sequence includes a plurality of sub-vectors, the plurality of sub-vectors are based on at least two text dimensions. means income;
  • the target vector sequence is a set of attribution probabilities of the plurality of sub-vectors and a plurality of entity labels respectively
  • the The semantic representation layer includes a plurality of parallel identification nodes, the identification nodes are related to each other, and the identification nodes are used to identify the corresponding sub-vectors and the attribution probabilities of a plurality of the entity labels, and the plurality of the entity labels are based on different the entity setting of the class;
  • the label prediction sequence is input into a conditional discrimination layer in the target recognition model to determine a target item in the attribution probability set, where the target item is used to indicate the entity in the target text information.
  • an embodiment of the present application provides an entity identification device, including: an acquisition unit configured to acquire target text information;
  • the input unit is used to input the target text information into the input representation layer in the target recognition model to generate a target vector sequence, the target vector sequence includes a plurality of sub-vectors, and the sub-vectors are based on at least two text dimensions.
  • the target text information indicates the income;
  • a prediction unit configured to input the target vector sequence into the semantic representation layer in the target recognition model to obtain a label prediction sequence, wherein the label prediction sequence is the attribution probability of the sub-vectors and multiple entity labels respectively set, the semantic representation layer includes a plurality of parallel identification nodes, the identification nodes are related to each other, the identification nodes are used to identify the corresponding sub-vectors and the attribution probabilities of a plurality of the entity labels, a plurality of the Entity tags are set based on different types of entities;
  • the recognition unit is used to input the label prediction sequence into the conditional discrimination layer in the target recognition model to determine the target item in the attribution probability set, and the target item is used to indicate all the items in the target text information. the entity described.
  • an embodiment of the present application provides a computer device, including: a memory, a processor, and a bus system; the memory is used for storing program codes; the processor is used for executing the above aspects according to instructions in the program code. the entity recognition method described.
  • an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program is used to execute the entity identification method described in the foregoing aspect.
  • an embodiment of the present application provides a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the entity identification method described in the above aspects.
  • the embodiments of the present application have the following advantages:
  • the target text information of the entity to be recognized input the target text information into the input representation layer in the target recognition model to generate a target vector sequence.
  • the target text The information is represented, and it is determined that the multiple sub-vectors included in the target vector sequence are obtained by representing the target text information based on at least two text dimensions.
  • the target vector sequence is input into the semantic representation layer in the target recognition model, and the label prediction sequence of the identification sub-vectors and the attribution probability sets of multiple entity labels is obtained. Nodes can obtain their own context information, enhance the integrity of semantic representation, and improve the recognition accuracy of subsequent entity labels.
  • the target text information can be associated with more entity tags during the identification process, and the important features of different categories of entities can be screened out to enhance the identification of entity categories.
  • the label prediction sequence is input into the conditional discrimination layer in the target recognition model to determine the target item in the attribution probability set that is used to indicate the entity in the target text information.
  • Fig. 1 is the network architecture diagram of entity recognition system operation
  • FIG. 2 is a flowchart structure diagram of an entity identification provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of an entity identification method provided by an embodiment of the present application.
  • FIG. 4 is a model architecture diagram of an entity recognition method provided by an embodiment of the present application.
  • FIG. 5 is a model architecture diagram of another entity recognition method provided by an embodiment of the present application.
  • FIG. 6 is a model architecture diagram of another entity recognition method provided by an embodiment of the present application.
  • FIG. 8 is a model architecture diagram of another entity recognition method provided by an embodiment of the present application.
  • FIG. 9 is a model architecture diagram of another entity recognition method provided by an embodiment of the present application.
  • FIG. 10 is a model architecture diagram of another entity recognition method provided by an embodiment of the present application.
  • FIG. 11 is a system architecture diagram of an entity identification method provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of an entity identification device provided by an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of a terminal device provided by an embodiment of the application.
  • FIG. 15 is a schematic structural diagram of a server according to an embodiment of the present application.
  • LSTM Long Short-Term Memory
  • RNN general recurrent neural networks
  • Natural Language Processing It is an important direction in the field of computer science and artificial intelligence. It studies various theories and methods that can realize effective communication between humans and computers using natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics.
  • Entity recognition It is an important basic tool in application fields such as information extraction, question answering system, syntactic analysis, machine translation, etc. It occupies an important position in the process of natural language processing technology becoming practical. For example, entity recognition can identify three categories (entity, time, and number), seven subcategories (person, institution, place, time, date, currency, and percentage) entities in the text to be processed, and so on.
  • Conditional random field It is a discriminative probability model, a type of random field, and is often used to label or analyze sequence data, such as natural language or biological sequences.
  • Short video short-form video is a method of Internet content dissemination. Generally, it is a video dissemination content that is disseminated on new Internet media with a duration of less than 5 minutes; Gradually gained the favor of major platforms, fans and capital.
  • Multi-Channel Network It is a product form of multi-channel network, which combines PGC content, with the strong support of capital, to ensure the continuous output of content, so as to finally realize the stable realization of business.
  • PGC Professional Generated Content
  • UPC User Generated Content
  • Web 2.0 that advocates personalization as the main feature. It is not a specific business, but a new way for users to use the Internet, that is, from the original focus on downloading to both downloading and uploading.
  • Feeds also translated as sources, feeds, information offers, feeds, summaries, sources, news feeds, and web sources are a data format through which a website disseminates the latest information to users, usually in a timeline. Arrangement, Timeline is the most primitive, intuitive and basic display form of Feed.
  • a prerequisite for a user to be able to subscribe to a website is that the website provides a source of news. Converging feeds in one place is called aggregation, and the software used for aggregation is called an aggregator.
  • an aggregator is a software specially used to subscribe to a website, and is also commonly known as an RSS reader, feed reader, news reader, etc.
  • the entity identification method provided in this application can be applied to computer equipment, and the computer equipment includes, for example, a terminal device or a server.
  • the entity identification system can run in the network architecture shown in Figure 1.
  • Figure 1 it is the network architecture in which the entity identification system operates.
  • the entity identification system can provide entity identification process with multiple information sources, that is, the server can accept the text content sent by multiple terminals, identify the entities in it, and reply the identification results corresponding to the terminals;
  • the server can accept the text content sent by multiple terminals, identify the entities in it, and reply the identification results corresponding to the terminals;
  • FIG. 1 shows one server, but in an actual scenario, multiple servers may also participate, and the specific number of servers depends on the actual scenario.
  • the server may be an independent physical server, or a server cluster or a distributed system composed of multiple physical servers, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, and network services , cloud communications, middleware services, domain name services, security services, CDN, and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • the terminal may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto.
  • the terminal and the server can be connected directly or indirectly through wired or wireless communication, and the terminal and the server can be connected to form a blockchain network, which is not limited in this application.
  • the above entity recognition system can run on personal mobile terminals, for example, as an application such as an interactive drama, it can also run on a server, and it can also run on a third-party device to provide entity identification to obtain the entity of the information source.
  • Recognition processing results; the specific entity recognition system can be run in the above-mentioned equipment in the form of a program, or can be run as a system component in the above-mentioned equipment, and can also be used as a kind of cloud service program.
  • the specific operation mode depends on the actual situation. It depends on the scene and is not limited here.
  • Natural language processing is an important direction in the field of computer science and artificial intelligence. It studies various theories and methods that can realize effective communication between humans and computers using natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Therefore, research in this field will involve natural language, the language that people use on a daily basis, so it is closely related to the study of linguistics. Natural language processing technology usually includes entity recognition, text processing, semantic understanding, machine translation, robot question answering, knowledge graph and other technologies.
  • entity recognition also known as proper name recognition
  • An entity generally refers to an entity with a specific meaning or strong denotation in the text, which usually includes a person's name, a place name, an organization name, a date and time, a proper noun, etc.
  • NER system extracts the above entities from unstructured input text, and can identify more categories of entities according to business requirements, such as product name, model, price, etc. Therefore, the concept of entity can be very broad, as long as it is a special text fragment required by the business, it can be called an entity.
  • entity recognition the desired data can be extracted, which is the basis for subsequent content mining analysis, relationship extraction and event analysis.
  • the multi-pattern matching (AC) algorithm can be used in the process of entity recognition, that is, to find the internal rules of pattern strings to achieve efficient jumping in each mismatch, such as identifying the same prefix relationship between pattern strings for entity recognition.
  • entity recognition based on the AC algorithm matching method has a single matching process, which is easy to introduce errors, and it is difficult to continue to improve the accuracy rate, and requires manual review, which affects the efficiency of entity recognition.
  • an entity recognition method which is applied to the process framework of entity recognition shown in FIG. 2 .
  • an entity recognition process provided by an embodiment of the present application Architecture diagram, through the vector transformation of multi-dimensional representation of text information (such as using three-dimensional features of word, word, and part of speech) to obtain word vectors, and then matching different entity tags (such as entity tags 1-n) to be in the Multiple different subspaces capture context-related information, screen important features of different categories of entities, enhance the feature's ability to distinguish entity categories, and then improve the recognition effect, and determine the target entity corresponding to the text information.
  • the entity recognition device obtains target text information; then input the target text information into the input representation layer in the target recognition model to generate a target vector sequence, the target vector sequence includes a plurality of sub-vectors, and the sub-vectors are based on at least two A text dimension represents the result; further input the target vector sequence into the semantic representation layer in the target recognition model to obtain a label prediction sequence, wherein the label prediction sequence is a set of attribution probabilities of sub-vectors and multiple entity labels, and the semantic representation layer includes Multiple parallel identification nodes, the identification nodes are related to each other, the identification nodes are used to identify the corresponding sub-vectors and the attribution probability of multiple entity labels, and the entity labels are set based on entities of different categories; then the label prediction sequence is input into the target identification
  • the scheme provided by the embodiment of the present application relates to the natural language processing technology of artificial intelligence, and is specifically described by the following embodiments:
  • FIG. 3 is a flowchart of an entity identification method provided by an embodiment of the present application.
  • the management method may be executed by a terminal device. It may also be executed by a server, or may be executed jointly by a terminal device and a server, and the execution by a terminal device is used as an example for description below.
  • the embodiment of the present application includes at least the following steps:
  • the target text information may come from various information sources, such as a web page, an application program, an interface, and the like.
  • target identification data is first obtained in response to the target operation, and the target identification data includes at least one form of media content , such as a short video; and then perform text interpretation on the target recognition data based on the media content form to determine target text information. For example, interpreting the summary part of the short video, or recognizing the sound signal of the short video, and converting it into corresponding text information, the specific content form depends on the actual scene, and is not limited here.
  • the target vector sequence includes a plurality of sub-vectors, and the sub-vectors are represented based on at least two text dimensions; wherein, in order to ensure the accuracy of the description of the target text information, features of different dimensions can be used for description, such as words and The dimension of the word is represented by a vector.
  • the at least two text dimensions include a word dimension and a word dimension.
  • the feature dimension of the target text information is improved, and the comprehensiveness of the feature description is ensured.
  • the word embedding vector processing process can be word2vec or word bag model, and the word embedding processing process can be obtained by encoding a random 01 vector string, and then word embedding and word embedding vector Combined together as the target vector sequence, that is, the word-granularity vector is left with a separate representation position when identifying, such as the OneHot encoding of the word used, which is placed directly behind the sub-vector, for example, the target vector sequence of "video” is " A sequence of "video", "video” and "video".
  • word embedding is usually pre-trained, word embedding is randomly initialized, and the embedding process will be adjusted with the iterative process of training.
  • each unit in the target text information is processed based on the above processing method of vector conversion, so as to obtain a target vector sequence including multiple vector sequences. For example, for a sentence, each word in it is used as a unit for vector transformation.
  • the label prediction sequence is a set of attribution probabilities of sub-vectors and multiple entity labels, that is, the matching process based on entity labels of different categories is a processing process of a Multi-Head Attention mechanism.
  • the -Head Attention mechanism increases the ability of the target recognition model to capture different position information in the target vector sequence. If the parameters before the mapping are directly used for calculation, only a fixed weight probability distribution can be obtained, and this probability distribution will focus on one position or one. Information of several locations, but based on the matching process of the Multi-Head Attention mechanism, it can be associated with entity tags at more locations. And since weights are not shared during mapping, the mapped subspaces are different, or the information covered by different subspaces is different, so the information covered by the final spliced vector will be wider.
  • the semantic representation layer includes multiple parallel identification nodes, and the identification nodes are related to each other.
  • the identification nodes are used to identify the corresponding sub-vectors and the attribution probability of multiple entity labels.
  • each identification node can use Bi-directional Long Short-Term Memory (BiLSTM) model can obtain contextual information in the process of semantic representation due to the interrelationship between identification nodes, which ensures the integrity of semantic representation and thus the accuracy of label prediction.
  • BiLSTM Bi-directional Long Short-Term Memory
  • FIG. 4 a model architecture diagram of an entity recognition method provided by an embodiment of the present application; the figure shows that the target recognition model includes an input representation layer, a semantic representation layer, and a conditional discrimination layer.
  • Layer input target vector sequence to obtain X1-Xt, and then perform multi-feature dimension association representation through LSTM1-LSTM3 in the semantic representation layer, and then perform feature splicing, and perform multi-head matching operation based on self-attention mechanism, and more Entity labels are associated to ensure the accuracy of Y1-Yt prediction of entity labels in the input conditional discrimination layer.
  • the corresponding prediction label can adopt the two-segment labeling specification of BIESO-Type.
  • BIESO-Type For the first paragraph, including begin, Inside, and End, which represent the start (B), middle (I), and end position (E) of the entity, respectively; in addition, Single represents a single-word entity, Other represents a non-entity, and for the second paragraph, Type Including PER, LOC, ORG, GAME, BOOK, etc., the corresponding person names, place names, organization names, games, books, etc.
  • B-Person the beginning of the name
  • I-Person the middle part of the name
  • E-Person the end of the name
  • B-Organization the beginning part of the organization
  • I-Organization the middle part of the organization
  • E-Organization the end part of the organization
  • O non-entity information
  • S- single-word entity
  • the settings of the above tags may also include that S-ORG represents an organization, S-GAME represents a game, S-Book represents a book, S-Music represents a music, and S-Food represents a It is food, S-Health means health, S-Tourist means tourism, S-Military means military, S-antiques antiques, etc.
  • S-ORG represents an organization
  • S-GAME represents a game
  • S-Book represents a book
  • S-Music represents a music
  • S-Food represents a It is food
  • S-Health means health
  • S-Tourist means tourism
  • S-Military military
  • S-antiques antiques etc.
  • the specific types and types of mining entities are distributed according to business needs and content categories Depends on the actual scene.
  • the operation process of the target recognition model is described below by taking the identification process of the person's name as Person and the organization as Organization as an example, as shown in FIG.
  • the figure shows that a piece of text information x added to the target recognition model is a sentence containing 5 words (w0, w1, w2, w3, w4).
  • sentence x [w0,w1] is the name of the person, [w3] is the name of the organization, and the others are "O".
  • the output of the BiLSTM layer (semantic representation layer) represents the score (attribution probability) of the word corresponding to each category.
  • the output of the BiLSTM node is 1.5(B-Person), 0.9(I-Person), 0.1(B-Organization), 0.08(I-Organization) and 0.05(O) , the above scores will be used as the input to the CRF layer.
  • the target item is used to indicate the entity in the target text information, that is, the recognition result.
  • the CRF layer will determine the target item, such as the entity tag with the highest attribution probability (the highest score) as the target item , that is, the entity recognition results can be obtained as w0(B-Person), w1(I-Person), w2(B-Organization), w3(I-Organization), w4(O).
  • the CRF layer can be used as a step to optimize the recognition process based on part-of-speech features. Therefore, combined with the above-mentioned division and combination of word and word granularity, three features of word, word and part-of-speech can be used for entity recognition.
  • the process of the above target item is equivalent to setting constraints on the label prediction sequence in the conditional discrimination layer, that is, firstly, the label prediction sequence is input into the conditional discrimination layer in the target recognition model to obtain the conditional discrimination layer. Then, based on the constraints, the attribution probability corresponding to each sub-vector is screened to determine the target item in the attribution probability set.
  • the constraints are set based on preset global information.
  • the CRF layer network can also learn the constraints of the sentence, and the CRF layer can add global constraint information to ensure that the final prediction result is valid.
  • This constraint can be automatically learned by the CRF layer during training data. For example, possible constraints are: the beginning of a sentence should be "B-" or "O”, not "I-” or "E-”, etc.
  • the candidate label corresponding to the sub-vector in the label prediction sequence can be determined first, and the candidate label includes the location identifier and the label identifier; then the correspondence between the location identifier and the label identifier is determined based on the constraints. Screening to determine the target item in the attribution probability set, such as screening out tags that cannot appear at the beginning of a sentence, so as to ensure the accuracy of entity recognition.
  • FIG. 6 is a model architecture diagram of another entity recognition method provided by the embodiment of the present application, for example, TransScore is used to represent the transfer score.
  • TransScore is used to represent the transfer score.
  • the scores of the transition matrix can be randomly initialized before the training of the CRF layer. These scores will be updated with the iterative process of training, that is, the CRF layer can learn these constraints by itself, and the transition scores between different categories constitute the transition matrix, that is, the labels that satisfy the transition matrix can be used as the recognition result.
  • the main function of the transfer score is to help CRF calculate the loss function.
  • the loss consists of two parts: the score of the real path and the total score of all paths. Among them, the score of the transfer matrix directly affects the size of the final loss function, so as to achieve constraints and ensure naming recognition accuracy.
  • classes 1, 2, 3 should be the same entity class. For example, “B-Person I-Person” is true, but “B-Person I-Organization” is false. “O I-label” is wrong, entity should start with “B-” instead of "I-”.
  • the target text information is input into the input representation layer in the target recognition model to generate a target vector sequence.
  • Each text dimension represents the target text information, and it is determined that the multiple sub-vectors included in the target vector sequence are obtained by representing the target text information based on at least two text dimensions.
  • the target vector sequence is input into the semantic representation layer in the target recognition model, and the label prediction sequence of the identification sub-vectors and the attribution probability sets of multiple entity labels is obtained. Nodes can obtain their own context information, enhance the integrity of semantic representation, and improve the recognition accuracy of subsequent entity labels.
  • the target text information can be associated with more entity tags during the identification process, and the important features of different categories of entities can be screened out to enhance the identification of entity categories.
  • the label prediction sequence is input into the conditional discrimination layer in the target recognition model to determine the target item in the attribution probability set that is used to indicate the entity in the target text information.
  • FIG. 7 is a flowchart of another entity identification method provided by the embodiment of the present application.
  • the embodiment of the present application includes at least the following steps:
  • step 701 is similar to step 301 in the embodiment shown in FIG. 3 , and details are not described here.
  • the bidirectional encoder in order to improve the adaptability of the bidirectional encoder to the target text information, the bidirectional encoder can be trained by calling the entity category set corresponding to the target text information.
  • the structure of the bidirectional encoder is shown in FIG. 8 , which is a model architecture diagram of another entity recognition method provided by an embodiment of the present application.
  • Token is the smallest granularity unit after each word segmentation
  • CLS represents the beginning of an independent sentence
  • SEP represents the end of an independent sentence
  • C is the abbreviation of CLS, which is also the beginning of the sentence
  • T1...TN is each independent sentence Shorthand for participle Tok1 to TokN.
  • the preset entity set corresponding to the target text information is obtained first; then the target category in the preset entity set is determined; and the bidirectional encoder is trained based on the target category, so that the bidirectional encoder is adapted to the target text information.
  • the BERT layer of the multi-entity entity recognition model is obtained by finetune through multi-type entity annotation data, thereby improving the recognition accuracy.
  • the relationship between the bidirectional encoder BERT model and the bidirectional long short-term memory network model BiLSTM model can be connected in sequence, that is, the BERT model is used as the pre-layer of the BiLSTM model, as shown in FIG. 9 , which is the present application
  • the label prediction sequence generation process for this scene can be as follows: firstly input the target vector sequence into the bidirectional encoder to obtain the first semantic representation; then input the first semantic representation into the bidirectional memory network model to obtain the target semantic representation; and then based on the target semantics Represents matching against multiple entity labels to obtain a sequence of label predictions.
  • E represents the position encoding of the Token in this sentence.
  • the position of the first word of the opening sentence is 0, and then the serial number is encoded in turn; T represents the initial encoding result of the output, and P represents the splicing together as the input of the next level. , that is, enter the CRF layer for screening.
  • the process of semantic representation involves two-way description, that is, firstly, the first semantic representation is input into the two-way memory network model for calculation based on the first order to obtain upper-level information; then the first semantic representation is input into the two-way memory network.
  • the model performs the calculation based on the second order to obtain the lower-level information; and then splices it based on the upper-level information and the lower-level information to obtain the target semantic representation.
  • FIG. 10 is a model architecture diagram of another entity recognition method provided by an embodiment of the present application.
  • FIG. 10 is a model architecture diagram of another entity recognition method provided by an embodiment of the present application.
  • the result of BERT encoding (the first semantic representation) and the result of BiLSTM encoding (the second semantic representation) are directly concatenated in the P layer. It can be understood that the original input has a delayed fusion method adopted by 2-way encoding. After the encoding is completed, the results are spliced, which can have better semantic representation extraction for the original in-depth text.
  • the parallel method will be shallower than the depth of network calculation, and the calculation efficiency is higher than that of sequential connection. It is suitable for places with higher requirements for calculation speed and prediction speed, while sequential connection is suitable for places with higher precision. .
  • the fusion method can be determined according to the data processing level and business scenario, that is, to obtain the text size corresponding to the target text information; and then determine the method of inputting the target vector sequence into the bidirectional encoder and bidirectional memory network model based on the text size. For example, if the text size is larger than 1G, the parallel mode is used to represent the text, thereby improving the adaptability of the recognition process to the scene.
  • a more comprehensive semantic expression can be output. If the relationship is set during the recognition process, the original semantic expression can be updated. , to improve the comprehensiveness of semantic expression.
  • this embodiment upgrades the recommended entity recognition algorithm in the information flow content distribution from matching to a multi-category entity joint recognition model including the target recognition model architecture of BERT+BiLSTM-CRF+Self-Attention.
  • the whole model mainly adopts three features of word, word and part of speech.
  • a multi-head Self-Attention layer is introduced between the BiLSTM layer and the CRF layer to capture context-related information in multiple different subspaces and screen the importance of different types of entities.
  • a Bert semantic extraction layer is added to the input representation layer, and two different types of BiLSTM are used.
  • CRF is used for tag sequence modeling, and global information constraints are used to avoid unreasonable tag sequences in the result, which greatly improves the accuracy of the entire content entity mining efficiency.
  • the text content can be efficiently structured in the information flow recommendation, and many subsequent tasks (such as relationship extraction, event extraction) provide a good foundation and provide effective auxiliary input, on the basis of entity recognition.
  • structured data such as keywords, categories, topics, entity nouns, etc.
  • FIG. 11 is a system architecture diagram of another entity recognition method provided by an embodiment of the present application.
  • the content entity mining method and system flow chart based on the deep learning model.
  • the content distributed in the information flow including the corresponding title of the content, the description text of the content, the text of the graphic content, the video contains too little text information (only the title or text extraction through subtitle OCR recognition, video audio conversion will have some text) ), these are the raw information of the mining input.
  • Structured data is obtained by performing content entity mining on some text information.
  • Structured text can have many purposes, such as keywords, categories, topics, entity nouns, etc., used to build knowledge graphs, and build portraits (including user portraits and item portraits), which are the basis for many subsequent tasks and processing.
  • entity extraction that is, entity recognition, including entity detection (find) and classification (classify), entity extraction or entity recognition (NER) plays an important role in information extraction, mainly
  • entity recognition including entity detection (find) and classification (classify)
  • entity extraction or entity recognition (NER) plays an important role in information extraction, mainly
  • the atomic information elements in the text are extracted, such as person name, organization/institution name, geographic location, event/date, character value, amount value, etc.
  • Subsequent relation extraction (Relation Extraction) and content analysis of event extraction must be based on content entity mining.
  • the entity mining service is called through the dispatch center service to conduct content entity mining, and the mining results are stored in the entity database to provide the recommendation system. Serve.
  • PGC or UGC user generate content
  • MCN content producers provide graphic or video content through the mobile terminal or back-end interface API system, these are the main content sources for recommended distribution content;
  • the source of graphic content is usually a lightweight publishing terminal and an entry for editing content.
  • Video content publishing is usually a shooting terminal. During the shooting process, the local video The content can be matched with music, filter templates and video beautification functions, etc.;
  • the core database of the content the meta information of all the content published by the producers is stored in this business database, the emphasis is on the meta information of the content itself, such as file size, cover image link, code rate, file format, title, release time, Author, video file size, video format, whether it is original markup or first release, it also includes the classification of content during the manual review process (including first, second, and third-level classification and label information, such as an article explaining Huawei mobile phones, first-level sub-categories It is technology, the secondary classification is smartphones, the third classification is domestic mobile phones, and the label information is Huawei, mate30);
  • the content processing by the scheduling center mainly includes machine processing and manual review processing.
  • the machine handles various quality judgments such as low-quality filtering, content labels such as classification, and label information (to obtain classification and label information, the premise is to perform content Entity information mining, this is also completed by the dispatch center service, but the final result of entity mining is stored in the entity database), and there is content sorting, their results will be written into the content database, and the same content will not be given Manually perform repeated secondary processing;
  • the content of the system is activated by manual review, and then the content index information obtained by the consumer end is directly displayed on the display page through the content export distribution service (usually a recommendation engine or a search engine or operation) to the content consumer of the terminal;
  • the content export distribution service usually a recommendation engine or a search engine or operation
  • Content entity information other than the meta-information of the stored content such as video source files and picture source files of graphic content
  • the content consumer After obtaining the index information of the content, the content consumer also directly accesses the content storage service to consume the actual content;
  • the manual review system is the carrier of human service capabilities. It is mainly used to review and filter content that is politically sensitive, pornographic, and not allowed by the law, which cannot be determined by the machine. At the same time, it also labels and re-confirms the video content;
  • (1) Obtain the original data from the metadata database and the content storage data.
  • the text data is preprocessed by sequence labeling as the sample data for the pre-training of the entity mining model;
  • the video content can extract the text information in the video by extracting the subtitles or audio-to-text in the video. This uses related technologies, and here is mainly used as a channel and source of text information;
  • the target recognition model comprising the BERT+BiLSTM-CRF+Self-Attention architecture based on the above-mentioned embodiment is constructed;
  • the entity identification method provided by the present application can be applied to the interaction process of the social network, that is, through the identification of the content sent by the user, the setting or association of the relevant identification is performed.
  • Social networking originates from social networking, and the starting point of social networking is email.
  • the Internet is essentially the networking between computers.
  • the early E-mail solved the problem of remote mail transmission. It is also the most popular application on the Internet, and it is also the starting point of social networking.
  • BBS goes a step further and normalizes "group posting” and "forwarding", theoretically realizing the function of publishing information to everyone and discussing topics (the boundary is the number of BBS visitors). Become the platform for the spontaneous generation of early Internet content.
  • Short video refers to the video content that is played on various new media platforms, suitable for watching in mobile state and short-term leisure state, and frequently pushed, ranging from a few seconds to a few minutes.
  • the content combines skills sharing, humor, fashion trends, social hotspots, street interviews, public welfare education, advertising creativity, business customization and other topics. Due to the short content, it can be a separate piece or a series of columns.
  • micro-movies and live broadcasts short video production does not have specific expression and team configuration requirements like micro-movies. It has the characteristics of simple production process, low production threshold, and strong participation. It has more communication value than live broadcast.
  • short videos have been on the rise from UGC, PGC, user uploads at the beginning, to agencies specializing in short video production, to MCN, to professional short video apps and many other head traffic platforms.
  • Short videos have become content entrepreneurship and social media.
  • FIG. 12 is a flowchart of another entity identification method provided by the embodiment of the present application.
  • the embodiment of the present application includes at least the following steps:
  • the acquisition of the target short video may be acquired in the process of uploading by the user, that is, performing content review and label setting on the target short video.
  • the target operation may be that the user selects whether a quick review is required, or the release date set by the user is closer to the current time, so as to determine whether the upper limit timing is loose or urgent.
  • a target recognition model in which the BERT model and the BiLSTM model are sequentially connected can be selected to ensure the recognition accuracy; if the upper limit timing indication is urgent, the BERT model and the BiLSTM model can be selected and connected in parallel target recognition model to ensure the recognition efficiency.
  • the specific timing setting can also be based on the duration threshold, that is, it is urgent if the distance between the release date and the current time is less than the duration threshold; if the distance between the release date and the current time is greater than the duration threshold, it is loose.
  • the target short video can be tagged, such as setting keywords; it can also be used to determine the associated video, for example, after the user finishes watching the target short video, according to
  • the tag information of the target short video is used to determine the associated video, so as to improve the interaction frequency between users and the short video, and improve the user activity.
  • FIG. 13 is a schematic structural diagram of an entity identification device provided by an embodiment of the present application.
  • the entity identification device 1300 includes:
  • the input unit 1302 is used to input the target text information into the input representation layer in the target recognition model to generate a target vector sequence, the target vector sequence includes a plurality of sub-vectors, and the plurality of sub-vectors are based on at least two text dimensions representing the result to the target text information;
  • the prediction unit 1303 is configured to input the target vector sequence into the semantic representation layer in the target recognition model to obtain a label prediction sequence, wherein the label prediction sequence is the result of the multiple sub-vectors and multiple entity tags respectively.
  • a set of attribution probability the semantic representation layer includes a plurality of parallel identification nodes, the identification nodes are associated with each other, and the identification nodes are used to identify the attribution probability of the corresponding sub-vector and the plurality of the entity labels, and a plurality of The entity tags are set based on entities of different categories;
  • the identification unit 1304 is configured to input the label prediction sequence into the conditional discrimination layer in the target identification model to determine the target item in the attribution probability set, where the target item is used to indicate the target item in the target text information. the entity.
  • the at least two text dimensions include a word dimension and a word dimension
  • the input unit 1302 is specifically configured to input the target text information into the input in the target recognition model.
  • the representation layer performs word embedding processing to obtain word embedding vectors;
  • the input unit 1302 is specifically configured to perform word embedding processing on the target text information to obtain a word embedding vector
  • the input unit 1302 is specifically configured to use the word embedding vector and the word embedding vector as the sub-vectors to generate the target vector sequence.
  • the semantic representation layer includes a bidirectional encoder and a bidirectional memory network model
  • the prediction unit 1303 is specifically configured to input the target vector sequence into the bidirectional encoder. , to obtain the first semantic representation
  • the predicting unit 1303 is specifically configured to input the first semantic representation into the bidirectional memory network model to obtain the target semantic representation;
  • the predicting unit 1303 is specifically configured to perform matching with a plurality of the entity labels based on the target semantic representation to obtain the label prediction sequence.
  • the predicting unit 1303 is specifically configured to input the first semantic representation into the two-way memory network model for calculation based on the first order, so as to obtain the above information;
  • the predicting unit 1303 is specifically configured to input the first semantic representation into the two-way memory network model for calculation based on the second order to obtain the following information;
  • the prediction unit 1303 is specifically configured to perform splicing based on the above information and the below information to obtain the target semantic representation.
  • the semantic representation layer includes the bidirectional encoder and the bidirectional memory network model, and the prediction unit 1303 is specifically configured to input the target vector sequence into the target vector sequence. the bidirectional encoder to obtain the first semantic representation;
  • the predicting unit 1303 is specifically configured to input the target vector sequence into the bidirectional memory network model to obtain a second semantic representation
  • the predicting unit 1303 is specifically configured to splicing the first semantic representation and the second semantic representation to obtain the target semantic representation;
  • the predicting unit 1303 is specifically configured to perform matching with a plurality of the entity labels based on the target semantic representation to obtain the label prediction sequence.
  • the predicting unit 1303 is specifically configured to obtain the text size corresponding to the target text information
  • the prediction unit 1303 is specifically configured to determine, based on the text size, a manner in which the target vector sequence is input to the bidirectional encoder and the bidirectional memory network model.
  • the predicting unit 1303 is specifically configured to acquire a preset entity set corresponding to the target text information
  • the predicting unit 1303 is specifically configured to determine the target category in the preset entity set
  • the prediction unit 1303 is specifically configured to train the bidirectional encoder based on the target category.
  • the identifying unit 1304 is specifically configured to input the label prediction sequence into the conditional discrimination layer in the target recognition model, to obtain the information in the conditional discrimination layer. Constraints, the constraints are set based on preset global information;
  • the identifying unit 1304 is specifically configured to screen the attribution probability corresponding to each of the sub-vectors based on the constraint condition to determine the target item in the attribution probability set.
  • the identifying unit 1304 is specifically configured to determine a candidate label corresponding to the sub-vector, where the candidate label includes a location identifier and a label identifier;
  • the identifying unit 1304 is specifically configured to screen the corresponding relationship between the location identifier and the label identifier based on the constraint condition, so as to determine the target item in the attribution probability set.
  • the identifying unit 1304 is specifically configured to acquire an initialization transition matrix
  • the identifying unit 1304 is specifically configured to train the initialization transition matrix based on the global information corresponding to the target text information to obtain a target transition matrix;
  • the identifying unit 1304 is specifically configured to determine the constraint condition according to the distribution of transition scores in the target transition matrix.
  • the obtaining unit 1301 is specifically configured to obtain target identification data in response to a target operation, where the target identification data includes at least one form of media content;
  • the acquiring unit 1301 is specifically configured to perform text interpretation on the target identification data to determine the target text information.
  • the target text information is input into the input representation layer in the target recognition model to generate a target vector sequence.
  • Each text dimension represents the target text information, and it is determined that the multiple sub-vectors included in the target vector sequence are obtained by representing the target text information based on at least two text dimensions.
  • the target vector sequence is input into the semantic representation layer in the target recognition model, and the label prediction sequence of the identification sub-vectors and the attribution probability sets of multiple entity labels is obtained. Nodes can obtain their own context information, enhance the integrity of semantic representation, and improve the recognition accuracy of subsequent entity labels.
  • the target text information can be associated with more entity tags during the identification process, and the important features of different categories of entities can be screened out to enhance the identification of entity categories.
  • the label prediction sequence is input into the conditional discrimination layer in the target recognition model to determine the target item in the attribution probability set that is used to indicate the entity in the target text information.
  • Embodiments of the present application further provide a terminal device, which may be the terminal device implementing the entity identification method mentioned in the foregoing embodiments, and the entity identification apparatus provided by the embodiments of the present application may be configured in the terminal device.
  • FIG. 14 it is a schematic structural diagram of another terminal device provided by an embodiment of the present application. For the convenience of description, only the part related to the embodiment of the present application is shown. If the specific technical details are not disclosed, please refer to the present application. Example Methods section.
  • the terminal can be any terminal device including a mobile phone, a tablet computer, a personal digital assistant (PDA), a point of sales (POS), a vehicle-mounted computer, etc.
  • PDA personal digital assistant
  • POS point of sales
  • vehicle-mounted computer etc.
  • the terminal is a mobile phone as an example:
  • FIG. 14 is a block diagram showing a partial structure of a mobile phone related to a terminal provided by an embodiment of the present application.
  • the mobile phone includes: a radio frequency (RF) circuit 1410 , a memory 1420 , an input unit 1430 , a display unit 1440 , a sensor 1450 , an audio circuit 1460 , a wireless fidelity (WiFi) module 1470 , and a processor 1480 , and components such as the power supply 1490.
  • RF radio frequency
  • the RF circuit 1410 can be used for receiving and sending signals during sending and receiving of information or during a call. In particular, after receiving the downlink information of the base station, it is processed by the processor 1480; in addition, it sends the designed uplink data to the base station.
  • RF circuitry 1410 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (LNA), a duplexer, and the like.
  • LNA low noise amplifier
  • RF circuitry 1410 may also communicate with networks and other devices via wireless communications.
  • the above-mentioned wireless communication can use any communication standard or protocol, including but not limited to the global system of mobile communication (global system of mobile communication, GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access) multiple access, CDMA), wideband code division multiple access (WCDMA), long term evolution (long term evolution, LTE), email, short message service (short messaging service, SMS) and so on.
  • GSM global system of mobile communication
  • general packet radio service general packet radio service
  • GPRS code division multiple access
  • CDMA code division multiple access
  • WCDMA wideband code division multiple access
  • long term evolution long term evolution
  • email short message service
  • the memory 1420 can be used to store software programs and modules, and the processor 1480 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 1420 .
  • the memory 1420 may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data created by the use of the mobile phone (such as audio data, phone book, etc.), etc.
  • memory 1420 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input unit 1430 may be used for receiving inputted numerical or character information, and generating key signal input related to user setting and function control of the mobile phone.
  • the input unit 1430 may include a touch panel 1431 and other input devices 1432 .
  • the touch panel 1431 also known as the touch screen, can collect the user's touch operations on or near it (such as the user's finger, stylus, etc., any suitable objects or accessories on the touch panel 1431 or near the touch panel 1431. operation, and air touch operation within a certain range on the touch panel 1431 ), and drive the corresponding connection device according to the preset program.
  • the touch panel 1431 may include two parts, a touch detection device and a touch controller.
  • the touch detection device detects the user's touch orientation, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it to the touch controller.
  • the touch panel 1431 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the input unit 1430 may also include other input devices 1432 .
  • other input devices 1432 may include, but are not limited to, one or more of physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 1440 may be used to display information input by the user or information provided to the user and various menus of the mobile phone.
  • the display unit 1440 may include a display panel 1441.
  • the display panel 1441 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
  • the touch panel 1431 may cover the display panel 1441. When the touch panel 1431 detects a touch operation on or near it, it transmits the touch operation to the processor 1480 to determine the type of the touch event, and then the processor 1480 determines the type of the touch event according to the touch event. Type provides corresponding visual output on display panel 1441.
  • the touch panel 1431 and the display panel 1441 are used as two independent components to realize the input and input functions of the mobile phone, in some embodiments, the touch panel 1431 and the display panel 1441 can be integrated to form Realize the input and output functions of the mobile phone.
  • the cell phone may also include at least one sensor 1450, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1441 according to the brightness of the ambient light, and the proximity sensor may turn off the display panel 1441 and/or when the mobile phone is moved to the ear. or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in all directions (usually three axes), and can detect the magnitude and direction of gravity when it is stationary. games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; as for other sensors such as gyroscope, barometer, hygrometer, thermometer, infrared sensor, etc. Repeat.
  • the audio circuit 1460, the speaker 1461, and the microphone 1462 can provide an audio interface between the user and the mobile phone.
  • the audio circuit 1460 can convert the received audio data into an electrical signal, and transmit it to the speaker 1461, and the speaker 1461 converts it into a sound signal for output; on the other hand, the microphone 1462 converts the collected sound signal into an electrical signal, which is converted by the audio circuit 1460 After receiving, it is converted into audio data, and then the audio data is output to the processor 1480 for processing, and then sent to, for example, another mobile phone through the RF circuit 1410, or the audio data is output to the memory 1420 for further processing.
  • WiFi is a short-distance wireless transmission technology.
  • the mobile phone can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 1470. It provides users with wireless broadband Internet access.
  • FIG. 14 shows the WiFi module 1470, it can be understood that it is not a necessary component of the mobile phone, and can be completely omitted as required within the scope of not changing the essence of the invention.
  • the processor 1480 is the control center of the mobile phone, using various interfaces and lines to connect various parts of the entire mobile phone, by running or executing the software programs and/or modules stored in the memory 1420, and calling the data stored in the memory 1420.
  • the processor 1480 may include one or more processing units; optionally, the processor 1480 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface and application programs etc., the modem processor mainly deals with wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 1480.
  • the mobile phone also includes a power supply 1490 (such as a battery) for supplying power to various components.
  • a power supply 1490 (such as a battery) for supplying power to various components.
  • the power supply can be logically connected to the processor 1480 through a power management system, so as to manage charging, discharging, and power consumption management functions through the power management system.
  • the mobile phone may also include a camera, a Bluetooth module, and the like, which will not be repeated here.
  • the processor 1480 included in the terminal device also has the function of executing various steps of the above entity identification method.
  • FIG. 15 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • the server 1500 may vary greatly due to different configurations or performances, and may include one or more central processing units. , CPU) 1522 (eg, one or more processors) and memory 1532, one or more storage media 1530 (eg, one or more mass storage devices) storing applications 1542 or data 1544.
  • the memory 1532 and the storage medium 1530 may be short-term storage or persistent storage.
  • the program stored in the storage medium 1530 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the server. Furthermore, the central processing unit 1522 may be configured to communicate with the storage medium 1530 to execute a series of instruction operations in the storage medium 1530 on the server 1500 .
  • Server 1500 may also include one or more power supplies 1526, one or more wired or wireless network interfaces 1550, one or more input output interfaces 1558, and/or, one or more operating systems 1541, such as Windows Server TM , Mac OS X TM , Unix TM , Linux TM , FreeBSD TM and many more.
  • operating systems 1541 such as Windows Server TM , Mac OS X TM , Unix TM , Linux TM , FreeBSD TM and many more.
  • the steps performed by the computer device in the above embodiment may be based on the server structure shown in FIG. 15 .
  • An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program is used to execute the method provided by the foregoing embodiment.
  • Embodiments of the present application also provide a computer program product including entity identification instructions, which, when run on a computer, enables the computer to execute the method provided by the above embodiments.
  • This embodiment of the present application also provides an entity identification system, and the entity identification system may include the entity identification device in the embodiment described in FIG. 13 , or the terminal device in the embodiment described in FIG. 14 , or the embodiment described in FIG. server.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the related technology, or all or part of the technical solution, and the computer software product is stored in a storage medium.
  • a computer device which may be a personal computer, an entity identification device, or a network device, etc.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

一种实体识别方法、装置、设备以及存储介质,涉及人工智能的自然语言处理技术。通过获取目标文本信息(301);然后将目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列(302);并将目标向量序列输入语义表示层,以得到标签预测序列(303);进而将标签预测序列输入条件鉴别层,以确定归属概率集合中的目标项(304)。从而实现高效的实体识别过程,由于采用多个实体标签进行匹配,可以筛选不同类别实体的重要特征,增强对于实体类别的分辨能力,且无需人工审核的过程,提高了实体识别的效率及准确性。

Description

一种实体识别方法、装置、设备以及存储介质
本申请要求于2020年10月14日提交中国专利局、申请号为202011096598.0、申请名称为“一种实体识别方法、装置、设备以及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及实体识别。
背景技术
实体识别是自然语言处理中的一项基础任务,应用范围非常广泛。以实体为例,实体一般指的是文本中具有特定意义或者指代性强的实体,通常包括人名、地名、组织机构名、日期时间、专有名词等。通过从非结构化的输入文本中抽取出上述实体,并且可以按照业务需求识别出更多类别的实体,比如产品名称、型号、价格等。因此实体这个概念可以很广,只要是业务需要的特殊文本片段都可以称为实体。通过实体识别,可以提炼出想要的数据或对象。实体识别是后续进行内容挖掘分析,关系抽取和事件分析的基础。
实体识别的过程可以采用多模式匹配(Aho–Corasick,AC)算法,即寻找模式串内部规律,达到在每次失配时的高效跳转,例如识别模式串之间的相同前缀关系进行实体识别。
发明内容
有鉴于此,本申请提供一种实体识别方法,可以有效提高实体识别的效率以及准确性。
一方面,本申请实施例提供一种实体识别方法,可以应用于终端设备中包含实体识别功能的系统或程序中,具体包括:
获取目标文本信息;
将所述目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列,所述目标向量序列包括多个子向量,所述多个子向量是基于至少两个文本维度对所述目标文本信息表示所得;
将所述目标向量序列输入所述目标识别模型中的语义表示层,以得到标签预测序列,其中,所述标签预测序列为所述多个子向量分别与多个实体标签的归属概率集合,所述语义表示层包括多个并列的识别节点,所述识别节点之间相互关联,所述识别节点用于识别对应的子向量与多个所述实体标签的归属概率,多个所述实体标签基于不同类别的实体设定;
将所述标签预测序列输入所述目标识别模型中的条件鉴别层,以确定所述归属概率集合中的目标项,所述目标项用于指示所述目标文本信息中的所述实体。
另一方面,本申请实施例提供一种实体识别装置,包括:获取单元,用于获取目标文本信息;
输入单元,用于将所述目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列,所述目标向量序列包括多个子向量,所述子向量是基于至少两个文本维度对所述目标文本信息表示所得;
预测单元,用于将所述目标向量序列输入所述目标识别模型中的语义表示层,以得到标签预测序列,其中,所述标签预测序列为所述子向量分别与多个实体标签的归属概率集 合,所述语义表示层包括多个并列的识别节点,所述识别节点之间相互关联,所述识别节点用于识别对应的子向量与多个所述实体标签的归属概率,多个所述实体标签基于不同类别的实体设定;
识别单元,用于将所述标签预测序列输入所述目标识别模型中的条件鉴别层,以确定所述归属概率集合中的目标项,所述目标项用于指示所述目标文本信息中的所述实体。
另一方面,本申请实施例提供一种计算机设备,包括:存储器、处理器以及总线系统;所述存储器用于存储程序代码;所述处理器用于根据所述程序代码中的指令执行上述方面所述的实体识别方法。
另一方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序用于执行上述方面所述的实体识别方法。
又一方面,本申请实施例提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述方面所述的实体识别方法。
从以上技术方案可以看出,本申请实施例具有以下优点:
针对待识别实体的目标文本信息,将目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列,为了增强对目标文本信息的特征描述全面性,通过至少两个文本维度对目标文本信息进行表示,确定目标向量序列包括的多个子向量是基于至少两个文本维度对目标文本信息表示所得。将该目标向量序列输入目标识别模型中的语义表示层,得到标识子向量分别与多个实体标签的归属概率集合的标签预测序列,语义表示层包括多个并列且相互关联的识别节点,使得识别节点间能够得到各自的上下文信息,增强语义表示的完整性,进而提高后续实体标签的识别准确性。而且,由于前述多个实体标签基于不同类别的实体设定,在识别过程中能够将目标文本信息与更多的实体标签进行关联,可以筛选出不同类别实体的重要特征,增强对于实体类别的分辨能力,进而将标签预测序列输入目标识别模型中的条件鉴别层,以确定归属概率集合中用于指示目标文本信息中的实体的目标项。从而实现高效的实体识别过程,提高了实体识别的效率及准确性。
附图说明
图1为实体识别系统运行的网络架构图;
图2为本申请实施例提供的一种实体识别的流程架构图;
图3为本申请实施例提供的一种实体识别方法的流程图;
图4为本申请实施例提供的一种实体识别方法的模型架构图;
图5为本申请实施例提供的另一种实体识别方法的模型架构图;
图6为本申请实施例提供的另一种实体识别方法的模型架构图;
图7为本申请实施例提供的另一种实体识别方法的流程图;
图8为本申请实施例提供的另一种实体识别方法的模型架构图;
图9为本申请实施例提供的另一种实体识别方法的模型架构图;
图10为本申请实施例提供的另一种实体识别方法的模型架构图;
图11为本申请实施例提供的一种实体识别方法的系统架构图;
图12为本申请实施例提供的另一种实体识别方法的流程图;
图13为本申请实施例提供的一种实体识别装置的结构示意图;
图14为本申请实施例提供的一种终端设备的结构示意图;
图15为本申请实施例提供的一种服务器的结构示意图。
具体实施方式
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例例如能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“对应于”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
首先,对本申请实施例中可能出现的一些名词进行解释。
长短期记忆网络(Long Short-Term Memory,LSTM):一种时间循环神经网络,是为了解决一般的循环神经网络(RNN)存在的长期依赖问题而专门设计出来的,LSTM适合于处理和预测时间序列中间隔和延迟非常长的重要事件。
自然语言处理(NLP):是计算机科学领域与人工智能领域中的一个重要方向。它研究能实现人与计算机之间用自然语言进行有效通信的各种理论和方法。自然语言处理是一门融语言学、计算机科学、数学于一体的科学。
实体识别:是信息提取、问答系统、句法分析、机器翻译等应用领域的重要基础工具,在自然语言处理技术走向实用化的过程中占有重要地位。例如,实体识别可以识别出待处理文本中三大类(实体类、时间类和数字类)、七小类(人名、机构、地名、时间、日期、货币和百分比)实体,等等。
条件随机场(conditional random field CRF):是一种鉴别式机率模型,是随机场的一种,常用于标注或分析序列资料,如自然语言文字或是生物序列。
短视频:即短片视频,是一种互联网内容传播方式,一般是在互联网新媒体上传播的时长在5分钟以内的视频传播内容;随着移动终端普及和网络的提速,短平快的大流量传播内容逐渐获得各大平台、粉丝和资本的青睐。
多频道网络(Multi-Channel Network,MCN):是一种多频道网络的产品形态,将PGC内容联合起来,在资本的有力支持下,保障内容的持续输出,从而最终实现商业的稳定变现。
内容生产中心(Professional Generated Content,PGC):指专业生产内容(视频网站)、专家生产内容(微博)。用来泛指内容个性化、视角多元化、传播民主化、社会关系虚拟化。也称为PPC(Professionally-produced Content)。
用户原创内容(User Generated Content,UGC):指用户原创内容,是伴随着以提倡个性化为主要特点的Web2.0概念而兴起的。它并不是某一种具体的业务,而是一种用户使用 互联网的新方式,即由原来的以下载为主变成下载和上传并重。
消息来源(Feeds),又译为源料、馈送、资讯提供、供稿、摘要、源、新闻订阅、网源是一种资料格式,网站透过它将最新资讯传播给用户,通常以时间轴方式排列,Timeline是Feed最原始最直觉也最基本的展示形式。用户能够订阅网站的先决条件是,网站提供了消息来源。将feed汇流于一处称为聚合(aggregation),而用于聚合的软体称为聚合器(aggregator)。对最终用户而言,聚合器是专门用来订阅网站的软件,一般亦称为RSS阅读器、feed阅读器、新闻阅读器等。
应理解,本申请提供的实体识别方法可以应用于计算机设备,该计算机设备例如包括终端设备或服务器。终端设备中包含实体识别功能的系统或程序中,例如互动剧,具体的,实体识别系统可以运行于如图1所示的网络架构中,如图1所示,是实体识别系统运行的网络架构图,如图可知,实体识别系统可以提供与多个信息源的实体识别过程,即服务器可以接受多个终端发送的文本内容,并对其中的实体进行识别,并回复终端对应的识别结果;可以理解的是,图1中示出了多种终端设备,在实际场景中可以有更多或更少种类的终端设备参与到实体识别的过程中,具体数量和种类因实际场景而定,此处不做限定,另外,图1中示出了一个服务器,但在实际场景中,也可以有多个服务器的参与,具体服务器数量因实际场景而定。
本实施例中,服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN、以及大数据和人工智能平台等基础云计算服务的云服务器。终端可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等,但并不局限于此。终端以及服务器可以通过有线或无线通信方式进行直接或间接地连接,终端以及服务器可以连接组成区块链网络,本申请在此不做限制。
可以理解的是,上述实体识别系统可以运行于个人移动终端,例如:作为互动剧这样的应用,也可以运行于服务器,还可以作为运行于第三方设备以提供实体识别,以得到信息源的实体识别处理结果;具体的实体识别系统可以是以一种程序的形式在上述设备中运行,也可以作为上述设备中的系统部件进行运行,还可以作为云端服务程序的一种,具体运作模式因实际场景而定,此处不做限定。
本申请应用于自然语言处理技术,自然语言处理(Nature Language processing,NLP)是计算机科学领域与人工智能领域中的一个重要方向。它研究能实现人与计算机之间用自然语言进行有效通信的各种理论和方法。自然语言处理是一门融语言学、计算机科学、数学于一体的科学。因此,这一领域的研究将涉及自然语言,即人们日常使用的语言,所以它与语言学的研究有着密切的联系。自然语言处理技术通常包括实体识别、文本处理、语义理解、机器翻译、机器人问答、知识图谱等技术。
其中,实体识别(Named Entity Recognition,NER)又称作专名识别,是自然语言处理中的一项基础任务,应用范围非常广泛。实体一般指的是文本中具有特定意义或者指代性强的实体,通常包括人名、地名、组织机构名、日期时间、专有名词等。NER系统就是 从非结构化的输入文本中抽取出上述实体,并且可以按照业务需求识别出更多类别的实体,比如产品名称、型号、价格等。因此实体这个概念可以很广,只要是业务需要的特殊文本片段都可以称为实体。通过实体识别,可以提炼出想要的数据,他是后续进行内容挖掘分析,关系抽取和事件分析的基础。
一般,实体识别的过程可以采用多模式匹配(AC)算法,即寻找模式串内部规律,达到在每次失配时的高效跳转,例如识别模式串之间的相同前缀关系进行实体识别。
但是,基于AC算法匹配方式进行实体识别,匹配过程单一,容易引入错误,准确率难以继续提升,且需要进行人工审核,影响了实体识别的效率。
为了解决上述问题,本申请提出了一种实体识别方法,该方法应用于图2所示的实体识别的流程框架中,如图2所示,为本申请实施例提供的一种实体识别的流程架构图,通过对于文本信息进行多维度表示的向量转化(例如采用字、词、词性三种维度特征)得到词向量,然后对不同的实体标签(例如实体标签1-n)进行匹配,以在多个不同子空间捕获上下文相关信息,筛选不同类别实体的重要特征,增强特征对实体类别分辨能力,进而提升识别效果,确定出文本信息对应的目标实体。
可以理解的是,本申请所提供的方法可以为一种程序的写入,以作为硬件系统中的一种处理逻辑,也可以作为一种实体识别装置,采用集成或外接的方式实现上述处理逻辑。作为一种实现方式,该实体识别装置通过获取目标文本信息;然后将目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列,目标向量序列包括多个子向量,子向量基于至少两个文本维度表示所得;进一步的将目标向量序列输入目标识别模型中的语义表示层,以得到标签预测序列,其中,标签预测序列为子向量与多个实体标签的归属概率集合,语义表示层包括多个并列的识别节点,识别节点之间相互关联,识别节点用于识别对应的子向量与多个实体标签的归属概率,实体标签基于不同类别的实体设定;进而将标签预测序列输入目标识别模型中的条件鉴别层,以确定归属概率集合中的目标项,目标项用于指示目标文本信息中的实体。从而实现高效的实体识别过程,由于采用多个实体标签进行匹配,可以筛选不同类别实体的重要特征,增强对于实体类别的分辨能力,且无需人工审核的过程,提高了实体识别的效率及准确性。
本申请实施例提供的方案涉及人工智能的自然语言处理技术,具体通过如下实施例进行说明:
结合上述流程架构,下面将对本申请中实体识别方法进行介绍,请参阅图3,图3为本申请实施例提供的一种实体识别方法的流程图,该管理方法可以是由终端设备执行的,也可以是由服务器执行的,还可以是由终端设备与服务器共同执行的,下面以终端设备执行为例进行说明。本申请实施例至少包括以下步骤:
301、获取目标文本信息。
本实施例中,目标文本信息可以是来源于多种信息来源,例如网页、应用程序、界面等。
在一种可能的实现方式中,在从信息来源获取的数据中有多媒体内容时,需要对文本信息进行提取,即首先响应于目标操作获取目标识别数据,目标识别数据包含至少一种媒 体内容形式,例如短视频;然后基于媒体内容形式对目标识别数据进行文本解译,以确定目标文本信息。例如对短视频中的摘要部分进行解译,或对短视频进行声信号识别,并转换为对应的文本信息,具体的内容形式因实际场景而定,此处不做限定。
302、将目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列。
本实施例中,目标向量序列包括多个子向量,子向量基于至少两个文本维度表示所得;其中,为了保证对于目标文本信息描述的准确性,可以采用不同维度的特征进行描述,例如采用词和字的维度进行向量表示。
在一种可能的实现方式中,所述至少两个文本维度包括词维度和字维度。首先将目标文本信息输入目标识别模型中的输入表示层进行词嵌入处理,以得到词嵌入向量;然后对目标文本信息进行字嵌入处理,以得到字嵌入向量;进而将词嵌入向量和字嵌入向量作为子向量,生成目标向量序列。从而提高了目标文本信息的特征维度,保证了特征描述的全面性。
在一种可能的场景中,词嵌入向量的处理过程可以是采用word2vec方式或者词袋模型方式,而字嵌入处理过程可以是采用随机的01向量字符串进行编码所得,然后字嵌入和词嵌入向量联合到一起作为目标向量序列,即在标识的时候给字粒度的向量单独留下了表示位置,比如采用的字的OneHot编码,直接放在子向量后面,例如“视频”的目标向量序列是“视频”“视”“频”这样的一条序列。其中,词嵌入通常是事先训练好的,字嵌入则是随机初始化的,且嵌入过程会随着训练的迭代过程被调整。
可以理解的是,基于上述向量转换的处理方式对目标文本信息中的每个单元进行处理,以得到包含多个向量序列的目标向量序列。例如对于一句话,将其中的每个词作为一个单元进行向量转换的处理。
303、将目标向量序列输入目标识别模型中的语义表示层,以得到标签预测序列。
本实施例中,标签预测序列为子向量与多个实体标签的归属概率集合,即基于不同类别的实体标签进行匹配过程,为一种多头注意力(Multi-Head Attention)机制的处理过程,Multi-Head Attention机制增加了目标识别模型捕获目标向量序列中不同位置信息的能力,如果直接用映射前的参数计算,只能得到一个固定的权重概率分布,而这个概率分布会重点关注一个位置或个几个位置的信息,但是基于Multi-Head Attention机制的匹配过程,可以和更多的位置上的实体标签关联起来。且由于在进行映射时不共享权值,因此映射后的子空间是不同的,或者说不同的子空间涵盖的信息是不一样的,故最后拼接的向量涵盖的信息会更广。
可以理解的是,语义表示层包括多个并列的识别节点,识别节点之间相互关联,识别节点用于识别对应的子向量与多个实体标签的归属概率,具体的,每个识别节点可以采用双向长短期记忆(Bi-directional Long Short-Term Memory,BiLSTM)模型,由于识别节点之间相互关联可以使得语义表示过程中得到上下文信息,保证语义表示的完整性,进而保证标签预测的准确性。
具体的,如图4所示,为本申请实施例提供的一种实体识别方法的模型架构图;图中示出了目标识别模型包括输入表示层、语义表示层和条件鉴别层,通过输入表示层输入目标 向量序列得到X1-Xt,然后通过语义表示层中的LSTM1-LSTM3进行多特征维度的关联表示,进而进行特征的拼接,并基于自注意力机制进行多头的匹配操作,与更多的实体标签进行关联,从而保证输入条件鉴别层中的Y1-Yt预测实体标签的准确性。
在一种可能的场景中,由于基于Multi-Head Attention机制可以得到子向量的位置信息,故对应的预测标签可以采用的BIESO-Type的两段式标注规范。对于第一段,包括begin、Inside、End,分别代表实体的开始(B),中间(I),结束位置(E);另外,Single代表单字实体,Other代表非实体,对于第二段,Type包括PER、LOC、ORG、GAME、BOOK等的分别对应人名,地名,组织机构名,游戏,书籍等等。比如人名Person和组织机构Organization挖掘的数据集中总共有8类标签:B-Person(人名的开始部分),I-Person(人名的中间部分),E-Person(人名的结束地方),B-Organization(组织机构的开始部分),I-Organization(组织机构的中间部分),E-Organization(组织机构的结束部分),O(非实体信息),S-(单字实体),具体的标签类型饮食及场景而定,此处不做限定。
可选的,上述标签的设定还可以包括S-ORG表示的是组织机构,S-GAME表示的是游戏,S-Book表示的是书籍,S-Music表示的是音乐,S-Food表示的是食品,S-Health表示的是健康,S-Tourist表示的是旅游,S-Military表示的是军事,S-antiques古董等等,挖掘实体的具体种类和类型以业务需要和内容类目的分布因实际场景而定。
下面以人名为Person和组织机构为Organization的识别过程为例对目标识别模型的运行过程进行说明,如图5所示,为本申请实施例提供的另一种实体识别方法的模型架构图。图中示出了加入目标识别模型的一段文本信息x是包含了5个单词的一句话(w0,w1,w2,w3,w4)。在句子x中[w0,w1]是人名,[w3]是组织机构名称,其他都是“O”。BiLSTM层(语义表示层)的输出表示该单词对应各个类别的分数(归属概率)。结合图6进行说明,如w0,BiLSTM节点(识别节点)的输出是1.5(B-Person),0.9(I-Person),0.1(B-Organization),0.08(I-Organization)and 0.05(O),上述分数将会作为CRF层的输入。
304、将标签预测序列输入目标识别模型中的条件鉴别层,以确定归属概率集合中的目标项。
本实施例中,目标项用于指示目标文本信息中的实体,即识别结果。
具体的,结合步骤303中图5所示的实施例,将上述分数将会作为CRF层的输入后,CRF层会确定其中的目标项,例如归属概率最高(分数最高)的实体标签作为目标项,即可以得到实体识别结果为w0(B-Person),w1(I-Person),w2(B-Organization),w3(I-Organization),w4(O)。其中,CRF层可以作为基于词性特征对于识别过程进行优化的步骤,故结合上述字、词粒度的划分与结合,本身其中可以采用字、词、词性三种特征进行实体识别。
可以理解的是,上述目标项的过程相当于在条件鉴别层中设定了对于标签预测序列的约束条件,即首先将标签预测序列输入目标识别模型中的条件鉴别层,以获取条件鉴别层中的约束条件;然后基于约束条件对每个子向量对应的归属概率进行筛选,以确定归属概率集合中的目标项。其中,约束条件基于预设的全局信息设定。
可选的,CRF层网络还可以学习到句子的约束条件,通过CRF层可以加入全局的约束信息来保证最终预测结果是有效的。该约束条件可以在训练数据时被CRF层自动学习得到。 比如可能的约束条件有:句子的开头应该是“B-”或“O”,而不是“I-”或者“E-”等。
对于全局信息设定的场景,在识别过程中可以首先确定标签预测序列中子向量对应的候选标签,该候选标签包括位置标识和标签标识;然后基于约束条件对位置标识和标签标识的对应关系进行筛选,以确定归属概率集合中的目标项,例如筛除不能出现在句首的标签,从而保证实体识别的准确性。
可选的,在上述CRF层的训练过程中,还可以是基于转移矩阵进行的,即首先获取初始化转移矩阵;然后基于目标文本信息对应的全局信息对初始化转移矩阵进行训练,以得到目标转移矩阵;进而根据目标转移矩阵中转移分数的分布确定约束条件。即在考虑当前标签的归属概率的同时,还需要考虑相邻标签的影响。具体的,在如图6所示的场景中,图6为本申请实施例提供的另一种实体识别方法的模型架构图例如用TransScore来表示转移分数。例如,tB-Person,I-Person=0.9表示从类别B-Person→I-Person的分数是0.9,可以确定为w2的归属分数。
可以理解的是,在CRF层训练之前,可以随机初始化转移矩阵的分数。这些分数将随着训练的迭代过程被更新,即CRF层可以自己学到这些约束条件,不同类别之间的转移分数就构成了转移矩阵,即满足转移矩阵的标签可以作为识别结果。对于转移分数主要作用即帮助CRF计算损失函数,该损失由两部分组成:真实路径的分数和所有路径的总分数,其中,转移矩阵的分数大小直接影响最终损失函数大小,从而实现约束,保证命名识别的准确性。
在一种可能的场景中,对于-“B-label1 I-label2 I-label3…”预测序列,类别1,2,3应该是同一种实体类别。比如,“B-Person I-Person”是正确的,而“B-Person I-Organization”则是错误的。“O I-label”是错误的,实体的开头应该是“B-”而不是“I-”。通过这些有用的约束规则,最终模型错误的预测序列将会大大减少。利用全局信息约束,避免结果中不合理的标签序列,使得整个内容实体挖掘效率的准确度有了很大的提升。
结合上述实施例可知,针对待识别实体的目标文本信息,将目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列,为了增强对目标文本信息的特征描述全面性,通过至少两个文本维度对目标文本信息进行表示,确定目标向量序列包括的多个子向量是基于至少两个文本维度对目标文本信息表示所得。将该目标向量序列输入目标识别模型中的语义表示层,得到标识子向量分别与多个实体标签的归属概率集合的标签预测序列,语义表示层包括多个并列且相互关联的识别节点,使得识别节点间能够得到各自的上下文信息,增强语义表示的完整性,进而提高后续实体标签的识别准确性。而且,由于前述多个实体标签基于不同类别的实体设定,在识别过程中能够将目标文本信息与更多的实体标签进行关联,可以筛选出不同类别实体的重要特征,增强对于实体类别的分辨能力,进而将标签预测序列输入目标识别模型中的条件鉴别层,以确定归属概率集合中用于指示目标文本信息中的实体的目标项。从而实现高效的实体识别过程,提高了实体识别的效率及准确性。
在一种可能的场景中,在语义表示的过程中,需要针对性的准确训练数据进行训练才能达到良好的识别效果,但是准备精确的训练数据耗时耗力;为了更好挖掘和获取原始文 本的语义,可以在原有的语义表示层(BiLSTM模型)的基础上添加双向编码器(Bidirectional Encoder Representation from Transformers,BERT),以使得语义表示层可以学习到上下文信息,提高实体识别效果,下面对该场景进行介绍。具体的,可以采用图7所示的流程,图7为本申请实施例提供的另一种实体识别方法的流程图,本申请实施例至少包括以下步骤:
701、获取目标文本信息。
本实施例中,步骤701与图3所示实施例的步骤301相似,此处不做赘述。
702、基于目标文本信息对应的实体类别集合对双向编码器进行训练。
本实施例中,为了提高双向编码器对于目标文本信息的适应性,可以调用与目标文本信息对应的实体类别集合对双向编码器进行训练。其中,双向编码器的结构如图8所示,图8为本申请实施例提供的另一种实体识别方法的模型架构图。其中,Token为是每一个分词后最小粒度单元,CLS表示的一个独立句子的开头,SEP表示一个独立句子的结尾,另外,C是CLS的简写,也是句子的开头;T1…TN是每个独立分词Tok1到TokN的简写。
具体的,即首先获取目标文本信息对应的预设实体集合;然后确定预设实体集合中的目标类别;进而基于目标类别对双向编码器进行训练,以使得双向编码器适配于目标文本信息。例如在PreTrain Model(BERT-Base,Chinese)的基础上,通过多类型实体标注数据进行finetune,得到支持多实体的实体识别模型的BERT层,从而提高识别的准确性。
703、调整双向编码器与双向记忆网络模型的关联关系。
本实施例中,双向编码器BERT模型与双向长短期记忆网络模型BiLSTM模型的关联关系可以是依次连接的,即BERT模型作为BiLSTM模型的前置层,如图9所示,图9为本申请实施例提供的另一种实体识别方法的模型架构图。对于该场景的标签预测序列生成过程可以是首先将目标向量序列输入双向编码器,以得到第一语义表示;然后将第一语义表示输入双向记忆网络模型,以得到目标语义表示;进而基于目标语义表示与多个实体标签进行匹配,以得到标签预测序列。其中E表示Token在这个句子当中的位置编码,比如开头句子的第一个词位置为0,然后依次编码序号;T表示输出的初步编码结果,P表示的是拼接在一起作为下一级的输入,即进入CRF层进行筛选。
对于该场景中,语义表示的过程涉及双向的描述,即首先将第一语义表示输入双向记忆网络模型进行基于第一次序的计算,以得到上位信息;然后将第一语义表示输入双向记忆网络模型进行基于第二次序的计算,以得到下位信息;进而基于上位信息和下位信息进行拼接,以得到目标语义表示。从而保证了上下文信息的获取,提高了语义表示的准确性。
另外,BERT模型与BiLSTM模型的关联关系还可以是并联的,如图10所示,图10为本申请实施例提供的另一种实体识别方法的模型架构图。对于图中架构生成标签预测序列的过程,即首先将目标向量序列输入双向编码器,以得到第一语义表示;然后将目标向量序列输入双向记忆网络模型,以得到第二语义表示;进而对第一语义表示和第二语义表示进行拼接,以得到目标语义表示;从而基于目标语义表示与多个实体标签进行匹配,以得到标签预测序列。即将BERT编码的结果(第一语义表示)和BiLSTM编码的结果(第二语义表示)在P层做了直接的拼接。可以理解为原始的输入有2路编码采用的延迟融合的方式,在编码完成后将各个结果进行拼接,可以对原始深入文本有更好的语义表征抽取。
可以理解的是,并联的方式相对网络计算的深度会浅一些,计算的效率比依次连接更高,适合对计算速度和预测速度有更高要求的地方,而依次连接适合有更高精度的地方。
可选的,可以依照数据处理量级和业务场景来进行融合方式的确定,即获取目标文本信息对应的文本大小;然后基于文本大小确定目标向量序列输入双向编码器和双向记忆网络模型的方式。例如文本大小大于1G,则选用并联的方式进行文本表示,从而提高了识别过程对于场景的适应性。
704、对目标向量序列的语义表达进行更新。
本实施例中,基于上述BERT模型与BiLSTM模型的关联关系设定,可以输出更为全面的语义表达,若是在识别过程中进行的关联关系设定,则可以是对原有的语义表达进行更新,以提高语义表达的全面性。
通过上述实施例可知,本实施例将信息流内容分发当中的推荐实体识别算法从匹配升级到包含BERT+BiLSTM-CRF+Self-Attention的目标识别模型架构的多类别实体联合识别模型。整个模型主要采用字、词、词性三种特征,在BiLSTM层与CRF层间引入多头注意力机制multi-head Self-Attention层,在多个不同子空间捕获上下文相关信息,筛选不同类别实体的重要特征,增强特征对实体类别分辨能力,进而提升模型识别效果;同时为了解决训练数据难以获取导致的精度不高的问题,在输入表示层增加Bert语义抽取层,并且采用了2种不同的与BiLSTM的融合方式来获取最佳效果;最后使用CRF进行标签序列建模,利用全局信息约束,避免结果中不合理的标签序列,使得整个内容实体挖掘效率的准确度有了很大的提升。
进一步的,通过本申请能够在信息流推荐当中对文本内容进行高效的结构化处理,后续很多任务(比如关系抽取,事件抽取)提供良好的基础和供有效的辅助输入,在实体识别的基础上确定无结构文本中实体对间的关系类别,并形成结构化的数据以便存储和取用;同时生成结构化数据,比如,关键词、分类、主题、实体名词等,能够有效帮助建立知识图谱,建立用户画像,这种内容分析越深入,越能提供有效信息供推荐系统使用,越能抓住的用户群体就越细致,推荐的转化率就越高;算法抽取到的实体信息,可以作为人工标记分类,主题和事件信息的补充,辅助人工标注,节省人力成本。
在另一种可能的场景中,上述实施例可以应用于图11所示的内容识别系统中,图11为本申请实施例提供的另一种实体识别方法的系统架构图,
如上图所示基于深度学习模型的内容实体挖掘方法和系统流程图。信息流当中分发的内容,包括内容的对应的标题,内容的描述文本,图文内容正文,视频包含的文本信息过少(只有标题或者通过字幕OCR识别文本抽取,视频音转文会有一部分文本),这些都是挖掘输入的原始信息。通过对些文本信息进行内容实体挖掘来获取结构化数据。结构化文本的目的可以有很多,比如,关键词、分类、主题、实体名词等,用来建立知识图谱,建立画像(包括用户画像和物品画像,),是后续很多任务和处理的基础。信息抽取三个最重要的子任务:实体抽取,也就是实体识别,包括实体的检测(find)和分类(classify),实体抽取或者说实体识别(NER)在信息抽取中扮演着重要角色,主要抽取的是文本中的原子信息元素,如人名、组织/机构名、地理位置、事件/日期、字符值、金额值等。后续的关系 抽取(Relation Extraction)和事件抽取的内容分析都必须以内容实体挖掘为基础。如上图所示,在内容处理全流程当中,内容生产者发布的内容在发布入口后,通过调度中心服务调用实体挖掘服务来进行内容实体挖掘,挖掘的结果保存在实体数据库当中,为推荐系统提供服务。
下面对该系统中各个服务模块的主要功能进行说明。
一.内容生产和消费端
(1)PGC或者UGC(user generate content),MCN内容生产者,通过移动端或者后端接口API系统,提供图文或者视频内容,这些都是推荐分发内容的主要内容来源;
(2)通过和上下行内容接口服务的通讯,上传图文内容,图文内容来源通常是一个轻量级发布端和编辑内容入口,视频内容发布通常是一个拍摄摄影端,拍摄过程当中本地视频内容可以选择搭配的音乐,滤镜模板和视频的美化功能等等;
(3)作为消费者,和上下行内容接口服务器通讯,推过推荐获取访问内容的索引信息,然后和内容存储服务器通讯,获取对应的内容包括推荐得到内容,专题订阅的内容,内容存储服务器存储的是内容实体比如视频源文件,图片源文件,而内容的元信息比如标题,作者,封面图,分类,Tag信息等等存储在内容数据库;
(4)同时将上传和下载过程当中用户播放的行为数据,卡顿,加载时间,播放点击等上报给后端用于统计分析;
(5)消费端通常通过Feeds流方式浏览内容数据;
二.上下行内容接口服务器
(1)和内容生产端直接通讯,从前端提交的内容,通常是内容的标题,发布者,摘要,封面图,发布时间,内容正文,把文件存入内容数据库;
(2)将图文内容的元信息,比如文件大小,封面图链接,标题,发布时间,作者,内容正文等信息写入内容数据库,如果是视频内容,视频文件保存在视频存储服务当中,还有封面图文件,视频的元信息和图文内容一样保存在内容数据库当中;
(3)将发布的提交的内容同步给调度中心服务器,进行后续的内容处理和流转;
三.内容数据库
(1)内容的核心数据库,所有生产者发布内容的元信息都保存在这个业务数据库当中,重点是内容本身的元信息比如文件大小,封面图链接,码率,文件格式,标题,发布时间,作者,视频文件大小,视频格式,是否原创的标记或者首发还包括人工审核过程中对内容的分类(包括一,二,三级别分类和标签信息,比如一篇讲解华为手机的文章,一级分科是科技,二级分类是智能手机,三级分类是国内手机,标签信息是华为,mate30);
(2)人工审核过程当中会读取内容数据库当中的信息,同时人工审核的结果和状态也会回传进入内容数据库;
(3)调度中心对内容处理主要包括机器处理和人工审核处理,这里机器处理核心各种质量判断比如低质过滤,内容标签比如分类,标签信息(要获取分类和标签信息,前提就是进行内容的实体信息挖掘,这个也是由调度中心服务完成的,不过最终实体挖掘的结果保存在实体数据库当中),还有就是内容排重,他们的结果会写入内容数据库,完全重复 一样的内容不会给人工进行重复的二次处理;
(4)后续抽取标签和分类的时候会从内容数据库读取内容的元信息,数据预处理服务也是需要从元数据库当中读取元信息;
四.调度中心服务
(1)负责内容流转的整个调度过程,通过上下行内容接口服务器接收入库的内容,然后从内容数据库中获取内容的元信息;
(2)调度人工审核系统和机器处理系统,控制调度的顺序和优先级;
(3)通过人工审核系统内容被启用,然后通过内容出口分发服务(通常是推荐引擎或者搜索引擎或者运营)直接的展示页面提供给终端的内容消费者,也就是消费端获得的内容索引信息;
(4)对于和实体挖掘服务通讯,输入内容本身所包含的文本信息,挖掘文本当中包含的各种内容实体信息,对文本进行结构化抽取和存储,挖掘实体结果信息保存在实体数据当中;
五.内容存储服务
(1)存储内容的元信息之外的内容实体信息,比如视频源文件和图文内容的图片源文件;
(2)在视频内容标签抽取的时候,为标签服务提供视频源文件包括源文件中间的抽帧内容;
(3)内容消费端在获取到内容的索引信息后,也是直接访问内容存储服务来消费实际内容的;
六.人工审核系统
(1)人工审核系统是人工服务能力的载体,主要用于审核过滤政治敏感,色情,法律不允许等机器无法确定判断的内容,同时还对进行视频内容的标签标注和二次确认;
七.数据预处理服务
(1)从元数据库和内容存储数据当中获取原始数据,在实体挖掘当中,对文本数据进行序列标注的预处理,作为实体挖掘模型预训练的样本数据;
(2)对文本进行分词处理,视频内容可以提取视频当中的字幕或者音转文方式来提取视频当中的文本信息,这个利用相关技术,这里主要是作为文本信息一个渠道和来源;
八.实体挖掘模型
(1)按照上面描述的实体挖掘建模方法,构建的基于上述实施例的包含BERT+BiLSTM-CRF+Self-Attention架构的目标识别模型;
(2)模型训练的样本和数据来自于数据预处理服务;
九.实体数据库
(1)保存实体挖掘服务挖掘的实体结果,为后续进行内容的标签分类,关系及事件抽取等后续任务提供数据基础;
十.实体挖掘服务
(1)接受调度中心的调度,对于链路上新发布的图文内容,通过调度中心服务调用实 体挖掘服务来进行内容实体挖掘,挖掘的结果保存在实体数据库当中,为推荐系统提供服务;
(2)将上面描述的实体挖掘模型服务化,接受链路上核心调度服务调度中心的调度处理;
十一.内容分发出口服务
(1)机器和人工处理链路内容输出的出口,调度中心处理最后生成的推荐内容池通过出口服务分发;
(2)分发的主要方式的推荐算法分发和人工运营;
(3)和内容消费端用户直接通讯,提供推荐分发内容的索引信息,也是信息流Feed的出口。
在另一种可能的场景中,本申请提供的实体识别方法可以应用于社交网络的交互过程中,即通过对于用户发出的内容的识别,进行相关标识的设定或联想等。社交网络源自网络社交,网络社交的起点是电子邮件。互联网本质上就是计算机之间的联网,早期的E-mail解决了远程的邮件传输的问题,至今它也是互联网上最普及的应用,同时它也是网络社交的起点。BBS则更进了一步,把“群发”和“转发”常态化,理论上实现了向所有人发布信息并讨论话题的功能(疆界是BBS的访问者数量)。成为早期的互联网内容自发产生的平台。最近这2年由于智能手机的全面普及,wi-fi设施的无处不在,4G资费的普遍降低,5G时代的即将来临,在当下移动互联网时代的强语境下,用户接受信息的需求,正在从图文时代向视频化时代过渡。因此,短视频将逐渐成为移动互联网的主导内容形态之一,在一定程度上替代图文内容消费,并在新闻、社交平台等图文媒体中逐渐取得主导地位。这些内容通常以Feeds流形式展示出来供用户快速刷新,因此如何快速完成内容的审核成为难题。
下面结合短视频应用中视频上线审核的场景进行说明。短视频是指在各种新媒体平台上播放的、适合在移动状态和短时休闲状态下观看的、高频推送的视频内容,几秒到几分钟不等。内容融合了技能分享、幽默搞怪、时尚潮流、社会热点、街头采访、公益教育、广告创意、商业定制等主题。由于内容较短,可以单独成片,也可以成为系列栏目。不同于微电影和直播,短视频制作并没有像微电影一样具有特定的表达形式和团队配置要求,具有生产流程简单、制作门槛低、参与性强等特点,又比直播更具有传播价值,超短的制作周期和趣味化的内容对短视频制作团队的文案以及策划功底有着一定的挑战,优秀的短视频制作团队通常依托于成熟运营的自媒体或IP,除了高频稳定的内容输出外,也有强大的粉丝渠道;短视频的出现丰富了新媒体原生广告的形式。目前短视频从一开始的UGC、PGC、用户上传,到专门制造短视频的机构,到MCN,再到专业的短视频App等众多头部流量平台不断崛起,短视频已经成为内容创业和社交媒体平台的重要传播方式之一。短视频在引发了内容创业者的狂欢,冲击着视频媒体平台的同时,其影响力进一步升级,各大资讯平台也展开了一场围绕短视频的争夺战。所以各种各样的短视频内容原来越多也越来越丰富。无论是短视频内容的生产者还是消费者都成为一个巨大的群体。
具体的,请参阅图12,图12为本申请实施例提供的另一种实体识别方法的流程图,本申请实施例至少包括以下步骤:
1201、获取目标短视频。
本实施例中,目标短视频的获取可以是在用户上传的过程中获取的,即对目标短视频进行内容审核以及标签设定。
1202、对目标短视频关联的文本内容进行识别。
本实施例中,对目标短视频关联的文本内容进行识别的过程可以参考图3中步骤301-304的过程,此处不做赘述。
1203、响应于目标操作获取目标短视频上线时机。
本实施例中,目标操作可以是用户选择是否需要快速审查,或用户设定的发布日期距离当前时间较近,从而确定上限时机是宽松还是紧迫。
1204、基于上线时机配置目标识别模型。
本实施例中,若上限时机指示为宽松,则可以选用BERT模型与BiLSTM模型依次连接的目标识别模型,以保证识别准确率;若上限时机指示为紧迫,则可以选用BERT模型与BiLSTM模型并列连接的目标识别模型,以保证识别效率。
具体的时机设定还可以是基于时长阈值进行的,即发布日期距离当前时间小于时长阈值则为紧迫;发布日期距离当前时间大于时长阈值则为宽松。
1205、将文本内容输入目标识别模型进行识别,并设置标签信息。
本实施例中,通过获取识别得到的实体,可以对目标短视频进行标签的设定,例如设定关键词;也可以用于关联视频的确定,例如用户在观看完该目标短视频后,根据目标短视频的标签信息进行关联视频的确定,从而提高用户与短视频的交互频率,提高用户活跃度。
为了更好的实施本申请实施例的上述方案,下面还提供用于实施上述方案的相关装置。请参阅图13,图13为本申请实施例提供的一种实体识别装置的结构示意图,实体识别装置1300包括:
获取单元1301,用于获取目标文本信息;
输入单元1302,用于将所述目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列,所述目标向量序列包括多个子向量,所述多个子向量是基于至少两个文本维度对所述目标文本信息表示所得;
预测单元1303,用于将所述目标向量序列输入所述目标识别模型中的语义表示层,以得到标签预测序列,其中,所述标签预测序列为所述多个子向量分别与多个实体标签的归属概率集合,所述语义表示层包括多个并列的识别节点,所述识别节点之间相互关联,所述识别节点用于识别对应的子向量与多个所述实体标签的归属概率,多个所述实体标签基于不同类别的实体设定;
识别单元1304,用于将所述标签预测序列输入所述目标识别模型中的条件鉴别层,以确定所述归属概率集合中的目标项,所述目标项用于指示所述目标文本信息中的所述实体。
可选的,在本申请一些可能的实现方式中,所述至少两个文本维度包括词维度和字维度,所述输入单元1302,具体用于将所述目标文本信息输入目标识别模型中的输入表示层进行词嵌入处理,以得到词嵌入向量;
所述输入单元1302,具体用于对所述目标文本信息进行字嵌入处理,以得到字嵌入向量;
所述输入单元1302,具体用于将所述词嵌入向量和所述字嵌入向量作为所述子向量,生成所述目标向量序列。
可选的,在本申请一些可能的实现方式中,所述语义表示层包括双向编码器和双向记忆网络模型,所述预测单元1303,具体用于将所述目标向量序列输入所述双向编码器,以得到第一语义表示;
所述预测单元1303,具体用于将所述第一语义表示输入所述双向记忆网络模型,以得到目标语义表示;
所述预测单元1303,具体用于基于所述目标语义表示与多个所述实体标签进行匹配,以得到所述标签预测序列。
可选的,在本申请一些可能的实现方式中,所述预测单元1303,具体用于将所述第一语义表示输入所述双向记忆网络模型进行基于第一次序的计算,以得到上文信息;
所述预测单元1303,具体用于将所述第一语义表示输入所述双向记忆网络模型进行基于第二次序的计算,以得到下文信息;
所述预测单元1303,具体用于基于所述上文信息和所述下文信息进行拼接,以得到所述目标语义表示。
可选的,在本申请一些可能的实现方式中,所述语义表示层包括所述双向编码器和所述双向记忆网络模型,所述预测单元1303,具体用于将所述目标向量序列输入所述双向编码器,以得到所述第一语义表示;
所述预测单元1303,具体用于将所述目标向量序列输入所述双向记忆网络模型,以得到第二语义表示;
所述预测单元1303,具体用于对所述第一语义表示和所述第二语义表示进行拼接,以得到目标语义表示;
所述预测单元1303,具体用于基于所述目标语义表示与多个所述实体标签进行匹配,以得到所述标签预测序列。
可选的,在本申请一些可能的实现方式中,所述预测单元1303,具体用于获取所述目标文本信息对应的文本大小;
所述预测单元1303,具体用于基于所述文本大小确定所述目标向量序列输入所述双向编码器和所述双向记忆网络模型的方式。
可选的,在本申请一些可能的实现方式中,所述预测单元1303,具体用于获取所述目标文本信息对应的预设实体集合;
所述预测单元1303,具体用于确定所述预设实体集合中的目标类别;
所述预测单元1303,具体用于基于所述目标类别对所述双向编码器进行训练。
可选的,在本申请一些可能的实现方式中,所述识别单元1304,具体用于将所述标签预测序列输入所述目标识别模型中的条件鉴别层,以获取所述条件鉴别层中的约束条件,所述约束条件基于预设的全局信息设定;
所述识别单元1304,具体用于基于所述约束条件对每个所述子向量对应的所述归属概率进行筛选,以确定所述归属概率集合中的所述目标项。
可选的,在本申请一些可能的实现方式中,所述识别单元1304,具体用于确定所述子向量对应的候选标签,所述候选标签包括位置标识和标签标识;
所述识别单元1304,具体用于基于所述约束条件对所述位置标识和所述标签标识的对应关系进行筛选,以确定所述归属概率集合中的所述目标项。
可选的,在本申请一些可能的实现方式中,所述识别单元1304,具体用于获取初始化转移矩阵;
所述识别单元1304,具体用于基于所述目标文本信息对应的全局信息对所述初始化转移矩阵进行训练,以得到目标转移矩阵;
所述识别单元1304,具体用于根据所述目标转移矩阵中转移分数的分布确定所述约束条件。
可选的,在本申请一些可能的实现方式中,所述获取单元1301,具体用于响应于目标操作获取目标识别数据,所述目标识别数据包含至少一种媒体内容形式;
所述获取单元1301,具体用于对所述目标识别数据进行文本解译,以确定所述目标文本信息。
结合上述实施例可知,针对待识别实体的目标文本信息,将目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列,为了增强对目标文本信息的特征描述全面性,通过至少两个文本维度对目标文本信息进行表示,确定目标向量序列包括的多个子向量是基于至少两个文本维度对目标文本信息表示所得。将该目标向量序列输入目标识别模型中的语义表示层,得到标识子向量分别与多个实体标签的归属概率集合的标签预测序列,语义表示层包括多个并列且相互关联的识别节点,使得识别节点间能够得到各自的上下文信息,增强语义表示的完整性,进而提高后续实体标签的识别准确性。而且,由于前述多个实体标签基于不同类别的实体设定,在识别过程中能够将目标文本信息与更多的实体标签进行关联,可以筛选出不同类别实体的重要特征,增强对于实体类别的分辨能力,进而将标签预测序列输入目标识别模型中的条件鉴别层,以确定归属概率集合中用于指示目标文本信息中的实体的目标项。从而实现高效的实体识别过程,提高了实体识别的效率及准确性。
本申请实施例还提供了一种终端设备,该终端设备可以为前述实施例中提及的实施实体识别方法的终端设备,本申请实施例提供的实体识别装置可以配置在该终端设备中。如图14所示,是本申请实施例提供的另一种终端设备的结构示意图,为了便于说明,仅示出了与本申请实施例相关的部分,具体技术细节未揭示的,请参照本申请实施例方法部分。该终端可以为包括手机、平板电脑、个人数字助理(personal digital assistant,PDA)、销售终端(point of sales,POS)、车载电脑等任意终端设备,以终端为手机为例:
图14示出的是与本申请实施例提供的终端相关的手机的部分结构的框图。参考图14,手机包括:射频(radio frequency,RF)电路1410、存储器1420、输入单元1430、显示单元1440、传感器1450、音频电路1460、无线保真(wireless fidelity,WiFi)模块1470、处理器 1480、以及电源1490等部件。本领域技术人员可以理解,图14中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图14对手机的各个构成部件进行具体的介绍:
RF电路1410可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器1480处理;另外,将设计上行的数据发送给基站。通常,RF电路1410包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(low noise amplifier,LNA)、双工器等。此外,RF电路1410还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(global system of mobile communication,GSM)、通用分组无线服务(general packet radio service,GPRS)、码分多址(code division multiple access,CDMA)、宽带码分多址(wideband code division multiple access,WCDMA)、长期演进(long term evolution,LTE)、电子邮件、短消息服务(short messaging service,SMS)等。
存储器1420可用于存储软件程序以及模块,处理器1480通过运行存储在存储器1420的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器1420可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器1420可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
输入单元1430可用于接收输入的数字或字符信息,以及产生与手机的用户设置以及功能控制有关的键信号输入。具体地,输入单元1430可包括触控面板1431以及其他输入设备1432。触控面板1431,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板1431上或在触控面板1431附近的操作,以及在触控面板1431上一定范围内的隔空触控操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板1431可包括触摸检测装置和触摸控制器两个部分。
其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器1480,并能接收处理器1480发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板1431。除了触控面板1431,输入单元1430还可以包括其他输入设备1432。具体地,其他输入设备1432可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元1440可用于显示由用户输入的信息或提供给用户的信息以及手机的各种菜单。显示单元1440可包括显示面板1441,可选的,可以采用液晶显示器(liquid crystal display,LCD)、有机发光二极管(organic light-emitting diode,OLED)等形式来配置显示面板1441。进一步的,触控面板1431可覆盖显示面板1441,当触控面板1431检测到在其上或附近的触 摸操作后,传送给处理器1480以确定触摸事件的类型,随后处理器1480根据触摸事件的类型在显示面板1441上提供相应的视觉输出。虽然在图14中,触控面板1431与显示面板1441是作为两个独立的部件来实现手机的输入和输入功能,但是在某些实施例中,可以将触控面板1431与显示面板1441集成而实现手机的输入和输出功能。
手机还可包括至少一种传感器1450,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板1441的亮度,接近传感器可在手机移动到耳边时,关闭显示面板1441和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路1460、扬声器1461,传声器1462可提供用户与手机之间的音频接口。音频电路1460可将接收到的音频数据转换后的电信号,传输到扬声器1461,由扬声器1461转换为声音信号输出;另一方面,传声器1462将收集的声音信号转换为电信号,由音频电路1460接收后转换为音频数据,再将音频数据输出处理器1480处理后,经RF电路1410以发送给比如另一手机,或者将音频数据输出至存储器1420以便进一步处理。
WiFi属于短距离无线传输技术,手机通过WiFi模块1470可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图14示出了WiFi模块1470,但是可以理解的是,其并不属于手机的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。
处理器1480是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器1420内的软件程序和/或模块,以及调用存储在存储器1420内的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器1480可包括一个或多个处理单元;可选的,处理器1480可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1480中。
手机还包括给各个部件供电的电源1490(比如电池),可选的,电源可以通过电源管理系统与处理器1480逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
尽管未示出,手机还可以包括摄像头、蓝牙模块等,在此不再赘述。
在本申请实施例中,该终端设备所包括的处理器1480还具有执行如上述实体识别方法的各个步骤的功能。
本申请实施例还提供了一种服务器,该服务器可以为前述实施例中提及的实施实体识别方法的服务器,本申请实施例提供的实体识别装置可以配置在该服务器中。请参阅图15,图15是本申请实施例提供的一种服务器的结构示意图,该服务器1500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU) 1522(例如,一个或一个以上处理器)和存储器1532,一个或一个以上存储应用程序1542或数据1544的存储介质1530(例如一个或一个以上海量存储设备)。其中,存储器1532和存储介质1530可以是短暂存储或持久存储。存储在存储介质1530的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,中央处理器1522可以设置为与存储介质1530通信,在服务器1500上执行存储介质1530中的一系列指令操作。
服务器1500还可以包括一个或一个以上电源1526,一个或一个以上有线或无线网络接口1550,一个或一个以上输入输出接口1558,和/或,一个或一个以上操作系统1541,例如Windows Server TM,Mac OS X TM,Unix TM,Linux TM,FreeBSD TM等等。
上述实施例中由计算机设备所执行的步骤可以基于该图15所示的服务器结构。
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用于存储计算机程序,所述计算机程序用于执行上述实施例提供的方法。
本申请实施例中还提供一种包括实体识别指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如上述实施例提供的方法。
本申请实施例还提供了一种实体识别系统,所述实体识别系统可以包含图13所描述实施例中的实体识别装置,或图14所描述实施例中的终端设备,或者图15所描述的服务器。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对相关技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,实体识别装置,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储 程序代码的介质。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (15)

  1. 一种实体识别方法,所述方法由计算机设备执行,所述方法包括:
    获取目标文本信息;
    将所述目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列,所述目标向量序列包括多个子向量,所述多个子向量是基于至少两个文本维度对所述目标文本信息表示所得;
    将所述目标向量序列输入所述目标识别模型中的语义表示层,以得到标签预测序列,其中,所述标签预测序列为所述多个子向量分别与多个实体标签的归属概率集合,所述语义表示层包括多个并列的识别节点,所述识别节点之间相互关联,所述识别节点用于识别对应的子向量与多个所述实体标签的归属概率,多个所述实体标签基于不同类别的实体设定;
    将所述标签预测序列输入所述目标识别模型中的条件鉴别层,以确定所述归属概率集合中的目标项,所述目标项用于指示所述目标文本信息中的所述实体。
  2. 根据权利要求1所述的方法,所述至少两个文本维度包括词维度和字维度,所述将所述目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列,包括:
    将所述目标文本信息输入目标识别模型中的输入表示层进行词嵌入处理,以得到词嵌入向量;
    对所述目标文本信息进行字嵌入处理,以得到字嵌入向量;
    将所述词嵌入向量和所述字嵌入向量作为所述子向量,生成所述目标向量序列。
  3. 根据权利要求1所述的方法,所述语义表示层包括双向编码器和双向记忆网络模型,所述将所述目标向量序列输入所述目标识别模型中的语义表示层,以得到标签预测序列,包括:
    将所述目标向量序列输入所述双向编码器,以得到第一语义表示;
    将所述第一语义表示输入所述双向记忆网络模型,以得到目标语义表示;
    基于所述目标语义表示与多个所述实体标签进行匹配,以得到所述标签预测序列。
  4. 根据权利要求3所述的方法,所述将所述第一语义表示输入所述双向记忆网络模型,以得到目标语义表示,包括:
    将所述第一语义表示输入所述双向记忆网络模型进行基于第一次序的计算,以得到上文信息;
    将所述第一语义表示输入所述双向记忆网络模型进行基于第二次序的计算,以得到下文信息;
    基于所述上文信息和所述下文信息进行拼接,以得到所述目标语义表示。
  5. 根据权利要求1所述的方法,所述语义表示层包括所述双向编码器和所述双向记忆网络模型,所述将所述目标向量序列输入所述目标识别模型中的语义表示层,以得到标签预测序列,包括:
    将所述目标向量序列输入所述双向编码器,以得到所述第一语义表示;
    将所述目标向量序列输入所述双向记忆网络模型,以得到第二语义表示;
    对所述第一语义表示和所述第二语义表示进行拼接,以得到目标语义表示;
    基于所述目标语义表示与多个所述实体标签进行匹配,以得到所述标签预测序列。
  6. 根据权利要求3-5任一项所述的方法,所述方法还包括:
    获取所述目标文本信息对应的文本大小;
    基于所述文本大小确定所述目标向量序列输入所述双向编码器和所述双向记忆网络模型的方式。
  7. 根据权利要求3-5任一项所述的方法,所述方法还包括:
    获取所述目标文本信息对应的预设实体集合;
    确定所述预设实体集合中的目标类别;
    基于所述目标类别对所述双向编码器进行训练。
  8. 根据权利要求1所述的方法,所述将所述标签预测序列输入所述目标识别模型中的条件鉴别层,以确定所述归属概率集合中的目标项,包括:
    将所述标签预测序列输入所述目标识别模型中的条件鉴别层,以获取所述条件鉴别层中的约束条件,所述约束条件基于预设的全局信息设定;
    基于所述约束条件对每个所述子向量对应的所述归属概率进行筛选,以确定所述归属概率集合中的所述目标项。
  9. 根据权利要求8所述的方法,所述基于所述约束条件对每个所述子向量对应的所述归属概率进行筛选,以确定所述归属概率集合中的所述目标项,包括:
    确定所述子向量对应的候选标签,所述候选标签包括位置标识和标签标识;
    基于所述约束条件对所述位置标识和所述标签标识的对应关系进行筛选,以确定所述归属概率集合中的所述目标项。
  10. 根据权利要求8所述的方法,所述方法还包括:
    获取初始化转移矩阵;
    基于所述目标文本信息对应的全局信息对所述初始化转移矩阵进行训练,以得到目标转移矩阵;
    根据所述目标转移矩阵中转移分数的分布确定所述约束条件。
  11. 根据权利要求1所述的方法,所述获取目标文本信息包括:
    响应于目标操作获取目标识别数据,所述目标识别数据包含至少一种媒体内容形式;
    基于所述媒体内容形式对所述目标识别数据进行文本解译,以确定所述目标文本信息。
  12. 一种实体识别装置,包括:
    获取单元,用于获取目标文本信息;
    输入单元,用于将所述目标文本信息输入目标识别模型中的输入表示层,以生成目标向量序列,所述目标向量序列包括多个子向量,所述子向量是基于至少两个文本维度对所述目标文本信息表示所得;
    预测单元,用于将所述目标向量序列输入所述目标识别模型中的语义表示层,以得到标签预测序列,其中,所述标签预测序列为所述子向量分别与多个实体标签的归属概率集合,所述语义表示层包括多个并列的识别节点,所述识别节点之间相互关联,所述识别节 点用于识别对应的子向量与多个所述实体标签的归属概率,多个所述实体标签基于不同类别的实体设定;
    识别单元,用于将所述标签预测序列输入所述目标识别模型中的条件鉴别层,以确定所述归属概率集合中的目标项,所述目标项用于指示所述目标文本信息中的所述实体。
  13. 一种计算机设备,所述计算机设备包括处理器以及存储器:
    所述存储器用于存储程序代码;所述处理器用于根据所述程序代码中的指令执行权利要求1至11任一项所述的实体识别方法。
  14. 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序用于执行上述权利要求1至11任一项所述的实体识别方法。
  15. 一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行权利要求1-11任一项所述的实体识别方法。
PCT/CN2021/116228 2020-10-14 2021-09-02 一种实体识别方法、装置、设备以及存储介质 WO2022078102A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/947,548 US20230015606A1 (en) 2020-10-14 2022-09-19 Named entity recognition method and apparatus, device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011096598.0 2020-10-14
CN202011096598.0A CN113536793A (zh) 2020-10-14 2020-10-14 一种实体识别方法、装置、设备以及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/947,548 Continuation US20230015606A1 (en) 2020-10-14 2022-09-19 Named entity recognition method and apparatus, device, and storage medium

Publications (1)

Publication Number Publication Date
WO2022078102A1 true WO2022078102A1 (zh) 2022-04-21

Family

ID=78094456

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/116228 WO2022078102A1 (zh) 2020-10-14 2021-09-02 一种实体识别方法、装置、设备以及存储介质

Country Status (3)

Country Link
US (1) US20230015606A1 (zh)
CN (1) CN113536793A (zh)
WO (1) WO2022078102A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112330379A (zh) * 2020-11-25 2021-02-05 税友软件集团股份有限公司 一种发票内容生成方法、系统、电子设备及存储介质
CN114707005A (zh) * 2022-06-02 2022-07-05 浙江建木智能系统有限公司 一种舰船装备的知识图谱构建方法和系统
CN114970553A (zh) * 2022-07-29 2022-08-30 北京道达天际科技股份有限公司 基于大规模无标注语料的情报分析方法、装置及电子设备
CN115146068A (zh) * 2022-06-01 2022-10-04 西北工业大学 关系三元组的抽取方法、装置、设备及存储介质
CN115294964A (zh) * 2022-09-26 2022-11-04 广州小鹏汽车科技有限公司 语音识别方法、服务器、语音识别系统和可读存储介质
CN115640810A (zh) * 2022-12-26 2023-01-24 国网湖北省电力有限公司信息通信公司 一种电力系统通信敏感信息识别方法、系统及存储介质
CN116049438A (zh) * 2023-01-10 2023-05-02 昆明理工大学 一种基于知识图谱的群体成员关系分析方法
CN116050418A (zh) * 2023-03-02 2023-05-02 浙江工业大学 基于融合多层语义特征的命名实体识别方法、设备和介质
CN116229493A (zh) * 2022-12-14 2023-06-06 国家能源集团物资有限公司 跨模态的图片文本命名实体识别方法、系统及电子设备
CN118036577A (zh) * 2024-04-11 2024-05-14 一百分信息技术有限公司 一种自然语言处理中的序列标注方法

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114118049B (zh) * 2021-10-28 2023-09-22 北京百度网讯科技有限公司 信息获取方法、装置、电子设备及存储介质
CN114580424B (zh) * 2022-04-24 2022-08-05 之江实验室 一种用于法律文书的命名实体识别的标注方法和装置
US11615247B1 (en) * 2022-04-24 2023-03-28 Zhejiang Lab Labeling method and apparatus for named entity recognition of legal instrument
CN116049347B (zh) * 2022-06-24 2023-10-31 荣耀终端有限公司 一种基于词融合的序列标注方法及相关设备
CN116341554B (zh) * 2023-05-22 2023-08-29 中国科学技术大学 面向生物医学文本的命名实体识别模型的训练方法
CN117473321A (zh) * 2023-11-07 2024-01-30 摩尔线程智能科技(北京)有限责任公司 文本标注方法、装置和存储介质
CN117807270B (zh) * 2024-02-29 2024-05-07 中国人民解放军国防科技大学 基于新闻内容的视频推荐方法、装置、设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107644014A (zh) * 2017-09-25 2018-01-30 南京安链数据科技有限公司 一种基于双向lstm和crf的命名实体识别方法
CN111191453A (zh) * 2019-12-25 2020-05-22 中国电子科技集团公司第十五研究所 一种基于对抗训练的命名实体识别方法
CN111444720A (zh) * 2020-03-30 2020-07-24 华南理工大学 一种英文文本的命名实体识别方法
CN111695345A (zh) * 2020-06-12 2020-09-22 腾讯科技(深圳)有限公司 文本中实体识别方法、以及装置
WO2020193966A1 (en) * 2019-03-26 2020-10-01 Benevolentai Technology Limited Name entity recognition with deep learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298042A (zh) * 2019-06-26 2019-10-01 四川长虹电器股份有限公司 基于Bilstm-crf与知识图谱影视实体识别方法
CN111125331B (zh) * 2019-12-20 2023-10-31 京东方科技集团股份有限公司 语义识别方法、装置、电子设备及计算机可读存储介质
CN111444726B (zh) * 2020-03-27 2024-02-09 河海大学常州校区 基于双向格子结构的长短时记忆网络的中文语义信息提取方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107644014A (zh) * 2017-09-25 2018-01-30 南京安链数据科技有限公司 一种基于双向lstm和crf的命名实体识别方法
WO2020193966A1 (en) * 2019-03-26 2020-10-01 Benevolentai Technology Limited Name entity recognition with deep learning
CN111191453A (zh) * 2019-12-25 2020-05-22 中国电子科技集团公司第十五研究所 一种基于对抗训练的命名实体识别方法
CN111444720A (zh) * 2020-03-30 2020-07-24 华南理工大学 一种英文文本的命名实体识别方法
CN111695345A (zh) * 2020-06-12 2020-09-22 腾讯科技(深圳)有限公司 文本中实体识别方法、以及装置

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112330379A (zh) * 2020-11-25 2021-02-05 税友软件集团股份有限公司 一种发票内容生成方法、系统、电子设备及存储介质
CN112330379B (zh) * 2020-11-25 2023-10-31 税友软件集团股份有限公司 一种发票内容生成方法、系统、电子设备及存储介质
CN115146068A (zh) * 2022-06-01 2022-10-04 西北工业大学 关系三元组的抽取方法、装置、设备及存储介质
CN115146068B (zh) * 2022-06-01 2023-10-03 西北工业大学 关系三元组的抽取方法、装置、设备及存储介质
CN114707005A (zh) * 2022-06-02 2022-07-05 浙江建木智能系统有限公司 一种舰船装备的知识图谱构建方法和系统
CN114707005B (zh) * 2022-06-02 2022-10-25 浙江建木智能系统有限公司 一种舰船装备的知识图谱构建方法和系统
CN114970553A (zh) * 2022-07-29 2022-08-30 北京道达天际科技股份有限公司 基于大规模无标注语料的情报分析方法、装置及电子设备
CN114970553B (zh) * 2022-07-29 2022-11-08 北京道达天际科技股份有限公司 基于大规模无标注语料的情报分析方法、装置及电子设备
CN115294964B (zh) * 2022-09-26 2023-02-10 广州小鹏汽车科技有限公司 语音识别方法、服务器、语音识别系统和可读存储介质
CN115294964A (zh) * 2022-09-26 2022-11-04 广州小鹏汽车科技有限公司 语音识别方法、服务器、语音识别系统和可读存储介质
CN116229493A (zh) * 2022-12-14 2023-06-06 国家能源集团物资有限公司 跨模态的图片文本命名实体识别方法、系统及电子设备
CN116229493B (zh) * 2022-12-14 2024-02-09 国家能源集团物资有限公司 跨模态的图片文本命名实体识别方法、系统及电子设备
CN115640810A (zh) * 2022-12-26 2023-01-24 国网湖北省电力有限公司信息通信公司 一种电力系统通信敏感信息识别方法、系统及存储介质
CN116049438A (zh) * 2023-01-10 2023-05-02 昆明理工大学 一种基于知识图谱的群体成员关系分析方法
CN116049438B (zh) * 2023-01-10 2023-06-02 昆明理工大学 一种基于知识图谱的群体成员关系分析方法
CN116050418A (zh) * 2023-03-02 2023-05-02 浙江工业大学 基于融合多层语义特征的命名实体识别方法、设备和介质
CN116050418B (zh) * 2023-03-02 2023-10-31 浙江工业大学 基于融合多层语义特征的命名实体识别方法、设备和介质
CN118036577A (zh) * 2024-04-11 2024-05-14 一百分信息技术有限公司 一种自然语言处理中的序列标注方法

Also Published As

Publication number Publication date
US20230015606A1 (en) 2023-01-19
CN113536793A (zh) 2021-10-22

Similar Documents

Publication Publication Date Title
WO2022078102A1 (zh) 一种实体识别方法、装置、设备以及存储介质
US11810576B2 (en) Personalization of experiences with digital assistants in communal settings through voice and query processing
CN110598046B (zh) 一种基于人工智能的标题党识别方法和相关装置
WO2021139701A1 (zh) 一种应用推荐方法、装置、存储介质及电子设备
CN111931501B (zh) 一种基于人工智能的文本挖掘方法、相关装置及设备
CN109196496A (zh) 未知词预测器和内容整合的翻译器
CN111444357B (zh) 内容信息确定方法、装置、计算机设备及存储介质
CN113590850A (zh) 多媒体数据的搜索方法、装置、设备及存储介质
CN112104642B (zh) 一种异常账号确定方法和相关装置
CN112231563B (zh) 一种内容推荐方法、装置及存储介质
CN111597804B (zh) 一种实体识别模型训练的方法以及相关装置
CN110765294B (zh) 图像搜索方法、装置、终端设备及存储介质
CN111368063B (zh) 一种基于机器学习的信息推送方法以及相关装置
WO2022257840A1 (zh) 信息显示方法、装置、电子设备及可读存储介质
CN115878841B (zh) 一种基于改进秃鹰搜索算法的短视频推荐方法及系统
CN110852047A (zh) 一种文本配乐方法、装置、以及计算机存储介质
CN109582869A (zh) 一种数据处理方法、装置和用于数据处理的装置
JP2022518645A (ja) 映像配信時効の決定方法及び装置
CN114817755A (zh) 一种用户互动内容管理方法、装置和存储介质
CN114428842A (zh) 一种扩充问答库的方法、装置、电子设备及可读存储介质
CN115168568B (zh) 一种数据内容的识别方法、装置以及存储介质
CN112307198B (zh) 一种单文本的摘要确定方法和相关装置
CN116340550A (zh) 一种文本标签的确定方法和相关装置
CN114265948A (zh) 图像推送方法和装置
CN116205686A (zh) 一种多媒体资源推荐的方法、装置、设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21879148

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 310823)

122 Ep: pct application non-entry in european phase

Ref document number: 21879148

Country of ref document: EP

Kind code of ref document: A1