WO2023137918A1 - 文本数据的分析方法、模型训练方法、装置及计算机设备 - Google Patents

文本数据的分析方法、模型训练方法、装置及计算机设备 Download PDF

Info

Publication number
WO2023137918A1
WO2023137918A1 PCT/CN2022/090738 CN2022090738W WO2023137918A1 WO 2023137918 A1 WO2023137918 A1 WO 2023137918A1 CN 2022090738 W CN2022090738 W CN 2022090738W WO 2023137918 A1 WO2023137918 A1 WO 2023137918A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
text
probability
label
emotional feature
Prior art date
Application number
PCT/CN2022/090738
Other languages
English (en)
French (fr)
Inventor
姜鹏
高鹏
谯轶轩
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023137918A1 publication Critical patent/WO2023137918A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/44Statistical methods, e.g. probability models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the technical field of artificial intelligence, in particular to a text data analysis method, model training method, device and computer equipment.
  • the machine learning model can analyze the emotional tendency contained in the given text data.
  • the embodiment of the present application provides a text data analysis method, including:
  • the text data to be processed and the first emotion tag corresponding to the text data;
  • the text data includes a plurality of words;
  • the text data and the first emotional label are input to a preset text analysis model, and the text analysis model extracts the emotional feature sentence in the text data to obtain a first output probability and a second output probability; wherein, the first output probability is used to characterize each word in the text data as the predicted probability of the initial word of the emotional feature sentence, and the second output probability is used to characterize each word in the text data as the predicted probability of the termination word of the emotional feature sentence;
  • the emotional feature sentence is determined from the text data according to the first output probability and the second output probability.
  • the embodiment of the present application provides a method for training a text analysis model, including:
  • the text sample includes a plurality of words
  • the text sample and the second emotional label are input to a text analysis model, and the text analysis model is used to extract the emotional feature sentence in the text sample to obtain a third output probability and a fourth output probability; wherein, the third output probability is used to characterize each word in the text sample as the predicted probability of the initial word of the emotional feature sentence, and the fourth output probability is used to characterize each word in the text sample as the predicted probability of the termination word of the emotional feature sentence;
  • the text analysis model is trained according to the loss value to obtain a trained text analysis model.
  • an embodiment of the present application provides a device for analyzing text data, including:
  • An acquisition module configured to acquire the text data to be processed and the first emotion tag corresponding to the text data; the text data includes a plurality of words;
  • a prediction module for inputting the text data and the first emotional label to a preset text analysis model, extracting the emotional feature sentence in the text data through the text analysis model, to obtain a first output probability and a second output probability; wherein, the first output probability is used to represent the predicted probability that each word in the text data is the start word of the emotional feature sentence, and the second output probability is used to represent the predicted probability that each word in the text data is the termination word of the emotional feature sentence;
  • a processing module configured to determine the emotional feature sentence from the text data according to the first output probability and the second output probability.
  • the embodiment of the present application provides a computer device, including:
  • At least one memory for storing at least one program
  • the at least one processor When the at least one program is executed by the at least one processor, the at least one processor implements a text data analysis method or a text analysis model training method;
  • the analysis method of the text data mentioned therein includes:
  • the text data to be processed and the first emotion tag corresponding to the text data;
  • the text data includes a plurality of words;
  • the text data and the first emotional label are input to a preset text analysis model, and the text analysis model extracts the emotional feature sentence in the text data to obtain a first output probability and a second output probability; wherein, the first output probability is used to characterize each word in the text data as the predicted probability of the initial word of the emotional feature sentence, and the second output probability is used to characterize each word in the text data as the predicted probability of the termination word of the emotional feature sentence;
  • training method of the text analysis model includes:
  • the text sample includes a plurality of words
  • the text sample and the second emotional label are input to a text analysis model, and the text analysis model is used to extract the emotional feature sentence in the text sample to obtain a third output probability and a fourth output probability; wherein, the third output probability is used to characterize each word in the text sample as the predicted probability of the initial word of the emotional feature sentence, and the fourth output probability is used to characterize each word in the text sample as the predicted probability of the termination word of the emotional feature sentence;
  • the text analysis model is trained according to the loss value to obtain a trained text analysis model.
  • the embodiment of the present application also provides a computer-readable storage medium, which stores a processor-executable program, and the processor-executable program is used to implement a text data analysis method or a text analysis model training method when executed by the processor;
  • the analysis method of the text data mentioned therein includes:
  • the text data to be processed and the first emotion tag corresponding to the text data;
  • the text data includes a plurality of words;
  • the text data and the first emotional label are input to a preset text analysis model, and the text analysis model extracts the emotional feature sentence in the text data to obtain a first output probability and a second output probability; wherein, the first output probability is used to characterize each word in the text data as the predicted probability of the initial word of the emotional feature sentence, and the second output probability is used to characterize each word in the text data as the predicted probability of the termination word of the emotional feature sentence;
  • training method of the text analysis model includes:
  • the text sample includes a plurality of words
  • the text sample and the second emotional label are input to a text analysis model, and the text analysis model is used to extract the emotional feature sentence in the text sample to obtain a third output probability and a fourth output probability; wherein, the third output probability is used to characterize each word in the text sample as the predicted probability of the initial word of the emotional feature sentence, and the fourth output probability is used to characterize each word in the text sample as the predicted probability of the termination word of the emotional feature sentence;
  • the text analysis model is trained according to the loss value to obtain a trained text analysis model.
  • the text data analysis method, model training method, device, and computer equipment disclosed in the embodiments of the present application can effectively extract the emotional feature sentence corresponding to the emotional tag from the text data according to the emotional tag of the text data, and use it in the field of sentiment analysis technology, which can help to understand the text content and judge the tendency of the text content in more detail; moreover, based on the probability that each word is the starting word of the emotional feature sentence and the probability of ending the word, the emotional feature sentence can be determined from the text data, which can simplify the complexity of the output data, improve the efficiency of data processing, and save the consumption of computing resources. .
  • Fig. 1 is a schematic diagram of the implementation environment of a text data analysis method provided in the embodiment of the present application
  • Fig. 2 is a schematic flow chart of a text data analysis method provided in the embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a training method for a text analysis model provided in an embodiment of the present application
  • FIG. 4 is a schematic diagram of a random discarding algorithm in the related art
  • FIG. 5 is a schematic structural diagram of a text data analysis device provided in an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a computer device provided in an embodiment of the present application.
  • AI Artificial Intelligence
  • Artificial Intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • artificial intelligence is a comprehensive technique of computer science that attempts to understand the nature of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Artificial intelligence technology is a comprehensive subject that involves a wide range of fields, including both hardware-level technology and software-level technology.
  • Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes several major directions such as computer vision technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • Natural Language Processing is an important direction in the field of computer science and artificial intelligence. It studies various theories and methods that can realize effective communication between humans and computers using natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. The natural language involved in this field is the language that people use every day, so it is also closely related to the study of linguistics. Natural language processing technologies usually include text processing, semantic understanding, machine translation, robot question answering, knowledge graph and other technologies.
  • Machine learning (Machine Learning, ML) is a multi-field interdisciplinary subject, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other disciplines. It specializes in studying how computers simulate or implement human learning behaviors to acquire new knowledge or skills, and reorganize existing knowledge structures to continuously improve their performance.
  • Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent. Its application covers all fields of artificial intelligence.
  • Machine learning (deep learning) usually includes technologies such as artificial neural network, belief network, reinforcement learning, transfer learning, inductive learning, and teaching learning.
  • Blockchain is a new application model of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain is essentially a decentralized database, which is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify the validity of the information (anti-counterfeiting) and generate the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the underlying blockchain platform can include processing modules such as user management, basic services, smart contracts, and operational monitoring.
  • the user management module is responsible for the identity information management of all blockchain participants, including maintenance of public and private key generation (account management), key management, and maintenance of user real identity and blockchain address correspondence (authority management), etc., and in the case of authorization, supervision and audit of certain real identity transactions, and provision of risk control rule configuration (risk control audit); the basic service module is deployed on all blockchain node devices to verify the validity of business requests.
  • the smart contract module is responsible for the registration and issuance of the contract, contract triggering and contract execution, developers can define the contract logic through a programming language, publish it to the blockchain (contract registration), according to the logic of the contract terms, call the key or other events to trigger execution, complete the contract logic, and also provide the function of contract upgrade and cancellation;
  • the operation monitoring module is mainly responsible for the deployment and configuration modification during the product release process , contract settings, cloud adaptation, and visual output of real-time status during product operation, such as: alarms, monitoring network conditions, monitoring node equipment health status, etc.
  • the platform product service layer provides the basic capabilities and implementation framework of typical applications. Based on these basic capabilities, developers can superimpose the characteristics of the business and complete the blockchain implementation of business logic.
  • the application service layer provides application services based on blockchain solutions for business participants to use.
  • the machine learning model can analyze the emotional tendency contained in the given text data.
  • the positive or negative reviews posted by users belong to emotional tendencies, and there is a need to analyze the content of user reviews and extract text content corresponding to emotional tendencies (recorded as emotional feature sentences in this application) to determine the factors why users give positive reviews (or negative reviews), so as to help other users better identify merchants and promote merchants to make corresponding service improvements and upgrades.
  • machine learning models generally cannot perform the above-mentioned types of tasks, or can only give vague prediction results, and are often too simple or insufficient in accuracy.
  • the embodiment of the present application provides a text data analysis method, model training method, device, and computer equipment. Determine the tendency of the text content in detail; moreover, based on the probability of outputting each word as the starting word of the emotional feature sentence and the probability of the ending word, determining the emotional feature sentence from the text data can simplify the complexity of the output data, improve the efficiency of data processing, and save the consumption of computing resources.
  • FIG. 1 is a schematic diagram of an implementation environment of a text data analysis method provided by an embodiment of the present application.
  • the software and hardware main body of the implementation environment mainly includes an operation terminal 101 and a server 102 , and the operation terminal 101 is connected to the server 102 in communication.
  • the analysis method of the text data may be separately configured and executed on the operation terminal 101, or may be separately configured and executed on the server 102, or may be executed based on the interaction between the operation terminal 101 and the server 102.
  • an appropriate selection may be made according to actual application conditions, which is not specifically limited in this embodiment.
  • the operation terminal 101 and the server 102 may be nodes in the block chain, which is not specifically limited in this embodiment.
  • the operation terminal 101 in this application may include, but is not limited to, any one or more of smart watches, smart phones, computers, personal digital assistants (Personal Digital Assistant, PDA), smart voice interaction devices, smart home appliances, or vehicle-mounted terminals.
  • the server 102 can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN (Content Delivery Network, content distribution network) and big data and artificial intelligence platforms.
  • a communication connection can be established between the operation terminal 101 and the server 102 through a wireless network or a wired network.
  • the wireless network or the wired network uses standard communication technologies and/or protocols. Any combination of networks or virtual private networks.
  • Fig. 2 is a flow chart of a method for analyzing text data provided by an embodiment of the present application.
  • the subject of execution of the method may be at least one of an operation terminal or a server.
  • the method for analyzing text data is configured and executed on an operation terminal as an example for illustration.
  • the text data analysis method includes but not limited to step 110 to step 130 .
  • Step 110 Obtain the text data to be processed and the first emotion label corresponding to the text data; the text data includes multiple words.
  • the text data and its corresponding emotion label are obtained first, which are recorded as the first emotion label.
  • the first emotion tag is used to represent the emotional tendency contained in the content of the text data.
  • the first emotion tag can be a tag indicating "happy”, “sad”, “good”, “bad”, “support”, “against”, etc.
  • the data format of the first emotion tag can be arbitrary, for example, it can be any one of numerical value, vector, matrix, or tensor, and the corresponding relationship between data and specific tags can be flexibly set according to needs, and this application does not limit this.
  • the source channel for obtaining the text data to be processed is not limited.
  • the text data to be processed can be downloaded from a relevant resource server, or transmitted through a hardware port, or obtained from the environment by a voice collection and recognition device and then recognized.
  • a text is composed of multiple sentences, and each sentence includes multiple words. Therefore, the text data can be divided into multiple words, that is, the text data includes multiple words.
  • the format and language type of the words there is no specific limitation on the format and language type of the words.
  • Step 120 input the text data and the first emotion label to the preset text analysis model, extract the emotional characteristic sentence in the text data by the text analysis model, and obtain the first output probability and the second output probability; wherein, the first output probability is used to represent the probability that each word in the text data is the initial word of the emotional characteristic sentence, and the second output probability is used to represent the probability that each word in the text data is the termination word of the emotional characteristic sentence.
  • the text data and its corresponding first emotion label when the text data and its corresponding first emotion label are input into the text analysis model, the text data and the first emotion label can be preprocessed, and the specific processing method can be any one of data splicing and data fusion.
  • the text data and its corresponding first emotion label are input into the text analysis model, and the emotional characteristic sentences in the text data are extracted through the text analysis model.
  • the emotional feature sentence is a related sentence in the text data that can reflect or embody the emotion corresponding to the first emotion tag.
  • the emotional feature sentence may include one or more words, and the specific number is not limited in this application.
  • text data itself is unstructured data
  • the data processed by machine learning models is generally structured data. Therefore, in the embodiment of the present application, before inputting the text data into the model, encoding conversion can be performed on it, and the unstructured text data can be converted into structured data that is easy to be processed by the model.
  • word segmentation processing can be performed on text data to obtain the phrases that make up the text data.
  • word segmentation algorithms there are various word segmentation algorithms that can be used. For example, in some embodiments, a dictionary-based word segmentation algorithm can be used to first divide each sentence in the text data into words according to the dictionary, and then find the best combination of words; in some embodiments, a word-based word segmentation algorithm can also be used.
  • the word embedding vector corresponding to each word in the phrase can be determined through a pre-established dictionary.
  • the word embedding vector can be obtained by mapping words to a vector space with a unified lower dimension.
  • the strategy for generating this mapping includes neural networks, dimensionality reduction of word co-occurrence matrices, probabilistic models, and interpretable knowledge base methods.
  • these word embedding vectors can be accumulated, and the accumulated vector can be recorded as a phrase vector, and the phrase vector can be normalized to obtain the corresponding vector of the text data. For example, during the normalization process, the sum of the elements in the corresponding vector can be set to 1.
  • the text analysis model when the text analysis model extracts the emotional feature sentence in the text data, it can be converted into a problem of determining the starting word and the ending word of the emotional feature sentence from the text data. In this way, the model can predict the probability that each word in the text data is the starting word of the emotional feature sentence and the probability that each word is the ending word of the emotional feature sentence.
  • the predicted probability that each word in the text data output by the text analysis model is the start word of the emotional feature sentence is recorded as the first output probability
  • the predicted probability that each word in the text data output by the text analysis model is the termination word of the emotional feature sentence is recorded as the second output probability.
  • the text analysis model predicts that it is more likely to be the first word in the sentence with emotional features; In this way, the text analysis model can be used to predict sentiment feature sentences in text data.
  • Step 130 Determine the emotional feature sentence from the text data according to the first output probability and the second output probability.
  • the emotional feature sentence can be determined from the text data.
  • the purpose of analyzing the text data is to extract the sentiment feature sentence corresponding to the first sentiment tag. Specifically, for example, first the size of the first output probability and the second output probability can be compared, the word with the highest corresponding first output probability is determined as the target start word of the emotional feature sentence, and the word with the highest corresponding second output probability is determined as the target end word of the emotional feature sentence.
  • the text content between the target start word and the target end word is extracted from the text data to obtain the emotional feature sentence.
  • the relevant threshold probability can also be set in advance.
  • the first output probability or the second output probability exceeds the probability threshold, it is first determined as a potential start word (potential end word), and then according to the positions of each potential start word and potential end word in the text data, multiple emotional feature sentences are intercepted in sequence.
  • a text data analysis method which can effectively extract the emotional feature sentence corresponding to the emotional tag from the text data according to the emotional tag of the text data, and use it in the field of sentiment analysis technology, which can help to assist in understanding the text content and judge the tendency of the text content in more detail; moreover, in the embodiment of the present application, based on the probability that each word is the initial word of the emotional feature sentence and the probability of the termination word, the emotional feature sentence is determined from the text data, which can simplify the complexity of the output data and improve the efficiency of data processing. And save the consumption of computing resources.
  • a text analysis model training method is also provided.
  • the text data analysis method in FIG. 2 can use the text analysis model obtained by the text analysis model training method to perform processing tasks.
  • the implementation environment of the training method is similar to the aforementioned text data analysis method, and will not be repeated here.
  • Fig. 3 is a flow chart of a method for training a text analysis model provided by an embodiment of the present application. The subject of execution of the method may be at least one of an operation terminal or a server.
  • the text data analysis method is configured and executed on an operation terminal as an example for illustration. Referring to FIG. 3 , the training method of the text analysis model includes but not limited to steps 210 to 240 .
  • Step 210 Obtain a plurality of text samples and the second emotion labels and emotion feature sentence labels corresponding to the text samples; the text samples include a plurality of words.
  • Step 220 input the text sample and the second emotion label to the text analysis model, extract the emotional feature sentence in the text sample through the text analysis model, and obtain the third output probability and the fourth output probability; wherein, the third output probability is used to represent the predicted probability that each word in the text sample is the start word of the emotional feature sentence, and the fourth output probability is used to represent the predicted probability that each word in the text sample is the end word of the emotional feature sentence.
  • Step 230 Determine a training loss value according to the third output probability, the fourth output probability and the sentence label of the emotional feature.
  • Step 240 Train the text analysis model according to the loss value to obtain a trained text analysis model.
  • the text analysis model can be built using any machine learning algorithm, which is not limited here.
  • Machine learning Machine Learning (Machine Learning, ML) is a multi-field interdisciplinary subject, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other disciplines. It specializes in the study of how computers simulate or implement human learning behaviors to acquire new knowledge or skills, and reorganize existing knowledge structures to continuously improve their performance.
  • Machine learning is the core of artificial intelligence and the fundamental way to make computers intelligent. Its application covers all fields of artificial intelligence.
  • Machine learning deep learning usually includes artificial neural network, belief network, reinforcement learning, migration learning, inductive learning, teaching learning and other technologies.
  • the model of the present application can be a model under the Transformer architecture system, such as BERT, RoBERTa, GPT-2, T5 and other models.
  • the framework of the model can also be modified in this application. For example, the output of each intermediate layer of the Transformer (excluding the Embedding layer) can be average pooled and maximum pooled.
  • the text analysis model can be trained by acquiring a training data set composed of multiple text samples. These text samples carry corresponding emotional labels, which are recorded as the second emotional label, and also carry emotional feature sentence labels.
  • the emotional feature sentence label of the text sample is used to characterize the emotional feature sentence in the text sample.
  • the emotional feature sentence label may represent position information of the emotional feature sentence in the text sample.
  • the corresponding second emotion label can be input into the initialized text analysis model to obtain the prediction result output by the text analysis model.
  • the text analysis model records the predicted probability that each word in the output text sample is the start word of the emotional feature sentence as the third output probability, and the predicted probability that each word in the output text sample is the termination word of the emotional feature sentence is recorded as the fourth output probability.
  • the accuracy of the model prediction can be evaluated according to the result and the aforementioned emotional feature sentence label, so as to perform backpropagation training on the model and update its related parameters.
  • the accuracy of its prediction results can be measured by the loss function (Loss Function), which is defined on a single training data and used to measure the prediction error of a training data, specifically, the loss value of the training data is determined by the label of the single training data and the prediction result of the model for the training data.
  • the cost function (Cost Function) is generally used to measure the overall error of the training data set.
  • the cost function is defined on the entire training data set and is used to calculate the average of the prediction errors of all training data, which can better measure the prediction effect of the model.
  • the loss value of the entire training data set can be calculated.
  • loss functions commonly used, such as 0-1 loss function, square loss function, absolute loss function, logarithmic loss function, cross-entropy loss function, etc. can be used as the loss function of machine learning models, and will not be elaborated here.
  • one of the loss functions can be selected to determine the training loss value, that is, the loss value between the third output probability, the fourth output probability and the emotional feature sentence label.
  • the backpropagation algorithm is used to update the parameters of the model, and the trained machine learning model can be obtained by iterating the preset rounds.
  • step 220 and step 230 of the training process of the text analysis model are further described.
  • step 220 may include but not limited to step 221 to step 222:
  • Step 221 Randomly discarding the neural network units of the text analysis model multiple times to obtain multiple different text analysis sub-models; each text analysis sub-model has a shared weight parameter.
  • Step 222 Input the text sample and the second sentiment label into each text analysis sub-model, and extract the sentiment feature sentence in the text data through each text analysis sub-model.
  • the model in order to improve the efficiency of model training, the model may be trained based on a random dropout algorithm (Dropout).
  • Dropout is a technology used to optimize the overfitting phenomenon that may occur in machine learning models.
  • Figure 4 shows a schematic diagram of a neural network model trained using this technology.
  • the output of each neuron (or neuron weight, bias) in the original neural network is discarded with a certain probability, thereby forming a relatively sparse network structure.
  • This training method is very effective for regularizing dense neural networks and can greatly improve the efficiency of model training.
  • the original Dropout is improved and utilized.
  • each text analysis sub-model is constrained to have a shared weight parameter, that is, the weight parameters of text analysis sub-models with different structures in the same neural network unit are consistent, and each text analysis sub-model is trained through the training data set.
  • step 230 may include but not limited to step 231 to step 232:
  • Step 231 Determine the sub-loss values corresponding to each text analysis sub-model.
  • Step 232 Calculate the mean value of each sub-loss value to obtain the training loss value.
  • the sub-loss values corresponding to each text analysis sub-model can be obtained, and the mean value of each sub-loss value can be calculated, and the mean value can be used as the total loss value of model training to update the model parameters.
  • the convergence speed of the training can be greatly accelerated, and the generalization ability of the model can be effectively improved, which is beneficial to improving the accuracy of the obtained prediction result.
  • the emotional feature sentence label of the present application is obtained through the following steps:
  • the first label probability is used to characterize each word in the text sample as the label probability of the starting word of the emotional feature statement, and the first label probability corresponding to each word is negatively correlated with the distance between the word and the starting word;
  • the second label probability is used to characterize each word in the text sample as the label probability of the terminating word of the emotional feature statement, and the second label probability corresponding to each word is negatively correlated with the distance between the word and the terminating word;
  • An emotional feature sentence label is constructed according to the first label probability and the second label probability.
  • the emotional feature sentence label can refer to the form of the prediction result output by the model, and is set to include two values, one is recorded as the first label probability, which is used to characterize each word in the text sample as the label probability of the initial word of the emotional feature sentence; the other is recorded as the second label probability, which is used to represent the label probability that each word in the text sample is the termination word of the emotional feature sentence.
  • the corresponding first label probability can be determined according to the distance of each word from the real starting word, that is, the closer the word is to the real starting word, the greater the corresponding first label probability; conversely, the farther the word is from the real starting word, the smaller the corresponding first label probability.
  • the corresponding second label probability can be determined according to the distance of each word from the real terminating word, that is, the closer the word is to the real terminating word, the greater the corresponding second label probability; conversely, the farther the word is from the real terminating word, the smaller the corresponding second label probability.
  • the emotional feature sentence label of the present application can also be obtained through the following steps:
  • Each word in the text sample is used as the candidate starting word of the emotional feature sentence respectively, and the termination word of the text sample is used as the candidate termination word of the emotional feature sentence, and the first candidate emotional feature sentence corresponding to each word in the text sample is constructed;
  • the cross-merging ratio of each first candidate emotional feature sentence and the word of the emotional feature sentence determine the first label probability of the word corresponding to each first candidate emotional feature sentence; the first label probability is used to characterize each word in the text sample as the label probability of the initial word of the emotional feature sentence;
  • each second candidate emotional feature sentence and the word of the emotional feature sentence determine the second label probability of the word corresponding to each second candidate emotional feature sentence; the second label probability is used to characterize each word in the text sample as the label probability of the termination word of the emotional feature sentence;
  • An emotional feature sentence label is constructed according to the first label probability and the second label probability.
  • each word in the text sample can also be used as a candidate start word for the emotional feature sentence, and the end word of the text sample can be used as a candidate end word for the emotional feature sentence, and the first candidate emotional feature sentence corresponding to each word can be constructed.
  • the first label probability of the word corresponding to the first candidate emotional feature sentence can be determined.
  • the second label probability of each word can be determined in the same way, that is, each word in the text sample is used as the candidate termination word of the emotional feature sentence, and the starting word of the text sample is used as the candidate starting word of the emotional feature sentence, and the second candidate emotional feature sentence corresponding to each word is constructed. According to the degree of overlap between the second candidate emotional feature sentence and the real emotional feature sentence, the second label probability of the word corresponding to the second candidate emotional feature sentence may be determined.
  • the sentences between the 23rd word and the last word in the text sample are the emotional feature sentences, and correspondingly, the labels corresponding to the words in the emotional feature sentence include 22 to 28.
  • the label of the start word of the emotion feature sentence is 22, and the label of the end word of the emotion feature sentence is 28.
  • each word is used as the candidate starting word of the emotional feature sentence in turn, and the ending word of the text sample is used as the candidate ending word of the emotional feature sentence, and the first candidate emotional feature sentence corresponding to each word in the text sample is constructed.
  • the corresponding first candidate emotional feature sentence includes the text content of all words labeled 0 to 28.
  • its corresponding first candidate emotional feature sentence includes the text content of all the words numbered 8 to 28.
  • the word intersection ratio of the first candidate emotional feature sentence and the real emotional feature sentence can be calculated.
  • the number of words in the intersection of the word set in the first candidate emotional feature sentence and the word set of the real emotional feature sentence can be divided by the ratio obtained by the number of words in the union of the two word sets as the word intersection and union ratio.
  • the word cross-union ratio can be directly used as the first label probability of the word corresponding to the first candidate emotional feature sentence.
  • the word cross-union ratio can also be processed by a certain function, and the obtained result can be used as the first label probability. In principle, it is only necessary to make the word cross-union ratio positively correlated with the first label probability.
  • the word intersection ratio is directly determined as the label probability, which is likely to cause drastic changes in the value and introduce a large error, and the introduction of the square term for smoothing can effectively avoid this situation. It can improve the effect of model training and help improve the accuracy of prediction.
  • the first label probability of each corresponding word can also be determined by the following formula:
  • i represents the label of the word in the text sample
  • k represents the total number of words in the text sample
  • y i represents the probability of the first label corresponding to the i-th word
  • is a numerical parameter, for example, it can be 0.6
  • Indicates the real label probability that is, a 29-dimensional vector, the element corresponding to the starting word of the vector is 1, and the other elements are 0
  • S i indicates the reference label probability corresponding to the i-th word.
  • the reference label probability is determined by the following formula:
  • S i represents the reference label probability corresponding to the i-th word
  • i represents the label of the word in the text sample
  • k represents the total number of words in the text sample
  • j i represents the word intersection ratio corresponding to the i-th word (or the word intersection-union ratio corresponding to the i-th word plus its own square term).
  • the loss value of the training is determined, including:
  • a training loss value is determined according to the sum of the first divergence value and the second divergence value.
  • the conventional loss function cannot better measure the difference between the predicted probability calculated by word intersection and the label. Therefore, in the embodiment of the present application, it is proposed to calculate the loss value through the divergence to optimize the model parameters. Specifically, the divergence value between the third output probability predicted during model training and the first label probability can be calculated, which is recorded as the first divergence value, and the divergence value between the fourth output probability predicted during model training and the second label probability can be calculated, which is recorded as the second divergence value. Then, the first divergence value and the second divergence value are summed to obtain the final loss value, which is used to reversely update the parameters of the model.
  • the corresponding divergence value can be calculated through the KL divergence formula, and the specific calculation process will not be repeated here.
  • the embodiment of the present application also provides a text data analysis device, which includes:
  • Obtaining module 510 for obtaining the text data to be processed and the first emotion tag corresponding to the text data; the text data includes a plurality of words;
  • the prediction module 520 is used to input the text data and the first emotional label to the preset text analysis model, extract the emotional characteristic sentence in the text data by the text analysis model, and obtain the first output probability and the second output probability; wherein, the first output probability is used to represent the predicted probability that each word in the text data is the initial word of the emotional characteristic sentence, and the second output probability is used to represent the predicted probability that each word in the text data is the termination word of the emotional characteristic sentence;
  • the processing module 530 is configured to determine the emotional feature sentence from the text data according to the first output probability and the second output probability.
  • the content in the embodiment of the text data analysis method shown in FIG. 2 is applicable to the text data analysis device embodiment.
  • the functions implemented by the text data analysis device embodiment are the same as the text data analysis method embodiment shown in FIG. 2 , and the beneficial effects achieved are also the same as those achieved by the text data analysis method embodiment shown in FIG. 2 .
  • the embodiment of the present application also discloses a computer device, including:
  • At least one processor 610 At least one processor 610
  • At least one memory 620 for storing at least one program
  • the at least one processor 610 When at least one program is executed by at least one processor 610, the at least one processor 610 implements a text data analysis method or a text analysis model training method;
  • the analysis method of the text data mentioned therein includes:
  • the text data to be processed and the first emotion tag corresponding to the text data;
  • the text data includes a plurality of words;
  • the text data and the first emotional label are input to a preset text analysis model, and the text analysis model extracts the emotional feature sentence in the text data to obtain a first output probability and a second output probability; wherein, the first output probability is used to characterize each word in the text data as the predicted probability of the initial word of the emotional feature sentence, and the second output probability is used to characterize each word in the text data as the predicted probability of the termination word of the emotional feature sentence;
  • training method of the text analysis model includes:
  • the text sample includes a plurality of words
  • the text sample and the second emotional label are input to a text analysis model, and the text analysis model is used to extract the emotional feature sentence in the text sample to obtain a third output probability and a fourth output probability; wherein, the third output probability is used to characterize each word in the text sample as the predicted probability of the initial word of the emotional feature sentence, and the fourth output probability is used to characterize each word in the text sample as the predicted probability of the termination word of the emotional feature sentence;
  • the text analysis model is trained according to the loss value to obtain a trained text analysis model.
  • the embodiment of the text data analysis method shown in FIG. 2 or the embodiment of the text analysis model training method shown in FIG. 3 are examples of the embodiment of the text analysis model training method shown in FIG. 3 .
  • the embodiment of the present application also discloses a computer-readable storage medium, in which a processor-executable program is stored, and the processor-executable program is used to implement a text data analysis method or a text analysis model training method when executed by the processor;
  • the analysis method of the text data mentioned therein includes:
  • the text data to be processed and the first emotion tag corresponding to the text data;
  • the text data includes a plurality of words;
  • the text data and the first emotional label are input to a preset text analysis model, and the text analysis model extracts the emotional feature sentence in the text data to obtain a first output probability and a second output probability; wherein, the first output probability is used to characterize each word in the text data as the predicted probability of the initial word of the emotional feature sentence, and the second output probability is used to characterize each word in the text data as the predicted probability of the termination word of the emotional feature sentence;
  • training method of the text analysis model includes:
  • the text sample includes a plurality of words
  • the text sample and the second emotional label are input to a text analysis model, and the text analysis model is used to extract the emotional feature sentence in the text sample to obtain a third output probability and a fourth output probability; wherein, the third output probability is used to characterize each word in the text sample as the predicted probability of the initial word of the emotional feature sentence, and the fourth output probability is used to characterize each word in the text sample as the predicted probability of the termination word of the emotional feature sentence;
  • the text analysis model is trained according to the loss value to obtain a trained text analysis model.
  • the embodiment of the text data analysis method shown in FIG. 2 or the embodiment of the text analysis model training method shown in FIG. 3 may be non-volatile or volatile
  • the content in the embodiment of the text data analysis method shown in FIG. 2 or the text analysis model training method embodiment shown in FIG. 3 is applicable to this embodiment of the computer-readable storage medium.
  • the functions implemented by the embodiment of the computer-readable storage medium are the same as the embodiment of the text data analysis method shown in FIG. 2 or the embodiment of the text analysis model training method shown in FIG.
  • the functions/operations noted in the block diagrams may occur out of the order noted in the operational diagrams.
  • two blocks shown in succession may, in fact, be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/operations involved.
  • the embodiments presented and described in the flowcharts of this application are provided by way of example for the purpose of providing a more comprehensive understanding of the technology. The disclosed methods are not limited to the operations and logical flow presented in this application. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
  • the functions are realized in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence or the part that contributes to the prior art or a part of the technical solution.
  • the computer software product is stored in a storage medium and includes several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods in each embodiment of the application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes.
  • a "computer-readable medium” may be any device that can contain, store, communicate, propagate or transmit a program for use in or in conjunction with an instruction execution system, device or device.
  • computer-readable media include the following: electrical connections with one or more wires (electronic device), portable computer disk case (magnetic device), random access memory (RAM), read-only memory (ROM), erasable-editable read-only memory (EPROM or flash memory), fiber optic devices, and portable compact disc read-only memory (CDROM).
  • the computer-readable medium may even be paper or other suitable medium on which the program can be printed, since the program can be obtained electronically, for example, by optical scanning of the paper or other medium, followed by editing, interpreting, or processing in other suitable ways if necessary, and then storing it in the computer memory.
  • each part of the present application may be realized by hardware, software, firmware or a combination thereof.
  • various steps or methods may be implemented by software or firmware stored in memory and executed by a suitable instruction execution system.
  • a suitable instruction execution system For example, if it is implemented in hardware, as in another embodiment, it can be implemented by any one of the following technologies known in the art or their combination: a discrete logic circuit with logic gates for implementing logic functions on data signals, an application specific integrated circuit with suitable combinational logic gates, a programmable gate array (PGA), a field programmable gate array (FPGA), etc.
  • PGA programmable gate array
  • FPGA field programmable gate array

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种文本数据的分析方法、模型训练方法、装置及计算机设备,该分析方法包括:获取待处理的文本数据和文本数据对应的第一情感标签(110);文本数据中包括多个单词;将文本数据和第一情感标签输入至文本分析模型,通过文本分析模型提取文本数据中的情感特征语句,得到第一输出概率和第二输出概率(120);第一输出概率用于表征文本数据中的各个单词为情感特征语句的起始单词的预测概率,第二输出概率用于表征文本数据中的各个单词为情感特征语句的终止单词的预测概率;根据第一输出概率和第二输出概率,从文本数据中确定情感特征语句(130)。该分析方法能够从文本数据中提取出情感特征语句,且提取效率和准确度较高,可广泛应用于人工智能技术领域。

Description

文本数据的分析方法、模型训练方法、装置及计算机设备
本申请要求于2022年1月21日提交中国专利局、申请号为202210074604.5,发明名称为“文本数据的分析方法、模型训练方法、装置及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其是一种文本数据的分析方法、模型训练方法、装置及计算机设备。
背景技术
近年来,随着人工智能技术的飞速发展,各种类型的机器学习模型在图像分类、人脸识别、自动驾驶等领域均取得了较为良好的应用效果。
其中,在文本分析的应用场景下,机器学习模型可以基于给定的文本数据,分析出其中蕴含的情感倾向。
技术问题
以下是发明人意识到的现有技术的技术问题:
在实际的应用中,可能存在有已经了解到文本数据的情感倾向,需要进一步判断、提取和该情感倾向相关内容的需求。面临该任务时,当下的机器学习模型输出的预测结果往往过于简略或者准确性不足。
综上,相关技术存在的问题亟需得到解决。
技术解决方案
第一方面,本申请实施例提供了一种文本数据的分析方法,包括:
获取待处理的文本数据和所述文本数据对应的第一情感标签;所述文本数据中包括多个单词;
将所述文本数据和所述第一情感标签输入至预设的文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,所述第一输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的起始单词的预测概率,所述第二输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的终止单词的预测概率;
根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句。
第二方面,本申请实施例提供了一种文本分析模型的训练方法,包括:
获取多个文本样本和所述文本样本对应的第二情感标签、情感特征语句标签;所述文本样本中包括多个单词;
将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本样本中的情感特征语句,得到第三输出概率和第四输出概率;其中,所述第三输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的预测概率,所述第四输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的预测概率;
根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值;
根据所述损失值对所述文本分析模型进行训练,得到训练好的文本分析模型。
第三方面,本申请实施例提供一种文本数据的分析装置,包括:
获取模块,用于获取待处理的文本数据和所述文本数据对应的第一情感标签;所述文本数据中包括多个单词;
预测模块,用于将所述文本数据和所述第一情感标签输入至预设的文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,所述第一输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的起始单词的预测概率,所述第二输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的终止单词的预测概率;
处理模块,用于根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句。
第四方面,本申请实施例提供了一种计算机设备,包括:
至少一个处理器;
至少一个存储器,用于存储至少一个程序;
当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现一种文本数据的分析方法或一种文本分析模型的训练方法;
其中所述文本数据的分析方法,包括:
获取待处理的文本数据和所述文本数据对应的第一情感标签;所述文本数据中包括多个单词;
将所述文本数据和所述第一情感标签输入至预设的文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,所述第一输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的起始单词的预测概率,所述第二输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的终止单词的预测概率;
根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句;
其中所述文本分析模型的训练方法,包括:
获取多个文本样本和所述文本样本对应的第二情感标签、情感特征语句标签;所述文本样本中包括多个单词;
将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本样本中的情感特征语句,得到第三输出概率和第四输出概率;其中,所述第三输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的预测概率,所述第四输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的预测概率;
根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值;
根据所述损失值对所述文本分析模型进行训练,得到训练好的文本分析模型。
第五方面,本申请实施例还提供了一种计算机可读存储介质,其中存储有处理器可执行的程序,所述处理器可执行的程序在由处理器执行时用于实现一种文本数据的分析方法或一种文本分析模型的训练方法;
其中所述文本数据的分析方法,包括:
获取待处理的文本数据和所述文本数据对应的第一情感标签;所述文本数据中包括多个单词;
将所述文本数据和所述第一情感标签输入至预设的文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,所述第一输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的起始单词的预测概率,所述第二输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的终止单词的预测概率;
根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句;
其中所述文本分析模型的训练方法,包括:
获取多个文本样本和所述文本样本对应的第二情感标签、情感特征语句标签;所述文本样本中包括多个单词;
将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本样本中的情感特征语句,得到第三输出概率和第四输出概率;其中,所述第三输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的预测概率,所述第四输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的预测概率;
根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值;
根据所述损失值对所述文本分析模型进行训练,得到训练好的文本分析模型。
有益效果
本申请实施例所公开的文本数据的分析方法、模型训练方法、装置及计算机设备,能够有效根据文本数据的情感标签,从文本数据中提取出和该情感标签对应的情感特征语句,用于情感分析技术领域,可以有利于辅助理解文本内容,更深入细节地判断文本内容的倾向性;而且,基于输出每个单词为情感特征语句的起始单词的概率以及终止单词的概率,从文本数据中确定情感特征语句,能够简化输出数据的复杂度,提高数据处理的效率,且节省计算资源的消耗。
附图说明
为了更清楚地说明本申请实施例或者现有技术中的技术方案,下面对本申请实施例或者现有技术中的相关技术方案附图作以下介绍,应当理解的是,下面介绍中的附图仅仅为了方便清晰表述本申请的技术方案中的部分实施例,对于本领域的技术人员来说,在无需付出创造性劳动的前提下,还可以根据这些附图获取到其他附图。
图1为本申请实施例中提供的一种文本数据的分析方法的实施环境示意图;
图2为本申请实施例中提供的一种文本数据的分析方法的流程示意图;
图3为本申请实施例中提供的一种文本分析模型的训练方法的流程示意图;
图4为相关技术中的随机丢弃算法的示意图;
图5为本申请实施例中提供的一种文本数据的分析装置的结构示意图;
图6为本申请实施例中提供的一种计算机设备的结构示意图。
本发明的实施方式
下面结合说明书附图和具体的实施例对本申请进行进一步的说明。所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。
除非另有定义,本申请所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本申请中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
对本申请实施例进行进一步详细说明之前,对本申请实施例中涉及的名词和术语进行说明,本申请实施例中涉及的名词和术语适用于如下的解释。
1)人工智能(Artificial Intelligence,AI),是利用数字计算机或者数字计算机控制的机 器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。人工智能技术是一门综合学科,涉及领域广泛,既有硬件层面的技术也有软件层面的技术。人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。
2)自然语言处理(Nature Language processing,NLP),是计算机科学领域与人工智能领域中的一个重要方向。它研究能实现人与计算机之间用自然语言进行有效通信的各种理论和方法,自然语言处理是一门融语言学、计算机科学、数学于一体的科学。这一领域涉及的自然语言即人们日常使用的语言,所以它与语言学的研究也有着密切的联系。自然语言处理技术通常包括文本处理、语义理解、机器翻译、机器人问答、知识图谱等技术。
3)机器学习(Machine Learning,ML),是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科,它专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域,机器学习(深度学习)通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、式教学习等技术。
4)区块链(Blockchain),是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层。区块链底层平台可以包括用户管理、基础服务、智能合约以及运营监控等处理模块。其中,用户管理模块负责所有区块链参与者的身份信息管理,包括维护公私钥生成(账户管理)、密钥管理以及用户真实身份和区块链地址对应关系维护(权限管理)等,并且在授权的情况下,监管和审计某些真实身份的交易情况,提供风险控制的规则配置(风控审计);基础服务模块部署在所有区块链节点设备上,用来验证业务请求的有效性,并对有效请求完成共识后记录到存储上,对于一个新的业务请求,基础服务先对接口适配解析和鉴权处理(接口适配),然后通过共识算法将业务信息加密(共识管理),在加密之后完整一致的传输至共享账本上(网络通信),并进行记录存储;智能合约模块负责合约的注册发行以及合约触发和合约执行,开发人员可以通过某种编程语言定义合约逻辑,发布到区块链上(合约注册),根据合约条款的逻辑,调用密钥或者其它的事件触发执行,完成合约逻辑,同时还提供对合约升级注销的功能;运营监控模块主要负责产品发布过程中的部署、配置的修改、合约设置、云适配以及产品运行中的实时状态的可视化输出,例如:告警、监控网络情况、监控节点设备健康状态等。平台产品服务层提供典型应用的基本能力和实现框架,开发人员可以基于这些基本能力,叠加业务的特性,完成业务逻辑的区块链实现。应用服务层提供基于区块链方案的应用服务给业务参与方进行使用。
近年来,随着人工智能技术的飞速发展,各种类型的机器学习模型在图像分类、人脸识别、自动驾驶等领域均取得了较为良好的应用效果。
其中,在文本分析的应用场景下,机器学习模型可以基于给定的文本数据,分析出其中蕴含的情感倾向。然而,在实际的应用中,可能存在有已经了解到文本数据的情感倾向,需要进一步判断、提取和该情感倾向相关内容的需求。例如,在餐饮娱乐点评的应用软件中,用户往往会打出好评或者差评后上传与之对应的评论内容,在该场景下,用户打出的好评或者差评就属于情感倾向,存在需要对用户的评论内容进行分析,提取出和情感倾向对应的文本内容(本申请中记为情感特征语句),以确定为何用户给出好评(或者差评)的因素,从 而帮助其他用户更好地甄别商家以及促进商家做出对应的服务改善升级。但当下的相关技术中,机器学习模型一般无法执行上述的任务类型,或者仅能大概给出模糊的预测结果,且往往过于简略或者准确性不足。
为了解决相关技术中存在需要根据情感倾向提取对应的情感特征语句的需求,而现有的机器学习模型一般无法执行上述的任务类型,或者仅能大概给出模糊的预测结果,且往往过于简略或者准确性不足的问题,本申请实施例提供了一种文本数据的分析方法、模型训练方法、装置及计算机设备,其中,该分析方法能够有效根据文本数据的情感标签,从文本数据中提取出和该情感标签对应的情感特征语句,用于情感分析技术领域,可以有利于辅助理解文本内容,更深入细节地判断文本内容的倾向性;而且,基于输出每个单词为情感特征语句的起始单词的概率以及终止单词的概率,从文本数据中确定情感特征语句,能够简化输出数据的复杂度,提高数据处理的效率,且节省计算资源的消耗。
图1是本申请实施例提供的一种文本数据的分析方法的实施环境示意图。参照图1,该实施环境的软硬件主体主要包括操作终端101和服务器102,操作终端101与服务器102通信连接。其中,该文本数据的分析方法可以单独配置于操作终端101执行,也可以单独配置于服务器102执行,或者基于操作终端101与服务器102二者之间的交互来执行,具体可以根据实际应用情况进行适当的选择,本实施例对此并不作具体限定。此外,操作终端101与服务器102可以为区块链中的节点,本实施例对此并不作具体限定。
具体地,本申请中的操作终端101可以包括但不限于智能手表、智能手机、电脑、个人数字助理(Personal Digital Assistant,PDA)、智能语音交互设备、智能家电或者车载终端中的任意一种或者多种。服务器102可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN(Content Delivery Network,内容分发网络)以及大数据和人工智能平台等基础云计算服务的云服务器。操作终端101与服务器102之间可以通过无线网络或有线网络建立通信连接,该无线网络或有线网络使用标准通信技术和/或协议,网络可以设置为因特网,也可以是其它任何网络,例如包括但不限于局域网(Local Area Network,LAN)、城域网(Metropolitan Area Network,MAN)、广域网(Wide Area Network,WAN)、移动、有线或者无线网络、专用网络或者虚拟专用网络的任何组合。
图2是本申请实施例提供的一种文本数据的分析方法的流程图,该方法的执行主体可以是操作终端或者服务器中的至少一者,图2中以该文本数据的分析方法配置于操作终端执行为例进行说明。参照图2,该文本数据的分析方法包括但不限于步骤110至步骤130。
步骤110:获取待处理的文本数据和文本数据对应的第一情感标签;文本数据中包括多个单词。
本步骤中,在对文本数据进行处理时,首先获取该文本数据和它对应的情感标签,记为第一情感标签。此处,第一情感标签用于表征文本数据中的内容所蕴含的情感倾向,比如说第一情感标签可以为表示“高兴”、“悲伤”、“好评”、“差评”、“支持”、“反对”等的标签,具体地,第一情感标签的数据格式可以是任意的,比如说可以是数值、向量、矩阵或者张量等的任一种,且数据和具体标签的对应关系可以根据需要灵活设定,本申请对此不作限制。
本步骤中,对获取待处理的文本数据的来源渠道不作限定,例如在一些实施例中,待处理的文本数据可以是从相关的资源服务器中下载得到的,也可以是通过硬件端口传输得到的,或者是通过语音采集及识别设备从环境中获取然后识别得到的。
需要说明的是,在自然语言中,一个文本是由多个语句组成的,而每个语句中又包括有多个词。因此,文本数据可以被划分为多个单词,即文本数据中包括有多个单词,本申请中,对单词的格式和语言类型不作具体限制。
步骤120:将文本数据和第一情感标签输入至预设的文本分析模型,通过文本分析模型 提取文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,第一输出概率用于表征文本数据中的各个单词为情感特征语句的起始单词的概率,第二输出概率用于表征文本数据中的各个单词为情感特征语句的终止单词的概率。
本步骤中,将文本数据和它对应的第一情感标签输入至文本分析模型时,可以对文本数据和第一情感标签进行预处理,具体的处理方式可以是数据拼接、数据融合中的任一种。
本步骤中,将文本数据和它对应的第一情感标签输入至文本分析模型,通过文本分析模型来提取文本数据中的情感特征语句。此处,情感特征语句即文本数据中能够反映或者体现和第一情感标签对应的情感的相关语句,情感特征语句中可以包括一个或者多个单词,具体的数量本申请不作限制。
需要说明的是,由于文本数据本身是非结构化的数据,而机器学习模型一般处理的数据为结构化数据。因此,本申请实施例中,在将文本数据输入到模型前可以对其进行编码转换,将非结构化的文本数据转换为模型易于处理的结构化数据。例如,可以对文本数据进行分词处理,得到组成该文本数据的词组。此处,可以采用的分词算法有多种,例如在一些实施例中,可以采用基于词典的分词算法,先把文本数据中的各个语句按照词典切分成词,再寻找词的最佳组合方式;在一些实施例中,也可以采用基于字的分词算法,先把文本数据中的各个语句分成一个个字,再将字组合成词,寻找最优的组合方式。对文本数据进行分词处理后,可以通过预先建立的词典来确定词组中每个词对应的词嵌入向量,当然,在一些实施例中,词嵌入向量可以通过将词映射到一个具有统一的较低维度的向量空间中得到,生成这种映射的策略包括神经网络、单词共生矩阵的降维、概率模型以及可解释的知识库方法等。以词嵌入向量作为对词编码得到的结构化数据为例,在得到文本数据中的每个词对应的词嵌入向量后,可以对这些词嵌入向量进行累加,累加后的向量可以记为词组向量,对词组向量进行归一化处理,即可得到的文本数据对应的向量,比如说归一化处理时,可以设定对应的向量中元素和为1。当然,以上仅用于举例说明一种对文本数据进行结构化处理的方式,并不意味着对本申请的具体实施形成限制。
本步骤中,文本分析模型在提取文本数据中的情感特征语句时,可以将其转换为从文本数据中确定情感特征语句的起始单词以及终止单词的问题。如此,模型可以预测文本数据中每一个单词是情感特征语句的起始单词的概率以及每一个单词是情感特征语句的终止单词的概率。本申请实施例中,将文本分析模型输出的文本数据中的各个单词为情感特征语句的起始单词的预测概率记为第一输出概率,将文本分析模型输出的文本数据中的各个单词为情感特征语句的终止单词的预测概率记为第二输出概率。可以理解的是,当某个单词对应的第一输出概率越高时,说明文本分析模型预测其越可能是情感特征语句中的第一个单词,当某个单词对应的第二输出概率越高时,说明文本分析模型预测其越可能是情感特征语句中的最后一个单词。如此,可以将文本分析模型用作预测文本数据中的情感特征语句。
可以理解的是,对于本申请实施例中的文本分析模型来说,其预测的文本数据中真实的情感特征语句的起始单词对应的第一输出概率越高,或者预测的文本数据中真实的情感特征语句的终止单词对应的第二输出概率越高,说明文本分析模型的预测效果越好,得到的预测结果越准确。
步骤130:根据第一输出概率和第二输出概率,从文本数据中确定情感特征语句。
本步骤中,在得到文本分析模型输出的第一输出概率和第二输出概率以后,可以从文本数据中确定情感特征语句。本申请实施例中,对文本数据进行分析的目的,即从中提取得到和第一情感标签对应的情感特征语句。具体地,例如,可以先比较第一输出概率和第二输出概率的大小,将对应的第一输出概率最高的单词确定为情感特征语句的目标起始单词,将对应的第二输出概率最高的单词确定为情感特征语句的目标终止单词。在确定到情感特征语句的目标起始单词和目标终止单词以后,从文本数据中提取目标起始单词和目标终止单词之间的文本内容(包括目标起始单词和目标终止单词),即可得到情感特征语句。
当然,在一些实施例中,还可能会存在有一个文本数据中包括多个情感特征语句,且这 些情感特征语句不完全相邻的情况。故而,本申请实施例中,在根据第一输出概率和第二输出概率确定情感特征语句时,还可以预先设置相关的阈值概率,当第一输出概率(或者第二输出概率)超过概率阈值时,先将其确定为潜在起始单词(潜在终止单词),然后根据各个潜在起始单词和潜在终止单词在文本数据中的位置,依次截取得到多个情感特征语句。
可以理解的是,本申请实施例中,提供一种文本数据的分析方法,该方法能够有效根据文本数据的情感标签,从文本数据中提取出和该情感标签对应的情感特征语句,用于情感分析技术领域,可以有利于辅助理解文本内容,更深入细节地判断文本内容的倾向性;而且,本申请实施例中,基于输出每个单词为情感特征语句的起始单词的概率以及终止单词的概率,从文本数据中确定情感特征语句,能够简化输出数据的复杂度,提高数据处理的效率,且节省计算资源的消耗。
本申请实施例中,还提供一种文本分析模型的训练方法,图2中的文本数据的分析方法可以采用该文本分析模型的训练方法得到的文本分析模型执行处理任务。本申请实施例中,该训练方法的实施环境和前述的文本数据的分析方法类似,在此不再赘述。图3是本申请实施例提供的一种文本分析模型的训练方法的流程图,该方法的执行主体可以是操作终端或者服务器中的至少一者,图3中以该文本数据的分析方法配置于操作终端执行为例进行说明。参照图3,该文本分析模型的训练方法包括但不限于步骤210至步骤240。
步骤210:获取多个文本样本和文本样本对应的第二情感标签、情感特征语句标签;文本样本中包括多个单词。
步骤220:将文本样本和第二情感标签输入至文本分析模型,通过文本分析模型提取文本样本中的情感特征语句,得到第三输出概率和第四输出概率;其中,第三输出概率用于表征文本样本中的各个单词为情感特征语句的起始单词的预测概率,第四输出概率用于表征文本样本中的各个单词为情感特征语句的终止单词的预测概率。
步骤230:根据第三输出概率、第四输出概率和情感特征语句标签,确定训练的损失值。
步骤240:根据损失值对文本分析模型进行训练,得到训练好的文本分析模型。
本申请实施例中,文本分析模型可以采用任一种机器学习算法搭建,在此不作限制。机器学习(Machine Learning,ML)是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科,它专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径,其应用遍及人工智能的各个领域,机器学习(深度学习)通常包括人工神经网络、置信网络、强化学习、迁移学习、归纳学习、示教学习等技术。
具体地,在一些实施例中,本申请的模型可以选用Transformer架构体系下的模型,如BERT、RoBERTa、GPT-2、T5等模型。而且,原有模型的基础上,为了能够充分利用Transformer各层提取的特征信息,本申请中还可以对模型的框架进行改造,例如可以将Transformer的各中间层(不含Embedding层)的输出分别作平均池化和最大池化操作,然后进行拼接输出给模型的线性层,从而提高模型的预测精度。
需要说明的是,在使用上述的机器学习模型前,需要对其进行基于监督学习的训练。本申请实施例中,可以通过获取多个文本样本组成的训练数据集对文本分析模型进行训练,这些文本样本携带有对应的情感标签,记为第二情感标签,还携带有情感特征语句标签。此处,文本样本的情感特征语句标签用于表征文本样本中的情感特征语句,例如在一些实施例中,该情感特征语句标签可以是表征情感特征语句在文本样本中的位置信息。
在得到训练数据集后,对于训练数据集中的文本样本,可以将和对应的第二情感标签其输入到初始化的文本分析模型中,得到文本分析模型输出的预测结果。类似地,此时文本分析模型将输出文本样本中的各个单词为情感特征语句的起始单词的预测概率,记为第三输出概率,以及输出文本样本中的各个单词为情感特征语句的终止单词的预测概率,记为第四输出概率。在得到文本分析模型输出的预测结果后,可以根据该结果和前述的情感特征语句标 签评估模型预测的准确性,以对模型进行反向传播训练,更新其相关参数。
具体地,对于机器学习模型来说,它的预测结果的准确性可以通过损失函数(Loss Function)来衡量,损失函数是定义在单个训练数据上的,用于衡量一个训练数据的预测误差,具体是通过单个训练数据的标签和模型对该训练数据的预测结果确定该训练数据的损失值。而实际训练时,一个训练数据集有很多训练数据,因此一般采用代价函数(Cost Function)来衡量训练数据集的整体误差,代价函数是定义在整个训练数据集上的,用于计算所有训练数据的预测误差的平均值,能够更好地衡量出模型的预测效果。对于一般的机器学习模型来说,基于前述的代价函数,再加上衡量模型复杂度的正则项即可作为训练的目标函数,基于该目标函数便能求出整个训练数据集的损失值。常用的损失函数种类有很多,例如0-1损失函数、平方损失函数、绝对损失函数、对数损失函数、交叉熵损失函数等均可以作为机器学习模型的损失函数,在此不再一一阐述。在实际应用中,可以从中任选一种损失函数来确定训练的损失值,也即第三输出概率、第四输出概率和情感特征语句标签之间的损失值。基于训练的损失值,采用反向传播算法对模型的参数进行更新,迭代预设的轮次即可得到训练好的机器学习模型。通过以上的训练方式,即可得到训练好的文本分析模型。
本申请的一个实施例中,对文本分析模型的训练过程的步骤220以及步骤230进行进一步的说明。
其中,步骤220可以包括但不限于步骤221至步骤222:
步骤221:对文本分析模型的神经网络单元进行多次的随机丢弃,得到多个不同的文本分析子模型;各个文本分析子模型具有共享的权重参数。
步骤222:将文本样本和第二情感标签输入到各个文本分析子模型中,通过各个文本分析子模型提取文本数据中的情感特征语句。
本申请实施例中,为了提高模型训练的效率,可以基于随机丢弃算法(Dropout)对模型进行训练。Dropout是一种用于优化机器学习模型中可能出现的过拟合现象的技术,参照图4,图4示出了一种神经网络模型采用该技术训练时的示意图,在模型训练过程的其中某一轮迭代时,原始的神经网络中每个神经元的输出(或者神经元的权重、偏置)以一定的概率被丢弃,从而形成了较为稀疏的网络结构,这种训练方式对于正则化密集的神经网络十分有效,可以大大提高模型训练的效率。而本申请实施例中,对原始的Dropout进行了改进利用,在模型的训练过程中,并行地对文本分析模型的神经网络单元进行多次的随机丢弃。如此,可以得到多个不同结构的文本分析子模型,本申请实施例中,约束各个文本分析子模型具有共享的权重参数,也即不同结构的文本分析子模型在相同的神经网络单元的权重参数一致,并通过训练数据集对各个文本分析子模型进行训练。
上述的步骤230可以包括但不限于步骤231至步骤232:
步骤231:确定各个文本分析子模型对应的子损失值。
步骤232:计算各个子损失值的均值,得到训练的损失值。
本申请实施例中,通过训练数据集对各个文本分析子模型训练以后,可以得到各个文本分析子模型对应的子损失值,计算各个子损失值的均值,可以将该均值作为模型训练的总损失值对模型参数进行更新。本申请实施例中,通过上述的训练方式,可以大大加快训练的收敛速度,并且能够有效提高模型的泛化能力,有利于提高得到的预测结果的准确性。
在一些实施例中,本申请的情感特征语句标签通过以下步骤得到:
根据文本样本中的情感特征语句的起始单词的位置,确定第一标签概率;第一标签概率用于表征文本样本中的各个单词为情感特征语句的起始单词的标签概率,各个单词对应的第一标签概率和单词与起始单词之间的距离负相关;
根据文本样本中的情感特征语句的终止单词的位置,确定第二标签概率;第二标签概率用于表征文本样本中的各个单词为情感特征语句的终止单词的标签概率,各个单词对应的第二标签概率和单词与终止单词之间的距离负相关;
根据第一标签概率和第二标签概率构造情感特征语句标签。
本申请实施例中,情感特征语句标签可以参照模型输出的预测结果的形式,设置为包括两个值,一个记为第一标签概率,用于表征文本样本中的各个单词为情感特征语句的起始单词的标签概率;另一个记为第二标签概率,用于表征文本样本中的各个单词为情感特征语句的终止单词的标签概率。
可以理解的是,模型预测的起始单词距离真实的起始单词的位置越近,以及预测的终止单词距离真实的终止单词的位置越近,则最后提取得到的情感特征语句就越准确。因此,本申请实施例中,在构造情感特征语句标签时,可以按照各个单词距离真实的起始单词的距离来确定其对应的第一标签概率,也即单词距离真实的起始单词的距离越近,其对应的第一标签概率越大;反之,单词距离真实的起始单词的距离越远,其对应的第一标签概率越小。类似地,可以按照各个单词距离真实的终止单词的距离来确定其对应的第二标签概率,也即单词距离真实的终止单词的距离越近,其对应的第二标签概率越大;反之,单词距离真实的终止单词的距离越远,其对应的第二标签概率越小。
在一些实施例中,本申请的情感特征语句标签也可以通过以下步骤得到:
分别将文本样本中的各个单词作为情感特征语句的候选起始单词,将文本样本的终止单词作为情感特征语句的候选终止单词,构造得到文本样本中的各个单词对应的第一候选情感特征语句;
根据各个第一候选情感特征语句和情感特征语句的单词交并比,确定各个第一候选情感特征语句对应的单词的第一标签概率;第一标签概率用于表征文本样本中的各个单词为情感特征语句的起始单词的标签概率;
将文本样本的起始单词作为情感特征语句的候选起始单词,分别将文本样本中的各个单词作为情感特征语句的候选终止单词,构造得到文本样本中的各个单词对应的第二候选情感特征语句;
根据各个第二候选情感特征语句和情感特征语句的单词交并比,确定各个第二候选情感特征语句对应的单词的第二标签概率;第二标签概率用于表征文本样本中的各个单词为情感特征语句的终止单词的标签概率;
根据第一标签概率和第二标签概率构造情感特征语句标签。
本申请实施例中,在构造情感特征语句标签时,还可以分别将文本样本中的每个单词作为情感特征语句的候选起始单词,将文本样本的终止单词作为情感特征语句的候选终止单词,构造得到各个单词对应的第一候选情感特征语句。根据第一候选情感特征语句和真实的情感特征语句的重合度,可以确定该第一候选情感特征语句对应的单词的第一标签概率。类似地,可以以同样的方式确定各个单词的第二标签概率,即分别将文本样本中的每个单词作为情感特征语句的候选终止单词,将文本样本的起始单词作为情感特征语句的候选起始单词,构造得到各个单词对应的第二候选情感特征语句。根据第二候选情感特征语句和真实的情感特征语句的重合度,可以确定该第二候选情感特征语句对应的单词的第二标签概率。
下面,结合一个具体的实施例来说明本申请中构造情感特征语句标签的实施过程。
假设当前存在一个单词总个数为29的文本样本,从0开始依次为各个单词进行标号,文本样本的起始单词对应的标号为0,文本样本的终止单词对应的标号为28。其中,该文本样本中的第23个单词到最后一个单词之间的语句是其情感特征语句,则相应地,情感特征语句中的单词对应的标号包括22至28。其中,情感特征语句的起始单词的标号为22,情感特征语句的终止单词的标号为28。在构造情感特征语句标签时,以第一标签概率为例,先初始化每个位置为的单词是起始单词的概率为0,得到一个维度为29,各个元素为0的向量。然后,从文本样本的起始单词开始,依次将各个单词作为情感特征语句的候选起始单词,将文本样本的终止单词作为情感特征语句的候选终止单词,构造得到文本样本中的各个单词对应的第一候选情感特征语句。比如说,对于文本样本的起始单词来说,其对应的第一候选情感特征语句就包括标号为0至28的全部单词的文本内容。类似地,对于文本样本中标号为8的单词来说,其对应的第一候选情感特征语句包括标号为8至28的全部单词的文本内容。
当构造得到各个单词对应的第一候选情感特征语句后,可以计算第一候选情感特征语句和真实的情感特征语句的单词交并比。此处,计算单词交并比时,可以通过第一候选情感特征语句中的单词集合和真实的情感特征语句的单词集合的交集中单词的个数,除以两个单词集合的并集中单词的个数得到的比值作为单词交并比。比如说,对于文本样本的起始单词来说,其对应的第一候选情感特征语句包括标号为0至28的全部单词,共有29个单词,而真实的情感特征语句中包括标号为22至28的单词,两者单词的交集中有7个单词,并集中有29个单词,则此时单词交并比为7/29=0.241。
本申请实施例中,可以将单词交并比直接作为第一候选情感特征语句对应的单词的第一标签概率,当然,在一些实施例中,也可以对单词交并比进行一定的函数处理后,将得到的结果作为第一标签概率,原理上只需使得单词交并比和第一标签概率正相关即可,例如可以将单词交并比加上自身的平方项作为第一标签概率,则前述的单词交并比为0.241的单词对应的第一标签概率可以计算为0.2996。
本申请实施例中,直接将单词交并比确定为标签概率,容易导致数值的变化剧烈,会引入较大误差,而引入平方项进行平滑,则可以有效避免这种情况。能够提高模型训练的效果,有利于提高预测的准确性。
在一些实施例中,在确定到第一候选情感特征语句和情感特征语句的单词交并比后,对应的各个单词的第一标签概率还可以通过以下公式确定:
Figure PCTCN2022090738-appb-000001
式中,i表示文本样本中单词的标号,k表示文本样本中单词的总个数,y i表示第i个单词对应的第一标签概率,α为数值参数,例如可以取0.6,
Figure PCTCN2022090738-appb-000002
表示真实标签概率(即一个29维的向量,该向量起始单词对应位置的元素为1,其他元素为0),S i表示第i个单词对应的参考标签概率。
上式中,参考标签概率通过以下公式确定:
Figure PCTCN2022090738-appb-000003
式中,S i表示第i个单词对应的参考标签概率,i表示文本样本中单词的标号,k表示文本样本中单词的总个数;j i表示第i个单词对应的单词交并比(或者第i个单词对应的单词交并比加上自身的平方项)。
需要说明的是,本申请上述实施例中,仅用于对标签概率的设置原理进行介绍和说明,其中选定或者计算得到的概率数值并不对实际实施形成限制。本领域人员在了解到本申请实施例中的原理后,具体的标签概率的数值可以根据需要灵活设定,在此不再赘述。
在一些实施例中,根据第三输出概率、第四输出概率和情感特征语句标签,确定训练的损失值,包括:
确定第三输出概率和第一标签概率之间的第一散度值;
确定第四输出概率和第二标签概率之间的第二散度值;
根据第一散度值和第二散度值的和,确定训练的损失值。
本申请实施例中,在计算损失值时,由于上述构造的标签属于概率的分布形式,常规的损失函数并不能较好地衡量通过单词交并比计算得到的预测概率和标签的差异情况。因此,本申请实施例中,提出通过散度来计算损失值,用于优化模型参数。具体地,可以计算模型训练时预测得到的第三输出概率和第一标签概率之间的散度值,记为第一散度值,以及计算模型训练时预测得到的第四输出概率和第二标签概率之间的散度值,记为第二散度值。然后,对第一散度值和第二散度值求和,从而得到最终的损失值,用于反向更新模型的参数。此处,可以通过KL散度公式计算对应的散度值,具体的计算过程不再赘述。
参照图5,本申请实施例还提供了一种文本数据的分析装置,该装置包括:
获取模块510,用于获取待处理的文本数据和文本数据对应的第一情感标签;文本数据中包括多个单词;
预测模块520,用于将文本数据和第一情感标签输入至预设的文本分析模型,通过文本分析模型提取文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,第一输出概率用于表征文本数据中的各个单词为情感特征语句的起始单词的预测概率,第二输出概率用于表征文本数据中的各个单词为情感特征语句的终止单词的预测概率;
处理模块530,用于根据第一输出概率和第二输出概率,从文本数据中确定情感特征语句。
可以理解的是,图2所示的文本数据的分析方法实施例中的内容均适用于本文本数据的分析装置实施例中,本文本数据的分析装置实施例所具体实现的功能与图2所示的文本数据的分析方法实施例相同,并且达到的有益效果与图2所示的文本数据的分析方法实施例所达到的有益效果也相同。
参照图6,本申请实施例还公开了一种计算机设备,包括:
至少一个处理器610;
至少一个存储器620,用于存储至少一个程序;
当至少一个程序被至少一个处理器610执行,使得至少一个处理器610实现一种文本数据的分析方法或一种文本分析模型的训练方法;
其中所述文本数据的分析方法,包括:
获取待处理的文本数据和所述文本数据对应的第一情感标签;所述文本数据中包括多个单词;
将所述文本数据和所述第一情感标签输入至预设的文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,所述第一输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的起始单词的预测概率,所述第二输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的终止单词的预测概率;
根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句;
其中所述文本分析模型的训练方法,包括:
获取多个文本样本和所述文本样本对应的第二情感标签、情感特征语句标签;所述文本样本中包括多个单词;
将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本样本中的情感特征语句,得到第三输出概率和第四输出概率;其中,所述第三输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的预测概率,所述第四输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的预测概率;
根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值;
根据所述损失值对所述文本分析模型进行训练,得到训练好的文本分析模型。例如,图2所示的文本数据的分析方法实施例或者图3所示的文本分析模型训练方法实施例。
可以理解的是,如图2所示的文本数据的分析方法实施例或者图3所示的文本分析模型训练方法实施例中的内容均适用于本计算机设备实施例中,本计算机设备实施例所具体实现的功能与如图2所示的文本数据的分析方法实施例或者图3所示的文本分析模型训练方法实施例相同,并且达到的有益效果与如图2所示的文本数据的分析方法实施例或者图3所示的文本分析模型训练方法实施例所达到的有益效果也相同。
本申请实施例还公开了一种计算机可读存储介质,其中存储有处理器可执行的程序,处理器可执行的程序在由处理器执行时用于实现一种文本数据的分析方法或一种文本分析模型的训练方法;
其中所述文本数据的分析方法,包括:
获取待处理的文本数据和所述文本数据对应的第一情感标签;所述文本数据中包括多个单词;
将所述文本数据和所述第一情感标签输入至预设的文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,所述第一输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的起始单词的预测概率,所述第二输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的终止单词的预测概率;
根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句;
其中所述文本分析模型的训练方法,包括:
获取多个文本样本和所述文本样本对应的第二情感标签、情感特征语句标签;所述文本样本中包括多个单词;
将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本样本中的情感特征语句,得到第三输出概率和第四输出概率;其中,所述第三输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的预测概率,所述第四输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的预测概率;
根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值;
根据所述损失值对所述文本分析模型进行训练,得到训练好的文本分析模型。例如,图2所示的文本数据的分析方法实施例或者图3所示的文本分析模型训练方法实施例。另外,所述计算机可读存储介质可以是非易失性,也可以是易失性
可以理解的是,如图2所示的文本数据的分析方法实施例或者图3所示的文本分析模型训练方法实施例中的内容均适用于本计算机可读存储介质实施例中,本计算机可读存储介质实施例所具体实现的功能与如图2所示的文本数据的分析方法实施例或者图3所示的文本分析模型训练方法实施例相同,并且达到的有益效果与如图2所示的文本数据的分析方法实施例或者图3所示的文本分析模型训练方法实施例所达到的有益效果也相同。
在一些可选择的实施例中,在方框图中提到的功能/操作可以不按照操作示图提到的顺序发生。例如,取决于所涉及的功能/操作,连续示出的两个方框实际上可以被大体上同时地执行或方框有时能以相反顺序被执行。此外,在本申请的流程图中所呈现和描述的实施例以示例的方式被提供,目的在于提供对技术更全面的理解。所公开的方法不限于本申请所呈现的操作和逻辑流程。可选择的实施例是可预期的,其中各种操作的顺序被改变以及其中被描述为较大操作的一部分的子操作被独立地执行。
此外,虽然在功能性模块的背景下描述了本申请,但应当理解的是,除非另有相反说明,功能和/或特征中的一个或多个可以被集成在单个物理装置和/或软件模块中,或者一个或多个功能和/或特征可以在单独的物理装置或软件模块中被实现。还可以理解的是,有关每个模块的实际实现的详细讨论对于理解本申请是不必要的。更确切地说,考虑到在本申请中公开的装置中各种功能模块的属性、功能和内部关系的情况下,在工程师的常规技术内将会了解该模块的实际实现。因此,本领域技术人员运用普通技术就能够在无需过度试验的情况下实现在权利要求书中所阐明的本申请。还可以理解的是,所公开的特定概念仅仅是说明性的,并不意在限制本申请的范围,本申请的范围由所附权利要求书及其等同方案的全部范围来决定。
功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括: U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,“计算机可读介质”可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。
计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得程序,然后将其存储在计算机存储器中。
应当理解,本申请的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。
在本说明书的上述描述中,参考术语“一个实施方式/实施例”、“另一实施方式/实施例”或“某些实施方式/实施例”等的描述意指结合实施方式或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施方式或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施方式或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施方式或示例中以合适的方式结合。
尽管已经示出和描述了本申请的实施方式,本领域的普通技术人员可以理解:在不脱离本申请的原理和宗旨的情况下可以对这些实施方式进行多种变化、修改、替换和变型,本申请的范围由权利要求及其等同物限定。
以上是对本申请的较佳实施进行了具体说明,但本申请并不限于实施例,熟悉本领域的技术人员在不违背本申请精神的前提下可做作出种种的等同变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内
在本说明书的描述中,参考术语“一个实施方式”、“另一实施方式”或“某些实施方式”等的描述意指结合实施方式或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施方式或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施方式或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施方式或示例中以合适的方式结合。
尽管已经示出和描述了本申请的实施方式,本领域的普通技术人员可以理解:在不脱离本申请的原理和宗旨的情况下可以对这些实施方式进行多种变化、修改、替换和变型,本申请的范围由权利要求及其等同物限定。

Claims (20)

  1. 一种文本数据的分析方法,其中,包括:
    获取待处理的文本数据和所述文本数据对应的第一情感标签;所述文本数据中包括多个单词;
    将所述文本数据和所述第一情感标签输入至预设的文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,所述第一输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的起始单词的预测概率,所述第二输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的终止单词的预测概率;
    根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句。
  2. 根据权利要求1所述的文本数据的分析方法,其中,所述根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句,包括:
    将所述第一输出概率最高值对应的单词确定为情感特征语句的目标起始单词,将所述第二输出概率最高值对应的单词确定为情感特征语句的目标终止单词;
    从所述文本数据中提取所述目标起始单词和所述目标终止单词之间的文本内容,得到所述情感特征语句。
  3. 一种文本分析模型的训练方法,其中,包括:
    获取多个文本样本和所述文本样本对应的第二情感标签、情感特征语句标签;所述文本样本中包括多个单词;
    将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本样本中的情感特征语句,得到第三输出概率和第四输出概率;其中,所述第三输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的预测概率,所述第四输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的预测概率;
    根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值;
    根据所述损失值对所述文本分析模型进行训练,得到训练好的文本分析模型。
  4. 根据权利要求3所述的文本分析模型的训练方法,其中:
    所述将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,包括:
    对所述文本分析模型的神经网络单元进行多次的随机丢弃,得到多个不同的文本分析子模型;各个所述文本分析子模型具有共享的权重参数;
    将所述文本样本和所述第二情感标签输入到各个所述文本分析子模型中,提取所述文本数据中的情感特征语句;
    所述确定训练的损失值,包括:
    确定各个所述文本分析子模型对应的子损失值;
    计算各个所述子损失值的均值,得到训练的损失值。
  5. 根据权利要求3所述的文本分析模型的训练方法,其中,所述情感特征语句标签通过以下步骤得到:
    根据所述文本样本中的情感特征语句的起始单词的位置,确定第一标签概率;所述第一标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的标签概率,各个单词对应的所述第一标签概率和所述单词与所述起始单词之间的距离负相关;
    根据所述文本样本中的情感特征语句的终止单词的位置,确定第二标签概率;所述第二标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的标签概率,各个单词对应的所述第二标签概率和所述单词与所述终止单词之间的距离负相关;
    根据所述第一标签概率和所述第二标签概率构造所述情感特征语句标签。
  6. 根据权利要求3所述的文本分析模型的训练方法,其中,所述情感特征语句标签通过以下步骤得到:
    分别将所述文本样本中的各个单词作为情感特征语句的候选起始单词,将所述文本样本的终止单词作为情感特征语句的候选终止单词,构造得到所述文本样本中的各个单词对应的第一候选情感特征语句;
    根据各个所述第一候选情感特征语句和所述情感特征语句的单词交并比,确定各个所述第一候选情感特征语句对应的单词的第一标签概率;所述第一标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的标签概率;
    将所述文本样本的起始单词作为情感特征语句的候选起始单词,分别将所述文本样本中的各个单词作为情感特征语句的候选终止单词,构造得到所述文本样本中的各个单词对应的第二候选情感特征语句;
    根据各个所述第二候选情感特征语句和所述情感特征语句的单词交并比,确定各个所述第二候选情感特征语句对应的单词的第二标签概率;所述第二标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的标签概率;
    根据所述第一标签概率和所述第二标签概率构造所述情感特征语句标签。
  7. 根据权利要求5或者6所述的文本分析模型的训练方法,其中,所述根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值,包括:
    确定所述第三输出概率和所述第一标签概率之间的第一散度值;
    确定所述第四输出概率和所述第二标签概率之间的第二散度值;
    根据所述第一散度值和所述第二散度值的和,确定训练的损失值。
  8. 一种文本数据的分析装置,其中,包括:
    获取模块,用于获取待处理的文本数据和所述文本数据对应的第一情感标签;所述文本数据中包括多个单词;
    预测模块,用于将所述文本数据和所述第一情感标签输入至预设的文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,所述第一输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的起始单词的预测概率,所述第二输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的终止单词的预测概率;
    处理模块,用于根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句。
  9. 一种计算机设备,其中,包括:
    至少一个处理器;
    至少一个存储器,用于存储至少一个程序;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现一种文本数据的分析方法或一种文本分析模型的训练方法;
    其中所述文本数据的分析方法,包括:
    获取待处理的文本数据和所述文本数据对应的第一情感标签;所述文本数据中包括多个单词;
    将所述文本数据和所述第一情感标签输入至预设的文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,所述第一输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的起始单词的预测概率,所述第二输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的终止单词的预测概率;
    根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句;
    其中所述文本分析模型的训练方法,包括:
    获取多个文本样本和所述文本样本对应的第二情感标签、情感特征语句标签;所述文本样本中包括多个单词;
    将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本样本中的情感特征语句,得到第三输出概率和第四输出概率;其中,所述第三输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的预测概率,所述第四输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的预测概率;
    根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值;
    根据所述损失值对所述文本分析模型进行训练,得到训练好的文本分析模型。
  10. 根据权利要求9所述的计算机设备,其中,所述根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句,包括:
    将所述第一输出概率最高值对应的单词确定为情感特征语句的目标起始单词,将所述第二输出概率最高值对应的单词确定为情感特征语句的目标终止单词;
    从所述文本数据中提取所述目标起始单词和所述目标终止单词之间的文本内容,得到所述情感特征语句。
  11. 根据权利要求9所述的计算机设备,其中:
    所述将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,包括:
    对所述文本分析模型的神经网络单元进行多次的随机丢弃,得到多个不同的文本分析子模型;各个所述文本分析子模型具有共享的权重参数;
    将所述文本样本和所述第二情感标签输入到各个所述文本分析子模型中,提取所述文本数据中的情感特征语句;
    所述确定训练的损失值,包括:
    确定各个所述文本分析子模型对应的子损失值;
    计算各个所述子损失值的均值,得到训练的损失值。
  12. 根据权利要求9所述的计算机设备,其中,所述情感特征语句标签通过以下步骤得到:
    根据所述文本样本中的情感特征语句的起始单词的位置,确定第一标签概率;所述第一标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的标签概率,各个单词对应的所述第一标签概率和所述单词与所述起始单词之间的距离负相关;
    根据所述文本样本中的情感特征语句的终止单词的位置,确定第二标签概率;所述第二标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的标签概率,各个单词对应的所述第二标签概率和所述单词与所述终止单词之间的距离负相关;
    根据所述第一标签概率和所述第二标签概率构造所述情感特征语句标签。
  13. 根据权利要求9所述的计算机设备,其中,所述情感特征语句标签通过以下步骤得到:
    分别将所述文本样本中的各个单词作为情感特征语句的候选起始单词,将所述文本样本的终止单词作为情感特征语句的候选终止单词,构造得到所述文本样本中的各个单词对应的第一候选情感特征语句;
    根据各个所述第一候选情感特征语句和所述情感特征语句的单词交并比,确定各个所述第一候选情感特征语句对应的单词的第一标签概率;所述第一标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的标签概率;
    将所述文本样本的起始单词作为情感特征语句的候选起始单词,分别将所述文本样本中的各个单词作为情感特征语句的候选终止单词,构造得到所述文本样本中的各个单词对应的第二候选情感特征语句;
    根据各个所述第二候选情感特征语句和所述情感特征语句的单词交并比,确定各个所述第二候选情感特征语句对应的单词的第二标签概率;所述第二标签概率用于表征所述文本样 本中的各个单词为所述情感特征语句的终止单词的标签概率;
    根据所述第一标签概率和所述第二标签概率构造所述情感特征语句标签。
  14. 根据权利要求12或者13所述的计算机设备,其中,所述根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值,包括:
    确定所述第三输出概率和所述第一标签概率之间的第一散度值;
    确定所述第四输出概率和所述第二标签概率之间的第二散度值;
    根据所述第一散度值和所述第二散度值的和,确定训练的损失值。
  15. 一种计算机可读存储介质,其中存储有处理器可执行的程序,其中:所述处理器可执行的程序在由处理器执行时用于实现一种文本数据的分析方法或一种文本分析模型的训练方法;
    其中所述文本数据的分析方法,包括:
    获取待处理的文本数据和所述文本数据对应的第一情感标签;所述文本数据中包括多个单词;
    将所述文本数据和所述第一情感标签输入至预设的文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,得到第一输出概率和第二输出概率;其中,所述第一输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的起始单词的预测概率,所述第二输出概率用于表征所述文本数据中的各个单词为所述情感特征语句的终止单词的预测概率;
    根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句;
    其中所述文本分析模型的训练方法,包括:
    获取多个文本样本和所述文本样本对应的第二情感标签、情感特征语句标签;所述文本样本中包括多个单词;
    将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本样本中的情感特征语句,得到第三输出概率和第四输出概率;其中,所述第三输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的预测概率,所述第四输出概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的预测概率;
    根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值;
    根据所述损失值对所述文本分析模型进行训练,得到训练好的文本分析模型。
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述根据所述第一输出概率和所述第二输出概率,从所述文本数据中确定所述情感特征语句,包括:
    将所述第一输出概率最高值对应的单词确定为情感特征语句的目标起始单词,将所述第二输出概率最高值对应的单词确定为情感特征语句的目标终止单词;
    从所述文本数据中提取所述目标起始单词和所述目标终止单词之间的文本内容,得到所述情感特征语句。
  17. 根据权利要求15所述的计算机可读存储介质,其中:
    所述将所述文本样本和所述第二情感标签输入至文本分析模型,通过所述文本分析模型提取所述文本数据中的情感特征语句,包括:
    对所述文本分析模型的神经网络单元进行多次的随机丢弃,得到多个不同的文本分析子模型;各个所述文本分析子模型具有共享的权重参数;
    将所述文本样本和所述第二情感标签输入到各个所述文本分析子模型中,提取所述文本数据中的情感特征语句;
    所述确定训练的损失值,包括:
    确定各个所述文本分析子模型对应的子损失值;
    计算各个所述子损失值的均值,得到训练的损失值。
  18. 根据权利要求15所述的计算机可读存储介质,其中,所述情感特征语句标签通过以下步骤得到:
    根据所述文本样本中的情感特征语句的起始单词的位置,确定第一标签概率;所述第一标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的标签概率,各个单词对应的所述第一标签概率和所述单词与所述起始单词之间的距离负相关;
    根据所述文本样本中的情感特征语句的终止单词的位置,确定第二标签概率;所述第二标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的标签概率,各个单词对应的所述第二标签概率和所述单词与所述终止单词之间的距离负相关;
    根据所述第一标签概率和所述第二标签概率构造所述情感特征语句标签。
  19. 根据权利要求15所述的计算机可读存储介质,其中,所述情感特征语句标签通过以下步骤得到:
    分别将所述文本样本中的各个单词作为情感特征语句的候选起始单词,将所述文本样本的终止单词作为情感特征语句的候选终止单词,构造得到所述文本样本中的各个单词对应的第一候选情感特征语句;
    根据各个所述第一候选情感特征语句和所述情感特征语句的单词交并比,确定各个所述第一候选情感特征语句对应的单词的第一标签概率;所述第一标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的起始单词的标签概率;
    将所述文本样本的起始单词作为情感特征语句的候选起始单词,分别将所述文本样本中的各个单词作为情感特征语句的候选终止单词,构造得到所述文本样本中的各个单词对应的第二候选情感特征语句;
    根据各个所述第二候选情感特征语句和所述情感特征语句的单词交并比,确定各个所述第二候选情感特征语句对应的单词的第二标签概率;所述第二标签概率用于表征所述文本样本中的各个单词为所述情感特征语句的终止单词的标签概率;
    根据所述第一标签概率和所述第二标签概率构造所述情感特征语句标签。
  20. 根据权利要求18或者19所述的计算机可读存储介质,其中,所述根据所述第三输出概率、所述第四输出概率和所述情感特征语句标签,确定训练的损失值,包括:
    确定所述第三输出概率和所述第一标签概率之间的第一散度值;
    确定所述第四输出概率和所述第二标签概率之间的第二散度值;
    根据所述第一散度值和所述第二散度值的和,确定训练的损失值。
PCT/CN2022/090738 2022-01-21 2022-04-29 文本数据的分析方法、模型训练方法、装置及计算机设备 WO2023137918A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210074604.5 2022-01-21
CN202210074604.5A CN114386436B (zh) 2022-01-21 2022-01-21 文本数据的分析方法、模型训练方法、装置及计算机设备

Publications (1)

Publication Number Publication Date
WO2023137918A1 true WO2023137918A1 (zh) 2023-07-27

Family

ID=81204292

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/090738 WO2023137918A1 (zh) 2022-01-21 2022-04-29 文本数据的分析方法、模型训练方法、装置及计算机设备

Country Status (2)

Country Link
CN (1) CN114386436B (zh)
WO (1) WO2023137918A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114386436B (zh) * 2022-01-21 2023-07-18 平安科技(深圳)有限公司 文本数据的分析方法、模型训练方法、装置及计算机设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210089936A1 (en) * 2019-09-24 2021-03-25 International Business Machines Corporation Opinion snippet detection for aspect-based sentiment analysis
CN113255327A (zh) * 2021-06-10 2021-08-13 腾讯科技(深圳)有限公司 文本处理方法、装置、电子设备及计算机可读存储介质
CN113515948A (zh) * 2021-01-11 2021-10-19 腾讯科技(深圳)有限公司 语言模型训练方法、装置、设备及存储介质
CN113535889A (zh) * 2020-04-20 2021-10-22 阿里巴巴集团控股有限公司 一种评论分析方法及装置
CN113836297A (zh) * 2021-07-23 2021-12-24 北京三快在线科技有限公司 文本情感分析模型的训练方法及装置
CN113850072A (zh) * 2021-09-27 2021-12-28 北京百度网讯科技有限公司 文本情感分析方法、情感分析模型训练方法、装置、设备及介质
CN114386436A (zh) * 2022-01-21 2022-04-22 平安科技(深圳)有限公司 文本数据的分析方法、模型训练方法、装置及计算机设备

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829672A (zh) * 2018-06-05 2018-11-16 平安科技(深圳)有限公司 文本的情感分析方法、装置、计算机设备和存储介质
CN109271493B (zh) * 2018-11-26 2021-10-08 腾讯科技(深圳)有限公司 一种语言文本处理方法、装置和存储介质
CN110442857B (zh) * 2019-06-18 2024-05-10 平安科技(深圳)有限公司 情感智能判断方法、装置及计算机可读存储介质
CN111339305B (zh) * 2020-03-20 2023-04-14 北京中科模识科技有限公司 文本分类方法、装置、电子设备及存储介质
CN112860841B (zh) * 2021-01-21 2023-10-24 平安科技(深圳)有限公司 一种文本情感分析方法、装置、设备及存储介质
CN112988979B (zh) * 2021-04-29 2021-10-08 腾讯科技(深圳)有限公司 实体识别方法、装置、计算机可读介质及电子设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210089936A1 (en) * 2019-09-24 2021-03-25 International Business Machines Corporation Opinion snippet detection for aspect-based sentiment analysis
CN113535889A (zh) * 2020-04-20 2021-10-22 阿里巴巴集团控股有限公司 一种评论分析方法及装置
CN113515948A (zh) * 2021-01-11 2021-10-19 腾讯科技(深圳)有限公司 语言模型训练方法、装置、设备及存储介质
CN113255327A (zh) * 2021-06-10 2021-08-13 腾讯科技(深圳)有限公司 文本处理方法、装置、电子设备及计算机可读存储介质
CN113836297A (zh) * 2021-07-23 2021-12-24 北京三快在线科技有限公司 文本情感分析模型的训练方法及装置
CN113850072A (zh) * 2021-09-27 2021-12-28 北京百度网讯科技有限公司 文本情感分析方法、情感分析模型训练方法、装置、设备及介质
CN114386436A (zh) * 2022-01-21 2022-04-22 平安科技(深圳)有限公司 文本数据的分析方法、模型训练方法、装置及计算机设备

Also Published As

Publication number Publication date
CN114386436B (zh) 2023-07-18
CN114386436A (zh) 2022-04-22

Similar Documents

Publication Publication Date Title
Ren et al. A sentiment-aware deep learning approach for personality detection from text
CN110263324B (zh) 文本处理方法、模型训练方法和装置
US20230100376A1 (en) Text sentence processing method and apparatus, computer device, and storage medium
US11481416B2 (en) Question Answering using trained generative adversarial network based modeling of text
US11281976B2 (en) Generative adversarial network based modeling of text for natural language processing
CN111783474B (zh) 一种评论文本观点信息处理方法、装置及存储介质
CN109992773B (zh) 基于多任务学习的词向量训练方法、系统、设备及介质
CN113051916B (zh) 一种社交网络中基于情感偏移感知的交互式微博文本情感挖掘方法
CN110110318B (zh) 基于循环神经网络的文本隐写检测方法及系统
CN110598070B (zh) 应用类型识别方法及装置、服务器及存储介质
CN113704460B (zh) 一种文本分类方法、装置、电子设备和存储介质
WO2021169364A1 (zh) 分析语义情感的方法、装置、设备及存储介质
US20220100967A1 (en) Lifecycle management for customized natural language processing
Guo et al. Who is answering whom? Finding “Reply-To” relations in group chats with deep bidirectional LSTM networks
Dangi et al. An efficient model for sentiment analysis using artificial rabbits optimized vector functional link network
WO2023137918A1 (zh) 文本数据的分析方法、模型训练方法、装置及计算机设备
CN113362852A (zh) 一种用户属性识别方法和装置
CN110889505A (zh) 一种图文序列匹配的跨媒体综合推理方法和系统
US20240028828A1 (en) Machine learning model architecture and user interface to indicate impact of text ngrams
CN113536784A (zh) 文本处理方法、装置、计算机设备和存储介质
US20240086731A1 (en) Knowledge-graph extrapolating method and system based on multi-layer perception
CN111859979A (zh) 讽刺文本协同识别方法、装置、设备及计算机可读介质
Sung et al. A Study of BERT-Based Classification Performance of Text-Based Health Counseling Data.
CN114925681A (zh) 知识图谱问答问句实体链接方法、装置、设备及介质
CN115293249A (zh) 一种基于动态时序预测的电力系统典型场景概率预测方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22921349

Country of ref document: EP

Kind code of ref document: A1