CN114662452A - Privacy-removing text label analysis method and device - Google Patents

Privacy-removing text label analysis method and device Download PDF

Info

Publication number
CN114662452A
CN114662452A CN202210247141.8A CN202210247141A CN114662452A CN 114662452 A CN114662452 A CN 114662452A CN 202210247141 A CN202210247141 A CN 202210247141A CN 114662452 A CN114662452 A CN 114662452A
Authority
CN
China
Prior art keywords
text data
text
label
privacy
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210247141.8A
Other languages
Chinese (zh)
Inventor
魏从猛
瞿伟
李渊苑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202210247141.8A priority Critical patent/CN114662452A/en
Publication of CN114662452A publication Critical patent/CN114662452A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a privacy-removing text label analysis method and device, which can be used in the financial field, and the method comprises the following steps: acquiring text data of a user, and performing privacy-removing data cleaning on the text data according to a preset privacy data rule to obtain cleaned text data; after the text data are converted into corresponding word vector matrixes according to a preset word vector model, performing label type analysis on the word vector matrixes according to a neural network text classification model and an attention model, and determining the label types of the text data; according to the method and the device, the label type analysis can be accurately carried out on the text data of the user, so that the label type of the text data of the user can be determined, a targeted coping strategy is adopted, and the classification accuracy of the text data of the user is improved.

Description

Privacy-removing text label analysis method and device
Technical Field
The application relates to the field of natural language processing and can also be used in the field of finance, in particular to a text label analysis method and device for removing privacy.
Background
In the prior art, auxiliary workers mainly used for solving the problems in the working and living of users have the modes of on-line questionnaire survey, on-site questioning and answering, interviewing and the like, and the mode method still has the following problems: (1) the items listed in the online questionnaire survey are single, the content of the items is popular, and the requirements of users cannot be reflected truly; (2) on-site question answering can not achieve the purpose that each user has chance to react to a problem, and the problem that some users are in the way of faces and are difficult to truly describe the true mind of the user is solved.
Therefore, a technical solution capable of accurately analyzing and classifying text data of a user is needed to solve the problems in the prior art.
Disclosure of Invention
Aiming at the problems in the prior art, the application provides a privacy-removing text label analysis method and device, which can accurately perform label type analysis on text data of a user so as to determine the label type of the text data and adopt a targeted coping strategy, so that the classification accuracy of the text data of the user is improved.
In order to solve at least one of the above problems, the present application provides the following technical solutions:
in a first aspect, the present application provides a method for analyzing a text label without privacy, including:
acquiring text data of a user, and performing privacy-removing data cleaning on the text data according to a preset privacy data rule to obtain cleaned text data;
and after the text data are converted into corresponding word vector matrixes according to a preset word vector model, performing label type analysis on the word vector matrixes according to a neural network text classification model and an attention model, and determining the label types of the text data.
Further, after the private data removing and cleaning is performed on the text data according to a preset private data rule to obtain the cleaned text data, the method further includes:
performing word segmentation on the text data subjected to privacy-removed data cleaning according to a preset custom dictionary;
and carrying out specific part-of-speech filtering on the text data subjected to the word segmentation according to the preset deactivation word to obtain the text data subjected to the word segmentation and the specific part-of-speech filtering.
Further, after the determining the tag type of the text data, the method further includes:
if the type of the label of the text data is a negative label type, determining a negative word according to a high-frequency word cloud obtained by the word vector matrix;
and calling a coping strategy corresponding to the negative word and sending the coping strategy to a corresponding administrator terminal.
Further, the word vector model is obtained in advance by the following method, including:
and setting the text data after the emotion label classification as a model training set to perform model training on a preset word vector model to obtain the word vector model after the model training.
Further, the emotion label classification includes:
and carrying out artificial emotion label classification on the acquired historical text data of the user, wherein if different emotion labels are respectively added to the same text data, a label with a higher type is defined as an emotion label of the text data.
Further, the performing label type analysis on the word vector matrix according to a neural network text classification model and an attention model to determine the label type of the text data includes:
newly adding a dimension to the word vector matrix, inputting the word vector matrix into a convolution layer of a preset neural network text classification model for convolution operation, and performing dimension superposition on the word vector matrix subjected to the convolution operation;
and normalizing the word vector matrix subjected to the dimension superposition according to a preset attention model and a preset logistic regression model, and determining the type of the label of the text data.
In a second aspect, the present application provides a de-privacy text label analysis apparatus, comprising:
the privacy-removing processing module is used for acquiring text data of a user, and performing privacy-removing data cleaning on the text data according to a preset privacy data rule to obtain cleaned text data;
and the text emotion analysis module is used for converting the text data into a corresponding word vector matrix according to a preset word vector model, and then performing label type analysis on the word vector matrix according to a neural network text classification model and an attention model to determine the label type of the text data.
Further, the de-privacy processing module further comprises:
the word segmentation unit is used for segmenting words of the text data subjected to privacy-removed data cleaning according to a preset custom dictionary;
and the part of speech filtering unit is used for filtering the specific part of speech of the text data subjected to the word segmentation according to the preset stop words to obtain the text data subjected to the word segmentation and the specific part of speech filtering.
In a third aspect, the present application provides an electronic device, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method for de-privacy text label analysis when executing the program.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method of de-privacy text label analysis.
In a fifth aspect, the present application provides a computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the de-privacy text label analysis method.
According to the technical scheme, the text label analysis method and device for removing privacy are provided, label type analysis is carried out on the text data after the user removes privacy through a deep neural network text classification model and an Attention mechanism of an Attention model, and therefore the type of the label of the text data of the user can be accurately determined and effectively responded.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for analyzing a text label with privacy removed in an embodiment of the present application;
FIG. 2 is a second flowchart of a method for de-privacy text label analysis according to an embodiment of the present application;
fig. 3 is a third schematic flowchart of a text label analysis method for privacy elimination in the embodiment of the present application;
fig. 4 is a fourth schematic flowchart of a text label analysis method for privacy elimination in the embodiment of the present application;
fig. 5 is a fifth flowchart illustrating a method for analyzing a text label with privacy removed in an embodiment of the present application;
fig. 6 is one of the structural diagrams of a text label analysis apparatus for privacy elimination in the embodiment of the present application;
fig. 7 is a second block diagram of a text label analysis apparatus for privacy elimination according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
In the technical scheme of the application, the data acquisition, storage, use, processing and the like all accord with relevant regulations of national laws and regulations.
The user information in the embodiment of the application is obtained through legal compliance, and the user information is obtained, stored, used, processed and the like through authorization and approval of a client.
The application provides a privacy-removing text label analysis method and device, label type analysis is carried out on text data after the privacy removal of a user through a deep neural network text classification model and an Attention mechanism of an Attention model, and therefore the type of the label to which the text data of the user belongs can be accurately determined and effectively responded.
In order to accurately perform tag type analysis on text data of a user to determine a tag type to which the text data belongs and adopt a targeted countermeasure strategy, and improve classification accuracy of the text data of the user, the application provides an embodiment of a de-privacy text tag analysis method, which specifically includes the following contents, with reference to fig. 1:
step S101: the method comprises the steps of obtaining text data of a user, and cleaning privacy-removing data of the text data according to a preset privacy data rule to obtain the cleaned text data.
It is to be understood that the text label type analysis can also be understood as opinion extraction or opinion mining, which is used for analyzing and processing text data with emotion colors and then mining emotions which a user wants to express, and the text label type analysis is essentially a text classification problem, for example, positive emotions can be classified into one class, negative emotions can be classified into one class, or negative emotions can be classified into one class, and positive emotions can be classified into one class.
It is understood that the present application may obtain the text data of the user after sufficiently obtaining the user's permission and authorization, for example, retrieving the text data saved by the user in the specific software (the user has sufficiently known in advance that the specific software will save the text data).
Optionally, after the text data of the user is obtained, the text data may be subjected to privacy-removing data cleaning according to a preset privacy data rule to obtain the cleaned text data.
For example, a user name dictionary may be predefined for data cleansing for de-naming the text data.
For another example, the present application may pre-define a project information/service information dictionary for performing data cleaning on the text data to obtain the cleaned text data.
Step S102: and after the text data are converted into corresponding word vector matrixes according to a preset word vector model, performing label type analysis on the word vector matrixes according to a neural network text classification model and an attention model, and determining the label types of the text data.
Optionally, the text data after being cleaned may be stored in a local database, and then the text data is converted into a corresponding word vector matrix by using a trained preset word vector model.
Optionally, before training, the word vector model needs to construct a training set, for example, emotion label classification is performed on the obtained text data authorized by the user, where if different emotion labels are added to the same text data, a label with a higher type is defined as an emotion label of the text data.
For example, the text data may be manually classified in advance by adding an "active" label or a "passive" label to each sentence/each paragraph of chat text in the text data, and multiple people may add labels respectively, so that when different emotion labels are added to the same text data, the text data may define a label with a higher type percentage (for example, 80% of the labels are emotion active labels) as an emotion label of the text data.
And then, setting the text data classified by the emotion labels as a model training set to perform model training on a preset word vector model to obtain the word vector model after model training.
Optionally, in the present application, after the word vector model converts the text data into a corresponding word vector matrix, the word vector matrix may be subjected to tag type analysis according to the neural network text classification model and the Attention model, for example, the existing TextCNN + Attention model is used to perform tag type analysis on text data that has been subjected to text preprocessing.
And finally, after the label type analysis result is obtained, combining the high-frequency word cloud corresponding to the word vector matrix, accurately and intuitively understanding the emotional tendency and reason of the user, and then adopting a corresponding working mode to effectively deal with the emotional tendency and reason.
For example, the tag type analysis result indicates that the text content belonging to the passive tag type in the user text data accounts for a relatively large amount, and specific passive words (i.e., reasons of user negativity) such as the distance after company moving, the working environment, the dining hall and the like can be determined through the high-frequency word cloud, so that the application can mainly deal with the worry or apprehension of the passive user in the following.
As can be seen from the above description, the text label analysis method for removing privacy provided in the embodiment of the present application can perform label type analysis on the text data after the user removes privacy through the deep neural network text classification model in combination with the Attention mechanism of the Attention model, so that the type of the label to which the text data of the user belongs can be accurately determined and effectively coped with.
In order to be able to perform preprocessing on text data, in an embodiment of the text label analysis method for de-privacy, referring to fig. 2, the following may be specifically included after the step S101:
step S201: and segmenting words of the text data subjected to privacy-removing data cleaning according to a preset custom dictionary.
Step S202: and carrying out specific part-of-speech filtering on the text data subjected to the word segmentation according to the preset deactivation word to obtain the text data subjected to the word segmentation and the specific part-of-speech filtering.
Optionally, after the original text data is subjected to privacy-removing data cleaning, word segmentation and filtering processing of specific parts of speech can be performed on the text data.
For example, the word segmentation method of the present application may adopt a Python chinese word segmentation component jieba library, and in a full mode, text data is segmented into word representations by adding a custom dictionary and referring to a disabled word list.
The customized dictionary is mainly introduced for specific words which are easy to be wrongly segmented, such as place names, working methods and the like in the text data, for example, a 'certain business building' is segmented into 'certain city', 'business', 'building' before the customized dictionary is introduced, but after the whole 'certain business building' is introduced as a dictionary value, the segmented effect is 'certain business building'.
The stop words are mainly filtered aiming at specific word classes such as mood assist words or person name pronouns in the sentence to be segmented, for example, before the words are not used as the stop words, the words that "i want to go to a certain market" can be segmented into the words of "i", "want to go" and "a certain market", but the person name pronouns such as "i" do not need to pay attention as emotion analysis, so that the segmentation effect is "want to go" and "a certain market" after the words are introduced as the stop words.
In order to effectively respond according to the type of the tag, in an embodiment of the text tag analysis method for de-privacy, referring to fig. 3, the following may be specifically included after the step S102:
step S301: and if the type of the label of the text data is a negative label type, determining a negative word according to the high-frequency word cloud obtained from the word vector matrix.
Step S302: and calling a coping strategy corresponding to the negative word and sending the coping strategy to a corresponding administrator terminal.
Optionally, after the tag type analysis result is obtained, the high-frequency word cloud corresponding to the word vector matrix is combined, so that the emotional tendency and the reason of the user can be accurately and intuitively understood, and then a corresponding working mode is adopted to effectively deal with the emotion tendency and the reason.
For example, the tag type analysis result shows that chat contents in a negative tag type in user text data account for a relatively large amount, and it can be determined through the high-frequency word cloud that negative words (i.e., reasons for user negativity) are mainly concentrated on a certain distance after a company moves a certain city, a working environment, a dining hall and the like, so that the application can mainly deal with worry or apprehension of a negative user.
In order to train and obtain an accurate word vector model, in an embodiment of the method for de-privacy text label analysis of the present application, referring to fig. 4, the word vector model is obtained in advance by the following method:
step S401: and carrying out artificial emotion label classification on the acquired historical text data of the user, wherein if different emotion labels are respectively added to the same text data, a label with a higher type is defined as an emotion label of the text data.
Step S402: and setting the text data after the emotion label classification as a model training set to perform model training on a preset word vector model to obtain a word vector model after the model training.
Optionally, before training, the word vector model needs to construct a training set, for example, emotion label classification is performed on the obtained text data authorized by the user, where if different emotion labels are added to the same text data, a label with a higher type is defined as an emotion label of the text data.
For example, the text data may be classified into emotion tags manually, that is, an "emotion positive" tag or a "negative tag type" tag is added to each sentence/each paragraph of chat text in the text data, and tags may be added by multiple persons, respectively, so that when different emotion tags are added to the same text data, the text data may define tags with higher type occupancy (for example, 80% of tags are emotion positive tags) as the emotion tags of the text data.
And then, setting the text data classified by the emotion labels as a model training set to perform model training on a preset word vector model to obtain the word vector model after model training.
In order to accurately perform the tag type analysis, in an embodiment of the text tag analysis method for de-privacy, referring to fig. 5, the step S102 may further include the following steps:
step S501: and adding a dimension to the word vector matrix, inputting the newly added dimension to a convolution layer of a preset neural network text classification model for convolution operation, and performing dimension superposition on the word vector matrix subjected to the convolution operation.
Step S502: and normalizing the word vector matrix subjected to the dimension superposition according to a preset attention model and a preset logistic regression model, and determining the type of the label of the text data.
In a specific embodiment of the present application, because the text and the picture have differences in the convolution operation, if a partial dimension of the text is convolved, text information is lost, so the present application adopts a full-dimension convolution operation for the text convolution.
Specifically, batch data is read in first, then the batch data is mapped into word vectors, because convolution operation needs four dimensions, a dimension is added behind the mapped word vectors, convolution operation is carried out after the mapped word vectors enter a convolution layer, finally, the vectors obtained through convolution are superposed through the last dimension, then through an attention mechanism and softmax normalization operation, input sentences are judged to be the category with the maximum output probability, and the type of the label to which the text data belongs is determined.
In order to accurately analyze the tag type of the text data of the user, determine the tag type to which the text data belongs, and adopt a corresponding countermeasure policy, and improve the classification accuracy of the text data of the user, the present application provides an embodiment of a text tag analysis apparatus for implementing de-privacy of all or part of the content of the text tag analysis method, which is shown in fig. 6, and the text tag analysis apparatus for de-privacy specifically includes the following contents:
the privacy removing processing module 10 is configured to acquire text data of a user, and perform privacy removing data cleaning on the text data according to a preset privacy data rule to obtain cleaned text data.
And the text emotion analysis module 20 is configured to, after the text data is converted into a corresponding word vector matrix according to a preset word vector model, perform label type analysis on the word vector matrix according to a neural network text classification model and an attention model, and determine a label type to which the text data belongs.
As can be seen from the above description, the text tag analysis apparatus for de-privacy provided in the embodiment of the present application can perform tag type analysis on text data after de-privacy of a user through a deep neural network text classification model in combination with an Attention mechanism of an Attention model, so that a tag type to which the text data of the user belongs can be accurately determined and effectively coped with.
In order to be able to pre-process the text data, in an embodiment of the text label analysis apparatus for de-privacy, referring to fig. 7, the de-privacy processing module 10 further includes:
and the word segmentation unit 11 is configured to perform word segmentation on the text data after the privacy-removed data is cleaned according to a preset custom dictionary.
And the part of speech filtering unit 12 is configured to perform specific part of speech filtering on the text data subjected to word segmentation according to the preset disabled word, so as to obtain the text data subjected to word segmentation and specific part of speech filtering.
In order to accurately perform tag type analysis on text data of a user to determine a tag type to which the text data belongs and adopt a targeted coping strategy and improve classification accuracy of the text data of the user in a hardware aspect, the present application provides an embodiment of an electronic device for implementing all or part of the content in the text tag analysis method for privacy removal, where the electronic device specifically includes the following contents:
a processor (processor), a memory (memory), a communication Interface (Communications Interface), and a bus; the processor, the memory and the communication interface complete mutual communication through the bus; the communication interface is used for realizing information transmission between the privacy-removed text label analysis device and relevant equipment such as a core service system, a user terminal, a relevant database and the like; the logic controller may be a desktop computer, a tablet computer, a mobile terminal, and the like, but the embodiment is not limited thereto. In this embodiment, the logic controller may refer to the embodiments of the method for analyzing a text label with privacy removed and the embodiments of the apparatus for analyzing a text label with privacy removed in the embodiments, which are incorporated herein, and repeated details are not repeated herein.
It is understood that the user terminal may include a smart phone, a tablet electronic device, a network set-top box, a portable computer, a desktop computer, a Personal Digital Assistant (PDA), an in-vehicle device, a smart wearable device, and the like. Wherein, intelligence wearing equipment can include intelligent glasses, intelligent wrist-watch, intelligent bracelet etc..
In practical applications, part of the text label analysis method for privacy elimination may be performed on the electronic device side as described above, or all operations may be performed in the client device. The selection may be specifically performed according to the processing capability of the client device, the limitation of the user usage scenario, and the like. This is not a limitation of the present application. The client device may further include a processor if all operations are performed in the client device.
The client device may have a communication module (i.e., a communication unit), and may be communicatively connected to a remote server to implement data transmission with the server. The server may include a server on the task scheduling center side, and in other implementation scenarios, the server may also include a server on an intermediate platform, for example, a server on a third-party server platform that is communicatively linked to the task scheduling center server. The server may include a single computer device, or may include a server cluster formed by a plurality of servers, or a server structure of a distributed apparatus.
Fig. 8 is a schematic block diagram of a system configuration of an electronic device 9600 according to an embodiment of the present application. As shown in fig. 8, the electronic device 9600 can include a central processor 9100 and a memory 9140; the memory 9140 is coupled to the central processor 9100. Notably, this FIG. 8 is exemplary; other types of structures may also be used in addition to or in place of the structure to implement telecommunications or other functions.
In one embodiment, the de-privatized text label analysis method function may be integrated into the central processor 9100. The central processor 9100 may be configured to control as follows:
step S101: the method comprises the steps of obtaining text data of a user, and cleaning privacy-removing data of the text data according to a preset privacy data rule to obtain the cleaned text data.
Step S102: and after the text data are converted into corresponding word vector matrixes according to a preset word vector model, performing label type analysis on the word vector matrixes according to a neural network text classification model and an attention model, and determining the label types of the text data.
As can be seen from the above description, the electronic device provided in the embodiment of the present application performs tag type analysis on the text data after the user is privatized through the deep neural network text classification model in combination with the Attention mechanism of the Attention model, so that the tag type of the text data of the user can be accurately determined and effectively responded to.
In another embodiment, the text label analysis apparatus for privacy elimination may be configured separately from the central processor 9100, for example, the text label analysis apparatus for privacy elimination may be configured as a chip connected to the central processor 9100, and the function of the text label analysis method for privacy elimination may be realized by the control of the central processor.
As shown in fig. 8, the electronic device 9600 may further include: a communication module 9110, an input unit 9120, an audio processor 9130, a display 9160, and a power supply 9170. It is noted that the electronic device 9600 also does not necessarily include all of the components shown in fig. 8; further, the electronic device 9600 may further include components not shown in fig. 8, which may be referred to in the art.
As shown in fig. 8, a central processor 9100, sometimes referred to as a controller or operational control, can include a microprocessor or other processor device and/or logic device, which central processor 9100 receives input and controls the operation of the various components of the electronic device 9600.
The memory 9140 can be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information relating to the failure may be stored, and a program for executing the information may be stored. And the central processing unit 9100 can execute the program stored in the memory 9140 to realize information storage or processing, or the like.
The input unit 9120 provides input to the central processor 9100. The input unit 9120 is, for example, a key or a touch input device. Power supply 9170 is used to provide power to electronic device 9600. The display 9160 is used for displaying display objects such as images and characters. The display may be, for example, an LCD display, but is not limited thereto.
The memory 9140 can be a solid state memory, e.g., Read Only Memory (ROM), Random Access Memory (RAM), a SIM card, or the like. There may also be a memory that holds information even when power is off, can be selectively erased, and is provided with more data, an example of which is sometimes called an EPROM or the like. The memory 9140 could also be some other type of device. Memory 9140 includes a buffer memory 9141 (sometimes referred to as a buffer). The memory 9140 may include an application/function storage part 9142, the application/function storage part 9142 being used to store application programs and function programs or a flow for executing the operation of the electronic device 9600 by the central processing unit 9100.
The memory 9140 can also include a data store 9143, the data store 9143 being used to store data, such as contacts, digital data, pictures, sounds, and/or any other data used by an electronic device. The driver storage portion 9144 of the memory 9140 may include various drivers for the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, contact book applications, etc.).
The communication module 9110 is a transmitter/receiver 9110 that transmits and receives signals via an antenna 9111. The communication module (transmitter/receiver) 9110 is coupled to the central processor 9100 to provide input signals and receive output signals, which may be the same as in the case of a conventional mobile communication terminal.
Based on different communication technologies, a plurality of communication modules 9110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, may be provided in the same electronic device. The communication module (transmitter/receiver) 9110 is also coupled to a speaker 9131 and a microphone 9132 via an audio processor 9130 to provide audio output via the speaker 9131 and receive audio input from the microphone 9132, thereby implementing ordinary telecommunications functions. The audio processor 9130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 9130 is also coupled to the central processor 9100, thereby enabling recording locally through the microphone 9132 and enabling locally stored sounds to be played through the speaker 9131.
An embodiment of the present application further provides a computer-readable storage medium capable of implementing all the steps in the text label analysis method for de-privacy whose execution subject is server or client in the foregoing embodiments, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements all the steps in the text label analysis method for de-privacy whose execution subject is server or client in the foregoing embodiments, for example, when the processor executes the computer program, the processor implements the following steps:
step S101: the method comprises the steps of obtaining text data of a user, and cleaning privacy-removing data of the text data according to a preset privacy data rule to obtain the cleaned text data.
Step S102: and after the text data are converted into corresponding word vector matrixes according to a preset word vector model, performing label type analysis on the word vector matrixes according to a neural network text classification model and an attention model, and determining the label types of the text data.
As can be seen from the above description, the computer-readable storage medium provided in the embodiment of the present application performs tag type analysis on the text data after the user has been subjected to privacy removal through the deep neural network text classification model in combination with the Attention mechanism of the Attention model, so that the tag type of the text data of the user can be accurately determined and effectively coped with.
Embodiments of the present application further provide a computer program product capable of implementing all steps in the text label analysis method for de-privacy whose execution subject is server or client in the above embodiments, where the computer program/instruction is executed by a processor to implement the steps of the text label analysis method for de-privacy, for example, the computer program/instruction implements the following steps:
step S101: the method comprises the steps of obtaining text data of a user, and cleaning privacy-removing data of the text data according to a preset privacy data rule to obtain the cleaned text data.
Step S102: and after the text data are converted into corresponding word vector matrixes according to a preset word vector model, performing label type analysis on the word vector matrixes according to a neural network text classification model and an attention model, and determining the label types of the text data.
As can be seen from the above description, the computer program product provided in the embodiment of the present application performs tag type analysis on the text data after the user has removed privacy through the deep neural network text classification model in combination with the Attention mechanism of the Attention model, so that the tag type of the text data of the user can be accurately determined and effectively coped with.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method of de-privatized text label analysis, the method comprising:
acquiring text data of a user, and performing privacy-removing data cleaning on the text data according to a preset privacy data rule to obtain cleaned text data;
and after the text data are converted into corresponding word vector matrixes according to a preset word vector model, performing label type analysis on the word vector matrixes according to a neural network text classification model and an attention model, and determining the label types of the text data.
2. The method for analyzing de-privacy text labels according to claim 1, wherein after the de-privacy data washing is performed on the text data according to the preset privacy data rule to obtain the washed text data, the method further comprises:
performing word segmentation on the text data subjected to privacy-removing data cleaning according to a preset custom dictionary;
and carrying out specific part-of-speech filtering on the text data subjected to the word segmentation according to the preset deactivation word to obtain the text data subjected to the word segmentation and the specific part-of-speech filtering.
3. The method of de-privacy text tag analysis of claim 1, further comprising, after the determining the tag type to which the text data belongs:
if the type of the tag of the text data is a negative tag type, determining a negative word according to a high-frequency word cloud obtained from the word vector matrix;
and calling a coping strategy corresponding to the negative word and sending the coping strategy to a corresponding administrator terminal.
4. The method of de-privatized text label analysis according to claim 1, wherein the word vector model is pre-derived by a method comprising:
and setting the text data after the emotion label classification as a model training set to perform model training on a preset word vector model to obtain the word vector model after the model training.
5. The de-privatized text label analysis method of claim 4, wherein the emotion label classification comprises:
and carrying out artificial emotion label classification on the acquired historical text data of the user, wherein if different emotion labels are respectively added to the same text data, a label with a higher type is defined as an emotion label of the text data.
6. The method of claim 1, wherein the performing label type analysis on the word vector matrix according to a neural network text classification model and an attention model to determine the label type of the text data comprises:
newly adding a dimension to the word vector matrix, inputting the word vector matrix into a convolution layer of a preset neural network text classification model for convolution operation, and performing dimension superposition on the word vector matrix subjected to the convolution operation;
and normalizing the word vector matrix subjected to the dimension superposition according to a preset attention model and a preset logistic regression model, and determining the type of the label of the text data.
7. A de-privatized text label analysis apparatus, comprising:
the privacy-removing processing module is used for acquiring text data of a user, and performing privacy-removing data cleaning on the text data according to a preset privacy data rule to obtain cleaned text data;
and the text sentiment analysis module is used for converting the text data into a corresponding word vector matrix according to a preset word vector model, and then performing label type analysis on the word vector matrix according to the neural network text classification model and the attention model to determine the label type of the text data.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method of de-privacy text label analysis of any one of claims 1 to 6 are implemented when the program is executed by the processor.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for de-privacy text label analysis of any one of claims 1 to 6.
10. A computer program product comprising computer program/instructions, characterized in that the computer program/instructions, when executed by a processor, implement the steps of the de-privacy text label analysis method of any one of claims 1 to 6.
CN202210247141.8A 2022-03-14 2022-03-14 Privacy-removing text label analysis method and device Pending CN114662452A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210247141.8A CN114662452A (en) 2022-03-14 2022-03-14 Privacy-removing text label analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210247141.8A CN114662452A (en) 2022-03-14 2022-03-14 Privacy-removing text label analysis method and device

Publications (1)

Publication Number Publication Date
CN114662452A true CN114662452A (en) 2022-06-24

Family

ID=82029741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210247141.8A Pending CN114662452A (en) 2022-03-14 2022-03-14 Privacy-removing text label analysis method and device

Country Status (1)

Country Link
CN (1) CN114662452A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116486981A (en) * 2023-06-15 2023-07-25 北京中科江南信息技术股份有限公司 Method for storing health data and method and device for reading health data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116486981A (en) * 2023-06-15 2023-07-25 北京中科江南信息技术股份有限公司 Method for storing health data and method and device for reading health data
CN116486981B (en) * 2023-06-15 2023-10-03 北京中科江南信息技术股份有限公司 Method for storing health data and method and device for reading health data

Similar Documents

Publication Publication Date Title
CN110178132B (en) Method and system for automatically suggesting content in a messaging application
CN110956956A (en) Voice recognition method and device based on policy rules
EP2618296A1 (en) Social media data analysis system and method
CN104813311A (en) System and methods for virtual agent recommendation for multiple persons
EP4131083A2 (en) Method and apparatus for generating federated learning model
CN107644106B (en) Method, terminal device and storage medium for automatically mining service middleman
CN111798118B (en) Enterprise operation risk monitoring method and device
WO2024099457A1 (en) Information recommendation method and apparatus, and storage medium and electronic device
US20180365551A1 (en) Cognitive communication assistant services
CN112052316A (en) Model evaluation method, model evaluation device, storage medium and electronic equipment
CN111048115A (en) Voice recognition method and device
CN114662452A (en) Privacy-removing text label analysis method and device
CN110740212B (en) Call answering method and device based on intelligent voice technology and electronic equipment
CN116580704A (en) Training method of voice recognition model, voice recognition method, equipment and medium
CN110046233A (en) Problem distributing method and device
CN115033675A (en) Conversation method, conversation device, electronic equipment and storage medium
Shin Socio-technical design of algorithms: Fairness, accountability, and transparency
CN114118937A (en) Information recommendation method and device based on task, electronic equipment and storage medium
CN112579773A (en) Risk event grading method and device
CN112965593A (en) AI algorithm-based method and device for realizing multi-mode control digital human interaction
CN110931014A (en) Speech recognition method and device based on regular matching rule
CN113782022B (en) Communication method, device, equipment and storage medium based on intention recognition model
US11676163B1 (en) System and method for determining a likelihood of a prospective client to conduct a real estate transaction
CN111428018B (en) Intelligent question-answering method and device
CN115564481A (en) Customer comment information analysis method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination