CN116628168A - User personality analysis processing method and system based on big data and cloud platform - Google Patents

User personality analysis processing method and system based on big data and cloud platform Download PDF

Info

Publication number
CN116628168A
CN116628168A CN202310690368.4A CN202310690368A CN116628168A CN 116628168 A CN116628168 A CN 116628168A CN 202310690368 A CN202310690368 A CN 202310690368A CN 116628168 A CN116628168 A CN 116628168A
Authority
CN
China
Prior art keywords
comment viewpoint
vector
local
text
topic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310690368.4A
Other languages
Chinese (zh)
Other versions
CN116628168B (en
Inventor
杨德兵
邹傲臣
龚丹球
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Douyu Technology Co ltd
Original Assignee
Shenzhen Douyu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Douyu Technology Co ltd filed Critical Shenzhen Douyu Technology Co ltd
Priority to CN202310690368.4A priority Critical patent/CN116628168B/en
Publication of CN116628168A publication Critical patent/CN116628168A/en
Application granted granted Critical
Publication of CN116628168B publication Critical patent/CN116628168B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0463Neocognitrons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Business, Economics & Management (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioethics (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Human Computer Interaction (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

According to the user personality analysis processing method, system and cloud platform based on big data, after the personalized comment viewpoint vectors of the interactive text of the chat room to be analyzed are obtained, comment viewpoint vectors of target online chat topics of different text keywords are extracted through different task threads, comment viewpoint collision operation is conducted on the comment viewpoint vectors of different task threads, therefore implicit details among the target online chat topics of different text keywords are found out, and the online chat topics with pushing analysis values in the interactive text of the chat room to be analyzed can be found out conveniently. Thus, the mining precision and integrity of the online chat topics are improved.

Description

User personality analysis processing method and system based on big data and cloud platform
Technical Field
The invention relates to the technical field of big data analysis, in particular to a user personality analysis processing method and system based on big data and a cloud platform.
Background
Social chat websites are an internet-based service that allows users to create a virtual nature chat room in the internet. Thereby establishing connection with other users to realize multi-user chat processing. Currently, personalized analysis for users so as to push interest topics in real time is a main direction of updating and upgrading of social chat websites. However, when the topics in the chat room are mined by the traditional technology, the precision and the integrity of the topics are difficult to ensure, and accurate interest topic pushing is difficult to realize.
Disclosure of Invention
In order to improve the technical problems in the related art, the invention provides a user personality analysis processing method, a system and a cloud platform based on big data.
In a first aspect, an embodiment of the present invention provides a method for user personality analysis processing based on big data, which is applied to a big data analysis cloud platform, where the method includes: acquiring a chat room interaction text to be analyzed, wherein the chat room interaction text to be analyzed comprises at least one target online chat topic;
comment viewpoint vectors are mined on the interactive text of the chat room to be analyzed, and personalized comment viewpoint vectors with different dimensions are obtained;
the personalized comment viewpoint vector is subjected to local comment viewpoint vector mining to obtain local comment viewpoint vectors corresponding to a plurality of task threads, wherein the local comment viewpoint vectors are comment viewpoint vectors corresponding to target online chat topics of different text keywords in the chat room interaction text to be analyzed;
performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads to obtain the local comment viewpoint collision vectors corresponding to the task threads;
and excavating the target online chat topics in the chat room interaction text to be analyzed by combining the local comment viewpoint collision vectors corresponding to the task threads.
Optionally, the performing comment viewpoint collision operation on the local comment viewpoint vectors corresponding to the multiple task threads to obtain the local comment viewpoint collision vector corresponding to the task thread includes: acquiring a text characteristic collision model; performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads;
the text feature collision model comprises a text feature collision sub-model and a comment viewpoint vector mining sub-model, and the method for performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads comprises the following steps:
performing comment viewpoint vector aggregation operation on the local comment viewpoint vectors corresponding to the task threads to obtain local comment viewpoint aggregation vectors;
performing comment viewpoint vector mining on the local comment viewpoint aggregate vector by using the comment viewpoint vector mining sub-model to obtain a plurality of local comment viewpoint derivative vectors;
carrying out bias coefficient modification on the local comment viewpoint derivative vector by utilizing the text feature collision sub-model to obtain a modified local comment viewpoint derivative vector;
And processing the modified local comment viewpoint derivative vector and the corresponding local comment viewpoint vector to obtain a local comment viewpoint collision vector corresponding to the task thread.
Optionally, the local comment viewpoint vector includes a first comment viewpoint vector and a second comment viewpoint vector, the local comment viewpoint aggregate vector includes a first local comment viewpoint aggregate word vector and a second local comment viewpoint aggregate word vector, and the comment viewpoint vector aggregate operation is performed on the local comment viewpoint vectors corresponding to the plurality of task threads to obtain the local comment viewpoint aggregate vector, including:
performing comment viewpoint vector aggregation operation on first comment viewpoint word vectors corresponding to the task threads to obtain first local comment viewpoint aggregation word vectors;
and carrying out comment viewpoint vector aggregation operation on second comment viewpoint vectors corresponding to the task threads to obtain the second local comment viewpoint aggregation word vectors.
Optionally, the method further comprises:
acquiring a first local word vector generation model and a second local word vector generation model corresponding to the task thread;
the local comment viewpoint collision vector corresponding to the task thread includes a local comment viewpoint collision vector corresponding to the first comment viewpoint vector and a local comment viewpoint collision vector corresponding to the second comment viewpoint vector, and the mining the target online chat topic in the chat room interaction text to be analyzed in combination with the local comment viewpoint collision vector corresponding to the task thread includes:
Carrying out vector convolution operation on the local comment viewpoint collision vector corresponding to the first comment viewpoint word vector by using the first local word vector generation model to obtain text keywords and distribution quantization characteristics of text units in the interactive text of the chat room to be analyzed;
performing vector convolution operation on the local comment viewpoint collision vector corresponding to the second comment viewpoint word vector by using the second local word vector generation model to obtain range quantization characteristics of text units in the chat room interaction text to be analyzed;
and mining the target online chat topics in the interactive text of the chat room to be analyzed by combining the text keywords, the distribution quantization characteristics and the range quantization characteristics of the text units in the interactive text of the chat room to be analyzed.
Optionally, before the obtaining the interactive text of the chat room to be analyzed, the method further includes:
acquiring a plurality of chat room interaction text sample sets and a universal online chat topic mining network, wherein the chat room interaction text sample sets are in one-to-one correspondence with a plurality of task threads in the universal online chat topic mining network, authentication notes of the chat room interaction text sample in the chat room interaction text sample sets are different, and the authentication notes are used for distinguishing target online chat topics of different text keywords in the chat room interaction text sample;
And debugging the universal online chat topic mining network by using the chat room interaction text sample to obtain an online chat topic mining network.
Optionally, each task thread in the universal online chat topic mining network includes a set first local word vector generation model and a second local word vector generation model, and the debugging is performed on the universal online chat topic mining network by using the chat room interaction text sample to obtain an online chat topic mining network, including:
performing target online chat topic mining on the chat room interactive text sample by using the universal online chat topic mining network to obtain first generated data of the set first local word vector generation model and second generated data of the set second local word vector generation model corresponding to the task thread;
determining a first debugging cost index and a second debugging cost index corresponding to the task thread based on the first generated data, the second generated data and the authentication annotation of the chat room interactive text sample corresponding to the task thread;
determining a debugging cost index corresponding to the task thread by combining a first debugging cost index and a second debugging cost index corresponding to the task thread;
Determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost index corresponding to the task thread;
optimizing network variables of the universal online chat topic mining network based on the debugging cost index of the universal online chat topic mining network, jumping to the step of carrying out target online chat topic mining on the chat room interactive text sample by utilizing the universal online chat topic mining network, and circulating to the debugging cost index of the universal online chat topic mining network to meet the set requirement to obtain the online chat topic mining network.
Optionally, the determining, in combination with the debug cost indicator corresponding to the task thread, the debug cost indicator of the universal online chat topic mining network includes:
determining a prior debugging cost index mean value corresponding to the task thread of a current dynamic circulation operator, wherein the current dynamic circulation operator represents the circulation from the xth circulation to the yth circulation;
determining a mean value of a previous debugging cost index corresponding to the task thread of a previous dynamic loop operator, wherein the previous dynamic loop operator represents the circulation from the x-z time to the y-z time, and z, x and y are positive integers;
Determining a debugging cost gradient corresponding to the task thread by combining a previous debugging cost index mean value corresponding to the task thread of the previous dynamic loop operator and a previous debugging cost index mean value corresponding to the task thread of the current dynamic loop operator;
determining the debug cost confidence corresponding to the task thread based on the debug cost gradient corresponding to the task thread;
and determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost confidence corresponding to the task thread and the debugging cost index corresponding to the task thread.
Optionally, the method further comprises:
determining topic preference labels of the target user terminals according to the target online chat topics;
and carrying out topic pushing processing on the target user terminal by utilizing the topic preference label.
Optionally, the performing topic pushing processing on the target user terminal by using the topic preference tag includes:
selecting an online chat topic to be pushed from a preset topic pool by utilizing the topic preference label;
collecting topic text information corresponding to the online chat topic to be pushed;
on the premise that the topic text information carries data desensitization instructions, performing data desensitization processing on the topic text information to obtain desensitized text information;
And pushing the online chat topic to be pushed and the desensitized text information in an associated mode to the target user terminal.
In a third aspect, the invention also provides a user personality analysis processing system based on big data, which comprises a big data analysis cloud platform and a user terminal which are communicated with each other;
the big data analysis cloud platform is used for:
acquiring a chat room interaction text to be analyzed, wherein the chat room interaction text to be analyzed comprises at least one target online chat topic;
comment viewpoint vectors are mined on the interactive text of the chat room to be analyzed, and personalized comment viewpoint vectors with different dimensions are obtained;
the personalized comment viewpoint vector is subjected to local comment viewpoint vector mining to obtain local comment viewpoint vectors corresponding to a plurality of task threads, wherein the local comment viewpoint vectors are comment viewpoint vectors corresponding to target online chat topics of different text keywords in the chat room interaction text to be analyzed;
performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads to obtain the local comment viewpoint collision vectors corresponding to the task threads;
and excavating the target online chat topics in the chat room interaction text to be analyzed by combining the local comment viewpoint collision vectors corresponding to the task threads.
Optionally, the big data analysis cloud platform is further configured to: determining topic preference labels of the target user terminals according to the target online chat topics; performing topic pushing processing on the target user terminal by utilizing the topic preference tag;
the topic pushing processing of the target user terminal by using the topic preference tag includes: selecting an online chat topic to be pushed from a preset topic pool by utilizing the topic preference label; collecting topic text information corresponding to the online chat topic to be pushed; on the premise that the topic text information carries data desensitization instructions, performing data desensitization processing on the topic text information to obtain desensitized text information; and pushing the online chat topic to be pushed and the desensitized text information in an associated mode to the target user terminal.
Optionally, the performing comment viewpoint collision operation on the local comment viewpoint vectors corresponding to the multiple task threads to obtain the local comment viewpoint collision vector corresponding to the task thread includes: acquiring a text characteristic collision model; performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads; the text feature collision model comprises a text feature collision sub-model and a comment viewpoint vector mining sub-model, and the method for performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads comprises the following steps: performing comment viewpoint vector aggregation operation on the local comment viewpoint vectors corresponding to the task threads to obtain local comment viewpoint aggregation vectors; performing comment viewpoint vector mining on the local comment viewpoint aggregate vector by using the comment viewpoint vector mining sub-model to obtain a plurality of local comment viewpoint derivative vectors; carrying out bias coefficient modification on the local comment viewpoint derivative vector by utilizing the text feature collision sub-model to obtain a modified local comment viewpoint derivative vector; processing the modified local comment viewpoint derivative vector and the corresponding local comment viewpoint vector to obtain a local comment viewpoint collision vector corresponding to the task thread;
The local comment viewpoint vector includes a first comment viewpoint vector and a second comment viewpoint vector, the local comment viewpoint aggregate vector includes a first local comment viewpoint aggregate word vector and a second local comment viewpoint aggregate word vector, and the local comment viewpoint vector corresponding to the task threads is subjected to comment viewpoint vector aggregate operation to obtain a local comment viewpoint aggregate vector, including: performing comment viewpoint vector aggregation operation on first comment viewpoint word vectors corresponding to the task threads to obtain first local comment viewpoint aggregation word vectors; performing comment viewpoint vector aggregation operation on second comment viewpoint word vectors corresponding to the plurality of task threads to obtain second local comment viewpoint aggregation word vectors;
wherein, big data analysis cloud platform still is used for: acquiring a first local word vector generation model and a second local word vector generation model corresponding to the task thread; the local comment viewpoint collision vector corresponding to the task thread includes a local comment viewpoint collision vector corresponding to the first comment viewpoint vector and a local comment viewpoint collision vector corresponding to the second comment viewpoint vector, and the mining the target online chat topic in the chat room interaction text to be analyzed in combination with the local comment viewpoint collision vector corresponding to the task thread includes: carrying out vector convolution operation on the local comment viewpoint collision vector corresponding to the first comment viewpoint word vector by using the first local word vector generation model to obtain text keywords and distribution quantization characteristics of text units in the interactive text of the chat room to be analyzed; performing vector convolution operation on the local comment viewpoint collision vector corresponding to the second comment viewpoint word vector by using the second local word vector generation model to obtain range quantization characteristics of text units in the chat room interaction text to be analyzed; and mining the target online chat topics in the interactive text of the chat room to be analyzed by combining the text keywords, the distribution quantization characteristics and the range quantization characteristics of the text units in the interactive text of the chat room to be analyzed.
Optionally, before the obtaining the interactive text of the chat room to be analyzed, the big data analysis cloud platform is further configured to: acquiring a plurality of chat room interaction text sample sets and a universal online chat topic mining network, wherein the chat room interaction text sample sets are in one-to-one correspondence with a plurality of task threads in the universal online chat topic mining network, authentication notes of the chat room interaction text sample in the chat room interaction text sample sets are different, and the authentication notes are used for distinguishing target online chat topics of different text keywords in the chat room interaction text sample; debugging the universal online chat topic mining network by using the chat room interaction text sample to obtain an online chat topic mining network;
each task thread in the universal online chat topic mining network comprises a set first local word vector generation model and a set second local word vector generation model, the universal online chat topic mining network is debugged by using the chat room interaction text sample to obtain an online chat topic mining network, and the method comprises the following steps: performing target online chat topic mining on the chat room interactive text sample by using the universal online chat topic mining network to obtain first generated data of the set first local word vector generation model and second generated data of the set second local word vector generation model corresponding to the task thread; determining a first debugging cost index and a second debugging cost index corresponding to the task thread based on the first generated data, the second generated data and the authentication annotation of the chat room interactive text sample corresponding to the task thread; determining a debugging cost index corresponding to the task thread by combining a first debugging cost index and a second debugging cost index corresponding to the task thread; determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost index corresponding to the task thread; optimizing network variables of the universal online chat topic mining network based on the debugging cost index of the universal online chat topic mining network, and jumping to the step of carrying out target online chat topic mining on the chat room interactive text sample by utilizing the universal online chat topic mining network, wherein the debugging cost index circulated to the universal online chat topic mining network meets the set requirement to obtain the online chat topic mining network;
The determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost index corresponding to the task thread comprises the following steps: determining a prior debugging cost index mean value corresponding to the task thread of a current dynamic circulation operator, wherein the current dynamic circulation operator represents the circulation from the xth circulation to the yth circulation; determining a mean value of a previous debugging cost index corresponding to the task thread of a previous dynamic loop operator, wherein the previous dynamic loop operator represents the circulation from the x-z time to the y-z time, and z, x and y are positive integers; determining a debugging cost gradient corresponding to the task thread by combining a previous debugging cost index mean value corresponding to the task thread of the previous dynamic loop operator and a previous debugging cost index mean value corresponding to the task thread of the current dynamic loop operator; determining the debug cost confidence corresponding to the task thread based on the debug cost gradient corresponding to the task thread; and determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost confidence corresponding to the task thread and the debugging cost index corresponding to the task thread.
In a third aspect, the invention also provides a big data analysis cloud platform, which comprises a processor and a memory; the processor is in communication with the memory, and the processor is configured to read and execute a computer program from the memory to implement the method described above.
In a fourth aspect, the present invention also provides a computer readable storage medium having stored thereon a program which when executed by a processor implements the method described above.
In the embodiment of the invention, after the personalized comment viewpoint vectors of the interactive text of the chat room to be analyzed are obtained, the comment viewpoint vectors of the target online chat topics of different text keywords are extracted through different task threads, and comment viewpoint collision operation is carried out on the comment viewpoint vectors of different task threads, so that hidden details among the target online chat topics of different text keywords are discovered, and the online chat topics with pushing analysis values in the interactive text of the chat room to be analyzed can be conveniently discovered. In this way, the mining precision and integrity of the online chat topics are improved
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
Fig. 1 is a flow chart of a user personality analysis processing method based on big data according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
The method embodiment provided by the embodiment of the invention can be executed in a big data analysis cloud platform, computer equipment or similar computing devices. Taking the example of running on a big data analysis cloud platform, the big data analysis cloud platform may include one or more processors (the processors may include, but are not limited to, a microprocessor MCU, a programmable logic device FPGA, or the like) and a memory for storing data, and optionally, the big data analysis cloud platform may further include a transmission device for communication functions. It will be appreciated by those of ordinary skill in the art that the above-described structure is merely illustrative, and is not intended to limit the structure of the big data analysis cloud platform. For example, the big data analysis cloud platform may also include more or fewer components than shown above, or have a different configuration than shown above.
The memory may be used to store a computer program, for example, a software program of application software and a module, for example, a computer program corresponding to a user personality analysis processing method based on big data in the embodiment of the present invention, and the processor executes the computer program stored in the memory to perform various functional applications and data processing, that is, implement the method described above. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory may further include memory remotely located with respect to the processor, the remote memory being connectable to the big data analysis cloud platform through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a big data analysis cloud platform. In one example, the transmission means comprises a network adapter (Network Interface Controller, simply referred to as NIC) that can be connected to other network devices via a base station to communicate with the internet. In one example, the transmission device may be a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.
Based on this, referring to fig. 1, fig. 1 is a flowchart of a user personality analysis processing method based on big data according to an embodiment of the present invention, where the method is applied to a big data analysis cloud platform, and further may include steps 110 to 150.
Step 110, obtaining interactive text of the chat room to be analyzed.
In the embodiment of the invention, the interactive text of the chat room to be analyzed comprises at least one target online chat topic. Further, the chat room interaction text to be analyzed may be user interaction text information corresponding to a multi-person chat room of any social platform, including but not limited to text communication records, expression communication records, voice communication records, and the like of the user. Still further, the targeted online chat topic may be a different type of online chat topic, for example, the targeted online chat topic may be "sea-panned shopping," "VR game," "sports competition," or "chatGPT," etc.
And 120, excavating comment viewpoint vectors of the interactive text of the chat room to be analyzed to obtain personalized comment viewpoint vectors with different dimensions.
The comment viewpoint vector related in the embodiment of the invention can be a comment viewpoint vector corresponding to a certain text area in the interactive text of the chat room to be analyzed, and can be recorded in a comment viewpoint vector matrix or a word bag vector set. The dimension of the personalized comment viewpoint vector may be understood as a feature scale, and the personalized comment viewpoint vector may be understood as a preamble comment viewpoint vector or a core comment viewpoint vector.
Under some design ideas, a big data analysis cloud platform (big data analysis server) can acquire a trunk model and a ladder model.
And excavating comment viewpoint vectors of the interactive text of the chat room to be analyzed by using the trunk model to obtain a plurality of trunk comment viewpoint vectors. The trunk model may be a recurrent neural network, a full convolution model, and the like. Under some design ideas, the method can utilize a trunk model to extract the vector of the interactive text of the chat room to be analyzed, so as to obtain a plurality of trunk comment viewpoint vectors.
And excavating comment viewpoint vectors of the trunk comment viewpoint vectors by using the ladder model to obtain personalized comment viewpoint vectors with different dimensions, such as personalized comment viewpoint vectors with 5 different dimensions. Encoding by the ladder model can lead to reduction of comment viewpoint vector matrixes, and the higher the reduction degree of the comment viewpoint vector matrixes is, the higher the grade of the obtained comment viewpoint vector matrixes is, and the lower the level of the anti-regularization is. The comment viewpoint vector matrix with higher rank is biased to the classification feature, the content feature is not emphasized, and the comment viewpoint vector matrix with lower rank contains more content features, so that the classification feature of the comment viewpoint vector matrix with each dimension can be enhanced.
And 130, carrying out local comment viewpoint vector mining on personalized comment viewpoint vectors to obtain local comment viewpoint vectors corresponding to a plurality of task threads, wherein the local comment viewpoint vectors are comment viewpoint vectors corresponding to target online chat topics of different text keywords in the chat room interactive text to be analyzed.
Different task threads can be used for mining target online chat topics of different text keywords in the interactive text of the chat room to be analyzed. Task threads can be understood as model branches, so local comment viewpoint vectors can be understood as branch comment viewpoint vectors.
The big data analysis cloud platform can acquire a multi-task processing model, and the multi-task processing model is utilized to conduct local comment viewpoint vector mining on personalized comment viewpoint vectors, so that local comment viewpoint vectors corresponding to a plurality of task threads are obtained. Wherein the multitasking model may be a deep learning model. The number of task threads corresponds to text keywords of a target online chat topic that can be mined, for example, if two text keywords (which can be understood as topic categories) can be mined, the number of task threads can be 2. The text keywords may also include a plurality of subordinate text keywords. For example, during debugging, the text keywords of the online chat topics corresponding to the authentication notes in each chat room interaction text sample set can be used as the same text keywords.
Under some design considerations, the task threads include a first task thread and a second task thread, and the multitasking model includes a first multitasking model and a second multitasking model. Then, utilizing the multi-task processing model to perform local comment viewpoint vector mining on the personalized comment viewpoint vector, and obtaining the local comment viewpoint vector corresponding to the plurality of task threads may include: carrying out local comment viewpoint vector mining on the personalized comment viewpoint vector by using a first multitasking model to obtain a first local comment viewpoint vector corresponding to the first task thread; and carrying out local comment viewpoint vector mining on the personalized comment viewpoint vector by using a second multitasking model to obtain a second local comment viewpoint vector corresponding to the second task thread.
Under some design considerations, each task thread may further include a first local task thread and a second local task thread, and the multitasking model may include two running average layers, one for each local task thread.
Carrying out local comment viewpoint vector mining on the personalized comment viewpoint vector by utilizing a moving average layer corresponding to the first local task thread to obtain a corresponding first comment viewpoint word vector; and carrying out local comment viewpoint vector mining on the personalized comment viewpoint vector by utilizing a moving average layer corresponding to the second local task thread to obtain a corresponding second comment viewpoint word vector.
And 140, performing comment viewpoint collision operation on the local comment viewpoint vectors corresponding to the plurality of task threads to obtain the local comment viewpoint collision vectors corresponding to the task threads.
The big data analysis cloud platform can acquire a text characteristic collision model; and performing comment viewpoint collision operation on the local comment viewpoint vectors corresponding to the plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads.
The text feature collision model is used for performing feature interaction processing, and the local comment viewpoint collision vector can be understood as the interaction feature of the task thread. According to the embodiment of the invention, the text characteristic collision model is introduced among different task threads, so that implicit details of the cross-task are provided for each task thread, and the deep connection among the target online chat topics of different text keywords can be conveniently discovered.
Under some design ideas, the text feature collision model may include a text feature collision sub-model and a comment viewpoint vector mining sub-model, and performing comment viewpoint collision operation on the local comment viewpoint vectors corresponding to the plurality of task threads by using the text feature collision model may include the following sub-steps.
And step one, comment viewpoint vector aggregation operation is carried out on the local comment viewpoint vectors corresponding to the task threads, so that the local comment viewpoint aggregation vector is obtained.
In this way, implicit classification features between local comment viewpoint vectors can be aggregated. The aggregated thought is not limited, and for example, the local comment viewpoint aggregated vector can be obtained by stitching a plurality of local comment viewpoint vectors.
And secondly, utilizing the comment viewpoint vector mining sub-model to carry out comment viewpoint vector mining on the local comment viewpoint aggregate vector, and obtaining a plurality of local comment viewpoint derivative vectors. The comment viewpoint vector mining sub-model may be two moving average layers with a moving average operator of 3, that is, a vector convolution operation with a moving average operator of 3 is performed twice on the local comment viewpoint aggregate vector, so as to generate a local comment viewpoint derivative vector suitable for each task thread and containing implicit classification features.
And thirdly, carrying out bias coefficient modification on the partial comment viewpoint derivative vector by utilizing the text characteristic collision submodel to obtain a modified partial comment viewpoint derivative vector. Wherein the text feature collision submodel may be a local focusing unit of a moving average block, or the like. Therefore, the text feature collision sub-model can be more biased to relatively important details in the local comment viewpoint derivative vector, and the details which are not relatively important are weakened, so that the modified local comment viewpoint derivative vector has core contents of related target online chat topics. The local comment viewpoint derivative vector can be understood as a newly added local comment viewpoint vector.
And step four, processing the modified local comment viewpoint derivative vector and the corresponding local comment viewpoint vector to obtain a local comment viewpoint collision vector corresponding to the task thread.
Therefore, the local comment viewpoint collision vector generated by the text feature collision sub-model and the initial local comment viewpoint vector are processed, so that the content details of the target online chat topic can be further enhanced.
Under some design ideas, each task thread may further include a first local task thread and a second local task thread, the local comment viewpoint vector corresponding to each task thread includes a first comment viewpoint vector and a second comment viewpoint vector, and the corresponding local comment viewpoint aggregate vector includes a first local comment viewpoint aggregate word vector and a second local comment viewpoint aggregate word vector. Therefore, comment viewpoint vector aggregation operation can be performed on first comment viewpoint word vectors corresponding to the task threads, and first partial comment viewpoint aggregate word vectors are obtained. And carrying out comment viewpoint vector aggregation operation on the second comment viewpoint vectors corresponding to the plurality of task threads to obtain second local comment viewpoint aggregation word vectors.
Then for each task thread: utilizing the corresponding comment viewpoint vector mining sub-model to carry out comment viewpoint vector mining on the first local comment viewpoint aggregate word vector to obtain a plurality of new first comment viewpoint word vectors; and excavating comment viewpoint vectors of the second local comment viewpoint aggregate word vectors by utilizing the corresponding comment viewpoint vector excavating sub-model to obtain a plurality of new second comment viewpoint word vectors.
Carrying out bias coefficient modification on the new first comment viewpoint word vector by utilizing the corresponding text feature collision sub-model to obtain a modified new first comment viewpoint word vector; and carrying out bias coefficient modification on the new second comment viewpoint word vector by using the corresponding text feature collision submodel to obtain a modified new second comment viewpoint word vector.
Processing the modified new first comment viewpoint word vector and the corresponding first comment viewpoint word vector to obtain a local comment viewpoint collision vector corresponding to the first local task thread; and processing the modified new second comment viewpoint word vector and the corresponding second comment viewpoint word vector to obtain a local comment viewpoint collision vector corresponding to the second local task thread.
For example, the processed local comment viewpoint collision vector can be subjected to numerical mapping processing through a numerical mapping sub-network (normalization model) for subsequent target online chat topic mining.
Taking the task thread corresponding to the personalized comment viewpoint vector in a certain dimension as an example in the step 130 and the step 140, it can be understood that the step 130 and the step 140 can be executed on the task corresponding to the personalized comment viewpoint vector in other dimensions to obtain the local comment viewpoint collision vector corresponding to each personalized comment viewpoint vector.
And 150, mining out a target online chat topic in the interaction text of the chat room to be analyzed based on the local comment viewpoint collision vector corresponding to the task thread.
According to the embodiment of the invention, the target online chat topics in the chat room interaction text to be analyzed can be mined in combination with the generated data of each task thread corresponding to each personalized comment viewpoint vector. The generated data corresponding to each task thread can be the text keywords, the distribution quantization characteristic and the range quantization characteristic of the text units in the interactive text of the chat room to be analyzed corresponding to each task thread. The output mode of mining out the target online chat topics is not limited, for example, the target online chat topics can be highlighted by using a text window, and text keywords and the like can be marked out.
For each task thread: the first local word vector generation model and the second local word vector generation model corresponding to the task thread can be obtained; carrying out vector convolution operation on the local comment viewpoint collision vector corresponding to the first comment viewpoint vector by using the first local word vector generation model to obtain text keywords and distributed quantization characteristics (position area weight of the text unit) of the text unit in the interactive text of the chat room to be analyzed; and carrying out vector convolution operation on the local comment viewpoint collision vector corresponding to the second comment viewpoint word vector by using the second local word vector generation model to obtain the range quantization characteristic of the text unit in the interactive text of the chat room to be analyzed.
The text keywords, the distribution quantization features and the range quantization features of the text units in the interactive text of the chat room to be analyzed can be obtained by generating a relation between the position areas of the text units in the local comment viewpoint collision vector and the position areas of the text units in the interactive text of the chat room to be analyzed. For example, highlighting the target online chat topic with a text window, wherein text keywords of a text unit can represent text keywords of the text unit corresponding to the target online chat topic; the distribution quantization feature is the correlation of each text unit and the target reference unit (the reference unit of the text window), and can be used for realizing noise suppression processing; the range quantization feature may be a positional correlation of a text unit within the text window with the text window.
And digging out the target online chat topics in the interactive text of the chat room to be analyzed based on the text keywords, the distribution quantization characteristics and the range quantization characteristics of the text units in the interactive text of the chat room to be analyzed. In other words, the text keywords, the distribution quantization features and the range quantization features of the text units in the interactive text of the chat room to be analyzed corresponding to each task thread corresponding to each personalized comment viewpoint vector are combined, so that all the target online chat topics with push analysis values in the interactive text of the chat room to be analyzed can be mined.
Under some design ideas, before the step of obtaining the interactive text of the chat room to be analyzed, the big data analysis cloud platform can also debug the universal online chat topic mining network to obtain the online chat topic mining network. The online chat topic mining network may include, but is not limited to, all of the models referred to in steps 110-150 above; that is, the online chat topic mining network may include a backbone model, a ladder model, a multitasking model, a text feature collision model, a task generation model (a first local word vector generation model and a second local word vector generation model), and so on.
Further, the specific implementation thought of debugging the universal online chat topic mining network is not limited, and the method can comprise the following steps.
STEP10, obtaining a plurality of chat room interaction text sample sets and a universal online chat topic mining network.
The universal online chat topic mining network is an online chat topic mining network which is not debugged yet. The chat room interaction text sample sets are in one-to-one correspondence with task threads in the universal online chat topic mining network, authentication notes of the chat room interaction text sample sets are different, and the authentication notes are used for distinguishing target online chat topics of different text keywords in the chat room interaction text sample sets. Debugging a task thread to obtain a model may also be referred to as a processing cycle, where each task thread corresponds to a processing cycle, so that the processing cycle corresponds to a chat room interaction text sample set one by one. Further, a sample may be understood as a sample for performing network training.
Under some design considerations, the number of chat room interaction text sample sets is 2 (i.e., the number of task threads is 2). Because on-line chat topics with push analysis values are difficult to dig out under the condition of no authentication annotation; the online chat topic mining network solves the problem of mining all target online chat topics under the condition of no authentication comments.
According to the embodiment of the invention, the chat room interaction text sample set can be split into the chat room interaction text sample set sample1 and the chat room interaction text sample set sample2, wherein the sample1 corresponds to a first task thread and the sample2 corresponds to a second task thread. The authentication annotation of the chat room interaction text sample in sample1 only matches the target online chat topic of the first text keyword, but the chat room interaction text sample comprises the target online chat topic of the second text keyword; the first text keywords may include 32 subordinate text keywords. The authentication notes of the chat room interaction text sample in sample2 only match the target online chat topics of the second text keywords, but include the target online chat topics of the first text keywords; the second text keywords may include 48 subordinate text keywords. The target online chat topics corresponding to the first text keywords and the second text keywords are not repeated.
STEP20 uses the chat room interactive text sample to debug the universal online chat topic mining network to obtain the online chat topic mining network.
Under some design considerations, debugging a generic online chat topic mining network using chat room interaction text samples may include, but is not limited to, the following steps.
1) And performing target online chat topic mining on the chat room interactive text sample by using a universal online chat topic mining network to obtain first generation data of a set first local word vector generation model and second generation data of a set second local word vector generation model corresponding to the task thread.
Under some design ideas, network variables can be flexibly set during debugging.
Excavating comment viewpoint vectors of the chat room interactive text samples by using the trunk model to obtain trunk comment viewpoint vectors of the chat room interactive text samples; and excavating comment viewpoint vectors of the plurality of trunk comment viewpoint vectors by using the ladder model to obtain personalized comment viewpoint vectors of different dimensionalities of the interactive text sample of the plurality of chat rooms. And then debugging the AI network model corresponding to the personalized comment viewpoint vector of each dimension. Taking an AI network model as an example: for each task thread in the AI network model: carrying out local comment viewpoint vector mining on the personalized comment viewpoint vector by utilizing a moving average layer corresponding to the first local task thread to obtain a corresponding first comment viewpoint word vector; and carrying out local comment viewpoint vector mining on the personalized comment viewpoint vector by utilizing a moving average layer corresponding to the second local task thread to obtain a corresponding second comment viewpoint word vector.
In the corresponding text feature collision model, comment viewpoint vector aggregation operation can be carried out on first comment viewpoint word vectors corresponding to two task threads, so that first local comment viewpoint aggregation word vectors are obtained; and carrying out comment viewpoint vector aggregation operation on the second comment viewpoint vectors corresponding to the two task threads to obtain second local comment viewpoint aggregation word vectors. Thus, implicit classification features between comment viewpoint vectors of different target online chat topics in different sample sets can be aggregated.
In the corresponding text feature collision model, the comment viewpoint vector mining sub-model can be utilized to mine the comment viewpoint vector of the first local comment viewpoint aggregation word vector, so as to obtain a new first comment viewpoint word vector corresponding to the first task thread and a new first comment viewpoint word vector corresponding to the second task thread; and excavating comment viewpoint vectors of the second local comment viewpoint aggregation word vectors by utilizing the corresponding comment viewpoint vector excavation sub-model to obtain new second comment viewpoint word vectors corresponding to the first task thread and new second comment viewpoint word vectors corresponding to the second task thread. A local comment perspective derived vector may be generated for each task thread that contains implicit classification features.
Then for each task thread: the bias coefficient of the new first comment viewpoint word vector can be modified by utilizing the corresponding text feature collision submodel, so that the modified new first comment viewpoint word vector is obtained; and carrying out bias coefficient modification on the new second comment viewpoint word vector by using the corresponding text feature collision submodel to obtain a modified new second comment viewpoint word vector. The text feature collision sub-model can be more biased to relatively important details in the local comment viewpoint derivative vector, and weakens details which are not relatively important, so that the modified local comment viewpoint derivative vector has core contents of related target online chat topics.
Processing the modified new first comment viewpoint word vector and the corresponding first comment viewpoint word vector to obtain a local comment viewpoint collision vector corresponding to the first local task thread; and processing the modified new second comment viewpoint word vector and the corresponding second comment viewpoint word vector to obtain a local comment viewpoint collision vector corresponding to the second local task thread. The characterization capability of the classification feature can be enhanced while weakening the noise evaluation viewpoint vector.
Under some design ideas, local comment viewpoint collision vectors can be selected based on distinguishing information of chat room interaction text samples in sample1 and sample2, and after the local comment viewpoint collision vectors corresponding to each local task thread are selected, the characteristics of the chat room interaction text samples in the corresponding sample sets are left, so that the task threads corresponding to each sample set only determine the debugging cost (model training loss) of the chat room interaction text samples with real authentication comments. The first local task thread and the second local task thread of the first task thread only comprise local comment viewpoint collision vectors corresponding to the chat room interaction text sample in sample 1.
Performing numerical mapping processing on the local comment viewpoint collision vector corresponding to the selected first local task thread by using a numerical mapping sub-network to obtain a local comment viewpoint collision vector corresponding to the first local task thread after numerical mapping; and carrying out numerical mapping processing on the local comment viewpoint collision vector corresponding to the selected second local task thread by using a numerical mapping sub-network to obtain the local comment viewpoint collision vector corresponding to the second local task thread after numerical mapping.
And finally, obtaining first generated data of a set first local word vector generation model corresponding to each task thread and second generated data of a set second local word vector generation model based on the local comment viewpoint collision vector corresponding to the first local task thread and the local comment viewpoint collision vector corresponding to the second local task thread.
2) And determining a first debugging cost index and a second debugging cost index corresponding to the task thread based on the first generated data and the second generated data corresponding to the task thread and the authentication annotation of the chat room interactive text sample.
Under some design considerations, a debug cost indicator may be determined based on the Loss function.
3) And determining the debugging cost index corresponding to the task thread based on the first debugging cost index and the second debugging cost index corresponding to the task thread.
Under some design ideas, the first debug cost index and the second debug cost index of each task thread can be summed to obtain the debug cost index corresponding to each task thread. Then, the debugging cost indexes corresponding to the first task threads in all AI network models can be summed to obtain the debugging cost indexes corresponding to all the first task threads; and summing the debugging cost indexes corresponding to the second task thread in all AI network models to obtain the debugging cost index of the second task thread.
Under some design ideas, configuration influence coefficients can be realized for each AI network model, and when determining the debugging cost index of the first task thread and the debugging cost index of the second task thread, summation processing is performed based on the influence coefficients.
4) And determining the debugging cost index of the universal online chat topic mining network based on the debugging cost index corresponding to the task thread.
In the embodiment of the invention, a processing thought based on a dynamic loop operator is provided, and the dynamic loop operator can be compatible with the influence coefficient of the debugging cost index of each task thread by combining iterative processing and window sliding processing, so that the effectiveness and the robustness of each task thread are ensured.
Under some design ideas, determining the debugging cost gradient corresponding to the task thread comprises determining a prior debugging cost index mean value corresponding to the local task thread of the current dynamic loop operator and determining the prior debugging cost index mean value of the current dynamic loop operator universal online chat topic mining network, wherein the current dynamic loop operator characterizes from the xth loop to the yth loop. Determining a mean value of a previous debugging cost index corresponding to a local task thread of a previous dynamic loop calculation, wherein the representation of a previous dynamic loop operator is from the x-z th loop to the y-z th loop; z may characterize the update period, the value of y-x may be referred to as the operator coverage size of the dynamic loop operator, and z, x and y are positive integers and may be flexibly set. Determining the debugging cost gradient corresponding to the task thread based on the prior debugging cost index average value corresponding to the prior dynamic circulation calculation local task thread, the prior debugging cost index average value corresponding to the current dynamic circulation calculation local task thread and the prior debugging cost index average value of the current dynamic circulation operator universal online chat topic mining network. The debug cost gradient can be understood as the rate of change of the loss function.
Under some design ideas, the average value of the prior debugging cost indexes of the current dynamic circulation operator universal online chat topic mining network can be determined; determining a cost variable label corresponding to the task thread based on a prior debugging cost index mean value corresponding to the current dynamic circulation computing local task thread and a prior debugging cost index mean value of a current dynamic circulation operator general online chat topic mining network; and determining the debugging cost confidence corresponding to the task thread based on the cost variable label corresponding to the task thread and the debugging cost gradient corresponding to the task thread.
And then, determining the debugging cost index of the universal online chat topic mining network based on the debugging cost confidence corresponding to the task thread and the debugging cost index corresponding to the task thread. Under some design ideas, the debugging cost indexes corresponding to the task threads can be integrated by utilizing the debugging cost confidence degrees corresponding to the task threads, so that the debugging cost indexes of the universal online chat topic mining network are obtained.
5) Optimizing network variables of the universal online chat topic mining network based on the debugging cost index of the universal online chat topic mining network, jumping to the step of carrying out target online chat topic mining on chat room interaction text samples by utilizing the universal online chat topic mining network, and circulating to the debugging cost index of the universal online chat topic mining network to meet the set requirements so as to obtain the online chat topic mining network. Wherein, the setting requirement can be that the network tends to be stable for the general online chat topic mining.
Therefore, the invention can acquire a plurality of chat room interaction text sample sets, the plurality of chat room interaction text sample sets are in one-to-one correspondence with a plurality of task threads in the universal online chat topic mining network, authentication notes of the chat room interaction text sample sets are different, and the authentication notes are used for distinguishing target online chat topics of different text keywords in the chat room interaction text sample; when the chat room interactive text sample is used for debugging the universal online chat topic mining network, an AI network model is introduced between different task threads, so that implicit details of cross tasks are provided for each task thread, and deep connection among target online chat topics can be conveniently mined from different sample sets. Furthermore, a processing thought based on a dynamic loop operator is provided, the dynamic loop operator can be compatible with the influence coefficient of the debugging cost index of each task thread by combining iterative processing and window sliding processing, and the effectiveness and the robustness of each task thread are ensured. Finally, the debugged online chat topic mining network is utilized to mine the target online chat topic of the chat room interaction text to be analyzed, and all online chat topics with pushing analysis values in the chat room interaction text to be analyzed can be mined, so that the quality of mining the target online chat topic is improved.
In other embodiments, a method for analyzing and processing user personality based on big data is provided, including the following.
Step 210, obtaining a plurality of chat room interaction text sample sets and a universal online chat topic mining network.
Step 220, debugging the universal online chat topic mining network by using the chat room interaction text sample to obtain the online chat topic mining network.
Step 230, obtaining a chat room interaction text to be analyzed, wherein the chat room interaction text to be analyzed comprises at least one target online chat topic.
The chat room interaction text to be analyzed may be the chat room interaction text in the test sample in step 210.
And 240, excavating comment viewpoint vectors of the interactive text of the chat room to be analyzed by utilizing a trunk model and a ladder model in the online chat topic excavation network to obtain personalized comment viewpoint vectors with different dimensions.
Step 250, mining local comment viewpoint vectors of personalized comment viewpoint vectors by using a multitasking model in an online chat topic mining network to obtain local comment viewpoint vectors corresponding to a plurality of task threads, wherein the local comment viewpoint vectors are comment viewpoint vectors corresponding to target online chat topics of different text keywords in an interactive text of a chat room to be analyzed.
And 260, performing comment viewpoint collision operation on the local comment viewpoint vectors corresponding to the plurality of task threads by using a text feature collision model in the online chat topic mining network to obtain the local comment viewpoint collision vectors corresponding to the task threads.
Step 270, mining out the target online chat topics in the interaction text of the chat room to be analyzed based on the local comment viewpoint collision vector corresponding to the task thread.
Steps 230-270 may incorporate the design considerations of steps 110-150.
Therefore, the invention can acquire a plurality of chat room interaction text sample sets, the plurality of chat room interaction text sample sets are in one-to-one correspondence with a plurality of task threads in the universal online chat topic mining network, authentication notes of the chat room interaction text sample sets are different, and the authentication notes are used for distinguishing target online chat topics of different text keywords in the chat room interaction text sample; when the chat room interactive text sample is used for debugging the universal online chat topic mining network, an AI network model is introduced between different task threads, so that implicit details of cross tasks are provided for each task thread, and deep connection among target online chat topics can be conveniently mined from different sample sets. Furthermore, a processing thought based on a dynamic loop operator is provided, the dynamic loop operator can be compatible with the influence coefficient of the debugging cost index of each task thread by combining iterative processing and window sliding processing, and the effectiveness and the robustness of each task thread are ensured. Finally, the debugged online chat topic mining network is utilized to mine the target online chat topic of the chat room interaction text to be analyzed, and all online chat topics with pushing analysis values in the chat room interaction text to be analyzed can be mined, so that the quality of mining the target online chat topic is improved.
Based on the above, after step 150, the method further comprises step 160.
Step 160, determining topic preference labels of the target user terminals according to the target online chat topics; and carrying out topic pushing processing on the target user terminal by utilizing the topic preference label.
In the embodiment of the invention, the topic preference label of the target user terminal can be determined through the target online chat topic, the target user terminal is the user terminal of the chat user corresponding to the interaction text of the chat room to be analyzed, and the topic preference label can reflect the interest preference of the chat user, so that topic pushing processing of the target user terminal is accurately realized based on the topic preference label.
Based on the above, the topic pushing process of the target user terminal by using the topic preference tag in step 160 includes steps 161-164.
Step 161, selecting an online chat topic to be pushed from a preset topic pool by utilizing the topic preference label.
The preset topic pool may be a push topic database generated in advance.
Step 162, collecting topic text information corresponding to the online chat topic to be pushed.
After determining the online chat topic to be pushed, topic text information can be accurately collected by different search engines for subsequent pushing.
And 163, carrying out data desensitization processing on the topic text information on the premise that the topic text information carries data desensitization instructions, so as to obtain desensitized text information.
It can be understood that if the topic text information contains sensitive personal privacy information, the topic text information can be bound with a data desensitization instruction in advance, and on the basis, the data needs to be subjected to anonymous desensitization before being pushed, so that the personal privacy is protected.
And step 164, pushing the online chat topic to be pushed and the desensitized text information to the target user terminal in an associated manner.
In this way, the method is applied to the steps 161-164, the topic to be pushed can be accurately selected from the preset topic pool based on the topic preference label, topic text information collection is then performed pertinently, and protection of personal privacy information is considered before pushing, so that data anonymization and desensitization are performed. Thus, accurate pushing of the target user terminal is achieved, and meanwhile personal privacy can be protected.
Based on the above, in some design ideas that can be independent, the data desensitization processing is performed on the topic text information in step 163 to obtain desensitized text information, which includes steps 1631-1634.
Step 1631, performing privacy knowledge extraction on the topic text information according to a first privacy knowledge extraction rule to obtain first privacy knowledge, wherein the first privacy knowledge extraction rule is an extraction rule based on a decision tree.
Step 1632, determining at least one initial topic text message matched with the topic text message based on the first privacy knowledge of the topic text message, and performing privacy knowledge extraction on the topic text message and each initial topic text message according to a second privacy knowledge extraction rule to obtain second privacy knowledge, wherein the second privacy knowledge extraction rule is an extraction rule based on a non-decision tree.
Step 1633, based on the extracted second privacy knowledge, performing text phrase matching processing on the topic text information and each initial topic text information in sequence, and determining a first topic text information in the at least one initial topic text information as a data desensitization reference text, wherein the number of matched text phrases between the first topic text information and the topic text information is greater than a number threshold.
Step 1634, performing data desensitization processing on the topic text information by using the data desensitization indication of the first topic text information to obtain desensitized text information.
In the embodiment of the invention, the extraction of the first privacy knowledge is obtained based on the extraction rule of the decision tree, so that the precision of the first privacy knowledge can be ensured, the extraction of the second privacy knowledge is obtained based on the extraction rule of the non-decision tree, the extraction comprehensiveness of the second privacy knowledge can be ensured, the data desensitization reference text of the topic text information is determined by integrating the first privacy knowledge and the second privacy knowledge, and the high matching degree of the data desensitization reference text can be ensured, so that the accurate and reliable data desensitization processing is performed on the topic text information according to the data desensitization instruction.
Based on the foregoing, in some design ideas that can be independent, the performing privacy knowledge refinement on the topic text information according to the first privacy knowledge refinement rule includes: carrying out privacy knowledge extraction on the topic text information based on a decision tree model, wherein the decision tree model is obtained by training based on first authenticated topic text information and second authenticated topic text information; the first authenticated topic text information is a training sample carrying priori data, and the second authenticated topic text information is a training sample not carrying priori data.
Based on the same or similar invention conception, the invention also provides a user personality analysis processing system based on big data, which comprises a big data analysis cloud platform and a user terminal which are communicated with each other;
the big data analysis cloud platform is used for:
acquiring a chat room interaction text to be analyzed, wherein the chat room interaction text to be analyzed comprises at least one target online chat topic;
comment viewpoint vectors are mined on the interactive text of the chat room to be analyzed, and personalized comment viewpoint vectors with different dimensions are obtained;
the personalized comment viewpoint vector is subjected to local comment viewpoint vector mining to obtain local comment viewpoint vectors corresponding to a plurality of task threads, wherein the local comment viewpoint vectors are comment viewpoint vectors corresponding to target online chat topics of different text keywords in the chat room interaction text to be analyzed;
performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads to obtain the local comment viewpoint collision vectors corresponding to the task threads;
and excavating the target online chat topics in the chat room interaction text to be analyzed by combining the local comment viewpoint collision vectors corresponding to the task threads.
Optionally, the big data analysis cloud platform is further configured to: determining topic preference labels of the target user terminals according to the target online chat topics; performing topic pushing processing on the target user terminal by utilizing the topic preference tag;
the topic pushing processing of the target user terminal by using the topic preference tag includes: selecting an online chat topic to be pushed from a preset topic pool by utilizing the topic preference label; collecting topic text information corresponding to the online chat topic to be pushed; on the premise that the topic text information carries data desensitization instructions, performing data desensitization processing on the topic text information to obtain desensitized text information; and pushing the online chat topic to be pushed and the desensitized text information in an associated mode to the target user terminal.
Optionally, the performing comment viewpoint collision operation on the local comment viewpoint vectors corresponding to the multiple task threads to obtain the local comment viewpoint collision vector corresponding to the task thread includes: acquiring a text characteristic collision model; performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads; the text feature collision model comprises a text feature collision sub-model and a comment viewpoint vector mining sub-model, and the method for performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads comprises the following steps: performing comment viewpoint vector aggregation operation on the local comment viewpoint vectors corresponding to the task threads to obtain local comment viewpoint aggregation vectors; performing comment viewpoint vector mining on the local comment viewpoint aggregate vector by using the comment viewpoint vector mining sub-model to obtain a plurality of local comment viewpoint derivative vectors; carrying out bias coefficient modification on the local comment viewpoint derivative vector by utilizing the text feature collision sub-model to obtain a modified local comment viewpoint derivative vector; processing the modified local comment viewpoint derivative vector and the corresponding local comment viewpoint vector to obtain a local comment viewpoint collision vector corresponding to the task thread;
The local comment viewpoint vector includes a first comment viewpoint vector and a second comment viewpoint vector, the local comment viewpoint aggregate vector includes a first local comment viewpoint aggregate word vector and a second local comment viewpoint aggregate word vector, and the local comment viewpoint vector corresponding to the task threads is subjected to comment viewpoint vector aggregate operation to obtain a local comment viewpoint aggregate vector, including: performing comment viewpoint vector aggregation operation on first comment viewpoint word vectors corresponding to the task threads to obtain first local comment viewpoint aggregation word vectors; performing comment viewpoint vector aggregation operation on second comment viewpoint word vectors corresponding to the plurality of task threads to obtain second local comment viewpoint aggregation word vectors;
wherein, big data analysis cloud platform still is used for: acquiring a first local word vector generation model and a second local word vector generation model corresponding to the task thread; the local comment viewpoint collision vector corresponding to the task thread includes a local comment viewpoint collision vector corresponding to the first comment viewpoint vector and a local comment viewpoint collision vector corresponding to the second comment viewpoint vector, and the mining the target online chat topic in the chat room interaction text to be analyzed in combination with the local comment viewpoint collision vector corresponding to the task thread includes: carrying out vector convolution operation on the local comment viewpoint collision vector corresponding to the first comment viewpoint word vector by using the first local word vector generation model to obtain text keywords and distribution quantization characteristics of text units in the interactive text of the chat room to be analyzed; performing vector convolution operation on the local comment viewpoint collision vector corresponding to the second comment viewpoint word vector by using the second local word vector generation model to obtain range quantization characteristics of text units in the chat room interaction text to be analyzed; and mining the target online chat topics in the interactive text of the chat room to be analyzed by combining the text keywords, the distribution quantization characteristics and the range quantization characteristics of the text units in the interactive text of the chat room to be analyzed.
Optionally, before the obtaining the interactive text of the chat room to be analyzed, the big data analysis cloud platform is further configured to: acquiring a plurality of chat room interaction text sample sets and a universal online chat topic mining network, wherein the chat room interaction text sample sets are in one-to-one correspondence with a plurality of task threads in the universal online chat topic mining network, authentication notes of the chat room interaction text sample in the chat room interaction text sample sets are different, and the authentication notes are used for distinguishing target online chat topics of different text keywords in the chat room interaction text sample; debugging the universal online chat topic mining network by using the chat room interaction text sample to obtain an online chat topic mining network;
each task thread in the universal online chat topic mining network comprises a set first local word vector generation model and a set second local word vector generation model, the universal online chat topic mining network is debugged by using the chat room interaction text sample to obtain an online chat topic mining network, and the method comprises the following steps: performing target online chat topic mining on the chat room interactive text sample by using the universal online chat topic mining network to obtain first generated data of the set first local word vector generation model and second generated data of the set second local word vector generation model corresponding to the task thread; determining a first debugging cost index and a second debugging cost index corresponding to the task thread based on the first generated data, the second generated data and the authentication annotation of the chat room interactive text sample corresponding to the task thread; determining a debugging cost index corresponding to the task thread by combining a first debugging cost index and a second debugging cost index corresponding to the task thread; determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost index corresponding to the task thread; optimizing network variables of the universal online chat topic mining network based on the debugging cost index of the universal online chat topic mining network, and jumping to the step of carrying out target online chat topic mining on the chat room interactive text sample by utilizing the universal online chat topic mining network, wherein the debugging cost index circulated to the universal online chat topic mining network meets the set requirement to obtain the online chat topic mining network;
The determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost index corresponding to the task thread comprises the following steps: determining a prior debugging cost index mean value corresponding to the task thread of a current dynamic circulation operator, wherein the current dynamic circulation operator represents the circulation from the xth circulation to the yth circulation; determining a mean value of a previous debugging cost index corresponding to the task thread of a previous dynamic loop operator, wherein the previous dynamic loop operator represents the circulation from the x-z time to the y-z time, and z, x and y are positive integers; determining a debugging cost gradient corresponding to the task thread by combining a previous debugging cost index mean value corresponding to the task thread of the previous dynamic loop operator and a previous debugging cost index mean value corresponding to the task thread of the current dynamic loop operator; determining the debug cost confidence corresponding to the task thread based on the debug cost gradient corresponding to the task thread; and determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost confidence corresponding to the task thread and the debugging cost index corresponding to the task thread.
Further, there is also provided a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the above-described method.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus and method embodiments described above are merely illustrative, for example, flow diagrams and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present invention may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The user personality analysis processing method based on big data is characterized by being applied to a big data analysis cloud platform, and comprises the following steps:
acquiring a chat room interaction text to be analyzed, wherein the chat room interaction text to be analyzed comprises at least one target online chat topic;
comment viewpoint vectors are mined on the interactive text of the chat room to be analyzed, and personalized comment viewpoint vectors with different dimensions are obtained;
the personalized comment viewpoint vector is subjected to local comment viewpoint vector mining to obtain local comment viewpoint vectors corresponding to a plurality of task threads, wherein the local comment viewpoint vectors are comment viewpoint vectors corresponding to target online chat topics of different text keywords in the chat room interaction text to be analyzed;
performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads to obtain the local comment viewpoint collision vectors corresponding to the task threads;
And excavating the target online chat topics in the chat room interaction text to be analyzed by combining the local comment viewpoint collision vectors corresponding to the task threads.
2. The method of claim 1, wherein the method further comprises:
determining topic preference labels of the target user terminals according to the target online chat topics;
and carrying out topic pushing processing on the target user terminal by utilizing the topic preference label.
3. The method as claimed in claim 2, wherein said performing topic pushing processing on the target user terminal using the topic preference tag comprises:
selecting an online chat topic to be pushed from a preset topic pool by utilizing the topic preference label;
collecting topic text information corresponding to the online chat topic to be pushed;
on the premise that the topic text information carries data desensitization instructions, performing data desensitization processing on the topic text information to obtain desensitized text information;
and pushing the online chat topic to be pushed and the desensitized text information in an associated mode to the target user terminal.
4. The method of claim 1, wherein performing comment viewpoint collision operations on local comment viewpoint vectors corresponding to a plurality of task threads to obtain the local comment viewpoint collision vector corresponding to the task thread, comprises: acquiring a text characteristic collision model; performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads; the text feature collision model comprises a text feature collision sub-model and a comment viewpoint vector mining sub-model, and the method for performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads comprises the following steps: performing comment viewpoint vector aggregation operation on the local comment viewpoint vectors corresponding to the task threads to obtain local comment viewpoint aggregation vectors; performing comment viewpoint vector mining on the local comment viewpoint aggregate vector by using the comment viewpoint vector mining sub-model to obtain a plurality of local comment viewpoint derivative vectors; carrying out bias coefficient modification on the local comment viewpoint derivative vector by utilizing the text feature collision sub-model to obtain a modified local comment viewpoint derivative vector; processing the modified local comment viewpoint derivative vector and the corresponding local comment viewpoint vector to obtain a local comment viewpoint collision vector corresponding to the task thread;
The local comment viewpoint vector includes a first comment viewpoint vector and a second comment viewpoint vector, the local comment viewpoint aggregate vector includes a first local comment viewpoint aggregate word vector and a second local comment viewpoint aggregate word vector, and the local comment viewpoint vector corresponding to the task threads is subjected to comment viewpoint vector aggregate operation to obtain a local comment viewpoint aggregate vector, including: performing comment viewpoint vector aggregation operation on first comment viewpoint word vectors corresponding to the task threads to obtain first local comment viewpoint aggregation word vectors; performing comment viewpoint vector aggregation operation on second comment viewpoint word vectors corresponding to the plurality of task threads to obtain second local comment viewpoint aggregation word vectors;
wherein the method further comprises: acquiring a first local word vector generation model and a second local word vector generation model corresponding to the task thread; the local comment viewpoint collision vector corresponding to the task thread includes a local comment viewpoint collision vector corresponding to the first comment viewpoint vector and a local comment viewpoint collision vector corresponding to the second comment viewpoint vector, and the mining the target online chat topic in the chat room interaction text to be analyzed in combination with the local comment viewpoint collision vector corresponding to the task thread includes: carrying out vector convolution operation on the local comment viewpoint collision vector corresponding to the first comment viewpoint word vector by using the first local word vector generation model to obtain text keywords and distribution quantization characteristics of text units in the interactive text of the chat room to be analyzed; performing vector convolution operation on the local comment viewpoint collision vector corresponding to the second comment viewpoint word vector by using the second local word vector generation model to obtain range quantization characteristics of text units in the chat room interaction text to be analyzed; and mining the target online chat topics in the interactive text of the chat room to be analyzed by combining the text keywords, the distribution quantization characteristics and the range quantization characteristics of the text units in the interactive text of the chat room to be analyzed.
5. The method of claim 1, wherein prior to obtaining the chat room interaction text to be analyzed, further comprising: acquiring a plurality of chat room interaction text sample sets and a universal online chat topic mining network, wherein the chat room interaction text sample sets are in one-to-one correspondence with a plurality of task threads in the universal online chat topic mining network, authentication notes of the chat room interaction text sample in the chat room interaction text sample sets are different, and the authentication notes are used for distinguishing target online chat topics of different text keywords in the chat room interaction text sample; debugging the universal online chat topic mining network by using the chat room interaction text sample to obtain an online chat topic mining network;
each task thread in the universal online chat topic mining network comprises a set first local word vector generation model and a set second local word vector generation model, the universal online chat topic mining network is debugged by using the chat room interaction text sample to obtain an online chat topic mining network, and the method comprises the following steps: performing target online chat topic mining on the chat room interactive text sample by using the universal online chat topic mining network to obtain first generated data of the set first local word vector generation model and second generated data of the set second local word vector generation model corresponding to the task thread; determining a first debugging cost index and a second debugging cost index corresponding to the task thread based on the first generated data, the second generated data and the authentication annotation of the chat room interactive text sample corresponding to the task thread; determining a debugging cost index corresponding to the task thread by combining a first debugging cost index and a second debugging cost index corresponding to the task thread; determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost index corresponding to the task thread; optimizing network variables of the universal online chat topic mining network based on the debugging cost index of the universal online chat topic mining network, and jumping to the step of carrying out target online chat topic mining on the chat room interactive text sample by utilizing the universal online chat topic mining network, wherein the debugging cost index circulated to the universal online chat topic mining network meets the set requirement to obtain the online chat topic mining network;
The determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost index corresponding to the task thread comprises the following steps: determining a prior debugging cost index mean value corresponding to the task thread of a current dynamic circulation operator, wherein the current dynamic circulation operator represents the circulation from the xth circulation to the yth circulation; determining a mean value of a previous debugging cost index corresponding to the task thread of a previous dynamic loop operator, wherein the previous dynamic loop operator represents the circulation from the x-z time to the y-z time, and z, x and y are positive integers; determining a debugging cost gradient corresponding to the task thread by combining a previous debugging cost index mean value corresponding to the task thread of the previous dynamic loop operator and a previous debugging cost index mean value corresponding to the task thread of the current dynamic loop operator; determining the debug cost confidence corresponding to the task thread based on the debug cost gradient corresponding to the task thread; and determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost confidence corresponding to the task thread and the debugging cost index corresponding to the task thread.
6. The user personality analysis processing system based on the big data is characterized by comprising a big data analysis cloud platform and a user terminal which are communicated with each other;
the big data analysis cloud platform is used for:
acquiring a chat room interaction text to be analyzed, wherein the chat room interaction text to be analyzed comprises at least one target online chat topic;
comment viewpoint vectors are mined on the interactive text of the chat room to be analyzed, and personalized comment viewpoint vectors with different dimensions are obtained;
the personalized comment viewpoint vector is subjected to local comment viewpoint vector mining to obtain local comment viewpoint vectors corresponding to a plurality of task threads, wherein the local comment viewpoint vectors are comment viewpoint vectors corresponding to target online chat topics of different text keywords in the chat room interaction text to be analyzed;
performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads to obtain the local comment viewpoint collision vectors corresponding to the task threads;
and excavating the target online chat topics in the chat room interaction text to be analyzed by combining the local comment viewpoint collision vectors corresponding to the task threads.
7. The system of claim 6, wherein the big data analysis cloud platform is further to: determining topic preference labels of the target user terminals according to the target online chat topics; performing topic pushing processing on the target user terminal by utilizing the topic preference tag;
the topic pushing processing of the target user terminal by using the topic preference tag includes: selecting an online chat topic to be pushed from a preset topic pool by utilizing the topic preference label; collecting topic text information corresponding to the online chat topic to be pushed; on the premise that the topic text information carries data desensitization instructions, performing data desensitization processing on the topic text information to obtain desensitized text information; and pushing the online chat topic to be pushed and the desensitized text information in an associated mode to the target user terminal.
8. The system of claim 6, wherein the performing comment viewpoint collision operations on the local comment viewpoint vectors corresponding to the plurality of task threads to obtain the local comment viewpoint collision vector corresponding to the task thread includes: acquiring a text characteristic collision model; performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads; the text feature collision model comprises a text feature collision sub-model and a comment viewpoint vector mining sub-model, and the method for performing comment viewpoint collision operation on local comment viewpoint vectors corresponding to a plurality of task threads by using the text feature collision model to obtain the local comment viewpoint collision vectors corresponding to the task threads comprises the following steps: performing comment viewpoint vector aggregation operation on the local comment viewpoint vectors corresponding to the task threads to obtain local comment viewpoint aggregation vectors; performing comment viewpoint vector mining on the local comment viewpoint aggregate vector by using the comment viewpoint vector mining sub-model to obtain a plurality of local comment viewpoint derivative vectors; carrying out bias coefficient modification on the local comment viewpoint derivative vector by utilizing the text feature collision sub-model to obtain a modified local comment viewpoint derivative vector; processing the modified local comment viewpoint derivative vector and the corresponding local comment viewpoint vector to obtain a local comment viewpoint collision vector corresponding to the task thread;
The local comment viewpoint vector includes a first comment viewpoint vector and a second comment viewpoint vector, the local comment viewpoint aggregate vector includes a first local comment viewpoint aggregate word vector and a second local comment viewpoint aggregate word vector, and the local comment viewpoint vector corresponding to the task threads is subjected to comment viewpoint vector aggregate operation to obtain a local comment viewpoint aggregate vector, including: performing comment viewpoint vector aggregation operation on first comment viewpoint word vectors corresponding to the task threads to obtain first local comment viewpoint aggregation word vectors; performing comment viewpoint vector aggregation operation on second comment viewpoint word vectors corresponding to the plurality of task threads to obtain second local comment viewpoint aggregation word vectors;
wherein, big data analysis cloud platform still is used for: acquiring a first local word vector generation model and a second local word vector generation model corresponding to the task thread; the local comment viewpoint collision vector corresponding to the task thread includes a local comment viewpoint collision vector corresponding to the first comment viewpoint vector and a local comment viewpoint collision vector corresponding to the second comment viewpoint vector, and the mining the target online chat topic in the chat room interaction text to be analyzed in combination with the local comment viewpoint collision vector corresponding to the task thread includes: carrying out vector convolution operation on the local comment viewpoint collision vector corresponding to the first comment viewpoint word vector by using the first local word vector generation model to obtain text keywords and distribution quantization characteristics of text units in the interactive text of the chat room to be analyzed; performing vector convolution operation on the local comment viewpoint collision vector corresponding to the second comment viewpoint word vector by using the second local word vector generation model to obtain range quantization characteristics of text units in the chat room interaction text to be analyzed; and mining the target online chat topics in the interactive text of the chat room to be analyzed by combining the text keywords, the distribution quantization characteristics and the range quantization characteristics of the text units in the interactive text of the chat room to be analyzed.
9. The system of claim 6, wherein prior to the obtaining chat room interaction text to be analyzed, the big data analysis cloud platform is further configured to: acquiring a plurality of chat room interaction text sample sets and a universal online chat topic mining network, wherein the chat room interaction text sample sets are in one-to-one correspondence with a plurality of task threads in the universal online chat topic mining network, authentication notes of the chat room interaction text sample in the chat room interaction text sample sets are different, and the authentication notes are used for distinguishing target online chat topics of different text keywords in the chat room interaction text sample; debugging the universal online chat topic mining network by using the chat room interaction text sample to obtain an online chat topic mining network;
each task thread in the universal online chat topic mining network comprises a set first local word vector generation model and a set second local word vector generation model, the universal online chat topic mining network is debugged by using the chat room interaction text sample to obtain an online chat topic mining network, and the method comprises the following steps: performing target online chat topic mining on the chat room interactive text sample by using the universal online chat topic mining network to obtain first generated data of the set first local word vector generation model and second generated data of the set second local word vector generation model corresponding to the task thread; determining a first debugging cost index and a second debugging cost index corresponding to the task thread based on the first generated data, the second generated data and the authentication annotation of the chat room interactive text sample corresponding to the task thread; determining a debugging cost index corresponding to the task thread by combining a first debugging cost index and a second debugging cost index corresponding to the task thread; determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost index corresponding to the task thread; optimizing network variables of the universal online chat topic mining network based on the debugging cost index of the universal online chat topic mining network, and jumping to the step of carrying out target online chat topic mining on the chat room interactive text sample by utilizing the universal online chat topic mining network, wherein the debugging cost index circulated to the universal online chat topic mining network meets the set requirement to obtain the online chat topic mining network;
The determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost index corresponding to the task thread comprises the following steps: determining a prior debugging cost index mean value corresponding to the task thread of a current dynamic circulation operator, wherein the current dynamic circulation operator represents the circulation from the xth circulation to the yth circulation; determining a mean value of a previous debugging cost index corresponding to the task thread of a previous dynamic loop operator, wherein the previous dynamic loop operator represents the circulation from the x-z time to the y-z time, and z, x and y are positive integers; determining a debugging cost gradient corresponding to the task thread by combining a previous debugging cost index mean value corresponding to the task thread of the previous dynamic loop operator and a previous debugging cost index mean value corresponding to the task thread of the current dynamic loop operator; determining the debug cost confidence corresponding to the task thread based on the debug cost gradient corresponding to the task thread; and determining the debugging cost index of the universal online chat topic mining network by combining the debugging cost confidence corresponding to the task thread and the debugging cost index corresponding to the task thread.
10. The big data analysis cloud platform is characterized by comprising a processor and a memory; the processor is communicatively connected to the memory, the processor being configured to read a computer program from the memory and execute the computer program to implement the method of any of claims 1-5.
CN202310690368.4A 2023-06-12 2023-06-12 User personality analysis processing method and system based on big data and cloud platform Active CN116628168B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310690368.4A CN116628168B (en) 2023-06-12 2023-06-12 User personality analysis processing method and system based on big data and cloud platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310690368.4A CN116628168B (en) 2023-06-12 2023-06-12 User personality analysis processing method and system based on big data and cloud platform

Publications (2)

Publication Number Publication Date
CN116628168A true CN116628168A (en) 2023-08-22
CN116628168B CN116628168B (en) 2023-11-14

Family

ID=87597337

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310690368.4A Active CN116628168B (en) 2023-06-12 2023-06-12 User personality analysis processing method and system based on big data and cloud platform

Country Status (1)

Country Link
CN (1) CN116628168B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422063A (en) * 2023-12-18 2024-01-19 四川省大数据技术服务中心 Big data processing method applying intelligent auxiliary decision and intelligent auxiliary decision system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815946A (en) * 2018-12-03 2019-05-28 东南大学 Multithreading business license positioning identifying method based on intensive connection network
CN110019830A (en) * 2017-09-20 2019-07-16 腾讯科技(深圳)有限公司 Corpus processing, term vector acquisition methods and device, storage medium and equipment
CN110781276A (en) * 2019-09-18 2020-02-11 平安科技(深圳)有限公司 Text extraction method, device, equipment and storage medium
CN111737464A (en) * 2020-06-12 2020-10-02 网易(杭州)网络有限公司 Text classification method and device and electronic equipment
CN112241289A (en) * 2019-07-18 2021-01-19 中移(苏州)软件技术有限公司 Text data processing method and electronic equipment
US10984193B1 (en) * 2020-01-08 2021-04-20 Intuit Inc. Unsupervised text segmentation by topic
US20220188306A1 (en) * 2019-07-16 2022-06-16 Splunk Inc. Executing one query based on results of another query
CN115016950A (en) * 2022-08-09 2022-09-06 深圳市乙辰科技股份有限公司 Data analysis method and system based on multithreading model
CN115203412A (en) * 2022-07-06 2022-10-18 腾讯科技(深圳)有限公司 Emotion viewpoint information analysis method and device, storage medium and electronic equipment
CN116204647A (en) * 2023-03-17 2023-06-02 中国建设银行股份有限公司 Method and device for establishing target comparison learning model and text clustering
CN116244417A (en) * 2023-03-23 2023-06-09 山东倩倩网络科技有限责任公司 Question-answer interaction data processing method and server applied to AI chat robot

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019830A (en) * 2017-09-20 2019-07-16 腾讯科技(深圳)有限公司 Corpus processing, term vector acquisition methods and device, storage medium and equipment
CN109815946A (en) * 2018-12-03 2019-05-28 东南大学 Multithreading business license positioning identifying method based on intensive connection network
US20220188306A1 (en) * 2019-07-16 2022-06-16 Splunk Inc. Executing one query based on results of another query
CN112241289A (en) * 2019-07-18 2021-01-19 中移(苏州)软件技术有限公司 Text data processing method and electronic equipment
CN110781276A (en) * 2019-09-18 2020-02-11 平安科技(深圳)有限公司 Text extraction method, device, equipment and storage medium
US10984193B1 (en) * 2020-01-08 2021-04-20 Intuit Inc. Unsupervised text segmentation by topic
CN111737464A (en) * 2020-06-12 2020-10-02 网易(杭州)网络有限公司 Text classification method and device and electronic equipment
CN115203412A (en) * 2022-07-06 2022-10-18 腾讯科技(深圳)有限公司 Emotion viewpoint information analysis method and device, storage medium and electronic equipment
CN115016950A (en) * 2022-08-09 2022-09-06 深圳市乙辰科技股份有限公司 Data analysis method and system based on multithreading model
CN116204647A (en) * 2023-03-17 2023-06-02 中国建设银行股份有限公司 Method and device for establishing target comparison learning model and text clustering
CN116244417A (en) * 2023-03-23 2023-06-09 山东倩倩网络科技有限责任公司 Question-answer interaction data processing method and server applied to AI chat robot

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ALI YADOLLAHI 等: "Current State of Text Sentiment Analysis from Opinion to Emotion Mining", 《ACM COMPUTING SURVEYS》, vol. 50, no. 2, pages 1, XP058666307, DOI: 10.1145/3057270 *
RAYMOND SO 等: "Extract Aspect-based Financial Opinion Using Natural Language Inference", 《ICEMC \'22: PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON E-BUSINESS AND MOBILE COMMERCE》, pages 83 *
武嘉文: "基于主题分析模型的文本推荐和摘要生成方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 09, pages 138 - 265 *
焦凯楠 等: "基于MacBERT-BiLSTM-CRF的反恐领域细粒度实体识别", 《科学技术与工程》, vol. 21, no. 29, pages 12638 - 12648 *
颜永超: "基于标签信息提取的多标签文本分类研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 11, pages 138 - 289 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422063A (en) * 2023-12-18 2024-01-19 四川省大数据技术服务中心 Big data processing method applying intelligent auxiliary decision and intelligent auxiliary decision system
CN117422063B (en) * 2023-12-18 2024-02-23 四川省大数据技术服务中心 Big data processing method applying intelligent auxiliary decision and intelligent auxiliary decision system

Also Published As

Publication number Publication date
CN116628168B (en) 2023-11-14

Similar Documents

Publication Publication Date Title
JP6893233B2 (en) Image-based data processing methods, devices, electronics, computer-readable storage media and computer programs
CN110781276B (en) Text extraction method, device, equipment and storage medium
CN107346336B (en) Information processing method and device based on artificial intelligence
CN109299320B (en) Information interaction method and device, computer equipment and storage medium
CN108334533A (en) keyword extracting method and device, storage medium and electronic device
CN116628168B (en) User personality analysis processing method and system based on big data and cloud platform
CN110263916B (en) Data processing method and device, storage medium and electronic device
CN109299399B (en) Learning content recommendation method and terminal equipment
CN112329816A (en) Data classification method and device, electronic equipment and readable storage medium
CN105005593A (en) Scenario identification method and apparatus for multi-user shared device
CN113283238B (en) Text data processing method and device, electronic equipment and storage medium
CN116824278B (en) Image content analysis method, device, equipment and medium
CN103534696A (en) Exploiting query click logs for domain detection in spoken language understanding
US20170034111A1 (en) Method and Apparatus for Determining Key Social Information
CN109214417A (en) The method for digging and device, computer equipment and readable medium that user is intended to
CN108711031B (en) Intelligent terminal electronic evidence library management training system and method
CN110532562B (en) Neural network training method, idiom misuse detection method and device and electronic equipment
CN115860836A (en) E-commerce service pushing method and system based on user behavior big data analysis
CN110516162A (en) A kind of information recommendation method, device, equipment and storage medium
CN111414732A (en) Text style conversion method and device, electronic equipment and storage medium
CN112581297A (en) Information pushing method and device based on artificial intelligence and computer equipment
CN108268443A (en) It determines the transfer of topic point and obtains the method, apparatus for replying text
CN107665202A (en) Method and device for constructing interest model and electronic equipment
CN112949305B (en) Negative feedback information acquisition method, device, equipment and storage medium
CN115455151A (en) AI emotion visual identification method and system and cloud platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant