CN116662769B - User behavior analysis system and method based on deep learning model - Google Patents

User behavior analysis system and method based on deep learning model Download PDF

Info

Publication number
CN116662769B
CN116662769B CN202310961231.8A CN202310961231A CN116662769B CN 116662769 B CN116662769 B CN 116662769B CN 202310961231 A CN202310961231 A CN 202310961231A CN 116662769 B CN116662769 B CN 116662769B
Authority
CN
China
Prior art keywords
malicious
reporting
language
user
formula
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310961231.8A
Other languages
Chinese (zh)
Other versions
CN116662769A (en
Inventor
左长彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Digital Yuedong Technology Co ltd
Original Assignee
Beijing Digital Yuedong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Digital Yuedong Technology Co ltd filed Critical Beijing Digital Yuedong Technology Co ltd
Priority to CN202310961231.8A priority Critical patent/CN116662769B/en
Publication of CN116662769A publication Critical patent/CN116662769A/en
Application granted granted Critical
Publication of CN116662769B publication Critical patent/CN116662769B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses a user behavior analysis system and a method based on a deep learning model, and particularly relates to the field of deep learning, wherein the system comprises a task acquisition module, a priority calculation module and a user speaking analysis module, wherein the task acquisition module is used for acquiring a reporting task in a network platform; the priority calculating module is used for acquiring the priority indexes of the reporting tasks and sequencing the reporting tasks according to the priority indexes of the reporting tasks; analyzing the reporting task to obtain the information of the reporting person, obtaining the importance degree of the reporting task, and adjusting the processing sequence of the reporting task according to the importance degree; the user language analysis module is used for identifying whether the reported content is malicious language or not and evaluating the malicious degree, and solves the problems that the existing user behavior analysis system is low in processing reporting task and cannot process the reported content in time.

Description

User behavior analysis system and method based on deep learning model
Technical Field
The application relates to the technical field of deep learning, in particular to a user behavior analysis system and method based on a deep learning model.
Background
With the development of internet, more and more internet platforms are used, people release speaking expression views in the internet platforms, but because of the virtual nature and the transmissibility of the network platforms, people have great differences between speaking and real speaking in the network platforms, people are very likely to release malicious utterances in the network platforms, and because of the rapidity of network transmission, the rapid transmission of the malicious utterances is caused, and difficulties are increased in building good network environments, so that the internet platforms need to monitor user utterances.
At present, supervision of an internet platform mainly depends on user reporting, a platform manager analyzes a user talk aimed at by reporting, and judges whether the user talk is malicious or not, but a method of manually processing the reporting task is relied on to cause slow processing of the reporting task, the existing user behavior analysis system is slow in processing the reporting task, improper behaviors of users cannot be accurately identified, and in addition, the number of reporting behaviors generated by the large number of users of the internet platform is massive, so that the platform is difficult to process the reported content in time, and good network environment maintenance is not facilitated.
Disclosure of Invention
In order to overcome the defects in the prior art, the application provides a user behavior analysis system and a method based on a deep learning model, which are used for obtaining the importance degree of a reporting task by analyzing the reporting task, adjusting the processing sequence of the reporting task according to the importance degree and establishing a malicious language model through deep learning so as to solve the problems in the background art.
In order to achieve the above purpose, the present application provides the following technical solutions: a user behavior analysis system based on a deep learning model comprises a task acquisition module, a priority calculation module, a user language analysis module, a reporting behavior analysis module and a user management module,
the task acquisition module is used for acquiring a reporting task in the network platform, a user sends a reporting task request to the management end, the information of the reporting task comprises the information of a reporter, the information of a reporter and the text of a reporting target, and the reporting task is transmitted to the priority calculation module;
the priority computing module is used for acquiring the priority index of the reporting task, sequencing the reporting task according to the priority index of the reporting task, and comprises an activity parameter computing unit, a quality score computing unit, a concern degree parameter computing unit and a task priority index computing unit aiming at a reporter;
the user speech analysis module is used for identifying whether the reported content is a malicious speech and evaluating the malicious degree, and comprises a speech text preprocessing unit, a malicious speech identification model and a speech malicious degree evaluation unit, and transmitting a user speech analysis result to the reporting behavior analysis module;
the reporting behavior analysis module judges whether reporting behaviors of the reporting person are malicious or not based on the result of the malicious language analysis module, evaluates malicious reporting hazard degree, calculates a malicious reporting behavior scoring loss value, and calculates a successful reporting behavior scoring rewarding value;
the priority index is obtained by providing m users reporting user utterances, obtaining the liveness parameter HY_i, quality score SZ_i and attention degree parameter SG_i of each reporter, normalizing the obtained parameters and inputting the normalized parameters into the priority indexNumerical calculation formulaWherein lambda is 1 、λ 2 Is a preset constant, and has a value ranging from 0.1 to 1.0]Wherein t is i Real-time representing calculation priority and time t for earliest reporting task request 0 And calculating to obtain the priority index of the reporting task, and preferentially processing the reporting task with high priority index.
Preferably, the normalization process is one of linear normalization, nonlinear normalization, or average-zero normalization.
Preferably, the calculation of the user activity parameter HY_i satisfies the formulaWhere ta represents the current month online time of the presenter, tb represents the account usage time of the presenter, and sa represents the number of reviews posted by the user in the current month.
Preferably, the calculation of the user quality score sz_i satisfies the formulaWherein SZ is 0 Representing an initial quality score of the user, wherein YE_i represents a malicious language score loss value, which satisfies the formula +.>Wherein ey represents the number of malicious utterances in the current month of the user, sa represents the number of comments posted by the current month of the user, and sigma 1 Representing the malicious degree of malicious language, XE_i represents a malicious reporting behavior score loss value, CE_i represents a successful reporting behavior score reward value, and the initial values of YE_i, XE_i and CE_i are 0.
Preferably, the calculation of the user attention parameter sg_i satisfies the formulaWhere SFen represents the number of fan-shapes of the user, SDia represents the cumulative endorsement of the user, where μ 1 Sum mu 2 Is a preset coefficient and is more than or equal to 0 and less than or equal to mu 1 More than or equal to 1 and less than or equal to 0 mu 2 Is less than or equal to 1 and mu 1 22 2 =1。
Preferably, the speaker text preprocessing unit is configured to obtain keywords of the speaker text, where the keywords of the speaker text are words obtained by splitting sentences of the speaker text into vocabularies through regular expressions, removing stop words, and filtering nonsensical words, including but not limited to operations of converting related letters into lower case and converting expressions, and the application is not limited specifically.
Preferably, the user speech analysis module is used for judging whether the speech belongs to a malicious speech through a malicious speech recognition model, the speech text processed by the speech text preprocessing unit is input into the malicious speech recognition model, the malicious speech recognition model comprises a first channel and a second channel, the first channel is used for acquiring spatial features of the speech, the second channel is used for acquiring vector feature space of the text of the speech through a first-order markov chain algorithm, fusion splicing is carried out on the features extracted by the first channel and the second channel based on a cross attention mechanism, the features are subjected to secondary classification through a softmax classifier through a full connection layer, when the value of an output layer is close to 1, the value is close to 0, the fact that the speech is malicious is not indicated, the malicious speech and the malicious speech are obtained, the malicious speech recognition model is based on a deep learning frame, the neural network weight parameters, bias parameters and an activation function of the first channel are initialized respectively, the weight parameters and the bias parameters are updated based on the forward propagation function, the updated weight parameters and the bias parameters are propagated backward based on the updated weight parameters, and the updated weight parameters are required to meet the accuracy rate and the accuracy rate of the model.
Preferably, the speech malicious degree evaluation unit is configured to evaluate the malicious degree σ of the malicious speech 1Where Ye is the original malicious degree, μ is the average of the original malicious degrees, δ is the standard deviation of the original malicious degree, σ 1 The value of (1, 0) satisfies the formula +.>Where wai (Aj) represents the subject A to which the malicious language pertains j Corresponding weight, L represents keyword accumulated length of malicious language, L 0 The unit keyword length is represented, M represents the number of reported people, the topic category of malicious utterances is obtained, the malicious utterances are set to belong to M topics, the topics are marked as A= { A1, A2, … Aj, … and Am }, the weight coefficient of each topic is marked as Wai, the malicious utterances topic classification model is built based on a deep learning framework, the deep learning framework comprises an input layer, an implicit layer and an output layer, and M neurons are included in the implicit layer.
Preferably, the obtaining of the malicious language topic classification model includes the following steps:
step S01, initializing a model: defining initial parameters of deep learning and defining weight parameters W between neural networks ij Bias parameter b i Activating the function f (·) and outputting a result satisfying the formulaWherein E is i Representing the input word vector, w ij Representing the connection weight of the ith neuron and the jth neuron, P ij Representing malicious language E i Belonging to the subject A j Probability of (2);
step S02, forward propagation: inputting malicious utterances E i Malicious utterances including n key words, denoted E i ={S m1 ,S m2 ,…,S mn ' output malicious language belongs to subject A j Probability P of (2) ij The probability that a malicious utterance belongs to each topic is obtained and is recorded as a set P= { P i1 ,P i2 ,…,P ij ,…,P im Taking the topic corresponding to max (P) as a model to speculate that the topic to which the malicious language belongs is A_i, and the probability of belonging to the topic A_i is P max
Step S03, calculating a loss function, and setting a malicious language E i The actual theme is A_j, and a model is set to speculate malicious language E i Belonging to the subject matterThe probability of A_j is P ij By the formulaCalculating a loss function;
step S04, back propagation: updating weight parameters and bias parameters according to the loss values obtained by the loss function calculation, reversely transmitting the input information, updating the weight parameters and the bias parameters, repeating until the loss functions meet the threshold requirements, completing training of the model, and obtaining the malicious language topic classification model.
Preferably, the updated weight parameter W' ij Satisfy the formula asThe updated bias parameters satisfy the formula +.>WhereinαIs the learning rate of deep learning, satisfies the formulaWhereinα 0 Is the learning rate parameter constant which is initially set, and the value is between 0.01 and 0.05]Where epoch_num is the number of times forward and backward propagation is completed.
Preferably, the reporting behavior analysis module includes a malicious reporting behavior analysis unit and a successful reporting behavior analysis unit, where the malicious reporting behavior analysis unit calculates a malicious reporting behavior scoring loss value, sets the reporting times ZJ of the reporting person, the reporting success times CJ of the reporting person, the activity hy_i of the reporting person, and the malicious reporting behavior scoring loss value, and satisfies a formulaThe successful reporting behavior analysis unit is used for calculating a successful reporting behavior scoring rewarding value, and the formula ++>
Preferably, the user management module is based on a user language analysis module and reporting behavior analysisThe analysis result of the module processes the report task, updates the quality score of the user, and brings the acquired malicious report behavior score loss value XE_i, successful report behavior score reward value CE_i and malicious language score loss value YE_i into the calculation formula of the user quality score SZ_iAnd finishing updating the user quality score.
In order to achieve the above purpose, the present application provides the following technical solutions: a user behavior analysis method based on a deep learning model comprises the following steps:
step S001, acquiring a report task: in the internet platform, a user set A reports that the language of a user B is a malicious language to a management platform, reporting actions generate a report A, a reported person B and report target content C, and the number of the user set A is more than or equal to 1;
step S002, calculating task priority: based on the acquired activity parameters HY_i, quality scores SZ_i and attention degree parameters SG_i of each presenter, inputting the acquired parameters into a priority index calculation formula after normalization processing to acquire priority indexes of the reporting tasks, and sequencing the reporting tasks according to the priority indexes of the reporting tasks;
step S003, user speaking analysis: the method comprises the steps of respectively initializing a neural network weight parameter, a bias parameter and an activation function of a first channel and a second channel based on a deep learning frame, obtaining a loss function through forward propagation, updating the weight parameter and the bias parameter once based on the loss function, performing backward propagation based on the updated weight parameter and bias parameter to obtain a malicious language identification model, analyzing whether a user language is a malicious language or not through the malicious language identification model, obtaining a theme of the malicious language based on a malicious language theme classification model, obtaining a weight coefficient corresponding to the malicious language theme based on a theme Aj to which the malicious language belongs, and calculating the malicious degree of the malicious language;
step S004, reporting behavior analysis: the reporting behavior analysis module judges whether reporting behaviors of a reporter are malicious or not based on a malicious language analysis result, evaluates malicious reporting hazard degree, calculates malicious reporting behavior scoring loss values and calculates successful reporting behavior scoring rewarding values;
step S005, user management: based on the analysis results of the user language analysis module and the reporting behavior analysis module, processing reporting tasks and updating the quality scores of the users.
The application has the technical effects and advantages that:
according to the application, the importance degree of the reporting task is obtained by analyzing the reporting task, the processing sequence of the reporting task is adjusted according to the importance degree, the task acquisition module and the priority calculation module are measures for improving the processing efficiency of the reporting task by a platform management end, the more important reporting task is obtained by calculating the priority of the task, the reporting task with high priority is processed by using limited computer resources, the good network environment is favorably maintained, a user language model is established by deep learning, whether the reporting task comprises a malicious language is judged, the harm and the effect of the user reporting behavior are obtained by the reporting behavior analysis module, the malicious language or the malicious reporting behavior of the user is obtained according to the analysis of the reporting task, and the quality score of the user is updated by the analysis result, so that the problems that the existing user behavior analysis system provided in the background technology processes the reporting task is slow, the improper behavior of the user cannot be accurately identified, the existing platform is difficult to process the reporting task in time, and the good network environment is not favorably maintained are solved.
Drawings
Fig. 1 is a block diagram of the overall structure of the system of the present application.
Fig. 2 is a flowchart for constructing a malicious language topic classification model according to the present application.
Fig. 3 is a flow chart of the method of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "module," "system," and the like as used herein are intended to include a computer-related entity, such as, but not limited to, hardware, firmware, a combination of hardware and software, or software in execution. For example, a module may be, but is not limited to: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of example, both an application running on a computing device and the computing device can be a module. One or more modules may be located in one process and/or thread of execution, and one module may be located on one computer and/or distributed between two or more computers.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.
Example 1
The application provides a user behavior analysis system based on a deep learning model as shown in figure 1, which comprises a task acquisition module, a priority calculation module, a user language analysis module, a reporting behavior analysis module and a user management module,
the task acquisition module is used for acquiring a reporting task in the network platform, a user sends a reporting task request to the management end, the information of the reporting task comprises the information of a reporter, the information of a reporter and the text of a reporting target, and the reporting task is transmitted to the priority calculation module;
the priority computing module is used for acquiring the priority index of the reporting task, and sequencing the reporting task according to the priority index of the reporting task, and comprises an activity parameter computing unit, a quality scoring computing unit, a concern degree parameter computing unit and a task priority index computing unit;
the user speech analysis module is used for identifying whether the reported content is a malicious speech, evaluating the malicious degree of the speech, and comprises a speech text preprocessing unit, a malicious speech identification model and a speech malicious degree evaluation unit, and transmitting a user speech analysis result to the reporting behavior analysis module and the user management module;
the reporting behavior analysis module judges whether reporting behaviors of the reporting person are malicious or not based on the result of the malicious language analysis module, evaluates malicious reporting hazard degree, calculates a malicious reporting behavior scoring loss value, calculates a successful reporting behavior scoring rewarding value, and transmits the reporting behavior analysis result to the user management module;
and the user management module processes the reporting task based on the analysis results of the user language analysis module and the reporting behavior analysis module and updates the quality score of the user.
Further, the priority index is obtained by the following steps: m users are arranged to report user utterances, the liveness parameter HY_i, quality score SZ_i and attention degree parameter SG_i of each presenter are obtained, and the obtained parameters are input into a priority index calculation formula after normalization processingWherein lambda is 1 、λ 2 Is a preset constant, and has a value ranging from 0.1 to 1.0]Wherein t is i Real-time representing calculation priority and time t for earliest reporting task request 0 And calculating to obtain the priority index of the reporting task, and preferentially processing the reporting task with high priority index.
Further, the normalization process is one of linear normalization, nonlinear normalization or average-zero value normalization, and in the embodiment of the present application, average-zero value normalization is adopted.
Further, the calculation of the user activity parameter HY_i satisfies the formulaWhere ta represents the current month online time of the presenter, tb represents the account usage time of the presenter, and sa represents the number of reviews posted by the user in the current month.
Further, the calculation of the user quality score sz_i satisfies the formulaWherein SZ is 0 Representing an initial quality score of the user, wherein YE_i represents a malicious language score loss value, which satisfies the formula +.>Wherein ey represents the number of malicious utterances in the current month of the user, sa represents the number of comments posted by the current month of the user, and sigma 1 Representing the malicious degree of malicious language, XE_i represents a malicious reporting behavior score loss value, CE_i represents a successful reporting behavior score reward value, and the initial values of YE_i, XE_i and CE_i are 0.
Further, the calculation of the user attention degree parameter sg_i satisfies the formulaWhere SFen represents the number of fan-shapes of the user, SDia represents the cumulative endorsement of the user, where μ 1 Sum mu 2 Is a preset coefficient and is more than or equal to 0 and less than or equal to mu 1 More than or equal to 1 and less than or equal to 0 mu 2 Is less than or equal to 1 and mu 1 22 2 =1。
Further, the speaker text preprocessing unit is configured to obtain keywords of the speaker text, where the keywords of the speaker text are words obtained by splitting sentences into vocabularies through regular expressions of the speaker text, removing stop words, and filtering nonsensical words, and the operations include converting related letters into lower cases and converting expressions into characters.
Further, the user speech analysis module is used for judging whether the speech belongs to a malicious speech through the malicious speech recognition model, the speech text processed by the speech text preprocessing unit is input into the malicious speech recognition model, the malicious speech recognition model comprises a first channel and a second channel, the first channel is used for acquiring spatial features of the speech, the second channel is used for acquiring vector feature space of the text of the speech through a first-order Markov chain algorithm, fusion splicing is carried out on the features extracted by the first channel and the second channel based on a cross attention mechanism, the features are subjected to secondary classification through a softmax classifier through a full connection layer, when the numerical value of an output layer is close to 1, the numerical value is close to 0, the speech is not malicious, the malicious speech and the non-malicious speech are obtained, the malicious speech recognition model is based on a deep learning frame, the neural network weight parameters, the bias parameters and the activation functions of the first channel are initialized respectively, the weight parameters and the bias parameters are updated based on the forward propagation function, the updated weight parameters and the bias parameters are propagated backwards based on the updated weight parameters, and the updated weight parameters and the bias parameters are required to meet the accuracy rate and the accuracy rate of the model.
Further, the speech malicious degree evaluation unit is used for evaluating the malicious degree sigma of the malicious speech 1Where Ye is the original malicious degree, μ is the average of the original malicious degrees, δ is the standard deviation of the original malicious degree, σ 1 The value of (1, 0) satisfies the formula +.>Therein wai (Aj) Representing subject A to which malicious language pertains j Corresponding weight, L represents keyword accumulated length of malicious language, L 0 The unit keyword length is represented, M represents the number of reported people, the topic category of the malicious language is obtained, the malicious language is set to belong to M topics, the topics are marked as A= { A1, A2, … Aj, …, am }, and the weight coefficient of each topic is marked as Wai.
Further, the malicious degree evaluation unit includes a malicious language topic classification model, the malicious language topic classification model is built based on a deep learning framework, the deep learning framework includes an input layer, an hidden layer, and an output layer, and m neurons are included in the hidden layer, as shown in fig. 2, and the method includes the following steps:
step S01, initializing a model: defining initial parameters for deep learning, using P ij Representing malicious language E i Belonging to the subject A j Probability of (2);
step S02, forward propagation: inputting malicious utterances E i Outputting malicious utterances belonging to subject A j Probability P of (2) ij Obtaining the probability of the malicious language belonging to each topic, marking the probability as a set P, taking the topic corresponding to max (P) as a model to infer that the topic to which the malicious language belongs is A_i, and the probability of the topic A_i belongs to Pmax;
step S03, calculating a loss function, and setting a malicious language E i The actual theme is A_j, the manually marked malicious language belongs to the theme, and a model is set to speculate the malicious language E i The probability belonging to topic A_j is P ij The probability Pmax and the probability P obtained by the model ij Inputting a loss function;
step S04, back propagation: updating weight parameters and bias parameters according to the loss values obtained by the loss function calculation, reversely transmitting the input information, updating the weight parameters and the bias parameters, repeating until the loss functions meet the threshold requirements, completing training of the model, and obtaining the malicious language topic classification model.
Further, in step S01, a weight parameter W between the neural networks is defined ij Bias parameter b i Activating the function f (·) and outputting a result satisfying the formulaWherein E is i Representing the input word vector, w ij Representing the connection weight of the ith neuron and the jth neuron, P ij Representing malicious language E i Belonging to the subject A j Is a probability of (2).
Further, in step S02, the topic a_j refers to the topic to which the artificially marked malicious language belongs.
Further, in step S03, the loss function is
Further, updated weight parameter W' ij Satisfy the formula asThe updated bias parameters satisfy the formula +.>WhereinαIs the learning rate of deep learning, satisfies the formulaWhereinα 0 Is the learning rate parameter constant which is initially set, and the value is between 0.01 and 0.05]Where epoch_num is the number of times forward and backward propagation is completed.
Further, the reporting behavior analysis module comprises a malicious reporting behavior analysis unit and a successful reporting behavior analysis unit, the malicious reporting behavior analysis unit calculates a malicious reporting behavior scoring loss value, the reporting times ZJ of the reporting person, the reporting success times CJ of the reporting person and the activity HY_i of the reporting person are set, and the malicious reporting behavior scoring loss value meets the formulaThe successful reporting behavior analysis unit is used for calculating a successful reporting behavior scoring rewarding value, and the formula ++>
Further, the user management module obtains whether the reporting task is a malicious language according to the malicious language identification model, if so, performs deleting and hiding operations on the malicious language, reduces the quality score of the reported person, increases the quality score of the reported person, if not, evaluates whether the reporting behavior of the reported person is malicious, reduces the quality score of the reported person according to the malicious reporting behavior, substitutes the obtained malicious reporting behavior score loss value XE_i, successful reporting behavior score reward value CE_i and malicious language score loss value YE_i into the calculation formula of the user quality score SZ_iAnd finishing updating the user quality score.
Example 2
As shown in fig. 3, the present application provides a user behavior analysis method based on a deep learning model, which includes the following steps:
step S001, acquiring a report task: in the internet platform, a user set A reports that the language of a user B is a malicious language to a management platform, reporting actions generate a report A, a reported person B and report target content C, and the number of the user set A is more than or equal to 1;
step S002, calculating task priority: acquiring an activity parameter HY_i, a quality score SZ_i and a concern degree parameter SG_i of each presenter, inputting the acquired parameters into a priority index calculation formula after normalization processing to acquire a priority index of a reporting task, and sequencing the reporting task according to the priority index of the reporting task;
step S003, user speaking analysis: analyzing whether the user language is the malicious language or not through the malicious language identification model, acquiring the theme of the malicious language based on the malicious language theme classification model, acquiring the weight coefficient corresponding to the malicious language theme based on the theme Aj to which the malicious language belongs, and calculating the malicious degree of the malicious language;
step S004, reporting behavior analysis: the reporting behavior analysis module judges whether reporting behaviors of a reporter are malicious or not based on a malicious language analysis result, evaluates malicious reporting hazard degree, calculates malicious reporting behavior scoring loss values and calculates successful reporting behavior scoring rewarding values;
step S005, user management: based on the analysis results of the user language analysis module and the reporting behavior analysis module, processing reporting tasks and updating the quality scores of the users.
Further, in step S003, the neural network weight parameters, bias parameters and activation functions of the first channel and the second channel are initialized based on the deep learning framework, the loss function is obtained by forward propagation, the weight parameters and bias parameters are updated once based on the loss function, and backward propagation is performed based on the updated weight parameters and bias parameters, so as to obtain the malicious language identification model.
Finally: the foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and principles of the application are intended to be included within the scope of the application.

Claims (8)

1. A user behavior analysis system based on a deep learning model is characterized in that: comprises a task acquisition module, a priority calculation module, a user language analysis module and a reporting behavior analysis module,
the task acquisition module is used for acquiring a reporting task in the network platform, wherein the information of the reporting task comprises the information of a reported person, the information of the reported person and the report-oriented speaking text;
the priority calculating module is used for acquiring the priority indexes of the reporting tasks and sequencing the reporting tasks according to the priority indexes of the reporting tasks;
the user speech analysis module is used for identifying whether the reported content is a malicious speech or not and evaluating the malicious degree, and comprises a speech malicious degree evaluation unit; the language malicious degree evaluation unit is used for evaluating the malicious degree sigma of the malicious language 1 Satisfy the formulaWhere Ye is the original malicious degree, μ is the average of the original malicious degrees, δ is the standard deviation of the original malicious degree, σ 1 The value of (1, 0) satisfies the formulaTherein wai (Aj) Representing subject A to which malicious language pertains j Corresponding weight, L represents keyword accumulated length of malicious language, L 0 The length of unit keywords is represented, m represents the number of reported people, and the topic category of malicious language is obtained;
the reporting behavior analysis module judges whether reporting behaviors of the reporting person are malicious or not based on the result of the malicious language analysis module, evaluates malicious reporting hazard degree, calculates a malicious reporting behavior scoring loss value, and calculates a successful reporting behavior scoring rewarding value;
the priority index is obtained by providing m users with reporting user utterances, and obtaining the liveness parameter HY_i and quality assessment of each reporting personThe SZ_i and the attention degree parameter SG_i are divided, and the obtained parameters are input into a priority index calculation formula after normalization processingWherein lambda is 1 、λ 2 Is a preset constant, and has a value ranging from 0.1 to 1.0]Wherein t is i The real-time for representing the calculation priority is set with the earliest time t for reporting the task request 0 Calculating to obtain priority indexes of the reporting tasks, and preferentially processing the reporting tasks with high priority indexes;
the calculation of the activity parameter HY_i of the presenter satisfies the formulaWherein ta represents the current month online time of the reporter, tb represents the account number use time of the reporter, sa represents the number of comments issued by the user in the current month;
the calculation of the user quality score SZ_i meets the formulaWherein SZ is 0 Representing an initial quality score for the user, wherein YE_i represents that the malicious language score loss value satisfies the formula +.>Wherein ey represents the number of malicious utterances in the current month of the user, sa represents the number of comments posted by the current month of the user, and sigma 1 Representing the malicious degree of malicious language, wherein XE_i represents a malicious reporting behavior grading loss value, CE_i represents a successful reporting behavior grading rewarding value, and the initial values of YE_i, XE_i and CE_i are 0;
the attention degree parameter SG_i of the presenter satisfies the formulaWhere SFen represents the number of fan-shapes of the user, SDia represents the cumulative endorsement of the user, where μ 1 Sum mu 2 Is a preset coefficient and is more than or equal to 0 and less than or equal to mu 1 More than or equal to 1 and less than or equal to 0 mu 2 Is less than or equal to 1 and mu 1 22 2 =1。
2. A deep learning model based user behavior analysis system as claimed in claim 1, wherein: the user speech analysis module is used for judging whether the speech belongs to a malicious speech or not through a malicious speech recognition model, and comprises a speech text preprocessing unit, a malicious speech recognition model and a speech malicious degree evaluation unit, the speech text processed by the speech text preprocessing unit is input into the malicious speech recognition model, the malicious speech recognition model comprises a first channel and a second channel, the first channel is used for acquiring space characteristics of the speech, the second channel is used for acquiring vector characteristic spaces of the text of the speech through a first-order Markov chain algorithm, the characteristics extracted by the first channel and the second channel are fused and spliced based on a cross attention mechanism, and are classified by a softmax classifier through a full-connection layer, when the value of an output layer is close to 1, the value is close to 0, the speech is not malicious, the malicious speech and the malicious speech are obtained, the malicious speech recognition model is based on a deep learning frame, the neural network weight parameters, bias parameters and activation functions of the first channel and the second channel are initialized, the loss functions are obtained through forward propagation, the loss functions are updated based on the loss functions, the weight parameters are updated based on the loss functions, the bias parameters are updated, the weight parameters are updated based on the bias parameters, the bias parameters are satisfied, the accuracy and the accuracy is satisfied, and the accuracy is achieved after the accuracy is achieved.
3. A deep learning model based user behavior analysis system as claimed in claim 1, wherein: let malicious utterances belong to M kinds of topics, the topics are denoted as A= { A1, A2, … Aj, …, am }, the weight coefficient of each topic is denoted as Wai, the uttered malicious degree evaluation unit comprises a malicious uttered topic classification model, the malicious uttered topic classification model is built based on a deep learning framework, the deep learning framework comprises an input layer, an implicit layer and an output layer, and M neurons are included in the implicit layer.
4. A deep learning model based user behavior analysis system according to claim 3, wherein: the acquisition of the malicious language topic classification model comprises the following steps:
step S01, initializing a model: defining initial parameters of deep learning and defining weight parameters W between neural networks ij Bias parameter b i Activating the function f (·) and outputting a result satisfying the formulaWherein E is i Representing the input word vector, w ij Representing the connection weight of the ith neuron and the jth neuron, P ij Representing malicious language E i Belonging to the subject A j Probability of (2);
step S02, inputting malicious language E i Malicious utterances including n key words, denoted E i ={S m1 ,S m2 ,…,S mn ' output malicious language belongs to subject A j Probability P of (2) ij The probability that a malicious utterance belongs to each topic is obtained and is recorded as a set P= { P i1 ,P i2 ,…,P ij ,…,P im Taking the topic corresponding to max (P) as a model to speculate that the topic to which the malicious language belongs is A_i, and the probability of belonging to the topic A_i is P max
S03, calculating a loss function, setting the theme actually described by the malicious language Ei as A_j, and setting a model to speculate the malicious language E i The probability belonging to topic A_j is P ij By the formulaCalculating a loss function;
step S04, back propagation: updating weight parameters and bias parameters according to the loss values obtained by the loss function calculation, reversely transmitting the input information, updating the weight parameters and the bias parameters, repeating until the loss functions meet the threshold requirements, completing training of the model, and obtaining the malicious language topic classification model.
5. A substrate according to claim 4A user behavior analysis system for a deep learning model, characterized by: updated weight parameter W' ij Satisfy the formula asThe updated bias parameters satisfy the formulaWhereinαIs the learning rate of deep learning, satisfies the formula +.>Whereinα 0 Is the learning rate parameter constant which is initially set, and the value is between 0.01 and 0.05]Where epoch_num is the number of times forward and backward propagation is completed.
6. A deep learning model based user behavior analysis system as claimed in claim 1, wherein: the reporting behavior analysis module comprises a malicious reporting behavior analysis unit and a successful reporting behavior analysis unit, wherein the malicious reporting behavior analysis unit calculates a malicious reporting behavior scoring loss value, sets the reporting times ZJ of a reporter and the reporting success times CJ of the reporter, and the activity HY_i of the reporter, and the malicious reporting behavior scoring loss value meets the formulaThe successful reporting behavior analysis unit is used for calculating a successful reporting behavior scoring rewarding value, and the formula ++>
7. The deep learning model based user behavior analysis system of claim 6 wherein: the system comprises a user management module, wherein the user management module processes reporting tasks based on analysis results of a user language analysis module and a reporting behavior analysis module, updates quality scores of users and obtains the quality scoresIs carried into the calculation formula of the user quality score SZ_i, wherein the malicious report behavior score loss value XE_i, the successful report behavior score reward value CE_i and the malicious language score loss value YE_iAnd finishing updating the user quality score.
8. A method for analyzing user behavior based on a deep learning model, for implementing the user behavior analysis system based on a deep learning model as set forth in any one of claims 1 to 7, characterized in that: comprises the following steps:
step S001, acquiring a report task: in the internet platform, a user set A reports that the language of a user B is a malicious language to a management platform, reporting actions generate a report A, a reported person B and report target content C, and the number of the user set A is more than or equal to 1;
step S002, calculating task priority: based on the acquired activity parameter HY_i, quality score SZ_i and attention degree parameter SG_i of each presenter, the acquired parameters are input into a priority index calculation formula after normalization processing to acquire the priority index of the reporting task, the reporting task is ordered according to the priority index of the reporting task,
the calculation of the activity parameter HY_i of the presenter satisfies the formulaWherein ta represents the current month online time of the reporter, tb represents the account number use time of the reporter, sa represents the number of comments issued by the user in the current month;
the calculation of the user quality score SZ_i meets the formulaWherein SZ is 0 Representing an initial quality score for the user, wherein YE_i represents that the malicious language score loss value satisfies the formula +.>Wherein ey represents the number of malicious utterances in the current month of the user, sa represents the number of comments posted by the current month of the user, and sigma 1 Representing the malicious degree of malicious language, wherein XE_i represents a malicious reporting behavior grading loss value, CE_i represents a successful reporting behavior grading rewarding value, and the initial values of YE_i, XE_i and CE_i are 0;
the attention degree parameter SG_i of the presenter satisfies the formulaWhere SFen represents the number of fan-shapes of the user, SDia represents the cumulative endorsement of the user, where μ 1 Sum mu 2 Is a preset coefficient and is more than or equal to 0 and less than or equal to mu 1 More than or equal to 1 and less than or equal to 0 mu 2 Is less than or equal to 1 and mu 1 22 2 =1;
Step S003, user speaking analysis: the method comprises the steps of respectively initializing a neural network weight parameter, a bias parameter and an activation function of a first channel and a second channel based on a deep learning framework, obtaining a loss function through forward propagation, updating the weight parameter and the bias parameter once based on the loss function, performing backward propagation based on the updated weight parameter and the bias parameter, obtaining a malicious language identification model, analyzing whether a user language is a malicious language or not through the malicious language identification model, obtaining a theme of the malicious language based on the malicious language theme classification model, and obtaining a theme A of the malicious language based on the theme which the malicious language belongs to j Obtaining a weight coefficient corresponding to a malicious language topic, calculating the malicious degree of the malicious language, and calculating the malicious degree sigma of the malicious language 1 Satisfy the formulaWhere Ye is the original malicious degree, μ is the average of the original malicious degrees, δ is the standard deviation of the original malicious degree, σ 1 The value of (1, 0) satisfies the formula +.>Therein wai (Aj) Representing subject A to which malicious language pertains j Corresponding weight, L represents keyword cumulative length of malicious languageDegree, L 0 The length of unit keywords is represented, and m represents the number of reported people;
step S004, reporting behavior analysis: the reporting behavior analysis module judges whether reporting behaviors of a reporter are malicious or not based on a malicious language analysis result, evaluates malicious reporting hazard degree, calculates malicious reporting behavior scoring loss values and calculates successful reporting behavior scoring rewarding values;
step S005, user management: based on the analysis results of the user language analysis module and the reporting behavior analysis module, processing reporting tasks and updating the quality scores of the users.
CN202310961231.8A 2023-08-02 2023-08-02 User behavior analysis system and method based on deep learning model Active CN116662769B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310961231.8A CN116662769B (en) 2023-08-02 2023-08-02 User behavior analysis system and method based on deep learning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310961231.8A CN116662769B (en) 2023-08-02 2023-08-02 User behavior analysis system and method based on deep learning model

Publications (2)

Publication Number Publication Date
CN116662769A CN116662769A (en) 2023-08-29
CN116662769B true CN116662769B (en) 2023-10-13

Family

ID=87724688

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310961231.8A Active CN116662769B (en) 2023-08-02 2023-08-02 User behavior analysis system and method based on deep learning model

Country Status (1)

Country Link
CN (1) CN116662769B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678331A (en) * 2012-09-05 2014-03-26 阿里巴巴集团控股有限公司 Reported message processing method and device
CN105681257A (en) * 2014-11-19 2016-06-15 腾讯科技(深圳)有限公司 Information reporting method and system based on instant messaging interactive platform
CN105704005A (en) * 2014-11-28 2016-06-22 深圳市腾讯计算机系统有限公司 Malicious user reporting method and device, and reporting information processing method and device
CN106157119A (en) * 2016-07-11 2016-11-23 广东聚联电子商务股份有限公司 A kind of method that e-commerce purchases system platform report processes
KR20180116560A (en) * 2017-04-17 2018-10-25 이세진 Monitoring method about media reporting current issues and the system for the same
CN115840844A (en) * 2022-12-17 2023-03-24 深圳市新联鑫网络科技有限公司 Internet platform user behavior analysis system based on big data
CN116244441A (en) * 2023-03-16 2023-06-09 四川大学 Social network offensiveness language detection method based on multitasking learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678331A (en) * 2012-09-05 2014-03-26 阿里巴巴集团控股有限公司 Reported message processing method and device
CN105681257A (en) * 2014-11-19 2016-06-15 腾讯科技(深圳)有限公司 Information reporting method and system based on instant messaging interactive platform
CN105704005A (en) * 2014-11-28 2016-06-22 深圳市腾讯计算机系统有限公司 Malicious user reporting method and device, and reporting information processing method and device
CN106157119A (en) * 2016-07-11 2016-11-23 广东聚联电子商务股份有限公司 A kind of method that e-commerce purchases system platform report processes
KR20180116560A (en) * 2017-04-17 2018-10-25 이세진 Monitoring method about media reporting current issues and the system for the same
CN115840844A (en) * 2022-12-17 2023-03-24 深圳市新联鑫网络科技有限公司 Internet platform user behavior analysis system based on big data
CN116244441A (en) * 2023-03-16 2023-06-09 四川大学 Social network offensiveness language detection method based on multitasking learning

Also Published As

Publication number Publication date
CN116662769A (en) 2023-08-29

Similar Documents

Publication Publication Date Title
CN105741832B (en) Spoken language evaluation method and system based on deep learning
EP1989701B1 (en) Speaker authentication
Morrison A comparison of procedures for the calculation of forensic likelihood ratios from acoustic–phonetic data: Multivariate kernel density (MVKD) versus Gaussian mixture model–universal background model (GMM–UBM)
CN110175229B (en) Method and system for on-line training based on natural language
WO2017133165A1 (en) Method, apparatus and device for automatic evaluation of satisfaction and computer storage medium
CN108447490A (en) The method and device of Application on Voiceprint Recognition based on Memorability bottleneck characteristic
CN109299267B (en) Emotion recognition and prediction method for text conversation
CN110085215B (en) Language model data enhancement method based on generation countermeasure network
CN108255805A (en) The analysis of public opinion method and device, storage medium, electronic equipment
EP1701337B1 (en) Method of speech recognition
CN111966878B (en) Public sentiment event reversal detection method based on machine learning
CN110704618B (en) Method and device for determining standard problem corresponding to dialogue data
WO2023078370A1 (en) Conversation sentiment analysis method and apparatus, and computer-readable storage medium
CN113111152A (en) Depression detection method based on knowledge distillation and emotion integration model
CN114429134B (en) Hierarchical high-quality speech mining method and device based on multivariate semantic representation
Fan et al. The impact of student learning aids on deep learning and mobile platform on learning behavior
CN116662769B (en) User behavior analysis system and method based on deep learning model
CN111400489B (en) Dialog text abstract generating method and device, electronic equipment and storage medium
Vinyals et al. Chasing the metric: Smoothing learning algorithms for keyword detection
CN111985214A (en) Human-computer interaction negative emotion analysis method based on bilstm and attention
KR20210123545A (en) Method and apparatus for conversation service based on user feedback
KR20180005876A (en) System and method for personal credit rating through voice analysis
CN114119194A (en) Intelligent face-examination wind control early warning method and system
JPH1195795A (en) Voice quality evaluating method and recording medium
Vukojičić et al. Optimization of Multimodal Trait Prediction Using Particle Swarm Optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant