CN113822684A - Heikou user recognition model training method and device, electronic equipment and storage medium - Google Patents

Heikou user recognition model training method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113822684A
CN113822684A CN202111145600.3A CN202111145600A CN113822684A CN 113822684 A CN113822684 A CN 113822684A CN 202111145600 A CN202111145600 A CN 202111145600A CN 113822684 A CN113822684 A CN 113822684A
Authority
CN
China
Prior art keywords
model
loss
user
submodel
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111145600.3A
Other languages
Chinese (zh)
Other versions
CN113822684B (en
Inventor
张徵
秦超
陈柏宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202111145600.3A priority Critical patent/CN113822684B/en
Publication of CN113822684A publication Critical patent/CN113822684A/en
Application granted granted Critical
Publication of CN113822684B publication Critical patent/CN113822684B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Business, Economics & Management (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a method and a device for training a black product user recognition model, electronic equipment and a storage medium, wherein the method comprises the following steps: the method comprises the steps of training a first basic model by using a first sample set until a first constraint condition is met to obtain a first recognition model, determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model. Because the second submodel is jointly trained with the first submodel, and the second basic model is obtained based on the second submodel, the training can be completed only by a small number of user behavior characteristic sequences marked with label data in the second sample set, so that the influence of the number of positive samples for training the user recognition model on the accuracy of the model is reduced.

Description

Heikou user recognition model training method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a device for training a black product user recognition model, electronic equipment and a storage medium.
Background
With the development of internet technology, internet services provided by internet service providers are increasing, however, in an actual internet service scenario, some people may perform abnormal activities based on the internet services provided by the internet service providers, for example: stealing normal user information, maliciously swiping praise, swiping comments, swiping orders, issuing illegal transaction information, fraud messages and the like, wherein the person who performs the abnormal activities is called an abnormal user or a black user. The internet service provider needs to continuously identify these abnormal users in order to ensure the account security of the normal users and the normal operation of the internet service.
In the related art, a trained user recognition model is used for analyzing user behavior characteristics of a user, so that an abnormal user with abnormal behavior is recognized. The training process of the user recognition model comprises the following steps: and manually selecting the user behavior characteristics of the abnormal user and marking the abnormal user label as a positive sample, manually selecting the user behavior characteristics of the normal user as a negative sample, and training the user identification model by using the positive sample and the negative sample so as to obtain the trained user identification model.
However, the inventor finds in research that, by using the user recognition model training method, a large number of user behavior characteristics of abnormal users need to be manually selected as positive samples, however, in an actual scene, the number of abnormal users is far smaller than the number of normal users, so that a sufficient number of user behavior characteristics of abnormal users cannot be selected as positive samples, and the workload of manually selecting the user behavior characteristics of abnormal users is large, which also limits the number of positive samples of the user behavior characteristics of abnormal users, so that the number of positive samples which can be used for training the user recognition model is small, and the accuracy of the user recognition model is finally affected.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for training a black product user recognition model, electronic equipment and a storage medium, so as to reduce the influence of the number of positive samples which can be used for training the user recognition model on the accuracy of the user recognition model. The specific technical scheme is as follows:
in a first aspect of the present invention, a method for training a black product user recognition model is provided, where the method includes:
training the first basic model by using the first sample set until a first constraint condition is met to obtain a first recognition model; the first recognition model is used for predicting whether the text corresponding to the text characteristic sequence is junk content or not based on the text characteristic sequence and the user behavior characteristic sequence; the first base model includes: the system comprises a first submodel and a second submodel, wherein the first submodel is used for analyzing a text characteristic sequence to obtain a first junk content prediction result, and the second submodel is used for analyzing a user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first penalty, the first penalty comprising: the method comprises the following steps of obtaining a first loss, a second loss, a third loss and a fourth loss, wherein the second loss is the loss of a first submodel, the third loss is the loss of a second submodel, and the fourth loss is the characteristic loss between the first submodel and the second submodel;
determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the second identification model is used for identifying whether the user corresponding to the user behavior feature sequence is a black user or not based on the user behavior feature sequence.
In a second aspect of the present invention, there is also provided a black product user identification method, where the method further includes:
acquiring a user behavior characteristic sequence of a user to be identified;
inputting a user behavior characteristic sequence of a user to be recognized into a trained second recognition model, and determining whether the user to be recognized is a black product user, wherein the trained second recognition model is obtained by training through any black product user recognition model training method;
and determining whether the user to be identified is a black user or not based on the prediction result of the user to be identified.
In a third aspect of the present invention, there is also provided a training apparatus for a black product user recognition model, the training apparatus including:
the first training module is used for training the first basic model by utilizing the first sample set until a first constraint condition is met to obtain a first recognition model; the first recognition model is used for predicting whether the text corresponding to the text characteristic sequence is junk content or not based on the text characteristic sequence and the user behavior characteristic sequence; the first base model includes: the system comprises a first submodel and a second submodel, wherein the first submodel is used for analyzing a text characteristic sequence to obtain a first junk content prediction result, and the second submodel is used for analyzing a user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first penalty, the first penalty comprising: the method comprises the following steps of obtaining a first loss, a second loss, a third loss and a fourth loss, wherein the second loss is the loss of a first submodel, the third loss is the loss of a second submodel, and the fourth loss is the characteristic loss between the first submodel and the second submodel;
the second training module is used for taking a second sub-model in the first recognition model as a second basic model and training the second basic model by utilizing a second sample set to obtain a second recognition model; the second identification model is used for identifying whether the user corresponding to the user behavior feature sequence is a black user or not based on the user behavior feature sequence.
In a fourth aspect of the present invention, there is also provided a black product user identification apparatus, further including:
the acquisition module is used for acquiring a user behavior characteristic sequence of a user to be identified;
the recognition module is used for inputting the user behavior characteristic sequence of the user to be recognized into a trained second recognition model and determining whether the user to be recognized is a black product user, wherein the trained second recognition model is obtained by training through any black product user recognition model training device;
and the determining module is used for determining whether the user to be identified is a black user or not based on the prediction result of the user to be identified.
In a fifth aspect of the present invention, there is also provided an electronic device, including a processor, a communication interface, a memory and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
a processor for implementing the steps of any of the methods described herein when executing the program stored in the memory.
In a sixth aspect implemented by the present invention, there is also provided a computer-readable storage medium having stored therein a computer program, which when executed by a processor, implements the steps of any of the methods described herein.
In a seventh aspect of the present invention implementation, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the steps of any of the methods described herein.
The embodiment of the invention provides a method and a device for training a black product user recognition model, electronic equipment and a storage medium, wherein the method comprises the following steps: training the first basic model by using the first sample set until a first constraint condition is met to obtain a first recognition model; determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the first recognition model is used for predicting whether a text corresponding to the text characteristic sequence is junk content or not based on the text characteristic sequence and the user behavior characteristic sequence; the first base model includes: the system comprises a first submodel and a second submodel, wherein the first submodel is used for analyzing a text characteristic sequence to obtain a first junk content prediction result, and the second submodel is used for analyzing a user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first penalty, the first penalty comprising: the method comprises the following steps of obtaining a first loss, a second loss, a third loss and a fourth loss, wherein the second loss is the loss of a first submodel, the third loss is the loss of a second submodel, and the fourth loss is the characteristic loss between the first submodel and the second submodel; the second identification model is used for identifying whether the user corresponding to the user behavior feature sequence is a black user or not based on the user behavior feature sequence.
As can be seen, in the embodiment of the present invention, a joint training mode is adopted, a first sample set is used to perform joint training on a first sub-model and a second sub-model in a first basic model, so as to obtain a second loss of the first sub-model, a third loss of the second sub-model, and a fourth loss between the first sub-model and the second sub-model, and a first loss including the second loss, the third loss, and the fourth loss is used to adjust a training parameter of the first basic model, so as to obtain a training parameter of the first recognition model. And then, obtaining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by utilizing a second sample set. Because the second submodel is already jointly trained with the first submodel, and the second basic model is obtained based on the second submodel, the training can be completed only by a small number of user behavior characteristic sequences marked with the label data in the second sample set, so that the influence of the number of positive samples which can be used for training the user recognition model on the accuracy of the user recognition model can be reduced, and the accuracy of the second recognition model can be improved under the condition of only using a small number of user behavior characteristic sequences marked with the label data. Furthermore, because the workload of label data labeling of the text feature sequence is far less than that of label data labeling of the user behavior feature sequence, the workload of manually selecting and labeling the user behavior features of the abnormal users can be reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a flowchart of a first implementation of a method for training a black product user recognition model according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a second implementation manner of a black product user recognition model training method according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a network structure according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a third implementation manner of a method for training a black product user recognition model according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a method for identifying a black product user according to an embodiment of the present invention;
FIG. 6 is an overall frame diagram in an embodiment of the present invention;
FIG. 7 is a schematic flow chart of a fourth implementation of a method for training a black product user recognition model according to an embodiment of the present disclosure;
FIG. 8 is a schematic structural diagram of a training apparatus for a black product user recognition model according to an embodiment of the present disclosure;
fig. 9 is a schematic structural diagram of a black product user identification device according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.
In an actual internet service scenario, some people may perform abnormal activities based on internet services provided by internet service providers, such as: stealing normal user information, maliciously praise and comment on a video comment page, or publishing illegal transaction information or fraud messages by using a bullet screen function, and the people who perform the abnormal activities are called abnormal users or black users. The internet service provider needs to continuously identify these abnormal users using an identification model to ensure the account security of normal users and the normal operation of the internet service.
However, the inventor finds that, in the related art, user behavior characteristics of a large number of abnormal users are generally required to be manually selected for training a recognition model, and the number of the abnormal users is far smaller than that of normal users, so that a sufficient number of user behavior characteristics of the abnormal users cannot be selected as positive samples, and the workload of manually selecting the user behavior characteristics of the abnormal users is large, which also limits the number of the positive samples of the user behavior characteristics of the abnormal users, so that the number of the positive samples which can be used for training the user recognition model is small, and the accuracy of the user recognition model is ultimately affected.
In order to solve the above problems, embodiments of the present invention provide a training method and apparatus for a black product user recognition model, an electronic device, and a storage medium, so as to reduce workload of manually selecting and labeling user behavior features of an abnormal user.
Next, referring to fig. 1, a method for training a black product user recognition model according to an embodiment of the present invention is described first, which is a flowchart of a first implementation manner of the method for training a black product user recognition model according to the embodiment of the present invention, and the method may include:
s110, training the first basic model by using the first sample set until a first constraint condition is met to obtain a first recognition model; the first recognition model is used for predicting whether the text corresponding to the text characteristic sequence is junk content or not based on the text characteristic sequence and the user behavior characteristic sequence; the first base model includes: the system comprises a first submodel and a second submodel, wherein the first submodel is used for analyzing a text characteristic sequence to obtain a first junk content prediction result, and the second submodel is used for analyzing a user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first penalty, the first penalty comprising: the method comprises the following steps of obtaining a first loss, a second loss, a third loss and a fourth loss, wherein the second loss is the loss of a first submodel, the third loss is the loss of a second submodel, and the fourth loss is the characteristic loss between the first submodel and the second submodel;
s120, determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the second identification model is used for identifying whether the user corresponding to the user behavior feature sequence is a black user or not based on the user behavior feature sequence.
In order to reduce the influence of the number of positive samples that can be used for training the user recognition model on the accuracy of the user recognition model, in the embodiment of the present invention, a first sample set including a text feature sequence, a user behavior feature sequence, and first label data may be used to train a first base model, where the first label data is used to indicate whether the text feature sequence is spam.
In some examples, the text feature sequence is a word set obtained by processing text content published by a user on an internet service platform, where the text content may be normal comment information or abnormal comment information published by the user on a comment page, may also be normal barrage information or abnormal barrage information published by the user using a barrage function, and may also be: the normal post content or the abnormal post information published in the forum, the post bar or the microblog, and the abnormal comment information, the abnormal bullet screen information and the abnormal post information may be illegal transaction information or fraud information and the like. For example, the text content may be the text content shown in table 1, and the corresponding first tag may also be the tag shown in table 1.
For example, the text feature sequence may be a word set obtained by performing word segmentation processing on the text content, a word set obtained by processing the text content using a bag-of-words model, or the like.
In some examples, the text content shown in table 1 is relatively easy to obtain, and whether the text content is spam or not is also relatively easy to judge, so that the process of obtaining the text feature sequence and the first tag is relatively easy. It is understood that the text contents shown in table 1 are only for illustration, and in practical applications, the number of the text contents may be larger, for example, the number may be set to be millions or tens of millions.
Table 1 text content and label example table
Figure BDA0003285325910000071
In still other examples, the text feature sequence may be a word set obtained by processing text content through a bag-of-words model. Or a word set obtained by processing text contents by a TF-IDF (word frequency-inverse document frequency) model or the like. The text content may also be subjected to word segmentation processing by other word segmentation methods to obtain a word set, which is all right.
The word bag model is used for performing word segmentation processing on the text content and counting the occurrence frequency of each word in the text, so that all words contained in the text content and the occurrence frequency of each word in the text can be obtained.
The TF-IDF model is used for performing word segmentation processing on text contents, then calculating the frequency of each word appearing in the text contents and the reverse file frequency of the text containing the word, further obtaining the weighted weight of the word based on the frequency of each word appearing in the text contents and the reverse file frequency of the text containing the word, and finally selecting the word meeting the preset selection condition as a word set of the text characteristic sequence based on the weighted weight of each word. The preset selection condition can be that the weighted weights are positioned at the first N bits in the descending order; the weighting value may be greater than or equal to a preset weighting threshold.
In some examples, when a user is publishing text content, the text content has a correspondence with the user. When a user operates on the Internet service platform, corresponding user behaviors are generated, and the user behaviors have a corresponding relation with the user. Therefore, a user behavior feature sequence can be acquired here. The user behavior feature sequence may be related data representation when the user performs different behaviors, for example, when the user logs in a video website, each user behavior feature may be represented by at least one of "login device fingerprint", "login IP", "wifi flag", and "video name for which comment is directed", and when the user logs in a forum, a post bar, or a microblog, each user behavior feature may be represented by at least one of "login device account", "posting time", "modification posting time", and "posting destination".
In some examples, because the user behavior feature sequence is difficult to obtain, and the workload of label data labeling of the user behavior feature sequence is much greater than that of label data labeling of the text feature sequence, if a joint training mode is adopted to train the first sub-model for analyzing the text feature sequence and the second sub-model for analyzing the user behavior feature sequence, the second sub-model capable of being used for analyzing the user behavior feature sequence can be obtained only by labeling the label data of the text feature sequence.
Therefore, in the embodiment of the present invention, the first basic model may include two sub models, which may be twin neural network models, and the two sub models are a first sub model and a second sub model, respectively, so that the text feature sequence may be input into the first sub model, the user behavior feature sequence may be input into the second sub model, so as to obtain a first loss of the first basic model, and then the model parameters of the first basic model are adjusted based on the first loss, so as to implement training of the first basic model.
In still other examples, after the first recognition model is obtained by training the first base model, a second base model capable of being used for analyzing the user behavior feature sequence may be determined based on a second sub-model capable of being used for analyzing the user behavior feature sequence in the first recognition model;
for example, a second submodel may be extracted from the first recognition model and then used as a second base model;
the model parameters of the second sub-model in the first recognition model may also be obtained, and the model parameters of the second sub-model are stored in a preset knowledge base, where the preset knowledge base includes the model parameters of the multiple models, that is, the preset knowledge base is a set of the model parameters of the multiple models.
And then obtaining the model parameters of the second submodel from the knowledge base and transferring to a second basic model, wherein the model structure of the second basic model is the same as that of the second submodel.
It will be appreciated that when training a first recognition model, knowledge understanding of user behavior by a second sub-model in the first recognition model is also obtained, and the knowledge understanding can be represented by model parameters of the second sub-model, so the knowledge base herein can be model parameters of the second sub-model.
After the second base model is obtained, because the second base model can be used for analyzing the user behavior feature sequence, and in order to obtain higher analysis accuracy, the second base model can be further trained, at this time, the second base model can be trained by using a second sample set including the user behavior feature sequence and second label data until a second constraint condition is satisfied, and a second recognition model for recognizing whether the user corresponding to the user behavior feature sequence is a black user or not based on the user behavior feature sequence can be obtained.
The second constraint condition may be loss convergence of the second basic model, or the number of times of training reaches a third preset number threshold, or the number of times that a difference between losses obtained by two adjacent training times is smaller than a preset error threshold is greater than or equal to the third preset number threshold, and the like, where the second constraint condition is not limited.
In the embodiment of the present invention, a joint training mode is adopted, a first sample set is used to perform joint training on a first submodel and a second submodel in a first basic model, so as to obtain a second loss of the first submodel, a third loss of the second submodel, and a fourth loss between the first submodel and the second submodel, and a first loss including the second loss, the third loss, and the fourth loss is used to adjust a training parameter of the first basic model, so as to obtain a training parameter of the first recognition model. And then, obtaining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by utilizing a second sample set. Because the second submodel is already jointly trained with the first submodel, and the second basic model is obtained based on the second submodel, the training can be completed only by a small number of user behavior characteristic sequences marked with the label data in the second sample set, so that the influence of the number of positive samples which can be used for training the user recognition model on the accuracy of the user recognition model can be reduced, and the accuracy of the second recognition model can be improved under the condition of only using a small number of user behavior characteristic sequences marked with the label data. Furthermore, because the workload of label data labeling of the text feature sequence is far less than that of label data labeling of the user behavior feature sequence, the workload of manually selecting and labeling the user behavior features of the abnormal users can be reduced.
On the basis of the training method for the black product user recognition model shown in fig. 1, an embodiment of the present invention further provides a possible implementation manner, as shown in fig. 2, which is a flowchart of a second implementation manner of the training method for the black product user recognition model in the embodiment of the present invention, and the method may include:
s210, inputting the text feature sequence into the first sub-model to obtain a first full-link layer feature output by a second full-link layer in the first sub-model and a first garbage content prediction result output by a normalization layer of the first sub-model;
s220, inputting the user behavior characteristic sequence into a second sub-model to obtain a second full-link layer characteristic output by a second full-link layer in the second sub-model and a second garbage content prediction result output by a normalization layer of the second sub-model;
s230, calculating a second loss based on the first label data and the first junk content prediction result, calculating a third loss based on the first label data and the second junk content prediction result, and calculating a fourth loss based on the first full-link layer characteristic and the second full-link layer characteristic;
s240, determining a first loss according to the second loss, the third loss and the fourth loss;
and S250, adjusting the training parameters in the first basic model according to the first loss until the first constraint condition is met, and obtaining a first recognition model.
S260, determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the second identification model is used for identifying whether the user corresponding to the user behavior feature sequence is a black user or not based on the user behavior feature sequence.
In some examples, after the text feature sequence is input into the first submodel, a first full-link layer feature and a first spam content prediction result output by the first submodel may be obtained;
after the user behavior characteristic sequence is input into the second submodel, a second full-link layer characteristic and a second rubbish content prediction result output by the second submodel can be obtained;
at this time, a second loss may be calculated based on the first tag data and the first spam content prediction result, a third loss may be calculated based on the first tag data and the second spam content prediction result, and a fourth loss may be calculated based on the first fully-connected layer feature and the second fully-connected layer feature; then determining a first loss according to the second loss, the third loss and the fourth loss; and then, the training parameters in the first basic model can be adjusted according to the first loss until the first constraint condition is met, so that the first recognition model is obtained.
Wherein the second loss is indicative of a deviation between the first spam prediction and the first label data, and the third loss is indicative of a deviation between the second spam prediction and the first label data.
The first full link layer feature is a feature of a text feature sequence and can be used for representing the text feature sequence, and the second full link layer feature is a feature of a user behavior feature sequence and can be used for representing the user behavior feature sequence, so the fourth loss can be used for representing a distance between the first full link layer feature obtained by transforming the text feature sequence by the first sub-model and the second full link layer feature obtained by transforming the user behavior feature sequence by the second sub-model.
In this way, when the first loss is minimized, the distance between the first fully-connected layer feature and the second fully-connected layer feature is also minimized, that is, the transformation of the text feature sequence by the first sub-model is closest to the transformation of the user behavior feature sequence by the second sub-model. Thus, the training result of training the first sub-model with text features may be used in the second sub-model.
In still other examples, the first constraint may include at least one of: the first loss converges, or the number of times that the difference between the first losses obtained by two adjacent training is smaller than the preset error threshold is greater than or equal to the second preset number threshold, and the like, where the first constraint condition is not limited as long as the first constraint condition is related to the first loss. By associating the first constraint condition with the first loss, the recognition accuracy of the finally trained first recognition model can be related to the magnitude of the first loss, and if the first loss is larger, the recognition accuracy of the trained first recognition model is lower, whereas if the first loss is smaller, the recognition accuracy of the trained first recognition model is higher.
In still other examples, when the first basic model is trained by using the first sample set until the first constraint condition is satisfied, and the first recognition model is obtained, the text feature sequence in the first sample set, the user behavior feature sequence corresponding to the text feature sequence, and the corresponding first tag data may be obtained;
then inputting the text feature sequence into a first sub-model to obtain a first full-link layer feature and a first junk content prediction result; inputting the user behavior characteristic sequence into a second submodel to obtain a second full-connection layer characteristic and a second junk content prediction result;
further, a second loss may be calculated based on the first label data and the first spam content prediction result, a third loss may be calculated based on the first label data and the second spam content prediction result, and a fourth loss may be calculated based on the first fully-connected layer feature and the second fully-connected layer feature; determining a first loss according to the second loss, the third loss and the fourth loss;
finally, the training parameters in the first base model may be adjusted according to the first loss, and the steps are returned to: and acquiring the text feature sequence in the first sample set, the user behavior feature sequence corresponding to the text feature sequence and corresponding first label data until a first constraint condition is met, and acquiring a first recognition model.
In some examples, the first sub-model and the second sub-model may adopt the same network structure, or may adopt different network structures, for example, when the first sub-model and the second sub-model are the same, both the first sub-model and the second sub-model may be BERT (Bidirectional Encoder representation based on transform model) models, or both the first sub-model and the second sub-model may be BERT (a Lite BERT, lightweight Bidirectional Encoder representation based on transform model) models, and when the first sub-model and the second sub-model are different, the first sub-model may be a BERT model, the second sub-model may be an albert model, or the first sub-model is an albert model and the second sub-model is a BERT model.
In still other examples, when the first and second submodels are to employ the same network structure; the network structure may be the network structure shown in fig. 3, the network structure comprising: a plurality of embedding layers 301, a plurality of bidirectional long-short term memory layers 302, a reverse feedforward neural network layer 303, a forward feedforward neural network layer 304, a first fully-connected layer 305, a hidden representation layer 306, a second fully-connected layer 307, a logistic regression layer 308, a normalization layer 309; wherein,
as shown in fig. 3, each embedded layer 301 is connected to a bidirectional long-short term memory layer 302; the bidirectional long and short term memory layers 302 are respectively connected with the reverse feedforward neural network layer 303 and the forward feedforward neural network layer 304; a reverse feedforward neural network layer 303 and a forward feedforward neural network layer 304, both connected to the first fully-connected layer 305; the first fully-connected layer 305, the hidden representation layer 306, the second fully-connected layer 307, the logistic regression layer 308, and the normalization layer 309 are connected in sequence.
After the text feature sequence is input into the first submodel, the plurality of embedding layers 301 of the first submodel can acquire and process the text feature sequence, so that continuous word vectors can be output; then, the continuous word vectors can be input into the bidirectional long-short term memory layers 302 of the first submodel, so as to obtain the first word vectors output by the bidirectional long-short term memory layers 302 of the first submodel; then, the first word vector is respectively input into the reverse feedforward neural network layer 303 and the forward feedforward neural network layer 304 of the first submodel to obtain a second word vector output by the reverse feedforward neural network layer 303 of the first submodel and a third word vector output by the forward feedforward neural network layer 304 of the first submodel; further, a second word vector and a third word vector may be input to the first fully-connected layer 305 of the first sub-model, so as to obtain a third fully-connected layer feature output by the first fully-connected layer 305 of the first sub-model;
after obtaining the third fully-connected layer feature, the third fully-connected layer feature may be input to the hidden representation layer 306 of the first submodel to obtain a first hidden feature output by the hidden representation layer 306 of the first submodel; inputting the first hidden feature to the second full-link layer 307 of the first sub-model to obtain a first full-link layer feature output by the second full-link layer 307 of the first sub-model; then, inputting the first full-connected layer feature to the logistic regression layer 308 of the first submodel to obtain a first to-be-normalized prediction result output by the logistic regression layer 308 of the first submodel; finally, the first to-be-normalized prediction result is input to the normalization layer 309 of the first sub-model, so as to obtain the first garbage content prediction result output by the normalization layer 309 of the first sub-model. The first spam prediction result is derived based on the text feature sequence and is used to indicate whether the text is spam.
After the user behavior feature sequence is input into the second submodel, the multiple embedding layers 301 of the second submodel may acquire and process the user behavior feature sequence, so that a continuous behavior feature vector may be output. Then, the continuous behavior feature vectors can be input into the plurality of bidirectional long-short term memory layers 302 of the second submodel, so as to obtain first behavior feature vectors output by the plurality of bidirectional long-short term memory layers 302 of the second submodel; then, the first behavior feature vector is respectively input into the reverse feedforward neural network layer 303 and the forward feedforward neural network layer 304 of the second submodel, so as to obtain a second behavior feature vector output by the reverse feedforward neural network layer 303 of the second submodel and a third behavior feature vector output by the forward feedforward neural network layer 304 of the second submodel; further, the second behavior feature vector and the third behavior feature vector may be input to the first fully-connected layer 305 of the second sub-model, so as to obtain a fourth fully-connected layer feature output by the first fully-connected layer 305 of the second sub-model;
after obtaining the fourth fully-connected layer feature, the fourth fully-connected layer feature may be input to the hidden representation layer 306 of the second submodel, so as to obtain a second hidden feature output by the hidden representation layer 306 of the second submodel; inputting the second hidden feature to the second full-link layer 307 of the second submodel to obtain a second full-link layer feature output by the second full-link layer 307 of the second submodel; then, inputting the characteristics of the second full-connected layer to the logistic regression layer 308 of the second submodel to obtain a second prediction result to be normalized output by the logistic regression layer 308 of the second submodel; and finally, inputting a second prediction result to be normalized to the normalization layer 309 of the second submodel to obtain a second garbage content prediction result output by the normalization layer 309 of the second submodel. The second spam prediction result is obtained based on the user behavior feature sequence and is used for indicating whether the text is spam or not.
In some examples, the bidirectional long-short term memory layer 302 is a recurrent neural network, and since the bidirectional long-short term memory layer 302 includes a plurality of gating structures, the gating structures and the bus of the recurrent neural network together form a special structure of the bidirectional long-short term memory layer, so that the problems of gradient disappearance and gradient explosion during continuous training based on word vectors and continuous behavior feature vectors can be solved.
By adopting the network structure of the embodiment of the invention, the discrete word vectors can be converted into the continuous word vectors through the embedding layer 301, and the discrete behavior characteristics can also be converted into the continuous behavior characteristic vectors through the embedding layer 301. Moreover, the bidirectional long-short term memory layer 302 can solve the problems of gradient disappearance and gradient explosion in the training process. Furthermore, by adopting the same network structure, the model parameters of the first sub-model and the model parameters of the second sub-model can be conveniently adjusted, so that the time overhead in the training process is reduced, and the training efficiency is improved.
In some examples, after obtaining the first spam prediction result, the second spam prediction result, the first fully connected layer feature, and the second fully connected layer feature, a second loss may be calculated based on the first label data and the first spam prediction result, a third loss may be calculated based on the first label data and the second spam prediction result, and a fourth loss may be calculated based on the first fully connected layer feature and the second fully connected layer feature; wherein the second loss is indicative of a deviation between the first spam prediction and the first label data and the third loss is indicative of a deviation between the second spam prediction and the first label data.
In some examples, the fourth loss satisfies the following equation:
Figure BDA0003285325910000141
wherein MMD (X, Y) is a first fully-connected layer characteristic X ═ X1,…xi,…,xn]And a second fully-connected layer characteristic Y ═ Y1,…yj,…,ym]Maximum mean difference between, Φ (x)i) To make the first fully-connected layer characteristic X ═ X1,…xi,…,xn]After the ith feature in (b) is mapped to the regenerated Hilbert space, the obtained feature value, phi (y), of the first full-link feature in the regenerated Hilbert spacej) To make the second fully-connected layer characteristic Y ═ Y1,…yj,…,ym]Is mapped to the regenerationAfter the hilbert space, the obtained second fully-connected layer feature is the feature value in the regenerated hilbert space. The maximum mean difference may represent a distance between the first fully-connected layer feature and the second fully-connected layer feature.
It will be appreciated that the goal of migration learning is to apply the knowledge learned in the source domain to different but related target domains. To achieve the goal of migration learning, the distance between the source domain data and the target domain data needs to be minimized. And the maximum mean difference can measure the distance between two different but related domain data. Therefore, the maximum mean difference can be used to measure the distance between the text feature sequence and the user behavior feature sequence.
In some examples, training the first base model by calculating a maximum mean difference between the first fully-connected layer features and the second fully-connected layer features and adding the maximum mean difference to the loss of the network may enable subsequent migration of the second submodel in the first recognition model, i.e., migrating the model parameters of the second submodel as a knowledge base into the second base model.
After the second loss, the third loss, and the fourth loss are obtained, the first loss of the first base model may be calculated based on the second loss, the third loss, and the fourth loss, and then the training parameters in the first base model are adjusted according to the first loss until the first constraint condition is satisfied, so as to obtain the first recognition model. And then, a second basic model can be determined based on the first recognition model, and the second basic model is trained by using a second sample set to obtain a second recognition model.
It is understood that step S260 in the embodiment of the present invention is the same as or similar to step S120 in the first embodiment, and is not described herein again.
On the basis of the training method for the black product user recognition model shown in fig. 2, a possible implementation manner is further improved in the embodiment of the present invention, as shown in fig. 4, which is a flowchart of a third implementation manner of the training method for the black product user recognition model in the embodiment of the present invention, and the method may include:
s410, inputting the text feature sequence into the first sub-model to obtain a first full-link layer feature output by a second full-link layer in the first sub-model and a first garbage content prediction result output by a normalization layer of the first sub-model;
s420, inputting the user behavior characteristic sequence into a second sub-model to obtain a second full-link layer characteristic output by a second full-link layer in the second sub-model and a second garbage content prediction result output by a normalization layer of the second sub-model;
s430, calculating a second loss based on the first label data and the first junk content prediction result, calculating a third loss based on the first label data and the second junk content prediction result, and calculating a fourth loss based on the first full link layer characteristic and the second full link layer characteristic;
s440, a weighting process is performed on the second loss, the third loss, and the fourth loss to obtain a first loss.
S450, adjusting the training parameters in the first basic model according to the first loss until the first constraint condition is met, and obtaining a first recognition model.
S460, determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; the second identification model is used for identifying whether the user corresponding to the user behavior feature sequence is a black user or not based on the user behavior feature sequence.
In some examples, the second loss, the third loss, and the fourth loss may be weighted, so that the first loss may be obtained, wherein the weighting process may be weighted summation or weighted average.
In still other examples, when performing the weighted summation to calculate the first loss, the first loss may be calculated by first multiplying the second loss by a first weighting factor, multiplying the third loss by a second weighting factor, multiplying the fourth loss by a third weighting factor, and then summing the weighted second loss, the weighted third loss, and the weighted fourth loss. For example, the first loss may be calculated using the following equation:
Total Loss=a*loss1+b*loss2+c*domain_loss。
wherein, Total Loss is a first Loss, Loss1 is a second Loss, Loss2 is a third Loss, domain _ Loss is a fourth Loss, a is a first weighting coefficient, b is a second weighting coefficient, c is a third weighting coefficient, and a, b and c are artificial hyper-parameters.
In some examples, the first weighting coefficient, the second weighting coefficient, and the third weighting coefficient may be preset constant parameters or parameters that are adjusted each time the model parameters are adjusted.
It is understood that steps S410 to S430 and S450 to S460 in the embodiment of the present invention are the same as or similar to steps S210 to S230 and S250 to S260 in the second embodiment, and are not described again here.
In some examples, after obtaining the second recognition model, the user may be recognized using the second recognition model. As shown in fig. 5, which is a flowchart of a black product user identification method according to an embodiment of the present invention, the method may include:
s510, acquiring a user behavior characteristic sequence of a user to be identified;
and S520, inputting the user behavior characteristic sequence of the user to be recognized into a trained second recognition model, and determining whether the user to be recognized is a black product user, wherein the trained second recognition model is obtained by training through the black product user recognition model training method shown in any one of the embodiments.
Specifically, the user behavior feature sequence of the user to be identified may be input into the second identification model, and the second identification model may output the probability that the user to be identified is a black product user based on the user behavior feature sequence of the user to be identified, and may further determine whether the user to be identified is a black product user based on the probability that the user to be identified is a black product user.
For example, when the probability that the user to be identified is a black user is greater than 50%, the user to be identified may be determined as the black user, otherwise, the user to be identified is determined not to be the black user.
In still other examples, the second recognition model may also directly output whether the user to be recognized is a black user based on the user behavior feature sequence of the user to be recognized, for example, yes may be directly output to indicate that the user to be recognized is a black user, or no may be directly output to indicate that the user to be recognized is not a black user.
In still other examples, the second set of samples may be periodically updated, and the second recognition model may be intensively trained using the updated second set of samples. Therefore, the second recognition model can be continuously updated, and the second recognition model can recognize the black users in time.
For a more clear description of the embodiments of the present invention, the description is made herein with reference to the overall frame diagram shown in fig. 6 and the flow chart diagram shown in fig. 7.
Firstly, massive text content can be collected offline, the text content can be comment information published on a comment page by a user or barrage information published by the user by using a barrage function, the comment information or the barrage information can include normal comment information or abnormal comment information, and the abnormal comment information can be illegal transaction information or fraud information and the like. Then, labeling a first label on the mass text content, and then extracting the mass text content to obtain a text characteristic set and a user behavior characteristic set, namely, a content characteristic set and a user behavior characteristic set.
In still other examples, the text content may also be information posted by the user in other scenarios, such as post content posted in forums, posts, microblogs, and the text content may include normal post content or abnormal post content.
Then, a twin network is constructed, as shown in fig. 6, which is the first sub-model 610 and the second sub-model 620 shown in fig. 6 using the same network structure. As shown in fig. 6, the network structure includes: a plurality of embedding layers 301, a plurality of bidirectional long-short term memory layers 302, a reverse feedforward neural network layer 303, a forward feedforward neural network layer 304, a first fully-connected layer 305, a hidden representation layer 306, a second fully-connected layer 307, a logistic regression layer 308, a normalization layer 309; wherein,
as shown in fig. 3, each embedded layer 301 is connected to a bidirectional long-short term memory layer 302; the bidirectional long and short term memory layers 302 are respectively connected with the reverse feedforward neural network layer 303 and the forward feedforward neural network layer 304; a reverse feedforward neural network layer 303 and a forward feedforward neural network layer 304, both connected to the first fully-connected layer 305; the first fully-connected layer 305, the hidden representation layer 306, the second fully-connected layer 307, the logistic regression layer 308, and the normalization layer 309 are connected in sequence.
Since the embedding layer 301 can transform discrete features into continuous vectors, after obtaining a text feature set and a user behavior feature set, the text feature set can be converted into a text feature sequence, and the user behavior feature set can be converted into a user behavior feature sequence.
At this time, the text feature sequence and the user behavior feature sequence may be input into the twin network, and the twin network may be trained offline. Training the twin network may be considered herein as task 1, and thus training the twin network may also be referred to herein as training the twin network on task 1.
After the training is completed, a first trained recognition model may be obtained, and then the model parameters of the second sub-model in the first recognition model may be used as a knowledge base to be migrated to the second base model 630.
And then, a second sample set is used to perform offline fine tuning on the second basic model 630 obtained by the migration. Here, the fine tuning of the second base model 630 is referred to as task 2, and therefore, the fine tuning of the second base model 630 may be referred to as continuing the fine tuning on task 2, so that the second recognition model can be deployed on the line. And deploying the second recognition model on the line to become an online wind control model.
Therefore, when the user behavior real-time flow is acquired, the behavior characteristics of the user behavior real-time flow are sequentially extracted by adopting a sliding window with a preset size according to the time sequence of the user behavior in the user behavior real-time flow, then the behavior characteristics of the user behavior real-time flow are input into a second recognition model deployed on the line, the second recognition model deployed on the line performs on-line recognition on the behavior characteristics of the user behavior real-time flow, and a user recognition result is output in real time, namely a wind control result is output in real time. The user identification result includes: the user is a black product user or the user is not a black product user.
By the embodiment of the invention, the training of the second recognition model does not depend on the labeling of a large amount of user behavior data, and the labor cost for labeling the user behavior data is reduced. In addition, the training method for the black product user identification model does not need to make rules of various levels, and the problems of rule failure and contradiction between the rules are avoided. Furthermore, the second recognition model obtained by training with the black product user recognition model training method provided by the embodiment of the invention can detect the real-time behavior of the user, the detection real-time performance is strong, and the user can be detected as long as the user has an abnormality in a short-term behavior sequence.
Corresponding to the above method embodiment, an embodiment of the present invention further provides a model training apparatus, as shown in fig. 8, which is a schematic structural diagram of the model training apparatus in the embodiment of the present invention, and the apparatus may include:
a first training module 810, configured to train the first basic model by using the first sample set until a first constraint condition is met, so as to obtain a first recognition model; the first recognition model is used for predicting whether the text corresponding to the text characteristic sequence is junk content or not based on the text characteristic sequence and the user behavior characteristic sequence; the first base model includes: the system comprises a first submodel and a second submodel, wherein the first submodel is used for analyzing a text characteristic sequence to obtain a first junk content prediction result, and the second submodel is used for analyzing a user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first penalty, the first penalty comprising: the method comprises the following steps of obtaining a first loss, a second loss, a third loss and a fourth loss, wherein the second loss is the loss of a first submodel, the third loss is the loss of a second submodel, and the fourth loss is the characteristic loss between the first submodel and the second submodel;
a second training module 820, configured to use a second sub-model in the first recognition model as a second basic model, and train the second basic model by using a second sample set to obtain a second recognition model; the second identification model is used for identifying whether the user corresponding to the user behavior feature sequence is a black user or not based on the user behavior feature sequence.
In some examples, the first set of samples includes: the system comprises a text characteristic sequence, a user behavior characteristic sequence and first label data, wherein the first label data is used for indicating whether the text characteristic sequence is junk content or not;
the first training module 810 is specifically configured to:
inputting the text feature sequence into a first submodel to obtain a first full-connection layer feature and a first junk content prediction result;
inputting the user behavior characteristic sequence into a second submodel to obtain a second full-connection layer characteristic and a second junk content prediction result;
calculating a second loss based on the first label data and the first spam content prediction result, calculating a third loss based on the first label data and the second spam content prediction result, and calculating a fourth loss based on the first fully-connected layer feature and the second fully-connected layer feature;
determining a first loss according to the second loss, the third loss and the fourth loss;
and adjusting the training parameters in the first basic model according to the first loss until the first constraint condition is met, so as to obtain a first recognition model.
In some examples, the first submodel and the second submodel each include:
the system comprises a plurality of embedding layers, a plurality of bidirectional long-short term memory layers, a reverse feedforward neural network layer, a forward feedforward neural network layer, a first full-connection layer, a hidden representation layer, a second full-connection layer, a logistic regression layer and a normalization layer;
wherein, the second full connection layer in the first sub-model outputs the first full connection layer characteristics; a second fully-connected layer in the second submodel outputs second fully-connected layer characteristics.
In some examples, the first training module 810 is specifically configured to:
calculating the maximum mean difference between the first full link layer characteristic and the second full link layer characteristic; and the maximum mean difference is determined as the fourth loss.
In some examples, the first training module 810 is further to:
and weighting the second loss, the third loss and the fourth loss to obtain the first loss.
In some examples, second training module 820 is specifically configured to:
acquiring model parameters of a second sub-model in the first recognition model, and storing the model parameters of the second sub-model into a preset knowledge base, wherein the preset knowledge base comprises the model parameters of a plurality of models;
and obtaining model parameters of the second submodel from the knowledge base and transferring the model parameters to a second basic model, wherein the model structure of the second basic model is the same as that of the second submodel.
In some examples, an embodiment of the present invention further provides a blackout user identification apparatus, as shown in fig. 9, which is a schematic structural diagram of the blackout user identification apparatus in the embodiment of the present invention, and the apparatus may include:
an obtaining module 910, configured to obtain a user behavior feature sequence of a user to be identified;
the identifying module 920 is configured to input the user behavior feature sequence of the user to be identified into a trained second identifying model, and determine whether the user to be identified is a black-producing user, where the trained second identifying model is obtained by training through the model training apparatus shown in fig. 8.
The embodiment of the present invention further provides an electronic device, as shown in fig. 10, which includes a processor 1001, a communication interface 1002, a memory 1003 and a communication bus 1004, wherein the processor 1001, the communication interface 1002 and the memory 1003 complete mutual communication through the communication bus 1004,
a memory 1003 for storing a computer program;
the processor 1001 is configured to implement the steps shown in any of the embodiments described above when executing the program stored in the memory 1003.
The communication bus mentioned in the above terminal may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the terminal and other equipment.
The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
In yet another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program realizes the steps shown in any one of the above embodiments when executed by a processor.
In a further embodiment provided by the present invention, there is also provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps shown in any of the embodiments described above.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for embodiments of the apparatus, the electronic device, the storage medium, and the like, since they are substantially similar to the method embodiments, the description is relatively simple, and for relevant points, reference may be made to part of the description of the method embodiments.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (11)

1. A method for training a user recognition model of black products is characterized by comprising the following steps:
training the first basic model by using the first sample set until a first constraint condition is met to obtain a first recognition model; the first recognition model is used for predicting whether the text corresponding to the text characteristic sequence is junk content or not based on the text characteristic sequence and the user behavior characteristic sequence; the first base model includes: the first submodel is used for analyzing the text characteristic sequence to obtain a first junk content prediction result, and the second submodel is used for analyzing the user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first penalty comprising: a second loss, a third loss and a fourth loss, wherein the second loss is the loss of the first submodel, the third loss is the loss of the second submodel, and the fourth loss is the characteristic loss between the first submodel and the second submodel;
determining a second basic model based on a second sub-model in the first recognition model, and training the second basic model by using a second sample set to obtain a second recognition model; and the second identification model is used for identifying whether the user corresponding to the user behavior characteristic sequence is a black user or not based on the user behavior characteristic sequence.
2. The method of claim 1, wherein the first set of samples comprises: the method comprises the steps of obtaining a text characteristic sequence, a user behavior characteristic sequence and first label data, wherein the first label data are used for indicating whether the text characteristic sequence is junk content or not;
the training of the first basic model by using the first sample set until the first constraint condition is met to obtain a first recognition model includes:
inputting the text feature sequence into the first sub-model to obtain a first full-link layer feature and the first junk content prediction result;
inputting the user behavior feature sequence into the second submodel to obtain a second full-link layer feature and the second junk content prediction result;
calculating the second loss based on the first label data and the first spam content prediction result, calculating a third loss based on the first label data and the second spam content prediction result, and calculating the fourth loss based on the first fully-connected layer feature and the second fully-connected layer feature;
determining the first loss according to the second loss, the third loss and the fourth loss;
and adjusting the training parameters in the first basic model according to the first loss until the first constraint condition is met, so as to obtain the first recognition model.
3. The method of claim 2, wherein the first submodel and the second submodel each comprise:
the system comprises a plurality of embedding layers, a plurality of bidirectional long-short term memory layers, a reverse feedforward neural network layer, a forward feedforward neural network layer, a first full-connection layer, a hidden representation layer, a second full-connection layer, a logistic regression layer and a normalization layer;
wherein a second fully-connected layer in the first submodel outputs the first fully-connected layer features; a second fully-connected layer in the second submodel outputs the second fully-connected layer characteristics.
4. The method of claim 2, wherein calculating a fourth penalty based on the first fully-connected layer feature and the second fully-connected layer feature comprises:
calculating a maximum mean difference between the first fully-connected layer feature and the second fully-connected layer feature; and determining the maximum mean difference as the fourth loss.
5. The method of claim 2, wherein determining the first loss from the second loss, the third loss, and the fourth loss comprises:
and weighting the second loss, the third loss and the fourth loss to obtain the first loss.
6. The method of claim 1, wherein determining a second base model based on a second sub-model in the first identified model comprises:
obtaining model parameters of a second sub-model in the first recognition model, and storing the model parameters of the second sub-model into a preset knowledge base, wherein the preset knowledge base comprises the model parameters of a plurality of models;
and obtaining model parameters of the second submodel from the knowledge base and transferring the model parameters to the second basic model, wherein the model structure of the second basic model is the same as that of the second submodel.
7. A black product user identification method is characterized by further comprising the following steps:
acquiring a user behavior characteristic sequence of a user to be identified;
inputting the user behavior feature sequence of the user to be recognized into a trained second recognition model, and determining whether the user to be recognized is a black-producing user, wherein the trained second recognition model is obtained by training through the black-producing user recognition model training method according to any one of claims 1 to 6.
8. A black product user recognition model training device, the device comprising:
the first training module is used for training the first basic model by utilizing the first sample set until a first constraint condition is met to obtain a first recognition model; the first recognition model is used for predicting whether the text corresponding to the text characteristic sequence is junk content or not based on the text characteristic sequence and the user behavior characteristic sequence; the first base model includes: the first submodel is used for analyzing the text characteristic sequence to obtain a first junk content prediction result, and the second submodel is used for analyzing the user behavior characteristic sequence to obtain a second junk content prediction result; wherein the first constraint is associated with a first penalty comprising: a second loss, a third loss and a fourth loss, wherein the second loss is the loss of the first submodel, the third loss is the loss of the second submodel, and the fourth loss is the characteristic loss between the first submodel and the second submodel;
the second training module is used for taking a second sub-model in the first recognition model as a second basic model and training the second basic model by utilizing a second sample set to obtain a second recognition model; and the second identification model is used for identifying whether the user corresponding to the user behavior characteristic sequence is a black user or not based on the user behavior characteristic sequence.
9. A black product user identification apparatus, the apparatus further comprising:
the acquisition module is used for acquiring a user behavior characteristic sequence of a user to be identified;
a recognition module, configured to input the user behavior feature sequence of the user to be recognized into a trained second recognition model, and determine whether the user to be recognized is a black user, where the trained second recognition model is obtained by training through the black user recognition model training device according to claim 8.
10. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the steps of the method of any one of claims 1 to 7 when executing the program stored in the memory.
11. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202111145600.3A 2021-09-28 2021-09-28 Black-birth user identification model training method and device, electronic equipment and storage medium Active CN113822684B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111145600.3A CN113822684B (en) 2021-09-28 2021-09-28 Black-birth user identification model training method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111145600.3A CN113822684B (en) 2021-09-28 2021-09-28 Black-birth user identification model training method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113822684A true CN113822684A (en) 2021-12-21
CN113822684B CN113822684B (en) 2023-06-06

Family

ID=78915779

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111145600.3A Active CN113822684B (en) 2021-09-28 2021-09-28 Black-birth user identification model training method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113822684B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115564577A (en) * 2022-12-02 2023-01-03 成都新希望金融信息有限公司 Abnormal user identification method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180176241A1 (en) * 2016-12-21 2018-06-21 Hewlett Packard Enterprise Development Lp Abnormal behavior detection of enterprise entities using time-series data
CN109168044A (en) * 2018-10-11 2019-01-08 北京奇艺世纪科技有限公司 A kind of determination method and device of video features
US20190180207A1 (en) * 2017-12-12 2019-06-13 Electronics And Telecommunications Research Institute System and method for managing risk factors in aeo (authorized economic operator) certificate process
CN110008980A (en) * 2019-01-02 2019-07-12 阿里巴巴集团控股有限公司 Identification model generation method, recognition methods, device, equipment and storage medium
WO2020156004A1 (en) * 2019-02-01 2020-08-06 阿里巴巴集团控股有限公司 Model training method, apparatus and system
US20210064018A1 (en) * 2018-04-09 2021-03-04 Diveplane Corporation Model Reduction and Training Efficiency in Computer-Based Reasoning and Artificial Intelligence Systems
CN112686046A (en) * 2021-01-06 2021-04-20 上海明略人工智能(集团)有限公司 Model training method, device, equipment and computer readable medium
CN112926699A (en) * 2021-04-25 2021-06-08 恒生电子股份有限公司 Abnormal object identification method, device, equipment and storage medium
CN113392179A (en) * 2020-12-21 2021-09-14 腾讯科技(深圳)有限公司 Text labeling method and device, electronic equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180176241A1 (en) * 2016-12-21 2018-06-21 Hewlett Packard Enterprise Development Lp Abnormal behavior detection of enterprise entities using time-series data
US20190180207A1 (en) * 2017-12-12 2019-06-13 Electronics And Telecommunications Research Institute System and method for managing risk factors in aeo (authorized economic operator) certificate process
US20210064018A1 (en) * 2018-04-09 2021-03-04 Diveplane Corporation Model Reduction and Training Efficiency in Computer-Based Reasoning and Artificial Intelligence Systems
CN109168044A (en) * 2018-10-11 2019-01-08 北京奇艺世纪科技有限公司 A kind of determination method and device of video features
CN110008980A (en) * 2019-01-02 2019-07-12 阿里巴巴集团控股有限公司 Identification model generation method, recognition methods, device, equipment and storage medium
WO2020156004A1 (en) * 2019-02-01 2020-08-06 阿里巴巴集团控股有限公司 Model training method, apparatus and system
CN113392179A (en) * 2020-12-21 2021-09-14 腾讯科技(深圳)有限公司 Text labeling method and device, electronic equipment and storage medium
CN112686046A (en) * 2021-01-06 2021-04-20 上海明略人工智能(集团)有限公司 Model training method, device, equipment and computer readable medium
CN112926699A (en) * 2021-04-25 2021-06-08 恒生电子股份有限公司 Abnormal object identification method, device, equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
侯丽仙;李艳玲;林民;李成城;: "融合多约束条件的意图和语义槽填充联合识别", 计算机科学与探索 *
刘全超;黄河燕;冯冲;: "面向中文微博的评价对象与评价词语联合抽取", 电子学报 *
陈俊杰: "基于近邻传播的网络异常行为检测算法设计及应用", 中国硕士学位论文全文数据库 信息科技辑 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115564577A (en) * 2022-12-02 2023-01-03 成都新希望金融信息有限公司 Abnormal user identification method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113822684B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
CN111371806B (en) Web attack detection method and device
CN108874777B (en) Text anti-spam method and device
CN110929785B (en) Data classification method, device, terminal equipment and readable storage medium
CN110598070B (en) Application type identification method and device, server and storage medium
WO2023185539A1 (en) Machine learning model training method, service data processing method, apparatuses, and systems
Shindarev et al. Approach to identifying of employees profiles in websites of social networks aimed to analyze social engineering vulnerabilities
CN112598111B (en) Abnormal data identification method and device
CN110414581B (en) Picture detection method and device, storage medium and electronic device
CN111881398B (en) Page type determining method, device and equipment and computer storage medium
CN117176482B (en) Big data network safety protection method and system
CN110162958B (en) Method, apparatus and recording medium for calculating comprehensive credit score of device
CN103631787A (en) Webpage type recognition method and webpage type recognition device
CN112819024B (en) Model processing method, user data processing method and device and computer equipment
CN113139052A (en) Rumor detection method and device based on graph neural network feature aggregation
CN111159481B (en) Edge prediction method and device for graph data and terminal equipment
Boahen et al. Detection of compromised online social network account with an enhanced knn
Srinath et al. BullyNet: Unmasking cyberbullies on social networks
CN111523604A (en) User classification method and related device
CN113822684B (en) Black-birth user identification model training method and device, electronic equipment and storage medium
CN112287225A (en) Object recommendation method and device
CN111143533A (en) Customer service method and system based on user behavior data
CN116362894A (en) Multi-objective learning method, multi-objective learning device, electronic equipment and computer readable storage medium
Pei et al. Spammer detection via combined neural network
CN114443904A (en) Video query method, video query device, computer equipment and computer readable storage medium
Lijun et al. An intuitionistic calculus to complex abnormal event recognition on data streams

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant