CN113011646B - Data processing method, device and readable storage medium - Google Patents

Data processing method, device and readable storage medium Download PDF

Info

Publication number
CN113011646B
CN113011646B CN202110275967.0A CN202110275967A CN113011646B CN 113011646 B CN113011646 B CN 113011646B CN 202110275967 A CN202110275967 A CN 202110275967A CN 113011646 B CN113011646 B CN 113011646B
Authority
CN
China
Prior art keywords
target
sample data
node
value
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110275967.0A
Other languages
Chinese (zh)
Other versions
CN113011646A (en
Inventor
孙艺芙
蓝利君
李超
王翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110275967.0A priority Critical patent/CN113011646B/en
Publication of CN113011646A publication Critical patent/CN113011646A/en
Application granted granted Critical
Publication of CN113011646B publication Critical patent/CN113011646B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Finance (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Human Resources & Organizations (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Software Systems (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Technology Law (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Medical Informatics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data processing method, equipment and a readable storage medium, wherein the method comprises the following steps: acquiring object sample data belonging to a target scene type and associated tag sample data belonging to an associated scene type; training an initial category prediction model according to the object tag sample data and the associated tag sample data to obtain a category prediction model, and determining a prediction abnormal category of the unlabeled sample data through the category prediction model to serve as a virtual abnormal category tag; and determining the object sample data and the associated label sample data as target sample data, and optimizing and adjusting the virtual abnormal class label according to the similarity between every two target sample data, the real abnormal class label corresponding to the object label sample data and the real abnormal class label corresponding to the associated label sample data. By adopting the method and the device, the recognition accuracy of the model for recognizing the target domain can be improved under the condition that the marked sample data of the target domain is too little.

Description

Data processing method, device and readable storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, apparatus, and readable storage medium.
Background
Currently, artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) technology is widely used, while machine learning (MACHINE LEARNING, ML) is the core of artificial intelligence, which is the fundamental way for computers to have intelligence, which is applied throughout various fields of artificial intelligence.
The transfer learning of the marked sample data based on the target domain (current scene) is a branch field of machine learning, and the target is to transfer the marked sample data of the source domain (the associated scene with the current scene) to the marked sample data of the target domain, so that the learning of the marked sample data of the target domain is assisted by the prior knowledge contained in the marked sample data of the source domain, and the learning effect of the risk category prediction model of the target domain is improved.
It can be seen that the labeled sample data of the target domain and the labeled sample data of the source domain are important to the risk class prediction model of the target domain. However, the target domain has a small amount of marked sample data and a large amount of unmarked sample data under the general condition, and the migration learning method generally requires a large amount of marked sample data in the source domain, so that when the marked sample data in the source domain is sufficient and the marked sample data in the target domain is too small, the risk class prediction model of the target domain is easily shifted to the source domain, and the problem of negative migration is caused, so that the risk class prediction model obtained by final training learning cannot accurately identify the risk class of the data in the target domain, and the identification accuracy of the risk class prediction model is not high.
Disclosure of Invention
The embodiment of the application provides a data processing method, data processing equipment and a readable storage medium, which can improve the recognition accuracy of a model for recognizing a target domain under the condition that marked sample data of the target domain is too little.
In one aspect, an embodiment of the present application provides a data processing method, including:
Acquiring object sample data belonging to a target scene type and associated tag sample data belonging to an associated scene type; the object sample data comprises object label sample data and label-free sample data; the associated scene type and the target scene type have scene association relation;
training an initial category prediction model according to the object tag sample data and the associated tag sample data to obtain a category prediction model, determining a prediction abnormality category corresponding to the unlabeled sample data through the category prediction model, and taking the prediction abnormality category corresponding to the unlabeled sample data as a virtual abnormality category tag of the unlabeled sample data;
The object sample data and the associated label sample data are both determined to be target sample data, and the virtual abnormal class label is optimized and adjusted according to the similarity between every two target sample data, the real abnormal class label corresponding to the object label sample data and the real abnormal class label corresponding to the associated label sample data, so that the target abnormal class label is obtained; the target abnormal category label is used for carrying out optimization training on the category prediction model together with the real abnormal category label corresponding to the object label sample data.
In one aspect, an embodiment of the present application provides a data processing apparatus, including:
the sample acquisition module is used for acquiring object sample data belonging to a target scene type and associated tag sample data belonging to an associated scene type; the object sample data comprises object label sample data and label-free sample data; the associated scene type and the target scene type have scene association relation;
The model training module is used for training the initial class prediction model according to the object label sample data and the associated label sample data to obtain a class prediction model;
The label prediction module is used for determining a prediction abnormality class corresponding to the unlabeled sample data through the class prediction model, and taking the prediction abnormality class corresponding to the unlabeled sample data as a virtual abnormality class label of the unlabeled sample data;
the target data determining module is used for determining the object sample data and the associated label sample data as target sample data;
The tag adjustment module is used for optimizing and adjusting the virtual abnormal category tag according to the similarity between every two target sample data, the real abnormal category tag corresponding to the object tag sample data and the real abnormal category tag corresponding to the associated tag sample data to obtain a target abnormal category tag; the target abnormal category label is used for carrying out optimization training on the category prediction model together with the real abnormal category label corresponding to the object label sample data.
In one embodiment, the model training module includes:
The model prediction unit is used for inputting the object tag sample data and the associated tag sample data into an initial category prediction model, and outputting a first prediction abnormal category and a first prediction scene type corresponding to the object tag sample data and a second prediction abnormal category and a second prediction scene type corresponding to the associated tag sample data through the initial category prediction model;
The real tag acquisition unit is used for acquiring a first real abnormal category tag corresponding to the object tag sample data and the target scene type, and a second real abnormal category tag corresponding to the associated tag sample data and the associated scene type;
The loss value determining unit is used for determining a first loss function value according to the first predicted abnormal category, the real abnormal category label corresponding to the object label sample data, the first predicted scene type and the target scene type;
The loss value determining unit is further configured to determine a second loss function value according to the second predicted abnormal category, the real abnormal category label corresponding to the associated label sample data, the second predicted scene type, and the associated scene type;
the model training unit is used for training the initial class prediction model according to the first loss function value and the second loss function value to obtain a class prediction model.
In one embodiment, the model training unit comprises:
a loss value generation subunit configured to generate a target loss function value according to the first loss function value and the second loss function value;
The model determining subunit is used for taking the initial class prediction model as a class prediction model if the objective loss function value meets the model convergence condition;
And the model adjustment subunit is used for acquiring a gradient optimization function if the objective loss function value does not meet the model convergence condition, and adjusting the model parameters of the initial model prediction model according to the gradient optimization function and the objective loss function value to obtain the model prediction model containing the adjusted model parameters.
In one embodiment, the tag adjustment module includes:
the node determining unit is used for determining each target sample data as a node of the graph network;
The node determining unit is further used for taking the real abnormal category label corresponding to the object label sample data as a node value of the node belonging to the object label sample data;
the node determining unit is further used for taking the virtual abnormal category label corresponding to the unlabeled sample data as a node value of the node belonging to the unlabeled sample data;
The node determining unit is further used for taking the real abnormal category label corresponding to the associated label sample data as a node value of the node belonging to the associated label sample data;
A similarity determination unit configured to determine a similarity between each two pieces of target sample data;
the graph network construction unit is used for constructing a graph network according to the similarity, the nodes corresponding to each target sample data and the node value of each node;
and the label optimizing unit is used for optimizing and adjusting the virtual abnormal class labels according to the graph network to obtain target abnormal class labels.
In one embodiment, the target sample data includes target sample data S i and target sample data S j; i. j are positive integers;
The similarity determination unit includes:
A first feature extraction subunit, configured to input the target sample data S i and the target sample data S j to a class prediction model, extract, by a feature extraction layer of the class prediction model, a hidden feature vector k a corresponding to the target sample data S i, and a hidden feature vector k b corresponding to the target sample data S j; a. b are positive integers;
A distance determination subunit configured to determine a vector distance between the hidden feature vector k a and the hidden feature vector k b;
The first similarity determining subunit is configured to use the vector distance as a similarity between the target sample data S i and the target sample data S j.
In one embodiment, the target sample data includes target sample data S i and target sample data S j; i. j are positive integers;
The similarity determination unit includes:
A second feature extraction subunit, configured to input the target sample data S i and the target sample data S j to a class prediction model, and extract, by a feature extraction layer of the class prediction model, a hidden feature vector k a corresponding to the target sample data S i, and a hidden feature vector k b corresponding to the target sample data S j; a. b are positive integers;
The cosine determining subunit is configured to determine an angle value between the hidden feature vector k a and the hidden feature vector k b, and determine a cosine value between the hidden feature vector k a and the hidden feature vector k b according to the angle value;
And a second similarity determination subunit configured to use the cosine value as a similarity between the target sample data S i and the target sample data S j.
In one embodiment, the graph network construction unit includes:
a set determining subunit, configured to determine a similarity between each two nodes as a similarity set;
The target value determining subunit is used for comparing each similarity in the similarity set with a similarity threshold value and acquiring target similarity which is larger than or equal to the similarity threshold value from the similarity set;
and the graph network generation subunit is used for creating a correlation edge between two nodes with target similarity and generating a graph network containing the node corresponding to each target sample data, the node value of each node and the correlation edge.
In one embodiment, the tag optimization unit includes:
a node selection subunit, configured to obtain a node corresponding to the label-free sample data in the graph network, as a target node;
An associated node subunit, configured to obtain, in the graph network, a node having an associated edge with a target node, as a target associated node;
the node optimization subunit is used for optimizing and adjusting the node value of the target node according to the node value of the target associated node and the similarity between the target associated node and the target node to obtain the target node value;
And the label determining subunit is used for determining the target abnormal category label according to the target node value.
In one embodiment, the target associated nodes include a first target associated node and a second target associated node;
the node optimization subunit is further specifically configured to obtain a first similarity between the first target association node and the target node, and a second similarity between the second target association node and the target node;
The node optimization subunit is further specifically configured to obtain a first node value of the first target association node and a second node value of the second target association node;
the node optimization subunit is further specifically configured to multiply the first similarity with a first node value to obtain a first operation value;
The node optimization subunit is further specifically configured to multiply the second similarity with a second node value to obtain a second operation value;
The node optimization subunit is further specifically configured to optimize and adjust a node value of the target node according to the first operation value and the second operation value, so as to obtain the target node value.
In one embodiment, the node optimization subunit is further specifically configured to perform addition processing on the first operation value and the second operation value to obtain a target operation value;
The node optimization subunit is further specifically configured to obtain a tag value corresponding to the target operation value, and match the tag value corresponding to the target operation value with a node value of the target node;
the node optimization subunit is further specifically configured to replace the node value of the target node with the tag value corresponding to the target operation value if the tag value corresponding to the target operation value is different from the node value of the target node, and perform optimization adjustment on the node value of the target node according to the first adjustment node value, the first similarity, the second adjustment node value and the second similarity of the first target associated node, so as to obtain the target node value; the first adjustment node value is obtained by optimizing and adjusting the first node value according to the node value of the associated node corresponding to the first target associated node; the second adjustment node value is obtained by optimizing and adjusting the second node value according to the node value of the associated node corresponding to the second target associated node;
The node optimization subunit is further specifically configured to determine that the node value of the target node is in a convergence state if the tag value corresponding to the target operation value is the same as the node value of the target node, and take the node value of the target node as the target node value.
In one embodiment, the label determining subunit is further specifically configured to determine, as the convergence operational value, a target operational value corresponding to a label value that is the same as the node value of the target node when the node value of the target node is in the convergence state;
The label determining subunit is further specifically configured to determine an absolute value of a difference between the target node value and the convergence operation value;
The label determining subunit is further specifically configured to match the absolute value of the difference value with a first label threshold value and a second label threshold value, and obtain a target absolute value of the difference value greater than the first label threshold value or less than the second label threshold value from the absolute value of the difference value; the second tag threshold is less than the first tag threshold;
the label determining subunit is further specifically configured to determine a target node value corresponding to the absolute value of the target difference value as a target abnormal class label.
In one embodiment, the apparatus further comprises:
The data input module is used for inputting the object label sample data and the label-free sample data into the class prediction model, and outputting a third prediction abnormal class corresponding to the object label sample data and a fourth prediction abnormal class corresponding to the label-free sample data through the class prediction model;
The label acquisition module is used for acquiring a real abnormal class label corresponding to the object label sample data and a target abnormal class label corresponding to the label-free sample data;
The model optimization module is used for determining a third loss value according to the third predicted abnormal category, the real abnormal category label corresponding to the object label sample data, the fourth predicted abnormal category and the target abnormal category label;
And the model optimization module is also used for carrying out optimization training on the class prediction model according to the third loss value to obtain a target class prediction model.
In one embodiment, the apparatus further comprises:
The target data acquisition module is used for acquiring data to be identified belonging to the type of the target scene;
The vector extraction module is used for inputting the data to be identified into the target category prediction model, and extracting the hidden feature vector of the data to be identified through the feature extraction layer of the target category prediction model;
The model application module is used for inputting the hidden feature vector of the data to be identified into a feature classification layer of the target category prediction model, and outputting the initial prediction abnormal category corresponding to the data to be identified and the prediction probability corresponding to the initial prediction abnormal category through the feature classification layer;
the model application module is further used for obtaining the maximum prediction probability from the prediction probabilities, and determining the initial prediction abnormal category corresponding to the maximum prediction probability as the prediction abnormal category corresponding to the data to be identified.
In one aspect, an embodiment of the present application provides a computer device, including: a processor and a memory;
The memory stores a computer program that, when executed by the processor, causes the processor to perform the methods of embodiments of the present application.
In one aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program, the computer program comprising program instructions that, when executed by a processor, perform a method according to embodiments of the present application.
In one aspect of the application, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the method provided in an aspect of the embodiment of the present application.
In the embodiment of the application, the initial class prediction model is initially trained through object label sample data belonging to the target scene type and associated label sample data belonging to the associated scene type to obtain a class prediction model, and the class prediction model can output the prediction abnormal class of the unlabeled sample data belonging to the target scene type; then, based on the similarity between every two sample data in the sample data and the real abnormal category label of the labeled sample data (comprising the object label sample data and the associated label sample data), optimizing and adjusting the predicted abnormal category, so that an accurate target abnormal category label after optimizing and adjusting can be obtained; after the target abnormal class label of the label-free sample data is obtained, the class prediction model can be optimized and trained based on the target abnormal class label of the label-free sample data and the real abnormal class label of the object label sample data. It can be seen that, in the application, based on the similarity between sample data and the true abnormal class label of the labeled sample data (i.e. the labeled sample data may include the object label sample data and the associated label sample data), the target abnormal class label of the unlabeled sample data (i.e. the unlabeled sample data) can be predicted and adjusted, so that the unlabeled sample data can have the abnormal class label, and then when the class prediction model is optimized and trained subsequently, the model can be trained by utilizing the object label sample data belonging to the target scene type (i.e. the target domain) and the unlabeled sample data together, so that the class prediction model obtained by the optimization and training is more accurate. In summary, the method and the device can effectively combine the unlabeled sample data of the target scene type and the labeled sample data in the model training, so that the unlabeled sample data can be effectively utilized under the condition that the labeled sample data of the target domain is too small, and the recognition accuracy of the model for recognizing the target domain can be improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a network architecture according to an embodiment of the present application;
FIGS. 2 a-2 b are schematic diagrams of a model training architecture provided by embodiments of the present application;
FIG. 3 is a schematic view of a class identification scenario provided by an embodiment of the present application;
FIG. 4 is a schematic flow chart of a data processing method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a tag for adjusting unlabeled exemplar data according to an embodiment of the present application;
FIG. 6 is a flowchart of training an initial class prediction model according to object tag sample data and associated tag sample data according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a system architecture according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The present application relates to the field of artificial intelligence, and for ease of understanding, artificial intelligence and its related technical concepts will be described below.
Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The scheme provided by the embodiment of the application belongs to machine learning ((MACHINE LEARNING, ML) in the field of artificial intelligence.
Machine learning (MACHINE LEARNING, ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.
Referring to fig. 1, fig. 1 is a schematic diagram of a network architecture according to an embodiment of the present application. As shown in fig. 1, the network architecture may include a service server 1000 and a user terminal cluster, which may include one or more user terminals, the number of which will not be limited here. As shown in fig. 1, the plurality of user terminals may include user terminal 100a, user terminal 100b, user terminals 100c, …, user terminal 100n; as shown in fig. 1, the user terminals 100a, 100b, 100c, …, 100n may respectively make a network connection with the service server 1000, so that each user terminal may perform data interaction with the service server 1000 through the network connection.
It will be appreciated that each user terminal as shown in fig. 1 may be provided with a target application, which when running in each user terminal, may interact with the service server 1000 shown in fig. 1, respectively, so that the service server 1000 may receive service data from each user terminal. The target application may include an application having a data information processing function of displaying text, images, audio, video, and the like. For example, the application may be a risk identification application, and may be used for a user to input data and obtain a risk category corresponding to the data. The application may be a stand-alone application, or may be an embedded application integrated in a client (e.g., social client, educational client, multimedia client, etc.), which is not limited herein.
As shown in fig. 1, the service server 1000 in the embodiment of the present application may be a server corresponding to the target application. The business server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms.
For easy understanding, the embodiment of the present application may select one user terminal from the plurality of user terminals shown in fig. 1 as a target user terminal, where the user terminal may include: smart terminals carrying multimedia data processing functions (e.g., video data playing functions, music data playing functions) such as smartphones, tablet computers, notebook computers, desktop computers, smart televisions, smart speakers, desktop computers, smartwatches, and the like, but are not limited thereto. For example, the embodiment of the present application may use the user terminal 100a shown in fig. 1 as a target user terminal, and the target user terminal may integrate therein a target application having the data information processing function. At this time, the target user terminal may implement data interaction between the service data platform corresponding to the target application and the service server 1000.
For example, when the target user is using a target application (e.g., a risk identification application) in the target user terminal, the target user may upload service data in the target application, and the service server 1000 may obtain the service data uploaded by the target user through the target application in the target user terminal. It can be understood that, the service server 1000 may be deployed with a target class prediction model, and after the service server 1000 obtains the service data uploaded by the target user, the service server 1000 may perform risk class identification on the service data through the target class prediction model, so as to identify whether the risk class of the service data is a "risk class" or a "risk-free class". The service server 1000 may return the risk category identification result to the target user terminal, and the target user may view the risk category to which the service data belongs in the display interface of the target user terminal.
It should be understood that, in order to improve the accuracy of identifying service data by the target class prediction model, the target class prediction model may be trained, so that the trained target class prediction model may have higher accuracy of identifying, for convenience of understanding, please refer to fig. 2 a-2 b, and fig. 2 a-2 b are schematic diagrams of a model training architecture provided by an embodiment of the present application.
As shown in fig. 2a, tagged sample data belonging to a target scene type, untagged sample data and tagged sample data belonging to an associated scene type may be obtained, and then the tagged sample data belonging to the target scene type and the tagged sample data belonging to the associated scene type may be input into an initial class prediction model, and preliminary training optimization may be performed on the initial class prediction model by using the two tagged sample data to obtain a class prediction model; the unlabeled exemplar data belonging to the target scene type may then be input into the class prediction model, from which a risk class (a predicted anomaly class as shown in fig. 2 a) corresponding to the unlabeled exemplar data may be output, which may be used as a virtual anomaly class label corresponding to the unlabeled exemplar data, further, as shown in fig. 2b, a graph network (e.g., a graph network as shown in fig. 2 b) may be constructed based on the real anomaly class label of the labeled exemplar data belonging to the associated scene type, the real anomaly class label of the labeled exemplar data belonging to the target scene type, and the virtual anomaly class label of the unlabeled exemplar data belonging to the target scene type. According to the real abnormal category label corresponding to the labeled sample data (which can comprise labeled sample data belonging to the associated scene type and labeled sample data belonging to the target scene type) in the graph network, the predicted abnormal category of the unlabeled sample data can be optimized and adjusted to obtain a target abnormal category label, and the target abnormal category label can be used as the real abnormal category label of the unlabeled sample data; therefore, the tagged sample data and the untagged sample data of the target scene type are provided with real abnormal class tags, and further, the tagged sample data and the untagged sample data belonging to the target scene type can be input into a class prediction model, and training and optimizing can be carried out on the class prediction model according to the tagged sample data and the untagged sample data which belong to the target scene type and are provided with real abnormal class tags, so that a final target class prediction model is obtained. The specific implementation manner of constructing the graph network and the specific implementation manner of optimizing and adjusting the predicted abnormal category of the unlabeled exemplar data according to the graph network to obtain the target abnormal category label can be referred to the description in the embodiment corresponding to the subsequent fig. 4.
It should be appreciated that the trained target class prediction model may be deployed in the business server 1000 for online service invocation, that is, the target class prediction model may be used to perform risk class identification on the data to be identified belonging to the target scene type. For example, after the service server 1000 obtains the service data uploaded by the target user, the service server 1000 may identify the risk category of the service data through the target category prediction model. The target category prediction model is trained by the labeled sample data and the unlabeled sample data of the target scene type, and the scene type to which the service data belongs should be the target scene type. The scene types may include transaction scenes (such as consumption scenes, shopping scenes, payment scenes, loan scenes, product supply scenes, etc.), rights allocation scenes, etc., and when a certain scene type is used as a target scene type, the scene type with an association relationship (for example, the scene type has high similarity and strong business association) with the scene type may be used as an association scene type of the target scene type; the business data may refer to data used for risk assessment in the target scene type. For example, the scenario type is a loan scenario, an enterprise initiates a loan request to a virtual property provider (e.g., a bank), and the virtual property provider may obtain relevant data (e.g., goods-related data, property-related data, sales-related data, cost-related data) of the enterprise, the virtual property provider may serve as the target user, the virtual property provider may upload the relevant data of the enterprise through the target application, the relevant data of the enterprise may serve as business data, and the business server 1000 may perform risk assessment on the relevant data of the enterprise through the target category prediction model, and identify a risk category corresponding to the relevant data of the enterprise.
Alternatively, it may be understood that the target class prediction model may be deployed in a user terminal, and after the user terminal obtains the service data, the risk class of the service data may be identified by the target class prediction model in the user terminal.
Optionally, it may be appreciated that, to ensure the authenticity of the risk category of the service data, the service data and the risk category corresponding thereto may be uplink to the blockchain, and since the blockchain is non-counterfeitable and tamper-resistant, the risk category of the service data may be ensured to be authentic by the blockchain. The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like, and is mainly used for sorting data according to time sequence and encrypting the data into an account book so that the account book cannot be tampered and forged, and meanwhile, the data can be verified, stored and updated. A blockchain is essentially a de-centralized database in which each node stores an identical blockchain, and the blockchain network distinguishes the nodes into core nodes, data nodes, and light nodes, wherein the core nodes are responsible for the consensus of the blockchain network, i.e., the core nodes are consensus nodes in the blockchain network. The process of writing transaction data (such as business data and risk category of the business data) in the blockchain network into the account book can be that the user terminal sends the transaction data to the data node or the light node, then the transaction data is transmitted between the data node or the light node in the blockchain network in a baton mode until the consensus node receives the transaction data, the consensus node packages the transaction data into blocks, performs consensus among other consensus nodes, and writes the blocks carrying the transaction data into the account book after the consensus passes.
It will be appreciated that the method provided by the embodiments of the present application may be performed by a computer device, including but not limited to a user terminal or a service server. The user terminal and the service server may be directly or indirectly connected through wired or wireless communication, which is not limited herein.
For ease of understanding, please refer to fig. 3, fig. 3 is a schematic diagram of a type of recognition scenario provided by an embodiment of the present application. The ue a shown in fig. 3 may be any ue selected from the ue group in the embodiment corresponding to fig. 1, for example, the ue may be the ue 100a; the ue B shown in fig. 3 may be any ue selected from the ue group in the embodiment corresponding to fig. 1, for example, the ue may be the ue 100B; the service server shown in fig. 3 may be the service server 1000 in the embodiment corresponding to fig. 1.
As shown in fig. 3, a user a may initiate a loan request through a user terminal a, in a virtual property business system (an application installed in the user terminal a) of the user terminal a, the user a may input a loan amount (e.g., 100000 elements of a borrowing amount shown in fig. 3) desired to be acquired, and after the user a inputs the loan amount and clicks a determination control, the user terminal a may respond to this triggering operation of the user a and display an information input interface; as shown in fig. 3, the information input interface includes a prompt message for requesting to input relevant user information of the borrower and an information input box corresponding to the prompt message, in which the user a can input relevant user information (such as an account address of the user a, relevant information of an enterprise where the user a is located, historical loan records, credit records, reported records, etc.), and the user a can click the determination control when the information input is completed; the user terminal A can respond to the triggering operation of the user a, generate a loan request and acquire the user information input by the user a, and can send the loan request and the user information of the user a to the user terminal B (the user terminal B can be a terminal corresponding to a virtual property provider); user B (virtual asset provider) may send the user information of user a to a service server through a risk identification application (installed in user terminal B), and the service server may input part of the user information of user a (including enterprise-related information, historical loan record, credit record, reported record, etc.) to the target category prediction model; the target class prediction model may be trained by sample data (including unlabeled sample data and labeled sample data) belonging to a loan scene type and labeled sample data belonging to an associated scene type (scene type having an association relationship with the loan scene type, such as deposit scene type, property investment scene type, etc.), and may output a risk class corresponding to part of the user information of the user a. As shown in fig. 3, the risk category output by the target category prediction model is a "no risk" category, and the service server may return the result (no risk category) identified by the target category prediction model to the user terminal B. The user B can check that the identification result is a risk-free category through the user terminal B, and after the user B agrees to provide the virtual property data (100000 yuan) corresponding to the loan amount for the user a according to the identification result, the user B can transfer the 100000 yuan to the account address of the user a according to the account address provided by the user a. After the user b successfully transfers the virtual property data, the user terminal a may display loan passing prompt information (such as the information that "you have passed the application and the account with the tail number xxxxxx has received the virtual property 100000" shown in fig. 3) in the display interface, and may also display a "check balance" control in the display interface, and if the user a clicks the "check balance" control, the balance in the account address (including the virtual property data 100000 elements in the balance) may be checked.
It should be appreciated that, by performing optimization training on the target category prediction model by using the labeled sample data, the unlabeled sample data and the labeled sample data belonging to the target scene type and the labeled sample data belonging to the associated scene type, the target category prediction model after optimization training can quickly and accurately identify the risk category of the data to be identified belonging to the target scene type, so that subsequent service processing can be further quickly performed according to the risk category.
Further, referring to fig. 4, fig. 4 is a flow chart of a data processing method according to an embodiment of the application. The method may be performed by a user terminal (e.g., any user terminal in the user terminal cluster shown in fig. 1, such as the user terminal 100 b) or by a service server (e.g., the service server 1000 shown in fig. 1), or by both the user terminal and the service server. For easy understanding, this embodiment is described by taking the method performed by the service server as an example, to describe a specific process of model training in the service server. Wherein, the method at least comprises the following steps S101-S103:
Step S101, obtaining object sample data belonging to a target scene type and associated tag sample data belonging to an associated scene type; the object sample data comprises object label sample data and label-free sample data; the associated scene type and the target scene type have scene association relation.
In the present application, the scene type may refer to a type of a scene in which a risk identification (risk assessment) process exists. For example, in a sales scenario, where both parties performing sales need to perform risk assessment (e.g., credit assessment) on each other to ensure that sales can be successfully completed, a risk identification process exists in the sales scenario, and the scenario types in the present application may include sales scenario types. It should be understood that, in addition to the above-mentioned buying and selling scenarios, consumption scenarios, shopping scenarios, loan scenarios, product supply scenarios, leasing scenarios, rights allocation scenarios, property investment scenarios, etc., there is a risk identification process, and the scenario types of the present application may also include these scenario types. It can be understood that any one of the scene types can be used as a target scene type, and when a certain scene type is used as a target scene type, the scene type with a scene association relationship with the target scene type can be used as an associated scene type of the target scene type; the scene type with the business association relation with the target scene type or the scene type with high business similarity with the target scene type can be determined to be the scene type with the scene association relation with the target scene type.
It should be appreciated that the object tag sample data may refer to data of risk-present class tags (e.g., high-risk class tags, "low-risk" class tags, "risky" class tags, etc.) under the target scene type, the label-free sample data may refer to data of risk-absent class tags under the target scene type, and the associated tag sample data may refer to data of risk-present class tags under the associated scene type. Wherein, for positive sample data in the object label sample data and the associated label sample data, the risk category label can be represented by a uniform value (for example, a value of 0); for negative sample data in the object label sample data and the associated label sample data, the risk category label thereof may be represented by another uniform value (e.g., value 1). It should be appreciated that when the risk category labels of positive or negative sample data are represented by a uniform value, the uniform value may be referred to as a risk category label of positive or negative sample data; for example, when negative-sample data is represented using a uniform value of 0, the risk category label of the negative-sample data is 0. The negative sample data may refer to sample data of which the risk category is a "risky" category or a "risky high" in the object tag sample data and the associated tag sample data, and the positive sample data may refer to sample data of which the risk category is a "risky" category or a "risky low" category in the object tag sample data and the associated tag sample data. For example, if the risk category of a certain object tag sample data in the object tag sample data is a "risky" category, the risk category tag of the object tag sample data may be represented by a value of 1, and the risk category tag of the object tag sample data is 1; if the risk category of a certain object tag sample data in the object tag sample data is a "no risk" category, the risk category tag of the object tag sample data may be represented by a value of 0, and the risk category tag of the object tag sample data is 0.
It should be noted that, in general, the number of object tag sample data (i.e., data with risk category tags) under the target scene type is far smaller than the number of unlabeled sample data (i.e., data without risk category tags).
Step S102, training an initial category prediction model according to the object tag sample data and the associated tag sample data to obtain a category prediction model, determining a prediction abnormality category corresponding to the untagged sample data through the category prediction model, and taking the prediction abnormality category corresponding to the untagged sample data as a virtual abnormality category tag of the untagged sample data.
The prediction anomaly class is a prediction risk class corresponding to the unlabeled exemplar data, the risk class may include a "risky" class and a "risky" class, the "risky" class may be characterized by a value of 1, the "risky" class may be characterized by a value of 0, for example, the prediction risk class corresponding to the unlabeled exemplar data is the "risky" class, the class prediction model may output a value of 0 to be used for characterizing that the prediction anomaly class of the unlabeled exemplar data is the "risky" class, and the value of 0 may be used as a virtual anomaly class label of the unlabeled exemplar data.
According to the method, the initial class prediction model can be initially trained through the object label sample data and the associated label sample data, and the class prediction model is obtained. The initial class prediction model may include a feature extraction layer and a feature classification layer, the object tag sample data and the associated tag sample data are input into the initial class prediction model, the hidden feature vectors corresponding to the object tag sample data and the associated tag sample data respectively can be extracted by the feature extraction layer in the initial class prediction model, the hidden feature vectors are input into the feature classification layer, and the prediction anomaly class corresponding to the hidden feature vectors (i.e., the prediction risk class corresponding to the object tag sample data and the associated tag sample data respectively) can be output by the feature classification layer; according to the predicted abnormal category corresponding to the object tag sample data and the associated tag sample data respectively and the real abnormal category tag corresponding to the object tag sample data and the associated tag sample data respectively, a loss function value of the initial category prediction model can be calculated, the initial category prediction model can be optimized and adjusted through the loss function value, for example, model parameters of the initial category prediction model can be updated according to the loss function value, and model parameters when the loss function value is converged can be used as model parameters of the category prediction model obtained through training.
Further, the unlabeled exemplar data is input to the class prediction model, and a prediction anomaly class corresponding to the unlabeled exemplar data can be output through the class prediction model. The prediction anomaly class can refer to a prediction risk class, and since the class prediction model is obtained by training an initial class prediction model, the class prediction model also comprises a feature extraction layer and a feature classification layer, and similarly, after the unlabeled sample data is input into the class prediction model, a hidden feature vector corresponding to the unlabeled sample data can be extracted through the feature extraction layer in the class prediction model, the hidden feature vector corresponding to the unlabeled sample data is input into the feature classification layer, and the prediction anomaly class corresponding to the unlabeled sample data can be output through the feature classification layer.
Step S103, determining object sample data and associated label sample data as target sample data, and optimizing and adjusting the virtual abnormal class label according to the similarity between every two target sample data, the real abnormal class label corresponding to the object label sample data and the real abnormal class label corresponding to the associated label sample data to obtain a target abnormal class label; the target abnormal category label is used for carrying out optimization training on the category prediction model together with the real abnormal category label corresponding to the object label sample data.
In the application, the abnormal category can refer to a risk category, and the real abnormal category label of the object label sample data and the associated label sample data can refer to a risk category label. The risk category labels may include "high risk" category labels, "low risk" category labels, "risky" category labels, "no risk" category labels, and the like. Taking the example that the risk category label includes a "risk" category label and a "no risk" category label, if the "risk" category label is represented by a uniform value of 1 and the "no risk" category label is represented by a uniform value of 0, the risk category label may include 0 (i.e., the "no risk" category label) and 1 (i.e., the "risk" category label), and the real abnormal category label may include 0 and 1.
It should be appreciated that, to improve the generalization ability of the class prediction model (generalization ability, the recognition ability of the finger model to unseen data, i.e. the recognition accuracy), the class prediction model may be optimally trained by using the unlabeled sample data under the target scene type together with the object label sample data, so as to obtain the target class prediction model. The specific method for carrying out optimization training on the category prediction model to obtain the target category prediction model can be as follows: the virtual abnormal class labels of the label-free sample data can be optimized and adjusted based on the object sample label data and the associated label sample data by using a graph network label diffusion algorithm to obtain more accurate target abnormal class labels, and then, the class prediction model is optimized and trained based on the object label sample data (with real abnormal class labels) and the label-free sample data (with target abnormal class labels) under the target scene type to obtain the target class prediction model. The specific method for obtaining the target abnormal category label of the unlabeled sample data can be as follows: the object sample data (including object label sample data and label-free sample data) and the associated label sample data can be determined as target sample data, the similarity between every two target sample data can be determined, a graph network corresponding to the target sample data is constructed according to the similarity, and the virtual abnormal class label of the label-free sample data is adjusted and predicted by utilizing a graph network label diffusion algorithm to obtain the target abnormal class label. Further, the unlabeled exemplar data and the target anomaly class labels corresponding to the unlabeled exemplar data may be applied to optimization training of a class prediction model. It should be appreciated that, by using the method of obtaining the target anomaly class label of the unlabeled exemplar data based on the similarity and the labeled exemplar data (including the object label exemplar data and the associated label exemplar data), the target anomaly class label of the unlabeled exemplar data can be accurately determined, and the target anomaly class label can be used as the true anomaly class label of the unlabeled exemplar data, after the unlabeled exemplar data has the true anomaly class label, the unlabeled exemplar data can be applied to the subsequent optimization training of the class prediction model, thereby the class prediction model can be jointly trained in combination with the object label exemplar data and the unlabeled exemplar data under the target scene type, instead of performing model training by using only object tag sample data in the target scene type, the risk category of the data to be identified (data in the target scene type) can be more accurately identified by combining the mode that the model is trained by combining the object tag sample data with a small number with the non-tag sample data with a large number under the condition that the object tag sample data in the target scene type is less and the non-tag sample data is more. It should be appreciated that the graph network label diffusion algorithm is a graph-based semi-supervised learning method, and the algorithm can use the relation between sample data (e.g., target sample data) to establish a complete graph, in which nodes include label sample data (e.g., object label sample data and associated label sample data) and unlabeled sample data, and labels of the nodes can be transferred to other nodes according to the similarity, so that label information of the unlabeled sample data can be predicted and adjusted by using information of the labeled sample data.
For easy understanding, the following will specifically describe a method for adjusting the virtual anomaly class label corresponding to the unlabeled exemplar data by using the graph network label diffusion algorithm to obtain the target anomaly class label: each target sample data may be determined to be a node of the graph network; then, the true abnormal category label corresponding to the object label sample data can be used as the node value of the node belonging to the object label sample data; taking the virtual abnormal category label corresponding to the unlabeled exemplar data as a node value of a node belonging to the unlabeled exemplar data; taking the real abnormal category label corresponding to the associated label sample data as a node value of the node belonging to the associated label sample data; then, the similarity between every two target sample data can be determined, and a graph network can be constructed according to the similarity, the node corresponding to each target sample data and the node value of each node; and optimizing and adjusting the virtual abnormal category label according to the graph network to obtain the target abnormal category label.
In the following, a specific method for determining the similarity between each two pieces of target sample data will be described by taking the example that the target sample data includes the target sample data S i and the target sample data S j (i and j are both positive integers), and the specific method for determining the similarity between each two pieces of target sample data may be: the target sample data S i and the target sample data S j may be input into a class prediction model, and the hidden feature vector k a corresponding to the target sample data S i and the hidden feature vector k b corresponding to the target sample data S j may be extracted by a feature extraction layer of the class prediction model; wherein a and b are positive integers; subsequently, a vector distance between the hidden feature vector k a and the hidden feature vector k b may be determined; the vector distance may be taken as a similarity between the target sample data S i and the target sample data S j.
Optionally, the specific method for determining the similarity between every two target sample data may further be: the target sample data S i and the target sample data S j may be input into a class prediction model, and the hidden feature vector k a corresponding to the target sample data S i and the hidden feature vector k b corresponding to the target sample data S j may be extracted by a feature extraction layer of the class prediction model; wherein a and b are positive integers; then, an angle value between the hidden feature vector k a and the hidden feature vector k b may be determined, and a cosine value between the hidden feature vector k a and the hidden feature vector k b may be determined from the angle value; the cosine value may be used as a similarity between the target sample data S i and the target sample data S j.
It can be understood that after extracting the hidden feature vectors corresponding to the two target sample data, a vector distance or a cosine value between the two hidden feature vectors can be determined, where the vector distance or the cosine value can be used as a similarity between the two hidden feature vectors.
Further, the graph network can be constructed according to the similarity, the node corresponding to each target sample data and the node value of each node, and the specific method can be as follows: the similarity between every two nodes can be determined as a similarity set; then, each similarity in the similarity set can be compared with a similarity threshold value, and target similarity which is larger than or equal to the similarity threshold value is obtained in the similarity set; and creating a correlation edge between two nodes with target similarity, and generating a graph network containing the node corresponding to each target sample data, the node value of each node and the correlation edge. It should be appreciated that if the similarity between any two target sample data is greater than or equal to the similarity threshold, then an associated edge may be created between the nodes corresponding to the two target sample data (i.e., the two nodes are edge connected), thereby resulting in a graph network that includes a plurality of nodes having different node values and one or more associated edges.
It should be appreciated that each of the object tag sample data and the associated tag sample data corresponds to a true anomaly class tag that can be characterized using a value of 0 and a value of 1, the true anomaly class tag can be a value of 0 for positive sample data in the object tag sample data and the associated tag sample data (e.g., sample data with a risk class of "no risk") and a value of 1 for negative sample data in the object tag sample data and the associated tag sample data (e.g., sample data with a risk class of "at risk"); for the unlabeled exemplar data, a prediction exception type (i.e., a virtual exception type label) corresponding to the unlabeled exemplar data may be output through a type prediction model, and the prediction exception type may be a value of 0 when the prediction exception type is a "no risk" type, and a value of 1 when the prediction exception type is a "risky" type. Each target sample data (including object label sample data, associated label sample data and unlabeled sample data) may be used as a node, and the value used to characterize the label corresponding to each target sample data may be used as the node value of its corresponding node, for example, unlabeled sample data a may be used as node a, whose prediction anomaly class is 0 (i.e., the "no risk" class), and then the node value of this node a may be 0; then, the similarity between any two pieces of target sample data can be determined, if the similarity between any two pieces of target sample data is greater than or equal to a similarity threshold value, an associated edge can be created between two nodes corresponding to the two pieces of target sample data, and the similarity is determined as the edge weight of the associated edge, so that a graph network comprising the nodes, the node value (0 or 1), the associated edge and the edge weight can be obtained. The similarity threshold may be a manually specified value, for example, the similarity thresholds are 0.5, 0.51, 0.6, …, which are not exemplified here.
Further, the virtual abnormal class label corresponding to the unlabeled exemplar data can be optimized and adjusted according to the graph network to obtain the target abnormal class label, and the specific method can be as follows: the nodes corresponding to the unlabeled sample data can be obtained in the graph network and used as target nodes; the node with the associated edge between the node and the target node can be obtained in the graph network and used as the target associated node; according to the node value of the target associated node and the similarity between the target associated node and the target node, the node value of the target node can be optimized and adjusted to obtain the target node value, and the target abnormal class label can be determined according to the target node value.
Taking the example that the target associated node comprises a first target associated node and a second target associated node, the specific method for optimizing and adjusting the node value of the target node according to the node value of the target associated node and the similarity between the target associated node and the target node to obtain the target node value is explained as follows: a first similarity between the first target associated node and the target node and a second similarity between the second target associated node and the target node can be obtained; subsequently, a first node value of the first target associated node and a second node value of the second target associated node may be obtained; multiplying the first similarity with the first node value to obtain a first operation value; multiplying the second similarity with the second node value to obtain a second operation value; the node value of the target node can be optimized and adjusted according to the first operation value and the second operation value, and the target node value is obtained. The specific method for optimizing and adjusting the node value of the target node according to the first operation value and the second operation value to obtain the target node value can be as follows: the first operation value and the second operation value can be added to obtain a target operation value; then, a label value corresponding to the target operation value can be obtained, and the label value corresponding to the target operation value can be matched with the node value of the target node; if the label value corresponding to the target operation value is different from the node value of the target node, the node value of the target node can be replaced by the label value corresponding to the target operation value, and the node value of the target node is continuously optimized and adjusted according to the first adjustment node value, the first similarity and the second adjustment node value and the second similarity of the first target associated node, so that the target node value is obtained; the first adjustment node value is obtained by optimizing and adjusting the first node value according to the node value of the associated node corresponding to the first target associated node; the second adjustment node value is obtained by optimizing and adjusting the second node value according to the node value of the associated node corresponding to the second target associated node; if the label value corresponding to the target operation value is the same as the node value of the target node, the node value of the target node can be determined to be in a convergence state, and the node value of the target node can be used as the target node value. It should be understood that, the tag value corresponding to the target operand may include a tag corresponding to the target sample data, for example, when a value a (e.g., a value of 1) is used as a tag of negative sample data (e.g., sample data with a risk category of "risky" category), and a value B (e.g., a value of 0) is used as a tag of positive sample data (e.g., sample data with a risk category of "risky") the tag value corresponding to the target operand may be the value a or the value B (e.g., the tag value corresponding to the target operand may be 0 or 1). Taking the label of the negative sample data represented by the value 1 and the label of the positive sample data represented by the value 0 as an example, it should be understood that after the graph network is constructed, the node value of each node is the label (e.g., 0 or 1) of the node, each node in the graph network can be traversed, for each node, the edge weight (i.e., similarity) from the neighboring node (i.e., the node with which the associated edge exists, e.g., the neighboring node is the target associated node) to the node can be obtained, the node values of all the neighboring nodes of the node can be weighted (i.e., the node value of each neighboring node is multiplied by the edge weight from the neighboring node to the node), and the multiplied results of all the neighboring nodes are added, if the weighted result (i.e., the target operation value) is greater than or equal to a preset threshold (e.g., 0.7), determining the tag value corresponding to the target operation value as 1; if the weighted result (i.e., the target operation value) is smaller than the preset threshold (may be a manually specified value, for example, 0.7), the tag value corresponding to the target operation value may be determined to be 0. Further, the label value corresponding to the target operation value may be matched with the label (node value) of the node, if the label value is inconsistent with the label of the node, the label of the node may be updated (i.e., the label of the node is updated to the label value corresponding to the weighted result, for example, for the target node, if the label value corresponding to the target operation value is not identical with the node value of the target node, the node value of the target node may be replaced with the label value corresponding to the target operation value; if the label value corresponding to the weighted result is consistent with the label of the node, the label of the node may not be updated.
Further, the method can be continuously adopted, each node in the graph network is traversed, the node values of the neighbor nodes are weighted, and the node values of each node are updated (or not updated) based on the weighted result; after the graph network is cycled n times (n is a positive integer) until no node update labels exist in the graph network, the graph network can be determined to be stable (the graph network is in a convergence state), and then the node value of each node in the graph network can be taken as a final result. That is, after the graph network is stable, the node corresponding to the unlabeled exemplar data (i.e., the target node) in the graph network also obtains the final node value (i.e., the target node value).
Further, after obtaining the target node value corresponding to each target node, the target node value of each target node may be determined as a target abnormal class label, and all the unlabeled sample data and the target abnormal class label corresponding to the unlabeled sample data are applied to the optimization training of the class prediction model, and the specific optimization training method may be as follows: the object label sample data and the label-free sample data can be input into a category prediction model, and a third prediction abnormal category corresponding to the object label sample data and a fourth prediction abnormal category corresponding to the label-free sample data can be output through the category prediction model; then, a real abnormal category label corresponding to the object label sample data and a target abnormal category label corresponding to the label-free sample data can be obtained; according to the third predicted abnormal category, the real abnormal category label corresponding to the object label sample data, the fourth predicted abnormal category and the target abnormal category label, a third loss value can be determined; and according to the third loss value, optimizing and training the class prediction model to obtain a target class prediction model, for example, updating model parameters of the class prediction model based on the third loss value, when the third loss value is converged, taking the model parameters corresponding to the convergence of the third loss value as model parameters of the trained target class prediction model, wherein the model parameters can be used for identifying risk classes of input data.
Optionally, it may be understood that the specific method for determining the target anomaly class label according to the target node value may further be: when the node value of the target node is in a convergence state, the target operation value corresponding to the label value which is the same as the node value of the target node can be determined as a convergence operation value; subsequently, an absolute value of a difference between the target node value and the convergence operand may be determined; matching the absolute value of the difference with a first label threshold value and a second label threshold value, and acquiring a target absolute value of the difference larger than the first label threshold value or smaller than the second label threshold value from the absolute value of the difference; the second tag threshold is less than the first tag threshold; and determining a target node value corresponding to the absolute value of the target difference value as a target abnormal class label. The first label threshold and the second label threshold may be manually specified values, for example, the first label threshold may be 0.9, 0.97, …, and the second label threshold may be 0.1, 0.15, …, which are not illustrated here.
It should be understood that after obtaining the target node value corresponding to each target node, a target node value of a portion of the target nodes (for example, a target node value corresponding to a node with a higher prediction confidence level) may be selected as the target anomaly class label, and the target anomaly class label of the portion of the target nodes may be applied to training of the class prediction model. The node with higher prediction confidence may refer to: a target node having an absolute value of the difference between the convergence operand and the target node value greater than a first label threshold (e.g., 0.9) or less than a second label threshold (e.g., 0.1); the convergence operation value may refer to a target operation value corresponding to a label value identical to the node value of the target node when the node value of the target node is in a convergence state. For example, the node corresponding to the unlabeled exemplar data q is the node q, the target operation value of the neighboring node of the node q is 0.96, the label value corresponding to the target operation value is 1, and since the label value corresponding to the target operation value is the same as the current node value 1 of the node q, the node q can be determined to be in a convergence state, the target operation value 0.96 can be determined to be a convergence operation value, and the current node value 1 of the node q can be determined to be the target node value of the node q. Because the absolute value of the difference between the convergence calculation value 0.96 and the target node value 1 of the node q is 0.04 (|0.96-1|=0.04), and the absolute value of the difference is smaller than the second label threshold value 0.1, the node q can be determined to be a node with higher prediction confidence, the target node value 1 of the node q can be determined to be a target abnormal class label, the unlabeled sample data q corresponding to the node q and the target abnormal class label 1 can be applied to a subsequent class prediction model.
Further, the trained target class prediction model may be deployed in a server for online invocation. Specific methods for applying the target class prediction model may be: the data to be identified belonging to the target scene type can be obtained; then, the data to be identified can be input into a target category prediction model, and the hidden feature vector of the data to be identified can be extracted through a feature extraction layer of the target category prediction model; the hidden feature vector of the data to be identified can be input into a feature classification layer of the target category prediction model, and the prediction probability corresponding to the initial prediction abnormal category and the initial prediction abnormal category corresponding to the data to be identified can be output through the feature classification layer; the maximum prediction probability can be obtained from the prediction probabilities, and the initial prediction abnormal category corresponding to the maximum prediction probability is determined as the prediction abnormal category corresponding to the data to be identified.
In the embodiment of the application, the initial class prediction model is initially trained through object label sample data belonging to the target scene type and associated label sample data belonging to the associated scene type to obtain a class prediction model, and the class prediction model can output a prediction abnormality class of unlabeled sample data belonging to the target scene type, wherein the prediction abnormality class can be used as a virtual abnormality class label of unlabeled sample data; then, each sample data can be used as a node (the label corresponding to each node is the node value), a graph network is constructed based on the similarity between every two sample data in the sample data, and the real abnormal class label with the label sample data in the graph network can be transmitted to the label-free sample data according to the similarity, so that the virtual abnormal class label of the label-free sample data can be optimally adjusted, and an accurate target abnormal class label can be obtained; after the target abnormal class label of the label-free sample data is obtained, the class prediction model can be optimized and trained based on the target abnormal class label of the label-free sample data and the real abnormal class label of the object label sample data. It can be seen that in the application, based on the real abnormal class label of the object label sample data under the target scene type and the real abnormal class label of the associated label sample data under the associated scene type, the accurate target abnormal class label of the unlabeled sample data can be predicted, and the model is trained according to the object label sample data (namely the labeled sample data) belonging to the target scene type (namely the target domain) and the unlabeled sample data (namely the unlabeled sample data), so that the object label sample data and the unlabeled sample data are effectively combined, and the class prediction model obtained by optimization training can be more accurate. In summary, the method and the device can effectively combine the unlabeled sample data of the target scene type and the labeled sample data in the model training, so that the unlabeled sample data can be effectively utilized under the condition that the labeled sample data of the target domain is too small, and the recognition accuracy of the model for recognizing the target domain can be improved.
In order to facilitate understanding of the method for obtaining the specific implementation of the target abnormal class label by adjusting the virtual abnormal class label of the unlabeled exemplar data by using the graph network label diffusion algorithm, please refer to fig. 5, fig. 5 is a schematic diagram of adjusting the label of the unlabeled exemplar data according to the embodiment of the present application. Taking the example that the target sample data includes target sample data 1, target sample data 2 and target sample data 3, wherein the target sample data 2 is unlabeled sample data under the target scene type. The node corresponding to the target sample data 1 is node 1, the node corresponding to the target sample data 2 is node 2, and the node corresponding to the target sample data 3 is node 3; after comparing the similarity between any two pieces of target sample data with the similarity threshold value, determining that the similarity (0.7) between the target sample data 1 and the target sample data 2, the similarity (0.6) between the target sample data 2 and the target sample data 3, and the similarity (0.8) between the target sample data 1 and the target sample data 3 are all greater than the similarity threshold value 0.5, and creating a correlation edge between the node 1 and the node 2, between the node 2 and the node 3, and between the node 1 and the node 3 (the edge weight of the correlation edge between the two nodes is the similarity between the two nodes), so that the obtained graph network is shown as a graph network 50 a. In the graph network 50a, the node value of each node is a label corresponding to the sample data, for example, the node 1 is a node corresponding to the target sample data 1, the true abnormal class label of the target sample data 1 is 0 (the value 0 can be used to characterize a "no risk" class), and then the node value of the node 1 is 0; and node 2 is the node corresponding to the target sample data 2, the virtual anomaly class label of the target sample data 2 is 1 (the value 1 can be used for representing the "risky" class), and then the node value of the node 2 is 1.
Further, the weighted result may be obtained by weighting the node values of all the neighboring nodes (i.e., the sum of the products of the node values of all the neighboring nodes and the edge weights of the neighboring nodes to the current node) based on the edge weights (i.e., the similarity) from the neighboring nodes of each node (the associated node having the associated edge) to the node, and then the weighted result may be matched with a preset threshold, and if the weighted result is greater than or equal to the preset threshold, the tag value corresponding to the weighted result may be determined as a value 1; if the weighted result is smaller than the preset threshold, the tag value corresponding to the weighted result can be determined to be a value of 0. Then, the label value corresponding to the weighted result can be matched with the node value of the node, and if the label value corresponding to the weighted result is inconsistent with the node value of the node, the node value of the node can be updated; if the weighted result is consistent with the node value of the node, not updating the node value of the node; this may be looped n times through the graph network until no nodes are updated, ultimately resulting in the graph network 50n.
For example, as shown in the graph network 50a, if the neighboring node of the node 1 is node 2 and node 3, the node value (1) of the node 2 may be multiplied by the similarity (0.7) between the node 2 and node 1, the node value (0) of the node 3 may be multiplied by the lower similarity (0.8) between the node 3 and node 1, and the two multiplication results may be added (i.e., 0.7+0) to obtain a weighted result of 0.7, where it may be determined whether the weighted result of 0.7 is greater than or equal to the preset threshold (0.8), because the weighted result of 0.7 is less than the preset threshold of 0.8, and the tag value corresponding to the weighted result may be determined to be 0; then, it may be determined whether the tag value 0 is the same as the node value (0) of the node 1, if so, the node value of the node 1 may not be updated, and if not, the node value of the node 1 may be updated to the tag value (0) corresponding to the weighted result, because the tag value 0 is the same as the node value 0, it may be determined that the node value of the node 1 is in a converged state, and the node value of the node 1 may not be updated any more; similarly, if the neighboring node of the node 2 is node 1 and node 3, the node values of the neighboring node of the node 2 may be weighted (i.e., 0×0.7+0×0.6), the obtained weighted result is 0, it may be determined whether the weighted result 0 is greater than or equal to a preset threshold value 0.8, since the weighted result 0 is less than the preset threshold value 0.8, the tag value corresponding to the weighted result may be determined to be 0, and then, it may be determined whether the tag value 0 is the same as the node value (1) of the node 2, and if the tag value 0 is different from the node value 1, the node value of the node 2 may be updated to be 0; similarly, if the neighboring node of the node 3 is node 1 and node 2, the node values of the neighboring node of the node 3 may be weighted (i.e., 1×0.6+0×0.8), the obtained weighted result is 0.6, it may be determined whether the weighted result 0.6 is greater than or equal to the preset threshold 0.8, because the weighted result 0.6 is less than the preset threshold 0.8, the tag value corresponding to the weighted result may be determined as 0, and then it may be determined whether the tag value 0 is the same as the node value (0) of the node 3, because the tag value 0 is the same as the node value 0, it may be determined that the node value of the node 3 is in a converged state, and the node value of the node 3 may not be updated. Thus, the graph network 50b shown in fig. 5 can be obtained.
Further, in the graph network 50b, the node values of all the neighboring nodes of each node may be continuously weighted, for example, for the node 2, the node values of the neighboring nodes thereof may be weighted (i.e., 0×0.7+0×6), since the weighted result 0 is smaller than the preset threshold value 0.8, the tag value corresponding to the weighted result 0 may be determined as 0, then, whether the tag value 0 is the same as the node value (0) of the node 2 may be determined, since the tag value 0 is the same as the node value 0, the node value of the node 2 may be determined to be in a converged state, and the node value of the node 2 may not be updated any more; thus, a graph network 50c as shown in fig. 5 can be obtained. It should be understood that, in the graph network 50c, the weighted neighboring nodes of each node have the same node value, so that it may be determined that the graph network 50c is already in a converged state, and the graph network 50c may be determined as the final graph network. The target node value of node 1 in the graph network 50c is 0, the target node value of node 2 is 0, and the target node value of node 3 is 0.
Further, it should be appreciated that, because the target sample data 2 is unlabeled sample data under the target scene type, the target node value 0 of the node 2 may be regarded as the target anomaly class label corresponding to the target sample data 2. The target anomaly class label 0 for the target sample data 2 can be applied to subsequent optimization training of the class prediction model. For example, tagged sample data (i.e., object tag sample data, such as object sample data 1) and untagged sample data (object sample data 2) under the object scene type may be input into a class prediction model, by which a prediction probability corresponding to the object sample data 1 and the object sample data 2, respectively, may be output, for example, the prediction probability corresponding to the object sample data 1 is 0.8 (corresponding to the "risky" class) and 0.2 (corresponding to the "risky" class), because the prediction probability 0.8 is greater than the prediction probability 0.2, the prediction anomaly class corresponding to the object sample data 1 may be determined as the "risky" class; the prediction probability corresponding to the target sample data 2 is 0.3 (corresponding to the "risky" category) and 0.7 (corresponding to the "risky" category), and since the prediction probability 0.3 is smaller than the prediction probability 0.7, the prediction anomaly category corresponding to the target sample data 2 can be determined as the "risky" category. Then, a loss value of 1 may be determined from the predicted anomaly class (i.e., the "risky" class) of the target sample data of 1, and the true anomaly class label (i.e., 0, the "risky" class); according to the predicted abnormal category (namely the 'no risk' category) corresponding to the target sample data 2 and the real abnormal category label (namely the target abnormal category label 0 and the 'no risk category'), a loss value 2 can be determined; and generating a final loss value of the model according to the loss value 1 and the loss value 2, and adjusting model parameters of the class prediction model according to the final loss value until the final loss value converges, so that the trained target class prediction model can be obtained. It should be noted that, in the above graph network, the target sample data 1 is the object label sample data in the target scene, after the graph network is stabilized, the target node value of each node may be determined, for example, the node value of the node of the target sample data 1 (i.e., the node 1) may be determined to be 0, and although the node value 0 in the graph network 50c is consistent with the original label (the true abnormal class label 0) of the target sample data 1, when the class prediction model is trained, the label value of the target sample data 1 is the original label of the target sample data 1, instead of the target node value of the target sample data 1 in the graph network 50 c. Of course, alternatively, training of the class prediction model may also be performed based on the target node value of the target sample data 1 in the graph network (e.g., the node value 0 of the node 1 in the graph network).
Optionally, it may be appreciated that, to further improve the accuracy of determining the similarity between every two target sample data, a domain adaptive (domain adaptation) method may be used to map the sample data from different scene types onto a unified common feature space to measure the similarity, so as to avoid the problem that the calculated similarity is inaccurate due to the fact that the feature embedding vectors are distributed differently and the scene types are different, and when the similarity is calculated on the original feature space, domain clustering (scene clustering) occurs. To map sample data from different scene types into a unified common feature space, a domain classification layer may be added after the feature extraction layer in the above model (e.g., the above initial class prediction model), it will be appreciated that the initial class prediction model may include a feature extraction layer, a domain classification layer, and a feature classification layer. For ease of understanding, the feature extraction layer, the domain classification layer, and the feature classification layer in the initial class prediction model will be described below:
Feature extraction layer: the feature extraction layer may be a fully connected network layer (e.g., ) The fully connected network layer may extract feature embedding vectors of the input data (e.g., the sample data described above).
Domain classification layer: the domain classification layer may be formed of multiple layers of fully connected networks, which may identify the classification network for a domain (e.g.,) The feature embedding vector extracted by the feature extraction layer may be input into the domain classification layer, through which a domain discrimination result (i.e., a predicted scene type) of the feature embedding vector may be output. It should be appreciated that the domain classification layer may constrain the true domain relationships/>, of the fitting domainThat is, if the scene type is predictedAnd predicting scene type/>From the same domain (same scene type), then/>And if the scene type/>, is predictedAnd predicting scene type/>From different domains (different scene types), then/>
It should be appreciated that the domain classification layer may, by constraining the true domain relationships of the fitted domains, be able to supervise the mapping of sample data from different domains (scene types) to embedded vectors on the same hidden space, i.e. sample data from different domains (e.g. the object tag sample data and the associated tag sample data) may be mapped to a common space resulting in hidden feature vectors on the common space. It can be understood that the feature extraction layer and the domain classification layer in the initial class prediction model have an antagonistic relationship, in forward propagation, the feature embedding vector extracted by the feature extraction layer can be transmitted into the domain classification layer, and the domain classification layer can calculate a domain classification loss value according to a judgment result by judging that the transmitted feature embedding vector is from a target scene type or an associated scene type; the domain classification layer aims to distinguish whether the input feature embedded vector comes from a target scene type or an associated scene type as far as possible; in the back propagation, the gradient inversion layer (GRADIENT REVERSAL LAYER) between the domain classification layer and the feature extraction layer can make the training target of the feature extraction layer opposite to the domain classification layer, that is, the feature embedding vector expected to be output by the feature extraction layer can make the domain classification layer unable to correctly determine from which scene type the feature embedding vector comes. It should be appreciated that such a countermeasure relationship may ultimately prevent the domain classification layer from accurately distinguishing the received feature embedding vectors, while the feature extraction layer may successfully blend object tag sample data belonging to the target scene type with associated tag sample data belonging to the associated scene type in some common feature space.
Feature classification layer: the feature classification layer may be formed from multiple layers of fully-connected networks, and the feature classification layer may be a feature recognition classification network (e.g.,) The feature embedding vector extracted by the feature extraction layer may be input into the feature classification layer, through which a risk category prediction result (i.e., a predicted abnormal category) of the feature embedding vector may be output. It is understood that the initial class prediction model may be a domain migration neural network (Domain Adaptive Neural Network, DANN) model.
It may be appreciated that when the domain classification layer is included in the initial class prediction model, in the process of performing preliminary training on the initial class prediction model through the object tag sample data and the associated tag sample data to obtain the class prediction model, a cross entropy loss function value may be calculated together based on a real scene type tag (i.e. a target scene type) of the object tag sample data, a real scene type tag (i.e. an associated scene type) of the associated tag sample data, and a real abnormal class tag corresponding to the object tag sample data and the associated tag sample data respectively, and model parameters of the initial class prediction model may be optimized and updated based on the cross entropy loss value by using a reverse gradient update mechanism. The specific method can be seen from the description of step S201-step S205 in the corresponding embodiment of fig. 6.
Referring to fig. 6, fig. 6 is a flowchart illustrating training of an initial class prediction model according to object tag sample data and associated tag sample data according to an embodiment of the present application. As shown in fig. 6, the flow may include the following steps S201 to S205:
in step S201, the object tag sample data and the associated tag sample data are input to the initial category prediction model, and the first prediction abnormal category and the first prediction scene type corresponding to the object tag sample data and the second prediction abnormal category and the second prediction scene type corresponding to the associated tag sample data are output through the initial category prediction model.
According to the application, object tag sample data and associated tag sample data are input into an initial category prediction model, and hidden feature vectors respectively corresponding to the object tag sample data and the associated tag sample data can be extracted through a feature extraction layer in the initial category prediction model; the method comprises the steps of inputting hidden feature vectors corresponding to object tag sample data into a feature classification layer, outputting a first prediction abnormal category corresponding to the object tag sample data through the feature classification layer, inputting hidden feature vectors corresponding to the associated tag sample data into the feature classification layer, and outputting a second prediction abnormal category corresponding to the associated tag sample data through the feature classification layer; the hidden feature vector corresponding to the object tag sample data is input into a domain classification layer of an initial category prediction model, and a first prediction scene type corresponding to the object tag sample data can be output through the domain classification layer; and inputting the hidden feature vector of the associated tag sample data into a domain classification layer, and outputting a second predicted scene type corresponding to the associated tag sample data through the domain classification layer.
Step S202, a first real abnormal category label and a target scene type corresponding to object label sample data and a second real abnormal category label and an associated scene type corresponding to associated label sample data are obtained.
In the application, the target scene type is the real scene type tag of the object tag sample data, and the associated scene type is the real scene type tag of the associated tag sample data.
Step S203, a first loss function value is determined according to the first predicted abnormal category, the real abnormal category label corresponding to the object label sample data, the first predicted scene type and the target scene type.
According to the method, a first loss function value, which is a cross entropy loss function value of object tag sample data, can be generated according to a first prediction exception type, a real exception type tag corresponding to the object tag sample data, a first prediction scene type, a target scene type and a cross entropy loss function.
Step S204, a second loss function value is determined according to the second predicted abnormal category, the real abnormal category label corresponding to the associated label sample data, the second predicted scene type and the associated scene type.
In the application, a cross entropy loss function value of the associated label sample data, namely a second loss function value, can be generated according to the second prediction exception type, the real exception type label corresponding to the associated label sample data, the second prediction scene type, the associated scene type and the cross entropy loss function.
Step S205, training the initial class prediction model according to the first loss function value and the second loss function value to obtain a class prediction model.
In the application, training the initial class prediction model according to the first loss function value and the second loss function value can be carried out as follows: generating a target loss function value according to the first loss function value and the second loss function value; if the objective loss function value meets the model convergence condition, the initial class prediction model can be used as a class prediction model; if the objective loss function value does not meet the model convergence condition, a gradient optimization function can be obtained, and model parameters of the initial model prediction model are adjusted according to the gradient optimization function and the objective loss function value, so that a model prediction model containing the adjusted model parameters is obtained.
In the embodiment of the application, the initial class model can be trained in a supervised manner based on the countermeasure ideas of the feature extraction layer and the domain classification layer, so that the class prediction model obtained by training can be mixed in a certain public feature space with sample data of a target scene type and an associated scene type, that is, the hidden feature vector of the object label sample data extracted by the class prediction model, the hidden feature vector of the label-free sample data and the hidden feature vector of the associated label sample data are feature vectors on a unified public feature space, the domain clustering problem can be avoided when the similarity between any two sample data is calculated based on the hidden feature vector, the calculated similarity can be more accurate, the virtual abnormal class label of the label-free sample data can be optimally adjusted based on the accurate similarity, and the more accurate target abnormal class label can be obtained, so that when the model is trained based on the target abnormal class label later.
Further, referring to fig. 7, fig. 7 is a schematic diagram of a system architecture according to an embodiment of the present application. The system provided by the application can be a risk identification system, and as shown in fig. 7, the system can comprise a risk identification system sample preparation module, a risk identification system training module and a risk identification system identification module. In order to facilitate understanding of the function of each module in the system, the risk identification system sample preparation module, the risk identification system training module, and the risk identification system identification module will be described below:
Risk identification system sample preparation module: the risk identification system sample preparation module may be configured to obtain sample data for the training module, where the sample data may include all sample data from the target domain and the plurality (or one) of source domains, where the all sample data may include tagged sample data and untagged sample data for the target domain, tagged sample data from the source domain. It should be appreciated that a field may refer to a scene type, a target field may refer to the target scene type, and a source field may refer to the associated scene type.
The sample preparation module of the risk identification system can divide the labeled sample data of the target domain to obtain a labeled training set and a labeled test set, wherein the labeled training set can be used for carrying out subsequent model training, and the labeled test set can be used for testing the trained model. In the case of dividing the labeled sample data of the target domain, a cumulative distribution function (e.g., a K-S test function) may be used to verify whether two empirical distributions are different or whether one empirical distribution is different from the other ideal distribution, where KS values are commonly used as an evaluation index for distinguishing between positive and negative samples by a model. It should be appreciated that using the cumulative distribution function may result in a uniform distribution of positive and negative samples in the resulting labeled training set (e.g., the number of sample data labeled "risky" versus "risky" categories may be very small).
Risk identification system training module: in the risk recognition system training module, training processing can be performed on the initial category prediction model through labeled sample data from a source domain and a labeled training set of a target domain, and specifically the method can comprise the following steps of 1-5:
Step 1, inputting tagged sample data from a source domain and a tagged training set of a target domain into a feature extraction layer of an initial category prediction model, and performing unified coding processing on the tagged sample data from the source domain and the tagged training set of the target domain through the feature extraction layer to obtain hidden feature vectors of the tagged sample data and the tagged training set of the source domain in a public space.
And 2, inputting the hidden feature vector extracted by the feature extraction layer into a domain classification layer, predicting from which domain the labeled sample data and the labeled training set of the source domain come respectively (namely outputting the respective corresponding predicted scene types) through the domain classification layer, and performing domain countermeasure.
And step 3, inputting the hidden feature vector extracted by the feature extraction layer into a feature classification layer, predicting risk probabilities respectively corresponding to the labeled sample data and the labeled training set of the source domain through the feature classification layer, and predicting risk categories respectively corresponding to the labeled sample data and the labeled training set according to the risk probabilities.
Step 4, calculating a loss function value (e.g., cross entropy loss function value) of the initial class prediction model based on the real domain classification (i.e., real scene type label) and the real risk class label (i.e., real anomaly class label) of the sample data (including labeled sample data of the source domain and labeled training set of the target domain).
And 5, updating model parameters of the initial class prediction model based on the cross entropy loss function value until the cross entropy loss function value is converged, and taking the model when the cross entropy loss function value is converged as a class prediction model.
For the specific implementation manners of the steps 1 to 5, refer to the descriptions of the steps S201 to S205 in the embodiment corresponding to fig. 6, and the detailed descriptions will be omitted here.
Further, after model training is performed by using labeled sample data from a source domain and a labeled training set of a target domain to obtain a class prediction model, a graph network label diffusion algorithm can be utilized to obtain a target abnormal class label of unlabeled sample data. For a specific implementation manner, reference may be made to the description of the target abnormality category label obtained in step S103 in the embodiment corresponding to fig. 4, which will not be described herein.
Further, the class prediction model may be trained based on a labeled training set of the target domain and unlabeled sample data from which a target anomaly class label is obtained, to obtain a trained model (i.e., a target class prediction model).
Risk identification system identification module: in the risk recognition system recognition module, the recognition of the risk class may be performed on the data to be recognized based on the trained model, for example, the data to be recognized may be input into the trained model, and through the trained model, a prediction result (i.e., a recognition result of the risk class) corresponding to the data to be recognized may be output.
In the embodiment of the application, a risk identification system based on semi-supervised learning of graph network label diffusion and domain countermeasure transfer learning is provided, a risk identification model (such as the target category prediction model) with strong generalization capability and high accuracy is established by fully combining a small amount of labeled sample data of a target domain, a large amount of unlabeled sample data of a source domain and a large amount of labeled sample data of a source domain, and the risk identification model can more accurately identify the risk category of data to be identified from the target domain.
Further, referring to fig. 8, fig. 8 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The data processing apparatus may be a computer program (including program code) running in a computer device, for example the data processing apparatus is an application software; the data processing device may be used to perform the method shown in fig. 4. As shown in fig. 8, the data processing apparatus 1 may include: a sample acquisition module 11, a model training module 12, a tag prediction module 13, a target data determination module 14, and a tag adjustment module 15.
A sample acquiring module 11, configured to acquire object sample data belonging to a target scene type and associated tag sample data belonging to an associated scene type; the object sample data comprises object label sample data and label-free sample data; the associated scene type and the target scene type have scene association relation;
The model training module 12 is configured to train the initial class prediction model according to the object label sample data and the associated label sample data to obtain a class prediction model;
the label prediction module 13 is configured to determine a predicted abnormal class corresponding to the unlabeled exemplar data through a class prediction model, and take the predicted abnormal class corresponding to the unlabeled exemplar data as a virtual abnormal class label of the unlabeled exemplar data;
A target data determining module 14, configured to determine both the object sample data and the associated tag sample data as target sample data;
The tag adjustment module 15 is configured to perform optimization adjustment on the virtual abnormal category tag according to the similarity between every two target sample data, the real abnormal category tag corresponding to the object tag sample data, and the real abnormal category tag corresponding to the associated tag sample data, so as to obtain a target abnormal category tag; the target abnormal category label is used for carrying out optimization training on the category prediction model together with the real abnormal category label corresponding to the object label sample data.
The specific implementation manners of the sample acquiring module 11, the model training module 12, the tag predicting module 13, the target data determining module 14, and the tag adjusting module 15 may be referred to the description of step S101 to step S103 in the embodiment corresponding to fig. 4, and will not be described herein.
In one embodiment, model training module 12 may include: model prediction unit 121, real tag acquisition unit 122, loss value determination unit 123, and model training unit 124.
A model prediction unit 121, configured to input the object tag sample data and the associated tag sample data into an initial category prediction model, and output a first prediction anomaly category and a first prediction scene type corresponding to the object tag sample data and a second prediction anomaly category and a second prediction scene type corresponding to the associated tag sample data through the initial category prediction model;
The real tag obtaining unit 122 is configured to obtain a first real abnormal category tag corresponding to the object tag sample data and the target scene type, and a second real abnormal category tag corresponding to the associated tag sample data and the associated scene type;
A loss value determining unit 123, configured to determine a first loss function value according to the first predicted abnormal category, the real abnormal category label corresponding to the object label sample data, the first predicted scene type, and the target scene type;
The loss value determining unit 123 is further configured to determine a second loss function value according to the second predicted abnormal category, the real abnormal category label corresponding to the associated label sample data, the second predicted scene type, and the associated scene type;
the model training unit 124 is configured to train the initial class prediction model according to the first loss function value and the second loss function value, so as to obtain a class prediction model.
The specific implementation manners of the model prediction unit 121, the real tag obtaining unit 122, the loss value determining unit 123, and the model training unit 124 may be referred to the descriptions of step S201 to step S205 in the embodiment corresponding to fig. 6, and will not be described herein.
In one embodiment, model training unit 124 may include: a loss value generation subunit 1241, a model determination subunit 1242, and a model adjustment subunit 1243.
A loss value generation subunit 1241, configured to generate a target loss function value according to the first loss function value and the second loss function value;
a model determining subunit 1242, configured to take the initial class prediction model as a class prediction model if the objective loss function value meets the model convergence condition;
The model adjustment subunit 1243 is configured to obtain a gradient optimization function if the objective loss function value does not meet the model convergence condition, and adjust the model parameters of the initial model according to the gradient optimization function and the objective loss function value, so as to obtain a model prediction model including the adjusted model parameters.
The specific implementation manners of the loss value generation subunit 1241, the model determination subunit 1242, and the model adjustment subunit 1243 may be referred to the description in step S205 in the embodiment corresponding to fig. 6, which will not be described herein.
In one embodiment, the tag adjustment module 15 may include: a node determination unit 151, a similarity determination unit 152, a graph network construction unit 153, and a label optimization unit 154.
A node determining unit 151 for determining each target sample data as a node of the graph network;
the node determining unit 151 is further configured to use the real anomaly class label corresponding to the object label sample data as a node value of a node belonging to the object label sample data;
the node determining unit 151 is further configured to use a virtual exception type label corresponding to the unlabeled exemplar data as a node value of a node belonging to the unlabeled exemplar data;
the node determining unit 151 is further configured to use the real abnormal category label corresponding to the associated label sample data as a node value of a node belonging to the associated label sample data;
A similarity determination unit 152 for determining a similarity between each two pieces of target sample data;
A graph network construction unit 153, configured to construct a graph network according to the similarity, the node corresponding to each target sample data, and the node value of each node;
The tag optimizing unit 154 is configured to perform optimization adjustment on the virtual abnormal category tag according to the graph network, so as to obtain a target abnormal category tag.
The specific implementation manners of the node determining unit 151, the similarity determining unit 152, the graph network constructing unit 153, and the label optimizing unit 154 may be referred to the description in step S103 in the embodiment corresponding to fig. 4, and will not be described herein.
In one embodiment, the target sample data includes target sample data S i and target sample data S j; i. j are positive integers;
The similarity determination unit 152 may include: a first feature extraction subunit 1521, a distance determination subunit 1522, and a first similarity determination subunit 1523.
A first feature extraction subunit 1521, configured to input the target sample data S i and the target sample data S j to a class prediction model, extract, by a feature extraction layer of the class prediction model, a hidden feature vector k a corresponding to the target sample data S i, and a hidden feature vector k b corresponding to the target sample data S j; a. b are positive integers;
a distance determination subunit 1522 configured to determine a vector distance between the hidden feature vector k a and the hidden feature vector k b;
The first similarity determination subunit 1523 is configured to use the vector distance as a similarity between the target sample data S i and the target sample data S j.
The specific implementation manner of the first feature extraction subunit 1521, the distance determination subunit 1522, and the first similarity determination subunit 1523 may be referred to the description in step S103 in the embodiment corresponding to fig. 4, and will not be described herein.
In one embodiment, the target sample data includes target sample data S i and target sample data S j; i. j are positive integers;
The similarity determination unit 152 may include: a second feature extraction subunit 1524, a cosine determination subunit 1525, and a second similarity determination subunit 1526.
The second feature extraction subunit 1524 is configured to input the target sample data S i and the target sample data S j to a class prediction model, and extract, by a feature extraction layer of the class prediction model, a hidden feature vector k a corresponding to the target sample data S i and a hidden feature vector k b corresponding to the target sample data S j; a. b are positive integers;
A cosine determining subunit 1525, configured to determine an angle value between the hidden feature vector k a and the hidden feature vector k b, and determine a cosine value between the hidden feature vector k a and the hidden feature vector k b according to the angle value;
The second similarity determining subunit 1526 is configured to use the cosine value as the similarity between the target sample data S i and the target sample data S j.
The specific implementation manner of the second feature extraction subunit 1524, the cosine determination subunit 1525, and the second similarity determination subunit 1526 may be referred to the description in step S103 in the embodiment corresponding to fig. 4, and will not be described herein.
In one embodiment, the graph network constructing unit 153 may include: a set determination subunit 1531, a target value determination subunit 1532, and a graph network generation subunit 1533.
A set determination subunit 1531 configured to determine a similarity between each two nodes as a set of similarities;
a target value determining subunit 1532, configured to compare each similarity in the similarity set with a similarity threshold value, and obtain a target similarity greater than or equal to the similarity threshold value in the similarity set;
a graph network generation subunit 1533 is configured to create a correlation edge for two nodes with target similarity, and generate a graph network including a node corresponding to each target sample data, a node value of each node, and the correlation edge.
The specific implementation manner of the set determining subunit 1531, the target value determining subunit 1532, and the graph network generating subunit 1533 may be referred to the description in step S103 in the embodiment corresponding to fig. 4, and will not be described herein.
In one embodiment, the tag optimization unit 154 may include: a node selection subunit 1541, an association node subunit 1542, a node optimization subunit 1543, and a label determination subunit 1544.
A node selection subunit 1541, configured to obtain, in the graph network, a node corresponding to the label-free sample data as a target node;
An association node subunit 1542, configured to obtain, in the graph network, a node having an association edge with a target node, as a target association node;
the node optimization subunit 1543 is configured to perform optimization adjustment on the node value of the target node according to the node value of the target associated node and the similarity between the target associated node and the target node, so as to obtain the target node value;
the tag determination subunit 1544 is configured to determine the target abnormal category tag according to the target node value.
The specific implementation manner of the node selection subunit 1541, the association node subunit 1542, the node optimization subunit 1543, and the label determination subunit 1544 may be referred to the description in step S103 in the embodiment corresponding to fig. 4, and will not be described herein.
In one embodiment, the target associated nodes include a first target associated node and a second target associated node;
The node optimization subunit 1543 is further specifically configured to obtain a first similarity between the first target associated node and the target node, and a second similarity between the second target associated node and the target node;
The node optimization subunit 1543 is further specifically configured to obtain a first node value of the first target associated node and a second node value of the second target associated node;
The node optimization subunit 1543 is further specifically configured to multiply the first similarity with a first node value to obtain a first operation value;
The node optimization subunit 1543 is further specifically configured to multiply the second similarity with a second node value to obtain a second operation value;
The node optimization subunit 1543 is further specifically configured to perform optimization adjustment on the node value of the target node according to the first operation value and the second operation value, so as to obtain the target node value.
In one embodiment, the node optimization subunit 1543 is further specifically configured to add the first operand and the second operand to obtain a target operand;
The node optimization subunit 1543 is further specifically configured to obtain a tag value corresponding to the target operation value, and match the tag value corresponding to the target operation value with a node value of the target node;
The node optimization subunit 1543 is further specifically configured to replace the node value of the target node with the tag value corresponding to the target operation value if the tag value corresponding to the target operation value is different from the node value of the target node, and perform optimization adjustment on the node value of the target node according to the first adjustment node value, the first similarity, the second adjustment node value and the second similarity of the first target associated node, so as to obtain the target node value; the first adjustment node value is obtained by optimizing and adjusting the first node value according to the node value of the associated node corresponding to the first target associated node; the second adjustment node value is obtained by optimizing and adjusting the second node value according to the node value of the associated node corresponding to the second target associated node;
The node optimization subunit 1543 is further specifically configured to determine that the node value of the target node is in a convergence state if the tag value corresponding to the target operation value is the same as the node value of the target node, and take the node value of the target node as the target node value.
In one embodiment, the tag determination subunit 1544 is further specifically configured to determine, when the node value of the target node is in the converged state, a target operation value corresponding to the tag value that is the same as the node value of the target node as the converged operation value;
The tag determination subunit 1544 is further specifically configured to determine an absolute value of a difference between the target node value and the convergence operand;
the tag determination subunit 1544 is further specifically configured to match the absolute difference value with a first tag threshold and a second tag threshold, and obtain a target absolute difference value greater than the first tag threshold or less than the second tag threshold from the absolute difference value; the second tag threshold is less than the first tag threshold;
The tag determination subunit 1544 is further specifically configured to determine a target node value corresponding to the target difference absolute value as a target abnormal class tag.
In one embodiment, the apparatus 1 may further comprise: a data input module 16, a tag acquisition module 17 and a model optimization module 18.
The data input module 16 is configured to input the object label sample data and the unlabeled sample data into a class prediction model, and output a third predicted abnormal class corresponding to the object label sample data and a fourth predicted abnormal class corresponding to the unlabeled sample data through the class prediction model;
The tag obtaining module 17 is configured to obtain a real abnormal category tag corresponding to the object tag sample data and a target abnormal category tag corresponding to the label-free sample data;
The model optimization module 18 is configured to determine a third loss value according to the third predicted anomaly class, the real anomaly class label corresponding to the object label sample data, the fourth predicted anomaly class, and the target anomaly class label;
the model optimization module 18 is further configured to perform optimization training on the class prediction model according to the third loss value, so as to obtain a target class prediction model.
The specific implementation manners of the data input module 16, the tag obtaining module 17, and the model optimizing module 18 may be referred to the description in step S103 in the embodiment corresponding to fig. 4, and will not be described herein.
In one embodiment, the apparatus 1 may further comprise: a target data acquisition module 19, a vector extraction module 20 and a model application module 21.
A target data acquisition module 19, configured to acquire data to be identified belonging to a target scene type;
The vector extraction module 20 is configured to input data to be identified into the target class prediction model, and extract a hidden feature vector of the data to be identified through a feature extraction layer of the target class prediction model;
The model application module 21 is configured to input the hidden feature vector of the data to be identified to a feature classification layer of the target class prediction model, and output, through the feature classification layer, a prediction probability corresponding to an initial prediction anomaly class and an initial prediction anomaly class corresponding to the data to be identified;
The model application module 21 is further configured to obtain a maximum prediction probability from the prediction probabilities, and determine an initial prediction anomaly class corresponding to the maximum prediction probability as a prediction anomaly class corresponding to the data to be identified.
The specific implementation manners of the target data obtaining module 19, the vector extracting module 20, and the model application module 21 may be referred to the description in step S103 in the embodiment corresponding to fig. 4, and will not be described herein.
In the embodiment of the application, the initial class prediction model is initially trained through object label sample data belonging to the target scene type and associated label sample data belonging to the associated scene type to obtain a class prediction model, and the class prediction model can output a prediction abnormality class of unlabeled sample data belonging to the target scene type, wherein the prediction abnormality class can be used as a virtual abnormality class label of unlabeled sample data; then, each sample data can be used as a node (the label corresponding to each node is the node value), a graph network is constructed based on the similarity between every two sample data in the sample data, and the real abnormal class label with the label sample data in the graph network can be transmitted to the label-free sample data according to the similarity, so that the virtual abnormal class label of the label-free sample data can be optimally adjusted, and an accurate target abnormal class label can be obtained; after the target abnormal class label of the label-free sample data is obtained, the class prediction model can be optimized and trained based on the target abnormal class label of the label-free sample data and the real abnormal class label of the object label sample data. It can be seen that in the application, based on the real abnormal class label of the object label sample data under the target scene type and the real abnormal class label of the associated label sample data under the associated scene type, the accurate target abnormal class label of the unlabeled sample data can be predicted, and the model is trained according to the object label sample data (namely the labeled sample data) belonging to the target scene type (namely the target domain) and the unlabeled sample data (namely the unlabeled sample data), so that the object label sample data and the unlabeled sample data are effectively combined, and the class prediction model obtained by optimization training can be more accurate. In summary, the method and the device can effectively combine the unlabeled sample data of the target scene type and the labeled sample data in the model training, so that the unlabeled sample data can be effectively utilized under the condition that the labeled sample data of the target domain is too small, and the recognition accuracy of the model for recognizing the target domain can be improved.
Further, referring to fig. 9, fig. 9 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 9, the apparatus 1 in the embodiment corresponding to fig. 8 may be applied to the computer device 1000, and the computer device 1000 may include: processor 1001, network interface 1004, and memory 1005, and in addition, the above-described computer device 1000 further includes: a user interface 1003, and at least one communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display (Display), a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface, among others. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may also optionally be at least one storage device located remotely from the processor 1001. As shown in fig. 9, an operating system, a network communication module, a user interface module, and a device control application may be included in a memory 1005, which is one type of computer-readable storage medium.
In the computer device 1000 shown in fig. 9, the network interface 1004 may provide network communication functions; while user interface 1003 is primarily used as an interface for providing input to a user; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement:
Acquiring object sample data belonging to a target scene type and associated tag sample data belonging to an associated scene type; the object sample data comprises object label sample data and label-free sample data; the association scene type and the target scene type have scene association relation;
Training an initial category prediction model according to the object tag sample data and the associated tag sample data to obtain a category prediction model, determining a prediction abnormality category corresponding to the unlabeled sample data through the category prediction model, and taking the prediction abnormality category corresponding to the unlabeled sample data as a virtual abnormality category tag of the unlabeled sample data;
Determining the object sample data and the associated tag sample data as target sample data, and optimizing and adjusting the virtual abnormal category tag according to the similarity between every two target sample data, the real abnormal category tag corresponding to the object tag sample data and the real abnormal category tag corresponding to the associated tag sample data to obtain a target abnormal category tag; and the target abnormal category label is used for carrying out optimization training on the category prediction model together with the real abnormal category label corresponding to the object label sample data.
It should be understood that the computer device 1000 described in the embodiment of the present application may perform the description of the data processing method in the embodiment corresponding to fig. 4 to 6, and may also perform the description of the data processing apparatus 1 in the embodiment corresponding to fig. 8, which is not repeated herein. In addition, the description of the beneficial effects of the same method is omitted.
Furthermore, it should be noted here that: the embodiment of the present application further provides a computer readable storage medium, where a computer program executed by the computer device 1000 for data processing mentioned above is stored, where the computer program includes program instructions, when the processor executes the program instructions, the description of the data processing method in the embodiment corresponding to fig. 4 to 6 can be executed, and therefore, will not be repeated herein. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the embodiments of the computer-readable storage medium according to the present application, please refer to the description of the method embodiments of the present application.
The computer readable storage medium may be the data processing apparatus provided in any one of the foregoing embodiments or an internal storage unit of the computer device, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), etc. that are provided on the computer device. Further, the computer-readable storage medium may also include both internal storage units and external storage devices of the computer device. The computer-readable storage medium is used to store the computer program and other programs and data required by the computer device. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
In one aspect of the application, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the method provided in an aspect of the embodiment of the present application.
The terms first, second and the like in the description and in the claims and drawings of embodiments of the application are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the term "include" and any variations thereof is intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or device that comprises a list of steps or elements is not limited to the list of steps or modules but may, in the alternative, include other steps or modules not listed or inherent to such process, method, apparatus, article, or device.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The method and related apparatus provided in the embodiments of the present application are described with reference to the flowchart and/or schematic structural diagrams of the method provided in the embodiments of the present application, and each flow and/or block of the flowchart and/or schematic structural diagrams of the method may be implemented by computer program instructions, and combinations of flows and/or blocks in the flowchart and/or block diagrams. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or structural diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or structures.
The foregoing disclosure is illustrative of the present application and is not to be construed as limiting the scope of the application, which is defined by the appended claims.

Claims (15)

1. A method of data processing, the method performed by a computer device, the method comprising:
Acquiring object sample data belonging to a target scene type and associated tag sample data belonging to an associated scene type; the object sample data comprises object label sample data and label-free sample data; the association scene type and the target scene type have scene association relation; the target scene type refers to a scene type with a risk identification flow, and the target scene type refers to any one of a transaction scene type and a permission allocation scene type; the object tag sample data refers to data with risk category tags under the target scene type, the label-free sample data refers to data without risk category tags under the target scene type, and the associated tag sample data refers to data with risk category tags under the associated scene type;
Training an initial category prediction model according to the object tag sample data and the associated tag sample data to obtain a category prediction model, determining a prediction abnormality category corresponding to the unlabeled sample data through the category prediction model, and taking the prediction abnormality category corresponding to the unlabeled sample data as a virtual abnormality category tag of the unlabeled sample data; the prediction abnormality category corresponding to the unlabeled exemplar data refers to a prediction risk category corresponding to the unlabeled exemplar data; the class prediction model is obtained by optimizing and adjusting the initial class prediction model through a loss function value, and the loss function value is obtained by calculating a predicted risk class and a real abnormal class label which are respectively corresponding to the associated label sample data through the object label sample data; the real abnormal category label refers to a risk category label;
Determining the object sample data and the associated tag sample data as target sample data, and optimizing and adjusting the virtual abnormal category tag according to the similarity between every two target sample data, the real abnormal category tag corresponding to the object tag sample data and the real abnormal category tag corresponding to the associated tag sample data to obtain a target abnormal category tag; the target abnormal category label is obtained by optimizing and adjusting the virtual abnormal category label by adopting a graph network, and the graph network is constructed according to the similarity between every two target sample data, the real abnormal category label corresponding to the target label sample data and the real abnormal category label corresponding to the associated label sample data; the target abnormal category label is used for carrying out optimization training on the category prediction model together with the real abnormal category label corresponding to the object label sample data; the target category prediction model after optimization training is used for identifying risk categories of data to be identified, wherein the data to be identified is data belonging to the target scene type.
2. The method according to claim 1, wherein training the initial class prediction model according to the object tag sample data and the associated tag sample data to obtain a class prediction model comprises:
Inputting the object tag sample data and the associated tag sample data into the initial category prediction model, and outputting a first prediction abnormal category and a first prediction scene type corresponding to the object tag sample data and a second prediction abnormal category and a second prediction scene type corresponding to the associated tag sample data through the initial category prediction model;
Acquiring a first real abnormal category label corresponding to the object label sample data and the target scene type, and acquiring a second real abnormal category label corresponding to the associated label sample data and the associated scene type;
determining a first loss function value according to the first prediction abnormal category, a real abnormal category label corresponding to the object label sample data, the first prediction scene type and the target scene type;
Determining a second loss function value according to the second predicted abnormal category, the real abnormal category label corresponding to the associated label sample data, the second predicted scene type and the associated scene type;
and training the initial class prediction model according to the first loss function value and the second loss function value to obtain the class prediction model.
3. The method of claim 2, wherein training the initial class prediction model based on the first loss function value and the second loss function value to obtain the class prediction model comprises:
generating a target loss function value according to the first loss function value and the second loss function value;
If the objective loss function value meets a model convergence condition, the initial class prediction model is used as the class prediction model;
And if the objective loss function value does not meet the model convergence condition, acquiring a gradient optimization function, and adjusting model parameters of the initial model prediction model according to the gradient optimization function and the objective loss function value to obtain a model prediction model containing the adjusted model parameters.
4. The method according to claim 1, wherein the optimizing the virtual anomaly class label according to the similarity between every two target sample data, the real anomaly class label corresponding to the target label sample data, and the real anomaly class label corresponding to the associated label sample data to obtain the target anomaly class label includes:
Determining each target sample data as a node of the graph network;
Taking the real abnormal category label corresponding to the object label sample data as a node value of a node belonging to the object label sample data;
taking the virtual abnormal category label corresponding to the unlabeled exemplar data as a node value of a node belonging to the unlabeled exemplar data;
taking the real abnormal category label corresponding to the associated label sample data as a node value of a node belonging to the associated label sample data;
Determining the similarity between every two pieces of target sample data, and constructing the graph network according to the similarity, the nodes corresponding to each piece of target sample data and the node value of each node;
and optimizing and adjusting the virtual abnormal category label according to the graph network to obtain the target abnormal category label.
5. The method of claim 4, wherein the target sample data comprises target sample data S i and target sample data S j; i. j are positive integers;
the determining the similarity between every two target sample data comprises the following steps:
Inputting the target sample data S i and the target sample data S j to the class prediction model, and extracting a hidden feature vector k a corresponding to the target sample data S i and a hidden feature vector k b corresponding to the target sample data S j by a feature extraction layer of the class prediction model; a. b are positive integers;
Determining a vector distance between the hidden feature vector k a and the hidden feature vector k b;
The vector distance is taken as the similarity between the target sample data S i and the target sample data S j.
6. The method of claim 4, wherein the target sample data comprises target sample data S i and target sample data S j; i. j are positive integers;
the determining the similarity between every two target sample data comprises the following steps:
Inputting the target sample data S i and the target sample data S j to the class prediction model, and extracting a hidden feature vector k a corresponding to the target sample data S i and a hidden feature vector k b corresponding to the target sample data S j by a feature extraction layer of the class prediction model; a. b are positive integers;
Determining an angle value between the hidden feature vector k a and the hidden feature vector k b, and determining a cosine value between the hidden feature vector k a and the hidden feature vector k b according to the angle value;
The cosine value is taken as a similarity between the target sample data S i and the target sample data S j.
7. The method of claim 4, wherein constructing the graph network based on the similarity, the node corresponding to each target sample data, and the node value of each node comprises:
Determining the similarity between every two nodes as a similarity set;
comparing each similarity in the similarity set with a similarity threshold value, and acquiring target similarity which is larger than or equal to the similarity threshold value from the similarity set;
And creating an associated edge between two nodes with the target similarity, and generating a graph network containing the node corresponding to each target sample data, the node value of each node and the associated edge.
8. The method of claim 7, wherein the optimizing the virtual anomaly class tag according to the graph network to obtain the target anomaly class tag comprises:
Acquiring a node corresponding to the unlabeled sample data in the graph network as a target node;
Acquiring a node with the associated edge between the node and the target node in the graph network as a target associated node;
And optimizing and adjusting the node value of the target node according to the node value of the target associated node and the similarity between the target associated node and the target node to obtain a target node value, and determining the target abnormal class label according to the target node value.
9. The method of claim 8, wherein the target association node comprises a first target association node and a second target association node;
The optimizing and adjusting the node value of the target node according to the node value of the target associated node and the similarity between the target associated node and the target node to obtain the target node value comprises the following steps:
acquiring a first similarity between the first target associated node and the target node and a second similarity between the second target associated node and the target node;
acquiring a first node value of the first target associated node and a second node value of the second target associated node;
multiplying the first similarity with the first node value to obtain a first operation value;
multiplying the second similarity with the second node value to obtain a second operation value;
and optimizing and adjusting the node value of the target node according to the first operation value and the second operation value to obtain the target node value.
10. The method according to claim 9, wherein the optimizing the node value of the target node according to the first operation value and the second operation value to obtain the target node value includes:
adding the first operation value and the second operation value to obtain a target operation value;
obtaining a label value corresponding to the target operation value, and matching the label value corresponding to the target operation value with a node value of the target node;
If the label value corresponding to the target operation value is different from the node value of the target node, replacing the node value of the target node with the label value corresponding to the target operation value, and optimally adjusting the node value of the target node according to the first adjustment node value of the first target associated node, the first similarity, the second adjustment node value of the second target associated node and the second similarity to obtain the target node value; the first adjustment node value is obtained by optimizing and adjusting the first node value according to the node value of the associated node corresponding to the first target associated node; the second adjustment node value is obtained by optimizing and adjusting the second node value according to the node value of the associated node corresponding to the second target associated node;
if the label value corresponding to the target operation value is the same as the node value of the target node, determining that the node value of the target node is in a convergence state, and taking the node value of the target node as the target node value.
11. The method of claim 10, wherein said determining the target anomaly class tag from the target node value comprises:
When the node value of the target node is in a convergence state, determining a target operation value corresponding to a label value identical to the node value of the target node as a convergence operation value;
Determining an absolute value of a difference between the target node value and the convergence calculation value;
matching the absolute value of the difference with a first label threshold value and a second label threshold value, and acquiring a target absolute value of the difference larger than the first label threshold value or smaller than the second label threshold value from the absolute value of the difference; the second tag threshold is less than the first tag threshold;
And determining a target node value corresponding to the target difference absolute value as the target abnormal class label.
12. The method according to claim 1, wherein the method further comprises:
inputting the object label sample data and the unlabeled sample data into the class prediction model, and outputting a third predicted abnormal class corresponding to the object label sample data and a fourth predicted abnormal class corresponding to the unlabeled sample data through the class prediction model;
acquiring a real abnormal category label corresponding to the object label sample data and the target abnormal category label corresponding to the label-free sample data;
Determining a third loss value according to the third predicted abnormal category, the real abnormal category label corresponding to the object label sample data, the fourth predicted abnormal category and the target abnormal category label;
and carrying out optimization training on the class prediction model according to the third loss value to obtain a target class prediction model.
13. The method according to claim 12, wherein the method further comprises:
acquiring data to be identified belonging to the target scene type;
Inputting the data to be identified into the target category prediction model, and extracting hidden feature vectors of the data to be identified through a feature extraction layer of the target category prediction model;
Inputting the hidden feature vector of the data to be identified into a feature classification layer of the target category prediction model, and outputting the initial prediction abnormal category corresponding to the data to be identified and the prediction probability corresponding to the initial prediction abnormal category through the feature classification layer;
And acquiring the maximum prediction probability from the prediction probabilities, and determining the initial prediction abnormal category corresponding to the maximum prediction probability as the prediction abnormal category corresponding to the data to be identified.
14. A computer device, comprising: a processor, a memory, and a network interface;
The processor is connected to the memory, the network interface for providing network communication functions, the memory for storing program code, the processor for invoking the program code to perform the method of any of claims 1-13.
15. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program adapted to be loaded by a processor and to perform the method of any of claims 1-13.
CN202110275967.0A 2021-03-15 2021-03-15 Data processing method, device and readable storage medium Active CN113011646B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110275967.0A CN113011646B (en) 2021-03-15 2021-03-15 Data processing method, device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110275967.0A CN113011646B (en) 2021-03-15 2021-03-15 Data processing method, device and readable storage medium

Publications (2)

Publication Number Publication Date
CN113011646A CN113011646A (en) 2021-06-22
CN113011646B true CN113011646B (en) 2024-05-31

Family

ID=76407175

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110275967.0A Active CN113011646B (en) 2021-03-15 2021-03-15 Data processing method, device and readable storage medium

Country Status (1)

Country Link
CN (1) CN113011646B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569991B (en) * 2021-08-26 2024-05-28 深圳市捷顺科技实业股份有限公司 Person evidence comparison model training method, computer equipment and computer storage medium
CN113782187B (en) * 2021-09-10 2023-06-27 深圳平安智慧医健科技有限公司 Index data processing method, related equipment and medium
CN113656927B (en) * 2021-10-20 2022-02-11 腾讯科技(深圳)有限公司 Data processing method, related device and computer storage medium
CN113987324A (en) * 2021-10-21 2022-01-28 北京达佳互联信息技术有限公司 Data processing method, device, equipment and storage medium
CN114185881B (en) * 2021-12-14 2024-06-04 中国平安财产保险股份有限公司 Automatic abnormal data repairing method, device, equipment and storage medium
CN115096375B (en) * 2022-08-22 2022-11-04 启东亦大通自动化设备有限公司 Carrier roller running state monitoring method and device based on carrier roller carrying trolley detection

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012165517A1 (en) * 2011-05-30 2012-12-06 日本電気株式会社 Probability model estimation device, method, and recording medium
CN111582342A (en) * 2020-04-29 2020-08-25 腾讯科技(深圳)有限公司 Image identification method, device, equipment and readable storage medium
CN111860677A (en) * 2020-07-29 2020-10-30 湖南科技大学 Rolling bearing transfer learning fault diagnosis method based on partial domain confrontation
CN112418442A (en) * 2020-12-02 2021-02-26 深圳前海微众银行股份有限公司 Data processing method, device, equipment and storage medium for federal transfer learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012165517A1 (en) * 2011-05-30 2012-12-06 日本電気株式会社 Probability model estimation device, method, and recording medium
CN111582342A (en) * 2020-04-29 2020-08-25 腾讯科技(深圳)有限公司 Image identification method, device, equipment and readable storage medium
CN111860677A (en) * 2020-07-29 2020-10-30 湖南科技大学 Rolling bearing transfer learning fault diagnosis method based on partial domain confrontation
CN112418442A (en) * 2020-12-02 2021-02-26 深圳前海微众银行股份有限公司 Data processing method, device, equipment and storage medium for federal transfer learning

Also Published As

Publication number Publication date
CN113011646A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
CN113011646B (en) Data processing method, device and readable storage medium
CN111724083B (en) Training method and device for financial risk identification model, computer equipment and medium
CN111401558A (en) Data processing model training method, data processing device and electronic equipment
CN111681091B (en) Financial risk prediction method and device based on time domain information and storage medium
CN113011895B (en) Associated account sample screening method, device and equipment and computer storage medium
CN112446310A (en) Age identification system, method and device based on block chain
CN113935738B (en) Transaction data processing method, device, storage medium and equipment
CN113850669A (en) User grouping method and device, computer equipment and computer readable storage medium
CN117422553A (en) Transaction processing method, device, equipment, medium and product of blockchain network
CN112989182A (en) Information processing method, information processing apparatus, information processing device, and storage medium
CN111260219A (en) Asset class identification method, device, equipment and computer readable storage medium
CN116522131A (en) Object representation method, device, electronic equipment and computer readable storage medium
Sui et al. Multi-level membership inference attacks in federated Learning based on active GAN
CN115328786A (en) Automatic testing method and device based on block chain and storage medium
CN116150429A (en) Abnormal object identification method, device, computing equipment and storage medium
CN115620019A (en) Commodity infringement detection method and device, equipment, medium and product thereof
CN116932878A (en) Content recommendation method, device, electronic equipment, storage medium and program product
CN114493850A (en) Artificial intelligence-based online notarization method, system and storage medium
CN115905605A (en) Data processing method, data processing equipment and computer readable storage medium
Wu et al. Applying a Probabilistic Network Method to Solve Business‐Related Few‐Shot Classification Problems
CN113946758B (en) Data identification method, device, equipment and readable storage medium
CN116628236B (en) Method and device for delivering multimedia information, electronic equipment and storage medium
CN112116441B (en) Training method, classification method, device and equipment for financial risk classification model
US20230419344A1 (en) Attribute selection for matchmaking
CN117009883A (en) Object classification model construction method, object classification method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40046033

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant