CN110490304B - Data processing method and device - Google Patents
Data processing method and device Download PDFInfo
- Publication number
- CN110490304B CN110490304B CN201910775569.8A CN201910775569A CN110490304B CN 110490304 B CN110490304 B CN 110490304B CN 201910775569 A CN201910775569 A CN 201910775569A CN 110490304 B CN110490304 B CN 110490304B
- Authority
- CN
- China
- Prior art keywords
- feature
- target
- target data
- data
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title abstract description 16
- 238000000605 extraction Methods 0.000 claims abstract description 95
- 239000013598 vector Substances 0.000 claims abstract description 70
- 230000007246 mechanism Effects 0.000 claims abstract description 54
- 238000000034 method Methods 0.000 claims abstract description 42
- 238000013528 artificial neural network Methods 0.000 claims abstract description 41
- 238000012545 processing Methods 0.000 claims abstract description 30
- 230000000694 effects Effects 0.000 claims abstract description 19
- 230000015654 memory Effects 0.000 claims description 18
- 238000004422 calculation algorithm Methods 0.000 claims description 15
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 12
- 230000009466 transformation Effects 0.000 claims description 12
- 238000010606 normalization Methods 0.000 claims description 7
- 238000004140 cleaning Methods 0.000 claims description 5
- 238000011156 evaluation Methods 0.000 claims 1
- 230000008569 process Effects 0.000 abstract description 11
- 238000012216 screening Methods 0.000 abstract description 7
- 238000004141 dimensional analysis Methods 0.000 abstract description 3
- 238000007477 logistic regression Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 6
- 206010070834 Sensitisation Diseases 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 230000005291 magnetic effect Effects 0.000 description 4
- 230000007547 defect Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000013019 agitation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000000586 desensitisation Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 230000003936 working memory Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application aims to provide a data processing method and equipment, which are used for extracting the characteristics of acquired multi-dimensional and heterogeneous target data to be processed by using an attention mechanism and a time-cycle neural network, simultaneously ignoring noise and useless parts in the target data, screening important information and non-important information in the target data, enabling weights of the two information occupied by the characteristic vectors of the target data to be different, ensuring multi-dimensional analysis and characteristic extraction of the target data, obtaining the characteristic vectors of the target data and inputting the characteristic vectors as task models corresponding to the target tasks, obtaining the execution results of the target tasks through the processing of the task models, avoiding the problem of information loss in the process of manual data processing, saving manpower, improving the screening capacity and fitting capacity of the characteristic extraction of the target data, improving the characteristic extraction effect of the target data, and improving the reliability of the execution results of the target tasks.
Description
Technical Field
The present application relates to the field of computers, and in particular, to a method and apparatus for data processing.
Background
In the prior art, the data characteristic extraction technology is taken as a technology for extracting useful information from a mass data source, has become an indispensable technology in the big data Internet age, and is also a foundation stone of the related technology of artificial intelligence. The data feature extraction technology is to automatically extract useful parts from different data sources (such as financial data, image data, text data, language data and the like), and convert the useful parts into codes capable of directly summarizing key information of the data, so that the data feature extraction technology is used for training and deducing models such as machine learning, deep learning and the like.
Currently, the commonly used data feature extraction technology is mainly divided into two main categories: 1. based on the characteristic extraction technology of the artificial characteristic template, the technology utilizes the prior knowledge of the expert in the related field to summarize a set of rule system and convert the rule system into the input of an artificial intelligent model, such as a stock trend mapping table which is arranged by the expert in the financial field, a character component part head mapping table which is arranged by a linguist, and the like; 2. the automatic feature extraction technology based on the statistical machine learning method utilizes a large amount of manual annotation data (supervision data) or non-manual annotation data (non-supervision data) to train a feature extractor in advance, and then is used as a pipeline between a data set and a target model, and common methods include principal component analysis, a self-encoder, a gradient lifting decision tree and the like.
The two traditional feature extraction technologies have respective defects, for example, the feature extraction technology based on the artificial feature template often has the problem of low recall rate, namely, the artificial feature extraction technology is accurate, but can not capture all information, so that a lot of useful information is lost, and the artificial construction of the feature template is very time-consuming and labor-consuming; the feature extraction technology based on the statistical machine learning method overcomes the defects of the feature extraction technology based on the artificial feature template to a certain extent, but has the defects of limited model fitting capability and incapability of being highly parallel, and is difficult to process on multidimensional and heterogeneous data sources.
Disclosure of Invention
The application aims to provide a data processing method and device, which are used for solving the problem of how to process multidimensional and heterogeneous data to be processed in the prior art.
According to an aspect of the present application, there is provided a data processing method comprising:
acquiring a target task, a task model corresponding to the target task and target data to be processed;
performing feature extraction on the target data based on an attention mechanism and a time-cycled neural network to obtain feature vectors of the target data;
And inputting the feature vector of the target data into the task model to execute the target task to obtain an execution result of the target task.
Further, in the above data processing method, the obtaining the target task, the task model corresponding to the target task, and the target data to be processed includes:
acquiring a target task;
determining a corresponding task model and data to be processed based on the target task; and respectively carrying out denoising, desensitizing, script cleaning and normalization on the data to be processed to obtain target data to be processed, wherein the target data comprises at least one piece of original data corresponding to at least one time step, and the original data comprises at least one piece of original data.
Further, in the above data processing method, the feature extraction is performed on the target data based on the attention mechanism and the time-loop neural network to obtain a feature vector of the target data, including:
based on a soft attention mechanism and a time-cycle neural network in sequence, respectively carrying out feature extraction on the original data corresponding to each time step to obtain a feature sequence of the original data corresponding to each time step;
and extracting the characteristics of the characteristic sequences of the original data corresponding to all the time steps based on a multi-head attention mechanism, and integrating the extracted characteristics through a feedforward neural network to obtain the characteristic vector of the target data.
Further, in the above data processing method, the feature extraction is performed on the original data corresponding to each time step based on the soft attention mechanism and the time-cycle neural network in sequence, to obtain a feature sequence of the original data corresponding to each time step, including:
based on an embedding algorithm corresponding to the target data, carrying out vectorization processing on the original data corresponding to each time step respectively to obtain a dense vector of the original data corresponding to each time step;
respectively carrying out feature extraction and feature weighting on dense vectors of the original data corresponding to each time step based on the soft attention mechanism to obtain attention feature sequences of the original data corresponding to each time step;
and based on the time-loop neural network, respectively carrying out feature learning on the attention feature sequence of the original data corresponding to each time step to obtain the feature sequence of the original data corresponding to each time step, wherein the feature sequence comprises a long-distance feature dependency relationship.
Further, in the above data processing method, the feature extraction is performed on the feature sequences of the original data corresponding to all the time steps based on the multi-head attention mechanism, and the extracted features are integrated through a feedforward neural network to obtain feature vectors of the target data, including:
After carrying out affine transformation on the feature sequences of all the original data corresponding to the time steps for at least one time, respectively carrying out feature extraction and feature weighting on the feature sequences corresponding to the target data after affine transformation for each time based on a multi-head attention mechanism to obtain the feature sequences of the target data corresponding to each soft attention mechanism in the multi-head attention mechanism;
and integrating the characteristic sequences of the target data corresponding to each soft attention mechanism in the multi-head attention mechanism through the feedforward neural network to obtain the characteristic vector of the target data.
Further, in the above data processing method, the method further includes:
performing feature adjustment on the target data based on the execution result of the target task to obtain updated feature vectors of the target data;
and inputting the updated feature vector of the target data into the task model to execute the target task and simultaneously adjusting the task model to obtain an updated execution result of the target task.
Further, in the data processing method, the time-loop neural network is a long-short-period memory network LSTM.
According to another aspect of the present application there is also provided a computer readable medium having stored thereon computer readable instructions which, when executed by a processor, cause the processor to implement a method as described in any of the preceding claims.
According to another aspect of the present application, there is also provided a data processing apparatus including:
one or more processors;
a computer readable medium for storing one or more computer readable instructions,
the one or more computer-readable instructions, when executed by the one or more processors, cause the one or more processors to implement the method of any of the preceding claims.
Compared with the prior art, the method and the device have the advantages that the target task, the task model corresponding to the target task and the target data to be processed are obtained; performing feature extraction on the target data based on an attention mechanism and a time-cycled neural network to obtain feature vectors of the target data; and inputting the feature vector of the target data into the task model to execute the target task to obtain an execution result of the target task. The method and the device have the advantages that the attention mechanism and the time-cycled neural network are utilized to extract the characteristics of the acquired multidimensional and heterogeneous target data to be processed, meanwhile, noise and useless parts in the target data are ignored, important information and non-important information in the target data are screened out, the weights of the important information and the non-important information occupied by the characteristic vectors of the target data are different, so that the multidimensional analysis and the characteristic extraction of the target data are ensured, the characteristic vectors of the target data are obtained and serve as the input of a task model corresponding to a target task, the execution result of the target task is obtained through the processing of the task model, the problem of information loss possibly occurring in the manual data processing process is avoided, the manpower is saved, the screening capacity and the fitting capacity of the characteristic extraction of the target data are improved, the characteristic extraction effect of the target data is improved, and the reliability of the execution result of the target task is improved.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
FIG. 1 illustrates a flow chart of a data processing method in accordance with an aspect of the application;
FIG. 2 illustrates a schematic diagram of feature extraction in an embodiment of a data processing method according to an aspect of the present application.
The same or similar reference numbers in the drawings refer to the same or similar parts.
Detailed Description
The application is described in further detail below with reference to the accompanying drawings.
In one exemplary configuration of the application, the terminal, the device of the service network, and the trusted party each include one or more processors (e.g., central processing unit (Central Processing Unit, CPU), input/output interfaces, network interfaces, and memory.
The Memory may include non-volatile Memory in a computer readable medium, random access Memory (Random Access Memory, RAM) and/or non-volatile Memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase-Change RAM (PRAM), static random access Memory (Static Random Access Memory, SRAM), dynamic random access Memory (Dynamic Random Access Memory, DRAM), other types of Random Access Memory (RAM), read-Only Memory (ROM), electrically erasable programmable read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), flash Memory or other Memory technology, read-Only optical disk read-Only Memory (Compact Disc Read-Only Memory, CD-ROM), digital versatile disks (Digital Versatile Disk, DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by the computing device. Computer readable media, as defined herein, does not include non-transitory computer readable media (transmission media), such as modulated data signals and carrier waves.
Fig. 1 shows a flow chart of a data processing method according to an aspect of the present application, the method includes step S11, step S12 and step S13, wherein the method specifically includes:
step S11, acquiring a target task, a task model corresponding to the target task and target data to be processed. Here, different target tasks correspond to different task models, wherein the target data corresponding to the target tasks may include raw data of different time dimensions.
And step S12, extracting the characteristics of the target data based on an attention mechanism and a time-loop neural network to obtain the characteristic vector of the target data. Here, the time-loop neural network may include, but is not limited to, a Long Short-Term Memory (LSTM) network, etc., wherein the LSTM solves the learning of remote information, thereby optimizing a feature extraction process of the target data, improving feature extraction efficiency of the target data, and improving feature extraction effect of the target data. The attention mechanism and the time-cycle neural network are used for screening important information (corresponding to important features) and non-important information (corresponding to non-important features) in the target data, and respectively setting different weights for the important information and the non-important information so as to embody different importance, so that feature vectors of the target data are obtained, the feature extraction efficiency of the target data is improved, and the feature extraction effect of the target data is improved.
And step S13, inputting the feature vector of the target data into the task model to execute the target task, and obtaining an execution result of the target task. In the step S13, the feature vector of the target data is used as an input of a task model, and the task model processes the feature vector of the target data and outputs an execution result of the target task, thereby improving the accuracy of the execution result of the target task.
The steps S11 to S13 are performed so that the noise and useless parts in the target data are ignored while the obtained multi-dimensional and heterogeneous characteristics in the target data to be processed are extracted by using the attention mechanism and the time-cycle neural network, and the important information and the non-important information in the target data are screened out, so that the weights of the important information and the non-important information in the characteristic vector of the target data are different, the multi-dimensional analysis and the characteristic extraction of the target data are ensured, the characteristic vector of the target data is obtained and is used as the input of a task model corresponding to the target task, the execution result of the target task is obtained through the processing of the task model, the problem of information loss possibly occurring in the process of manual data processing is avoided, the manpower is saved, the screening capability and the fitting capability of the characteristic extraction of the target data are improved, the characteristic extraction effect of the target data is improved, and the reliability of the execution result of the target task is improved.
For example, taking credit assessment in the financial field as an example, if the target task is to predict whether the user will pay on time, the task model corresponding to the target task may employ a logistic regression model, where the target data to be processed corresponding to the target task reflects data such as basic information, hobbies, social network information, recent activity information, asset information, fund flow records, bad credit records, etc. of the user related to the financial field from different time dimensions, so as to ensure that the credit index of each of the users is monitored later, and thus predict whether the corresponding user is paying on time. Based on an attention mechanism and a time-cycle neural network, analyzing target data such as basic information, interest and hobbies, social network information, recent activity information, asset information, fund flow records, bad credit records and the like of a user, screening out characteristics corresponding to important information and non-important information, setting relatively low weight for the non-important information, setting relatively high weight for the important information so as to ensure that the non-important information is not ignored while considering the important information, and realizing multi-dimensional data processing and characteristic extraction of the target data to obtain characteristic vectors of the target data. Finally, taking the feature vector of the target data as a task model: the input of the logistic regression model processes the feature vector of the target data through the logistic regression model, and the execution result obtained after the target task is executed is output, namely, whether the user pays on time or not is predicted (for example, the user does not pay on time or pays on time), so that the problem of information loss possibly occurring in the manual data processing process is avoided, the manpower is saved, the feature extraction efficiency of the target data is improved, the feature extraction effect of the target data is improved, and the accuracy of the execution result of the target task is improved.
Next, in the above embodiment of the present application, the step S11 of obtaining the target task, the task model corresponding to the target task, and the target data to be processed includes:
acquiring a target task;
determining a corresponding task model and data to be processed based on the target task; here, if the target task is to predict whether the user will pay on time, according to the target task: predicting whether a user pays on time, and acquiring corresponding data to be processed is as follows: data such as basic information, interest and hobbies, social network information, recent activity information, asset information, fund flow records, bad credit records and the like of the user related to the financial field of the user; if the target task is to predict whether the user will continue to purchase the financial product, according to the target task: predicting whether a user can continue to purchase financial products, wherein the acquired data to be processed is; the user's basic information, asset information, purchase financial product history, fund flow records, etc. are such that corresponding different pending data is obtained according to different target tasks to ensure that subsequent targeting of the target tasks is performed efficiently. Of course, the data to be processed may include data of different time dimensions, and each step of each time dimension may be referred to as a time step.
And respectively carrying out denoising, desensitization, script cleaning and normalization on the data to be processed to obtain target data to be processed, wherein the target data comprises at least one piece of original data corresponding to the time step, and the original data comprises at least one piece of original data.
For example, the data X to be processed is obtained based on the target task m t T represents the T time step, T is a positive integer less than or equal to T, and T is the time included in the data to be processedThe total number of the interval steps is a positive integer, each time step corresponds to at least one piece of original data, X m t Representing the m-th piece of original data in the time steps T, namely, the data to be processed comprises m pieces of original data corresponding to each time step T in the T time steps. In step S11, useless parts in the obtained data to be processed may be removed by means of data de-agitation, or data related to user privacy in the data to be processed may be hidden by means of data de-sensitization to implement de-sensitization processing of the data to be processed, or coding format of the data to be processed may be unified by means of a cleaning script, or dimension unification of information features may be performed on the data to be processed by means of normalization or the like to implement normalization processing of the data to be processed, so as to obtain target data to be processed, and implement preprocessing of target data corresponding to a target task, so as to implement feature extraction on the target data subsequently. Of course, after the target data to be processed is obtained, the obtained target data can be stored in a corresponding database or a large file, so that the target data to be processed is prevented from being lost, and the data storage format of the database or the large file can be, but is not limited to Hbase, mongoDB, csv, hdf and the like, so that the storage of the target data corresponding to the target task is realized.
Next, in the above embodiment of the present application, the step S12 performs feature extraction on the target data based on an attention mechanism and a time-loop neural network, and the feature vector of the target data is obtained, including:
based on a soft attention mechanism and a time-cycle neural network in sequence, respectively carrying out feature extraction on the original data corresponding to each time step to obtain a feature sequence of the original data corresponding to each time step;
and extracting the characteristics of the characteristic sequences of the original data corresponding to all the time steps based on a multi-head attention mechanism, and integrating the extracted characteristics through a feedforward neural network to obtain the characteristic vector of the target data.
For example, if the objective task is to predict whether the user pays on time, the obtained basic information, interest, social network information, recent activity information, asset information, fund flow records and bad credit record waiting processing data of the user are processed to obtain target data to be processed, feature extraction is performed on the original data corresponding to each time step, features corresponding to important information and non-important information are screened out, relatively low weight is set on the non-important information, relatively high weight is set on the important information to obtain a feature sequence of the original data corresponding to each time step, feature extraction is performed on the feature sequence of the original data corresponding to all the time steps based on a multi-head attention mechanism, features corresponding to the important information and the non-important information are selected, relatively low weight is set on the non-important information, relatively high weight is set on the important information, and the extracted features are integrated through a feedforward neural network to obtain a feature vector of the target data. And carrying out multidimensional data parallel processing and feature extraction on the target data to obtain feature vectors of the target data.
Further, in the step S12, feature extraction is performed on the original data corresponding to each time step based on the soft attention mechanism and the time-cycle neural network in sequence, so as to obtain a feature sequence of the original data corresponding to each time step, including:
and carrying out vectorization processing on the original data corresponding to each time step based on an embedding algorithm corresponding to the target data, so as to obtain a dense vector of the original data corresponding to each time step. Here, the target data of different types relate to different embedding methods, for example, if the target data is text data, the embedding algorithm corresponding to the text data may include a word embedding algorithm, a sentence embedding algorithm, and the like, and if the target data is image data, the embedding algorithm corresponding to the image data may include a convolution layer algorithm, and the like, so that after the target data is vectorized by the embedding algorithm corresponding to the target data, the multidimensional heterogeneous target data can be converted into dense vectors capable of further performing feature extraction, so as to obtain dense vectors of the original data corresponding to each time step in the target data, which is favorable for improving efficiency when the target data is subsequently subjected to feature extraction, and further improving the feature extraction effect of the target data.
And respectively carrying out feature extraction and feature weighting on the dense vector of the original data corresponding to each time step based on the soft attention mechanism to obtain an attention feature sequence of the original data corresponding to each time step. Here, the soft attention mechanism focuses on important information in dense vectors of the original data corresponding to each time step to acquire more information to be focused, suppresses other useless information, and outputs a new feature sequence after attention weighting, namely, an attention feature sequence after attention weighting by dynamically increasing the attention weight of important features (namely, important information).
And based on the time-loop neural network, respectively carrying out feature learning on the attention feature sequence of the original data corresponding to each time step to obtain the feature sequence of the original data corresponding to each time step, wherein the feature sequence comprises a long-distance feature dependency relationship. Here, on the network structure, the time-cycled neural network memorizes previous information and influences the output of the following nodes by using the previous information; the time-loop neural network takes the relation between the attention characteristic sequence of the original data corresponding to each time step and the characteristic sequence of the corresponding historical time step into consideration in a circulating way, and transmits important characteristics (such as important information) through a memory cell, so that long-distance characteristic dependency relation about the important characteristics is learned in the time-loop neural network, and a new characteristic sequence of the original data, in which the attention characteristic sequence of the original data corresponding to the time step is equal in length in the time dimension, is output.
For example, if the target data to be processed is text data, the target data to be processed includes basic information, hobbies, social network information, recent activity information, asset information, money flow records, bad credit records, and the like of a large number of clients. First, an embedding algorithm corresponding to the text data is employed: and the word embedding algorithm and the sentence embedding algorithm respectively carry out vectorization processing on the original data corresponding to each time step in the target data to obtain a dense vector of the original data corresponding to each time step. And then, respectively carrying out feature extraction and feature weighting on the dense vector of the original data corresponding to each time step through the soft attention mechanism so as to extract important features and non-important features in the dense vector of the original data corresponding to each time step, and carrying out feature weighting on the important features and the non-important features to different degrees, so that the important features needing attention are ensured, the non-important features are not ignored, and further, the multi-dimensional and full-aspect feature extraction of the data is realized, and the attention feature sequence is obtained. And then, respectively carrying out feature learning on the attention feature sequences of the original data corresponding to each time step through a time-loop neural network, learning a long-distance feature dependency relationship, outputting a new feature sequence of the original data, which is equal to the attention feature sequence of the original data in length in the time dimension, and improving the feature extraction efficiency of the target data while realizing feature extraction of multi-dimensional heterogeneous target data, thereby improving the feature extraction effect of the target data.
Further, in the step S12, feature extraction is performed on feature sequences of all the original data corresponding to the time steps based on a multi-head attention mechanism, and the extracted features are integrated through a feedforward neural network to obtain feature vectors of the target data, including:
after carrying out affine transformation on the feature sequences of all the original data corresponding to the time steps for at least one time, respectively carrying out feature extraction and feature weighting on the feature sequences corresponding to the target data after affine transformation for each time based on a multi-head attention mechanism to obtain the feature sequences of the target data corresponding to each soft attention mechanism in the multi-head attention mechanism; and integrating the characteristic sequences of the target data corresponding to each soft attention mechanism in the multi-head attention mechanism through the feedforward neural network to obtain the characteristic vector of the target data.
For example, the feature sequences of the original data corresponding to all the time steps are subjected to affine transformation for H times to obtain the feature sequences of the original data corresponding to all the time steps after affine transformation for H times, then the feature sequences of the original data corresponding to all the time steps after affine transformation for H times are respectively input into H attention layers, feature extraction and feature weighting are carried out through parallel H soft attention mechanisms, high parallel processing of the feature sequences of the original data corresponding to all the time steps is achieved, H different feature sequences are finally output, finally the H different feature sequences are spliced into 1 feature sequences through the feedforward neural network to be the feature vector of the target data, feature extraction of the target data is achieved, feature extraction of the target data is prevented from being fitted, feature extraction efficiency of the target data is improved, and feature extraction effect of the target data is improved.
Next to the above-mentioned embodiment of the present application, the data processing method in an embodiment of the present application further includes:
performing feature adjustment on the target data based on the execution result of the target task to obtain updated feature vectors of the target data;
and inputting the updated feature vector of the target data into the task model to execute the target task and simultaneously adjusting the task model to obtain an updated execution result of the target task.
For example, if the target task is to predict whether the user will pay on time, the task model corresponding to the target task is a logistic regression model, which will correspond to the target task: predicting whether the user pays corresponding target data to be processed on time, converting the target data into feature vectors of the target data after feature extraction, and inputting the feature vectors into the task model: after the logistic regression model performs the execution of the target task, an execution result is obtained, the gradient of the logistic regression model flows to the feature extraction process of the target data to be processed in the training process of each step, the execution result adjusts the feature extraction of the target data, the feature extraction process is further updated and optimized, and then the updated feature vector of the target data is input into the logistic regression model to obtain the updated execution result. The method and the device realize that the feature extraction, the task model and the execution result obtained after the task model are updated and optimized, and ensure that the obtained execution result of the target task is more reliable and accurate.
According to another aspect of the present application, there is also provided a computer readable medium having stored thereon computer readable instructions which, when executed by a processor, cause the processor to implement a method of controlling a user's base alignment as described above.
According to another aspect of the present application, there is also provided a data processing apparatus characterized by comprising:
one or more processors;
a computer readable medium for storing one or more computer readable instructions,
the one or more computer-readable instructions, when executed by the one or more processors, cause the one or more processors to implement a method of controlling user-alignment on a device as described above.
The details of each embodiment of the device may be specifically referred to the corresponding parts of the embodiment of the method for controlling user alignment at the device side, which is not described herein again.
In the practical application scenario of the data processing method provided by the application, credit assessment in the financial field is taken as an example, if a target task is to predict whether a user pays on time, a task model corresponding to the target task can adopt a logistic regression model, and the acquisition of corresponding data to be processed is as follows: data such as basic information, hobbies, social network information, recent activity information, asset information, fund flow records, bad credit records and the like of the user related to the financial field of the user. The data to be processed may comprise data of different time dimensions, each step of each time dimension may be referred to as a time step. Acquiring data X to be processed based on the target task m t T represents the T-th time step, T is a positive integer less than or equal to T, T is the total number of time steps included in the data to be processed and is a positive integer, each time step corresponds to at least one piece of original data (i.e. m is a positive integer greater than or equal to 1 and is each time stepTotal number of raw data corresponding to individual time steps), X m t Represents the m-th piece of original data in the time step T, namely, the data to be processed comprises m pieces of original data corresponding to each time step T in the T time steps, as shown in fig. 2.
Rejecting useless parts in the acquired data to be processed through a data de-noising means, hiding data related to user privacy in the data to be processed through a data de-sensitization means so as to realize de-sensitization processing of the data to be processed, unifying coding formats of the data to be processed through a script cleaning means, unifying dimensions of information characteristics of the data to be processed through normalization and other means so as to realize normalization processing of the data to be processed, and further obtaining target data X to be processed 1 1 ...X i 1 、X 1 2 ...X i 2 、X 1 3 ...X i 3 、X 1 4 ...X i 4 、X 1 5 ...X i 5 、......X 1 j ...X i j Wherein i is a positive integer less than or equal to m, j is a positive integer less than or equal to T, and X i j For the ith original data corresponding to the j time steps in the T time steps, the total number of the original data corresponding to each time step is less than or equal to m (i.e. i is less than or equal to m), so that the pretreatment of the target data corresponding to the target task is realized, and the characteristic extraction of the target data is conveniently carried out. Of course, after the target data to be processed is obtained, the obtained target data can be stored in a corresponding database or a large file, so that the target data to be processed is prevented from being lost, and the data storage format of the database or the large file can be, but is not limited to Hbase, mongoDB, csv, hdf and the like, so that the storage of the target data corresponding to the target task is realized.
The feature extraction process of the target data involves an input layer, an embedded layer, an attention layer, a loop network layer, a multi-head attention layer, and a feature output layer. First, the input layer acquires the target data X 1 1 ...X i 1 、X 1 2 ...X i 2 、X 1 3 ...X i 3 、X 1 4 ...X i 4 、X 1 5 ...X i 5 、......X 1 j ...X i j . Next, an embedding algorithm corresponding to the text data is adopted in an embedding layer: the word embedding algorithm and the sentence embedding algorithm respectively carry out vectorization processing on the original data corresponding to each time step in the target data to obtain a dense vector V of the original data corresponding to each time step 1 1 、V 1 2 ......V 1 j Wherein V is 1 1 For the dense vector of the original data corresponding to the first time step, V 1 2 Dense vector for raw data corresponding to the second time step, and so on, V 1 j Is the dense vector of the original data corresponding to the jth time step. Then, the dense vector V of the original data corresponding to each time step in the attention layer through the soft attention mechanism 1 1 、V 1 2 ......V 1 j Extracting important features and non-important features in dense vectors of the original data corresponding to each time step by performing feature extraction and feature weighting, and performing feature weighting on the important features and the non-important features to different degrees, so that the non-important features are not ignored while important features needing to be focused are ensured, multi-dimensional and full-aspect feature extraction of the data is further realized, and a focus feature sequence V is obtained 2 1 、V 2 2 ......V 2 j Wherein V is 2 1 For the attention characteristic sequence of the original data corresponding to the first time step, V 2 2 For the attention characteristic sequence of the original data corresponding to the second time step, and so on, V 2 j And (3) the attention characteristic sequence of the original data corresponding to the jth time step. Then, respectively carrying out feature learning on the attention feature sequence of the original data corresponding to each time step in a cyclic network layer through a time cyclic neural network to learn a long-distance feature dependency relationship, and outputting the long-distance feature dependency relationship and the original dataAttention characteristic sequence of data is equal in time dimension characteristic sequence V of new original data 3 1 、V 3 2 ......V 3 j Wherein V is 3 1 For the feature sequence of the new original data corresponding to the first time step and having the same length in the time dimension as the attention feature sequence of the original data, V 3 2 For the second time step, corresponding to the new feature sequence of the original data, which is equal in length in time dimension to the attention feature sequence of the original data, and so on, V 3 j The feature extraction method is characterized in that the feature extraction efficiency of the target data is improved while the feature extraction of the multi-dimensional and heterogeneous target data is realized for the feature sequence of the new original data, which corresponds to the j-th time step and has equal length with the attention feature sequence of the original data in the time dimension, so that the feature extraction effect of the target data is improved. Then, the characteristic sequence V of the original data corresponding to all the time steps is processed in the multi-head attention layer 3 1 、V 3 2 ......V 3 j Obtaining feature sequences of the original data corresponding to all time steps after affine transformation in H times through H times of affine transformation, then respectively inputting the feature sequences of the original data corresponding to all time steps after affine transformation in H times to H attention layers, and carrying out feature extraction and feature weighting through parallel H soft attention mechanisms to obtain feature sequences of target data corresponding to each soft attention mechanism in the multi-head attention mechanism: v (V) 4 1 、V 4 2 ......V 4 x 、……、V 4 (H-1) V (V) 4 H And x is used for indicating an xth attention layer in the H attention mechanisms, x is a positive integer less than or equal to H, so as to realize highly parallel processing of the feature sequences of the original data corresponding to all the time steps, and finally output H different feature sequences: v (V) 4 1 、V 4 2 ......V 4 x 、……、V 4 (H-1) V (V) 4 H Finally, H attention mechanisms are processed through the feedforward neural networkFeature sequences of the target data corresponding to each soft attention mechanism: v (V) 4 1 、V 4 2 ......V 4 x 、……、V 4 (H-1) V (V) 4 H The feature vector V of the target data is obtained by splicing 1 feature sequence, so that feature extraction of the target data is realized, the feature extraction of the target data is prevented from being fitted, the feature extraction efficiency of the target data is improved, and the feature extraction effect of the target data is improved. Finally, the obtained feature vector … V of the target data is output through a feature output layer.
Then, the target task will be: predicting whether a user pays corresponding target data to be processed on time or not, and inputting a feature vector … V of the target data obtained by converting the target data after feature extraction into the task model: after the logistic regression model performs the execution of the target task, an execution result is obtained, the gradient of the logistic regression model flows to the feature extraction process of the target data to be processed in the training process of each step, the execution result adjusts the feature extraction of the target data, the feature extraction process is further updated and optimized, and then the updated feature vector of the target data is input into the logistic regression model to obtain an updated execution result. The method and the device realize that the feature extraction, the task model and the execution result obtained after the task model are updated and optimized, and ensure that the obtained execution result of the target task is more reliable and accurate.
In summary, the method and the device for extracting the characteristics of the target data in the multi-dimensional and heterogeneous mode by using the attention mechanism and the time-cycled neural network extract the characteristics of the acquired target data to be processed, disregard noise and useless parts in the target data, screen important information and non-important information in the target data, and enable the weights of the important information and the non-important information in the characteristic vectors of the target data to be different, so that the multi-dimensional analysis and the characteristic extraction of the target data are ensured, the characteristic vectors of the target data are obtained and are used as the input of the task model corresponding to the target task, the execution result of the target task is obtained through the processing of the task model, the problem of information loss possibly occurring in the manual data processing process is avoided, the manpower is saved, the screening capability and the fitting capability of the characteristic extraction of the target data are improved, the characteristic extraction effect of the target data is improved, and the reliability of the execution result of the target task is improved.
It should be noted that the present application may be implemented in software and/or a combination of software and hardware, e.g., using Application Specific Integrated Circuits (ASIC), a general purpose computer or any other similar hardware device. In one embodiment, the software program of the present application may be executed by a processor to perform the steps or functions described above. Likewise, the software programs of the present application (including associated data structures) may be stored on a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. In addition, some steps or functions of the present application may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.
Furthermore, portions of the present application may be implemented as a computer program product, such as computer program instructions, which when executed by a computer, may invoke or provide methods and/or techniques in accordance with the present application by way of operation of the computer. Program instructions for invoking the inventive methods may be stored in fixed or removable recording media and/or transmitted via a data stream in a broadcast or other signal bearing medium and/or stored within a working memory of a computer device operating according to the program instructions. An embodiment according to the application comprises an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to operate a method and/or a solution according to the embodiments of the application as described above.
It will be evident to those skilled in the art that the application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the apparatus claims can also be implemented by means of one unit or means in software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.
Claims (7)
1. A method of data processing, the method comprising:
acquiring a target task, a task model corresponding to the target task and target data to be processed, wherein the target data comprises at least one piece of original data corresponding to time steps, the time steps comprise data with different time dimensions in the target data to be processed, and each step of each time dimension is a time step;
Performing feature extraction on the target data based on an attention mechanism and a time-cycled neural network to obtain feature vectors of the target data;
inputting the feature vector of the target data into the task model to execute the target task to obtain an execution result of the target task;
the method is applied to credit evaluation in the financial field, and target data to be processed corresponding to the target task reflects basic information, interest and hobbies, social network information, recent activity information, asset information, fund flow records and bad credit records of a user related to the financial field from different time dimensions; wherein,,
the feature extraction is performed on the target data based on the attention mechanism and the time-cycle neural network to obtain feature vectors of the target data, and the feature vector comprises the following steps: based on a soft attention mechanism and a time-cycle neural network in sequence, respectively carrying out feature extraction on the original data corresponding to each time step to obtain a feature sequence of the original data corresponding to each time step; feature extraction is carried out on the feature sequences of the original data corresponding to all the time steps based on a multi-head attention mechanism, and the extracted features are integrated through a feedforward neural network to obtain feature vectors of the target data, wherein the feature extraction is respectively carried out on the original data corresponding to each time step based on a soft attention mechanism and a time circulation neural network in sequence to obtain the feature sequences of the original data corresponding to each time step, and the feature extraction comprises the following steps: based on an embedding algorithm corresponding to the target data, carrying out vectorization processing on the original data corresponding to each time step respectively to obtain a dense vector of the original data corresponding to each time step;
Respectively carrying out feature extraction and feature weighting on dense vectors of the original data corresponding to each time step based on the soft attention mechanism to obtain attention feature sequences of the original data corresponding to each time step; and based on the time-loop neural network, respectively carrying out feature learning on the attention feature sequence of the original data corresponding to each time step to obtain the feature sequence of the original data corresponding to each time step, wherein the feature sequence comprises a long-distance feature dependency relationship.
2. The method according to claim 1, wherein the obtaining the target task, the task model corresponding to the target task, and the target data to be processed includes:
acquiring a target task;
determining a corresponding task model and data to be processed based on the target task; and respectively carrying out denoising, desensitizing, script cleaning and normalization on the data to be processed to obtain target data to be processed.
3. The method of claim 1, wherein the feature extraction is performed on the feature sequences of the original data corresponding to all the time steps based on the multi-head attention mechanism, and the extracted features are integrated through a feedforward neural network to obtain feature vectors of the target data, and the method comprises:
After carrying out affine transformation on the feature sequences of all the original data corresponding to the time steps for at least one time, respectively carrying out feature extraction and feature weighting on the feature sequences corresponding to the target data after affine transformation for each time based on a multi-head attention mechanism to obtain the feature sequences of the target data corresponding to each soft attention mechanism in the multi-head attention mechanism;
and integrating the characteristic sequences of the target data corresponding to each soft attention mechanism in the multi-head attention mechanism through the feedforward neural network to obtain the characteristic vector of the target data.
4. A method according to any one of claims 1 to 3, further comprising:
performing feature adjustment on the target data based on the execution result of the target task to obtain updated feature vectors of the target data;
and inputting the updated feature vector of the target data into the task model to execute the target task and simultaneously adjusting the task model to obtain an updated execution result of the target task.
5. The method according to any one of claims 1 to 2, wherein the time-cycled neural network is a long-short-term memory network LSTM.
6. A computer readable medium having stored thereon computer readable instructions which, when executed by a processor, cause the processor to implement the method of any of claims 1 to 5.
7. An apparatus for data processing, the apparatus comprising:
one or more processors;
a computer readable medium for storing one or more computer readable instructions,
when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910775569.8A CN110490304B (en) | 2019-08-21 | 2019-08-21 | Data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910775569.8A CN110490304B (en) | 2019-08-21 | 2019-08-21 | Data processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110490304A CN110490304A (en) | 2019-11-22 |
CN110490304B true CN110490304B (en) | 2023-10-27 |
Family
ID=68552668
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910775569.8A Active CN110490304B (en) | 2019-08-21 | 2019-08-21 | Data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110490304B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111222981A (en) * | 2020-01-16 | 2020-06-02 | 中国建设银行股份有限公司 | Credibility determination method, device, equipment and storage medium |
CN111491262A (en) * | 2020-04-26 | 2020-08-04 | 中国信息通信研究院 | Method and device for determining signal strength of mobile broadband network |
CN113283979A (en) * | 2021-05-12 | 2021-08-20 | 广州市全民钱包科技有限公司 | Loan credit evaluation method and device for loan applicant and storage medium |
CN117876910B (en) * | 2024-03-06 | 2024-06-21 | 西北工业大学 | Unmanned aerial vehicle target detection key data screening method based on active learning |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108280104B (en) * | 2017-02-13 | 2020-06-02 | 腾讯科技(深圳)有限公司 | Method and device for extracting characteristic information of target object |
US20190034497A1 (en) * | 2017-07-27 | 2019-01-31 | Nec Laboratories America, Inc. | Data2Data: Deep Learning for Time Series Representation and Retrieval |
CN108460679B (en) * | 2018-02-28 | 2021-02-26 | 电子科技大学 | Data analysis method of deep network intelligent investment system integrating attention mechanism |
-
2019
- 2019-08-21 CN CN201910775569.8A patent/CN110490304B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110490304A (en) | 2019-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11720804B2 (en) | Data-driven automatic code review | |
CN110490304B (en) | Data processing method and device | |
CN112633419B (en) | Small sample learning method and device, electronic equipment and storage medium | |
CN111724083A (en) | Training method and device for financial risk recognition model, computer equipment and medium | |
US10346782B2 (en) | Adaptive augmented decision engine | |
CN106095942B (en) | Strong variable extracting method and device | |
US20210303970A1 (en) | Processing data using multiple neural networks | |
CN110598620B (en) | Deep neural network model-based recommendation method and device | |
US20220253856A1 (en) | System and method for machine learning based detection of fraud | |
US20210304055A1 (en) | Mechanisms for Continuous Improvement of Automated Machine Learning | |
CN110956278A (en) | Method and system for retraining machine learning models | |
CN116737581A (en) | Test text generation method and device, storage medium and electronic equipment | |
US20200175406A1 (en) | Apparatus and methods for using bayesian program learning for efficient and reliable knowledge reasoning | |
CN116522912B (en) | Training method, device, medium and equipment for package design language model | |
CN117421639A (en) | Multi-mode data classification method, terminal equipment and storage medium | |
RU2715024C1 (en) | Method of trained recurrent neural network debugging | |
CN116883179A (en) | Method and device for determining financial product investment strategy, processor and electronic equipment | |
KR102284440B1 (en) | Method to broker deep learning model transactions perfomed by deep learning model transaction brokerage servers | |
CN112328899B (en) | Information processing method, information processing apparatus, storage medium, and electronic device | |
CN112860652B (en) | Task state prediction method and device and electronic equipment | |
EP4356317A1 (en) | Blackbox optimization via model ensembling | |
CN118155227B (en) | Nuclear power equipment maintenance decision-making method and system based on intelligent technology | |
KR102311108B1 (en) | Method to broker deep learning model transactions perfomed by deep learning model transaction brokerage servers | |
US20240134846A1 (en) | Automatic Generation of Training and Testing Data for Machine-Learning Models | |
US20230008628A1 (en) | Determining data suitability for training machine learning models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |