CN112015870B

CN112015870B - Data uploading method and device

Info

Publication number: CN112015870B
Application number: CN202010963165.4A
Authority: CN
Inventors: 屈晋宇
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2020-09-14
Filing date: 2020-09-14
Publication date: 2024-06-07
Anticipated expiration: 2040-09-14
Also published as: CN112015870A

Abstract

The specification provides a data uploading method and device, wherein the data uploading method comprises the following steps: acquiring at least two data to be uploaded; splitting the data to be uploaded to obtain a plurality of data fields aiming at each of at least two data to be uploaded, and determining a data index corresponding to each data field in the plurality of data fields; for the target data index, under the condition that the data of the data field corresponding to the target data index is acquired under the same acquisition condition, judging whether the data of the data field corresponding to the target data index is the same, if so, determining that the data field corresponding to the target data index passes the audit; and if a plurality of data fields of the data to be uploaded pass the audit aiming at any data to be uploaded, determining that the data to be uploaded passes the audit, and uploading the data to be uploaded to a target uploading platform. The method and the device realize that the compliance risk is known in advance before the data to be uploaded is uploaded to the target uploading platform, and ensure the compliance of the data.

Description

Data uploading method and device

Technical Field

The present disclosure relates to the technical field of compliance data management, and in particular, to a data uploading method and apparatus.

Background

With the rapid development of socioeconomic performance, various service platforms are involved with more and more complex service types, each service line has a corresponding cooperation mechanism, data needs to be uploaded to the cooperation mechanism, in addition, related data needs to be uploaded to a supervision department for supervision and examination, so that a great amount of data needs to be uploaded to different uploading platforms, the data is uploaded to the uploading platforms, namely, a data disclosure process, and with the development of the service, various data disclosure scenes such as supervision disclosure, financial data disclosure, fund data disclosure, service data disclosure, public relation disclosure, investor key data disclosure and the like are involved. In the prior art, the acquired data to be uploaded is often directly uploaded to an uploading platform, namely, the data is directly disclosed. However, there may be a risk of compliance in the data to be uploaded, that is, the data uploaded to the uploading platform may include data with non-compliance, and before the data to be uploaded is uploaded to the uploading platform, the risk of compliance cannot be known in advance, and the data with non-compliance cannot be processed, so there is a need for a data uploading method capable of discovering the risk of compliance of the data to be uploaded in advance.

Disclosure of Invention

In view of this, the embodiments of the present disclosure provide a data uploading method. The present disclosure also relates to a data uploading device, a computing device, and a computer-readable storage medium, so as to solve the technical defects in the prior art.

According to a first aspect of embodiments of the present disclosure, there is provided a data uploading method, the method including:

acquiring at least two data to be uploaded;

Splitting the data to be uploaded to obtain a plurality of data fields aiming at each data to be uploaded in the at least two data to be uploaded, and determining a data index corresponding to each data field in the plurality of data fields;

For a target data index, judging whether the data of the data field corresponding to the target data index is the same or not under the condition that the data of the data field corresponding to the target data index is acquired under the same acquisition condition, and if so, determining that the data field corresponding to the target data index passes the audit;

and for any data to be uploaded, if a plurality of data fields of the data to be uploaded pass the audit, determining that the data to be uploaded passes the audit, and uploading the data to be uploaded to a target uploading platform.

Optionally, after determining the data index corresponding to each of the plurality of data fields, the method further includes:

And correspondingly storing data of a data field corresponding to the data index and target uploading information according to the data index, wherein the target uploading information refers to uploading information of data to be uploaded, to which the data field corresponding to the data index belongs, and the uploading information comprises an uploading platform, uploading time and/or uploading party information.

Optionally, the at least two data to be uploaded are received through the same interactive interface or different interactive interfaces, and the at least two data to be uploaded are uploaded to the same uploading platform or different uploading platforms.

Optionally, after the determining whether the data of the data field corresponding to the target data indicator is the same, the method further includes:

if the data fields are different, determining that the data fields corresponding to the target data indexes do not pass the auditing;

Determining the data to be uploaded corresponding to the data field which does not pass the audit as data to be processed;

and uploading the processed data to a target uploading platform under the condition that the data to be processed is processed.

Optionally, the splitting the data to be uploaded to obtain a plurality of data fields includes:

inputting the data to be uploaded into a preset segmentation model;

and determining a plurality of data fields corresponding to the data to be uploaded based on the preset segmentation model.

carrying out semantic analysis on the data to be uploaded, and determining a plurality of initial segmentation points;

And splitting the data to be uploaded into a plurality of data fields based on the plurality of starting segmentation points, wherein the data between each starting segmentation point and the next starting segmentation point form a data field.

Optionally, the determining the data index corresponding to each of the plurality of data fields includes:

extracting target keywords of each data field in the plurality of data fields through a preset keyword extraction model;

And determining the data index corresponding to the target keyword according to the corresponding relation between the prestored keyword and the index.

Optionally, the preset keyword extraction model is obtained through training by the following method:

acquiring a data sample, wherein the data sample comprises a sample tag, and the sample tag comprises a target keyword;

Inputting the data sample into an initial model to obtain a predicted keyword;

and determining a loss value based on the predicted keywords and the target keywords, and training the initial model based on the loss value until a training stopping condition is reached, so as to obtain the preset keyword extraction model.

Optionally, the training the initial model based on the loss value until reaching a training stop condition includes:

Judging whether the loss value is smaller than a preset threshold value or not;

if not, returning to execute the step of acquiring the data sample, and continuing training;

if yes, determining that the training stopping condition is reached.

According to a second aspect of embodiments of the present specification, there is provided a data uploading apparatus, the apparatus comprising:

the acquisition module is configured to acquire at least two data to be uploaded;

The first determining module is configured to split the data to be uploaded to obtain a plurality of data fields aiming at each data to be uploaded in the at least two data to be uploaded, and determine a data index corresponding to each data field in the plurality of data fields;

The second determining module is configured to determine, for a target data index, whether data of a data field corresponding to the target data index are the same or not when the data of the data field corresponding to the target data index are acquired under the same acquisition condition, and if so, determine that the data field corresponding to the target data index passes the audit;

And the uploading module is configured to determine that the data to be uploaded passes the audit and upload the data to be uploaded to a target uploading platform if a plurality of data fields of the data to be uploaded pass the audit aiming at any data to be uploaded.

According to a third aspect of embodiments of the present specification, there is provided a computing device comprising:

A memory and a processor;

the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions to implement the method of:

acquiring at least two data to be uploaded;

According to a fourth aspect of embodiments of the present description, there is provided a computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the steps of the data upload method.

According to the data uploading method provided by the specification, after at least two data to be uploaded are obtained, aiming at each data to be uploaded in the at least two data to be uploaded, splitting the data to be uploaded to obtain a plurality of data fields, and determining a data index corresponding to each data field in the plurality of data fields; for the target data index, under the condition that the data of the data field corresponding to the target data index is acquired under the same acquisition condition, judging whether the data of the data field corresponding to the target data index is the same, if so, determining that the data field corresponding to the target data index passes the audit; and if a plurality of data fields of the data to be uploaded pass the audit aiming at any data to be uploaded, determining that the data to be uploaded passes the audit, and uploading the data to be uploaded to a target uploading platform. In this case, the data to be uploaded can be split into a plurality of data fields, each data field corresponds to a data index, for the same data index, if the data of the corresponding data field is acquired under the same acquisition condition, the data of the corresponding data field is the same, which indicates that the data field corresponding to the data index passes the audit, and there is no compliance risk, and if the data fields are different, it indicates that the data may have errors and the compliance risk; for any data to be uploaded, if each data field included in the data to be uploaded passes the auditing, and no compliance risk exists, and the data to be uploaded is uploaded to the uploading platform at the moment, so that the compliance risk can be known in advance before the data to be uploaded is uploaded to the uploading platform, and the compliance of the data is ensured.

Drawings

FIG. 1 is a flow chart of a method for uploading data according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram illustrating a splitting process of data to be uploaded according to an embodiment of the present disclosure;

FIG. 3 is a process flow diagram of another method for uploading data according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of a data uploading device according to an embodiment of the present disclosure;

Fig. 5 is a block diagram of a computing device according to an embodiment of the present disclosure.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.

The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" depending on the context.

In the present specification, a data uploading method is provided, and the present specification relates to a data uploading apparatus, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments one by one.

Fig. 1 shows a flowchart of a data uploading method according to an embodiment of the present disclosure, which specifically includes the following steps:

Step 102: and acquiring at least two data to be uploaded.

In practical application, the service types related to various service platforms are more and more complex, each service line has a corresponding cooperation mechanism, and data needs to be uploaded to the cooperation mechanism, in addition, related data needs to be uploaded to a supervision department for supervision and examination, so that a great amount of data needs to be uploaded to different uploading platforms, and the data is uploaded to the uploading platforms, namely, a data disclosure process. In the prior art, the acquired data to be uploaded is often directly uploaded to an uploading platform, namely, the data is directly disclosed. However, although there are many disclosure scenarios involved, the same data to be disclosed may exist in different disclosure scenarios, and since the data to be disclosed may be obtained from different business channels in a scattered manner or processed by different staff members, there may be a difference between the same data, but the difference is unreasonable, which indicates that there may be an error in the data to be disclosed, if the data to be disclosed is directly disclosed to the disclosure object, the risk of compliance cannot be known in advance, and thus the non-compliant data cannot be processed.

Therefore, in order to know the compliance risk of the data to be uploaded in advance, the present specification provides a data uploading method, which can split the data to be uploaded to obtain a plurality of data fields for each data to be uploaded in at least two data to be uploaded after obtaining at least two data to be uploaded, and determine a data index corresponding to each data field in the plurality of data fields; for the target data index, under the condition that the data of the data field corresponding to the target data index is acquired under the same acquisition condition, judging whether the data of the data field corresponding to the target data index is the same, if so, determining that the data field corresponding to the target data index passes the audit; and if a plurality of data fields of the data to be uploaded pass the audit aiming at any data to be uploaded, determining that the data to be uploaded passes the audit, and uploading the data to be uploaded to a target uploading platform. The method and the device realize self-checking of the data to be uploaded before uploading to the target uploading platform, and judge whether the data of the data fields corresponding to the same data index are the same, so that whether the data to be uploaded has a compliance risk or not is known in advance.

Specifically, the data to be uploaded refers to the data to be uploaded by the uploading platform, which is acquired from each service channel, and the at least two data to be uploaded can be from the same service channel or from different service channels.

In an optional implementation manner of this embodiment, since different service channels may each correspond to an interaction interface, the service data of the user may be transmitted, that is, different service channels correspond to different interaction interfaces; or the service data of each service channel can be transmitted through the same interactive interface, namely, different service channels correspond to the same interactive interface, so that the at least two data to be uploaded can be received through the same interactive interface or different interactive interfaces in the specification. In addition, the service types corresponding to the data to be uploaded acquired from different service channels may be different, and thus may need to be uploaded to different service platforms; however, it is not excluded that some service types are related and need to be uploaded to the same uploading platform, so that the at least two data to be uploaded in the present specification may be uploaded to the same uploading platform or different uploading platforms.

In the specification, at least two data to be uploaded need to be obtained in advance, the obtained at least two data to be uploaded can be further analyzed later, whether the data which should be the same are the same or not is judged, and therefore whether the data to be uploaded have a compliance risk or not is known in advance.

Step 104: and splitting the data to be uploaded to obtain a plurality of data fields aiming at each of at least two data to be uploaded, and determining a data index corresponding to each data field in the plurality of data fields.

Specifically, on the basis of obtaining at least two data to be uploaded, further, splitting the data to be uploaded to obtain a plurality of data fields aiming at each of the at least two data to be uploaded, and determining a data index corresponding to each of the plurality of data fields. The data index is a category attribute to which the data corresponding to the data field belongs, and the category attribute can be a large category to which the data belongs, and the data index can be a merchant number, a user number, a fund detail and the like, or can be a detailed category to which the data belongs, such as an a region merchant number, a B region user number, a user a fund detail and the like.

For example, assume that data field 1, data field 2 and data field 3 are obtained by splitting data a to be uploaded, data field 4 and data field 5 are obtained by splitting data B to be uploaded, data field 6, data field 7 and data field 8 are obtained by splitting data C to be uploaded, then it is further determined that the data index corresponding to data field 1 is index 1, the data index corresponding to data field 2 is index 2, the data index corresponding to data field 3 is index 3, the data index corresponding to data field 4 is index 2, the data index corresponding to data field 5 is index 4, the data index corresponding to data field 6 is index 1, the data index corresponding to data field 7 is index 3, and the data index corresponding to data field 8 is index 4.

In an optional implementation manner of this embodiment, the data to be uploaded may be split directly through a pre-trained preset segmentation model to obtain a plurality of data fields, and the specific implementation process may be as follows:

inputting data to be uploaded into a preset segmentation model;

and determining a plurality of data fields corresponding to the data to be uploaded based on a preset segmentation model.

It should be noted that, after at least two data to be uploaded are obtained, each data to be uploaded can be respectively input into a pre-trained preset segment model, the output of the preset segment model is a plurality of split data fields, and the data to be uploaded is split through the preset segment model, so that the splitting efficiency is high, and the accuracy is high.

In specific implementation, the preset segmentation model can be obtained through training by the following method:

obtaining a data sample, wherein the data sample comprises a sample tag, and the sample tag comprises a target data field;

Inputting the data sample into an initial model to obtain a predicted data field;

and determining a loss value based on the predicted data field and the target data field, and training the initial model based on the loss value until a training stop condition is reached, so as to obtain a preset segment model.

The target data field refers to a plurality of data fields which are obtained by splitting a data sample, and the predicted data field refers to a plurality of data fields which are output by an initial model after the data sample is input.

Specifically, a cross entropy loss function may be calculated based on the predicted data field and the target data field included in the sample tag, generating a loss value. The sample label refers to a result (i.e., a plurality of data fields) which is actually expected to be output by the preset segment model, that is, a target data field included in the sample label is an actual result, a data sample is input into the initial model, the output predicted data field is a predicted result, when the difference between the predicted result and the actual result is sufficiently small, it is indicated that the predicted result is sufficiently close to the actual result, and at the moment, training of the initial model is completed, and the preset segment model is obtained.

In the specification, the difference between the prediction result (a plurality of data fields which are output) and the real result (a plurality of data fields which are included in a sample label) of the model can be intuitively shown through calculating the loss value, and then the initial model is pertinently trained, and parameters are adjusted, so that the model training speed and the model training effect can be effectively improved.

Wherein training the initial model based on the loss value until reaching a training stop condition may include:

Judging whether the loss value is smaller than a preset threshold value or not;

if yes, determining that the training stopping condition is reached.

The method comprises the steps that a preset threshold is a critical value of a loss value, and when the loss value is larger than or equal to the preset threshold, a certain deviation still exists between a predicted result and a real result of an initial model, parameters of the initial model still need to be adjusted, and a data sample of the type is acquired to train the model continuously; under the condition that the loss value is smaller than a preset threshold value, the approach degree of the predicted result of the initial model and the real result is enough, and training can be stopped. The value of the preset threshold may be determined according to practical situations, which is not limited in this specification.

According to the method and the device, the specific training condition of the initial model can be judged according to the loss value, and the parameters of the initial model are reversely adjusted according to the loss value under the condition that training is unqualified so as to improve the analysis capability of the model, so that the training rate is high, and the training effect is good.

In an optional implementation manner of this embodiment, not only the data to be uploaded can be split directly through a pre-trained preset segmentation model to obtain a plurality of data fields, but also the data to be uploaded can be split through voice analysis to obtain a plurality of data fields, and the specific implementation process is as follows:

Carrying out semantic analysis on data to be uploaded, and determining a plurality of initial segmentation points;

based on a plurality of initial segmentation points, splitting the data to be uploaded into a plurality of data fields, wherein the data between each initial segmentation point and the next initial segmentation point form a data field.

In actual implementation, the initial segmentation point at the beginning and the end is different from the initial segmentation point in the middle, the first initial segmentation point does not have the previous initial segmentation point, and the last initial segmentation point does not have the next initial segmentation point. As shown in fig. 2, data from the first data of the data to be uploaded to the first start segment point (start segment point 1) constitutes a first data field (data field 1), data from the first start segment point (start segment point 1) and the second start segment point (start segment point 2) constitutes a second data field (data field 2), and so on until the next-to-last start segment point (start segment point 3) and the next-to-last start segment point (start segment point 4) constitute a next-to-last data field (data field 3), and the next-to-last start segment point (start segment point 4) to the last data end of the data to be uploaded constitutes a last data field (data field 4).

In the specification, a plurality of initial segmentation points can be determined through semantic analysis, then the data to be uploaded is split into a plurality of data fields based on the plurality of initial segmentation points, a large number of model training processes are not needed, and the efficiency of splitting the data to be uploaded is high.

In an optional implementation manner of this embodiment, the data index corresponding to each data field in the plurality of data fields may be determined based on the keywords included in the data fields, and the specific implementation process may be as follows:

extracting target keywords of the data fields through a preset keyword extraction model aiming at each of the plurality of data fields;

In practical application, each data field generally includes attribute parameters corresponding to the data besides simple data, so that a certain data field can be input into a keyword extraction model to obtain a target keyword of the data field, namely, a keyword which can indicate a data index corresponding to the data of the data field is extracted through the keyword extraction model, and then the indicated data index can be further determined according to the keyword.

Along the above example, assuming that the data field 1, the data field 2 and the data field 3 are obtained by splitting the data to be uploaded a, the 3 data fields are sequentially input into a trained keyword extraction model to obtain the target keywords as the keywords 1, the keywords 2 and the keywords 3 respectively, and the corresponding data indexes are respectively determined to be the indexes 1, 2 and 3 according to the corresponding relation between the prestored keywords and the indexes.

In specific implementation, the preset keyword extraction model can be obtained through training by the following method:

inputting the data sample into an initial model to obtain a predicted keyword;

And determining a loss value based on the predicted keywords and the target keywords, and training the initial model based on the loss value until a training stop condition is reached, so as to obtain a preset keyword extraction model.

Specifically, a cross entropy loss function may be calculated based on the predicted keywords and the target keywords included in the sample tags, generating a loss value. The sample label refers to a result output by a real keyword extraction model, namely, a target keyword included in the sample label is a real result, a data sample is input into the initial model, the output predicted keyword is a predicted result, when the difference between the predicted result and the real result is small enough, the predicted result is close to the real result, and at the moment, the initial model training is completed, so that the keyword extraction model is obtained.

Wherein training the initial model based on the loss value until reaching a training stop condition, comprises:

Judging whether the loss value is smaller than a preset threshold value or not;

if not, returning to the step of acquiring the data sample, and continuing training;

If yes, determining that the training stop condition is reached.

Further, after determining the data index corresponding to each data field in the plurality of data fields, the data to be uploaded may be stored according to the data index, and the specific implementation process is as follows:

According to the data index, the data of the data field corresponding to the data index and the target uploading information are correspondingly stored, wherein the target uploading information is uploading information of the data to be uploaded, which belongs to the data field corresponding to the data index, and the uploading information comprises an uploading platform, uploading time and/or uploading party information.

It should be noted that, as long as the data to be uploaded is to be self-checked by the data uploading method provided by the present specification, that is, after the data to be uploaded is received, the data to be uploaded is stored, and is stored according to the data index, so as to record the uploading information in each uploading scene. When the data of the data field corresponding to the data index and the target uploading information are correspondingly stored according to the data index, the data can be stored in a one-dimensional mode, namely, all corresponding data are directly stored in sequence for a certain data index; the data fields can be stored in a multidimensional manner, such as a two-dimensional manner, the data indexes can be used as horizontal rows, the acquisition time is used as vertical rows, and the specific storage manner is not limited in the specification as long as the determined data fields can be stored in a split index manner according to the data indexes.

Along the above example, it is assumed that the data to be uploaded a is data collected by the user X1 month and 1 day, and is to be uploaded to the uploading platform P, the data to be uploaded B is data collected by the user Y1 month and 2 days, and is to be uploaded to the uploading platform Q, and the data to be uploaded C is data collected by the user Z1 month and 3 days, and is to be uploaded to the uploading platform W, where the data may be stored in a one-dimensional manner as shown in the following table 1, or in a two-dimensional manner as shown in the following table 2.

Table 1 one-dimensional stored data to be uploaded

Table 2 two-dimensional data to be uploaded stored

It should be noted that, in the present specification, after receiving the data to be uploaded, a plurality of data fields of the data to be uploaded may be stored separately according to the data index, so as to record the uploading information in each uploading scene, and facilitate the inquiry of the staff. When a worker needs to upload new data, the relevant information of the historical uploading data can be queried to know uploading scenes such as rules, processes and the like of the previous uploading data, so that the subsequent uploading of the new data is facilitated. The method has the advantages that workers can concentrate on a storage position for storing data to be uploaded to check historical uploading information, and safety of the historical data is improved; in addition, for the sensitive index, the historical uploading information of the sensitive index can be simply and conveniently obtained, and the normal uploading of the sensitive index is ensured.

Step 106: for the target data index, under the condition that the data of the data field corresponding to the target data index is acquired under the same acquisition condition, judging whether the data of the data field corresponding to the target data index is the same, and if so, determining that the data field corresponding to the target data index passes the audit.

Specifically, on the basis of determining the data index corresponding to each data field in the plurality of data fields, further, for the target data index, if the data of the data field corresponding to the target data index is acquired under the same acquisition condition, judging whether the data of the data field corresponding to the target data index is the same, and if so, determining that the data field corresponding to the target data index passes the audit. The target data index is any one of the determined data indexes, and after the data index corresponding to each of the data fields is determined, each data index is used as a primary target data index to determine whether the data of each corresponding data field is the same. The acquisition condition refers to an acquisition scene when acquiring data to be uploaded, namely, a condition of acquiring the scene such as time, place, attribute and the like.

In an optional implementation manner of this embodiment, for the same data index (i.e., the target data index), if the data of the data field corresponding to the data index is acquired under the same acquisition condition, the data of each data field corresponding to the data index should be the same, which indicates that the data of each data field corresponding to the data index is compliant, and each data field can pass the verification; if the data fields are different, it is indicated that error data may exist in each data field, so that it is determined that each data field does not pass the audit, and the screening processing should be performed first and then the uploading is performed, and the specific implementation process may be as follows:

If the data fields are different, determining that the data fields corresponding to the target data indexes do not pass the audit;

determining the data to be uploaded corresponding to the data field which does not pass the audit as the data to be processed;

It should be noted that, after the data to be uploaded corresponding to the data field that does not pass the audit is determined as the data to be processed, a risk reminder may be returned to the staff, and the staff may further follow the data to be processed to process the data to be processed, or it may be determined that the data to be processed has no problem. When the processed data to be processed or the data to be processed determining instruction is received, determining that the data to be processed is processed, and uploading the processed data to the target uploading platform.

Step 108: and if a plurality of data fields of the data to be uploaded pass the audit aiming at any data to be uploaded, determining that the data to be uploaded passes the audit, and uploading the data to be uploaded to a target uploading platform.

It should be noted that any data to be uploaded may include a plurality of data fields, and as long as any one of the data fields fails to pass the audit, it is determined that the data to be uploaded fails to pass the audit, and cannot be directly uploaded to the target uploading platform, and the data to be uploaded needs to be processed first; only if all data fields included in the data to be uploaded pass the audit, the data to be uploaded is determined to pass the audit and can be directly uploaded to the target uploading platform. The target uploading platform refers to a platform for uploading data to be uploaded.

Along the above example, it is assumed that the data of the same index are all acquired under the same acquisition condition, that is, as long as the indexes are the same, the data corresponding to the data field should be the same, because the data field 1 and the data field 6 correspond to the index 1, the data field 2 and the data field 4 correspond to the index 2, the data field 3 and the data field 7 correspond to the index 3, the data field 5 and the data field 8 correspond to the index 4, so that it is required to determine whether the data of the data field 1 and the data field 6 are the same, the data of the data field 2 and the data field 4 are the same, the data of the data field 3 and the data field 7 are the same, and the data field 5 and the data field 8 are the same. Assuming that the data of data field 1 and data field 6, the data of data field 2 and data field 4, the data of data field 5 and data field 8 are the same, and the data of data field 3 and data field 7 are different, data field 1, data field 2, data field 4, data field 5, data field 6, data field 8 pass the audit, and data field 3 and data field 7 do not pass the audit. At this time, the data field 3 included in the data to be uploaded a does not pass the audit, and the data to be uploaded a is determined to be the data to be processed, and the data to be uploaded can be uploaded to the platform P after the processing is finished; the data field 4 and the data field 5 included in the data B to be uploaded pass the auditing, so that the data B to be uploaded is determined to pass the auditing and can be directly uploaded to the platform Q; the data field 7 included in the data C to be uploaded does not pass the audit, the data C to be uploaded is determined to be the data to be processed, and the data C to be uploaded can be uploaded to the platform W after the processing is finished.

Fig. 3 shows a flowchart of another data uploading method according to an embodiment of the present disclosure, which specifically includes the following steps:

Step 302: and receiving at least two pieces of data to be uploaded through the same interactive interface or different interactive interfaces, wherein the at least two pieces of data to be uploaded can be uploaded to the same uploading platform or different uploading platforms.

Step 304: for each of at least two data to be uploaded, carrying out semantic analysis on the data to be uploaded, determining a plurality of initial segmentation points, and splitting the data to be uploaded into a plurality of data fields based on the initial segmentation points, wherein the data between each initial segmentation point and the next initial segmentation point form a data field.

Step 306: for each data field in the plurality of data fields, extracting target keywords of the data field through a preset keyword extraction model, and then determining data indexes corresponding to the target keywords according to the corresponding relation between the prestored keywords and the indexes.

Step 308: for the target data index, under the condition that the data of the data field corresponding to the target data index is acquired under the same acquisition condition, judging whether the data of the data field corresponding to the target data index is the same, if so, determining that the data field corresponding to the target data index passes the audit, and if not, determining that the data field corresponding to the target data index does not pass the audit.

Step 310: for any data to be uploaded, if a plurality of data fields of the data to be uploaded pass the audit, determining that the data to be uploaded passes the audit, and uploading the data to be uploaded to a target uploading platform; if any data field of the data to be uploaded does not pass the audit, determining that the data to be uploaded does not pass the audit, and uploading the data to the target uploading platform under the condition that the data to be uploaded which does not pass the audit is processed.

According to the data uploading method provided by the specification, data to be uploaded can be split into a plurality of data fields, each data field corresponds to a data index, for the same data index, if the data of the corresponding data field is acquired under the same acquisition condition, the data of the corresponding data field is identical, the data field corresponding to the data index is verified, the compliance risk is not existed, and if the data fields are not identical, the data of the data field is possibly wrong, and the compliance risk exists; for any data to be uploaded, if each data field included in the data to be uploaded passes the auditing, and no compliance risk exists, and the data to be uploaded is uploaded to the uploading platform at the moment, so that the compliance risk can be known in advance before the data to be uploaded is uploaded to the uploading platform, and the compliance of the data is ensured.

Corresponding to the method embodiment, the present disclosure further provides an embodiment of a data uploading device, and fig. 4 shows a schematic structural diagram of the data uploading device according to an embodiment of the present disclosure. As shown in fig. 4, the apparatus includes:

an acquisition module 402 configured to acquire at least two data to be uploaded;

A first determining module 404, configured to split, for each of the at least two data to be uploaded, the data to be uploaded to obtain a plurality of data fields, and determine a data index corresponding to each of the plurality of data fields;

A second determining module 406, configured to determine, for a target data indicator, if data of a data field corresponding to the target data indicator is acquired under the same acquisition condition, whether the data of the data field corresponding to the target data indicator is the same, and if so, determine that the data field corresponding to the target data indicator passes the audit;

The uploading module 408 is configured to determine that the data to be uploaded passes the audit if the multiple data fields of the data to be uploaded pass the audit for any one of the data to be uploaded, and upload the data to be uploaded to the target uploading platform.

In an optional implementation manner of this embodiment, the apparatus further includes:

The storage module is configured to correspondingly store data of a data field corresponding to the data index and target uploading information according to the data index, wherein the target uploading information refers to uploading information of data to be uploaded, to which the data field corresponding to the data index belongs, and the uploading information comprises an uploading platform, uploading time and/or uploading party information.

In an optional implementation manner of this embodiment, the at least two data to be uploaded are received through the same interaction interface or different interaction interfaces, and the at least two data to be uploaded are uploaded to the same uploading platform or different uploading platforms.

In an alternative implementation of this embodiment, the second determining module 406 is further configured to:

In an alternative implementation of this embodiment, the first determining module 404 is further configured to:

inputting the data to be uploaded into a preset segmentation model;

In an optional implementation manner of this embodiment, the preset keyword extraction model is obtained through training by the following method:

Inputting the data sample into an initial model to obtain a predicted keyword;

Judging whether the loss value is smaller than a preset threshold value or not;

if yes, determining that the training stopping condition is reached.

The data uploading device provided by the specification can split data to be uploaded into a plurality of data fields, each data field corresponds to a data index, for the same data index, if the data of the corresponding data field is acquired under the same acquisition condition, the data of the corresponding data field is identical, the data field corresponding to the data index is verified, no compliance risk exists, and if the data fields are not identical, the data of the data field is possibly wrong, and the compliance risk exists; for any data to be uploaded, if each data field included in the data to be uploaded passes the auditing, and no compliance risk exists, and the data to be uploaded is uploaded to the uploading platform at the moment, so that the compliance risk can be known in advance before the data to be uploaded is uploaded to the uploading platform, and the compliance of the data is ensured.

The foregoing is a schematic scheme of a data uploading device in this embodiment. It should be noted that, the technical solution of the data uploading device and the technical solution of the data uploading method belong to the same concept, and details of the technical solution of the data uploading device, which are not described in detail, can be referred to the description of the technical solution of the data uploading method.

Fig. 5 illustrates a block diagram of a computing device 500 provided in accordance with an embodiment of the present specification. The components of the computing device 500 include, but are not limited to, a memory 510 and a processor 520. Processor 520 is coupled to memory 510 via bus 530 and database 550 is used to hold data.

Computing device 500 also includes access device 540, access device 540 enabling computing device 500 to communicate via one or more networks 560. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 540 may include one or more of any type of network interface, wired or wireless (e.g., a Network Interface Card (NIC)), such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 500, as well as other components not shown in FIG. 5, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device shown in FIG. 5 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.

Computing device 500 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 500 may also be a mobile or stationary server.

Wherein the processor 520 is configured to execute the following computer-executable instructions:

acquiring at least two data to be uploaded;

The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the data uploading method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the data uploading method.

An embodiment of the present disclosure also provides a computer-readable storage medium storing computer instructions that, when executed by a processor, are configured to implement the method of:

acquiring at least two data to be uploaded;

The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the data uploading method belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the data uploading method.

The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present description is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present description. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all necessary in the specification.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the disclosure and the practical application, to thereby enable others skilled in the art to best understand and utilize the disclosure. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims

1. A method of data upload, the method comprising:

acquiring at least two data to be uploaded, wherein the data to be uploaded refers to data to be uploaded by an uploading platform, which are acquired from each service channel;

For any data to be uploaded, if a plurality of data fields of the data to be uploaded pass the audit, determining that the data to be uploaded passes the audit, and uploading the data to be uploaded to a target uploading platform;

Wherein the determining the data index corresponding to each of the plurality of data fields includes:

and determining a data index corresponding to the target keyword according to a corresponding relation between the prestored keyword and the index, wherein the data index is a category attribute to which the data corresponding to the data field belongs.

2. The data uploading method according to claim 1, wherein after determining the data index corresponding to each of the plurality of data fields, the method further comprises:

3. The data uploading method according to claim 1, wherein the at least two data to be uploaded are received through the same interactive interface or different interactive interfaces, and the at least two data to be uploaded are uploaded to the same uploading platform or different uploading platforms.

4. The method for uploading data according to claim 1, wherein after determining whether the data of the data field corresponding to the target data indicator is the same, further comprising:

5. The data uploading method as claimed in claim 1, wherein the splitting the data to be uploaded to obtain a plurality of data fields comprises:

inputting the data to be uploaded into a preset segmentation model;

6. The data uploading method as claimed in claim 1, wherein the splitting the data to be uploaded to obtain a plurality of data fields comprises:

7. The data uploading method according to claim 1, wherein the preset keyword extraction model is obtained by training the following method:

Inputting the data sample into an initial model to obtain a predicted keyword;

8. The data uploading method of claim 7, the training the initial model based on the loss value until a training stop condition is reached, comprising:

Judging whether the loss value is smaller than a preset threshold value or not;

if yes, determining that the training stopping condition is reached.

9. A data uploading apparatus, the apparatus comprising:

The system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is configured to acquire at least two data to be uploaded, wherein the data to be uploaded refers to data to be uploaded by an uploading platform, which are acquired from each service channel;

The uploading module is configured to determine that the data to be uploaded passes the audit if a plurality of data fields of the data to be uploaded pass the audit for any data to be uploaded, and upload the data to be uploaded to a target uploading platform;

wherein the first determination module is further configured to:

10. A computing device, comprising:

A memory and a processor;

11. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the data upload method of any one of claims 1 to 8.