CN114896236A

CN114896236A - Big data denoising optimization method and big data system applying artificial intelligence analysis

Info

Publication number: CN114896236A
Application number: CN202210572020.0A
Authority: CN
Inventors: 杨文宝; 袁建超
Original assignee: Zhongwei Haoke Electronic Technology Co Ltd
Current assignee: Shandong Inspur New Century Technology Co Ltd; Shandong Inspur Smart Cultural Tourism Industry Development Co Ltd
Priority date: 2022-05-25
Filing date: 2022-05-25
Publication date: 2022-08-12
Anticipated expiration: 2042-05-25
Also published as: CN114896236B

Abstract

The embodiment of the invention provides a big data acquisition optimization method and a big data system applying artificial intelligence analysis, wherein a first noise characteristic point map generated by mining noise characteristic points of a target big data acquisition log of a big data acquisition process is extracted from a big data acquisition service database, and after the big data acquisition process is updated based on the first noise characteristic point map, when the first noise characteristic point map needs to be adjusted, a target noise characteristic point sequence associated with a service updating activity corresponding to a preset service online time period and a noise characteristic point relation of the target noise characteristic point sequence corresponding to the service updating activity are obtained from the first noise characteristic point map, and a corresponding second noise characteristic point map is reconstructed, the big data acquisition process is updated based on the second noise characteristic point map, so that the continuous reconstruction optimization of the noise characteristic point map is realized, therefore, the traversal big data acquisition process is updated, and the accuracy of the subsequent big data sample acquisition can be improved.

Description

Big data denoising optimization method and big data system applying artificial intelligence analysis

Technical Field

The invention relates to the technical field of big data acquisition, in particular to a big data denoising optimization method and a big data system applying artificial intelligence analysis.

Background

In the related technology, a large number of data samples aiming at behaviors of users in the process of using internet services can be collected, so that a large number of learnable characteristics are provided for the artificial intelligence model, and various user characteristic predictions such as user interest characteristic prediction, user intention characteristic prediction, user preference characteristic prediction, user demand characteristic prediction and the like are realized by using the trained artificial intelligence model. It should be noted that the effect of the artificial intelligence model training also depends on the feature accuracy of the behavior big data samples, that is, depending on the noise amount of the behavior big data samples, the higher feature accuracy can be ensured only when the noise amount is small. Therefore, during the acquisition phase of large data samples, the acquisition noise needs to be strictly controlled. However, in the related art, the deployment of the big data acquisition optimization is directly performed after the preliminary noise analysis is performed on a certain specific service plate, and this way cannot better ensure the accuracy of big data sample acquisition, thereby also affecting the subsequent artificial intelligence model training accuracy.

Disclosure of Invention

In order to overcome at least the above disadvantages in the prior art, the present invention provides a big data denoising optimization method and a big data system using artificial intelligence analysis.

In a first aspect, the present application provides a big data acquisition optimization method applying artificial intelligence analysis, which is applied to a big data system, where the big data system performs data interaction with a plurality of big data acquisition processes, and the method includes:

extracting a first noise characteristic point map generated by mining noise characteristic points of a target big data acquisition log of a big data acquisition process from a big data acquisition service database, and acquiring a target noise characteristic point sequence associated with a service updating activity corresponding to a preset service online time period and a noise characteristic point relation of the target noise characteristic point sequence corresponding to the service updating activity from the first noise characteristic point map when the first noise characteristic point map needs to be adjusted after a big data acquisition process is updated based on the first noise characteristic point map;

reconstructing a corresponding second noise characteristic point map based on a target noise characteristic point sequence and a noise characteristic point relation of the target noise characteristic point sequence corresponding to the service updating activity;

and updating a traversal big data acquisition process based on the second noise characteristic point map.

In a possible implementation manner of the first aspect, the step of extracting, from a big data acquisition service database, a first noise feature point spectrum generated by mining a noise feature point of a target big data acquisition log of a big data acquisition process, and acquiring, from the first noise feature point spectrum, a target noise feature point sequence associated with a service update activity corresponding to a preset service online period and a noise feature point relationship of the target noise feature point sequence corresponding to the service update activity when it is analyzed that the first noise feature point spectrum needs to be adjusted after a big data acquisition process is updated based on the first noise feature point spectrum includes:

extracting a first noise characteristic point map generated by mining noise characteristic points of a target big data acquisition log of a big data acquisition process from a big data acquisition service database, configuring target big data acquisition process updating information for the big data acquisition process according to the first noise characteristic point map, and monitoring big data acquisition updating deployment data of the big data acquisition process, wherein the big data acquisition updating deployment data are big data acquisition logs generated by traversing big data acquisition after big data acquisition process updating is performed according to the big data acquisition process updating information;

acquiring an estimated big data acquisition conclusion index of the big data acquisition process updating information and a real big data acquisition conclusion index of the big data acquisition updating deployment data;

outputting noise mining effect indexes between the big data acquisition process updating information and the big data acquisition updating deployment data according to the estimated big data acquisition conclusion indexes and the real big data acquisition conclusion indexes;

if the noise mining effect index is not smaller than a first effect index through analysis, loading the update information of the target big data acquisition process to a sample big data acquisition activity;

and if the noise mining effect index is smaller than the first effect index through analysis, determining that the first noise characteristic point map needs to be adjusted, and acquiring a target noise characteristic point sequence associated with a service updating activity corresponding to a preset service online time period and a noise characteristic point relation of the target noise characteristic point sequence corresponding to the service updating activity from the first noise characteristic point map.

In a possible implementation manner of the first aspect, before the step of outputting the noise mining effect indicator between the big data collection procedure update information and the big data collection update deployment data according to the estimated big data collection conclusion indicator and the real big data collection conclusion indicator, the method further includes:

acquiring a key estimation big data acquisition event of the big data acquisition process updating information and a key real big data acquisition event of the big data acquisition updating deployment data, wherein the key estimation big data acquisition event is configured to an estimation big data acquisition event of which the acquisition span changes and the acquisition span change value is greater than a target change value in estimation big data acquisition events of different process updating labels expressing the corresponding big data acquisition process updating information, and the key real big data acquisition event is configured to a real big data acquisition event of which the acquisition span changes and the acquisition span change value is greater than the target change value in real big data acquisition events of different process updating labels expressing the corresponding big data acquisition updating deployment data;

outputting the updating information of the big data acquisition process and the first mining connectivity of the big data acquisition updating deployment data according to the first big data acquisition variable distribution of the key estimation big data acquisition event and the second big data acquisition variable distribution of the key real big data acquisition event;

the outputting the noise mining effect index between the big data acquisition process updating information and the big data acquisition updating deployment data according to the estimated big data acquisition conclusion index and the real big data acquisition conclusion index specifically includes:

and analyzing to obtain a second mining connectivity of the estimated big data acquisition conclusion index and the real big data acquisition conclusion index, and weighting the first mining connectivity and the second mining connectivity to output the noise mining effect index.

In a possible implementation manner of the first aspect, before the step of outputting the noise mining effect index between the big data collection process update information and the big data collection update deployment data according to the estimated big data collection conclusion index and the real big data collection conclusion index, the big data collection update deployment data further carries first collection knowledge point deployment data related to the big data collection process update information and second collection knowledge point deployment data related to the big data collection update deployment data, the method further includes:

outputting a third mining connectivity of the big data acquisition flow updating information and the big data acquisition updating deployment data according to the first acquisition knowledge point deployment data and the second acquisition knowledge point deployment data;

and analyzing to obtain a second mining connectivity of the estimated big data acquisition conclusion index and the real big data acquisition conclusion index, and weighting the mining connectivity and the third mining connectivity to output the noise mining effect index.

In a possible implementation manner of the first aspect, before the step of obtaining the estimated big data collection conclusion indicator of the big data collection procedure update information and the real big data collection conclusion indicator of the big data collection update deployment data, the method further includes:

acquiring conflict information ratio of the big data acquisition flow updating information and the big data acquisition updating deployment data;

if the conflict information ratio of the big data acquisition process updating information and the big data acquisition updating deployment data is not larger than the target conflict information ratio, executing the step of obtaining an estimated big data acquisition conclusion index of the big data acquisition process updating information and a real big data acquisition conclusion index of the big data acquisition updating deployment data, and if the conflict information ratio of any one of the big data acquisition process updating information and the big data acquisition updating deployment data is larger than the target conflict information ratio, finishing the step.

In a possible implementation manner of the first aspect, the extracting, from a big data acquisition service database, a first noise feature point map generated by performing noise feature point mining on a target big data acquisition log of the big data acquisition process, and configuring update information of a target big data acquisition process for the big data acquisition process according to the first noise feature point map specifically includes:

acquiring a target big data acquisition log of the big data acquisition process, and mining noise information of the target big data acquisition log to determine target noise information corresponding to the target big data acquisition log;

acquiring a first target collaborative big data acquisition log corresponding to the target big data acquisition log and a second target collaborative big data acquisition log corresponding to target noise information, and acquiring a collaborative acquisition process of the big data acquisition process, wherein the collaborative acquisition process is configured with a plurality of first previous big data acquisition logs and collaborative acquisition logs respectively corresponding to the plurality of first previous big data acquisition logs, and the collaborative acquisition logs comprise first collaborative big data acquisition logs corresponding to the first previous big data acquisition logs;

matching the target big data acquisition log, the target noise information, the first target collaborative big data acquisition log and the second target collaborative big data acquisition log with a first previous big data acquisition log and a first collaborative big data acquisition log in the collaborative acquisition process respectively, and outputting the first previous big data acquisition log and the first collaborative big data acquisition log matched with the target big data acquisition log, the target noise information, the first target collaborative big data acquisition log and the second target collaborative big data acquisition log;

performing log integration on the target big data acquisition log based on the matched first prior big data acquisition log and the first collaborative big data acquisition log;

acquiring the target big data acquisition log and an additional big data acquisition log which is log-integrated with the target big data acquisition log, and outputting the target big data acquisition log and the additional big data acquisition log as a target big data acquisition log;

acquiring training sample data clusters corresponding to AI training acquisition tasks of the target big data acquisition log in a first AI training process, wherein the first AI training process comprises at least two AI training acquisition tasks, and the training sample data clusters corresponding to the AI training acquisition tasks comprise acquisition big data of template features of the target training samples acquired by training sample acquisition activities in the target big data acquisition log in the corresponding AI training acquisition tasks;

analyzing and obtaining a training sample data atlas between training sample data clusters corresponding to all AI training acquisition tasks in the first AI training flow;

outputting noise characteristic point communication information of the target big data acquisition log in the first AI training flow based on a training sample data atlas between training sample data clusters corresponding to all AI training acquisition tasks in the first AI training flow;

analyzing and obtaining noise characteristic point relation information of the target big data acquisition log in the first AI training process based on the noise characteristic point communication information;

and configuring target big data acquisition process updating information for the big data acquisition process based on a first noise characteristic point map obtained by the noise characteristic point relation information.

In one possible implementation of the first aspect, the method further comprises:

acquiring large data of user behaviors to be learned, acquired after updating based on a large data acquisition process, for training a user interest mining model, and scheduling an initial user interest mining model and a collaborative training model for training the user interest mining model, wherein the initial user interest mining model comprises a plurality of first basic interest learning units for model parameter layer optimization and selection, and the collaborative training model comprises a plurality of second basic interest learning units for model parameter layer optimization and selection;

carrying out data division on the big data of the user behavior to be learned, and extracting the user pre-behavior activity and the user trigger behavior activity corresponding to the big data of the user behavior to be learned;

generating an interest learning data sequence based on the user pre-behavioral activity, the user triggered behavioral activity, and user activity interest points corresponding to the user triggered behavioral activity, and generating an interest difference learning data sequence based on the user triggered behavioral activity and the user activity interest points;

determining a first interest learning variable based on the first basic interest learning unit and according to the interest learning data in the interest learning data sequence;

determining a second interest learning variable based on a corresponding second basic interest learning unit and according to the interest difference learning data in the interest difference learning data sequence;

and adjusting and selecting the model parameter layer of the first basic interest learning unit based on the first interest learning variable and the second interest learning variable or the learning cost information of the user triggered behavior activity, and outputting the final user interest mining model which is adjusted and selected by the model parameter layer.

Based on the steps, the model parameter layer optimization and selection can be obtained by mining the model based on the initial user interest to be optimized and selected and determining the first interest learning variable according to the interest learning data, determining the second interest learning variable based on the collaborative training model which is optimized and selected based on the model parameter layer and determining the second interest learning variable according to the interest distinguishing learning data, and performing the model parameter layer optimization and selection based on the first interest learning variable and the second interest learning variable or the learning cost information of the user triggering behavior activity, the method comprises the steps of generating a fuzzy interest thermodynamic diagram based on an initial user interest mining model driven by user prepositive behaviors and generating training basis information in a model training process based on a collaborative training model which is subjected to model parameter layer optimization and selection, and further improving the reliability of model training.

configuring a learning cooperative relationship of the first basic interest learning unit and the second basic interest learning unit;

and determining interest learning data corresponding to the first basic interest learning unit in the interest learning data sequence according to the learning collaborative relationship, and determining interest difference learning data corresponding to the second basic interest learning unit in the interest difference learning data sequence.

In one possible implementation of the first aspect, the first base interest learning unit includes: a first explicit interest feature learning subunit, the second base interest learning unit comprising: a second explicit interest feature learning subunit; the interest learning data includes: user pre-behavioral activity learning data, the interest-discriminative learning data comprising: user triggered behavioral activity learning data corresponding to the user pre-behavioral activity learning data;

the determining a first interest learning variable based on the first basic interest learning unit and according to the interest learning data in the interest learning data sequence comprises:

acquiring first explicit interest feature information based on the first explicit interest feature learning subunit and according to the user pre-behavior activity learning data;

the determining a second interest learning variable based on the corresponding second basic interest learning unit and according to the interest difference learning data in the interest difference learning data sequence includes:

acquiring second explicit interest feature information based on the second explicit interest feature learning subunit and according to the user triggered behavior activity learning data;

the tuning and selecting the model parameter layer of the first basic interest learning unit based on the first interest learning variable and the second interest learning variable or the learning cost information of the user triggered behavior activity includes:

calculating a first training loss function value based on learning cost information of the first explicit interest feature information and the second explicit interest feature information;

and adjusting and selecting the model parameter layer of the first explicit interest feature learning subunit based on the first training loss function value.

For example, in one possible implementation manner of the first aspect, the first basic interest learning unit includes: an implicit interest feature learning subunit, wherein the second basic interest learning unit comprises: a first implicit point of interest explicit conversion subunit; the interest learning data includes: user pre-behavioral activity learning data, the interest-discriminative learning data comprising: user triggered behavioral activity learning data corresponding to the user pre-behavioral activity learning data;

obtaining implicit interest feature information based on the implicit interest feature learning subunit and according to the user preposed behavior activity learning data;

acquiring first implicit interest point explicit conversion characteristic information based on the first implicit interest point explicit conversion subunit and according to the user triggered behavior activity learning data;

acquiring explicit conversion characteristic information of a second implicit interest point based on the implicit interest characteristic information;

calculating a second training loss function value based on learning cost information of the first implicit interest point explicit conversion characteristic information and the second implicit interest point explicit conversion characteristic information;

and adjusting and selecting the model parameter layer of the implicit interest feature learning subunit based on the second training loss function value.

For example, in a possible implementation manner of the first aspect, the second basic interest learning unit further includes: a second implicit point of interest explicit conversion subunit; the interest difference learning data includes: sample user activity interest points corresponding to the user triggered behavioral activity learning data; the obtaining of the second implicit interest point explicit conversion feature information based on the implicit interest feature information includes:

obtaining an explicit interest point feature based on the second implicit interest point explicit conversion subunit and according to the example user activity interest point;

and performing explicit conversion on the explicit interest point characteristics and the implicit interest characteristic information to generate second implicit interest point explicit conversion characteristic information.

For example, in one possible implementation manner of the first aspect, the first basic interest learning unit includes: a first prediction subunit, the second base interest learning unit comprising: an intention probability distribution matrix output unit; the interest learning data includes: user triggered behavioral activity learning data and sample user activity interest points, the interest discriminative learning data comprising user triggered behavioral activity learning data;

acquiring third implicit interest point dominant conversion characteristic information based on the user triggered behavior activity learning data;

obtaining a first intention probability distribution matrix based on the first prediction subunit and according to the third implicit interest point explicit conversion feature information and the example user activity interest points;

obtaining a second intention probability distribution matrix based on the intention probability distribution matrix output unit and according to the user triggered behavior activity learning data;

calculating a third training loss function value based on learning cost information of the first intention probability distribution matrix and the second intention probability distribution matrix;

and adjusting and selecting the model parameter layer of the first prediction subunit based on the third training loss function value.

For example, in a possible implementation manner of the first aspect, the second basic interest learning unit further includes: a third implicit point of interest explicit conversion subunit;

the step of obtaining third implicit interest point explicit conversion feature information based on the user triggered behavior activity learning data includes:

and acquiring the third implicit interest point explicit conversion characteristic information based on the third implicit interest point explicit conversion subunit and according to the user triggered behavior activity learning data.

For example, in one possible implementation manner of the first aspect, the first basic interest learning unit includes: a second predictor unit, the interest learning data comprising the user triggered behavioral activity learning data and example user activity interest points;

acquiring a third intention probability distribution matrix corresponding to the user triggered behavior activity learning data;

obtaining a fuzzy interest thermodynamic diagram based on the second predictor unit and in accordance with the third intent probability distribution matrix and the example user activity interest points;

calculating a fourth training loss function value based on the fuzzy interest thermodynamic diagram and learning cost information of the user triggered behavioral activity learning data;

and adjusting and selecting the model parameter layer of the second prediction subunit based on the fourth training loss function value.

For example, in one possible implementation of the first aspect, the obtaining a third intent probability distribution matrix corresponding to the user triggered behavioral activity learning data includes:

acquiring a third implicit interest point feature based on the third implicit interest point explicit conversion subunit and according to the user triggered behavior activity learning data;

and obtaining the third intention probability distribution matrix based on the first prediction subunit and according to the third implicit interest point dominant conversion characteristic information.

In a second aspect, an embodiment of the present application further provides a big data acquisition optimization system applying artificial intelligence analysis, where the big data acquisition optimization system applying artificial intelligence analysis includes a big data system and a plurality of big data acquisition processes communicatively connected to the big data system;

the big data system is used for:

In a second aspect, an embodiment of the present application further provides a big data system, where the big data system includes a processor and a memory, where the memory is used for storing a computer program that can be executed on the processor, and when the processor is used for executing the computer program, the big data collection optimization method applying artificial intelligence analysis according to the first aspect or any one of the possible implementation manners in the first aspect is executed.

By adopting the implementation mode of any aspect, after extracting the first noise characteristic point map generated by mining the noise characteristic point of the target big data acquisition log of the big data acquisition process from the big data acquisition service database, and updating the big data acquisition process based on the first noise characteristic point map, when the first noise characteristic point map needs to be adjusted, the target noise characteristic point sequence associated with the service updating activity corresponding to the preset service online time period and the noise characteristic point relationship of the target noise characteristic point sequence corresponding to the service updating activity are obtained from the first noise characteristic point map, and the corresponding second noise characteristic point map is reconstructed, the traversal big data acquisition process is updated based on the second noise characteristic point map, the continuous reconstruction optimization of the noise characteristic point map is realized, and the traversal big data acquisition process is updated, the accuracy of follow-up big data sample collection can be improved.

Drawings

Fig. 1 is a schematic flowchart of a user demand decision method applying AI and big data analysis according to an embodiment of the present invention;

fig. 2 is a schematic block diagram of the structure of the internet system for implementing the user demand decision method for applying AI and big data analysis according to the embodiment of the present invention.

Detailed Description

The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "corresponding" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The architecture of the user demand decision system 10 applying AI and big data analysis according to an embodiment of the present invention is described below, and the user demand decision system 10 applying AI and big data analysis may include an internet system 100 and a big data collecting process200 communicatively connected to the internet system 100. The internet system 100 and the big data collecting process200 in the user demand decision system 10 applying the AI and big data analysis may perform the user demand decision method applying the AI and big data analysis described in the following method embodiments in a matching manner, and the detailed description of the method embodiments below may be referred to for the specific steps of the internet system 100 and the big data collecting process 200.

The user demand decision method for applying AI and big data analysis provided in this embodiment may be executed by the internet system 100, and is described in detail below with reference to fig. 1.

The Process100 extracts a first noise feature point map generated by mining a noise feature point of a target big data acquisition log of a big data acquisition Process from a big data acquisition service database, and acquires a target noise feature point sequence associated with a service update activity corresponding to a preset service online period and a noise feature point relationship of the target noise feature point sequence corresponding to the service update activity from the first noise feature point map when the first noise feature point map needs to be adjusted after a big data acquisition Process is updated based on the first noise feature point map.

In this embodiment, the first noise feature point map may represent feature field information corresponding to different noise feature points and an association relationship between different noise feature points, so that not only the noise feature points themselves but also an association relationship between other noise feature points are considered, and thus, a corresponding big data acquisition process may be updated, and a specific updating manner may be, for example, filtering the feature field information corresponding to the noise feature points in the big data acquisition template and the feature field information corresponding to the association relationship between other noise feature points.

However, the inventor researches and discovers that after a large data acquisition process is updated based on a first noise feature point map, because the first noise feature point map is not completely accurate enough, the acquired large data sample may still have associated noise information, and one of the reasons for this is mainly from service update activities existing in new online services, which causes an error in the originally mined first noise feature point map, so that the present embodiment may further acquire, from the first noise feature point map, a target noise feature point sequence associated with a service update activity (such as service update field distribution corresponding to the service update activity) corresponding to a preset service online period and a noise feature point relationship of the target noise feature point sequence corresponding to the service update activity.

And the Process200 reconstructs a corresponding second noise characteristic point map based on the target noise characteristic point sequence and the noise characteristic point relation of the target noise characteristic point sequence corresponding to the service updating activity.

In this embodiment, when the target noise feature point sequence is extracted and the noise feature point relationship of the target noise feature point sequence corresponding to the service update activity is extracted, a corresponding second noise feature point map may be reconstructed and generated, where the second noise feature point map better conforms to the service update activity corresponding to the latest preset service on-line time period than the first noise feature point map.

And the Process300 performs traversal big data acquisition Process updating based on the second noise characteristic point map.

For example, the specific updating manner may be, for example, to filter, in the big data acquisition template, the feature field information corresponding to the noise feature point in the second noise feature point spectrum, and the feature field information corresponding to the association relationship between the noise feature point and other noise feature points.

Based on the above steps, in this embodiment, after extracting the first noise feature point map generated by performing noise feature point mining on the target big data acquisition log of the big data acquisition process from the big data acquisition service database, and performing big data acquisition process updating based on the first noise feature point map, when the first noise feature point map needs to be adjusted, the target noise feature point sequence associated with the service update activity corresponding to the preset service on-line period and the noise feature point relationship of the target noise feature point sequence corresponding to the service update activity are obtained from the first noise feature point map, and the corresponding second noise feature point map is reconstructed, traversal big data acquisition process updating is performed based on the second noise feature point map, so as to achieve continuous reconstruction and optimization of the noise feature point map, and thus perform traversal big data acquisition process updating, the accuracy of follow-up big data sample collection can be improved.

For some exemplary design considerations, Process100 may be implemented by the following exemplary steps.

The Process110 extracts a first noise feature point map generated by performing noise feature point mining on a target big data acquisition log of the big data acquisition Process200 from a big data acquisition service database, configures target big data acquisition Process update information for the big data acquisition Process200 according to the first noise feature point map, and monitors big data acquisition update deployment data of the big data acquisition Process200, wherein the big data acquisition update deployment data is a big data acquisition log generated by traversing big data acquisition after the big data acquisition Process is performed according to the big data acquisition Process update information.

The Process120 obtains the estimated big data collection conclusion index of the big data collection flow update information and the real big data collection conclusion index of the big data collection update deployment data by the big data system 100.

The Process130, the big data system 100 outputs a noise mining effect indicator between the big data collecting Process updating information and the big data collecting updating deployment data according to the estimated big data collecting conclusion indicator and the real big data collecting conclusion indicator.

The Process140 loads the target big data acquisition flow update information to the sample big data acquisition activity if the noise mining effect index is not smaller than the first effect index, and outputs optimization indication information for optimizing the first noise feature point map if the noise mining effect index is smaller than the first effect index.

By adopting the technical scheme, after the big data acquisition updating deployment data is obtained, the noise mining effect index evaluation is carried out on the big data acquisition process updating information and the big data acquisition updating deployment data to judge whether the big data acquisition updating deployment data meets the acquisition meeting state of the big data acquisition process updating information or not, whether the big data acquisition process updating meets the effective deployment requirement or not can be determined, when the big data acquisition process updating information meets the requirement, the target big data acquisition process updating information can be loaded to the sample big data acquisition activity to be effectively recorded, so that a sample big data acquisition activity resource with higher reliability is formed, and when the target big data acquisition updating deployment information does not meet the requirement, the optimization indication information of the first noise characteristic point map for optimizing the configuration basis of the big data acquisition process updating information is output, so that the noise characteristic point updating is facilitated.

Based on the description of the foregoing embodiment, the following describes a flow of a big data collection optimization method using artificial intelligence analysis according to another embodiment of the present application, where the embodiment includes:

the Process210 sends a big data Process update evaluation instruction to the big data system 100, where the big data Process update evaluation instruction carries big data acquisition update deployment data, and the big data acquisition update deployment data is a big data acquisition log generated by traversing big data acquisition after the big data acquisition Process is updated according to the big data acquisition Process update information.

The big data acquisition updating deployment data refers to a big data acquisition log generated by traversing big data acquisition after a big data acquisition process updates a big data acquisition flow. The big data acquisition update deployment data may include deployment information for training sample data acquisition in a big data acquisition process, and the like. The big data acquisition process update information refers to big data acquisition process update information generated according to the first noise feature point map mined in advance, and may include a series of acquisition process update information for a big data acquisition field, for example.

For other design considerations, when uploading the big data collection update deployment data, the big data collection process200 is automatically triggered to send a big data process update evaluation instruction to the big data system 100.

The Process220, the big data system 100 receives a big data Process update evaluation instruction of the big data collecting Process 200.

The Process230, the big data system 100 obtains the conflict information ratio between the big data acquisition Process updating information and the big data acquisition updating deployment data, analyzes whether the conflict information ratio between the big data acquisition Process updating information and the big data acquisition updating deployment data is not greater than the target conflict information ratio, and executes the Process240 if the conflict information ratio between the big data acquisition Process updating information and the big data acquisition updating deployment data is not greater than the target conflict information ratio.

The target conflict information ratio is a preset conflict information ratio, and the target conflict information ratio is used for judging whether the conflict information ratio of the big data acquisition process updating information and the big data acquisition updating deployment data meets an expected condition or not.

For some design ideas, after the big data system 100 obtains the conflict information ratio between the big data acquisition process update information and the big data acquisition update deployment data, if the conflict information ratio between any one of the big data acquisition process update information and the big data acquisition update deployment data is greater than the target conflict information ratio, the step is ended.

By the judgment of the conflict information ratio, the big data acquisition process updating information and the big data acquisition updating deployment data with the conflict information ratio meeting the conditions can be screened out, the accuracy of the subsequent big data acquisition process updating evaluation is improved, the characteristics in the big data acquisition process updating information and the big data acquisition updating deployment data are more obvious, and the accuracy of characteristic extraction is improved.

The Process240 obtains the estimated big data collection conclusion index of the big data collection flow updating information and the real big data collection conclusion index of the big data collection updating deployment data by the big data system 100.

Wherein the estimated big data collection conclusion indicator is configured to a big data collection effect variable expressing the big data collection process update information. The true big data collection conclusion indicator is configured to a big data collection effect variable expressing the big data collection update deployment data. The actual big data acquisition conclusion index (big data acquisition effect variable) can be member estimation acquisition parameter information of different process updating labels. Estimating a big data collection event refers to a basic big data collection event that is expected by the big data collection process to update information.

For some design ideas, the process of the big data system 100 obtaining the big data acquisition flow update information and estimating the big data acquisition conclusion index is as follows: after the update information of the big data acquisition process is obtained, the estimated big data acquisition conclusion index is determined according to the ratio of the estimated filtering quantity of the current noise characteristic points and the quantity of the current accumulated noise characteristic points.

For some design ideas, the process of acquiring the actual big data acquisition conclusion index of the big data acquisition update deployment data by the big data system 100 is as follows: and determining an estimated big data acquisition conclusion index according to the ratio of the filtered number of the actual noise characteristic points after the big data acquisition is updated to the number of the current accumulated noise characteristic points.

The Process250, the big data system 100 outputs the second mining connectivity of the estimated big data acquisition conclusion index and the real big data acquisition conclusion index of the real big data acquisition conclusion index according to the estimated big data acquisition conclusion index of the big data acquisition Process updating information and the real big data acquisition conclusion index of the big data acquisition updating deployment data.

The mining connectivity is to estimate the contact ratio between the big data acquisition conclusion index and the real big data acquisition conclusion index.

The Process260 obtains an emphasized estimated big data collecting event of the big data collecting Process updating information and an emphasized real big data collecting event of the big data collecting updating deployment data by the big data system 100, wherein the emphasized estimated big data collecting event is configured to an estimated big data collecting event in which the collecting span has a change and the collecting span change value is greater than the target change value in the estimated big data collecting events of different Process updating labels expressing the corresponding big data collecting Process updating information.

Wherein the target change value is used for measuring and estimating the acquisition span update state of the large data acquisition event. For example, an estimated large data acquisition event whose acquisition span variation value is larger than the target variation value is taken as an obviously updated important estimated large data acquisition event.

For some design ideas, after acquiring the big data acquisition process update information and the big data acquisition update deployment data, the big data system 100 analyzes the big data acquisition process update information and the big data acquisition update deployment data, and determines important estimated big data acquisition events which are updated in the big data acquisition process update information and the big data acquisition update deployment data in the estimated big data acquisition events of different process update labels and have the acquisition span change value larger than the target change value, so as to obtain important estimated big data acquisition events of the big data acquisition process update information and important real big data acquisition events of the big data acquisition update deployment data.

The Process270 outputs the first mining connectivity of the big data acquisition Process update information and the big data acquisition update deployment data according to the first big data acquisition variable distribution of the important estimated big data acquisition event and the second big data acquisition variable distribution of the important real big data acquisition event by the big data system 100.

Wherein mining connectivity may refer to a degree of overlap between the first big data collection variable distribution and the second big data collection variable distribution.

The obtaining process of the big data acquisition variable distribution can be executed before, after or simultaneously with obtaining the estimated big data acquisition conclusion index of the big data acquisition process updating information and the real big data acquisition conclusion index of the big data acquisition updating deployment data. The execution sequence of the acquisition process of the large data acquisition variable distribution is not limited in the embodiment of the application.

The Process280, the big data system 100 outputs the noise mining effect indicator by weighting the mining connectivity and the mining connectivity.

For some design ideas, after the big data system 100 determines the first mining connectivity and the second mining connectivity, the noise mining effect index is weighted and output according to importance weights of the first mining connectivity and the second mining connectivity. In the process, two effect indexes can be better combined by adjusting the importance weights of the first excavation connectivity and the second excavation connectivity, and then the accuracy of noise excavation effect index evaluation is improved.

Among them, the processes 260 to 280 are optional steps. For some design ideas, the big data system 100 uses the second mining connectivity of the estimated big data acquisition conclusion index and the real big data acquisition conclusion index as a noise mining effect index, and then performs the subsequent process.

Process290, if the noise mining effect indicator meets the target condition, the big data system 100 sends update passing information to the big data collecting Process200, where the update passing information is used to indicate that the big data collecting update deployment data meets the effective deployment requirement.

For some design ideas, if the noise mining effect index is not smaller than the first effect index, determining that the big data acquisition update deployment data meets the effective deployment requirement, and loading the update information of the target big data acquisition flow to the sample big data acquisition activity. The first effect index is a preset fixed threshold value and is used for measuring that the big data acquisition, updating and deployment data meet the effective deployment requirement.

For other design considerations, the big data system 100 can also perform an effect index evaluation process according to noise mining effect index evaluation and feature matching. The big data flow updating and evaluating instruction also carries first acquisition knowledge point deployment data related to the big data acquisition flow updating information and second acquisition knowledge point deployment data related to the big data acquisition updating deployment data. The following describes a flow of a big data collection optimization method using artificial intelligence analysis according to another embodiment of the present application, where the embodiment includes:

the Process310 sends a big data Process update evaluation instruction to the big data system 100, where the big data Process update evaluation instruction carries big data acquisition Process update information, first acquisition knowledge point deployment data, big data acquisition update deployment data, and second acquisition knowledge point deployment data.

The first acquired knowledge point deployment data refers to acquired knowledge point deployment data of updating information of a big data acquisition process.

Process320, big data system 100 receives big data Process update evaluation instructions of big data collection Process 200.

The Process330, the big data system 100 obtains the conflict information ratio between the big data acquisition Process updating information and the big data acquisition updating deployment data, determines whether the conflict information ratio between the big data acquisition Process updating information and the big data acquisition updating deployment data is not greater than the target conflict information ratio, and executes the Process340 if the conflict information ratio between the big data acquisition Process updating information and the big data acquisition updating deployment data is not greater than the target conflict information ratio.

The Process340 obtains the estimated big data collection conclusion index of the big data collection flow update information and the real big data collection conclusion index of the big data collection update deployment data by the big data system 100.

The Process350, the big data system 100 outputs the second mining connectivity of the estimated big data acquisition conclusion index and the real big data acquisition conclusion index of the big data acquisition update deployment data according to the estimated big data acquisition conclusion index of the big data acquisition Process update information and the real big data acquisition conclusion index of the big data acquisition update deployment data.

The contents of the processes 220 to 250 are referred to as the processes 320 to 350, and are not described again.

The Process360, the big data system 100 outputs a third mining connectivity of the big data collecting Process updating information and the big data collecting updating deployment data according to the first collecting knowledge point deployment data and the second collecting knowledge point deployment data.

And the third mining connectivity refers to the contact ratio between the first acquired knowledge point deployment data and the second acquired knowledge point deployment data.

The Process370, the big data system 100 weights the mining connectivity and the third mining connectivity to output the noise mining effect indicator.

The Process380, if the noise mining effect index meets the target condition, sends update passing information to the big data collecting Process200 by the big data system 100, where the update passing information is used to indicate that the big data collecting, updating and deploying data meets the effective deploying requirement.

The Process380 refers to the content of the Process290, and is not described in detail.

For some design considerations, for the Process110, the following steps may be implemented in a Process of extracting a first noise feature point map generated by performing noise feature point mining on a target big data acquisition log of the big data acquisition Process200 from a big data acquisition service database, and configuring target big data acquisition Process update information for the big data acquisition Process200 according to the first noise feature point map.

STEP101, obtaining a target big data acquisition log of the big data acquisition process200, performing noise information mining on the target big data acquisition log, and determining target noise information corresponding to the target big data acquisition log.

STEP102, obtaining a first target collaborative big data acquisition log corresponding to a target big data acquisition log and a second target collaborative big data acquisition log corresponding to target noise information, and obtaining a collaborative acquisition process of the big data acquisition process200, where the collaborative acquisition process is configured with a plurality of first previous big data acquisition logs and a plurality of collaborative acquisition logs corresponding to the first previous big data acquisition logs, and the collaborative acquisition logs include first collaborative big data acquisition logs corresponding to the first previous big data acquisition logs.

STEP103, matching the target big data acquisition log, the target noise information, the first target collaborative big data acquisition log and the second target collaborative big data acquisition log with the first prior big data acquisition log and the first collaborative big data acquisition log in the collaborative acquisition process acquired in STEP103, respectively, and outputting the first prior big data acquisition log and the first collaborative big data acquisition log matched with the target big data acquisition log, the target noise information, the first target collaborative big data acquisition log and the second target collaborative big data acquisition log.

For example, the target big data acquisition log, the target noise information, the first target collaborative big data acquisition log, and the second target collaborative big data acquisition log are respectively matched with each first previous big data acquisition log and the first collaborative big data acquisition log thereof in the collaborative acquisition process, so as to obtain the first previous big data acquisition log and the first collaborative big data acquisition log matched with the target big data acquisition log, the target noise information, the first target collaborative big data acquisition log, and the second target collaborative big data acquisition log.

STEP104, based on the first previous big data acquisition log and the first collaborative big data acquisition log obtained by STEP103 matching, performs log integration on the target big data acquisition log.

Aiming at some design ideas, the application also provides another big data acquisition optimization method applying artificial intelligence analysis, and the method comprises the following steps.

STEP501, acquiring a target big data acquisition log and an additional big data acquisition log for log integration with the target big data acquisition log, and outputting the target big data acquisition log and the additional big data acquisition log as a target big data acquisition log.

STEP502, acquiring a training sample data cluster corresponding to each AI training acquisition task of the target big data acquisition log in a first AI training process, wherein the first AI training process comprises at least two AI training acquisition tasks, and the training sample data cluster corresponding to each AI training acquisition task comprises acquisition big data of the template characteristics of the target training sample acquired by the training sample acquisition activities in the target big data acquisition log in the corresponding AI training acquisition task.

STEP503, outputting a training sample data atlas between training sample data clusters corresponding to each AI training acquisition task in the first AI training flow.

STEP504, based on the training sample data atlas between training sample data clusters corresponding to each AI training acquisition task in the first AI training flow, outputs the noise feature point communication information of the target big data acquisition log in the first AI training flow.

STEP505, determining noise feature point relation information of the target big data acquisition log in the first AI training process based on the noise feature point communication information, and configuring target big data acquisition process update information for the big data acquisition process200 based on a first noise feature point atlas obtained based on the noise feature point relation information.

For example, for some design considerations, STEP502 can be implemented by the following STEPs.

STEP5021, acquiring large data of training sample template characteristics acquired in a preset acquisition subarea after a first AI training acquisition task is started by training sample acquisition activities in a target large data acquisition log, presetting acquisition large data of the target training sample template characteristics acquired in the acquisition subarea after the first AI training acquisition task is started according to the training sample acquisition activities in the target large data acquisition log, outputting training sample data clusters corresponding to the first AI training acquisition task, and enabling the first AI training acquisition task to be any AI training acquisition task in a first AI training flow.

STEP5022, when the template characteristics of the target training sample are not collected in the preset collection subarea after the second AI training collection task is started by the training sample collection activity in the target big data collection log, the collected big data of the template characteristics of the target training sample received by the training sample collection activity in the target big data collection log is output, and the training sample data cluster corresponding to the second AI training collection task is output, wherein the second AI training collection task is any AI training collection task except the first AI training collection task in the first AI training flow.

In this embodiment, when a training sample acquisition activity in the target big data acquisition log does not acquire a target training sample template feature in a preset acquisition partition after the third AI training acquisition task is started, and training sample data clusters corresponding to continuous first-target-number AI training acquisition tasks before the third AI training acquisition task are determined based on the acquired big data of the target training sample template feature received by the training sample acquisition activity, a target training sample template feature acquisition request is sent to the training sample acquisition activity, so that the training sample acquisition activity acquires the target training sample template feature in response to the target training sample template feature acquisition request, and the third AI training acquisition task is any one of AI training acquisition tasks other than the first AI training acquisition task and the second AI training acquisition task in the first AI training flow.

Therefore, the big data of the training sample template characteristics acquired by the training sample acquisition activity responding to the target training sample template characteristic acquisition request can be acquired, and the training sample data cluster corresponding to the third AI training acquisition task can be output.

For example, for some design ideas, the training sample data atlas between training sample data clusters corresponding to each AI training acquisition task in the first AI training flow is output, and may be, for example: and outputting a dynamic acquisition big data cluster from a training sample data cluster corresponding to each AI training acquisition task in the first AI training process. Then, respectively determining training sample data clusters other than the dynamically acquired big data cluster in the training sample data clusters corresponding to the AI training acquisition tasks in the first AI training process, and a training sample data atlas between the dynamically acquired big data clusters. Or respectively determining a training sample data atlas between training sample data clusters corresponding to every two associated AI training acquisition tasks in the first AI training process.

The training sample data map may include training cooperative relationship information between a plurality of training sample data and training sample data.

For some design ideas, the training sample data cluster corresponding to each AI training acquisition task in the first AI training flow includes a migratable training sample data cluster and a non-migratable training sample data cluster, and the noise feature point communication information includes first noise feature point communication information determined based on a training sample data atlas corresponding to the migratable training sample data cluster of each AI training acquisition task specified in the first AI training flow, and second noise feature point communication information determined based on a training sample data atlas corresponding to the non-migratable training sample data cluster of each AI training acquisition task specified in the first AI training flow.

Based on the description of the foregoing embodiment, noise feature point relationship information of the target big data acquisition log in the first AI training procedure is determined based on the noise feature point communication information, and specifically, the noise feature point relationship information of the target big data acquisition log in the first AI training procedure may be output based on the first noise feature point communication information and the second noise feature point communication information.

For example, for some design considerations, STEP504 can be implemented as follows.

STEP5041, outputting a plurality of target migratable training sample data clusters with noise output value higher than first target value corresponding to noise characteristic point relation information of the cooperative features of the target training sample template features from training sample data clusters corresponding to each AI training acquisition task in the first AI training process, and outputting a plurality of target non-migratable training sample data clusters with noise output value higher than second target value corresponding to noise characteristic point relation information of the cooperative features of the target training sample template features.

STEP5042, determining first noise characteristic point communication information based on training sample data maps corresponding to a plurality of target migratable training sample data clusters, and determining second noise characteristic point communication information according to training sample data maps corresponding to a plurality of target non-migratable training sample data clusters.

Based on the first noise feature point communication information and the second noise feature point communication information, outputting noise feature point relationship information of the target big data acquisition log in the first AI training process, which may be, for example: when the communication continuity of the first noise feature point communication information (which may indicate the number of continuous noise feature points having a communication relationship) is not less than a preset first target communication continuity and the communication continuity of the second noise feature point communication information is not less than a preset second target communication continuity, outputting noise feature point relationship information of the target big data acquisition log in the first AI training process as first noise feature point relationship information (that is, including the first noise feature point communication information and the second noise feature point communication information). And when the communication continuity of the first noise feature point communication information is not less than the first target communication continuity and the communication continuity of the second noise feature point communication information is less than the second target communication continuity, outputting noise feature point relation information of the target big data acquisition log in the first AI training process as second noise feature point relation information (namely including the first noise feature point communication information). When the communication continuity of the first noise feature point communication information is less than the first target communication continuity and the communication continuity of the second noise feature point communication information is less than the second target communication continuity, outputting noise feature point relation information of the target big data acquisition log in the first AI training process as third noise feature point relation information (that is, fuzzy noise feature point communication information other than the first noise feature point communication information and the second noise feature point communication information is included, and the fuzzy noise feature point communication information may be predicted noise feature point communication information which may be related).

For some design ideas, if the noise feature point relationship information is third noise feature point relationship information, then N types of big data acquisition path sequences corresponding to the third noise feature point relationship information and an acquisition member category chain corresponding to each type of big data acquisition path sequence may be obtained, where each type of big data acquisition path sequence includes M different key big data acquisition paths, and N and M are positive integers not less than 1. Then, determining the current cyclic collection member category corresponding to the big data collection path sequence in the collection member category chain corresponding to the big data collection path sequence, adopting the current cyclic collection member category corresponding to the big data collection path sequence to extract collection member category variables, outputting the collection member category variable of each key big data collection path in the big data collection path sequence, and according to the collection member category variable of each key big data collection path in the N kinds of big data collection path sequences, and performing extension reference on the current cyclic acquisition member category corresponding to the big data acquisition path sequence, outputting a real-time extension reference acquisition member category corresponding to the big data acquisition path sequence, and adding the real-time extension reference acquisition member category corresponding to the big data acquisition path sequence into an acquisition member category chain corresponding to the big data acquisition path sequence.

Therefore, the step is returned and executed to determine the current cyclic acquisition member category corresponding to the big data acquisition path sequence in the acquisition member category chain corresponding to the big data acquisition path sequence until the global acquisition coverage rate corresponding to the N kinds of big data acquisition path sequences is greater than the set acquisition coverage rate, and the update information of the big data acquisition path interval corresponding to the N kinds of big data acquisition path sequences is obtained according to the global acquisition coverage rate.

The determining of the current cyclic acquisition member category corresponding to the big data acquisition path sequence in the acquisition member category chain corresponding to the big data acquisition path sequence may be, for example: determining the category of the associated cyclic acquisition members corresponding to the big data acquisition path sequence, the current big data acquisition path coverage information and the current big data acquisition path coverage information corresponding to the target big data acquisition path sequence, comparing the current big data acquisition path coverage information corresponding to the big data acquisition path sequence with the current big data acquisition path coverage information corresponding to the target big data acquisition path sequence, and outputting the first coverage category distribution of the current big data acquisition path coverage information corresponding to the big data acquisition path sequence, wherein the target big data acquisition path sequence is all big data acquisition path sequences including the big data acquisition path sequence in the N kinds of big data acquisition path sequences. Then, comparing the current big data acquisition path coverage information corresponding to the big data acquisition path sequence with the associated cyclic acquisition member category corresponding to the big data acquisition path sequence, outputting a second coverage category distribution of the current big data acquisition path coverage information of the big data acquisition path sequence, and determining the associated cyclic acquisition member category corresponding to the big data acquisition path sequence or the current big data acquisition path coverage information corresponding to the big data acquisition path sequence as the acquisition member category corresponding to the current time sequence node of the big data acquisition path sequence according to the second coverage category distribution and the first coverage category distribution.

In a possible implementation manner, the method provided by this embodiment may further include the following steps.

The Process400 acquires user behavior big data to be learned for user interest mining model training, which is acquired after big data acquisition Process updating, and schedules an initial user interest mining model and a collaborative training model for user interest mining model training, wherein the initial user interest mining model comprises a plurality of first basic interest learning units for model parameter layer optimization and selection, and the collaborative training model comprises a plurality of second basic interest learning units for model parameter layer optimization and selection.

And the Process500 is used for carrying out data classification on the big data of the user behavior to be learned, and extracting the user pre-behavior activity and the user trigger behavior activity corresponding to the big data of the user behavior to be learned.

The Process600 generates an interest learning data sequence based on the user pre-behavior activity, the user triggered behavior activity, and a user activity interest point corresponding to the user triggered behavior activity, and generates an interest difference learning data sequence based on the user triggered behavior activity and the user activity interest point.

The Process700 determines a first interest learning variable based on the first basic interest learning unit and according to the interest learning data in the interest learning data sequence.

The Process800 determines a second interest learning variable based on the corresponding second basic interest learning unit and according to the interest difference learning data in the interest difference learning data sequence.

The Process900 optimizes and selects the model parameter layer of the first basic interest learning unit based on the first interest learning variable and the second interest learning variable or the learning cost information of the user triggered behavior activity, and outputs the final user interest mining model which is optimized and selected by the model parameter layer.

In a possible implementation manner, before the above embodiment is implemented, the present embodiment may configure a learning coordination relationship between the first basic interest learning unit and the second basic interest learning unit, and then determine, in the interest learning data sequence, interest learning data corresponding to the first basic interest learning unit according to the learning coordination relationship, and determine, in the interest difference learning data sequence, interest difference learning data corresponding to the second basic interest learning unit.

In one possible implementation, the first basic interest learning unit includes: a first explicit interest feature learning subunit, the second base interest learning unit comprising: a second explicit interest feature learning subunit; the interest learning data includes: user pre-behavioral activity learning data, the interest-discriminative learning data comprising: user triggered behavioral activity learning data corresponding to the user pre-behavioral activity learning data;

one implementation of Process700 may be: acquiring first explicit interest feature information based on the first explicit interest feature learning subunit and according to the user pre-behavior activity learning data;

one implementation of Process800 may be: acquiring second explicit interest feature information based on the second explicit interest feature learning subunit and according to the user triggered behavior activity learning data;

one implementation of Process900 may be: calculating a first training loss function value based on learning cost information of the first explicit interest feature information and the second explicit interest feature information; and adjusting and selecting the model parameter layer of the first explicit interest feature learning subunit based on the first training loss function value.

For example, in one possible implementation, the first basic interest learning unit includes: an implicit interest feature learning subunit, wherein the second basic interest learning unit comprises: a first implicit point of interest explicit conversion subunit; the interest learning data includes: user pre-behavioral activity learning data, the interest-discriminative learning data comprising: user triggered behavioral activity learning data corresponding to the user pre-behavioral activity learning data.

One implementation of Process700 may be: obtaining implicit interest feature information based on the implicit interest feature learning subunit and according to the user preposed behavior activity learning data;

one implementation of Process800 may be: acquiring first implicit interest point explicit conversion characteristic information based on the first implicit interest point explicit conversion subunit and according to the user triggered behavior activity learning data;

one implementation of Process900 may be: acquiring explicit conversion characteristic information of a second implicit interest point based on the implicit interest characteristic information; calculating a second training loss function value based on learning cost information of the first implicit interest point explicit conversion characteristic information and the second implicit interest point explicit conversion characteristic information; and adjusting and selecting the model parameter layer of the implicit interest feature learning subunit based on the second training loss function value.

For example, in one possible implementation, the second basic interest learning unit further includes: a second implicit point of interest explicit conversion subunit; the interest difference learning data includes: sample user activity interest points corresponding to the user triggered behavioral activity learning data.

Acquiring second implicit interest point explicit conversion feature information based on the implicit interest feature information comprises the following steps: obtaining an explicit interest point feature based on the second implicit interest point explicit conversion subunit and according to the example user activity interest point; and performing explicit conversion on the explicit interest point characteristics and the implicit interest characteristic information to generate second implicit interest point explicit conversion characteristic information.

For example, in one possible implementation, the first basic interest learning unit includes: a first prediction subunit, the second base interest learning unit comprising: an intention probability distribution matrix output unit; the interest learning data includes: user triggered behavioral activity learning data and sample user activity interest points, the interest discriminative learning data comprising user triggered behavioral activity learning data;

one implementation of Process700 may be: acquiring third implicit interest point dominant conversion characteristic information based on the user triggered behavior activity learning data; obtaining a first intention probability distribution matrix based on the first prediction subunit and according to the third implicit interest point explicit conversion feature information and the example user activity interest points;

one implementation of Process800 may be: obtaining a second intention probability distribution matrix based on the intention probability distribution matrix output unit and according to the user triggered behavior activity learning data;

one implementation of Process900 may be: calculating a third training loss function value based on learning cost information of the first intention probability distribution matrix and the second intention probability distribution matrix; and adjusting and selecting the model parameter layer of the first prediction subunit based on the third training loss function value.

For example, in one possible implementation, the second basic interest learning unit further includes: a third implicit point of interest explicit transition sub-sheet. Acquiring third implicit interest point explicit conversion feature information based on the user triggered behavior activity learning data, specifically comprising: and acquiring the third implicit interest point explicit conversion characteristic information based on the third implicit interest point explicit conversion subunit and according to the user triggered behavior activity learning data.

For example, in one possible implementation, the first basic interest learning unit includes: a second predictor unit, the interest learning data comprising the user triggered behavioral activity learning data and example user activity interest points.

One implementation of Process700 may be: acquiring a third intention probability distribution matrix corresponding to the user triggered behavior activity learning data; obtaining a fuzzy interest thermodynamic diagram based on the second predictor unit and in accordance with the third intent probability distribution matrix and the example user activity interest points;

one implementation of Process900 may be: calculating a fourth training loss function value based on the fuzzy interest thermodynamic diagram and learning cost information of the user triggered behavioral activity learning data; and adjusting and selecting the model parameter layer of the second prediction subunit based on the fourth training loss function value.

For instance, in one possible implementation, obtaining a third intent probability distribution matrix corresponding to the user triggered behavioral activity learning data includes: acquiring a third implicit interest point feature based on the third implicit interest point explicit conversion subunit and according to the user triggered behavior activity learning data; and obtaining the third intention probability distribution matrix based on the first prediction subunit and according to the third implicit interest point dominant conversion characteristic information.

Fig. 2 illustrates a hardware structural diagram of the internet system 100 for implementing the user demand decision system applying AI and big data analysis as described above according to an embodiment of the present invention, and as shown in fig. 2, the internet system 100 may include a processor 110, a machine-readable storage medium 120, a bus 130, and a communication unit 140.

The processor 110 may perform various suitable actions and processes based on a program stored in the machine-readable storage medium 120, such as program instructions associated with the user demand decision method applying AI and big data analysis described in the foregoing embodiments. The processor 110, the machine-readable storage medium 120, and the communication unit 140 perform signal transmission through the bus 130.

In particular, the processes described in the above exemplary flow diagrams may be implemented as computer software programs, according to embodiments of the present invention. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication unit 140, and when executed by the processor 110, performs the above-described functions defined in the methods of the embodiments of the present invention.

Yet another embodiment of the present invention further provides a computer-readable storage medium, in which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the computer-readable storage medium is used for implementing the user demand decision method applying AI and big data analysis according to any of the above embodiments.

The computer readable medium of the present invention may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (LAM), a read-only memory (LOM), an erasable programmable read-only memory (EPLOM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-LOM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, LM (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the methods shown in the above embodiments.

Yet another embodiment of the present invention further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the user demand decision method for applying AI and big data analysis according to any of the above embodiments.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A big data acquisition optimization method applying artificial intelligence analysis is applied to a big data system, and comprises the following steps:

2. The big data collection optimization method applying artificial intelligence analysis according to claim 1, wherein the step of extracting a first noise feature point map generated by mining noise feature points of a target big data collection log of a big data collection process from a big data collection service database, and acquiring a target noise feature point sequence associated with a service update activity corresponding to a preset service on-line period and a noise feature point relationship of the target noise feature point sequence corresponding to the service update activity from the first noise feature point map when the first noise feature point map needs to be adjusted after a big data collection process is updated based on the first noise feature point map, comprises:

3. The big data collecting and optimizing method using artificial intelligence analysis according to claim 2, wherein before the step of outputting the noise mining effect indicator between the big data collecting process updating information and the big data collecting and updating deployment data according to the estimated big data collecting conclusion indicator and the actual big data collecting conclusion indicator, the method further comprises:

according to the first big data acquisition variable distribution of the key point estimation big data acquisition event and the second big data acquisition variable distribution of the key point real big data acquisition event, outputting the first mining connectivity of the big data acquisition process updating information and the big data acquisition updating deployment data;

4. The big data collection optimization method applying artificial intelligence analysis according to claim 2, wherein before the step of outputting the noise mining effect indicator between the big data collection procedure update information and the big data collection update deployment data according to the estimated big data collection conclusion indicator and the actual big data collection conclusion indicator, the big data collection update deployment data further carries first collection knowledge point deployment data related to the big data collection procedure update information and second collection knowledge point deployment data related to the big data collection update deployment data, the method further comprises:

5. The big data collection optimization method applying artificial intelligence analysis according to claim 2, wherein before the step of obtaining the estimated big data collection conclusion index of the big data collection procedure update information and the actual big data collection conclusion index of the big data collection update deployment data, the method further comprises:

6. The big data collection optimization method applying artificial intelligence analysis according to any one of claims 1 to 5, wherein the extracting a first noise feature point map generated by performing noise feature point mining on a target big data collection log of the big data collection process from a big data collection service database, and configuring target big data collection process update information for the big data collection process according to the first noise feature point map specifically comprises:

7. The big data collection optimization method applying artificial intelligence analysis according to any one of claims 1-6, wherein the method further comprises:

8. The method for optimizing big data collection by applying artificial intelligence analysis according to claim 7, further comprising:

9. The big data collection optimization method applying artificial intelligence analysis according to claim 8, wherein the first basic interest learning unit comprises: a first explicit interest feature learning subunit, the second base interest learning unit comprising: a second explicit interest feature learning subunit; the interest learning data includes: the user preposition behavior activity learning data, wherein the interest difference learning data comprises: user triggered behavioral activity learning data corresponding to the user pre-behavioral activity learning data;

10. A big data system, comprising a processor and a memory for storing a computer program capable of running on the processor, wherein the processor is configured to execute the big data collection optimization method applying artificial intelligence analysis of any of claims 1-9 when running the computer program.