CN115563654A

CN115563654A - Digital marketing big data processing method

Info

Publication number: CN115563654A
Application number: CN202211469771.6A
Authority: CN
Inventors: 孙晓琛; 葛强
Original assignee: Shandong Zhidou Digital Technology Co ltd
Current assignee: Shandong Zhidou Digital Technology Co ltd
Priority date: 2022-11-23
Filing date: 2022-11-23
Publication date: 2023-01-03
Anticipated expiration: 2042-11-23
Also published as: CN115563654B

Abstract

The invention relates to the technical field of big data processing, and provides a digital marketing big data processing method, which comprises the following steps: acquiring digital marketing big data and establishing a database; carrying out characteristic preliminary cleaning on the digital marketing big data in the database; acquiring the characteristics of all digital marketing big data, acquiring positive connection parameters and negative connection parameters according to the distribution relation of entries in a database among the characteristics, acquiring the profitability of the characteristics according to the density expression of the characteristics in the database, acquiring the connectivity among the characteristics according to the positive connection parameters and the negative connection parameters, and carrying out sensitivity quantification on the characteristics according to the connectivity and the profitability; acquiring the sensitivity of the entries in the database by using the characteristic sensitivity to obtain the entries corresponding to the sensitive data; and carrying out security processing on the sensitive data in the obtained digital marketing big data. The invention aims to solve the problem that when large digital marketing data are encrypted, the time consumption is too long due to the huge data volume.

Description

Digital marketing big data processing method

Technical Field

The application relates to the field of big data processing, in particular to a digital marketing big data processing method.

Background

With the development of science and technology and the arrival of the digital era, the traditional marketing mode, such as the promotion and promotion of off-line physical stores, is not dominant in the selling process of commodities because of small coverage, and the corresponding digital marketing is more popular because of the accuracy and the coverage of a large area. In the process of digital marketing, the corresponding big data is generated correspondingly for the commodities of each enterprise, and the big data is very important for updating and promoting subsequent products of the enterprise, so that the safety of the digital marketing big data is an important problem for the enterprise, and the digital marketing big data needs to be subjected to corresponding safety processing.

Disclosure of Invention

The invention provides a method for processing digital marketing big data, which aims to solve the problems that the data volume is huge and the time consumption is too long when the existing algorithm is used for encrypting the digital marketing big data, and adopts the following technical scheme:

one embodiment of the invention provides a digital marketing big data processing method, which comprises the following steps:

constructing a database of the digital marketing big data, and performing characteristic cleaning on all entries of the digital marketing big data in the database;

acquiring the characteristics of all entries, acquiring the characteristic relevance of each characteristic in each entry according to the position relation between different characteristics in the same entry, taking the mean value of the characteristic relevance of each characteristic in each entry in all entries as the positive contact parameter of each characteristic, acquiring the negative contact parameter of each characteristic according to the integral occurrence frequency between the characteristics which never appear in the same entry and the occurrence frequency of the characteristics in a certain entry range, and acquiring the contact of each characteristic according to the positive contact parameter and the negative contact parameter;

acquiring the profitability of each characteristic according to the inter-entry density of the characteristics appearing in different entries and the intra-entry density appearing in the same entry, and acquiring the sensitivity of each characteristic according to the associativity and the profitability of each characteristic;

and by utilizing the sensitivity of the characteristics in the digital marketing big data, taking the sum of the sensitivities of all the characteristics in the same entry as the sensitivity of the entry, acquiring the sensitive data contained in the entry according to the sensitivity of the entry, and carrying out safety processing on the sensitive data.

Optionally, the step of constructing the database of the digital marketing big data is as follows:

and acquiring the digital marketing big data, classifying and establishing a database based on the sources, and performing structured processing on the digital marketing big data of the same source in the database by using a form entry mode according to the obtaining time of the big data to obtain the preprocessed digital marketing big data.

Optionally, the step of performing feature cleaning includes:

repeated characters in entries corresponding to all digital marketing big data in the database are obtained, and characters corresponding to a small part of unrepeated features are cleaned, so that the workload of subsequent feature extraction and feature sensitivity calculation is reduced.

Optionally, the method for acquiring the features of all the entries includes:

and (3) taking the text data of each entry as the input of the named body recognition technology, and outputting the obtained entity as the characteristic of the digital marketing big data.

Optionally, the method for obtaining the feature relevance of each feature in each entry includes:

wherein,

is shown as

In the individual entry

The characteristic relevance of each characteristic is determined by the characteristic relevance,

is as follows

The total number of all features in an individual entry,

is shown as

In the individual entry

A characteristic of

The characteristic association parameter of each characteristic is obtained by the position relation of two characteristics appearing in the same entry.

Optionally, the method for acquiring the positive contact parameter of each feature includes:

wherein,

is shown as

A positive connection parameter of the individual characteristic,

for the number of structured entries of the digitized marketing big data in the database,

is shown as

The first in the individual entry

The number of times that an individual feature occurs,

is shown as

Is divided by

The total number of occurrences of other features than the individual feature,

is shown as

In each entry

Feature relevance of individual features.

Optionally, the method for obtaining the negative contact parameter of each feature includes:

wherein,

is shown as

The negative connection parameter of the individual characteristic,

indicates never

The first of the features that an individual feature appears in the same entry

The characteristics of the device are as follows,

then this is indicatedSome never before

The total number of features that an individual feature appears in the same entry,

is shown as

The total number of times that an individual feature appears in the database,

denotes the first

The total number of times that an individual feature appears in the database,

is shown in

Within the range of the individual entry

The frequency of occurrence of a feature is such that,

is shown in

Within the range of the individual entry

The frequency of occurrence of the individual features is,

is shown in common

The range of each entry is defined as,the term range is a range formed by a certain number of terms.

Optionally, the method for obtaining the contact of each feature includes:

wherein,

is the first

The relevance of the individual characteristics is such that,

is as follows

Each feature is being associated with a normalized parameter,

is as follows

The individual features are negatively linked to the normalized parameters.

Optionally, the method for obtaining the profitability of each feature includes:

wherein,

is as follows

The inter-entry density of the individual features,

is a first

Second adjacent occurrence of

The distance between the two entries where the individual features are located,

is the maximum number of adjacent occurrences; said first

Density within entry of individual feature

The calculation method comprises the following steps:

wherein,

is as follows

The in-entry density of the individual features,

is shown as

Is characterized in that

The number of occurrences in an individual entry,

denotes the first

Is characterized in that

In the individual entry

The position of the secondary occurrence is,

is shown as

Is characterized in that

In the individual entry

The position of the secondary occurrence is,

denotes the first

The length of an individual entry; the invention has the advantages that the product of the inter-entry density, the intra-entry density and the total occurrence frequency according to the characteristics is as follows: the sensitivity of the big data is quantified by utilizing the characteristic characteristics through the characteristic extraction of the digital marketing big data, so that a large amount of sensitive data screening calculation amount is saved; sensitivity calculation is carried out through positive and negative connectivity and characteristic income, sensitive data screening of the digital marketing big data is carried out more accurately, then the digital marketing big data is processed safely, the amount of processed basic data is greatly reduced, and the processing time is shortened.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive labor.

Fig. 1 is a schematic flow chart of a digital marketing big data processing method according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, a flowchart of a method for processing digital marketing big data according to an embodiment of the present invention is shown, where the method includes the following steps:

and S001, acquiring the digital marketing big data and establishing a database.

Because the large digital marketing data is very scattered and irregular in structure relative to the structured data in the database, it is very inconvenient for subsequent feature extraction and feature sensitivity calculation. The concrete expression is that irregular data (particularly data structures) need to be searched when feature extraction is carried out, so that the calculation amount is greatly increased. And the data from different sources are not very strong in connectivity, when the data feature identification is performed and the feature sensitivity calculation is performed by using the data features, the data from different sources are not strong in connectivity, so that the feature extraction is too much, and further the feature sensitivity calculation is inaccurate and the dimension disaster is caused. Therefore, a database based on data sources needs to be established for the digital marketing big data, and then the digital marketing big data in the database needs to be structured.

The method comprises the steps of firstly acquiring digital marketing big data, recording the digital marketing big data when the digital marketing big data are collected by an enterprise, and further classifying the digital marketing big data according to data sources, wherein the data from the same source are classified into one type.

The database is established for the digital marketing big data of each source, and preferably, the database is established by using the prior art such as Hbase technology, which is a well-known technology and will not be described in detail herein.

Carrying out structuring processing on the digital marketing big data from the same source in each database according to the obtaining time of the big data by using the form of the table entries, and obtaining the big data

Entries, where the total number of entries in each database may not be the same, are used uniformly for convenience of description herein

And (4) performing representation.

The preprocessed digital marketing big data is obtained through the acquisition of the digital marketing big data, the classification and database establishment based on the sources and the corresponding structural processing.

The sensitivity of the entries of the structured digital marketing big data in each database is different from that of the commodities. The concrete behavior is the connectivity between the different features extracted in the terms, and the revenue of the contribution to marketing is different. The description of the commodity by the characteristics with stronger contact is more accurate, and the description of the commodity by the characteristics with weaker contact is more fuzzy; the greater the impact of the corresponding features on marketing benefits, the more important it is among all the features of the good.

And S002, determining the characteristics of the digital marketing big data in the database, carrying out primary cleaning, and obtaining the corresponding characteristics of all the digital marketing big data.

When the entries in the database are used for analysis, the length of the entire entries in the database may be too long, and the entries may contain noise of other non-valid information. Therefore, the method and the device perform initial feature cleaning on all the entries corresponding to the digital marketing big data in the database, extract the features in the entries of the database through the named body recognition technology by utilizing the data after the initial feature cleaning, and calculate the sensitivity of the data by taking the features as the labels of the entries in the database.

The method comprises the steps of carrying out primary characteristic cleaning on entries corresponding to all digital marketing big data in a database, and specifically, obtaining repeated characters of the entries corresponding to all the digital marketing big data in the database. Because the characteristics are used for describing important words in the vocabulary entry, characters corresponding to most characteristics are repeated, and correspondingly, characters corresponding to a small part of characteristics which do not repeatedly appear exist, but the characteristics are irrelevant and important in big data, and the big data is not concerned about a small number of data and only about the general dynamic trend. The method is used for carrying out the initial cleaning of the features, so that the workload in the subsequent feature extraction and feature sensitivity calculation can be reduced, and a few features which are irrelevant to the general dynamic trend are eliminated.

Further, the data obtained by the preliminary feature cleaning is subjected to feature extraction by utilizing a named body recognition technology, specifically, the input data is the data corresponding to the vocabulary entry, and then the entity obtained by the output of the named body recognition technology is the feature in the digital marketing big data and is expressed as a word form in the vocabulary entry.

Specifically, by using the method, feature extraction is performed on all entries in the database after the structuring processing of the digital marketing big data, so that all features can be obtained

The following are:

in which the subscripts denote different features, e.g.

I.e. representing dataAll the digitalized marketing big data in the library

The characteristics of the device are as follows,

，

the maximum feature number extracted after the preliminary feature cleaning of the corresponding digital marketing big data in the current database and the maximum feature number in each database

May be different, and are used herein for convenience of description and uniformity

And (4) performing representation.

And S003, carrying out sensitivity quantification on the characteristics according to the acquired characteristics of all the digital marketing big data.

The sensitivity refers to a parameter for quantifying the importance degree of the extracted features in the digital marketing big data or whether safety processing is necessary; calculating the relationship between the characteristics in the digital marketing big data and the profitability of the characteristics to the marketing contribution; the more strongly the certain characteristic is connected with the rest characteristics, the more important the certain characteristic is compared with the other characteristics in the process of digital marketing, namely, the digital marketing can be carried out under the coordination of most characteristics, so as to generate big data of the corresponding characteristics; and the higher the frequency of appearance of the characteristic is, the more uniform the characteristic is, the more corresponding the income on the characteristic is in the process of marketing, so the more sensitive the characteristic is, the more sensitive the corresponding big data of the digital marketing corresponding to the characteristic is, and the stronger the necessity of safety processing is.

It should be noted that, for the relationship between the features, it includes positive and negative relationship. Positive connectivity refers to the presence of one feature, often accompanied by the presence of the remaining features, and negative connectivity refers to the presence of one feature, often the absence of most features, so this property is used to quantify the connectivity between features.

Further, when the characteristics of the commodity are generally described, the stronger the connectivity between the two characteristics is, the smaller the distance between the euclidean distance between the article descriptions corresponding to the two characteristics in a term should be, that is, one characteristic enhances the other characteristic; the weaker the connectivity of the corresponding two entries, the longer the Euclidean distance of the two features in the same entry, i.e. one feature supplements the other, so that each feature contains

The character bars of (2) are subjected to calculation of Euclidean distances between features to determine the features thereby

Positive associations with the remaining features.

In particular, in the following

A characteristic

For example, it is in direct contact with

The quantization mode of (1) is as follows:

wherein,

is shown as

The positive connection parameter of the individual feature,

is shown as

The first in the individual entry

The number of times that an individual feature occurs,

is shown as

Except for the first in each entry

The total number of occurrences of other features than the individual feature,

denotes the first

In the individual entry

Feature relevance of individual features.

The first mentioned

In each entry

Is a characteristic

Is related to the characteristic of (i) i.e.

The calculating method comprises the following steps:

wherein,

is shown as

In the individual entry

is a first

The total number of all features in an individual entry,

denotes the first

In the individual entry

A characteristic of

The feature of each feature is associated with a parameter.

The first mentioned

In the individual entry

A characteristic of

A characteristic

Is related to a parameter, i.e.

The calculating method comprises the following steps:

wherein,

is shown as

A characteristic

In the first place

The number of occurrences in an individual entry,

is shown as

Is a characteristic

In the first place

In each entry

The location of the secondary occurrence;

is shown as

Is characterized in that

The nearest to the word entry when it appears

The location of the features, it being noted that

Is characterized in that

The number of occurrences of an entry is not necessarily

Second, first

The nearest one corresponding to different occurrence times of each feature

The feature occurrence positions may be the same.

It should be construed that

A plurality of words contained in each entry form a word sequence from left to right, and characteristics

Also on the entry is a word, which may appear multiple times in the word sequence, then the word is included in the wordThe position in the sequence being a feature

In the first place

The position of occurrence in the entry, similarly

The same features can also be obtained in

The position of occurrence in the individual entry. Wherein,

the larger the value is

A characteristic of

The closer the positions of the characteristics appearing on the same entry are, the stronger the contact between the two characteristics is;

the smaller the two characteristics, the farther the two characteristics appear on the same entry, the weaker the connectivity of the two characteristics is;

larger indicates on the same entry

A characteristic

The closer the position of the feature to all other features on the entry is, the more other features are close to the feature, and the feature is shown to be close to other features on the entryThe stronger the relevance of the features; the farther the feature appears from all other features on the entry, the less other features are close to the feature, which means that the association of the feature with other features on the entry is weaker;

larger indicates that the word is on all entries

The stronger the relationship between each feature and all other features on all the entries, the more important and sensitive the feature is in the database, and the more relevant change can reflect the overall change trend of the digital marketing big data.

The overall relevance of a feature to other features within the same entry is considered positive, the stronger the relevance, the stronger the corresponding sensitivity of the feature, and the more important it is in the database.

Further, for the second

Is a characteristic

Is in direct contact with

Is quantified by the remaining features and

a characteristic

And all the characteristics are described for the same digital marketing, namely all the characteristics are subordinate to the process of the digital marketing. But with the contrary characteristics between them, i.e. characteristics

AppearContrary to the presence of other features, i.e. of

A characteristic

The more the number of occurrences, the more the remaining features occur, and because the cardinality of the features is large, they still do not occur in the same entry, which indicates that the negative relationship between them is larger. So utilize

Is calculated as an overall negative relation with the total number of occurrences of the remaining conflicting features, and then a partial negative relation is calculated by multiplying the frequencies of occurrences within the range, and the overall and partial negative relations are multiplied to represent the fourth

Is a characteristic

Negative links to the remaining features.

Specifically, in the order of

A characteristic

For example, its negative connection

The quantization method is as follows:

wherein,

denotes the first

The negative connection parameter of the individual characteristic,

indicates never comes

The first of the features that an individual feature appears in the same entry

The characteristics of the device are as follows,

then this indicates that these have never been compared

is shown as

The total number of times that an individual feature appears in the database,

is shown as

The total number of occurrences of a feature in the database,

is shown in

Within the range of the individual entry

The frequency of occurrence of the individual features is,

is shown in

Within the range of the individual entry

The frequency of occurrence of the individual features is,

is shown in common

An entry range, which is a range formed by a certain number of entries.

Preferably, the term range gives an empirical value of 100 terms; specifically, will

Every 100 entries in each entry are divided into a group, and the result is

Group, i.e.

A range of entries.

Feature(s)

Number and never of occurrences

The larger the ratio of the total times of a certain feature appearing in the same entry is, the larger the feature cardinality is, the more the feature cardinality is, the feature cardinality is still not appeared at the same time, namely, the stronger the negative relation between the two features is; and features within a certain range of entries

Frequency and uncombination characteristics

The ratio of the occurrence frequencies of certain features appearing in the same entry can also indicate that the stronger the negative relationship between the two, the more negative relationship

The larger the feature is, the more times the feature and the features irrelevant to the feature are appeared, but the feature still does not appear in the same entry, which indicates that the feature has more irrelevant features, so that the overall importance of the feature in the database is reduced, and the sensitivity is also reduced.

Further, the positive connection and the negative connection of all the characteristics are calculated by the method, and then the positive connection and the negative connection of all the characteristics are normalized to calculate the connectivity.

In particular, in the following

Contact of individual characteristics

For example, the calculation method is as follows:

wherein,

is the first

The relevance of the individual characteristics is such that,

is as follows

Each feature is being associated with a normalized parameter,

is as follows

Each feature is negatively linked to the normalized parameter.

The method is used for calculating the relevance of all the characteristics, and the relevance of all the characteristics can be obtained. The positive relation of the features increases the corresponding sensitivity, the negative relation reduces the corresponding sensitivity, the integral relation of the features is obtained by subtracting the negative relation from the positive relation, the larger the positive relation is, the smaller the negative relation is, namely, the more the related features of the features are and the fewer the unrelated features are, the relation is also increased, and the corresponding importance and sensitivity in the database are also larger; conversely, if the positive link is smaller and the negative link is larger, the irrelevant feature is far more than the relevant feature, so that the importance of the feature in the database is greatly reduced, and the feature has no greater sensitivity.

Further, the profitability of the characteristics is calculated, wherein the profitability of the characteristics refers to the income corresponding to each characteristic when the digital marketing big data is used for marketing, and the theoretical logic means that the more times each attribute appears in all entries, the greater the income of the digital marketing big data is when the digital marketing big data is used for marketing.

Further, when performing the feature profitability calculation, the first step is utilized

Is a characteristic

The density and number of occurrences in the global database are calculated. Because the data entry time of the database is based on time series entry, the first one

Is a characteristic

The more and more uniform the appearance density is, the more relevant the digital marketing big data in the database is to the second time of marketing

A characteristic

The most contribution, i.e. the corresponding gain. And the characteristic profit

Is composed of two parts including density and overall frequency of occurrence, the first part is used in density

A characteristic

The distance between the different terms appearing is calculated, and the larger the value is, the more the description is

Is a characteristic

The more times this occurs, this is the density between entries. And then multiplied by the density within the entry for the repeated occurrences within one entry

Is a characteristic

The more times it occurs, the more important the feature is in the entry, and then the more

A characteristic

The product of the number of times of occurrence of the whole is taken as

A characteristic

The characteristic yield of (1).

In particular, in the following

A characteristic

For example, the characteristic profit

The calculating method comprises the following steps:

wherein,

is a first

The inter-entry density of the individual features,

is as follows

The in-entry density of the individual features,

denotes the first

Is a characteristic

The total number of occurrences is,

is the total number of all entries, wherein

Inter-entry density of individual features

The calculation method comprises the following steps:

wherein,

is as follows

Second adjacent occurrence of

The distance between the two entries where the individual features are located,

for the maximum number of adjacent occurrences, it should be noted that

Second adjacent occurrence of

Distance between two entries of a feature

The meaning of (A) is: for example

First occurrence is

The second occurrence is

The third occurrence is

The first occurrence is adjacent to the second occurrence and is

The adjacent ones of the first and second layers are next to each other,

the second occurrence is adjacent to the third occurrence, then

The adjacent ones of the first and second layers are next to each other,

。

and the first

Density within entry of individual feature

The calculation method comprises the following steps:

wherein,

denotes the first

Is characterized in that

The number of occurrences in an individual entry,

denotes the first

Is characterized in that

In the individual entry

The position of the secondary occurrence is,

is shown as

Is characterized in that

In the individual entry

The position of the secondary occurrence is,

is shown as

Length of an individual entry.

Further, in the second place

A characteristic

Characteristic profit of

For example, the feature yields are obtained after normalization

The obtained characteristic income comprises inter-entry density and intra-entry density, wherein the inter-entry density is obtained by the mean value of the distances between the two entries when the characteristics appear in different entries, and the smaller the mean value of the distances between the two entries containing the same characteristic is, the more the entries containing the characteristic are distributed uniformly, and the larger the characteristic income is; the density in the entries is obtained by the ratio of the sum of the distances between the continuous occurrences of the same features in the same entry to the total length of the entries, the larger the ratio is, the more sparse the features in the same entry are, the less the number of occurrences of the features contributes more, and the feature profit is also larger.

And performing characteristic income calculation on all the characteristics by using the method, and obtaining the profitability of all the characteristics after normalization.

Further, the first

A characteristic

Sensitivity of (2)

Is calculated from the relationship between the remaining features and the overall yield, in particular

Is a characteristic

For example, its sensitivity

The calculation method comprises the following steps:

wherein,

is as follows

The sensitivity of the individual characteristics of the material,

is the first

The relevance of the individual characteristics is such that,

is a first

The profitability of the individual characteristics.

The sensitivity of all the characteristics can be obtained by calculating the sensitivity of all the characteristics by the method. The stronger the connection between a certain feature and other features, the more important the feature is relative to the whole digital marketing process, the greater the profit is, the most contributed in the whole digital marketing process, the more sensitive the attribute is, the more safety processing is needed, otherwise, the less sensitive the feature is, the less important the feature is, and the processing is not needed.

And S004, acquiring entries corresponding to the structured sensitive data corresponding to the digital marketing big data in the database by utilizing the characteristic sensitivity of the quantized digital marketing big data.

Specifically, the sensitivity corresponding to each feature is obtained in the above process, and the overall sensitivity calculation is performed on each entry to obtain the sensitivity of each entry

An entryFor example, the calculation method is as follows:

wherein,

is shown as

The sensitivity of the individual terms is such that,

is shown as

The number of all features in an individual entry,

indicates the first in the entry

Sensitivity of individual characteristics, then entry sensitivity

The larger the entry, the more sensitive the entry is, preferably, the first threshold is given

And (6) judging.

And performing overall sensitivity calculation on each entry by using the method, and judging and obtaining the corresponding entry corresponding to the sensitive data according to a first threshold, wherein the data contained in the entry with the sensitivity greater than the first preset threshold is the sensitive data.

Sensitive data in all the digital marketing big data are the most relevant data with the strongest contact in the marketing process for the whole database, and the sensitive data are more important in the database compared with other data, so that the data is safely processed in the subsequent process, the data volume in the processing process can be greatly reduced, and the processing time is shortened.

And S005, carrying out safety processing on the sensitive data in the acquired digital marketing big data.

Specifically, the digital marketing big data is subjected to data partitioning, wherein the data partitioning comprises sensitive data and non-sensitive data, further, the sensitive data is subjected to security processing, the security processing of the whole digital marketing big data can be completed, and specifically, the sensitive data can be subjected to security processing and can be encrypted by using an AES algorithm.

The present invention is not limited to the above-described preferred embodiments, and any modifications, equivalent substitutions, improvements, etc. made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A digital marketing big data processing method is characterized by comprising the following steps:

obtaining the profitability of each feature according to the inter-entry density of the features appearing in different entries and the intra-entry density of the features appearing in the same entry, and obtaining the sensitivity of each feature according to the contact and the profitability of each feature;

2. The digital marketing big data processing method of claim 1, wherein the step of constructing the database of the digital marketing big data is:

3. The digital marketing big data processing method of claim 1, wherein the step of performing feature cleaning comprises:

4. The method for processing the digital marketing big data according to claim 1, wherein the method for acquiring the characteristics of all entries comprises the following steps:

5. The method for processing the digital marketing big data according to claim 1, wherein the method for acquiring the feature relevance of each feature in each entry comprises the following steps:

wherein，

Is shown as

In each entry

The feature relevance of the individual features is such that,

is as follows

The total number of all features in an individual entry,

denotes the first

In each entry

A characteristic of

6. The digital marketing big data processing method of claim 1, wherein the positive connection parameter of each feature is obtained by:

wherein,

denotes the first

A positive connection parameter of the individual characteristic,

denotes the first

The first in the individual entry

The number of times that an individual feature occurs,

is shown as

Is divided by

The total number of occurrences of other features than the individual feature,

denotes the first

In each entry

Feature relevance of individual features.

7. The digital marketing big data processing method of claim 1, wherein the method for acquiring the negative connection parameter of each feature is as follows:

wherein,

denotes the first

The negative connection parameter of the individual characteristic,

indicates never

The first of the features that the feature appears in the same entry

The characteristics of the composite material are that,

then this indicates that these have never been compared

is shown as

The total number of times that an individual feature appears in the database,

denotes the first

The total number of occurrences of a feature in the database,

is shown in

Within the range of the individual entry

The frequency of occurrence of a feature is such that,

is shown in

Within the range of the individual entry

The frequency of occurrence of a feature is such that,

is shown in common

An entry range, which is a range formed by a certain number of entries.

8. The digital marketing big data processing method according to claim 1, wherein the method for acquiring the connectivity of each feature is as follows:

wherein,

is the first

The relevance of the individual characteristics is such that,

is a first

Each feature is being associated with a normalized parameter,

is as follows

The individual features are negatively linked to the normalized parameters.

9. The digital marketing big data processing method of claim 1, wherein the method for acquiring the profitability of each feature comprises the following steps:

wherein,

is as follows

The inter-entry density of the individual features,

is as follows

Second adjacent occurrence of

The distance between the two entries where the individual features are located,

is the maximum number of adjacent occurrences; the first mentioned

Density within entry of individual feature

The calculation method comprises the following steps:

wherein,

is as follows

The in-entry density of the individual features,

is shown as

Is characterized in that

The number of occurrences in an individual entry,

is shown as

Is characterized in that

In the individual entry

The position of the secondary occurrence is,

is shown as

Is characterized in that

In the individual entry

The position of the secondary occurrence is,

is shown as

The length of an individual entry; and obtaining the profitability of the characteristics according to the product of the inter-entry density, the intra-entry density and the total occurrence frequency of the characteristics and the ratio of the total number of the entries.