CN115563654A - Digital marketing big data processing method - Google Patents

Digital marketing big data processing method Download PDF

Info

Publication number
CN115563654A
CN115563654A CN202211469771.6A CN202211469771A CN115563654A CN 115563654 A CN115563654 A CN 115563654A CN 202211469771 A CN202211469771 A CN 202211469771A CN 115563654 A CN115563654 A CN 115563654A
Authority
CN
China
Prior art keywords
entry
feature
big data
individual
digital marketing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211469771.6A
Other languages
Chinese (zh)
Other versions
CN115563654B (en
Inventor
孙晓琛
葛强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Zhidou Digital Technology Co ltd
Original Assignee
Shandong Zhidou Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Zhidou Digital Technology Co ltd filed Critical Shandong Zhidou Digital Technology Co ltd
Priority to CN202211469771.6A priority Critical patent/CN115563654B/en
Publication of CN115563654A publication Critical patent/CN115563654A/en
Application granted granted Critical
Publication of CN115563654B publication Critical patent/CN115563654B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of big data processing, and provides a digital marketing big data processing method, which comprises the following steps: acquiring digital marketing big data and establishing a database; carrying out characteristic preliminary cleaning on the digital marketing big data in the database; acquiring the characteristics of all digital marketing big data, acquiring positive connection parameters and negative connection parameters according to the distribution relation of entries in a database among the characteristics, acquiring the profitability of the characteristics according to the density expression of the characteristics in the database, acquiring the connectivity among the characteristics according to the positive connection parameters and the negative connection parameters, and carrying out sensitivity quantification on the characteristics according to the connectivity and the profitability; acquiring the sensitivity of the entries in the database by using the characteristic sensitivity to obtain the entries corresponding to the sensitive data; and carrying out security processing on the sensitive data in the obtained digital marketing big data. The invention aims to solve the problem that when large digital marketing data are encrypted, the time consumption is too long due to the huge data volume.

Description

Digital marketing big data processing method
Technical Field
The application relates to the field of big data processing, in particular to a digital marketing big data processing method.
Background
With the development of science and technology and the arrival of the digital era, the traditional marketing mode, such as the promotion and promotion of off-line physical stores, is not dominant in the selling process of commodities because of small coverage, and the corresponding digital marketing is more popular because of the accuracy and the coverage of a large area. In the process of digital marketing, the corresponding big data is generated correspondingly for the commodities of each enterprise, and the big data is very important for updating and promoting subsequent products of the enterprise, so that the safety of the digital marketing big data is an important problem for the enterprise, and the digital marketing big data needs to be subjected to corresponding safety processing.
Disclosure of Invention
The invention provides a method for processing digital marketing big data, which aims to solve the problems that the data volume is huge and the time consumption is too long when the existing algorithm is used for encrypting the digital marketing big data, and adopts the following technical scheme:
one embodiment of the invention provides a digital marketing big data processing method, which comprises the following steps:
constructing a database of the digital marketing big data, and performing characteristic cleaning on all entries of the digital marketing big data in the database;
acquiring the characteristics of all entries, acquiring the characteristic relevance of each characteristic in each entry according to the position relation between different characteristics in the same entry, taking the mean value of the characteristic relevance of each characteristic in each entry in all entries as the positive contact parameter of each characteristic, acquiring the negative contact parameter of each characteristic according to the integral occurrence frequency between the characteristics which never appear in the same entry and the occurrence frequency of the characteristics in a certain entry range, and acquiring the contact of each characteristic according to the positive contact parameter and the negative contact parameter;
acquiring the profitability of each characteristic according to the inter-entry density of the characteristics appearing in different entries and the intra-entry density appearing in the same entry, and acquiring the sensitivity of each characteristic according to the associativity and the profitability of each characteristic;
and by utilizing the sensitivity of the characteristics in the digital marketing big data, taking the sum of the sensitivities of all the characteristics in the same entry as the sensitivity of the entry, acquiring the sensitive data contained in the entry according to the sensitivity of the entry, and carrying out safety processing on the sensitive data.
Optionally, the step of constructing the database of the digital marketing big data is as follows:
and acquiring the digital marketing big data, classifying and establishing a database based on the sources, and performing structured processing on the digital marketing big data of the same source in the database by using a form entry mode according to the obtaining time of the big data to obtain the preprocessed digital marketing big data.
Optionally, the step of performing feature cleaning includes:
repeated characters in entries corresponding to all digital marketing big data in the database are obtained, and characters corresponding to a small part of unrepeated features are cleaned, so that the workload of subsequent feature extraction and feature sensitivity calculation is reduced.
Optionally, the method for acquiring the features of all the entries includes:
and (3) taking the text data of each entry as the input of the named body recognition technology, and outputting the obtained entity as the characteristic of the digital marketing big data.
Optionally, the method for obtaining the feature relevance of each feature in each entry includes:
Figure 957170DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 660683DEST_PATH_IMAGE002
is shown as
Figure 176241DEST_PATH_IMAGE003
In the individual entry
Figure 648810DEST_PATH_IMAGE004
The characteristic relevance of each characteristic is determined by the characteristic relevance,
Figure 90156DEST_PATH_IMAGE005
is as follows
Figure 280966DEST_PATH_IMAGE003
The total number of all features in an individual entry,
Figure 833170DEST_PATH_IMAGE006
is shown as
Figure 425825DEST_PATH_IMAGE007
In the individual entry
Figure 539537DEST_PATH_IMAGE008
A characteristic of
Figure 952064DEST_PATH_IMAGE004
The characteristic association parameter of each characteristic is obtained by the position relation of two characteristics appearing in the same entry.
Optionally, the method for acquiring the positive contact parameter of each feature includes:
Figure 511221DEST_PATH_IMAGE009
wherein, the first and the second end of the pipe are connected with each other,
Figure 20700DEST_PATH_IMAGE010
is shown as
Figure 741531DEST_PATH_IMAGE004
A positive connection parameter of the individual characteristic,
Figure 172513DEST_PATH_IMAGE011
for the number of structured entries of the digitized marketing big data in the database,
Figure 567984DEST_PATH_IMAGE012
is shown as
Figure 869652DEST_PATH_IMAGE007
The first in the individual entry
Figure 823702DEST_PATH_IMAGE004
The number of times that an individual feature occurs,
Figure 210821DEST_PATH_IMAGE013
is shown as
Figure 3459DEST_PATH_IMAGE007
Is divided by
Figure 425213DEST_PATH_IMAGE004
The total number of occurrences of other features than the individual feature,
Figure 753426DEST_PATH_IMAGE014
is shown as
Figure 690158DEST_PATH_IMAGE007
In each entry
Figure 863650DEST_PATH_IMAGE004
Feature relevance of individual features.
Optionally, the method for obtaining the negative contact parameter of each feature includes:
Figure 906955DEST_PATH_IMAGE015
wherein the content of the first and second substances,
Figure 202807DEST_PATH_IMAGE016
is shown as
Figure 564518DEST_PATH_IMAGE017
The negative connection parameter of the individual characteristic,
Figure 604018DEST_PATH_IMAGE018
indicates never
Figure 365DEST_PATH_IMAGE004
The first of the features that an individual feature appears in the same entry
Figure 968583DEST_PATH_IMAGE018
The characteristics of the device are as follows,
Figure 817590DEST_PATH_IMAGE019
then this is indicatedSome never before
Figure 519836DEST_PATH_IMAGE017
The total number of features that an individual feature appears in the same entry,
Figure 57172DEST_PATH_IMAGE020
is shown as
Figure 898090DEST_PATH_IMAGE004
The total number of times that an individual feature appears in the database,
Figure 968814DEST_PATH_IMAGE021
denotes the first
Figure 350116DEST_PATH_IMAGE018
The total number of times that an individual feature appears in the database,
Figure 455476DEST_PATH_IMAGE022
is shown in
Figure 765497DEST_PATH_IMAGE023
Within the range of the individual entry
Figure 589096DEST_PATH_IMAGE004
The frequency of occurrence of a feature is such that,
Figure 508511DEST_PATH_IMAGE024
is shown in
Figure 733956DEST_PATH_IMAGE023
Within the range of the individual entry
Figure 916675DEST_PATH_IMAGE018
The frequency of occurrence of the individual features is,
Figure 24309DEST_PATH_IMAGE025
is shown in common
Figure 186562DEST_PATH_IMAGE025
The range of each entry is defined as,the term range is a range formed by a certain number of terms.
Optionally, the method for obtaining the contact of each feature includes:
Figure 63251DEST_PATH_IMAGE026
wherein the content of the first and second substances,
Figure 416872DEST_PATH_IMAGE027
is the first
Figure 11801DEST_PATH_IMAGE004
The relevance of the individual characteristics is such that,
Figure 741860DEST_PATH_IMAGE028
is as follows
Figure 974520DEST_PATH_IMAGE004
Each feature is being associated with a normalized parameter,
Figure 499043DEST_PATH_IMAGE029
is as follows
Figure 518951DEST_PATH_IMAGE004
The individual features are negatively linked to the normalized parameters.
Optionally, the method for obtaining the profitability of each feature includes:
Figure 849438DEST_PATH_IMAGE030
wherein, the first and the second end of the pipe are connected with each other,
Figure 903982DEST_PATH_IMAGE031
is as follows
Figure 333826DEST_PATH_IMAGE004
The inter-entry density of the individual features,
Figure 139234DEST_PATH_IMAGE032
is a first
Figure 7833DEST_PATH_IMAGE033
Second adjacent occurrence of
Figure 916883DEST_PATH_IMAGE017
The distance between the two entries where the individual features are located,
Figure 783208DEST_PATH_IMAGE034
is the maximum number of adjacent occurrences; said first
Figure 574446DEST_PATH_IMAGE017
Density within entry of individual feature
Figure 184419DEST_PATH_IMAGE035
The calculation method comprises the following steps:
Figure 246178DEST_PATH_IMAGE036
wherein the content of the first and second substances,
Figure 80142DEST_PATH_IMAGE037
is as follows
Figure 561939DEST_PATH_IMAGE017
The in-entry density of the individual features,
Figure 867280DEST_PATH_IMAGE038
is shown as
Figure 547661DEST_PATH_IMAGE017
Is characterized in that
Figure 755788DEST_PATH_IMAGE039
The number of occurrences in an individual entry,
Figure 521619DEST_PATH_IMAGE040
denotes the first
Figure 974859DEST_PATH_IMAGE017
Is characterized in that
Figure 713008DEST_PATH_IMAGE039
In the individual entry
Figure 154353DEST_PATH_IMAGE041
The position of the secondary occurrence is,
Figure 345163DEST_PATH_IMAGE042
is shown as
Figure 897367DEST_PATH_IMAGE017
Is characterized in that
Figure 490023DEST_PATH_IMAGE039
In the individual entry
Figure 365636DEST_PATH_IMAGE043
The position of the secondary occurrence is,
Figure 106059DEST_PATH_IMAGE044
denotes the first
Figure 665217DEST_PATH_IMAGE039
The length of an individual entry; the invention has the advantages that the product of the inter-entry density, the intra-entry density and the total occurrence frequency according to the characteristics is as follows: the sensitivity of the big data is quantified by utilizing the characteristic characteristics through the characteristic extraction of the digital marketing big data, so that a large amount of sensitive data screening calculation amount is saved; sensitivity calculation is carried out through positive and negative connectivity and characteristic income, sensitive data screening of the digital marketing big data is carried out more accurately, then the digital marketing big data is processed safely, the amount of processed basic data is greatly reduced, and the processing time is shortened.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive labor.
Fig. 1 is a schematic flow chart of a digital marketing big data processing method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flowchart of a method for processing digital marketing big data according to an embodiment of the present invention is shown, where the method includes the following steps:
and S001, acquiring the digital marketing big data and establishing a database.
Because the large digital marketing data is very scattered and irregular in structure relative to the structured data in the database, it is very inconvenient for subsequent feature extraction and feature sensitivity calculation. The concrete expression is that irregular data (particularly data structures) need to be searched when feature extraction is carried out, so that the calculation amount is greatly increased. And the data from different sources are not very strong in connectivity, when the data feature identification is performed and the feature sensitivity calculation is performed by using the data features, the data from different sources are not strong in connectivity, so that the feature extraction is too much, and further the feature sensitivity calculation is inaccurate and the dimension disaster is caused. Therefore, a database based on data sources needs to be established for the digital marketing big data, and then the digital marketing big data in the database needs to be structured.
The method comprises the steps of firstly acquiring digital marketing big data, recording the digital marketing big data when the digital marketing big data are collected by an enterprise, and further classifying the digital marketing big data according to data sources, wherein the data from the same source are classified into one type.
The database is established for the digital marketing big data of each source, and preferably, the database is established by using the prior art such as Hbase technology, which is a well-known technology and will not be described in detail herein.
Carrying out structuring processing on the digital marketing big data from the same source in each database according to the obtaining time of the big data by using the form of the table entries, and obtaining the big data
Figure 112379DEST_PATH_IMAGE045
Entries, where the total number of entries in each database may not be the same, are used uniformly for convenience of description herein
Figure 895527DEST_PATH_IMAGE045
And (4) performing representation.
The preprocessed digital marketing big data is obtained through the acquisition of the digital marketing big data, the classification and database establishment based on the sources and the corresponding structural processing.
The sensitivity of the entries of the structured digital marketing big data in each database is different from that of the commodities. The concrete behavior is the connectivity between the different features extracted in the terms, and the revenue of the contribution to marketing is different. The description of the commodity by the characteristics with stronger contact is more accurate, and the description of the commodity by the characteristics with weaker contact is more fuzzy; the greater the impact of the corresponding features on marketing benefits, the more important it is among all the features of the good.
And S002, determining the characteristics of the digital marketing big data in the database, carrying out primary cleaning, and obtaining the corresponding characteristics of all the digital marketing big data.
When the entries in the database are used for analysis, the length of the entire entries in the database may be too long, and the entries may contain noise of other non-valid information. Therefore, the method and the device perform initial feature cleaning on all the entries corresponding to the digital marketing big data in the database, extract the features in the entries of the database through the named body recognition technology by utilizing the data after the initial feature cleaning, and calculate the sensitivity of the data by taking the features as the labels of the entries in the database.
The method comprises the steps of carrying out primary characteristic cleaning on entries corresponding to all digital marketing big data in a database, and specifically, obtaining repeated characters of the entries corresponding to all the digital marketing big data in the database. Because the characteristics are used for describing important words in the vocabulary entry, characters corresponding to most characteristics are repeated, and correspondingly, characters corresponding to a small part of characteristics which do not repeatedly appear exist, but the characteristics are irrelevant and important in big data, and the big data is not concerned about a small number of data and only about the general dynamic trend. The method is used for carrying out the initial cleaning of the features, so that the workload in the subsequent feature extraction and feature sensitivity calculation can be reduced, and a few features which are irrelevant to the general dynamic trend are eliminated.
Further, the data obtained by the preliminary feature cleaning is subjected to feature extraction by utilizing a named body recognition technology, specifically, the input data is the data corresponding to the vocabulary entry, and then the entity obtained by the output of the named body recognition technology is the feature in the digital marketing big data and is expressed as a word form in the vocabulary entry.
Specifically, by using the method, feature extraction is performed on all entries in the database after the structuring processing of the digital marketing big data, so that all features can be obtained
Figure 795350DEST_PATH_IMAGE046
The following are:
Figure 190821DEST_PATH_IMAGE047
in which the subscripts denote different features, e.g.
Figure 758069DEST_PATH_IMAGE048
I.e. representing dataAll the digitalized marketing big data in the library
Figure 712118DEST_PATH_IMAGE017
The characteristics of the device are as follows,
Figure 99237DEST_PATH_IMAGE049
Figure 197DEST_PATH_IMAGE050
the maximum feature number extracted after the preliminary feature cleaning of the corresponding digital marketing big data in the current database and the maximum feature number in each database
Figure 985733DEST_PATH_IMAGE050
May be different, and are used herein for convenience of description and uniformity
Figure 48367DEST_PATH_IMAGE050
And (4) performing representation.
And S003, carrying out sensitivity quantification on the characteristics according to the acquired characteristics of all the digital marketing big data.
The sensitivity refers to a parameter for quantifying the importance degree of the extracted features in the digital marketing big data or whether safety processing is necessary; calculating the relationship between the characteristics in the digital marketing big data and the profitability of the characteristics to the marketing contribution; the more strongly the certain characteristic is connected with the rest characteristics, the more important the certain characteristic is compared with the other characteristics in the process of digital marketing, namely, the digital marketing can be carried out under the coordination of most characteristics, so as to generate big data of the corresponding characteristics; and the higher the frequency of appearance of the characteristic is, the more uniform the characteristic is, the more corresponding the income on the characteristic is in the process of marketing, so the more sensitive the characteristic is, the more sensitive the corresponding big data of the digital marketing corresponding to the characteristic is, and the stronger the necessity of safety processing is.
It should be noted that, for the relationship between the features, it includes positive and negative relationship. Positive connectivity refers to the presence of one feature, often accompanied by the presence of the remaining features, and negative connectivity refers to the presence of one feature, often the absence of most features, so this property is used to quantify the connectivity between features.
Further, when the characteristics of the commodity are generally described, the stronger the connectivity between the two characteristics is, the smaller the distance between the euclidean distance between the article descriptions corresponding to the two characteristics in a term should be, that is, one characteristic enhances the other characteristic; the weaker the connectivity of the corresponding two entries, the longer the Euclidean distance of the two features in the same entry, i.e. one feature supplements the other, so that each feature contains
Figure 985099DEST_PATH_IMAGE051
The character bars of (2) are subjected to calculation of Euclidean distances between features to determine the features thereby
Figure 424170DEST_PATH_IMAGE051
Positive associations with the remaining features.
In particular, in the following
Figure 700431DEST_PATH_IMAGE017
A characteristic
Figure 730704DEST_PATH_IMAGE051
For example, it is in direct contact with
Figure 656197DEST_PATH_IMAGE052
The quantization mode of (1) is as follows:
Figure 898959DEST_PATH_IMAGE009
wherein the content of the first and second substances,
Figure 29726DEST_PATH_IMAGE010
is shown as
Figure 496480DEST_PATH_IMAGE004
The positive connection parameter of the individual feature,
Figure 345487DEST_PATH_IMAGE011
for the number of structured entries of the digitized marketing big data in the database,
Figure 690143DEST_PATH_IMAGE012
is shown as
Figure 675417DEST_PATH_IMAGE007
The first in the individual entry
Figure 313071DEST_PATH_IMAGE004
The number of times that an individual feature occurs,
Figure 885260DEST_PATH_IMAGE013
is shown as
Figure 469826DEST_PATH_IMAGE007
Except for the first in each entry
Figure 575185DEST_PATH_IMAGE004
The total number of occurrences of other features than the individual feature,
Figure 383741DEST_PATH_IMAGE014
denotes the first
Figure 974384DEST_PATH_IMAGE007
In the individual entry
Figure 831482DEST_PATH_IMAGE004
Feature relevance of individual features.
The first mentioned
Figure 853665DEST_PATH_IMAGE039
In each entry
Figure 36384DEST_PATH_IMAGE017
Is a characteristic
Figure 347280DEST_PATH_IMAGE051
Is related to the characteristic of (i) i.e.
Figure 950344DEST_PATH_IMAGE053
The calculating method comprises the following steps:
Figure 295875DEST_PATH_IMAGE054
wherein the content of the first and second substances,
Figure 446234DEST_PATH_IMAGE055
is shown as
Figure 745890DEST_PATH_IMAGE007
In the individual entry
Figure 210370DEST_PATH_IMAGE004
The characteristic relevance of each characteristic is determined by the characteristic relevance,
Figure 410407DEST_PATH_IMAGE056
is a first
Figure 466087DEST_PATH_IMAGE007
The total number of all features in an individual entry,
Figure 751575DEST_PATH_IMAGE006
denotes the first
Figure 816483DEST_PATH_IMAGE007
In the individual entry
Figure 372492DEST_PATH_IMAGE008
A characteristic of
Figure 67915DEST_PATH_IMAGE004
The feature of each feature is associated with a parameter.
The first mentioned
Figure 762071DEST_PATH_IMAGE039
In the individual entry
Figure 804238DEST_PATH_IMAGE057
A characteristic of
Figure 775605DEST_PATH_IMAGE017
A characteristic
Figure 376351DEST_PATH_IMAGE051
Is related to a parameter, i.e.
Figure 433169DEST_PATH_IMAGE058
The calculating method comprises the following steps:
Figure 777562DEST_PATH_IMAGE059
wherein the content of the first and second substances,
Figure 104901DEST_PATH_IMAGE060
is shown as
Figure 142127DEST_PATH_IMAGE017
A characteristic
Figure 686241DEST_PATH_IMAGE051
In the first place
Figure 896642DEST_PATH_IMAGE039
The number of occurrences in an individual entry,
Figure 514705DEST_PATH_IMAGE040
is shown as
Figure 489877DEST_PATH_IMAGE017
Is a characteristic
Figure 255708DEST_PATH_IMAGE051
In the first place
Figure 941904DEST_PATH_IMAGE039
In each entry
Figure 945632DEST_PATH_IMAGE041
The location of the secondary occurrence;
Figure 386978DEST_PATH_IMAGE061
is shown as
Figure 843367DEST_PATH_IMAGE017
Is characterized in that
Figure 834719DEST_PATH_IMAGE039
The nearest to the word entry when it appears
Figure 224112DEST_PATH_IMAGE057
The location of the features, it being noted that
Figure 570779DEST_PATH_IMAGE057
Is characterized in that
Figure 248885DEST_PATH_IMAGE039
The number of occurrences of an entry is not necessarily
Figure 604780DEST_PATH_IMAGE060
Second, first
Figure 553407DEST_PATH_IMAGE017
The nearest one corresponding to different occurrence times of each feature
Figure 70976DEST_PATH_IMAGE057
The feature occurrence positions may be the same.
It should be construed that
Figure 33116DEST_PATH_IMAGE039
A plurality of words contained in each entry form a word sequence from left to right, and characteristics
Figure 130385DEST_PATH_IMAGE051
Also on the entry is a word, which may appear multiple times in the word sequence, then the word is included in the wordThe position in the sequence being a feature
Figure 494370DEST_PATH_IMAGE051
In the first place
Figure 633707DEST_PATH_IMAGE039
The position of occurrence in the entry, similarly
Figure 551985DEST_PATH_IMAGE057
The same features can also be obtained in
Figure 249682DEST_PATH_IMAGE039
The position of occurrence in the individual entry. Wherein the content of the first and second substances,
Figure 405857DEST_PATH_IMAGE006
the larger the value is
Figure 530808DEST_PATH_IMAGE057
A characteristic of
Figure 405223DEST_PATH_IMAGE017
The closer the positions of the characteristics appearing on the same entry are, the stronger the contact between the two characteristics is;
Figure 611339DEST_PATH_IMAGE006
the smaller the two characteristics, the farther the two characteristics appear on the same entry, the weaker the connectivity of the two characteristics is;
Figure 949916DEST_PATH_IMAGE055
larger indicates on the same entry
Figure 183451DEST_PATH_IMAGE017
A characteristic
Figure 607479DEST_PATH_IMAGE051
The closer the position of the feature to all other features on the entry is, the more other features are close to the feature, and the feature is shown to be close to other features on the entryThe stronger the relevance of the features; the farther the feature appears from all other features on the entry, the less other features are close to the feature, which means that the association of the feature with other features on the entry is weaker;
Figure 850242DEST_PATH_IMAGE062
larger indicates that the word is on all entries
Figure 981009DEST_PATH_IMAGE017
The stronger the relationship between each feature and all other features on all the entries, the more important and sensitive the feature is in the database, and the more relevant change can reflect the overall change trend of the digital marketing big data.
The overall relevance of a feature to other features within the same entry is considered positive, the stronger the relevance, the stronger the corresponding sensitivity of the feature, and the more important it is in the database.
Further, for the second
Figure 152490DEST_PATH_IMAGE017
Is a characteristic
Figure 63814DEST_PATH_IMAGE051
Is in direct contact with
Figure 844688DEST_PATH_IMAGE010
Is quantified by the remaining features and
Figure 95541DEST_PATH_IMAGE017
a characteristic
Figure 467616DEST_PATH_IMAGE051
And all the characteristics are described for the same digital marketing, namely all the characteristics are subordinate to the process of the digital marketing. But with the contrary characteristics between them, i.e. characteristics
Figure 69499DEST_PATH_IMAGE051
AppearContrary to the presence of other features, i.e. of
Figure 952266DEST_PATH_IMAGE017
A characteristic
Figure 57626DEST_PATH_IMAGE051
The more the number of occurrences, the more the remaining features occur, and because the cardinality of the features is large, they still do not occur in the same entry, which indicates that the negative relationship between them is larger. So utilize
Figure 866182DEST_PATH_IMAGE051
Is calculated as an overall negative relation with the total number of occurrences of the remaining conflicting features, and then a partial negative relation is calculated by multiplying the frequencies of occurrences within the range, and the overall and partial negative relations are multiplied to represent the fourth
Figure 689781DEST_PATH_IMAGE017
Is a characteristic
Figure 812458DEST_PATH_IMAGE051
Negative links to the remaining features.
Specifically, in the order of
Figure 336106DEST_PATH_IMAGE017
A characteristic
Figure 253246DEST_PATH_IMAGE051
For example, its negative connection
Figure 829721DEST_PATH_IMAGE063
The quantization method is as follows:
Figure 552826DEST_PATH_IMAGE015
wherein the content of the first and second substances,
Figure 163936DEST_PATH_IMAGE016
denotes the first
Figure 783136DEST_PATH_IMAGE017
The negative connection parameter of the individual characteristic,
Figure 817214DEST_PATH_IMAGE018
indicates never comes
Figure 344010DEST_PATH_IMAGE004
The first of the features that an individual feature appears in the same entry
Figure 12889DEST_PATH_IMAGE018
The characteristics of the device are as follows,
Figure 802990DEST_PATH_IMAGE019
then this indicates that these have never been compared
Figure 885216DEST_PATH_IMAGE017
The total number of features that an individual feature appears in the same entry,
Figure 887807DEST_PATH_IMAGE020
is shown as
Figure 240553DEST_PATH_IMAGE004
The total number of times that an individual feature appears in the database,
Figure 935976DEST_PATH_IMAGE021
is shown as
Figure 443181DEST_PATH_IMAGE018
The total number of occurrences of a feature in the database,
Figure 311780DEST_PATH_IMAGE022
is shown in
Figure 17568DEST_PATH_IMAGE023
Within the range of the individual entry
Figure 385357DEST_PATH_IMAGE004
The frequency of occurrence of the individual features is,
Figure 645438DEST_PATH_IMAGE024
is shown in
Figure 786569DEST_PATH_IMAGE023
Within the range of the individual entry
Figure 815705DEST_PATH_IMAGE018
The frequency of occurrence of the individual features is,
Figure 587352DEST_PATH_IMAGE025
is shown in common
Figure 131465DEST_PATH_IMAGE025
An entry range, which is a range formed by a certain number of entries.
Preferably, the term range gives an empirical value of 100 terms; specifically, will
Figure 769296DEST_PATH_IMAGE045
Every 100 entries in each entry are divided into a group, and the result is
Figure 449676DEST_PATH_IMAGE064
Group, i.e.
Figure 923383DEST_PATH_IMAGE025
A range of entries.
Feature(s)
Figure 626897DEST_PATH_IMAGE051
Number and never of occurrences
Figure 375410DEST_PATH_IMAGE051
The larger the ratio of the total times of a certain feature appearing in the same entry is, the larger the feature cardinality is, the more the feature cardinality is, the feature cardinality is still not appeared at the same time, namely, the stronger the negative relation between the two features is; and features within a certain range of entries
Figure 379138DEST_PATH_IMAGE051
Frequency and uncombination characteristics
Figure 259632DEST_PATH_IMAGE051
The ratio of the occurrence frequencies of certain features appearing in the same entry can also indicate that the stronger the negative relationship between the two, the more negative relationship
Figure 247179DEST_PATH_IMAGE063
The larger the feature is, the more times the feature and the features irrelevant to the feature are appeared, but the feature still does not appear in the same entry, which indicates that the feature has more irrelevant features, so that the overall importance of the feature in the database is reduced, and the sensitivity is also reduced.
Further, the positive connection and the negative connection of all the characteristics are calculated by the method, and then the positive connection and the negative connection of all the characteristics are normalized to calculate the connectivity.
In particular, in the following
Figure 737066DEST_PATH_IMAGE017
Contact of individual characteristics
Figure 392039DEST_PATH_IMAGE065
For example, the calculation method is as follows:
Figure 207548DEST_PATH_IMAGE026
wherein the content of the first and second substances,
Figure 885654DEST_PATH_IMAGE027
is the first
Figure 477434DEST_PATH_IMAGE004
The relevance of the individual characteristics is such that,
Figure 924596DEST_PATH_IMAGE028
is as follows
Figure 707744DEST_PATH_IMAGE004
Each feature is being associated with a normalized parameter,
Figure 873147DEST_PATH_IMAGE029
is as follows
Figure 767153DEST_PATH_IMAGE004
Each feature is negatively linked to the normalized parameter.
The method is used for calculating the relevance of all the characteristics, and the relevance of all the characteristics can be obtained. The positive relation of the features increases the corresponding sensitivity, the negative relation reduces the corresponding sensitivity, the integral relation of the features is obtained by subtracting the negative relation from the positive relation, the larger the positive relation is, the smaller the negative relation is, namely, the more the related features of the features are and the fewer the unrelated features are, the relation is also increased, and the corresponding importance and sensitivity in the database are also larger; conversely, if the positive link is smaller and the negative link is larger, the irrelevant feature is far more than the relevant feature, so that the importance of the feature in the database is greatly reduced, and the feature has no greater sensitivity.
Further, the profitability of the characteristics is calculated, wherein the profitability of the characteristics refers to the income corresponding to each characteristic when the digital marketing big data is used for marketing, and the theoretical logic means that the more times each attribute appears in all entries, the greater the income of the digital marketing big data is when the digital marketing big data is used for marketing.
Further, when performing the feature profitability calculation, the first step is utilized
Figure 570286DEST_PATH_IMAGE017
Is a characteristic
Figure 727598DEST_PATH_IMAGE051
The density and number of occurrences in the global database are calculated. Because the data entry time of the database is based on time series entry, the first one
Figure 442613DEST_PATH_IMAGE017
Is a characteristic
Figure 77994DEST_PATH_IMAGE051
The more and more uniform the appearance density is, the more relevant the digital marketing big data in the database is to the second time of marketing
Figure 499748DEST_PATH_IMAGE017
A characteristic
Figure 827961DEST_PATH_IMAGE051
The most contribution, i.e. the corresponding gain. And the characteristic profit
Figure 266158DEST_PATH_IMAGE066
Is composed of two parts including density and overall frequency of occurrence, the first part is used in density
Figure 970809DEST_PATH_IMAGE017
A characteristic
Figure 247070DEST_PATH_IMAGE051
The distance between the different terms appearing is calculated, and the larger the value is, the more the description is
Figure 542922DEST_PATH_IMAGE017
Is a characteristic
Figure 904633DEST_PATH_IMAGE051
The more times this occurs, this is the density between entries. And then multiplied by the density within the entry for the repeated occurrences within one entry
Figure 678554DEST_PATH_IMAGE017
Is a characteristic
Figure 576365DEST_PATH_IMAGE051
The more times it occurs, the more important the feature is in the entry, and then the more
Figure 43118DEST_PATH_IMAGE017
A characteristic
Figure 892126DEST_PATH_IMAGE051
The product of the number of times of occurrence of the whole is taken as
Figure 673000DEST_PATH_IMAGE017
A characteristic
Figure 986170DEST_PATH_IMAGE051
The characteristic yield of (1).
In particular, in the following
Figure 561507DEST_PATH_IMAGE017
A characteristic
Figure 461592DEST_PATH_IMAGE051
For example, the characteristic profit
Figure 780578DEST_PATH_IMAGE067
The calculating method comprises the following steps:
Figure 151517DEST_PATH_IMAGE068
wherein the content of the first and second substances,
Figure 694494DEST_PATH_IMAGE031
is a first
Figure 518093DEST_PATH_IMAGE017
The inter-entry density of the individual features,
Figure 204552DEST_PATH_IMAGE037
is as follows
Figure 429997DEST_PATH_IMAGE017
The in-entry density of the individual features,
Figure 347137DEST_PATH_IMAGE069
denotes the first
Figure 720350DEST_PATH_IMAGE017
Is a characteristic
Figure 381138DEST_PATH_IMAGE051
The total number of occurrences is,
Figure 257827DEST_PATH_IMAGE045
is the total number of all entries, wherein
Figure 124632DEST_PATH_IMAGE017
Inter-entry density of individual features
Figure 559374DEST_PATH_IMAGE031
The calculation method comprises the following steps:
Figure 758275DEST_PATH_IMAGE030
wherein the content of the first and second substances,
Figure 256514DEST_PATH_IMAGE032
is as follows
Figure 781036DEST_PATH_IMAGE033
Second adjacent occurrence of
Figure 597683DEST_PATH_IMAGE017
The distance between the two entries where the individual features are located,
Figure 865853DEST_PATH_IMAGE034
for the maximum number of adjacent occurrences, it should be noted that
Figure 717134DEST_PATH_IMAGE070
Second adjacent occurrence of
Figure 146979DEST_PATH_IMAGE017
Distance between two entries of a feature
Figure 421228DEST_PATH_IMAGE071
The meaning of (A) is: for example
Figure 289826DEST_PATH_IMAGE051
First occurrence is
Figure 933297DEST_PATH_IMAGE072
The second occurrence is
Figure 861939DEST_PATH_IMAGE073
The third occurrence is
Figure 590861DEST_PATH_IMAGE074
The first occurrence is adjacent to the second occurrence and is
Figure 764615DEST_PATH_IMAGE075
The adjacent ones of the first and second layers are next to each other,
Figure 793751DEST_PATH_IMAGE076
the second occurrence is adjacent to the third occurrence, then
Figure 565398DEST_PATH_IMAGE077
The adjacent ones of the first and second layers are next to each other,
Figure 109512DEST_PATH_IMAGE078
and the first
Figure 257596DEST_PATH_IMAGE017
Density within entry of individual feature
Figure 937977DEST_PATH_IMAGE035
The calculation method comprises the following steps:
Figure 381990DEST_PATH_IMAGE036
wherein the content of the first and second substances,
Figure 147820DEST_PATH_IMAGE060
denotes the first
Figure 99596DEST_PATH_IMAGE017
Is characterized in that
Figure 634482DEST_PATH_IMAGE039
The number of occurrences in an individual entry,
Figure 13511DEST_PATH_IMAGE040
denotes the first
Figure 469900DEST_PATH_IMAGE017
Is characterized in that
Figure 535288DEST_PATH_IMAGE039
In the individual entry
Figure 127943DEST_PATH_IMAGE041
The position of the secondary occurrence is,
Figure 740190DEST_PATH_IMAGE042
is shown as
Figure 152717DEST_PATH_IMAGE017
Is characterized in that
Figure 508612DEST_PATH_IMAGE039
In the individual entry
Figure 722818DEST_PATH_IMAGE043
The position of the secondary occurrence is,
Figure 443649DEST_PATH_IMAGE044
is shown as
Figure 671368DEST_PATH_IMAGE039
Length of an individual entry.
Further, in the second place
Figure 503058DEST_PATH_IMAGE017
A characteristic
Figure 70306DEST_PATH_IMAGE051
Characteristic profit of
Figure 24355DEST_PATH_IMAGE079
For example, the feature yields are obtained after normalization
Figure 975256DEST_PATH_IMAGE079
The obtained characteristic income comprises inter-entry density and intra-entry density, wherein the inter-entry density is obtained by the mean value of the distances between the two entries when the characteristics appear in different entries, and the smaller the mean value of the distances between the two entries containing the same characteristic is, the more the entries containing the characteristic are distributed uniformly, and the larger the characteristic income is; the density in the entries is obtained by the ratio of the sum of the distances between the continuous occurrences of the same features in the same entry to the total length of the entries, the larger the ratio is, the more sparse the features in the same entry are, the less the number of occurrences of the features contributes more, and the feature profit is also larger.
And performing characteristic income calculation on all the characteristics by using the method, and obtaining the profitability of all the characteristics after normalization.
Further, the first
Figure 672953DEST_PATH_IMAGE017
A characteristic
Figure 94707DEST_PATH_IMAGE051
Sensitivity of (2)
Figure 219658DEST_PATH_IMAGE080
Is calculated from the relationship between the remaining features and the overall yield, in particular
Figure 861117DEST_PATH_IMAGE017
Is a characteristic
Figure 34610DEST_PATH_IMAGE051
For example, its sensitivity
Figure 373187DEST_PATH_IMAGE080
The calculation method comprises the following steps:
Figure 872302DEST_PATH_IMAGE081
wherein, the first and the second end of the pipe are connected with each other,
Figure 234013DEST_PATH_IMAGE082
is as follows
Figure 273513DEST_PATH_IMAGE017
The sensitivity of the individual characteristics of the material,
Figure 171324DEST_PATH_IMAGE083
is the first
Figure 638078DEST_PATH_IMAGE017
The relevance of the individual characteristics is such that,
Figure 221506DEST_PATH_IMAGE084
is a first
Figure 64697DEST_PATH_IMAGE017
The profitability of the individual characteristics.
The sensitivity of all the characteristics can be obtained by calculating the sensitivity of all the characteristics by the method. The stronger the connection between a certain feature and other features, the more important the feature is relative to the whole digital marketing process, the greater the profit is, the most contributed in the whole digital marketing process, the more sensitive the attribute is, the more safety processing is needed, otherwise, the less sensitive the feature is, the less important the feature is, and the processing is not needed.
And S004, acquiring entries corresponding to the structured sensitive data corresponding to the digital marketing big data in the database by utilizing the characteristic sensitivity of the quantized digital marketing big data.
Specifically, the sensitivity corresponding to each feature is obtained in the above process, and the overall sensitivity calculation is performed on each entry to obtain the sensitivity of each entry
Figure 581129DEST_PATH_IMAGE039
An entryFor example, the calculation method is as follows:
Figure 156467DEST_PATH_IMAGE085
wherein, the first and the second end of the pipe are connected with each other,
Figure 790973DEST_PATH_IMAGE086
is shown as
Figure 532795DEST_PATH_IMAGE039
The sensitivity of the individual terms is such that,
Figure 127900DEST_PATH_IMAGE087
is shown as
Figure 936456DEST_PATH_IMAGE039
The number of all features in an individual entry,
Figure 25635DEST_PATH_IMAGE080
indicates the first in the entry
Figure 882732DEST_PATH_IMAGE017
Sensitivity of individual characteristics, then entry sensitivity
Figure 170494DEST_PATH_IMAGE088
The larger the entry, the more sensitive the entry is, preferably, the first threshold is given
Figure 87635DEST_PATH_IMAGE089
And (6) judging.
And performing overall sensitivity calculation on each entry by using the method, and judging and obtaining the corresponding entry corresponding to the sensitive data according to a first threshold, wherein the data contained in the entry with the sensitivity greater than the first preset threshold is the sensitive data.
Sensitive data in all the digital marketing big data are the most relevant data with the strongest contact in the marketing process for the whole database, and the sensitive data are more important in the database compared with other data, so that the data is safely processed in the subsequent process, the data volume in the processing process can be greatly reduced, and the processing time is shortened.
And S005, carrying out safety processing on the sensitive data in the acquired digital marketing big data.
Specifically, the digital marketing big data is subjected to data partitioning, wherein the data partitioning comprises sensitive data and non-sensitive data, further, the sensitive data is subjected to security processing, the security processing of the whole digital marketing big data can be completed, and specifically, the sensitive data can be subjected to security processing and can be encrypted by using an AES algorithm.
The present invention is not limited to the above-described preferred embodiments, and any modifications, equivalent substitutions, improvements, etc. made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (9)

1. A digital marketing big data processing method is characterized by comprising the following steps:
constructing a database of the digital marketing big data, and performing characteristic cleaning on all entries of the digital marketing big data in the database;
acquiring the characteristics of all entries, acquiring the characteristic relevance of each characteristic in each entry according to the position relation between different characteristics in the same entry, taking the mean value of the characteristic relevance of each characteristic in each entry in all entries as the positive contact parameter of each characteristic, acquiring the negative contact parameter of each characteristic according to the integral occurrence frequency between the characteristics which never appear in the same entry and the occurrence frequency of the characteristics in a certain entry range, and acquiring the contact of each characteristic according to the positive contact parameter and the negative contact parameter;
obtaining the profitability of each feature according to the inter-entry density of the features appearing in different entries and the intra-entry density of the features appearing in the same entry, and obtaining the sensitivity of each feature according to the contact and the profitability of each feature;
and by utilizing the sensitivity of the characteristics in the digital marketing big data, taking the sum of the sensitivities of all the characteristics in the same entry as the sensitivity of the entry, acquiring the sensitive data contained in the entry according to the sensitivity of the entry, and carrying out safety processing on the sensitive data.
2. The digital marketing big data processing method of claim 1, wherein the step of constructing the database of the digital marketing big data is:
and acquiring the digital marketing big data, classifying and establishing a database based on the sources, and performing structured processing on the digital marketing big data of the same source in the database by using a form entry mode according to the obtaining time of the big data to obtain the preprocessed digital marketing big data.
3. The digital marketing big data processing method of claim 1, wherein the step of performing feature cleaning comprises:
repeated characters in entries corresponding to all digital marketing big data in the database are obtained, and characters corresponding to a small part of unrepeated features are cleaned, so that the workload of subsequent feature extraction and feature sensitivity calculation is reduced.
4. The method for processing the digital marketing big data according to claim 1, wherein the method for acquiring the characteristics of all entries comprises the following steps:
and (3) taking the text data of each entry as the input of the named body recognition technology, and outputting the obtained entity as the characteristic of the digital marketing big data.
5. The method for processing the digital marketing big data according to claim 1, wherein the method for acquiring the feature relevance of each feature in each entry comprises the following steps:
Figure 387041DEST_PATH_IMAGE001
wherein,
Figure 178411DEST_PATH_IMAGE002
Is shown as
Figure 885860DEST_PATH_IMAGE003
In each entry
Figure 238213DEST_PATH_IMAGE004
The feature relevance of the individual features is such that,
Figure 790942DEST_PATH_IMAGE005
is as follows
Figure 730427DEST_PATH_IMAGE003
The total number of all features in an individual entry,
Figure 340269DEST_PATH_IMAGE006
denotes the first
Figure 510744DEST_PATH_IMAGE007
In each entry
Figure 693595DEST_PATH_IMAGE008
A characteristic of
Figure 538185DEST_PATH_IMAGE004
The characteristic association parameter of each characteristic is obtained by the position relation of two characteristics appearing in the same entry.
6. The digital marketing big data processing method of claim 1, wherein the positive connection parameter of each feature is obtained by:
Figure 148289DEST_PATH_IMAGE009
wherein, the first and the second end of the pipe are connected with each other,
Figure 396606DEST_PATH_IMAGE010
denotes the first
Figure 291137DEST_PATH_IMAGE004
A positive connection parameter of the individual characteristic,
Figure 708343DEST_PATH_IMAGE011
for the number of structured entries of the digitized marketing big data in the database,
Figure 456725DEST_PATH_IMAGE012
denotes the first
Figure 449127DEST_PATH_IMAGE007
The first in the individual entry
Figure 504939DEST_PATH_IMAGE004
The number of times that an individual feature occurs,
Figure 744029DEST_PATH_IMAGE013
is shown as
Figure 351727DEST_PATH_IMAGE007
Is divided by
Figure 108724DEST_PATH_IMAGE004
The total number of occurrences of other features than the individual feature,
Figure 889598DEST_PATH_IMAGE014
denotes the first
Figure 281397DEST_PATH_IMAGE007
In each entry
Figure 122314DEST_PATH_IMAGE004
Feature relevance of individual features.
7. The digital marketing big data processing method of claim 1, wherein the method for acquiring the negative connection parameter of each feature is as follows:
Figure 773131DEST_PATH_IMAGE015
wherein the content of the first and second substances,
Figure 29800DEST_PATH_IMAGE016
denotes the first
Figure 400739DEST_PATH_IMAGE017
The negative connection parameter of the individual characteristic,
Figure 287923DEST_PATH_IMAGE018
indicates never
Figure 845944DEST_PATH_IMAGE004
The first of the features that the feature appears in the same entry
Figure 608101DEST_PATH_IMAGE018
The characteristics of the composite material are that,
Figure 99125DEST_PATH_IMAGE019
then this indicates that these have never been compared
Figure 157211DEST_PATH_IMAGE017
The total number of features that an individual feature appears in the same entry,
Figure 468107DEST_PATH_IMAGE020
is shown as
Figure 800999DEST_PATH_IMAGE004
The total number of times that an individual feature appears in the database,
Figure 664306DEST_PATH_IMAGE021
denotes the first
Figure 158872DEST_PATH_IMAGE018
The total number of occurrences of a feature in the database,
Figure 629168DEST_PATH_IMAGE022
is shown in
Figure 139653DEST_PATH_IMAGE023
Within the range of the individual entry
Figure 215056DEST_PATH_IMAGE004
The frequency of occurrence of a feature is such that,
Figure 190445DEST_PATH_IMAGE024
is shown in
Figure 600567DEST_PATH_IMAGE023
Within the range of the individual entry
Figure 806420DEST_PATH_IMAGE018
The frequency of occurrence of a feature is such that,
Figure 378740DEST_PATH_IMAGE025
is shown in common
Figure 74164DEST_PATH_IMAGE025
An entry range, which is a range formed by a certain number of entries.
8. The digital marketing big data processing method according to claim 1, wherein the method for acquiring the connectivity of each feature is as follows:
Figure 519052DEST_PATH_IMAGE026
wherein, the first and the second end of the pipe are connected with each other,
Figure 466279DEST_PATH_IMAGE027
is the first
Figure 375329DEST_PATH_IMAGE004
The relevance of the individual characteristics is such that,
Figure 881135DEST_PATH_IMAGE028
is a first
Figure 610056DEST_PATH_IMAGE004
Each feature is being associated with a normalized parameter,
Figure 360975DEST_PATH_IMAGE029
is as follows
Figure 62214DEST_PATH_IMAGE004
The individual features are negatively linked to the normalized parameters.
9. The digital marketing big data processing method of claim 1, wherein the method for acquiring the profitability of each feature comprises the following steps:
Figure 99440DEST_PATH_IMAGE030
wherein the content of the first and second substances,
Figure 456604DEST_PATH_IMAGE031
is as follows
Figure 870267DEST_PATH_IMAGE004
The inter-entry density of the individual features,
Figure 130741DEST_PATH_IMAGE032
is as follows
Figure 338868DEST_PATH_IMAGE033
Second adjacent occurrence of
Figure 183327DEST_PATH_IMAGE017
The distance between the two entries where the individual features are located,
Figure 72786DEST_PATH_IMAGE034
is the maximum number of adjacent occurrences; the first mentioned
Figure 14197DEST_PATH_IMAGE017
Density within entry of individual feature
Figure 330909DEST_PATH_IMAGE035
The calculation method comprises the following steps:
Figure 787298DEST_PATH_IMAGE036
wherein, the first and the second end of the pipe are connected with each other,
Figure 916666DEST_PATH_IMAGE037
is as follows
Figure 774901DEST_PATH_IMAGE017
The in-entry density of the individual features,
Figure DEST_PATH_IMAGE038
is shown as
Figure 996935DEST_PATH_IMAGE017
Is characterized in that
Figure 550407DEST_PATH_IMAGE039
The number of occurrences in an individual entry,
Figure 843985DEST_PATH_IMAGE040
is shown as
Figure 697671DEST_PATH_IMAGE017
Is characterized in that
Figure 684082DEST_PATH_IMAGE039
In the individual entry
Figure DEST_PATH_IMAGE042
The position of the secondary occurrence is,
Figure 23053DEST_PATH_IMAGE043
is shown as
Figure 730109DEST_PATH_IMAGE017
Is characterized in that
Figure 297356DEST_PATH_IMAGE039
In the individual entry
Figure 330034DEST_PATH_IMAGE044
The position of the secondary occurrence is,
Figure 185995DEST_PATH_IMAGE045
is shown as
Figure 460856DEST_PATH_IMAGE039
The length of an individual entry; and obtaining the profitability of the characteristics according to the product of the inter-entry density, the intra-entry density and the total occurrence frequency of the characteristics and the ratio of the total number of the entries.
CN202211469771.6A 2022-11-23 2022-11-23 Digital marketing big data processing method Active CN115563654B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211469771.6A CN115563654B (en) 2022-11-23 2022-11-23 Digital marketing big data processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211469771.6A CN115563654B (en) 2022-11-23 2022-11-23 Digital marketing big data processing method

Publications (2)

Publication Number Publication Date
CN115563654A true CN115563654A (en) 2023-01-03
CN115563654B CN115563654B (en) 2023-03-31

Family

ID=84770775

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211469771.6A Active CN115563654B (en) 2022-11-23 2022-11-23 Digital marketing big data processing method

Country Status (1)

Country Link
CN (1) CN115563654B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115795112A (en) * 2023-02-08 2023-03-14 吉林交通职业技术学院 Data transmission method in scientific research innovation platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102012985A (en) * 2010-11-19 2011-04-13 国网电力科学研究院 Sensitive data dynamic identification method based on data mining
CN105404886A (en) * 2014-09-16 2016-03-16 株式会社理光 Feature model generating method and feature model generating device
CN113157678A (en) * 2021-04-19 2021-07-23 中国人民解放军91977部队 Multi-source heterogeneous data association method
US20210303725A1 (en) * 2020-03-30 2021-09-30 Google Llc Partially customized machine learning models for data de-identification

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102012985A (en) * 2010-11-19 2011-04-13 国网电力科学研究院 Sensitive data dynamic identification method based on data mining
CN105404886A (en) * 2014-09-16 2016-03-16 株式会社理光 Feature model generating method and feature model generating device
US20210303725A1 (en) * 2020-03-30 2021-09-30 Google Llc Partially customized machine learning models for data de-identification
CN113157678A (en) * 2021-04-19 2021-07-23 中国人民解放军91977部队 Multi-source heterogeneous data association method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
朱世玲;郑彦;: "改进的文本特征选取算法研究" *
杨云鹿: "支持隐私保护的数据挖掘方法研究及实现" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115795112A (en) * 2023-02-08 2023-03-14 吉林交通职业技术学院 Data transmission method in scientific research innovation platform
CN115795112B (en) * 2023-02-08 2023-04-11 吉林交通职业技术学院 Data transmission method in scientific research innovation platform

Also Published As

Publication number Publication date
CN115563654B (en) 2023-03-31

Similar Documents

Publication Publication Date Title
CN111199343B (en) Multi-model fusion tobacco market supervision abnormal data mining method
TWI612488B (en) Computer device and method for predicting market demand of commodities
CN109165294B (en) Short text classification method based on Bayesian classification
CN111144127B (en) Text semantic recognition method, text semantic recognition model acquisition method and related device
CN104636447B (en) A kind of intelligent Evaluation method and system towards medicine equipment B2B websites user
CN111104466A (en) Method for rapidly classifying massive database tables
CN108763496B (en) Dynamic and static data fusion customer classification method based on grids and density
CN110826618A (en) Personal credit risk assessment method based on random forest
CN103365867A (en) Method and device for emotion analysis of user evaluation
CN112950276B (en) Seed population expansion method based on multi-order feature combination
CN115563654B (en) Digital marketing big data processing method
CN113434628B (en) Comment text confidence detection method based on feature level and propagation relation network
CN115423575B (en) Internet-based digital analysis management system and method
CN112101452A (en) Access right control method and device
CN117453764A (en) Data mining analysis method
CN105718444B (en) Financial concept based on news corpus corresponds to stock correlating method and its device
CN114942974A (en) E-commerce platform commodity user evaluation emotional tendency classification method
Wei et al. [Retracted] Analysis and Risk Assessment of Corporate Financial Leverage Using Mobile Payment in the Era of Digital Technology in a Complex Environment
CN108776652B (en) Market forecasting method based on news corpus
CN112784049A (en) Online social platform multivariate knowledge acquisition method facing text data
CN114298013A (en) False goods receiving address prediction method and device based on deep learning
Velikova et al. Decision trees for monotone price models
CN114757495A (en) Membership value quantitative evaluation method based on logistic regression
CN111768306A (en) Risk identification method and system based on intelligent data analysis
Peng et al. Credit scoring model in imbalanced data based on cnn-atcn

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant