CN112270595A - Data reconciliation decision method, device, server and storage medium - Google Patents

Data reconciliation decision method, device, server and storage medium Download PDF

Info

Publication number
CN112270595A
CN112270595A CN202011080428.3A CN202011080428A CN112270595A CN 112270595 A CN112270595 A CN 112270595A CN 202011080428 A CN202011080428 A CN 202011080428A CN 112270595 A CN112270595 A CN 112270595A
Authority
CN
China
Prior art keywords
data
consumption
module
classification
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011080428.3A
Other languages
Chinese (zh)
Inventor
杨国为
张凡龙
黄璞
万鸣华
杨章静
詹天明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NANJING AUDIT UNIVERSITY
Original Assignee
NANJING AUDIT UNIVERSITY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NANJING AUDIT UNIVERSITY filed Critical NANJING AUDIT UNIVERSITY
Priority to CN202011080428.3A priority Critical patent/CN112270595A/en
Publication of CN112270595A publication Critical patent/CN112270595A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the technical field of financial services, and discloses a method, a device, a server and a storage medium for data reconciliation decision, which comprise the following steps: establishing a user classification model: constructing a user consumption classification model by taking a decision tree algorithm as a core, and excavating consumption characteristics of different types of consumption groups; data preprocessing: solving the problems of fragmentation and incompleteness by adopting methods such as aggregation, average value replacement and the like, deleting redundant data, and generating a consumption data record set meeting the classification mining requirement; and (3) generating a personalized statement: after the consumption condition of each cardholder is analyzed by utilizing the plurality of classification models established in the prior art, according to the comprehensive information such as the contact between different client groups and advertisement types, the ranking of the cardholder in group classification, the number of advertisement distribution copies and time, and the like, targeted advertisement distribution is carried out on the user through a personalized recommendation algorithm, a personalized bill generation model of the credit card client is constructed, and a personalized statement of account is generated for each client.

Description

Data reconciliation decision method, device, server and storage medium
Technical Field
The invention belongs to the technical field of financial services, and particularly relates to a data reconciliation decision method, a device, a server and a storage medium.
Background
At present: generally, after a cardholder applies for a credit card, there is little initiative in contacting the bank other than depositing or withdrawing and changing the card. At this time, the bill sent to the cardholder by the bank every month becomes a carrier for the bank to communicate with the cardholder. The statement of account is a document that is of high interest to the user and is to be persisted, and is not a mere accessory to a credit card, but rather an important communication medium between the bank and the user. It is characterized in that: users receive monthly periodic, most effective marketing and service promotional media, high reading rates, and feedback rates increase year by year.
It has become a necessary service to provide customers with mailed statements, but it has been necessary for banks to send statements to customers on a scheduled basis, both from a talent and a physical source. In the face of huge cost expenditure, the advertising revenue cannot be ignored, and the method is an important way for compensating the cost. The billing-oriented user is a large, potential consumer group whose advertising value is difficult to gauge. According to research of various enterprises, publicity in this way has better effect than other media and is easier to be accepted by users. The business advertisement is mainly printed on the back or blank of the bill to introduce business, tariff conditions, marketing measures, questionnaire survey and the like, so that the effects of communicating users and promoting consumption can be achieved, and the business propaganda cost can be saved. Due to the characteristics of wide related areas, many times of delivery, high advertising and advertising hit rate and the like of bank bills, merchants closely related to users, such as mobile phones, communication terminals, household appliances, tourism, catering, real estate, home services and the like, gradually pay attention to the good carrier.
Through the above analysis, the problems and defects of the prior art are as follows: the old investment of manpower and material resources is needed for sending the statement to the user according to the date, and the resource waste is large.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a method, a device, a server and a storage medium for data reconciliation decision.
The invention is realized in such a way that a data reconciliation decision method comprises the following steps:
step one, collecting and storing consumption data of users in a big data network through a data collection module;
the collecting consumption data of the users in the big data network through the data collecting module comprises the following steps:
(1.1) counting the consumption characteristic vector of each user according to records in the network data set to form a rough data set;
(1.2) screening out consumption characteristic vectors corresponding to all known first-class objects from the rough data set, and filtering the screened consumption characteristic vectors to obtain consumption sample data;
(1.3) carrying out the recoding processing of missing values, singular values and discrete character type fields on each field in the acquired sample data;
(1.4) carrying out normalization processing on each field in the sample data in a z-score mode, and carrying out discretization processing on each field in the sample after the normalization processing;
(1.5) constructing a regression model based on the processed consumption sample data, and then determining whether each of all known second-class target objects potentially belongs to the first-class target object using the constructed regression model;
step two, preliminarily classifying the acquired data to obtain required large-class data information;
the preliminary classifying the collected data comprises:
(2.1) obtaining a randomly generated rice population comprising a plurality of rice individuals;
(2.2) acquiring the collected related consumption data, and calculating the fitness value of each rice individual according to the data to be consumed;
(2.3) dividing the plurality of rice individuals into maintainer line individuals, sterile line individuals and restorer line individuals according to the fitness value;
(2.4) obtaining the optimal individual for hybridization generated in the process of hybridizing the maintainer line individual and the sterile line individual;
(2.5) obtaining an optimal selfing individual generated in the selfing process of the individual of the restorer line;
(2.6) obtaining an individual with the optimal fitness value in the optimal individual for hybridization and the optimal individual for self-crossing;
(2.7) using the individual with the optimal fitness value as a clustering center to classify the data to be classified; eliminating data in sterile line individuals to obtain required large-scale data information;
step three, data preprocessing: solving the problems of fragmentation and incompleteness by adopting methods such as aggregation, average value replacement and the like, deleting redundant data, and generating a consumption data record set meeting the classification mining requirement;
the specific method for deleting the redundant data comprises the following steps:
s11, dividing the data blocks of the collected data objects by using a fixed length algorithm in each logic channel by using different data block lengths;
s12, finding the first redundant data block of the current data object for the data blocks preliminarily classified in step two based on the criterion used in each logical channel and the data blocks divided for the current data object in the logical channel using the criterion.
S13, eliminating an overlapping portion existing between two or more first redundant data blocks based on the offset and the length of the first redundant data block found in each of the plurality of logical channels;
s14, deleting the data of the current data object by deleting a second redundant data block, wherein the second redundant data block comprises the first redundant data block after the overlapping part is eliminated;
receiving the required big data formed by cleaning the redundant data or temporarily stored by the big data management module according to a preset management strategy;
uploading the required big data to a big data server in a storage module, storing the required big data in blocks in the big data server, and distinguishing the subclasses again;
step six, establishing a user classification model: constructing a user consumption classification model by taking a decision tree algorithm as a core, and excavating consumption characteristics of different types of consumption groups;
step seven, generating a personalized statement: analyzing the consumption condition of each card holder by using the established classification model;
and step eight, according to the relation between different client groups and the advertisement types, the ranking of cardholders in group classification, the advertisement distribution number and time and other comprehensive information, carrying out targeted advertisement distribution on the users through a personalized recommendation algorithm, constructing a personalized bill generation model of the credit card clients, and generating personalized statement bills for each client.
Further, in the first step, when the consumption data of the users in the big data network are collected, the customers are classified according to different geographic angles, customer incomes, knowledge backgrounds and education degrees of the customers and different financing and life styles owned by different age groups.
Further, in step three, after data preprocessing, classifying the customer consumption modes of the credit card customers based on the decision tree according to the consumption information of the card customers, wherein the specific classification method comprises the following steps:
s21, selecting a classification keyword, selecting the total annual consumption amount of the client as a keyword for classifying the client, and taking the total amount of money consumed by the client through a credit card in one year as a standard for judging the consumption capability of the client;
s22, identifying the client category;
s23, selecting training samples to construct a classification model;
and S24, analyzing the decision tree classification result.
Further, in the sixth step, after constructing the user consumption classification model, firstly, parameter optimization is performed on the classification model, and the adopted specific method comprises the following steps:
s31, determining the number of parameters of the parameters constructed in the classification model, and generating parameter correlation vectors with the set number of dimensions as the number of the parameters;
s32, initializing each parameter related vector to obtain a set number of initial parameter related vectors containing initial component information;
s33, iteratively updating each initial parameter related vector according to a set updating strategy to obtain a target parameter related vector containing global optimal component information;
and S34, determining the optimal parameter value of each construction parameter according to the global optimal component information.
Another object of the present invention is to provide an apparatus for data reconciliation decision, comprising:
the data acquisition module is connected with the central control and processing module and is used for acquiring customer information and consumption data of users in the big data network;
the preliminary classification module is connected with the central control and processing module and is used for preliminarily classifying the acquired data to form required large-class data information;
the data preprocessing module is connected with the central control and processing module and is used for preprocessing the large-scale data information, deleting redundant data and generating a consumption data record set meeting the classification mining requirement;
the central control and processing module is connected with the data acquisition module, the primary classification module, the data preprocessing module, the secondary classification module, the model establishing module and the data generating module and is used for processing the acquired data and performing coordination control on each module according to a processing result and preset parameters;
the secondary classification module is connected with the central control and processing module and is used for storing the required big data in blocks in the big data server and distinguishing the subclasses again;
the model building module is connected with the central control and processing module and used for constructing a user consumption classification model and mining consumption characteristics of different types of consumption groups;
and the data generation module is connected with the central control and processing module and is used for constructing a personalized bill generation model of the credit card client and generating a personalized statement bill.
Further, the preliminary classification module includes:
the key information extraction unit is used for extracting different keywords as classification bases;
the data block dividing unit is used for dividing the data block of the current data object by using different standards in each of the plurality of logical channels;
and the classified storage unit is used for respectively packaging and storing the divided data blocks and naming the keywords.
Further, the data preprocessing module comprises:
a first redundant data block determining unit, configured to find, in each logical channel, one or more first redundant data blocks of a current data object based on data blocks divided by the current data object in the logical channel, respectively;
and the data de-duplication unit is used for de-duplicating the current data object for all the first redundant data blocks found by the first redundant data block determination unit.
An overlap elimination unit for eliminating an overlap existing between two or more first redundant data blocks based on an offset and a length of the first redundant data block found in each of the plurality of logical channels.
Further, the central control and processing module comprises:
the parameter presetting unit is used for presetting and inputting control parameters through external input equipment;
the data processing unit is used for processing and analyzing the acquired data by a user according to preset parameters;
and the control instruction generating unit is used for generating a control instruction according to the processing result and sending the control instruction to different controlled modules.
By combining all the technical schemes, the invention has the advantages and positive effects that: the invention takes the personalized financial statement as a research object, provides a personalized service solution based on the data mining technology, and the research result can be applied to the processing of credit card statements, the statement processing of various service industries such as telecommunication and network, and the like, and opens up a new idea for providing personalized service for other industries; meanwhile, the problem of repeatability in credit card consumption data is solved. The algorithm mainly selects important attributes to carry out multiple independent basic neighbor sorting, N times of basic neighbor sorting needs to be independently operated for N specified important attributes, and meanwhile, the window size needs to be continuously adjusted according to the relation between the similarity and the threshold, so that the operation needs to take longer time, and the matching efficiency of the algorithm is lower.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a method for data reconciliation decision according to an embodiment of the present invention.
Fig. 2 is a flowchart of a method for eliminating duplicate records according to an embodiment of the present invention.
Fig. 3 is a flowchart for classifying the amount of the fee according to the embodiment of the present invention.
Fig. 4 is a flowchart of a specific method for firstly performing parameter optimization on a classification model after constructing a user consumption classification model according to an embodiment of the present invention.
FIG. 5 is a schematic diagram of a server structure provided by an embodiment of the present invention;
in fig. 5: 1. a data acquisition module; 2. a preliminary classification module; 2. a data preprocessing module; 3. a central control and processing module; 5. a secondary classification module; 6. a model building module; 7. and a data generation module.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In view of the problems in the prior art, the present invention provides a method, an apparatus, a server and a storage medium for data reconciliation decision, which are described in detail below with reference to the accompanying drawings.
As shown in fig. 1, a method for data reconciliation decision provided in the embodiment of the present invention includes the following steps:
s101, collecting and storing consumption data of users in a big data network through a data collection module;
s102, preliminarily classifying the acquired data to obtain required large-class data information;
s103, data preprocessing: solving the problems of fragmentation and incompleteness by adopting methods such as aggregation, average value replacement and the like, deleting redundant data, and generating a consumption data record set meeting the classification mining requirement;
s104, receiving the required big data formed by redundant data cleaning or temporarily stored by a big data management module according to a preset management strategy;
s105, uploading the required big data to a big data server in a storage module, storing the required big data in the big data server in a blocking manner, and distinguishing the subclasses again;
s106, establishing a user classification model: constructing a user consumption classification model by taking a decision tree algorithm as a core, and excavating consumption characteristics of different types of consumption groups;
s107, generating a personalized statement: analyzing the consumption condition of each card holder by using the established classification model;
and S108, according to the relation between different client groups and advertisement types, the ranking of cardholders in group classification, the advertisement distribution number and time and other comprehensive information, targeted advertisement distribution is carried out on the users through a personalized recommendation algorithm, a personalized bill generation model of credit card clients is constructed, and personalized statement bills are generated for each client.
As shown in fig. 2, in step S103 provided in the embodiment of the present invention, a specific method for deleting redundant data includes:
s201, dividing data blocks of the acquired data objects by using a fixed length algorithm in each logic channel by using different data block lengths;
s202, searching a first redundant data block of the current data object for the data blocks preliminarily classified in the step two based on the standard used in each logical channel and the data blocks divided for the current data object in the logical channel by using the standard.
S203, eliminating the overlapping part existing between two or more first redundant data blocks based on the offset and the length of the first redundant data blocks found in each logic channel of the plurality of logic channels;
and S204, deleting the second redundant data block to delete the repeated data of the current data object, wherein the second redundant data block comprises the first redundant data block with the overlapped part eliminated.
In step S101, the collecting consumption data of users in a big data network by a data collection module according to the embodiment of the present invention includes:
(1.1) counting the consumption characteristic vector of each user according to records in the network data set to form a rough data set;
(1.2) screening out consumption characteristic vectors corresponding to all known first-class objects from the rough data set, and filtering the screened consumption characteristic vectors to obtain consumption sample data;
(1.3) carrying out the recoding processing of missing values, singular values and discrete character type fields on each field in the acquired sample data;
(1.4) carrying out normalization processing on each field in the sample data in a z-score mode, and carrying out discretization processing on each field in the sample after the normalization processing;
(1.5) constructing a regression model based on the processed consumption sample data, and then determining whether each of all known second class target objects potentially belongs to the first class target object using the constructed regression model.
In step S101, when the consumption data of the users in the big data network provided by the embodiment of the present invention is collected, the customers are classified according to the difference in geography, customer income, knowledge background and education level of the customers, and the difference in financing and life style owned by different age groups.
In step S102, the preliminary classification of the collected data provided by the embodiment of the present invention includes:
(2.1) obtaining a randomly generated rice population comprising a plurality of rice individuals;
(2.2) acquiring the collected related consumption data, and calculating the fitness value of each rice individual according to the data to be consumed;
(2.3) dividing the plurality of rice individuals into maintainer line individuals, sterile line individuals and restorer line individuals according to the fitness value;
(2.4) obtaining the optimal individual for hybridization generated in the process of hybridizing the maintainer line individual and the sterile line individual;
(2.5) obtaining an optimal selfing individual generated in the selfing process of the individual of the restorer line;
(2.6) obtaining an individual with the optimal fitness value in the optimal individual for hybridization and the optimal individual for self-crossing;
(2.7) using the individual with the optimal fitness value as a clustering center to classify the data to be classified; and eliminating data in sterile line individuals to obtain required large-scale data information.
As shown in fig. 3, in step S103, after the data preprocessing, the decision tree-based customer consumption pattern classification method for credit card customers according to the consumption information of the card customers provided by the embodiment of the present invention includes:
s301, selecting classified keywords, selecting the total annual consumption amount of a client as the keywords for classifying the client, and taking the total amount of money consumed by the client through a credit card in one year as a standard for judging the consumption capability of the client;
s302, identifying the client category;
s303, selecting training samples to construct a classification model;
s304, analyzing the classification result of the decision tree.
As shown in fig. 4, in step S106, after constructing the user consumption classification model, the method according to the embodiment of the present invention first performs parameter optimization on the classification model, and the specific method adopted includes:
s401, determining the number of parameters of the constructed parameters in the classification model, and generating parameter correlation vectors with the set number of dimensions as the number of the parameters;
s402, initializing each parameter related vector to obtain a set number of initial parameter related vectors containing initial component information;
s403, iteratively updating each initial parameter related vector according to a set updating strategy to obtain a target parameter related vector containing globally optimal component information;
s404, determining the optimal parameter value of each construction parameter according to the global optimal component information.
As shown in fig. 5, an apparatus for data reconciliation decision provided in an embodiment of the present invention includes:
the data acquisition module 1 is connected with the central control and processing module and is used for acquiring customer information and consumption data of users in the big data network;
the preliminary classification module 2 is connected with the central control and processing module and is used for preliminarily classifying the acquired data to form required large-class data information;
the data preprocessing module 3 is connected with the central control and processing module and is used for preprocessing the large-scale data information, deleting redundant data and generating a consumption data record set meeting the classification mining requirement;
the central control and processing module 4 is connected with the data acquisition module, the primary classification module, the data preprocessing module, the secondary classification module, the model establishing module and the data generating module, and is used for processing the acquired data and performing coordination control on each module according to a processing result and preset parameters;
the secondary classification module 5 is connected with the central control and processing module and is used for storing the required big data in blocks in the big data server and distinguishing the subclasses again;
the model building module 6 is connected with the central control and processing module and used for constructing a user consumption classification model and mining consumption characteristics of different types of consumption groups;
and the data generation module 7 is connected with the central control and processing module and is used for constructing a personalized bill generation model of the credit card client and generating a personalized statement bill.
The preliminary classification module provided by the embodiment of the invention comprises:
the key information extraction unit is used for extracting different keywords as classification bases;
the data block dividing unit is used for dividing the data block of the current data object by using different standards in each of the plurality of logical channels;
and the classified storage unit is used for respectively packaging and storing the divided data blocks and naming the keywords.
The data preprocessing module provided by the embodiment of the invention comprises:
a first redundant data block determining unit, configured to find, in each logical channel, one or more first redundant data blocks of a current data object based on data blocks divided by the current data object in the logical channel, respectively;
and the data de-duplication unit is used for de-duplicating the current data object for all the first redundant data blocks found by the first redundant data block determination unit.
An overlap elimination unit for eliminating an overlap existing between two or more first redundant data blocks based on an offset and a length of the first redundant data block found in each of the plurality of logical channels.
The central control and processing module provided by the embodiment of the invention comprises:
the parameter presetting unit is used for presetting and inputting control parameters through external input equipment;
the data processing unit is used for processing and analyzing the acquired data by a user according to preset parameters;
and the control instruction generating unit is used for generating a control instruction according to the processing result and sending the control instruction to different controlled modules.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, and any modification, equivalent replacement, and improvement made by those skilled in the art within the technical scope of the present invention disclosed herein, which is within the spirit and principle of the present invention, should be covered by the present invention.

Claims (10)

1. A method for data reconciliation decision, characterized in that the method for data reconciliation decision comprises the following steps:
step one, collecting and storing consumption data of users in a big data network through a data collection module;
the collecting consumption data of the users in the big data network through the data collecting module comprises the following steps:
(1.1) counting the consumption characteristic vector of each user according to records in the network data set to form a rough data set;
(1.2) screening out consumption characteristic vectors corresponding to all known first-class objects from the rough data set, and filtering the screened consumption characteristic vectors to obtain consumption sample data;
(1.3) carrying out the recoding processing of missing values, singular values and discrete character type fields on each field in the acquired sample data;
(1.4) carrying out normalization processing on each field in the sample data in a z-score mode, and carrying out discretization processing on each field in the sample after the normalization processing;
(1.5) constructing a regression model based on the processed consumption sample data, and then determining whether each of all known second-class target objects potentially belongs to the first-class target object using the constructed regression model;
step two, preliminarily classifying the acquired data to obtain required large-class data information;
the preliminary classifying the collected data comprises:
(2.1) obtaining a randomly generated rice population comprising a plurality of rice individuals;
(2.2) acquiring the collected related consumption data, and calculating the fitness value of each rice individual according to the data to be consumed;
(2.3) dividing the plurality of rice individuals into maintainer line individuals, sterile line individuals and restorer line individuals according to the fitness value;
(2.4) obtaining the optimal individual for hybridization generated in the process of hybridizing the maintainer line individual and the sterile line individual;
(2.5) obtaining an optimal selfing individual generated in the selfing process of the individual of the restorer line;
(2.6) obtaining an individual with the optimal fitness value in the optimal individual for hybridization and the optimal individual for self-crossing;
(2.7) using the individual with the optimal fitness value as a clustering center to classify the data to be classified; eliminating data in sterile line individuals to obtain required large-scale data information;
step three, data preprocessing: solving the problems of fragmentation and incompleteness by adopting methods such as aggregation, average value replacement and the like, deleting redundant data, and generating a consumption data record set meeting the classification mining requirement;
the specific method for deleting the redundant data comprises the following steps:
s11, dividing the data blocks of the collected data objects by using a fixed length algorithm in each logic channel by using different data block lengths;
s12, finding the first redundant data block of the current data object for the data blocks preliminarily classified in the step two based on the standard used in each logical channel and the data blocks divided for the current data object by using the standard in the logical channel;
s13, eliminating an overlapping portion existing between two or more first redundant data blocks based on the offset and the length of the first redundant data block found in each of the plurality of logical channels;
s14, deleting the data of the current data object by deleting a second redundant data block, wherein the second redundant data block comprises the first redundant data block after the overlapping part is eliminated;
receiving the required big data formed by cleaning the redundant data or temporarily stored by the big data management module according to a preset management strategy;
uploading the required big data to a big data server in a storage module, storing the required big data in blocks in the big data server, and distinguishing the subclasses again;
step six, establishing a user classification model: constructing a user consumption classification model by taking a decision tree algorithm as a core, and excavating consumption characteristics of different types of consumption groups;
step seven, generating a personalized statement: analyzing the consumption condition of each card holder by using the established classification model;
and step eight, according to the relation between different client groups and the advertisement types, the ranking of cardholders in group classification, the advertisement distribution number and time and other comprehensive information, carrying out targeted advertisement distribution on the users through a personalized recommendation algorithm, constructing a personalized bill generation model of the credit card clients, and generating personalized statement bills for each client.
2. The method for data reconciliation decision making according to claim 1 wherein in step one, when the consumption data of the users in the big data network is collected, the customers are classified according to the geographic angle, the income of the customers, the difference of the knowledge background and education degree of the customers, the difference of the owned finances and the life style of different age groups.
3. The method for data reconciliation decision of claim 1 wherein in step three, after the data preprocessing, the credit card client is classified based on the decision tree based client consumption pattern according to the card client consumption information, and the specific classification method comprises:
s21, selecting a classification keyword, selecting the total annual consumption amount of the client as a keyword for classifying the client, and taking the total amount of money consumed by the client through a credit card in one year as a standard for judging the consumption capability of the client;
s22, identifying the client category;
s23, selecting training samples to construct a classification model;
and S24, analyzing the decision tree classification result.
4. The method for data reconciliation decision-making according to claim 1, wherein in step six, after constructing the user consumption classification model, parameter optimization is firstly carried out on the classification model, and the adopted specific method comprises the following steps:
s31, determining the number of parameters of the parameters constructed in the classification model, and generating parameter correlation vectors with the set number of dimensions as the number of the parameters;
s32, initializing each parameter related vector to obtain a set number of initial parameter related vectors containing initial component information;
s33, iteratively updating each initial parameter related vector according to a set updating strategy to obtain a target parameter related vector containing global optimal component information;
and S34, determining the optimal parameter value of each construction parameter according to the global optimal component information.
5. An apparatus for performing a data reconciliation decision making method according to any one of claims 1 to 4, wherein the apparatus for performing a data reconciliation decision making method comprises:
the data acquisition module is connected with the central control and processing module and is used for acquiring customer information and consumption data of users in the big data network;
the preliminary classification module is connected with the central control and processing module and is used for preliminarily classifying the acquired data to form required large-class data information;
the data preprocessing module is connected with the central control and processing module and is used for preprocessing the large-scale data information, deleting redundant data and generating a consumption data record set meeting the classification mining requirement;
the central control and processing module is connected with the data acquisition module, the primary classification module, the data preprocessing module, the secondary classification module, the model establishing module and the data generating module and is used for processing the acquired data and performing coordination control on each module according to a processing result and preset parameters;
the secondary classification module is connected with the central control and processing module and is used for storing the required big data in blocks in the big data server and distinguishing the subclasses again;
the model building module is connected with the central control and processing module and used for constructing a user consumption classification model and mining consumption characteristics of different types of consumption groups;
and the data generation module is connected with the central control and processing module and is used for constructing a personalized bill generation model of the credit card client and generating a personalized statement bill.
6. The apparatus of data reconciliation decision of claim 5 wherein the preliminary classification module comprises:
the key information extraction unit is used for extracting different keywords as classification bases;
the data block dividing unit is used for dividing the data block of the current data object by using different standards in each of the plurality of logical channels;
and the classified storage unit is used for respectively packaging and storing the divided data blocks and naming the keywords.
7. The apparatus of data reconciliation decision of claim 5 wherein the data preprocessing module comprises:
a first redundant data block determining unit, configured to find, in each logical channel, one or more first redundant data blocks of a current data object based on data blocks divided by the current data object in the logical channel, respectively;
a data de-duplication unit for de-duplicating the current data object for all the first redundant data blocks found by the first redundant data block determination unit;
an overlap elimination unit for eliminating an overlap existing between two or more first redundant data blocks based on an offset and a length of the first redundant data block found in each of the plurality of logical channels.
8. The apparatus for data reconciliation decision of claim 5 wherein the central control and processing module comprises:
the parameter presetting unit is used for presetting and inputting control parameters through external input equipment;
the data processing unit is used for processing and analyzing the acquired data by a user according to preset parameters;
and the control instruction generating unit is used for generating a control instruction according to the processing result and sending the control instruction to different controlled modules.
9. A server, comprising:
one or more processors;
storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of data reconciliation decision according to any of claims 1-6.
10. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out a method of data reconciliation decision according to one of the claims 1 to 6.
CN202011080428.3A 2020-10-10 2020-10-10 Data reconciliation decision method, device, server and storage medium Withdrawn CN112270595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011080428.3A CN112270595A (en) 2020-10-10 2020-10-10 Data reconciliation decision method, device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011080428.3A CN112270595A (en) 2020-10-10 2020-10-10 Data reconciliation decision method, device, server and storage medium

Publications (1)

Publication Number Publication Date
CN112270595A true CN112270595A (en) 2021-01-26

Family

ID=74338941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011080428.3A Withdrawn CN112270595A (en) 2020-10-10 2020-10-10 Data reconciliation decision method, device, server and storage medium

Country Status (1)

Country Link
CN (1) CN112270595A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077292A (en) * 2021-04-20 2021-07-06 北京沃东天骏信息技术有限公司 User classification method and device, storage medium and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077292A (en) * 2021-04-20 2021-07-06 北京沃东天骏信息技术有限公司 User classification method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN101026802B (en) Information push method and device
Chorianopoulos Effective CRM using predictive analytics
CN106649890A (en) Data storage method and device
CN102081774A (en) Card-raising identification method and system
CN102982077A (en) User data processing method and device
WO2000034910A2 (en) Customer relationship management system and method
CN110427418A (en) A kind of customer analysis grouping method based on client's energy value index system
CN112632405A (en) Recommendation method, device, equipment and storage medium
CN112862585A (en) Personal loan type bad asset risk rating method based on LightGBM decision tree algorithm
CN114078050A (en) Loan overdue prediction method and device, electronic equipment and computer readable medium
CN116468460A (en) Consumer finance customer image recognition system and method based on artificial intelligence
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
CN113204603A (en) Method and device for marking categories of financial data assets
Zheng et al. Anomalous telecom customer behavior detection and clustering analysis based on ISP’s operating data
CN108776857A (en) NPS short messages method of investigation and study, system, computer equipment and storage medium
CN115205011A (en) Bank user portrait model generation method based on LSF-FC algorithm
CN112270595A (en) Data reconciliation decision method, device, server and storage medium
CN112950359B (en) User identification method and device
CN109919667A (en) A kind of method and apparatus of the IP of enterprise for identification
Urkup et al. Customer mobility signatures and financial indicators as predictors in product recommendation
Leventhal Predictive Analytics for Marketers: Using Data Mining for Business Advantage
CN116501957A (en) User tag portrait processing method, user portrait system, apparatus and storage medium
CN115880077A (en) Recommendation method and device based on client label, electronic device and storage medium
Lee et al. Factors affecting companies’ telecommunication service selection strategy
Khansong et al. Customer Service Improvement based on Electricity Payment Behaviors Analysis using Data Mining Approaches

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210126