CN117729264B - Digital financial service mass information transmission method - Google Patents

Digital financial service mass information transmission method Download PDF

Info

Publication number
CN117729264B
CN117729264B CN202410176308.5A CN202410176308A CN117729264B CN 117729264 B CN117729264 B CN 117729264B CN 202410176308 A CN202410176308 A CN 202410176308A CN 117729264 B CN117729264 B CN 117729264B
Authority
CN
China
Prior art keywords
data
cluster
row
structured matrix
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410176308.5A
Other languages
Chinese (zh)
Other versions
CN117729264A (en
Inventor
石桥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Digital China Rongxin Cloud Technology Service Co ltd
Original Assignee
Digital China Rongxin Cloud Technology Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digital China Rongxin Cloud Technology Service Co ltd filed Critical Digital China Rongxin Cloud Technology Service Co ltd
Priority to CN202410176308.5A priority Critical patent/CN117729264B/en
Publication of CN117729264A publication Critical patent/CN117729264A/en
Application granted granted Critical
Publication of CN117729264B publication Critical patent/CN117729264B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of financial data transmission, in particular to a digital financial service mass information transmission method, which comprises the following steps: acquiring financial service information transmission data; establishing a sensitive database and a structured matrix; analyzing each row of data based on the structured matrix, constructing a transition change value and a business mode change amplitude, and obtaining an overall business mode difference stability index; constructing a difference robust factor for each data of the structured matrix; analyzing time sequence data in the structured matrix by combining the correlation between rows to construct the degree of correlation of the comprehensive modes; acquiring a time mode association factor and a business influence change factor; and (3) improving the sliding window length of the compression algorithm, compressing and encrypting the sensitive data to complete the transmission of financial service information. The invention aims to combine the characteristics of different dimensions of data to improve the sliding window during compression, and improve the compression and transmission efficiency of mass information of financial services.

Description

Digital financial service mass information transmission method
Technical Field
The application relates to the technical field of financial data transmission, in particular to a mass information transmission method of digital financial services.
Background
In recent years, the rapid development of information technology, especially the maturity of technologies such as cloud computing, big data, artificial intelligence and the like, provides more advanced tools and platforms for the financial industry, and the demand of consumers for financial services is continuously increased, so that consumers expect to trade and manage financial resources more conveniently, and expect to obtain more personalized and intelligent services, the development of digital financial services is very rapid, the demand is also gradually increased, the development prospect of digital financial services is very wide, the digital financial services not only can bring more business opportunities and income sources to financial institutions, but also can improve the efficiency and quality of financial services, reduce the operation cost of financial institutions, enhance the financial competitiveness, and promote financial popularization, so that more people can enjoy convenient financial services, and promote economic development and social progress.
However, with the rapid development of digital financial services, financial institutions are under pressure in terms of mass data management and analysis, and in order to solve these problems, the financial institutions need to adopt an efficient data compression algorithm to reduce data storage space and transmission bandwidth and improve data processing efficiency, but when processing mass data, the efficient compression algorithm generally needs longer time, which causes problems of information loss, low efficiency, and failure to accurately compress the repeated modes of the data.
Disclosure of Invention
In order to solve the technical problems, the invention provides a digital financial service mass information transmission method to solve the existing problems.
The invention relates to a mass information transmission method of digital financial services, which adopts the following technical scheme:
The embodiment of the invention provides a digital financial service mass information transmission method, which comprises the following steps:
Acquiring financial service information transmission data; dividing financial service information transmission data into sensitive data and non-sensitive data according to the openness of the data; establishing a sensitive database according to the sensitive data; building a structured matrix according to the non-sensitive data;
acquiring transition change values of each cluster according to each row of data of the structured matrix in combination with a clustering algorithm; acquiring the business mode variation amplitude of each row of the structured matrix according to the transition variation value of each cluster and the numerical distribution of each cluster; acquiring an overall business mode difference stability index according to the business mode variation amplitude of all rows of the structured matrix; for each row of the structured matrix, acquiring a difference variation robust factor of each data according to the overall business mode difference stability index of the structured matrix and the data distribution of each row; acquiring the comprehensive mode association degree of each month according to the correlation between each row of the structured matrix and the rest rows; obtaining time mode association factors of all months by combining the difference robust factors of all data of the structured matrix according to the comprehensive mode association degree of all months; taking the product of the time mode association factors of all months and the overall business mode difference stability index as business influence changing factors of all months;
the sliding window length of the LZ77 algorithm is improved according to the business influence change factors of each month; and compressing and encrypting the sensitive data by combining the improved sliding window to complete the transmission of financial service information.
Preferably, the building a structured matrix according to the non-sensitive data includes:
And (3) normalizing the non-sensitive data by adopting a Z-score algorithm, and taking the non-sensitive data after various types of normalization as each row of data of the structured matrix according to the month sequence.
Preferably, the obtaining the transition change value of each cluster according to each row of data of the structured matrix by combining with a clustering algorithm specifically includes:
Clustering each row of data of the structured matrix by adopting DBSCAN density clustering to obtain each cluster, and taking the minimum month and the maximum month corresponding to the data in each cluster as a left boundary point and a right boundary point of each cluster respectively;
For each cluster, storing the absolute value of the difference value of the left boundary point value of each cluster and the right boundary point value of the left adjacent cluster as a left transition change value; storing the absolute value of the difference value of the right boundary point value of each cluster and the left boundary point value of the right adjacent cluster as a right transition change value; and taking one half of the sum value of the left transition change value and the right transition change value as the transition change value of each cluster.
Preferably, the obtaining the service mode variation amplitude of each row of the structured matrix according to the transition variation value of each cluster and the numerical distribution of each cluster specifically includes:
Calculating the average value of the data in each cluster aiming at each row of the structured matrix; for each cluster, calculating the opposite number of the difference value between each data in the cluster and the mean value; taking the opposite number as an exponent of an exponential function based on a natural constant; calculating the product of the calculation result of the exponential function and the transition change value of the cluster; calculating the sum of the products of all the data of each cluster; and taking the result of adding the sum values of all the clusters of each row of the structured matrix as the service mode change amplitude of each row of the structured matrix.
Preferably, the overall traffic pattern difference stability index is specifically the sum of the traffic pattern variation amplitudes of all rows of the structured matrix.
Preferably, for each row of the structured matrix, the difference-varying robust factor of each data is obtained by combining the data distribution of each row according to the overall business mode difference stability index of the structured matrix, which specifically includes:
setting an initial window and step sizes of the window which are increased towards two sides by taking each data as a center aiming at each row of the structured matrix; the differential robustness factor expression of the ith row of the structured matrix, g data, is:
In the method, in the process of the invention, The difference-varying robustness factor for the g-th data of row i may represent the degree of robustness of the data variation around that data,/>In-cluster variation coefficient of cluster where the g data of the i-th row is located,/>For the business mode change amplitude of the ith row in the structured matrix K,/>For structuring the g data of the i-th row in matrix K,/>For the data mean value in the window after the p-th increase,/>The standard deviation of data in the window after the p-th increase is obtained, and p is the number of window increases;
The variation coefficient in the cluster is specifically the ratio of the standard deviation to the mean value of the cluster where the ith row and the jth data are located.
Preferably, the obtaining the comprehensive mode association degree of each month according to the correlation between each row of the structured matrix and the rest rows includes:
Taking the pearson correlation coefficient between each row in the structured matrix and the rest rows as each row data in the correlation coefficient matrix; setting a correlation threshold value, and acquiring the number S of data larger than the correlation threshold value in a correlation matrix;
Performing straight line fitting on each column of data of the structured matrix to obtain the slope of each column; the comprehensive mode association degree expression of each month is as follows:
In the method, in the process of the invention, For the comprehensive pattern association degree of the q-th month,/>、/>、/>The slope of the fitting straight line of the q-th column, the q-1 th column and the q+1 th column is respectively, m is the number of the similarity matrix rows, and the number of the similarity matrix rows is equal to the sum of the slopes of the fitting straight line and the mAs a function of the maximum value.
Preferably, the obtaining the time mode association factor of each month according to the comprehensive mode association degree of each month and the difference robust factor of each data of the structured matrix includes:
Taking the opposite number of the difference robust factor of each data of the structured matrix as an index of an exponential function based on a natural constant, and calculating the sum of calculation results of the exponential function of all data of each column of the structured matrix; and taking the product of the sum and the comprehensive mode association degree of each month as a time mode association factor of each month.
Preferably, the improvement on the sliding window length of the LZ77 algorithm according to the business impact change factor of each month is specifically:
Presetting an initial window length and an adjustment parameter of an LZ77 algorithm; calculating the sum of the normalized value of the business change influence factor of each month and the adjustment parameter; and taking the rounded value of the product of the sum value and the initial window length as the sliding window length after the LZ77 algorithm is improved.
Preferably, the compressing and encrypting the sensitive data by combining with the improved sliding window to complete the transmission of the financial service information includes:
Compressing the data of the sensitive database by adopting an LZ77 algorithm in combination with the improved sliding window length; and encrypting the data compressed by the sensitive database by adopting an AES algorithm.
The invention has at least the following beneficial effects:
The invention mainly constructs a sensitive database and a structured matrix according to sensitive data and non-sensitive data respectively, analyzes the structured matrix, analyzes each data in the matrix based on density clustering, constructs an integral business mode difference stability index, and reflects the business mode change amplitude of enterprises in non-sensitive data in one year; setting a window, constructing a variation robust factor for each data in the matrix, and reflecting the robust degree of data variation around the data; constructing a time mode association factor by combining the characteristics of the time stamp, and reflecting the intensity of association among different business modes in different months; and constructing a business change influence factor by combining the time mode association factor and the business mode difference stability index, and reflecting the influence of business change on the mode change of enterprises in different months.
Meanwhile, when data in the sensitive database is compressed, the window length during compression is improved by combining the business change influence factor of the month according to the month of the data timestamp, so that the efficient compression of massive sensitive data is realized, and the efficiency and the robustness of an algorithm are enhanced.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for transmitting mass information of digital financial services provided by the invention;
FIG. 2 is a flow chart for improving the sliding window length of the LZ77 algorithm.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description refers to the specific implementation, structure, characteristics and effects of a digital financial service mass information transmission method according to the invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following specifically describes a specific scheme of the digital financial service mass information transmission method provided by the invention with reference to the accompanying drawings.
The invention provides a mass information transmission method for digital financial services.
Specifically, the following method for transmitting mass information of digital financial services is provided, referring to fig. 1, the method includes the following steps:
step S001: and calling sensitive data and non-sensitive data in the financial enterprise data storage system, constructing a sensitive database according to the sensitive data, and constructing a structured matrix according to the non-sensitive data.
In this embodiment, a financial enterprise (such as a securities exchange) is taken as an example, in the transmission of financial service information of the enterprise, data to be transmitted includes a large amount of sensitive data and non-sensitive data, where the sensitive data is information with sensitive property and needing special protection, and the non-sensitive data is general information which does not involve privacy. The sensitive information in this embodiment is specifically sensitive information, which generally refers to confidential information related to personal identity, finance, and accounts; the non-sensitive information is specifically transaction data, market data, business operation data, payment data, insurance data, etc. The method comprises the steps of extracting relevant sensitive data from a data storage system of a financial enterprise, adding a timestamp after extracting the sensitive data, integrating all the sensitive data into structured data, constructing a sensitive database in the embodiment, taking personal identity information of a user as an example, acquiring the user name, address, identity card number, social security number and user service number information in the data storage system, creating a structured data table, wherein each row represents a data type (such as the user service number, name and the like), each column contains specific information, adding a timestamp with a day as a unit for tracking timeliness of the data, representing creation or update time of the data, conveniently tracking change of the user information and ensuring the timeliness of the data by updating the timestamp when inserting or updating the data each time, selecting a relational database, and establishing the structured data table containing the timestamp when updating the data, other enterprise financial information, client personal credential information, transaction details and other data types by the method. Because the time stamp of the data in the database is in days, the sensitive database is huge and contains massive financial service information.
Taking months as time stamps, retrieving non-sensitive data for each month from the financial enterprise data storage system, the types of non-sensitive data including, but not limited to, transaction amount, transaction time, transaction location, transaction type (purchase, sale, transfer), which typically do not contain direct personal identity information, such as name, etc., and which typically are generic business operations and transaction information, which in many cases are publicly available; the market data includes, but is not limited to, product price, exchange rate, interest rate; the service operation data includes, but is not limited to, service cost, revenue, customer satisfaction; the payment data includes, but is not limited to, payment history, payment means, refund record; the insurance data includes, but is not limited to, policy information, claim records, and premium, the data types of the above non-sensitive data are m, month is taken as a timestamp, the collected data is integrally standardized by using a Z-score algorithm, and then a structured matrix K is constructed as follows:
In the method, in the process of the invention, Data values at month n for the mth data type of the structured matrix. In this embodiment, the value of m is 16, and the value of n is 12.
To this end, a sensitive database is obtained, together with a structured matrix of non-sensitive data.
Step S002: analyzing the structured matrix, analyzing the change and the change degree of the enterprise mode according to the structured matrix, and reflecting the influence of the enterprise business change on the mode change according to the association change of different data of each month.
The sensitive data and the non-sensitive data are usually two mutually independent data sets, the change of the non-sensitive data does not cause the change in the sensitive data, and the change in the sensitive data does not cause the change in the non-sensitive data, however, in the acquired non-sensitive data, the widely available property of the non-sensitive data is irrelevant to specific privacy data, the non-sensitive data can be more commonly applied to business analysis, and the characteristics of the non-sensitive data are easy to acquire and process, so that the non-sensitive data can be more intuitively understood about the overall market trend, customer behavior, product performance and the like, and the business performance, market competitiveness and effectiveness of products or services can be tracked according to the change of the non-sensitive data, so that the analysis range is more comprehensive and is not limited by limited individuals.
Taking each row in the structured matrix as input, in this embodiment, the minimum neighborhood point number is set to 2, the maximum radius is set to 1, and each row in the structured matrix is clustered by using DBSCAN density clustering, which is a well-known technology in the art, and will not be described in detail herein. The ith data in the structured matrix is obtained after DBSCAN clusteringThe more the number of data forming a cluster, which means that the density change of the ith data in the feature space is larger, because each data in the cluster represents one month of data, in this embodiment, each cluster is regarded as a business mode of an enterprise, the data of the minimum month and the maximum month corresponding to the data in the business mode are respectively marked as a left boundary point and a right boundary point, and the jth cluster is provided with/>, wherein the data of the minimum month and the maximum month are respectively marked as left boundary points and right boundary pointsData, mean value in cluster is/>And the azimuth subscripts of the left and right boundary points of the j-th cluster are/>, respectivelyAnd/>Constructing an overall business model difference stability index/>, of a structured matrixThe expression is:
In the method, in the process of the invention, For the overall traffic pattern difference stability index of the structured matrix,/>Traffic pattern variation amplitude for structured matrix ith row,/>For the data value of the v data in the j-th cluster of the i-th row,/>For the data average value in the j-th cluster of the ith row of the structured matrix,/>Transition change value of jth cluster in ith row of structured matrix,/>The values of the left boundary point of the jth cluster and the right boundary point of the jth cluster of the ith behavior of the structured matrix are respectively/>、/>The values of the right boundary point of the jth cluster and the left boundary point of the jth and (1) cluster of the structured matrix are respectively/>For structuring the number of rows of the matrix,/>For the number of clustering clusters obtained by DBSCAN clustering of the ith data,/>The number of data in the jth cluster of the ith row. If the left or right boundary of the jth cluster is the boundary of the ith row, the difference between the value of the defined boundary point and the value of the boundary point of the adjacent cluster is 0. Will beSave as left transition change value, will/>Saved as right transition change value.
Since each cluster of each row in the structured matrix is an overall pattern for the enterprise, the transformation of boundary points from cluster to cluster can be understood as the transition of the enterprise between patterns,Calculated in (a)Representing the absolute values of the differences between the boundary points on the left and right sides of the jth cluster and the boundary points of the adjacent clusters, the larger the values are, the more complex the process of traffic pattern conversion is, possibly involving multiple dimensions or factors,The larger the/>The larger the same; and/>Representing the shift of each data point within the jth cluster relative to the mean of that cluster, can be used to measure the difference between each data point and the typical pattern of the cluster to which it belongs, and can be used to measure the magnitude of the change within the traffic pattern, if this value is larger, the more unstable the traffic pattern,Smaller,/>Smaller,/>And consequently decreases.
Taking the g data of the i row as an example, then analyzing, and taking the g data as a center, constructing an initial size asThe value of f is 3, the window is increased to two sides by taking 2 as step length, the window is stopped after being increased for two times, and the average value of data in the window after the p-th increase is set as/>Standard deviation is/>Setting the cluster where the randomly selected g data is located as the ith cluster of the ith row, and constructing a difference robust factor/>, of the g data of the ith row
In the method, in the process of the invention,The difference-varying robustness factor for the g-th data of row i may represent the degree of robustness of the data variation around that data,/>In-cluster variation coefficient of cluster where the g data of the i-th row is located,/>For the business mode change amplitude of the ith row in the structured matrix K,/>For the g-th data value of the i-th row in the structured matrix K,/>For the data mean value in the window after the p-th increase,/>And p is the sum index, which represents the number of window increases, for the standard deviation of the data in the window after the p-th increase. It should be noted that, the variation coefficient in the cluster where the ith row and the nth data are located is specifically the ratio of the standard deviation to the average value of the cluster where the ith row and the nth data are located.
The product of the variation coefficient in the cluster and the variation amplitude of the service mode reveals the fluctuation condition of the enterprise mode of the month corresponding to the data, the larger the value of the fluctuation condition is, the more unstable the mode is, and the smaller the difference variation stability factor is; The mean and standard deviation are introduced to determine the integrated difference between the selected data and its surrounding data, the larger the value thereof, the larger the enterprise mode gap representing the month of the selected data from the surrounding months, the less robust the enterprise mode, and the smaller the differential robustness factor.
In the structured matrix, the data of different columns represent the values of the data types in different time dimensions, in the financial service, many industries are influenced by seasonal factors, different seasons may cause changes of consumer behaviors, demands and market conditions, therefore, business modes may change in relevance in different seasonal periods, and due to changes of market trends and competition situations, the enterprises adjust policies and business modes to change relevance, not only are economic factors, industry technologies, related policies and changes of consumer behaviors caused to change relevance of business modes, the relevance change of business modes is mainly embodied in the same time period, a large number of business modes generate relevance changes, such as a large number of purchase or sales transactions may influence market liquidity, thereby causing changes of trade volume and trade amount of industry insurance business, further influencing a series of relevance changes such as decision of market, change of consumption behaviors, customer satisfaction and the like.
Calculating the pearson correlation coefficient between each row and the rest rows of the structured matrix, and recording the pearson phase relation number of the c-th row and the d-th row in the structured matrix asConstructing a m x m correlation coefficient matrix/> according to all the pearson correlation coefficientsAnd set a correlation threshold/>In this embodiment/>The value is 0.7, and the correlation coefficient matrix/> istraversedMiddle is greater than/>Record the correlation coefficient matrix/>Middle is greater than/>The number of data of (2) is S. Then taking the data of the q-th column in the structured matrix K as input, and carrying out linear fitting on the data to obtain the best theoretical line/>Linear fitting is a well-known technique in the art and is not described in detail herein, wherein/>For the slope fitted to,/>For the fitted intercept, constructing a time pattern correlation factor/>, of the q-th monthThe expression is:
In the method, in the process of the invention, For the time pattern association factor of the q-th month,/>For the comprehensive pattern association degree of the q-th month,/>Is the difference-varying robust factor of the (q) th data of the h row, S is the correlation coefficient matrix/>Middle is greater than/>Number of data,/>、/>、/>The slope of the fitting straight line of the q-th column, the q-1 th column and the q+1 th column is respectively, m is the number of rows in the similarity matrix, and is/As a function of the maximum value. When q is 1, this is the left column boundary in the structured matrix K, and this is taken/>And similarly, when q is 12, this is the right column boundary in the structured matrix K, this time is taken/>Is 0.
Integrated pattern association degree at the q-th monthIn the calculation of/>Representing the ratio of the number of pearson correlation coefficients between all two different rows greater than the correlation threshold to the number of combinations of all two different rows, in the sense that the data type with the higher correlation pattern has a ratio, with a larger value representing more patterns having a correlation, a/>The larger the/>And then increases; and/>Representing the maximum value of the slope difference between the fitted straight line of the q-th month and the adjacent two months, when the straight line is fitted, if only the extremely individual modes in the two months are changed, the slope influence on the fitted straight line is extremely small, conversely, if more modes in the two months are changed in a correlation manner, the slope fitted to the straight line is extremely changed, so that the value of the slope can represent the range of the correlation mode; /(I)The more robust the data change around the g-th month data in the h row, the more robust the change of the month relative to the surrounding months, the more robust the data change of the q-th month overall, the less the probability of occurrence of the correlation pattern change, and the greater the data change of the g-th month overallThe smaller; multiplying the probability of the change of the relevance mode by the range of the relevance mode, and obtaining the integral time mode relevance factor/>, of the q monthSo that it can reflect the influence intensity of the overall association pattern change at the q-th month.
Finally combining the above-mentioned overall business mode difference stability indexCalculating the business change influence factor/>, of the q-th monthThe expression is:
In the method, in the process of the invention, For the q-th month business change influencing factor,/>For the time pattern association factor of the q-th month,/>Is an overall traffic pattern difference stability index. The service change influence factor comprehensively expresses the influence of the q-th month state-owned enterprise service change on the mode change, not only considers the relevance of the time mode, but also considers the difference stability of the whole service mode so as to reflect the influence of the enterprise service change on the mode change.
Step S003: and improving the compression window of the LZ77 algorithm according to the service change influence factors of different months, and carrying out encryption operation after compressing data in a sensitive database to ensure the safety of data transmission.
The collected data in the sensitive database is compressed by adopting an LZ77 compression algorithm, in the LZ77 algorithm, during the compression process, the algorithm maintains a window with a fixed size, the window contains the input data which is seen recently, when the algorithm finds the repeated character string in the window, the repeated character string is represented by a pointer pointing to the same character string in the window, so that the compression is realized, however, due to overlarge data to be transmitted, the window length may miss some longer repeated character strings, so that the compression efficiency is reduced, and the embodiment sets the initial length of the sliding window in the LZ77 algorithm to be L, and the L takes a value of 128Kb.
In the sensitive database, the change of the business mode of the enterprise can cause the sensitive data to change in a large scale, so that a larger window is used in the time period of the business mode change, thereby ensuring the compression efficiency, because the time stamp attribute is introduced when the sensitive database is established, when the sensitive database is compressed, the window with the length as the initial length starts to be compressed, the time stamp attribute of the window center data is judged in real time when the sensitive database is compressed, the month acquired by the window center data is analyzed, the window center data is set as the data of the q month, and the sliding window length of the LZ77 algorithm is adaptively improved:
wherein, For the window length at the next sliding,/>To adjust the parameters,/>For the q-th month business change influencing factor,/>Normalized function for sigmoid,/>For the initial window length of the LZ77 algorithm, round () is a rounding function. In this embodiment, the adjustment parameter takes a value of 1. A flow in which the LZ77 algorithm sliding window length is improved is shown in fig. 2.
After the window length is improved, the algorithm can judge the business change influence of the month where the central data is located when the window moves, so that the window length is adjusted in real time, the window length can be increased when the business change degree is larger, and the window length is kept near the initial length when the business change degree is smaller.
After data in the sensitive database is compressed, the compressed data is taken as input, and is encrypted by using an AES algorithm, so that the safety of the data in transmission is ensured, and the transmission of mass information of financial services is completed.
In summary, the embodiment of the invention mainly constructs the sensitive database and the structured matrix according to the sensitive data and the non-sensitive data respectively, analyzes the structured matrix, analyzes each data in the matrix based on density clustering, constructs the overall business mode difference stability index, and reflects the business mode change amplitude of enterprises in the non-sensitive data in one year; setting a window, constructing a variation robust factor for each data in the matrix, and reflecting the robust degree of data variation around the data; constructing a time mode association factor by combining the characteristics of the time stamp, and reflecting the intensity of association among different business modes in different months; constructing a business change influence factor by combining the time mode association factor and the business mode difference stability index, and reflecting the influence of business change on mode change of enterprises in different months;
Meanwhile, when data in the sensitive database is compressed, the window length during compression is improved by combining the business change influence factor of the month according to the month of the data timestamp, so that the efficient compression of massive sensitive data is realized, and the efficiency and the robustness of an algorithm are enhanced.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. And the foregoing description has been directed to specific embodiments of this specification. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and the same or similar parts of each embodiment are referred to each other, and each embodiment mainly describes differences from other embodiments.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; the technical solutions described in the foregoing embodiments are modified or some of the technical features are replaced equivalently, so that the essence of the corresponding technical solutions does not deviate from the scope of the technical solutions of the embodiments of the present application, and all the technical solutions are included in the protection scope of the present application.

Claims (10)

1. The mass information transmission method for the digital financial service is characterized by comprising the following steps of:
Acquiring financial service information transmission data; dividing financial service information transmission data into sensitive data and non-sensitive data according to the openness of the data; establishing a sensitive database according to the sensitive data; building a structured matrix according to the non-sensitive data;
acquiring transition change values of each cluster according to each row of data of the structured matrix in combination with a clustering algorithm; acquiring the business mode variation amplitude of each row of the structured matrix according to the transition variation value of each cluster and the numerical distribution of each cluster; acquiring an overall business mode difference stability index according to the business mode variation amplitude of all rows of the structured matrix; for each row of the structured matrix, acquiring a difference variation robust factor of each data according to the overall business mode difference stability index of the structured matrix and the data distribution of each row; acquiring the comprehensive mode association degree of each month according to the correlation between each row of the structured matrix and the rest rows; obtaining time mode association factors of all months by combining the difference robust factors of all data of the structured matrix according to the comprehensive mode association degree of all months; taking the product of the time mode association factors of all months and the overall business mode difference stability index as business influence changing factors of all months;
the sliding window length of the LZ77 algorithm is improved according to the business influence change factors of each month; and compressing and encrypting the sensitive data by combining the improved sliding window to complete the transmission of financial service information.
2. The method of claim 1, wherein the creating a structured matrix based on non-sensitive data comprises:
And (3) normalizing the non-sensitive data by adopting a Z-score algorithm, and taking the non-sensitive data after various types of normalization as each row of data of the structured matrix according to the month sequence.
3. The method for transmitting mass information of digital financial services according to claim 1, wherein the step of acquiring the transition change value of each cluster according to each row of data of the structured matrix by combining a clustering algorithm comprises the following steps:
Clustering each row of data of the structured matrix by adopting DBSCAN density clustering to obtain each cluster, and taking the minimum month and the maximum month corresponding to the data in each cluster as a left boundary point and a right boundary point of each cluster respectively;
For each cluster, storing the absolute value of the difference value of the left boundary point value of each cluster and the right boundary point value of the left adjacent cluster as a left transition change value; storing the absolute value of the difference value of the right boundary point value of each cluster and the left boundary point value of the right adjacent cluster as a right transition change value; and taking one half of the sum value of the left transition change value and the right transition change value as the transition change value of each cluster.
4. The method for transmitting mass information of digital financial services according to claim 1, wherein the step of acquiring the business mode variation amplitude of each row of the structured matrix by combining the numerical distribution of each cluster according to the transition variation value of each cluster comprises the following steps:
Calculating the average value of the data in each cluster aiming at each row of the structured matrix; for each cluster, calculating the opposite number of the difference value between each data in the cluster and the mean value; taking the opposite number as an exponent of an exponential function based on a natural constant; calculating the product of the calculation result of the exponential function and the transition change value of the cluster; calculating the sum of the products of all the data of each cluster; and taking the result of adding the sum values of all the clusters of each row of the structured matrix as the service mode change amplitude of each row of the structured matrix.
5. The method of claim 1, wherein the global traffic pattern difference stability index is a sum of traffic pattern variation amplitudes of all rows of the structured matrix.
6. The method for transmitting mass information of digital financial services according to claim 1, wherein for each row of the structured matrix, the differential robustness factor of each data is obtained by combining the data distribution of each row according to the overall business mode differential stability index of the structured matrix, specifically:
setting an initial window and step sizes of the window which are increased towards two sides by taking each data as a center aiming at each row of the structured matrix; the differential robustness factor expression of the ith row of the structured matrix, g data, is:
In the method, in the process of the invention, The difference-varying robustness factor for the g-th data of row i may represent the degree of robustness of the data variation around that data,/>In-cluster variation coefficient of cluster where the g data of the i-th row is located,/>For the business mode change amplitude of the ith row in the structured matrix K,/>For structuring the g data of the i-th row in matrix K,/>For the data mean value in the window after the p-th increase,/>The standard deviation of data in the window after the p-th increase is obtained, and p is the number of window increases;
The variation coefficient in the cluster is specifically the ratio of the standard deviation to the mean value of the cluster where the ith row and the jth data are located.
7. The method for transmitting mass information of digital financial services according to claim 1, wherein the step of obtaining the degree of association of the integrated mode for each month according to the correlation between each row of the structured matrix and the remaining rows comprises:
Taking the pearson correlation coefficient between each row in the structured matrix and the rest rows as each row data in the correlation coefficient matrix; setting a correlation threshold value, and acquiring the number S of data larger than the correlation threshold value in a correlation matrix;
Performing straight line fitting on each column of data of the structured matrix to obtain the slope of each column; the comprehensive mode association degree expression of each month is as follows:
In the method, in the process of the invention, For the comprehensive pattern association degree of the q-th month,/>、/>、/>The slope of the fitting straight line of the q-th column, the q-1 th column and the q+1 th column is respectively, m is the number of the similarity matrix rows, and the number of the similarity matrix rows is equal to the sum of the slopes of the fitting straight line and the mAs a function of the maximum value.
8. The method for transmitting mass information of digital financial services according to claim 1, wherein the step of obtaining the time pattern correlation factor of each month by combining the differential robustness factor of each data of the structured matrix according to the comprehensive pattern correlation degree of each month comprises the steps of:
Taking the opposite number of the difference robust factor of each data of the structured matrix as an index of an exponential function based on a natural constant, and calculating the sum of calculation results of the exponential function of all data of each column of the structured matrix; and taking the product of the sum and the comprehensive mode association degree of each month as a time mode association factor of each month.
9. The method for transmitting mass information of digital financial services according to claim 1, wherein the improvement of the sliding window length of the LZ77 algorithm according to the business impact change factor of each month is specifically as follows:
Presetting an initial window length and an adjustment parameter of an LZ77 algorithm; calculating the sum of the normalized value of the business change influence factor of each month and the adjustment parameter; and taking the rounded value of the product of the sum value and the initial window length as the sliding window length after the LZ77 algorithm is improved.
10. The method for transmitting mass information of digital financial services according to claim 9, wherein the compressing and encrypting the sensitive data in combination with the improved sliding window to complete the transmission of the financial service information comprises:
Compressing the data of the sensitive database by adopting an LZ77 algorithm in combination with the improved sliding window length; and encrypting the data compressed by the sensitive database by adopting an AES algorithm.
CN202410176308.5A 2024-02-08 2024-02-08 Digital financial service mass information transmission method Active CN117729264B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410176308.5A CN117729264B (en) 2024-02-08 2024-02-08 Digital financial service mass information transmission method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410176308.5A CN117729264B (en) 2024-02-08 2024-02-08 Digital financial service mass information transmission method

Publications (2)

Publication Number Publication Date
CN117729264A CN117729264A (en) 2024-03-19
CN117729264B true CN117729264B (en) 2024-04-26

Family

ID=90200142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410176308.5A Active CN117729264B (en) 2024-02-08 2024-02-08 Digital financial service mass information transmission method

Country Status (1)

Country Link
CN (1) CN117729264B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063045A (en) * 2018-07-18 2018-12-21 程欣悦 A kind of financial service method and financial service terminal
WO2022251317A1 (en) * 2021-05-27 2022-12-01 Rutgers, The State University Of New Jersey Systems of neural networks compression and methods thereof
CN116361840A (en) * 2023-06-02 2023-06-30 深圳市力博实业有限公司 Bank self-service equipment data security management system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070079012A1 (en) * 2005-02-14 2007-04-05 Walker Richard C Universal electronic payment system: to include "PS1 & PFN Connect TM", and the same technology to provide wireless interoperability for first responder communications in a national security program
US20210089927A9 (en) * 2018-06-12 2021-03-25 Ciena Corporation Unsupervised outlier detection in time-series data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063045A (en) * 2018-07-18 2018-12-21 程欣悦 A kind of financial service method and financial service terminal
WO2022251317A1 (en) * 2021-05-27 2022-12-01 Rutgers, The State University Of New Jersey Systems of neural networks compression and methods thereof
CN116361840A (en) * 2023-06-02 2023-06-30 深圳市力博实业有限公司 Bank self-service equipment data security management system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
深度神经网络模型压缩方法与进展;赖叶静;郝珊锋;黄定江;;华东师范大学学报(自然科学版);20200930(第05期);第68-82页 *

Also Published As

Publication number Publication date
CN117729264A (en) 2024-03-19

Similar Documents

Publication Publication Date Title
Ledoit et al. The power of (non-) linear shrinking: A review and guide to covariance matrix estimation
Paleologo et al. Subagging for credit scoring models
Piramuthu Evaluating feature selection methods for learning in data mining applications
Srinivasan et al. Credit granting: A comparative analysis of classification procedures
CN109977151A (en) A kind of data analysing method and system
US20070226099A1 (en) System and method for predicting the financial health of a business entity
Van Thiel et al. Artificial intelligence credit risk prediction: An empirical study of analytical artificial intelligence tools for credit risk prediction in a digital era
CN112561598A (en) Customer loss prediction and retrieval method and system based on customer portrait
CN111861697B (en) Loan multi-head data-based user portrait generation method and system
CN117252689B (en) Agricultural user credit decision support method and system based on big data
Chen et al. Research on credit card default prediction based on k-means SMOTE and BP neural network
Chavleishvili et al. Measuring systemic financial stress and its risks for growth
Schugoreva et al. The impact of digital transformation on geo-territorial restructuring of bank branches
CN117729264B (en) Digital financial service mass information transmission method
Song et al. Enhancing enterprise credit risk assessment with cascaded multi-level graph representation learning
Cheryshenko et al. Integration of big data in the decision-making process in the real estate sector
CN115689708A (en) Screening method, risk assessment method, device, equipment and medium of training data
Yang et al. Applying k-means technique and decision tree analysis to predict Taiwan ETF performance
CN113744042A (en) Credit default prediction method and system based on optimized Boruta and XGboost
CN114240633A (en) Credit risk assessment method, system, terminal device and storage medium
CN117972792B (en) Method for desensitizing massive user information in bank development environment
Afanasiev et al. Predictive fraud analytics: B-tests
Li et al. Research on P2P Credit Assessment Based on Random Forest―from the Perspective of Lender’s Profit
CN117992809B (en) Hierarchical protection method for operation and maintenance information of multiple databases of bank
CN117910025B (en) Financial service data safety storage protection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant