CN113902496B

CN113902496B - Data analysis method and device and electronic equipment

Info

Publication number: CN113902496B
Application number: CN202111502973.1A
Authority: CN
Inventors: 贺园
Original assignee: Beijing Qingsongchou Information Technology Co ltd
Current assignee: Beijing Easy Yikang Information Technology Co ltd
Priority date: 2021-12-10
Filing date: 2021-12-10
Publication date: 2022-03-01
Anticipated expiration: 2041-12-10
Also published as: CN113902496A

Abstract

The invention provides a data analysis method, a data analysis device and electronic equipment, wherein under the condition of receiving a data abnormity analysis instruction, a logic tree is obtained, a target index of data abnormity fluctuation is determined based on the logic tree, and then the influence degree analysis is carried out on the index data of the target sub-index layer by layer based on the logic connection relation between the target sub-index and the target index so as to determine the target sub-index of the data abnormity fluctuation of the target index, thereby obtaining the root cause of the data abnormity fluctuation.

Description

Data analysis method and device and electronic equipment

Technical Field

The invention relates to the field of attribution analysis, in particular to a data analysis method and device and electronic equipment.

Background

With the continuous increase of the digitization degree, the data becomes an important basis in enterprise operation decision. In practical applications, data collected by an enterprise may be constantly changed, and when data is abnormally changed, in order to improve the security of the data, the reason why the data is abnormally changed needs to be analyzed.

At present, data with abnormal changes are analyzed manually according to experience to obtain reasons of the abnormal changes of the data, but the manual reason analysis mode is low in accuracy and efficiency, and further the accuracy of determining subsequent data processing rules based on the reasons of the abnormal changes of the data is low.

Disclosure of Invention

In view of the above, the present invention provides a data analysis method, an apparatus and an electronic device, so as to solve the problems that the accuracy is low and the efficiency is low in a manner of manually analyzing the reason of the abnormal change of the data, and further the accuracy of determining the subsequent data processing rule based on the reason of the abnormal change of the data is low.

In order to solve the technical problems, the invention adopts the following technical scheme:

a method of data analysis, comprising:

under the condition of receiving a data anomaly analysis instruction, acquiring a logic tree; the logic tree is constructed in advance and comprises a plurality of indexes which are logically connected according to a preset logic relationship;

screening out an index of the abnormal fluctuation of the existing data from the logic tree, and taking the index as a target index;

determining an index which is positioned behind the level of the target index and has a logical connection relation with the target index, and taking the index as a target sub-index;

analyzing the influence degree of the index data of the target sub-indexes layer by layer according to the logical connection relation between the target sub-indexes and the target indexes to determine the target sub-indexes which enable the data of the target indexes to fluctuate abnormally; the influence degree is the influence degree of the target sub-indexes on the abnormal fluctuation of the data of the target indexes.

Optionally, the step of screening out an index of the data abnormal fluctuation from the logic tree, and using the index as a target index includes:

acquiring an index analysis starting point;

screening out the indexes to be processed which accord with the index analysis starting point from the logic tree;

performing abnormal fluctuation analysis on the index data corresponding to the index to be processed to obtain an abnormal fluctuation analysis result;

and determining the to-be-processed index of which the abnormal fluctuation analysis result meets a preset abnormal fluctuation threshold rule, and taking the to-be-processed index as a target index.

Optionally, determining an index which is located after the hierarchy of the target index and has a logical connection relationship with the target index, and taking the index as a target sub-index includes:

acquiring the grade number of an index analysis layer;

and determining the indexes which are positioned in the target index hierarchy, have a logical connection relation with the target index, have a hierarchy difference with the target index and meet the index analysis hierarchy level, and taking the indexes as target sub-indexes.

Optionally, analyzing the influence degree of the target data of the target sub-indicators layer by layer according to the logical connection relationship between the target sub-indicators and the target indicator, so as to determine the target sub-indicators that cause the data of the target indicator to fluctuate abnormally, including:

taking each target sub-index with the minimum level number in the target sub-indexes as a first target sub-index in a first target sub-index set;

determining a first target sub-indicator fluctuating data of the target indicator based on the indicator data of each first target sub-indicator in the first target sub-indicator set, and taking the determined first target sub-indicator as a second target sub-indicator;

acquiring at least one target sub-index which is located behind the second target sub-index and has the smallest level number, and taking the at least one target sub-index as a new first target sub-index set;

returning the index data based on each first target sub-index in the first target sub-index set, determining a first target sub-index that fluctuates data of the target index, and taking the determined first target sub-index as a second target sub-index, and sequentially executing until determining that a second target sub-index in a level corresponding to a maximum level number of the index analysis level numbers is satisfied by a level difference of the target index;

and taking a second target sub-index in the hierarchy corresponding to the maximum hierarchy number as a target sub-index for abnormally fluctuating the data of the target index.

Optionally, determining a first target sub-indicator fluctuating the data of the target indicator based on the indicator data of each first target sub-indicator in the first target sub-indicator set, comprises:

determining the sum of index data of all indexes under the dimensionality of the first target sub-index under the condition that the first target sub-index is a transversely constructed index;

and screening out a first target sub-index with the maximum ratio of the index data to the sum of the index data.

under the condition that the first target sub-index is a longitudinally-built index, taking the ratio of data at a first specified time to data at a second specified time in index data of the first target sub-index as the degree of influence value of the first target sub-index;

calculating the sum of the influence degree values of all the first target sub-indexes;

and screening out a first target sub-index with the influence degree value meeting a preset influence degree rule based on the sum of the influence degree values.

Optionally, the logic tree is pre-constructed, and specifically includes the steps of:

obtaining a plurality of pre-selected indexes and the logical relationship of the indexes; the logical relations comprise dimension division relations and upper and lower hierarchy division relations;

and taking a preset index as the starting point of the logic tree, and carrying out transverse and longitudinal construction on the logic tree based on the dimension division relation and the upper and lower hierarchy division relation of the index to obtain the logic tree.

A data analysis apparatus comprising:

the logic tree acquisition module is used for acquiring a logic tree under the condition of receiving a data exception analysis instruction; the logic tree is constructed in advance and comprises a plurality of indexes which are logically connected according to a preset logic relationship;

the index screening module is used for screening the indexes of the abnormal fluctuation of the data from the logic tree and taking the indexes as target indexes;

the first index determining module is used for determining an index which is positioned behind the level of the target index and has a logical connection relation with the target index, and the index is used as a target sub-index;

the second index determining module is used for analyzing the influence degree of the index data of the target sub-indexes layer by layer according to the logical connection relation between the target sub-indexes and the target indexes so as to determine the target sub-indexes which enable the data of the target indexes to fluctuate abnormally; the influence degree is the influence degree of the target sub-indexes on the abnormal fluctuation of the data of the target indexes.

Optionally, the index screening module includes:

the starting point acquisition submodule is used for acquiring an index analysis starting point;

the index screening submodule is used for screening out the indexes to be processed which accord with the index analysis starting point from the logic tree;

the fluctuation analysis submodule is used for carrying out abnormal fluctuation analysis on the index data corresponding to the index to be processed to obtain an abnormal fluctuation analysis result;

and the index determining submodule is used for determining the to-be-processed index of which the abnormal fluctuation analysis result meets the preset abnormal fluctuation threshold value rule and using the to-be-processed index as a target index.

An electronic device, comprising: a memory and a processor;

wherein the memory is used for storing programs;

the processor calls a program and is used to perform the data analysis method described above.

Compared with the prior art, the invention has the following beneficial effects:

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a flowchart of a method of data analysis according to an embodiment of the present invention;

FIG. 2 is a flow chart of another method for data analysis according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a logic tree according to an embodiment of the present invention;

FIG. 4 is a flowchart of a method for analyzing data according to another embodiment of the present invention;

FIG. 5 is a flowchart of a method of analyzing data according to another embodiment of the present invention;

fig. 6 is a schematic structural diagram of a data analysis apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

With the increasing degree of digitization, data has become an important basis in the business decision of companies, and particularly for an internet insurance company, a business problem of "why the business is rising and why the business is falling" is often faced, in order to find out the reason of data change, business personnel need to lock the problem by proposing data requirements to corresponding analysts based on business understanding, and then performing multidimensional analysis by taking numbers through Structured Query Language (SQL), and making charts, analyzing reports and reporting the problem reason, and the process usually requires at least one day.

At present, the following disadvantages exist in the manual mode:

1) wasting manpower. Because the insurance business mode is basically fixed, the analysis idea and the analysis framework are mature, most of the analysis of business parties and analysts is repeated and tedious labor, and much time is occupied.

2) The accuracy of the analysis result is poor. The number of the company business indexes is very large, the calculation logic is complex, different index definitions are easily different if manual analysis is carried out, the final effect quality is different, and omission is easily caused due to negligence. The abnormal judgment standards of each person are different, and the analysts mainly rely on personal experience to judge the data abnormity, so that the different analysts have different judgment standards for the data abnormity and finally have different analysis results

3) The overall efficiency is low and the timeliness is slow. The current data analysis requires a day at least, and the optimal decision time point is usually missed at this time, which is not beneficial to the determination and use of the subsequent data processing rule.

From the above, the method for manually analyzing the reason of the abnormal change of the data has the problems of low accuracy, low efficiency and manpower waste. Therefore, the manual method for checking the reason of data transaction cannot meet the business requirement.

In order to solve the problems of low accuracy, low efficiency and manpower waste in the manner of manually analyzing the causes of abnormal data changes, the inventors have found through research that if the abnormal data analysis can be automatically performed by using a program, the problems of errors and low efficiency caused by the manual analysis can be avoided. And the data change reason can be quickly analyzed, the data abnormity reason can be directly given, the data processing rule can be quickly determined subsequently, and the analysis cost and the decision period are reduced.

In order to automatically analyze the cause of the data abnormality, a general index attribution analysis method can be adopted, but the index attribution analysis method is used in a scene related to flow or transaction and is not suitable for scenes such as insurance reimbursement, insurance cancellation and monthly payment.

In addition, attribution analysis can be performed by combining random dimensions, but the method is large in calculation amount, disordered in analysis result and poor in readable visibility.

Therefore, the invention provides a data analysis method, a data analysis device and electronic equipment, wherein under the condition of receiving a data abnormal analysis instruction, a logic tree is obtained, a target index of data abnormal fluctuation is determined based on the logic tree, and then the index data of the target sub-index is subjected to influence degree analysis layer by layer based on the logic connection relation between the target sub-index and the target index to determine the target sub-index of the data abnormal fluctuation of the target index, so that the root cause of the data abnormal fluctuation is obtained.

It should be noted that the data analysis method, the data analysis device and the electronic device can be well applied to an application scenario of insurance.

On the basis of the above content, the embodiment of the present invention provides a data analysis method, which is applied to a server or a processor. Referring to fig. 1, may include:

and S11, acquiring the logic tree when the data abnormity analysis instruction is received.

In this embodiment, after finding the data anomaly, the user may trigger a certain button or click a certain button to generate a data anomaly analysis instruction, and the server or the processor may obtain the logic tree after receiving the instruction.

In practical applications, the logic tree is pre-constructed and includes a plurality of indexes logically connected according to a preset logic relationship. That is, the logical tree is built based on the logical relationship of the plurality of indexes.

The index in this embodiment is manually pre-selected. Specifically, insurance business transaction order detail data, user click behavior data, payment order detail data, insurance policy detail data and renewal insurance detail data are manually acquired, different dimensional information is combined into the related detail data, a data model (or called a data table) is established, and indexes are created on the basis of the data models. The index includes an index name, an index definition, and a technical implementation code. The index name may be: time + modifier + atomic index. The index definition refers to the meaning of the index and is constructed by an sql statement. The index name, index definition, and technical implementation code may be user-determined from a service (e.g., an insurance service, from attributes or processes of the service).

In addition, index attribute information such as which table the index belongs to, which field the index is counted, and the like is set for the index.

After the indexes are determined, data verification (data quality verification) is carried out manually through a data table provided by a data warehouse to determine whether the indexes are proper, and if so, the indexes are used for building a logic tree.

Specifically, referring to fig. 2, the logic tree is pre-constructed, and specifically includes the steps of:

and S21, acquiring a plurality of pre-selected indexes and the logical relationship of the indexes.

In this embodiment, the pre-selected indicators may include annual premium, new annual premium, continuous annual premium, online applet channel, H5 channel, other channels, successful insurance number, insurance price, insurance number, and the like, and the specific indicators may refer to the indicators in the text box in fig. 3, where each text box corresponds to an indicator.

In addition, the logical relationship in the present embodiment includes a dimension division relationship and an upper and lower hierarchy division relationship.

The dimension division relationship refers to that the same index is split according to multiple dimensions, and multiple branches are generated. Taking the annual premium as an example, the annual premium can be divided into a new single annual premium and a continuous annual premium, or can be divided into an unexpected insurance annual premium, a medical insurance annual premium, a serious insurance annual premium and a life insurance annual premium according to insurance products, or can be divided into an online small program channel insurance annual premium, an H5 channel annual premium and other channel annual premiums according to insurance application channels, or can be divided into a region A, a region B, a region C and the like according to regions. The upper and lower hierarchical division relations refer to longitudinal division through a multiplication model, such as annual insurance charge = successful insurance singular price, successful insurance singular = insurance success rate, insurance singular = payment order average insurance singular, payment order = transaction order payment success rate, transaction order = product detail page UV access rate, and the like.

And S22, taking a preset index as a logic tree starting point, and building the logic tree transversely and longitudinally based on the dimension division relation and the upper and lower hierarchy division relation of the index to obtain the logic tree.

In this embodiment, since the annual premium is a concerned destination, the annual premium is used as a starting point of the logic tree, then the initial logic tree (the logic tree including only one node of the annual premium) is horizontally built according to the dimensional division relationship of the index, then the horizontal logic tree is vertically built according to the upper and lower hierarchical division relationship, then the horizontal logic tree is horizontally built according to the dimensional division relationship of the index, then the horizontal logic tree is vertically built … … according to the upper and lower hierarchical division relationship, and finally the logic tree is built. The specific construction results are shown in FIG. 3.

That is to say, in this embodiment, the horizontal construction and the vertical construction are performed sequentially, one layer of horizontal construction and one layer of vertical construction, and then the horizontal construction and the vertical construction are repeated until the final logic tree is obtained.

The horizontal construction can refer to a first layer (a layer in the first layer refers to a hierarchy, annual insurance premiums are used as root nodes and do not occupy the hierarchy level, and the number of the hierarchy levels is smaller as the first layer is closer to the root nodes), namely, the annual insurance premiums are divided from multiple dimensions according to the dimensions, such as the division from the new order or the continuous insurance dimension, the division according to the dimension of insurance products, the division according to the dimension of insurance application channels, the division according to the dimension of regions and the like.

Through horizontal construction, the logic tree has a plurality of branches, and each branch needs to be constructed longitudinally in the follow-up process.

Vertical construction may refer to the second level of fig. 3, such as annualization premium = success premium.

In the present embodiment, only the new single-annualized premium is taken as an example, and the continuous-annualized premium, the online applet channel, the H5 channel, and other channels are similar to the new single-annualized premium.

In addition, if a certain index is split downwards, taking a new annual premium as an example, when the horizontal split is encountered again, the splits of other dimensions except the new annual premium are used, that is, the horizontal split is not repeatedly performed on the new annual premium.

In addition, the dimensionality can be diversified when the device is transversely split, different dimensionalities can be in parallel relation, the priority of the dimensionality can be set, if the priority is set, splitting is preferentially carried out according to the priority when the device is transversely split, and if the priority is not set, splitting is carried out randomly.

And S12, screening the indexes of the abnormal fluctuation of the data from the logic tree, and using the indexes as target indexes.

In this embodiment, an index having data abnormal fluctuation is screened, and in screening the index, a parameter of an index analysis starting point needs to be combined, and the index analysis starting point may be present in the data abnormal analysis command or a parameter which is set and stored in advance.

The starting point of index analysis is the level from which the analysis starts, and generally, the analysis starts from the first level (the level of new annual premium and renewal annual premium).

After the index analysis starting point is obtained, the index with data abnormal fluctuation corresponding to the index analysis starting point is screened out, namely, the index with data abnormal fluctuation is screened out from the first level (the level of new annual premium and continuous annual premium) and is used as the target index.

And S13, determining the indexes which are positioned at the level of the target indexes and have logical connection relation with the target indexes, and taking the indexes as target sub indexes.

S14, analyzing the influence degree of the index data of the target sub-indexes layer by layer according to the logical connection relation between the target sub-indexes and the target indexes, and determining the target sub-indexes which enable the data of the target indexes to fluctuate abnormally; the influence degree is the influence degree of the target sub-indexes on the abnormal fluctuation of the data of the target indexes.

Generally, five levels after the starting point of index analysis, that is, five levels from the second level (the level of successful order) to the sixth level (the level of paid order) are analyzed, and finally, an index causing abnormal fluctuation of data of the target index is selected from the sixth level.

In this embodiment, six levels are analyzed as an example, and in addition, the required number of levels may be selected according to actual needs to be analyzed, and the determined index of the last level is the most fundamental index that substantially causes the data of the target index to fluctuate abnormally.

In this embodiment, under the condition that a data anomaly analysis instruction is received, a logic tree is obtained, then a target index of data anomaly fluctuation is determined based on the logic tree, and then, based on a logic connection relation between the target sub-index and the target index, the influence degree analysis is performed on the index data of the target sub-index layer by layer to determine the target sub-index of the target index, so as to obtain a root cause of the data anomaly fluctuation.

In another implementation manner of the present invention, a specific implementation process of step S12 "screening the index of the abnormal fluctuation of the data from the logic tree, and using the index as the target index" is given, and referring to fig. 4, step S12 may include:

and S41, acquiring an index analysis starting point.

For a detailed explanation of the starting point of index analysis, refer to the corresponding explanations above.

And S42, screening the indexes to be processed which meet the index analysis starting point from the logic tree.

In this embodiment, if the starting point of the index analysis is the first level, all the indexes to be processed at the first level, i.e., new annual premium, continuous annual premium and the like at the first level, are screened out.

And S43, performing abnormal fluctuation analysis on the index data corresponding to the to-be-processed index to obtain an abnormal fluctuation analysis result.

In this embodiment, the index data corresponding to the to-be-processed index may be index data corresponding to the to-be-processed index acquired in advance, and may be data by hour, data by day, data by month, or the like.

The abnormal fluctuation analysis comprises standard deviation analysis or quartile range IQR analysis. Specifically, the index data is divided into a variation standard deviation (index 3 times standard deviation) of the index and IQR, and whether the data is abnormally fluctuating or not is determined. In addition, abnormal fluctuation analysis can be carried out by methods such as artificial parameters, mean value +1.5IQR and the like.

And S44, determining the to-be-processed index of which the abnormal fluctuation analysis result meets the preset abnormal fluctuation threshold rule, and taking the to-be-processed index as a target index.

In this embodiment, after the abnormal fluctuation analysis result is obtained, it is determined whether the result is greater than a corresponding preset abnormal fluctuation threshold, and if so, it is analyzed whether the acquisition time of the index data is a predetermined time, such as a working day, a holiday, and the like. If not, the index to be analyzed corresponding to the abnormal fluctuation is the target index.

And if the data are abnormally fluctuated due to dates such as working days, festivals and holidays, the index to be analyzed is not taken as a target index.

In another implementation manner of the present invention, step S13 may include:

1) and acquiring the index analysis layer level.

In this embodiment, the index analysis hierarchy number may be present in the data abnormality analysis instruction, or may be data that is set in advance and stored.

Generally, the index analysis layer number is six, and may be set according to an actual scene, such as 4 layers, 5 layers, and the like.

It should be noted that, it is not preferable to set a large number of index analysis levels, for example, 20 levels are set, because the larger the number of levels, the smaller the data amount of the index data of the index, the larger the fluctuation of the data, and the irregular data change is not possible, which is not favorable for data analysis. Therefore, the index analysis layer number can be set to an appropriate value according to the actual scene.

2) And determining the indexes which are positioned in the target index hierarchy, have a logical connection relation with the target index, have a hierarchy difference with the target index and meet the index analysis hierarchy level, and taking the indexes as target sub-indexes.

In this embodiment, referring to fig. 3, since the first layer is already determined and the target index is determined from the indexes of the first layer, in this embodiment, taking the target index as a new annual premium as an example, which indexes should be analyzed subsequently cause new annual premium fluctuation, at this time, if the index analysis layer number is 6 layers, the indexes which are located after the new annual premium, have a logical connection relationship with the new annual premium, and have a level number smaller than 6 layers with the new annual premium can be screened. As can be seen from fig. 3, it is necessary to screen out the indexes of the second level (the level of successful order number) to the sixth level (the level of paid order number) as target sub-indexes.

On the basis of the present embodiment, referring to fig. 5, step S14 may include:

and S51, taking each target sub-index with the minimum level number in the target sub-indexes as a first target sub-index in the first target sub-index set.

In this embodiment, the closer the root node is, the smaller the number of hierarchies is. After the target indexes are determined from the first level, the second level is the level with the minimum level number, the target sub-indexes of the second level are analyzed firstly, at the moment, all the target sub-indexes of the second level are respectively used as first target sub-indexes, and all the target sub-indexes form a first target sub-index set.

Referring to FIG. 3, the first target sub-indicators are the successful insurance amount and the insurance policy price.

S52, determining a first target sub-indicator that fluctuates the data of the target indicators based on the indicator data of each first target sub-indicator in the first target sub-indicator set, and using the determined first target sub-indicator as a second target sub-indicator.

As can be seen from the above description, when the successful number of insurance policies and the policy price are obtained by vertical construction, if the first target sub-indicator is an indicator of vertical construction, the influence degree value of the first target sub-indicator is defined as a ratio of data at a first specified time (which may be a certain time of today, such as 6.00 of today (monday)) and data at a second specified time (which may be a time of yesterday, such as 6.00 of yesterday, or a time of last week, such as 6.00 of last monday) in the indicator data of the first target sub-indicator.

And calculating the sum of the influence degree values of all the first target sub-indexes. In this embodiment, a is used as the influence degree value of the successful policy number, and b is used as the influence degree value of the policy price. The sum of the impact values is a + b.

And then screening out a first target sub-index with the influence degree value meeting a preset influence degree rule based on the sum of the influence degree values.

Specifically, a value of a/(1+ a + b + ab) and a value of b/(1+ a + b + ab) are calculated, and a first target sub-index with a larger value is selected and used as a second target sub-index. Wherein ab identifies the interaction of the two metrics.

In practical applications, X1= a × B may be set, a may be a successful policy number, B may be a policy price,

x2= a (1+ a) × B (1+ B) = a × B (1+ a + B + a × B). Then a X1/X2, b X1/X2 were calculated. Wherein, a is X1/X2, b is X1/X2 which are a/(1+ a + b + ab) and b/(1+ a + b + ab).

In addition, if the first target sub-index is determined to be the transversely constructed index in the layer-by-layer analysis process, under the condition that the first target sub-index is the transversely constructed index, the sum of the index data of all indexes under the dimensionality of the first target sub-index is determined, and the first target sub-index with the largest ratio of the index data to the sum of the index data is screened out.

In this embodiment, the index data may be set to Y, and the sum of the index data may be set to Y, that is, the largest Y/Y is screened out, and then the index corresponding to Y is screened out and used as the second target sub-index.

It should be noted that, Y/Y is calculated by using a damping coefficient, the horizontal axis is a characteristic grouping base period accumulation ratio, the vertical axis is a fluctuation value accumulation ratio (which may be a negative value), and the larger the damping coefficient is, the better the interpretation effect of the characteristic on the fluctuation is.

S53, judging whether the second target sub-index in the level corresponding to the maximum level of the index analysis level is determined according to the level difference of the target index; if yes, go to step S55; if not, step S54 is executed.

In this embodiment, the determination is to determine whether the last level corresponding to the index analysis level number is analyzed, and if the last level is analyzed, the second target sub-index in the level corresponding to the maximum level number is directly used as the target sub-index for abnormally fluctuating the data of the target index.

If the last level is not analyzed, the subsequent levels are processed in sequence until the last level is processed.

And S54, acquiring at least one target sub-index which is located behind the second target sub-index and has the smallest level number, and taking the at least one target sub-index as a new first target sub-index set.

In this embodiment, after the processing (successful number keeping) of the hierarchy is completed, assuming that the determined second target sub-index is the successful number keeping, the next hierarchy of the successful number keeping is obtained, i.e. the level of the online applet channel, and determines a first target sub-indicator fluctuating the data of said target indicator "based on the indicator data of each first target sub-indicator of said first set of target sub-indicators in the above-mentioned way, and calculating the determined first target sub-index as a second target sub-index to determine a new second target sub-index, then, the next level is judged continuously until a second target sub-index with the level difference from the target index meeting the maximum level (such as a sixth level) of the index analysis level is determined, and the second target sub-index, such as the amount of paid orders, is used as the target sub-index that abnormally fluctuates the data of the target index.

That is, through the above analysis, it is concluded that the amount of paid orders varies abnormally, resulting in abnormal variation in data of new annual premium, and ultimately in abnormal variation in annual premium.

And S55, taking the second target sub-index in the level corresponding to the maximum level number as the target sub-index for making the data of the target index abnormally fluctuated.

In this embodiment, the most root cause of the abnormal change of the data affecting the target index is determined by a layer-by-layer analysis method, and then a subsequent data processing rule can be set according to the root cause, so as to achieve the purpose of eliminating the cause of the abnormal change of the data. In addition, after the most fundamental reason of the abnormal variation of the data influencing the target index is determined, an attribution analysis report of the index variation can be generated and displayed.

On the basis of the above embodiment of the data analysis method, another embodiment of the present invention provides a data analysis apparatus, which may include, with reference to fig. 6:

a logic tree obtaining module 11, configured to obtain a logic tree in a case where a data anomaly analysis instruction is received; the logic tree is constructed in advance and comprises a plurality of indexes which are logically connected according to a preset logic relationship;

the index screening module 12 is used for screening the indexes of the abnormal fluctuation of the data from the logic tree and taking the indexes as target indexes;

a first index determining module 13, configured to determine an index that is located behind the hierarchy of the target index and has a logical connection relationship with the target index, and use the index as a target sub-index;

a second index determining module 14, configured to perform influence degree analysis on the index data of the target sub-index layer by layer according to a logical connection relationship between the target sub-index and the target index, so as to determine a target sub-index that causes data of the target index to fluctuate abnormally; the influence degree is the influence degree of the target sub-indexes on the abnormal fluctuation of the data of the target indexes.

Further, the index screening module comprises:

Further, the first index determining module 13 is specifically configured to:

acquiring the grade of an index analysis layer, determining an index which is positioned at the grade of the target index, has a logical connection relation with the target index, has a grade difference with the target index meeting the grade of the index analysis layer, and taking the index as a target sub-index.

Further, the second index determining module 14 includes:

the first sub-index determining sub-module is used for taking each target sub-index with the minimum level number in the target sub-indexes as a first target sub-index in a first target sub-index set;

a second sub-index determining sub-module configured to determine, based on index data of each first target sub-index in the first target sub-index set, a first target sub-index that fluctuates data of the target index, and to use the determined first target sub-index as a second target sub-index;

the judgment submodule is used for judging whether a second target sub-index in a level corresponding to the maximum level of the index analysis level is determined, wherein the level difference of the target index meets the level difference of the index analysis level;

the first sub-index determining sub-module is further configured to, if not determined, obtain at least one target sub-index that is located behind the second target sub-index and has the smallest number of tiers, and use the at least one target sub-index as a new first target sub-index set;

and the third sub-index determining sub-module is used for taking the second target sub-index in the level corresponding to the maximum level number as the target sub-index for enabling the data of the target index to fluctuate abnormally if the determination is made.

Further, the second sub-indicator determining sub-module, when determining the first target sub-indicator that fluctuates the data of the target indicator based on the indicator data of each first target sub-indicator in the first target sub-indicator set, is specifically configured to:

and under the condition that the first target sub-index is a transversely constructed index, determining the sum of the index data of all indexes under the dimensionality of the first target sub-index, and screening out the first target sub-index with the maximum ratio of the index data to the sum of the index data.

Further, still include:

a logic tree construction module, configured to construct a logic tree in advance, specifically to:

It should be noted that, for the working processes of each module and sub-module in this embodiment, please refer to the corresponding description in the above embodiments, which is not described herein again.

On the basis of the embodiments of the data analysis method and apparatus, another embodiment of the present invention provides an electronic device, including: a memory and a processor;

wherein the memory is used for storing programs;

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of data analysis, comprising:

analyzing the influence degree of the index data of the target sub-indexes layer by layer according to the logical connection relation between the target sub-indexes and the target indexes to determine the target sub-indexes which enable the data of the target indexes to fluctuate abnormally; the influence degree is the influence degree of the target sub-indexes on the abnormal fluctuation of the data of the target indexes;

the logic tree is constructed in advance, and specifically comprises the following steps:

2. The data analysis method of claim 1, wherein the step of screening the index of the abnormal fluctuation of the data from the logic tree as a target index comprises:

acquiring an index analysis starting point;

3. The data analysis method according to claim 1, wherein determining, as a target sub-indicator, an indicator that is located after the hierarchy of the target indicator and has a logical connection relationship with the target indicator, comprises:

acquiring the grade number of an index analysis layer;

4. The data analysis method according to claim 3, wherein analyzing the influence degree of the target data of the target sub-indicator layer by layer according to the logical connection relationship between the target sub-indicator and the target indicator to determine the target sub-indicator that causes the data of the target indicator to fluctuate abnormally, comprises:

5. The data analysis method according to claim 4, wherein determining a first target sub-indicator that fluctuates data of the target indicator based on the indicator data of each first target sub-indicator in the first target sub-indicator set includes:

6. The data analysis method according to claim 4, wherein determining a first target sub-indicator that fluctuates data of the target indicator based on the indicator data of each first target sub-indicator in the first target sub-indicator set includes:

7. A data analysis apparatus, comprising:

the logic tree acquisition module is used for acquiring a logic tree under the condition of receiving a data exception analysis instruction; the logic tree is constructed in advance and comprises a plurality of indexes which are logically connected according to a preset logic relationship; the logic tree is constructed in advance, and specifically comprises the following steps: obtaining a plurality of pre-selected indexes and the logical relationship of the indexes; the logical relations comprise dimension division relations and upper and lower hierarchy division relations; taking a preset index as a starting point of the logic tree, and carrying out transverse and longitudinal construction on the logic tree based on the dimension division relation and the upper and lower hierarchy division relation of the index to obtain the logic tree;

8. The data analysis device of claim 7, wherein the indicator screening module comprises:

9. An electronic device, comprising: a memory and a processor;

wherein the memory is used for storing programs;

a processor calls a program and is arranged to perform the data analysis method of any of claims 1 to 6.