CN113449024B - Insurance data analysis method, device, equipment and medium based on big data - Google Patents

Insurance data analysis method, device, equipment and medium based on big data Download PDF

Info

Publication number
CN113449024B
CN113449024B CN202110696233.XA CN202110696233A CN113449024B CN 113449024 B CN113449024 B CN 113449024B CN 202110696233 A CN202110696233 A CN 202110696233A CN 113449024 B CN113449024 B CN 113449024B
Authority
CN
China
Prior art keywords
data
target
value
processed
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110696233.XA
Other languages
Chinese (zh)
Other versions
CN113449024A (en
Inventor
吴先祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Puhui Enterprise Management Co Ltd
Original Assignee
Ping An Puhui Enterprise Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Puhui Enterprise Management Co Ltd filed Critical Ping An Puhui Enterprise Management Co Ltd
Priority to CN202110696233.XA priority Critical patent/CN113449024B/en
Publication of CN113449024A publication Critical patent/CN113449024A/en
Application granted granted Critical
Publication of CN113449024B publication Critical patent/CN113449024B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Accounting & Taxation (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Finance (AREA)
  • Technology Law (AREA)
  • Computing Systems (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Software Systems (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention relates to the field of big data, and provides insurance data analysis method, device, equipment and medium based on big data, which can call Sqoop to extract data to be processed from upstream data and write the data into a Hive table, call Hive to process the data to be processed to obtain a marker bit, execute calculation when the marker bit is detected to meet configuration conditions, effectively save calculation resources of a system, call Hive to extract calculation factors under each target dimension from the data to be processed to calculate a UPR value of each target dimension, call Sqoop to synchronize the UPR value of each target dimension to a local database, further combine big data calculation and unified operation mode, avoid waste of system resources caused by different calculation modes due to differences of data quantity and the like, effectively reduce labor cost, avoid introducing higher errors due to artificial calculation, enable the calculated UPR value to be more accurate, and improve calculation efficiency. In addition, the invention also relates to a block chain technology, and the UPR value can be stored in the block chain node.

Description

Insurance data analysis method, device, equipment and medium based on big data
Technical Field
The invention relates to the technical field of big data, in particular to insurance data analysis method, device, equipment and medium based on big data.
Background
At present, in each company enterprise on the market, the operation mode of calculating the UPR (unexpired Premium Reserve) is generally divided into manual processing and SAS (status Analysis System) System processing according to the difference of data volume.
In actual operation, the income details of the previous month are extracted from the financial aspect, then the UPR accounting of each period is calculated according to the guarantee fee income and the unexposed risk date of each period, and finally the data after accounting is sent to the financial aspect for entering account.
However, regardless of the manual processing or the SAS system processing, the labor cost is high, the accuracy of the calculation result is also poor, and the time consumption is long.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus, a device and a medium for analyzing insurance data based on big data, which can combine big data calculation and unify operation modes, avoid waste of system resources and manpower due to different calculation modes caused by differences in data amount, automatically calculate the UPR value in each dimension by a Hive data warehouse tool, effectively reduce labor cost, avoid introducing higher errors due to manual calculation, make the calculated UPR value more accurate, and improve calculation efficiency.
An insurance data analysis method based on big data, comprising:
scanning upstream data at preset time intervals, calling a Sqoop data migration tool to extract data to be processed from the upstream data, and writing the data to be processed into a Hive table;
calling a Hive data warehouse tool to process the data to be processed stored in the Hive table to obtain a flag bit;
starting an Azkaban task scheduler to detect whether the zone bit meets configuration conditions or not at regular time;
when the flag bit is detected to meet the configuration condition, determining at least one target dimension according to a received requirement table;
calling the Hive data warehouse tool to extract a calculation factor under each target dimension from the data to be processed, and calculating a UPR value of each target dimension according to the calculation factor under each target dimension;
and calling the Sqoop data migration tool to synchronize the UPR value of each target dimension to a local database.
According to the preferred embodiment of the present invention, the invoking the Sqoop data migration tool to extract the data to be processed from the upstream data includes:
acquiring a current timestamp;
determining a target time range according to the preset time interval and the current timestamp;
determining a designated database storing the upstream data;
starting the Sqoop data migration tool to be connected to the specified database, and extracting data in the target time range from the specified database as the data to be processed; and/or
And detecting data with a specified format generated in the target time range in the upstream data, and determining the detected data as the data to be processed.
According to the preferred embodiment of the present invention, the calling the Hive data warehouse tool to process the data to be processed stored in the Hive table to obtain the flag bit includes:
acquiring a first field, a second field and a third field which are configured in advance;
acquiring data matched with the first field from the data to be processed as guarantee fee income data, acquiring data matched with the second field as a guarantee period, and acquiring data matched with the third field as an evaluation time point;
when the guarantee fee income data is less than zero and/or the guarantee period is less than or equal to the evaluation time point, determining that the Flag bit is Flag =0;
and when the guarantee fee income data is greater than or equal to zero and the guarantee period is greater than the evaluation time point, determining that the Flag bit is Flag =1.
According to the preferred embodiment of the present invention, the starting of the Azkaban task scheduler to regularly detect whether the flag bit meets the configuration condition includes:
starting the Azkaban task scheduler to detect the value of the flag bit at fixed time;
when the Flag bit is detected to be Flag =1, determining that the configuration condition is satisfied; or alternatively
When the Flag bit is detected to be Flag =0, it is determined that the configuration condition is not satisfied.
According to a preferred embodiment of the present invention, the invoking the Hive data warehouse tool to extract a calculation factor in each target dimension from the data to be processed, and calculating the UPR value in each target dimension according to the calculation factor in each target dimension includes:
for each target dimension, acquiring at least one field name configured in advance, wherein the at least one field name comprises the first field, the third field and a fourth field corresponding to a guarantee period;
extracting a calculation factor under each target dimension from the data to be processed according to the at least one field name, wherein the calculation factor comprises target guarantee fee income data corresponding to the first field, a target evaluation time point corresponding to the third field and a target guarantee period corresponding to the fourth field;
calculating target passing time according to the target evaluation time point and the target guarantee period;
calculating a target current insurance period according to the target guarantee period and the target elapsed time;
calculating the sum of the target current insurance start period and a preset value as a target current insurance end period;
calculating the difference value between the target current insurance expiration date and the target evaluation time point and the preset value as a first target value;
calculating the difference value between the target current insurance deadline and the target current insurance start period as a second target value;
calculating a product of the second target value and the target premium revenue data as a third target value;
and calculating the quotient of the first target value and the third target value as the UPR value of the target dimension.
According to the preferred embodiment of the present invention, after the Sqoop data migration tool is called to synchronize the UPR value of each target dimension to the local database, the method further includes:
when the preset identification is detected, the synchronization is determined to be completed;
transmitting the preset identification to kafak;
connecting to a mail notification interface;
and when the mail notification interface monitors that the kafak consumes the preset identifier, sending a prompt mail to a specified terminal device through the mail notification interface, wherein the prompt mail is used for prompting that the UPR value of each target dimension is successfully synchronized to the local database.
According to the preferred embodiment of the present invention, after the Sqoop data migration tool is called to synchronize the UPR value of each target dimension to the local database, the method further includes:
monitoring a pre-established query interface and a pre-established download interface in real time;
when the inquiry interface and/or the download interface is detected to be triggered, acquiring a requester triggering the inquiry interface and/or the download interface;
verifying the requestor's rights;
when the authority of the requester is verified, determining target data requested by the requester through the query interface and/or the download interface;
and feeding back the target data to the terminal equipment of the requester through the query interface and/or the download interface.
A big-data based insurance data analysis apparatus, the big-data based insurance data analysis apparatus comprising:
the extracting unit is used for scanning upstream data at preset time intervals, calling a Sqoop data migration tool to extract data to be processed from the upstream data, and writing the data to be processed into a Hive table;
the processing unit is used for calling a Hive data warehouse tool to process the data to be processed stored in the Hive table to obtain a flag bit;
the detection unit is used for starting the Azkaban task scheduler to detect whether the zone bit meets the configuration condition or not at regular time;
the determining unit is used for determining at least one target dimension according to the received requirement table when the flag bit is detected to meet the configuration condition;
the calculating unit is used for calling the Hive data warehouse tool to extract a calculation factor under each target dimension from the data to be processed and calculating a UPR value of each target dimension according to the calculation factor under each target dimension;
and the synchronization unit is used for calling the Sqoop data migration tool to synchronize the UPR value of each target dimension to a local database.
A computer device, the computer device comprising:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the big-data based insurance data analysis method.
A computer-readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement the big-data based insurance data analysis method.
It can be seen from the above technical solutions that the present invention can scan upstream data at preset time intervals, invoke a Sqoop data migration tool to extract data to be processed from the upstream data, write the data to be processed into a Hive table, invoke a Hive data warehouse tool to process the data to be processed stored in the Hive table, obtain a flag bit, start an Azkaban task scheduler to periodically detect whether the flag bit satisfies a configuration condition, determine at least one target dimension according to a received requirement table when it is detected that the flag bit satisfies the configuration condition, execute subsequent calculations only when it is detected that the flag bit satisfies the configuration condition, effectively save the computational resources of the system, the Hive data warehouse tool is called to extract a calculation factor of each target dimension from the data to be processed, a UPR value of each target dimension is calculated according to the calculation factor of each target dimension, the Sqoop data migration tool is called to synchronize the UPR value of each target dimension to a local database, further, big data calculation and unified operation modes can be combined, waste of system resources and manpower caused by different calculation modes due to difference of data quantity and the like is avoided, the UPR value of each dimension is automatically calculated through the Hive data warehouse tool, manpower cost is effectively reduced, higher error caused by manual calculation is avoided, the calculated UPR value is more accurate, and meanwhile calculation efficiency is improved.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the big data based insurance data analysis method of the present invention.
FIG. 2 is a functional block diagram of a preferred embodiment of the big data based insurance data analysis apparatus of the present invention.
FIG. 3 is a schematic structural diagram of a computer device for implementing a big-data-based insurance data analysis method according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
FIG. 1 is a flow chart of a method for analyzing insurance data based on big data according to a preferred embodiment of the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The insurance data analysis method based on big data is applied to one or more computer devices, wherein the computer devices are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware thereof includes but is not limited to a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device and the like.
The computer device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an Internet Protocol Television (IPTV), an intelligent wearable device, and the like.
The computer device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers.
The Network in which the computer device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
S10, scanning upstream data at preset time intervals, calling a Sqoop data migration tool to extract data to be processed from the upstream data, and writing the data to be processed into a Hive table.
In this embodiment, the preset time interval may be configured by a user, for example, 1 day.
In this embodiment, the upstream data refers to a raw data record of system operation, and the upstream data is usually stored in a relational database, such as mysql database, oracle database, and the like.
In at least one embodiment of the present invention, the invoking the Sqoop data migration tool to extract the data to be processed from the upstream data includes:
acquiring a current timestamp;
determining a target time range according to the preset time interval and the current timestamp;
determining a designated database storing the upstream data;
starting the Sqoop data migration tool to be connected to the specified database, and extracting data in the target time range from the specified database as the data to be processed; and/or
And detecting data with a specified format generated in the target time range in the upstream data, and determining the detected data as the data to be processed.
For example: when the current timestamp is day 1, 12 of 1990 and the preset time interval is one month, the target time range is one month forward from day 1, 12 of 1990, i.e., day 1, 11 of 1990.
In this embodiment, the designated database is generally a relational database for storing the upstream data.
In this embodiment, the specified format includes a csv format.
In this embodiment, when the data that meets the target time range is stored in the specified database, the data is directly acquired from the specified database as the to-be-processed data by using the Sqoop data migration tool.
Meanwhile, for data which is not stored in the specified database, the data with the specified format generated in the target time range is detected, so that the obtained data is more comprehensive, and data omission is avoided.
Further, in this embodiment, after the Sqoop data migration tool is called to extract the to-be-processed data from the upstream data, the to-be-processed data is written into the Hive table, so that a Hive data warehouse tool is called subsequently to perform data calculation.
And S11, calling a Hive data warehouse tool to process the data to be processed stored in the Hive table to obtain a flag bit.
The flag bit may be used as a determination flag to determine whether to perform a calculation.
In at least one embodiment of the present invention, the invoking the Hive data warehouse tool to process the data to be processed stored in the Hive table to obtain a flag bit includes:
acquiring a first field, a second field and a third field which are configured in advance;
acquiring data matched with the first field from the data to be processed as guarantee fee income data, acquiring data matched with the second field as a guarantee period, and acquiring data matched with the third field as an evaluation time point;
when the guarantee fee income data is less than zero and/or the guarantee period is less than or equal to the evaluation time point, determining that the Flag bit is Flag =0;
and when the guarantee fee income data is larger than or equal to zero and the guarantee period is larger than the evaluation time point, determining that the Flag bit is Flag =1.
The first field, the second field and the third field can be configured by self-definition and used for acquiring corresponding data.
And S12, starting the Azkaban task scheduler to detect whether the zone bit meets the configuration condition at regular time.
In at least one embodiment of the present invention, the starting of the Azkaban task scheduler to periodically detect whether the flag bit meets the configuration condition includes:
starting the Azkaban task scheduler to detect the value of the flag bit at fixed time;
when the Flag bit is detected to be Flag =1, determining that the configuration condition is met; or
When the Flag bit is detected to be Flag =0, determining that the configuration condition is not satisfied.
S13, when the flag bit is detected to meet the configuration condition, determining at least one target dimension according to the received requirement table.
In at least one embodiment of the present invention, when it is detected that the Flag bit does not satisfy the configuration condition, i.e., when it is detected that the Flag bit is Flag =0, no calculation is performed.
By the embodiment, when the acquired data do not meet the configuration condition, namely the acquired data are dirty data, the calculation is not executed, the waste of calculation resources is avoided, and the performance of the system is effectively ensured.
In this embodiment, the requirement table may be uploaded by a relevant worker (e.g., a developer, a project manager, etc.), and all dimensions that may involve calculations are stored in the requirement table.
The target dimension can be configured by self-definition, and the invention is not limited.
By way of example, the target dimensions include, but are not limited to, one or a combination of the following: date, year and month, legal subject, company section, cost center, product section and business section.
For example: the target dimension may be: UPR value at xxx cost center under xxx legal subjects at time 2021-03.
In the above embodiment, only when the flag bit is detected to satisfy the configuration condition, the subsequent calculation is performed, which effectively saves the calculation resources of the system.
S14, calling the Hive data warehouse tool to extract a calculation factor under each target dimension from the data to be processed, and calculating an UPR (unexpired Premium Reserve) value of each target dimension according to the calculation factor under each target dimension.
Taking the above embodiment into consideration, the invoking the Hive data warehouse tool to extract the calculation factor in each target dimension from the data to be processed, and calculating the UPR value in each target dimension according to the calculation factor in each target dimension includes:
for each target dimension, acquiring at least one field name configured in advance, wherein the at least one field name comprises the first field, the third field and a fourth field corresponding to a guarantee period;
extracting a calculation factor under each target dimension from the data to be processed according to the at least one field name, wherein the calculation factor comprises target guarantee fee income data corresponding to the first field, a target evaluation time point corresponding to the third field and a target guarantee period corresponding to the fourth field;
calculating target passing time according to the target evaluation time point and the target guarantee period;
calculating the current insurance period according to the target guarantee period and the target elapsed time;
calculating the sum of the target current insurance start period and a preset value as a target current insurance end period;
calculating the difference value between the target current insurance deadline and the target evaluation time point and the preset value as a first target value;
calculating the difference value between the target current insurance deadline and the target current insurance start period as a second target value;
calculating a product of the second target value and the target premium revenue data as a third target value;
and calculating the quotient of the first target value and the third target value as the UPR value of the target dimension.
Specifically, the calculation is as follows:
(1) Target elapsed time (unit: month) = (target evaluation time point (unit: year) — target guarantee period (unit: year)). 12+ (target evaluation time point (unit: month) — target guarantee period (unit: month));
wherein the target evaluation time point (unit: year and day) is the last natural day of the last month;
wherein, the target elapsed time is a term concept which indicates how many periods, i.e. how many months, the client has elapsed;
(2) Target current insurance expiration date (unit: year and day) = guarantee expiration date (unit: year and day) + target elapsed time (unit: month);
the target current insurance period refers to the insurance period of each client in the current period, and the target current insurance period of each client is different according to different dates of the target guarantee period of the client;
(3) Target current insurance deadline (unit: year and day) = target current insurance start period (unit: year and day) +1;
wherein 1 is the preset value, namely one month;
wherein the target current insurance deadline represents a current guarantee deadline of the client;
(4) UPR value = ((target current insurance deadline (unit: year and day) — target evaluation time point (unit: year and day)) -1)/(target current insurance deadline (unit: year and day) — target current insurance start date (unit: year and day))) target premium revenue data.
Through the embodiment, the UPR value under each dimensionality can be automatically calculated through a Hive data warehouse tool, the labor cost is effectively reduced, the phenomenon that a higher error is introduced due to manual calculation is avoided, the calculated UPR value is more accurate, and meanwhile the calculation efficiency is improved.
And S15, calling the Sqoop data migration tool to synchronize the UPR value of each target dimension to a local database.
The local database is a database which is deployed locally and can directly acquire data from the local database, so that the data acquisition process is more efficient and convenient.
The embodiment combines big data calculation, unifies operation modes, and avoids waste of system resources and human resources caused by different calculation modes due to differences of data quantity and the like.
In at least one embodiment of the present invention, after the Sqoop data migration tool is called to synchronize the UPR value of each target dimension to the local database, the method further includes:
when the preset identification is detected, the synchronization is determined to be completed;
transmitting the preset identification to kafak;
connecting to a mail notification interface;
and when the mail notification interface monitors that the kafak consumes the preset identifier, sending a prompt mail to a specified terminal device through the mail notification interface, wherein the prompt mail is used for prompting that the UPR value of each target dimension is successfully synchronized to the local database.
Wherein the preset identification is used for marking whether synchronization is completed or not.
The specific terminal device may include, but is not limited to: terminal equipment of relevant financial staff and terminal equipment of sales personnel.
Through the implementation mode, after the calculated UPR value is successfully synchronized to the local database, the automatic mail can inform relevant workers so as to prompt the relevant workers to check in time, and the work efficiency of the relevant workers is improved in an auxiliary manner.
In at least one embodiment of the present invention, after the Sqoop data migration tool is called to synchronize the UPR value of each target dimension to the local database, the method further includes:
monitoring a pre-established query interface and a pre-established download interface in real time;
when the inquiry interface and/or the download interface is detected to be triggered, acquiring a requester triggering the inquiry interface and/or the download interface;
verifying the authority of the requester;
when the authority of the requester is verified, determining target data requested by the requester through the query interface and/or the download interface;
and feeding the target data back to the terminal equipment of the requester through the query interface and/or the download interface.
For example: when the button of the query interface is detected to be clicked, a target user clicking the button of the query interface is obtained, the authority of the target user can be queried from a pre-configured authority list, when the target user has the query authority, the target user is determined to pass verification, data queried by the target user through the query interface is further detected to serve as target data, and the target data is transmitted through the query interface and displayed on the terminal device of the requester.
Through the implementation mode, the local inquiry and downloading can be supported after the UPR value obtained through calculation is successfully synchronized to the local database, so that the data can be conveniently read and checked by related workers, the working efficiency is further improved, and the user experience is better.
It should be noted that, in order to further improve the security of the data and avoid malicious tampering of the data, the UPR value may be stored in the blockchain node.
It can be seen from the above technical solutions that the present invention can scan upstream data at preset time intervals, invoke a Sqoop data migration tool to extract data to be processed from the upstream data, write the data to be processed into a Hive table, invoke a Hive data warehouse tool to process the data to be processed stored in the Hive table, obtain a flag bit, start an Azkaban task scheduler to periodically detect whether the flag bit satisfies a configuration condition, determine at least one target dimension according to a received requirement table when it is detected that the flag bit satisfies the configuration condition, execute subsequent calculations only when it is detected that the flag bit satisfies the configuration condition, effectively save the computational resources of the system, the Hive data warehouse tool is called to extract a calculation factor under each target dimension from the data to be processed, a UPR value of each target dimension is calculated according to the calculation factor under each target dimension, the Sqoop data migration tool is called to synchronize the UPR value of each target dimension to a local database, further big data calculation and unified operation modes can be combined, waste of system resources and manpower caused by different calculation modes due to difference of data quantity and the like is avoided, the UPR value under each dimension is automatically calculated through the Hive data warehouse tool, manpower cost is effectively reduced, higher errors caused by manual calculation are avoided, the calculated UPR value is more accurate, and meanwhile calculation efficiency is improved.
Fig. 2 is a functional block diagram of a safety data analysis device based on big data according to a preferred embodiment of the present invention. The safety data analysis device 11 based on big data comprises an extraction unit 110, a processing unit 111, a detection unit 112, a determination unit 113, a calculation unit 114 and a synchronization unit 115. The module/unit referred to in the present invention refers to a series of computer program segments that can be executed by the processor 13 and that can perform a fixed function, and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
The extraction unit 110 scans upstream data at preset time intervals, invokes a Sqoop data migration tool to extract data to be processed from the upstream data, and writes the data to be processed into the Hive table.
In this embodiment, the preset time interval may be configured by a user, for example, 1 day.
In this embodiment, the upstream data refers to a raw data record of system operation, and the upstream data is usually stored in a relational database, such as mysql database, oracle database, and the like.
In at least one embodiment of the present invention, the extracting unit 110 invoking the Sqoop data migration tool to extract the data to be processed from the upstream data includes:
acquiring a current timestamp;
determining a target time range according to the preset time interval and the current timestamp;
determining a designated database storing the upstream data;
starting the Sqoop data migration tool to be connected to the specified database, and extracting data in the target time range from the specified database to serve as the data to be processed; and/or
And detecting data with a specified format generated in the target time range in the upstream data, and determining the detected data as the data to be processed.
For example: when the current timestamp is day 1, 12/1990 and the predetermined time interval is one month, the target time range is one month forward from day 1, 12/1990, i.e., day 1, 11/1990.
In this embodiment, the designated database is generally a relational database for storing the upstream data.
In this embodiment, the specified format includes a csv format.
In this embodiment, when the data corresponding to the target time range is stored in the specified database, the data is directly acquired from the specified database as the data to be processed by using the Sqoop data migration tool.
Meanwhile, the data which is not stored in the specified database is detected to be generated in the target time range and has the specified format, so that the acquired data is more comprehensive, and data omission is avoided.
Further, in this embodiment, after the Sqoop data migration tool is called to extract the to-be-processed data from the upstream data, the to-be-processed data is written into the Hive table, so that a Hive data warehouse tool is called subsequently to perform data calculation.
The processing unit 111 calls a Hive data warehouse tool to process the data to be processed stored in the Hive table to obtain a flag bit.
The flag bit may be used as a determination flag to determine whether to perform a calculation.
In at least one embodiment of the present invention, the processing unit 111 invokes a Hive data warehouse tool to process the to-be-processed data stored in the Hive table, and obtaining the flag bit includes:
acquiring a first field, a second field and a third field which are configured in advance;
acquiring data matched with the first field from the data to be processed as guarantee fee income data, acquiring data matched with the second field as a guarantee period, and acquiring data matched with the third field as an evaluation time point;
when the guarantee fee income data is less than zero and/or the guarantee period is less than or equal to the evaluation time point, determining that the Flag bit is Flag =0;
and when the guarantee fee income data is larger than or equal to zero and the guarantee period is larger than the evaluation time point, determining that the Flag bit is Flag =1.
The first field, the second field and the third field can be configured by self-definition and used for acquiring corresponding data.
The detection unit 112 starts the Azkaban task scheduler to detect whether the flag bit meets the configuration condition or not at regular time.
In at least one embodiment of the present invention, the starting, by the detection unit 112, of the Azkaban task scheduler to periodically detect whether the flag bit meets the configuration condition includes:
starting the Azkaban task scheduler to detect the value of the flag bit at fixed time;
when the Flag bit is detected to be Flag =1, determining that the configuration condition is satisfied; or
When the Flag bit is detected to be Flag =0, determining that the configuration condition is not satisfied.
When detecting that the flag bit satisfies the configuration condition, the determining unit 113 determines at least one target dimension according to the received requirement table.
In at least one embodiment of the present invention, when it is detected that the Flag does not satisfy the configuration condition, that is, when it is detected that the Flag is Flag =0, no calculation is performed.
By the embodiment, when the acquired data do not meet the configuration condition, namely the acquired data are dirty data, the calculation is not executed, the waste of calculation resources is avoided, and the performance of the system is effectively ensured.
In this embodiment, the requirement table may be uploaded by a relevant worker (e.g., a developer, a project manager, etc.), and all dimensions that may involve calculations are stored in the requirement table.
The target dimension can be configured by self-definition, and the invention is not limited.
For example, the target dimensions include, but are not limited to, one or a combination of the following dimensions: date, year and month, legal body, company section, cost center, product section and service section.
For example: the target dimension may be: UPR values for xxx cost centers under xxx jurisdictions at times 2021-03.
In the above embodiment, only when the flag bit is detected to satisfy the configuration condition, the subsequent calculation is performed, which effectively saves the calculation resources of the system.
The calculating unit 114 calls the Hive data warehouse tool to extract a calculation factor in each target dimension from the data to be processed, and calculates an UPR (unexpired Premium Reserve) value in each target dimension according to the calculation factor in each target dimension.
Taking the above embodiment into consideration, the invoking the Hive data warehouse tool to extract the calculation factor in each target dimension from the data to be processed, and calculating the UPR value in each target dimension according to the calculation factor in each target dimension includes:
for each target dimension, acquiring at least one field name configured in advance, wherein the at least one field name comprises the first field, the third field and a fourth field corresponding to a guarantee period;
extracting a calculation factor under each target dimension from the data to be processed according to the at least one field name, wherein the calculation factor comprises target guarantee fee income data corresponding to the first field, a target evaluation time point corresponding to the third field and a target guarantee period corresponding to the fourth field;
calculating target passing time according to the target evaluation time point and the target guarantee period;
calculating the current insurance period according to the target guarantee period and the target elapsed time;
calculating the sum of the target current insurance start period and a preset value as a target current insurance end period;
calculating the difference value between the target current insurance deadline and the target evaluation time point and the preset value as a first target value;
calculating the difference value between the target current insurance deadline and the target current insurance start period as a second target value;
calculating a product of the second target value and the target premium revenue data as a third target value;
and calculating the quotient of the first target value and the third target value as the UPR value of the target dimension.
Specifically, the calculation is as follows:
(1) Target elapsed time (unit: month) = (target evaluation time point (unit: year) — target guarantee period (unit: year)) + 12+ (target evaluation time point (unit: month) — target guarantee period (unit: month));
wherein the target evaluation time point (unit: year and day) is the last natural day of the last month;
wherein, the target elapsed time is a term concept which indicates how many periods, i.e. months, the client has elapsed;
(2) Target current insurance expiration date (unit: year and day) = guarantee expiration date (unit: year and day) + target elapsed time (unit: month);
the target current insurance period refers to the insurance period of each client in the current period, and the target current insurance period of each client is different according to different dates of the target guarantee period of the client;
(3) Target current insurance deadline (unit: year and day) = target current insurance start period (unit: year and day) +1;
wherein 1 is the preset value, namely one month;
wherein the target current insurance deadline represents a current guarantee deadline of the client;
(4) UPR value = ((target current insurance deadline (unit: year and day) —) -target evaluation time point (unit: year and day) — 1)/(target current insurance deadline (unit: year and day) — (target current insurance start time (unit: year and day))) target premium revenue data.
Through the embodiment, the UPR value under each dimensionality can be automatically calculated through a Hive data warehouse tool, the labor cost is effectively reduced, the phenomenon that a higher error is introduced due to manual calculation is avoided, the calculated UPR value is more accurate, and meanwhile the calculation efficiency is improved.
The synchronization unit 115 invokes the Sqoop data migration tool to synchronize the UPR value of each target dimension to the local database.
The local database is a database which is deployed locally and can directly acquire data from the local database, so that the data acquisition process is more efficient and convenient.
The embodiment combines big data calculation, unifies operation modes, and avoids waste of system resources and human resources caused by different calculation modes due to differences of data quantity and the like.
In at least one embodiment of the invention, after the Sqoop data migration tool is called to synchronize the UPR value of each target dimension to a local database, when a preset identifier is detected, synchronization is determined to be completed;
transmitting the preset identification to kafak;
connecting to a mail notification interface;
and when the mail notification interface monitors that the kafak consumes the preset identifier, sending a prompt mail to a specified terminal device through the mail notification interface, wherein the prompt mail is used for prompting that the UPR value of each target dimension is successfully synchronized to the local database.
Wherein the preset identification is used for marking whether synchronization is completed or not.
Wherein, the specified terminal device may include, but is not limited to: terminal equipment of related financial staff and terminal equipment of salesmen.
Through the implementation mode, after the calculated UPR value is successfully synchronized to the local database, the automatic mail can inform relevant workers so as to prompt the relevant workers to check in time, and the work efficiency of the relevant workers is improved in an auxiliary manner.
In at least one embodiment of the invention, after the Sqoop data migration tool is called to synchronize the UPR value of each target dimension to a local database, a pre-established query interface and a pre-established download interface are monitored in real time;
when the inquiry interface and/or the download interface is detected to be triggered, acquiring a requester triggering the inquiry interface and/or the download interface;
verifying the authority of the requester;
when the authority of the requester is verified, determining target data requested by the requester through the query interface and/or the download interface;
and feeding back the target data to the terminal equipment of the requester through the query interface and/or the download interface.
For example: when the button of the query interface is detected to be clicked, a target user clicking the button of the query interface is obtained, the authority of the target user can be queried from a pre-configured authority list, when the target user has the query authority, the target user is determined to pass verification, data queried by the target user through the query interface is further detected to serve as target data, and the target data is transmitted through the query interface and displayed on the terminal device of the requester.
Through the embodiment, after the UPR value obtained through calculation is successfully synchronized to the local database, local query and downloading are supported, so that relevant workers can conveniently read and check data, the working efficiency is further improved, and the user experience is better.
It should be noted that, in order to further improve the security of the data and avoid malicious tampering of the data, the UPR value may be stored in the blockchain node.
According to the technical scheme, the upstream data can be scanned at intervals of preset time, the data to be processed is extracted from the upstream data by calling the Sqoop data migration tool, the data to be processed is written into the Hive table, the Hive data warehouse tool is called to process the data to be processed stored in the Hive table to obtain the zone bits, the Azkaban task scheduler is started to detect whether the zone bits meet the configuration conditions at regular time, when the zone bits meet the configuration conditions, at least one target dimension is determined according to the received requirement table, when the zone bits meet the configuration conditions, subsequent calculation is executed, the calculation resources of the system are effectively saved, the Hive data warehouse tool is called to extract the calculation factors of each target dimension from the data to be processed, the calculation factors of each target dimension are calculated according to the calculation factors of each target dimension, the Sqoop data migration tool is called to synchronize the UPR value of each target dimension to the local database, the calculation of the UPR value of each target dimension is further combined with the calculation of the large data, the operation mode is unified, the situation that the difference of the data amount and the difference of the different calculation modes, the UPR value of each target dimension is adopted is avoided, the UPR value is more accurately calculated, the UPR value is more accurately introduced by the calculation efficiency is reduced, and the calculation errors of the UPR calculation of the UPR of the system are more effectively reduced, and the calculation errors caused by the UPR calculation errors are avoided. .
Fig. 3 is a schematic structural diagram of a computer device for implementing the insurance data analysis method based on big data according to the preferred embodiment of the present invention.
The computer device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program, such as a big data based insurance data analysis program, stored in the memory 12 and executable on the processor 13.
It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the computer device 1, and does not constitute a limitation to the computer device 1, the computer device 1 may be in a bus structure or a star structure, the computer device 1 may include more or less hardware or software than those shown, or different component arrangements, for example, the computer device 1 may further include an input and output device, a network access device, and the like.
It should be noted that the computer device 1 is only an example, and other electronic products that are currently available or may come into existence in the future, such as electronic products that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
The memory 12 includes at least one type of readable storage medium, which includes flash memory, removable hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the computer device 1, for example a removable hard disk of the computer device 1. The memory 12 may also be an external storage device of the computer device 1 in other embodiments, such as a plug-in removable hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the computer device 1. The memory 12 can be used not only for storing application software installed in the computer device 1 and various types of data such as codes of insurance data analysis programs based on big data, etc., but also for temporarily storing data that has been output or is to be output.
The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same function or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 13 is a Control Unit (Control Unit) of the computer device 1, connects various components of the whole computer device 1 by various interfaces and lines, executes various functions of the computer device 1 and processes data by running or executing programs or modules stored in the memory 12 (for example, executing a safety data analysis program based on big data, etc.), and calling data stored in the memory 12.
The processor 13 executes the operating system of the computer device 1 and various installed application programs. The processor 13 executes the application program to implement the steps of the various big-data based insurance data analysis method embodiments described above, such as the steps shown in fig. 1.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present invention. The one or more modules/units may be a series of computer readable instruction segments capable of performing certain functions, which are used to describe the execution of the computer program in the computer device 1. For example, the computer program may be divided into an extraction unit 110, a processing unit 111, a detection unit 112, a determination unit 113, a calculation unit 114, a synchronization unit 115.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute the portions of the big data based insurance data analysis method according to the embodiments of the present invention.
The modules/units integrated by the computer device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments described above may be implemented.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random-access Memory, or the like.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one line is shown in FIG. 3, but this does not mean only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 etc.
Although not shown, the computer device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 13 through a power management device, so that functions of charge management, discharge management, power consumption management and the like are realized through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The computer device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the computer device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the computer device 1 and other computer devices.
Optionally, the computer device 1 may further comprise a user interface, which may be a Display (Display), an input unit, such as a Keyboard (Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the computer device 1 and for displaying a visualized user interface.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
Fig. 3 shows only the computer device 1 with the components 12-13, and it will be understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the computer device 1 and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
Referring to fig. 1, the memory 12 of the computer device 1 stores a plurality of instructions to implement a big data based insurance data analysis method, and the processor 13 can execute the plurality of instructions to implement:
scanning upstream data at preset time intervals, calling a Sqoop data migration tool to extract data to be processed from the upstream data, and writing the data to be processed into a Hive table;
calling a Hive data warehouse tool to process the data to be processed stored in the Hive table to obtain a flag bit;
starting an Azkaban task scheduler to detect whether the zone bit meets configuration conditions or not at regular time;
when the flag bit is detected to meet the configuration condition, determining at least one target dimension according to a received requirement table;
calling the Hive data warehouse tool to extract a calculation factor under each target dimension from the data to be processed, and calculating a UPR value of each target dimension according to the calculation factor under each target dimension;
and calling the Sqoop data migration tool to synchronize the UPR value of each target dimension to a local database.
Specifically, the specific implementation method of the instruction by the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1, which is not described herein again.
In the several embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the present invention may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the same, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (9)

1. An insurance data analysis method based on big data is characterized by comprising the following steps:
scanning upstream data at preset time intervals, calling a Sqoop data migration tool to extract data to be processed from the upstream data, and writing the data to be processed into a Hive table;
calling a Hive data warehouse tool to process the data to be processed stored in the Hive table to obtain a flag bit;
starting an Azkaban task scheduler to detect whether the flag bit meets configuration conditions at regular time;
when the flag bit is detected to meet the configuration condition, determining at least one target dimension according to a received requirement table;
calling the Hive data warehouse tool to extract a calculation factor under each target dimension from the data to be processed, and calculating a UPR value of each target dimension according to the calculation factor under each target dimension, wherein the method comprises the following steps: for each target dimension, acquiring at least one field name configured in advance, wherein the at least one field name comprises a first field, a third field and a fourth field corresponding to a guarantee period; extracting a calculation factor under each target dimension from the data to be processed according to the at least one field name, wherein the calculation factor comprises target guarantee income data corresponding to the first field, a target evaluation time point corresponding to the third field and a target guarantee period corresponding to the fourth field; calculating target passing time according to the target evaluation time point and the target guarantee period; calculating the current insurance period according to the target guarantee period and the target elapsed time; calculating the sum of the target current insurance start period and a preset value as a target current insurance end period; calculating the difference value between the target current insurance expiration date and the target evaluation time point and the preset value as a first target value; calculating the difference value between the target current insurance deadline and the target current insurance start period as a second target value; calculating a product of the second target value and the target premium revenue data as a third target value; calculating a quotient of the first target value and the third target value as a UPR value of the target dimension;
and calling the Sqoop data migration tool to synchronize the UPR value of each target dimension to a local database.
2. The big-data-based insurance data analyzing method according to claim 1, wherein said invoking a Sqoop data migration tool to extract the data to be processed from the upstream data comprises:
acquiring a current timestamp;
determining a target time range according to the preset time interval and the current timestamp;
determining a designated database storing the upstream data;
starting the Sqoop data migration tool to be connected to the specified database, and extracting data in the target time range from the specified database as the data to be processed; and/or
And detecting data with a specified format generated in the target time range in the upstream data, and determining the detected data as the data to be processed.
3. The big-data-based insurance data analysis method according to claim 1, wherein the calling Hive data warehouse tool to process the data to be processed stored in the Hive table to obtain a flag bit comprises:
acquiring the first field, the second field and the third field which are configured in advance;
acquiring data matched with the first field from the data to be processed as guarantee fee income data, acquiring data matched with the second field as a guarantee period, and acquiring data matched with the third field as an evaluation time point;
when the guarantee fee income data is less than zero and/or the guarantee period is less than or equal to the evaluation time point, determining that the Flag bit is Flag =0;
and when the guarantee fee income data is larger than or equal to zero and the guarantee period is larger than the evaluation time point, determining that the Flag bit is Flag =1.
4. The big-data-based insurance data analysis method according to claim 1, wherein the starting of the Azkaban task scheduler to periodically detect whether the flag bit satisfies a configuration condition comprises:
starting the Azkaban task scheduler to detect the value of the flag bit at fixed time;
when the Flag bit is detected to be Flag =1, determining that the configuration condition is met; or
When the Flag bit is detected to be Flag =0, determining that the configuration condition is not satisfied.
5. The big-data-based insurance data analyzing method according to claim 1, wherein after invoking the Sqoop data migration tool to synchronize UPR values of each target dimension to a local database, the method further comprises:
when the preset identification is detected, the synchronization is determined to be completed;
transmitting the preset identification to kafak;
connecting to a mail notification interface;
and when the mail notification interface monitors that the kafak consumes the preset identifier, sending a prompt mail to a specified terminal device through the mail notification interface, wherein the prompt mail is used for prompting that the UPR value of each target dimension is successfully synchronized to the local database.
6. The big-data-based insurance data analysis method according to claim 1, wherein after invoking the Sqoop data migration tool to synchronize the UPR value of each target dimension to a local database, the method further comprises:
monitoring a pre-established query interface and a pre-established download interface in real time;
when the inquiry interface and/or the download interface is detected to be triggered, acquiring a requester triggering the inquiry interface and/or the download interface;
verifying the requestor's rights;
when the authority of the requester is verified, determining target data requested by the requester through the query interface and/or the download interface;
and feeding back the target data to the terminal equipment of the requester through the query interface and/or the download interface.
7. An insurance data analysis apparatus based on big data, characterized in that the insurance data analysis apparatus based on big data comprises:
the extraction unit is used for scanning upstream data at preset time intervals, calling a Sqoop data migration tool to extract data to be processed from the upstream data, and writing the data to be processed into a Hive table;
the processing unit is used for calling a Hive data warehouse tool to process the data to be processed stored in the Hive table to obtain a flag bit;
the detection unit is used for starting the Azkaban task scheduler to detect whether the zone bit meets the configuration condition at regular time;
the determining unit is used for determining at least one target dimension according to the received requirement table when the flag bit is detected to meet the configuration condition;
the calculating unit is used for calling the Hive data warehouse tool to extract a calculation factor under each target dimension from the data to be processed, and calculating a UPR value of each target dimension according to the calculation factor under each target dimension, and the calculating unit comprises: for each target dimension, acquiring at least one field name configured in advance, wherein the at least one field name comprises a first field, a third field and a fourth field corresponding to a guarantee period; extracting a calculation factor under each target dimension from the data to be processed according to the at least one field name, wherein the calculation factor comprises target guarantee fee income data corresponding to the first field, a target evaluation time point corresponding to the third field and a target guarantee period corresponding to the fourth field; calculating target passing time according to the target evaluation time point and the target guarantee period; calculating a target current insurance period according to the target guarantee period and the target elapsed time; calculating the sum of the target current insurance start period and a preset value as a target current insurance end period; calculating the difference value between the target current insurance deadline and the target evaluation time point and the preset value as a first target value; calculating the difference value between the target current insurance deadline and the target current insurance start period as a second target value; calculating a product of the second target value and the target premium revenue data as a third target value; calculating a quotient of the first target value and the third target value as a UPR value of the target dimension;
and the synchronization unit is used for calling the Sqoop data migration tool to synchronize the UPR value of each target dimension to a local database.
8. A computer device, characterized in that the computer device comprises:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement a big-data based insurance data analysis method of any of claims 1 to 6.
9. A computer-readable storage medium, characterized in that: the computer-readable storage medium has stored therein at least one instruction that is executed by a processor in a computer device to implement the big data-based insurance data analysis method of any one of claims 1 to 6.
CN202110696233.XA 2021-06-23 2021-06-23 Insurance data analysis method, device, equipment and medium based on big data Active CN113449024B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110696233.XA CN113449024B (en) 2021-06-23 2021-06-23 Insurance data analysis method, device, equipment and medium based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110696233.XA CN113449024B (en) 2021-06-23 2021-06-23 Insurance data analysis method, device, equipment and medium based on big data

Publications (2)

Publication Number Publication Date
CN113449024A CN113449024A (en) 2021-09-28
CN113449024B true CN113449024B (en) 2023-02-14

Family

ID=77812326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110696233.XA Active CN113449024B (en) 2021-06-23 2021-06-23 Insurance data analysis method, device, equipment and medium based on big data

Country Status (1)

Country Link
CN (1) CN113449024B (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120197666A1 (en) * 2011-01-28 2012-08-02 Assurant, Inc. Method and apparatus for providing unemployment insurance
CN107133727A (en) * 2017-04-22 2017-09-05 南昌航空大学 Dimensionality optimization method is assessed based on the failure effect for differentiating force coefficient
CN107977897A (en) * 2017-12-28 2018-05-01 平安健康保险股份有限公司 Insurance business data analysis method, system and computer-readable recording medium
CA3046542A1 (en) * 2018-06-15 2019-12-15 Bank Of Montreal New issue management system
CN111695751A (en) * 2019-03-15 2020-09-22 北京京东尚科信息技术有限公司 Data processing method and data processing system
CN110716989A (en) * 2019-08-27 2020-01-21 苏宁云计算有限公司 Dimension data processing method and device, computer equipment and storage medium
CN112579586A (en) * 2020-12-23 2021-03-30 平安普惠企业管理有限公司 Data processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113449024A (en) 2021-09-28

Similar Documents

Publication Publication Date Title
CN111754110A (en) Method, device, equipment and medium for evaluating operation index based on artificial intelligence
WO2022147908A1 (en) Table association-based lost data recovery method and apparatus, device, and medium
CN112015663A (en) Test data recording method, device, equipment and medium
CN111754123A (en) Data monitoring method and device, computer equipment and storage medium
CN112256783A (en) Data export method and device, electronic equipment and storage medium
CN111985545A (en) Target data detection method, device, equipment and medium based on artificial intelligence
CN110647409A (en) Message writing method, electronic device, system and medium
CN114185776A (en) Big data point burying method, device, equipment and medium for application program
CN112866285B (en) Gateway interception method and device, electronic equipment and storage medium
CN111858604B (en) Data storage method and device, electronic equipment and storage medium
CN113449024B (en) Insurance data analysis method, device, equipment and medium based on big data
CN115147031B (en) Clearing workflow execution method, device, equipment and medium
CN111950707A (en) Behavior prediction method, apparatus, device and medium based on behavior co-occurrence network
CN114399397A (en) Renewal tracking method, device, equipment and medium
CN114626948A (en) Block chain transaction accounting method and device, electronic equipment and storage medium
CN115145870A (en) Method and device for positioning reason of failed task, electronic equipment and storage medium
CN113469649A (en) Project progress analysis method and device, electronic equipment and storage medium
CN110941536B (en) Monitoring method and system, and first server cluster
CN113723813A (en) Performance ranking method and device, electronic equipment and readable storage medium
CN113240351A (en) Business data consistency checking method and device, electronic equipment and medium
CN113419916B (en) Wind control inspection program uninterrupted operation method, device, equipment and storage medium
CN113360505B (en) Time sequence data-based data processing method and device, electronic equipment and readable storage medium
CN115065642B (en) Code table request method, device, equipment and medium under bandwidth limitation
CN113297228B (en) MySQL writing method, device, equipment and medium based on multiple live instances
CN113419718A (en) Data transmission method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant