CN113656704A

CN113656704A - Insurance data processing method and equipment based on similarity matching and storage medium

Info

Publication number: CN113656704A
Application number: CN202111011057.8A
Authority: CN
Inventors: 程克喜
Original assignee: Ping An Property and Casualty Insurance Company of China Ltd
Current assignee: Ping An Property and Casualty Insurance Company of China Ltd
Priority date: 2021-08-31
Filing date: 2021-08-31
Publication date: 2021-11-16

Abstract

The application relates to artificial intelligence, and provides an insurance data processing method, insurance data processing equipment and an insurance data storage medium based on similarity matching, wherein the insurance data processing method comprises the following steps: the method comprises the steps that a first data processing tool Sqoop is adopted to synchronize data of a target customer to a big data platform HIVE database at regular time, and one or more seats of data are stored in the HIVE database; respectively extracting a first characteristic of a target customer and a second characteristic of one or more seats by adopting a second data processing tool Hadoop; a target agent is determined from the one or more agents based on the first characteristic and the one or more second characteristics, the target agent being designated to follow the insurance requirements of the target customer. By the embodiment of the application, the customer conversion rate in insurance business can be improved.

Description

Insurance data processing method and equipment based on similarity matching and storage medium

Technical Field

The present application relates to the field of data processing, and in particular, to an insurance data processing method, device, and storage medium based on similarity matching.

Background

The agent generally refers to a customer service person who answers questions by receiving a consultation call, and is an important bridge between customers and companies. When the client has insurance requirements, the information of the client needs to be distributed to the seat, and the seat follows the requirements of the client.

Currently, a manual method for distributing customer information is generally adopted. However, as insurance business continues to grow, insurance demands of users continue to increase, and thus, the amount of data generated becomes larger and larger. In this case, if manual allocation is continued, not only the workload of the relevant person is greatly increased, but also an operation or calculation error is liable to occur. Further, since manual assignment is more assigned based on human experience, there may be cases of uneven assignment, improper assignment, and the like. Therefore, a large data-based processing method is required for distribution.

Disclosure of Invention

The embodiment of the application provides an insurance data processing method, insurance data processing equipment and an insurance data storage medium based on similarity matching, and the insurance data processing method, the insurance data processing equipment and the insurance data storage medium can improve the customer conversion rate in insurance business.

In a first aspect, an embodiment of the present application provides an insurance data processing method based on similarity matching, where the method may include: the method comprises the steps that a first data processing tool Sqoop is adopted to synchronize data of a target customer to a big data platform HIVE database at regular time, and one or more seats of data are stored in the HIVE database; respectively extracting a first characteristic of a target customer and a second characteristic of one or more seats by adopting a second data processing tool Hadoop; a target agent is determined from the one or more agents based on the first characteristic and the one or more second characteristics, the target agent being designated to follow the insurance requirements of the target customer.

According to the first aspect, in one possible implementation manner, determining a target agent from one or more agents based on the first feature and the one or more second features includes: inputting the first feature and the one or more second features into a predictive model; matching the first characteristic and one or more second characteristics through a prediction model to obtain one or more matching results, wherein each matching result corresponds to a matching score; taking the matching result with the highest matching score in the one or more matching results as a target matching result; and taking the seat corresponding to the target matching result as a target seat.

According to the first aspect, in a possible implementation manner, after determining a target agent from one or more agents based on the first feature and the second feature, the method further includes: and sending the data of the target client to the target seat through the computing engine Spark, wherein the data of the target client is displayed on the terminal equipment corresponding to the target seat.

According to the first aspect, in a possible implementation manner, before synchronizing data of a target client to a big data platform HIVE database at a timing by using a first data processing tool Sqoop, the method further includes:

acquiring data of a target client in a current scheduling period; the target client is a client with insurance requirements, and the behavior data comprises data of browsing insurance products by the target client; and storing the data of the target client in the current scheduling period into a database.

According to the first aspect, in a possible implementation manner, synchronizing data of a target client to a big data platform HIVE database at regular time by using a first data processing tool Sqoop includes: scanning a database at a first moment by adopting a first data processing tool Sqoop, wherein the first moment is the ending moment of the current scheduling period; and if the data of the target client in the current scheduling period is scanned from the database, synchronizing the data of the target client to a HIVE database of the big data platform.

According to the first aspect, in one possible implementation, if no data of the target client in the current scheduling period is scanned from the database, starting a next scheduling period; and storing the data of the target client in the next scheduling period in a database.

According to the first aspect, in a possible implementation manner, acquiring data of a target client in a current scheduling period includes: collecting data of a source client of a client browsing insurance products in a current scheduling period; and screening the collected data of the source client according to a preset rule to obtain the data of the target client.

In a second aspect, an embodiment of the present application provides an insurance data processing system based on similarity matching, where the system includes a source data platform, a processing platform, and a front-end display unit; the source data platform is used for screening the acquired data of the source client according to a preset rule to obtain the data of the target client; the system comprises a processing platform, a big data platform HIVE database and a data processing platform, wherein the processing platform is used for scanning a source processing platform at regular time by adopting a first data processing tool Sqoop and synchronizing scanned target customer data to the HIVE database of the big data platform, and the HIVE database stores data of one or more seats; the processing platform is further used for extracting a first feature of a target customer and a second feature of one or more seats respectively by adopting a second data processing tool Hadoop, determining the target seat from the one or more seats based on the first feature and the one or more second features, and enabling the target seat to be designated to follow up the insurance requirement of the target customer; the processing platform is also used for sending target customer data to a target agent through a computing engine Spark, and the target agent belongs to one or more agents; and the front-end display unit is used for displaying the target customer data to the target seat.

In a third aspect, an embodiment of the present application provides an insurance data processing apparatus based on similarity matching, where the apparatus may include: the system comprises a synchronization unit, a big data platform HIVE database and a data processing unit, wherein the synchronization unit is used for synchronizing data of a target client to the HIVE database of the big data platform at regular time by adopting a first data processing tool Sqoop, and the HIVE database stores data of one or more seats; the processing unit is used for respectively extracting a first feature of a target customer and a second feature of one or more seats by adopting a second data processing tool Hadoop; and the matching unit is used for determining a target seat from the one or more seats based on the first characteristic and the one or more second characteristics, and the target seat is specified to follow up the insurance requirement of the target customer.

In a fourth aspect, embodiments of the present application provide an electronic device that may include a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the programs include instructions for performing the steps of any of the methods of the first aspect of the present application.

In a fifth aspect, the present application provides a computer-readable storage medium, where a computer program is stored, where the computer program is executed by a processor to implement part or all of the steps described in any one of the methods of the first aspect of the present application.

It can be seen that after the target customer data with insurance requirements is obtained, the target customer data may be synchronized to the big data platform HIVE at a first time point by using the first data processing tool Sqoop, and the target customer data is sent to the target seat by using the computing engine Spark. The operation of the service staff is reduced, the distribution is timely, and the aim of automatically distributing the clients is fulfilled. Furthermore, the target seat is obtained through the prediction model and can follow up the insurance requirements of the target customer, so that the customer conversion rate and the customer embodiment can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments or the background art of the present application, the drawings required to be used in the embodiments or the background art of the present application will be described below.

FIG. 1 is a block diagram of an insurance data processing system based on similarity matching according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart of an insurance data processing method based on similarity matching according to an embodiment of the present application;

FIG. 3 is a schematic structural diagram of an insurance data processing apparatus based on similarity matching according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The embodiments of the present application will be described below with reference to the drawings.

The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

As used in this specification, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between 2 or more computers. In addition, these components can execute from various computer readable media having various data structures stored thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from two components interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal).

The embodiment of the application can acquire and process related data (such as behavior data of users browsing insurance products) based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning and the like.

Referring to fig. 1, fig. 1 is a schematic diagram of an architecture of an insurance data processing system based on similarity matching according to an embodiment of the present application, and a data system 100 may include a source data platform 101, a processing platform 102, and a front-end presentation unit 103.

The source data platform 101 includes, but is not limited to, a mobile terminal, an application server, and a network. The mobile terminal may be a mobile terminal, a smart phone, a smart watch, a desktop computer, a notebook computer, and the like. The application server may be an independent server or a server cluster composed of a plurality of servers. The network may be an intranet, the internet, or the like. The source data platform 101 can collect source client data of a client browsing an insurance product, wherein the source client data comprises basic information of the client and information of attention behaviors of the client on the insurance product.

For example, when the client has insurance needs, basic information such as name, gender, age, telephone, address, concerned insurance category and the like can be filled in through a client such as a webpage, an application program and the like. Thus, the source data platform 101 can obtain basic information filled in by the customer. When the insurance company promotes and markets insurance products to the clients on the client sides such as the web pages and the application programs, the clients respond to the activities such as promotion and marketing of the insurance company on the client sides such as the web pages and the application programs according to personal wishes. Therefore, the source data platform 101 can acquire the information of the attention behaviors of the customers to the insurance products in real time. The attention behavior information may reflect a customer's attention to the insurance product, and the attention is related to the customer's willingness to purchase the insurance product. For example, when a customer has a desire to purchase an insurance product, the customer may want to learn about the product. Thus, the customer clicks on the open insurance product page presentation. When a customer does not have a desire to purchase an insurance product, the customer does not want to know about the product. Thus, the customer may simply browse the page, but will not click to open the insurance product page presentation.

After the source data platform 101 obtains the data of the source client, the collected client data of the source client can be screened in real time according to a preset rule, so as to obtain the screened data of the target client.

The processing platform 102 may scan the source processing platform 101 by using the first data processing tool Sqoop, and synchronize the scanned data of the target client in the source data platform 101 to the big data platform HIVE database. The HIVE database stores data of one or more agents. The processing platform 102 may be a server or other device for data processing, and the application is not limited thereto. The server may be an independent server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, a middleware server, domain name service, security service, Content Delivery Network (CDN), big data and an artificial intelligence platform.

After the processing platform 102 may also synchronize the data of the target customer to the HIVE database through the Sqoop, a second data processing tool Hadoop may be adopted to extract a first feature of the target customer and a second feature of one or more agents, respectively, and determine the target agent from the one or more agents based on the first feature and the one or more second features. Finally, the data of the target customer is sent to the target agents by the computing engine Spark, and the target agents belong to one or more agents stored in the HIVE database.

The front-end display unit 103 may be a work terminal of a target agent, and the front-end display unit 103 may receive and display data of a target customer. Further, after the front-end display unit 103 receives the data of the target customer, the customer contact channel may be automatically accessed, or a corresponding task prompt may be performed on the target seat, for example, a pop-up window prompt may be sent to the target seat. Therefore, the target seat can be in contact with the target customer in time, and the customer conversion rate is improved according to the requirement of the target customer.

Referring to fig. 2, fig. 2 is a flowchart illustrating an insurance data processing method based on similarity matching according to an embodiment of the present disclosure, where the method can be applied to the system 100 in fig. 1. As shown in fig. 2, an insurance data processing method based on similarity matching provided in an embodiment of the present application may include:

step S201, a first data processing tool Sqoop is adopted to synchronize the data of the target client to a HIVE database of the big data platform at regular time.

Specifically, in the embodiment of the present application, the Sqoop is an open source tool, and is mainly used for data transmission between the HIVE and the conventional database, and may lead data in one relational database into the HIVE, or may lead data of the HIVE into the relational database. The HIVE is a data warehouse tool based on Hadoop, is used for data extraction, transformation and loading, and is a mechanism capable of storing, querying and analyzing large-scale data stored in Hadoop. Before data of the target client is synchronized to the HIVE database by adopting Sqoop timing, the data of the target client in the current scheduling period can be acquired. Furthermore, in the current scheduling period, data of a source client of the client browsing the insurance product can be collected, and the collected data of the source client is screened according to a preset rule to obtain data of a target client.

It can be understood that an insurance company can market insurance products to clients through promotion, marketing and the like on clients such as web pages, application programs and the like, and when the clients browse the insurance products through devices such as mobile terminals, smart phones, notebook computers and the like, generally speaking, the users need to register and fill in their basic information before browsing more insurance products. The basic information includes but is not limited to: name, gender, age, phone, address, and insurance category of interest. The client can respond to the activities of promotion, marketing and the like of the insurance company on the client such as a webpage, an application program and the like according to personal wishes. Therefore, after the customer responds, the information of the attention behavior of the customer to the insurance product can be acquired. The attention behavior information may reflect a customer's attention to the insurance product, and the attention is related to the customer's willingness to purchase the insurance product. For example, when a customer has a desire to purchase an insurance product, the customer may want to learn about the product. Thus, the customer clicks on the open insurance product page presentation. When a customer does not have a desire to purchase an insurance product, the customer does not want to know about the product. Thus, the customer may simply browse the page, but will not click to open the insurance product page presentation.

Thus, the source customer's data may include the customer's basic information and the customer's behavior-of-interest information for the insurance product. It will be appreciated that the source customer data includes insurance requirements and customers without insurance requirements. And the customers with insurance requirements are the customers with higher conversion rate. Therefore, after the data of the target client is obtained, the collected source client data can be screened according to the preset rule, and the data of the target client with insurance requirements can be obtained. For example, the client who opens the insurance product page presentation may be a target client, the client who chooses to consult the customer service for the insurance product may be a target client, the client who adds the insurance product to the shopping cart may be a target client, or the client who has browsed the insurance page for a relatively long time may be a target client. In order to ensure fairness of client allocation, after the target client is obtained through screening, basic information of the target client can be subjected to telephone screening, name screening and the like. Further, the data of the target client in the current scheduling period may be stored in the database. It should be noted that the scheduling period may be a period for acquiring the client data, which is set manually according to actual needs. For example, taking 24 hours as an example, the scheduling period may be set to 5 minutes, and from zero (00:00), data of source customers who have browsed the insurance industry within the scheduling period of 00:05 may be acquired.

Therefore, the first data processing tool Sqoop can be adopted to scan the database at the end time of the current scheduling period, and if the data of the target client in the current scheduling period is scanned from the database, the data of the target client is synchronized to the big data platform HIVE database. In this way, the data of the target customer can be processed at the first time. If the data of the target client in the current scheduling period is not scanned from the database, which indicates that the data of the target client in the current scheduling period is processed, the next scheduling period can be started. And acquiring source client data of the client browsing insurance product in the next scheduling period, screening the acquired source client data according to preset rules to obtain data of a target client, and storing the data of the target client in the next scheduling period into a database.

It should be noted that, after the data of the target customer is obtained through screening, the basic information in the data of the target customer is telephone-shielded, so that the data of the target customer is synchronized to the big data platform HIVE database through the first data processing tool Sqoop, and then the information (such as a telephone and a name) in the data of the target customer that is shielded is also synchronized to the big data platform HIVE database. It is understood that the HIVE database also stores information of one or more agents, such as basic information and service information of the agents.

Step S202, a second data processing tool Hadoop is adopted to respectively extract a first feature of a target customer and a second feature of one or more seats.

Specifically, data of the target customer and data of one or more agents are stored in the HIVE database, so that a first feature of the target customer can be extracted from the data of the target customer and a second feature of the agents can be extracted from the data of the one or more agents by using the second data processing tool Hadoop.

The first characteristic of the target customer may be an insurance information class and a customer base information class. Insurance information classes include, but are not limited to: insurance categories (say long insurance, short insurance, car insurance) filled out by the user, insurance objects, and the like. When the insurance category filled out by the user is vehicle insurance, the insurance information category may further include a vehicle brand, a vehicle type, a vehicle series, a vehicle value, and the like. The customer base information classes include, but are not limited to: birthday, gender, address (regional or native), recent contact status, classification of web pages viewed, contact time distribution, etc.

The second characteristic of the agent may be a policy information class and a basic information class. Policy information classes include, but are not limited to: the daily communication times of the seat is close to one month, the daily communication time of the seat is close to one month, the online time of the seat is close to one month, the attendance data of the seat is close to one month, the number of tasks to be followed by the seat at present and the task category to be followed by the seat at present. The basic information classes include, but are not limited to: birthday, gender, constellation, address (regional or native), preferences, etc.

Step S203, determining a target seat from one or more seats based on the first characteristic and the one or more second characteristics.

Specifically, when the similarity of the characteristics between the first characteristic of the agent and the second characteristic of the target customer is high, it is indicated that the common characteristics between the agent and the target customer are high, and good communication with the target customer can be achieved. For example, if the agent is consistent with the native place of the target client or the geographic location is close, or the birthday is close, or the distribution situation of the touch time of the target client is close to the online time of the agent for one month, or the classification of the browsed webpage of the target client is close to the task category currently followed by the agent, the first characteristic of the agent can be considered to be similar to the second characteristic of the target client. Therefore, when the number of tasks to be followed by the agent is not large or the attendance data of the agent is good in a month, if the agent follows the target customer, the conversion rate of the target customer can be improved, namely, the probability of the target customer for purchasing insurance products is improved. Therefore, the target agent may be determined based on the first characteristic of the target customer's data and the second characteristic of the one or more agents.

Further, after the first features of the target customer and the second features of one or more agents are extracted by the aid of a second data processing tool Hadoop, the first features of the target customer and the second features of the one or more agents can be input into a prediction model, and the first features and the one or more second features are matched through the prediction model to obtain one or more matching results; each matching result corresponds to a matching score, the matching result with the highest matching score in one or more matching results is used as a target matching result, and the seat corresponding to the target matching result is used as a target seat.

It can be understood that the predictive model is a model trained by a large amount of data and meeting the actual requirements. In the prediction model, when the first feature of the target customer and the second feature of one or more agents have similarity and have high similarity, the score value corresponding to the matching result is higher, and the higher the matching score is, the higher the matching degree of the agent and the target customer is, the higher the probability of the future customer deal is represented. Further, the score value may be set to a percentile (0-100). Representative matches with a score value less than 60 are poor, representative matches with a score value greater than or equal to 60 and less than 80 are good, representative matches with a score value greater than or equal to 80 and less than 90 are medium, and representative matches with a score value greater than or equal to 90 and less than 100 are excellent. Note that the score value and the matching division are not limited to this.

Further, after the agent corresponding to the matching result with the highest score value is confirmed as the target agent, the data of the target client is sent to the target agent through the calculation engine Spark. Subsequently, the target agent is used to determine the insurance requirements of the target customer.

Further, under the condition that the matching score is medium or more, distributing the data of the target customer to an agent with better attendance in a first time period, and judging whether the service volume (namely the customer conversion rate) of the agent is increased compared with the previous time period in the first time period; if not, then subsequent appointments for that agent are no longer assigned preferentially based on attendance data. For example, the first feature and the second feature are matched through the prediction model to obtain one or more matching results, and if there are four results with matching scores greater than or equal to 80 corresponding to the matching results, the results may be 95-point matching results, 93-point matching results, 89-point matching results, and 87-point matching results in descending order. However, the matching results may be 89-point matching results, 93-point matching results, 95-point matching results, and 87-point matching results in the order of the attendance data from good to bad. Therefore, the data of the target customer is allocated to the seat corresponding to the matching result of 89 points in the first time period (which may be the next month of the current month with better attendance data). If the service volume of the agent in the first time period is increased compared with the previous time period, the fact that the data of the target customer are distributed to the agent with good attendance data under the condition that the score value of the matching result is high is shown, and customer conversion rate is improved. If the number of the agents is not increased, the data of the target client is allocated to the agents which are not the highest in matching degree, so that the conversion rate of the client cannot be improved, and therefore, the agents are not preferentially allocated based on the attendance data subsequently.

Referring to fig. 3, fig. 3 is a schematic diagram of an insurance data processing apparatus based on similarity matching according to an embodiment of the present application. As shown in fig. 3, the insurance data processing apparatus 300 may include a synchronization unit 301, a processing unit 302, and a matching unit 303.

The system comprises a synchronization unit 301, a big data platform HIVE database and a data processing unit, wherein the synchronization unit is used for synchronizing data of a target client to the HIVE database of the big data platform at regular time by adopting a first data processing tool Sqoop, and the HIVE database stores data of one or more seats;

the processing unit 302 is configured to respectively extract a first feature of a target customer and a second feature of one or more agents by using a second data processing tool Hadoop;

a matching unit 303, configured to determine a target agent from the one or more agents based on the first characteristic and the one or more second characteristics, the target agent being designated to follow up the insurance requirements of the target customer.

In a possible implementation manner, the matching unit 303 is specifically configured to: inputting the first feature and the one or more second features into a predictive model; matching the first characteristic and one or more second characteristics through a prediction model to obtain one or more matching results, wherein each matching result corresponds to a matching score; taking the matching result with the highest matching score in the one or more matching results as a target matching result; and taking the seat corresponding to the target matching result as a target seat.

In one possible implementation, the synchronization unit 301 is further configured to: and sending the data of the target client to the target seat through the computing engine Spark, wherein the data of the target client is displayed on the terminal equipment corresponding to the target seat.

In one possible implementation, the processing unit 302 is further configured to: acquiring data of a target client in a current scheduling period; the target client is a client with insurance requirements, and the behavior data comprises data of browsing insurance products by the target client; and storing the data of the target client in the current scheduling period into a database.

In a possible implementation manner, the synchronization unit 301 is specifically configured to: scanning a database at a first moment by adopting a first data processing tool Sqoop, wherein the first moment is the ending moment of the current scheduling period; and if the data of the target client in the current scheduling period is scanned from the database, synchronizing the data of the target client to a HIVE database of the big data platform.

In a possible implementation manner, the synchronization unit 301 is further specifically configured to: if the data of the target client in the current scheduling period is not scanned from the database, starting the next scheduling period; and storing the data of the target client in the next scheduling period in a database.

In a possible implementation manner, the processing unit 302 is specifically configured to: collecting data of a source client of a client browsing insurance products in a current scheduling period; and screening the collected data of the source client according to a preset rule to obtain the data of the target client.

Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. As shown in fig. 4, the electronic device 40 may include: one or more processors 401, one or more memories 402, and one or more communication interfaces 403. These components may be connected by a bus 404 or other means, such as a communications bus in FIG. 4. Wherein:

the communication interface 403 can be used for the processing device 40 of the service data to communicate with other communication devices, such as other electronic devices. In particular, the communication interface 403 may be a wired interface.

The memory 402 may be coupled to the processor 401 via the bus 404 or an input/output port, and the memory 402 may be integrated with the processor 401. The memory 402 is used to store various software programs and/or sets of instructions or data. Specifically, the Memory 402 may be a Read-Only Memory (ROM) or other types of static storage devices that can store static information and instructions, a Random Access Memory (RAM) or other types of dynamic storage devices that can store information and instructions, an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc Read-Only Memory (CD-ROM) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code resources in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 402 may include high-speed random access memory and may also include non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 402 may store an operating system (hereinafter, referred to as a system), such as an embedded operating system like uCOS, VxWorks, RTLinux, etc. The memory 402 may also store a network communication program that may be used to communicate with one or more additional devices, one or more user devices, one or more electronic devices. The memory may be self-contained and coupled to the processor via a bus. The memory may also be integral to the processor.

The memory 402 is used for storing application code resources for executing the above scheme, and is controlled by the processor 401. The processor 401 is operative to execute application code resources stored in the memory 402.

The processor 401 may be a central processing unit, general purpose processor, digital signal processor, application specific integrated circuit, field programmable gate array or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. A processor may also be a combination of certain functions, e.g., a combination comprising one or more microprocessors, a digital signal processor and a microprocessor, or the like.

Processor 401 may be configured to invoke an application program stored in memory 402 to implement the steps of the method for training a machine translation model in the embodiment corresponding to fig. 2; in particular implementations, one or more instructions in the computer storage medium are loaded by processor 401 and perform the following steps:

the method comprises the steps that a first data processing tool Sqoop is adopted to synchronize data of a target customer to a big data platform HIVE database at regular time, and one or more seats of data are stored in the HIVE database;

respectively extracting a first characteristic of a target customer and a second characteristic of one or more seats by adopting a second data processing tool Hadoop;

a target agent is determined from the one or more agents based on the first characteristic and the one or more second characteristics, the target agent being designated to follow the insurance requirements of the target customer.

In one possible implementation, the processor 401 is specifically configured to: inputting the first feature and the one or more second features into a predictive model; matching the first characteristic and one or more second characteristics through a prediction model to obtain one or more matching results, wherein each matching result corresponds to a matching score; taking the matching result with the highest matching score in the one or more matching results as a target matching result; and taking the seat corresponding to the target matching result as a target seat.

In one possible implementation, the processor 401, through the communication interface 403, is specifically configured to: and sending the data of the target client to the target seat through the computing engine Spark, wherein the data of the target client is displayed on the terminal equipment corresponding to the target seat.

In one possible implementation, the processor 401, through the communication interface 403, is specifically configured to: acquiring data of a target client in a current scheduling period; the target client is a client with insurance requirements, and the behavior data comprises data of browsing insurance products by the target client; and storing the data of the target client in the current scheduling period into a database.

In one possible implementation, the processor 401, through the communication interface 403, is specifically configured to: scanning a database at a first moment by adopting a first data processing tool Sqoop, wherein the first moment is the ending moment of the current scheduling period; and if the data of the target client in the current scheduling period is scanned from the database, synchronizing the data of the target client to a HIVE database of the big data platform.

In one possible implementation, the processor 401 is specifically configured to: if the data of the target client in the current scheduling period is not scanned from the database, starting the next scheduling period; and storing the data of the target client in the next scheduling period in a database.

In one possible implementation, the processor 401 is specifically configured to: collecting data of a source client of a client browsing insurance products in a current scheduling period; and screening the collected data of the source client according to a preset rule to obtain the data of the target client.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

Also provided in embodiments of the present application is a computer (readable) storage medium storing a computer program comprising program instructions that, when executed by a processor, perform some or all of the steps performed in the above-described method embodiments. Alternatively, the computer storage media may be volatile or nonvolatile.

It is also noted that while for simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present application is not limited by the order of acts, as some steps may, in accordance with the present application, occur in other orders and concurrently. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application. In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims

1. An insurance data processing method based on similarity matching, characterized in that the method comprises:

respectively extracting a first feature of the target customer and a second feature of the one or more agents by adopting a second data processing tool Hadoop;

determining a target agent from the one or more agents based on the first characteristic and one or more of the second characteristics, the target agent designated to follow the insurance requirements of the target customer.

2. The method of claim 1, wherein determining a target agent from the one or more agents based on the first feature and the one or more second features comprises:

inputting the first feature and one or more of the second features into a predictive model;

matching the first feature and one or more second features through the prediction model to obtain one or more matching results, wherein each matching result corresponds to a matching score;

taking the matching result with the highest matching score in the one or more matching results as a target matching result;

and taking the seat corresponding to the target matching result as a target seat.

3. The method of claim 1 or 2, wherein after determining a target agent from the one or more agents based on the first feature and the second feature, further comprising:

and sending the data of the target customer to the target seat through a computing engine Spark, wherein the data of the target customer is displayed on a terminal device corresponding to the target seat.

4. The method of claim 1, wherein before synchronizing the target customer's data to the big data platform HIVE database using the first data processing tool Sqoop timing, the method further comprises:

acquiring data of a target client in a current scheduling period; the target client is a client with insurance requirements, and the behavior data comprises data of browsing insurance products by the target client;

and storing the data of the target client in the current scheduling period in the database.

5. The method of claim 4, wherein the step of synchronizing the data of the target client to a big data platform (HIVE) database by using a first data processing tool (Sqoop) timing comprises:

scanning the database at a first moment by adopting a first data processing tool Sqoop, wherein the first moment is the ending moment of the current scheduling period;

and if the data of the target client in the current scheduling period is scanned from the database, synchronizing the data of the target client to a high data platform (HIVE) database.

6. The method of claim 5, wherein if no data of the target client in the current scheduling period is scanned from the database, a next scheduling period is started;

storing the data of the target client in the next scheduling period in the database.

7. The method according to any one of claims 4 to 6, wherein the obtaining data of the target client in the current scheduling period comprises:

collecting data of source customers for browsing insurance products by customers in the current scheduling period;

and screening the collected data of the source client according to a preset rule to obtain the data of the target client.

8. An insurance data processing apparatus based on similarity matching, the apparatus comprising:

the system comprises a synchronization unit, a big data platform HIVE database and a data processing unit, wherein the synchronization unit is used for synchronizing data of a target client to the HIVE database of the big data platform at regular time by adopting a first data processing tool Sqoop, and the HIVE database stores data of one or more seats;

the processing unit is used for respectively extracting a first feature of the target client and a second feature of the one or more agents by adopting a second data processing tool Hadoop;

a matching unit configured to determine a target agent from the one or more agents based on the first characteristic and the one or more second characteristics, the target agent being designated to follow the insurance requirements of the target customer.

9. An electronic device comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps of the method of any of claims 1 to 7.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, causes the processor to carry out the method according to any one of claims 1 to 7.