CN112508720A - Insurance client identity attribute screening method and screening device and electronic equipment - Google Patents
Insurance client identity attribute screening method and screening device and electronic equipment Download PDFInfo
- Publication number
- CN112508720A CN112508720A CN202011459951.7A CN202011459951A CN112508720A CN 112508720 A CN112508720 A CN 112508720A CN 202011459951 A CN202011459951 A CN 202011459951A CN 112508720 A CN112508720 A CN 112508720A
- Authority
- CN
- China
- Prior art keywords
- data
- screening
- client
- customer
- identity attribute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012216 screening Methods 0.000 title claims abstract description 110
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000007781 pre-processing Methods 0.000 claims abstract description 12
- 238000012545 processing Methods 0.000 claims description 10
- 238000010606 normalization Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 4
- 230000010354 integration Effects 0.000 claims description 4
- 238000002372 labelling Methods 0.000 claims 1
- 238000004891 communication Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 6
- 238000001914 filtration Methods 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000007774 longterm Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003862 health status Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/08—Insurance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Data Mining & Analysis (AREA)
- Development Economics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Game Theory and Decision Science (AREA)
- Technology Law (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The embodiment of the specification provides a screening method and a screening device for identity attributes of insurance clients and electronic equipment. The screening method comprises the following steps: extracting characteristic data corresponding to a plurality of data sources according to the identity attribute characteristics of the client, and preprocessing to obtain normalized data; wherein the normalized data comprises at least data matching two of the customer identity attribute features; based on the same customer identity attribute characteristics of different normalized data, correlating and deduplicating the normalized data to obtain a customer retrieval data set; and screening the client retrieval data set according to a preset screening condition to obtain a target attribute client set. By the technical scheme, the quantity of the retrieval data sources and the screening difficulty can be greatly reduced, and the screening efficiency is improved.
Description
Technical Field
The embodiment of the specification relates to the technical field of big data, in particular to a screening method and a screening device for identity attributes of insurance clients and electronic equipment.
Background
With the development of information technology, insurance companies realize electronic management on massive customer data. In the prior art, customer profiles of different customer types, service types and the like are generally stored in different databases for convenience of management. Due to the fact that different client sources and different types of services relate to different client data types and different quantities, for example, personal client data comprise names and certificate numbers, and long-term insurance client data comprise nationality and residential addresses, corresponding databases are different, and great inconvenience is brought to screening of massive client identity attributes by using different databases.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a method for screening identity attributes of insurance clients, so as to solve the technical defect of inconvenient screening of identity attributes of insurance clients in the prior art. Another object of the present invention is to provide a screening device for insurance client identity attribute. It is a further object of the invention to provide an electronic device for performing a method for screening of insurance client identity attributes.
In a first aspect, one or more embodiments of the present specification provide a method for screening identity attributes of insurance clients, including:
extracting characteristic data corresponding to a plurality of data sources according to the identity attribute characteristics of the client, and preprocessing to obtain normalized data; wherein the normalized data comprises at least data matching two of the customer identity attribute features;
based on the same customer identity attribute characteristics of different normalized data, correlating and deduplicating the normalized data to obtain a customer retrieval data set;
and screening the client retrieval data set according to a preset screening condition to obtain a target attribute client set.
Further, the step of extracting feature data corresponding to a plurality of data sources and preprocessing the feature data to obtain normalized data includes:
matching the customer identity attribute features with the feature tags of the data sources;
extracting corresponding characteristic data according to the characteristic label matched with the client identity attribute characteristic;
and carrying out normalization processing on the feature data of different data sources to obtain the normalized data.
Further, the step of screening the client search data set according to a preset screening condition to obtain a target client set includes:
constructing a data table corresponding to each feature according to the screening features in the screening conditions and the client retrieval data set;
and screening and combining the data tables of each characteristic according to the screening conditions to obtain a target attribute client set.
Further, before the step of constructing the data table corresponding to each feature, the method further includes:
and establishing a data index for the client retrieval data set according to the screening characteristics in the screening conditions.
Further, the building of the data table corresponding to each feature is realized by a classification marking mode.
Further, the normalized data is processed by adopting a Hadoop distributed file architecture.
Further, the client identity attribute characteristics include name, certificate type, certificate number, nationality, birth address, residence address, and contact address.
Further, the plurality of data sources includes an individual customer data source, a group customer data source, a long-risk data source, and a short-risk data source.
In a second aspect, an embodiment of the present specification provides an apparatus for screening identity attributes of insurance clients, including:
a data acquisition module configured to: extracting characteristic data corresponding to a plurality of data sources according to the identity attribute characteristics of the client, and preprocessing to obtain normalized data; wherein the normalized data comprises at least data matching two of the customer identity attribute features;
a data integration module configured to: based on the same customer identity attribute characteristics of different normalized data, correlating and deduplicating the normalized data to obtain a customer retrieval data set;
a data screening module configured to: and screening the client retrieval data set according to a preset screening condition to obtain a target attribute client set.
In a third aspect, embodiments of the present specification further provide an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the screening method according to any of the foregoing descriptions when executing the program.
As can be seen from the above description, in the screening method for identity attributes of insurance clients provided in one or more embodiments of the present specification, the normalized data is obtained by extracting feature data corresponding to a plurality of data sources according to the identity attribute features of the clients and preprocessing the feature data; based on the same customer identity attribute characteristics of different normalized data, correlating and deduplicating the normalized data to obtain a customer retrieval data set; therefore, the customer identity attribute data of the data sources are integrated, and only the customer retrieval data set needs to be queried and retrieved in the subsequent screening process, so that the quantity of the retrieval data sources and the screening difficulty are greatly reduced, and the screening efficiency is improved.
Drawings
In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, the drawings that are needed in the description of the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only one or more embodiments of the present specification, and that other drawings may be obtained by those skilled in the art without inventive effort from these drawings.
FIG. 1 is a schematic flow chart of a prior art related art;
FIG. 2 is a schematic flow diagram illustrating a method for screening identity attributes of insurance clients according to one or more embodiments of the present disclosure;
FIG. 3 is a schematic flow chart for obtaining normalized data according to one or more embodiments of the present disclosure;
fig. 4 is a schematic structural diagram of a screening apparatus for identity attributes of insurance clients according to one or more embodiments of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device provided in one or more embodiments of the present disclosure.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It is to be noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present specification should have the ordinary meaning as understood by those of ordinary skill in the art to which this disclosure belongs. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect.
Referring to fig. 1, a method for querying and retrieving massive client identity attributes for data before different databases in the prior art is briefly described.
Note that the data source a, the data source B, and the data source N correspond to different databases, respectively. It should be understood that here, the types of databases may be the same or different. Illustratively, data source A and data source B are both MySQL databases, while data source C employs an Oracle database.
Specifically, the screening condition 1 in the keyword associated data source B is queried from the data source a for retrieval, and if the retrieval condition is insufficient, other data sources such as the associated data source C and the like are also needed for associated query, so as to obtain a final result set. If the data sources are not the same type of database (for example, the data source A is a MySQL database, and the data source B is an Oracle database), modification and adjustment of the query link and the query statement are also needed. That is to say, in order to implement screening of massive customer identity attributes in the prior art, data query and retrieval between single databases or double databases are usually only performed, then other databases are gradually added, and finally query links between different data sources are established by traversing all data sources, which results in complex retrieval logic and implementation manner, low retrieval efficiency, and is not beneficial to retrieval and screening of massive large data information.
Based on the above, one or more embodiments of the present specification provide a method for screening identity attributes of insurance clients.
Referring to fig. 2, the screening method includes:
s201: extracting characteristic data corresponding to a plurality of data sources according to the identity attribute characteristics of the client, and preprocessing to obtain normalized data; wherein the normalized data comprises at least data matching two of the customer identity attribute characteristics.
It should be noted that the client identity attribute may be resident or non-resident; or a coastal area client, an inland area client, etc., and is not limited herein.
For the identity attribute characteristics of the client, those skilled in the art can perform reasonable setting according to the screened target client set, and the setting is not specifically limited herein.
As an alternative embodiment, the client identity attribute features include name, certificate type, certificate number, nationality, birth address, residence address, and contact address. Optionally, the customer identity attribute feature can be used to filter whether the customer is non-residential. Alternatively, here, the resident includes not only natural persons but also legal persons.
For insurance clients, the client identity attribute data typically exists in multiple data sources. It should be understood that each data source may include the same customer or may include different customers. That is, the identity attribute data for the same client may appear in only one data source, or may appear in multiple data sources.
As an alternative embodiment, the plurality of data sources include an individual customer data source, a group customer data source, a long-risk data source, a short-risk data source, and the like.
Further, since each data source corresponds to a different traffic type or customer source, different data sources may include different content of the same customer identity attribute data. That is, for a data source, only some of the customer identity attribute features it contains can be extracted, not all of the customer identity attribute features.
Illustratively, the personal customer data source and the group customer data source generally include names, certificate types, certificate numbers, and the like in the customer identity attribute feature. It can be appreciated that only the name, certificate type and certificate number in the customer identity attribute feature can be extracted from the personal customer data source.
Illustratively, the long risk data source and the short risk data source generally include nationality, birth address, residence address, contact address, and the like in the customer identity attribute feature. It can be understood that from the long insurance data source, only nationality, birth address, residence address and contact address in the identity attribute feature of the client can be extracted.
As an alternative embodiment, the normalized data is processed using a Hadoop distributed file architecture. Here, the Hadoop distributed file architecture enables large data processing engines to be stored as close as possible, which is relatively appropriate for batch operations such as ETL, for example, because batch results like such operations can go directly to storage. The MapReduce function of Hadoop realizes the purposes of breaking up a single task, sending a broken task (Map) to a plurality of nodes, and then loading (Reduce) the broken task into a data warehouse in the form of a single data set. By the technical scheme, the screening efficiency can be effectively improved, and the screening time is reduced.
Those skilled in the art will understand that the normalized data may be processed by a MySQL database, a Microsoft SQL Server database, or an oracle database, which is not limited herein.
Here, MySQL database is an open source relational database management system (RDBMS) that can perform database operations using the most common structured query language.
The Microsoft SQL Server database is an extensible and high-performance database management system designed for distributed client/Server computing, realizes organic combination with Windows NT, and provides an enterprise-level information management system scheme based on transactions.
The Oracle database is a relational database management system, has good system portability, convenient use and strong function, and is suitable for various large, medium, small and microcomputer environments. The database scheme is high in efficiency, good in reliability and suitable for high throughput.
S202: based on the same customer identity attribute characteristics of different normalized data, correlating and deduplicating the normalized data to obtain a customer retrieval data set.
As previously mentioned, the normalized data may originate from different data sources, and the different data sources may contain different characteristics of the customer identity attribute. Therefore, the normalized data needs to be associated to integrate the identity attribute characteristics of multiple customers of the same customer according to the identity attribute characteristics of the same customer, such as the name and the certificate number of the same customer.
Optionally, if the same client corresponds to multiple normalized data after the integration, a deduplication operation is further required, so that the client search data set avoids the occurrence of invalid duplicate data on the basis of keeping the integrity.
S203: and screening the client retrieval data set according to a preset screening condition to obtain a target attribute client set.
It should be noted that, the preset screening conditions are reasonably set by those skilled in the art according to the characteristics of the target attribute client set, and the preset screening conditions are not limited herein.
For better understanding of the present step, the preset filtering condition and the target attribute client set are explained below by way of examples.
The target attribute client set is a non-residential client. Based on the preset screening conditions, the client nationality is determined to be non-residents if the client nationality is non-Chinese nationality. For the client search data which does not comprise nationality information, if the client certificate type is not the identity card type, if the client certificate type is a passport and the birth address and the residence are in non-Chinese areas, the client is determined as a non-resident; if the type of the client certificate is a passport, the birth address and the living address are China areas, and the contact way is not the contact way of the China areas, the client certificate is determined as suspected non-resident. Here, the determination of the area of the contact address may be made according to whether the number is a number of a communication carrier such as china mobile, china unicom, china telecom, or the like.
As can be seen from the above, the normalized data is obtained by preprocessing the feature data corresponding to the plurality of data sources extracted according to the identity attribute features of the client; based on the same customer identity attribute characteristics of different normalized data, correlating and deduplicating the normalized data to obtain a customer retrieval data set; therefore, the customer identity attribute data of the data sources are integrated, and only the customer retrieval data set needs to be queried and retrieved in the subsequent screening process, so that the quantity of the retrieval data sources and the screening difficulty are greatly reduced, and the screening efficiency is improved.
Referring to fig. 3, in one or more embodiments of the present disclosure, the step of extracting feature data corresponding to a plurality of data sources and preprocessing the feature data to obtain normalized data includes:
s301: and matching the identity attribute characteristics of the client with the characteristic label of the data source.
In general, different data sources may include a plurality of feature tags, some of which can be matched with the customer identity attribute features for screening customer identities; some features are independent of the client identity attributes and therefore require a screening of a large number of features included in the data source. For example, long-term insurance data sources include name, nationality, insurance item, insurance amount, disclaimer, and the like. As another example, the personal customer data sources include name, credential type, credential number, physical health status, and the like. As will be understood by those skilled in the art, for determining whether a customer is non-resident, the amount of insurance, disclaimer, etc. belong to the extraneous feature, while the name, type of certificate, and nationality feature belong to the customer identity attribute specialization.
S302: and extracting corresponding characteristic data according to the characteristic label matched with the client identity attribute characteristic.
S303: and carrying out normalization processing on the feature data of different data sources to obtain the normalized data.
It should be noted that the normalization processing may be to unify data of different data sources and different formats according to a set format, so as to obtain feature data with a consistent format. Of course, the normalization processing may also include other processing manners that can make the feature data uniform and convenient for retrieval, and is not limited specifically here.
Optionally, after the normalization process, a deduplication step is further included to reduce the presence of completely duplicated data in the same or different data sources.
Through the technical scheme, the corresponding feature data are extracted from the plurality of data sources and the normalized data are obtained according to the identity attribute features of the client, so that the follow-up inquiry and retrieval of the plurality of data sources when the identity attributes of the insurance client are screened is avoided, the retrieval difficulty is greatly reduced, and the retrieval efficiency is improved.
In one or more embodiments of the present specification, the step of filtering the client search data set according to a preset filtering condition to obtain a target client set includes:
and constructing a data table corresponding to each characteristic according to the screening characteristics in the screening conditions and the client retrieval data set.
Illustratively, the screening characteristics include whether the nationality is a Chinese nationality, whether the certificate type is an identity card, whether the address is a region of China, and the like. Therefore, a nationality dimension data table, a certificate type dimension table and an address dimension table are respectively established on the basis of the client retrieval data set.
By establishing the data table corresponding to each characteristic, the retrieval order of magnitude can be effectively reduced, and the screening efficiency is improved.
And further, screening and combining the data tables of each characteristic according to the screening conditions to obtain a target attribute client set.
And screening and combining the data tables of each characteristic to obtain a target attribute client set by combining the screening conditions of the non-residential clients for exemplary illustration.
For the nationality dimension data sheet, the client nationality is a non-Chinese nationality and can be directly regarded as a non-resident.
For the certificate type dimension table, screening non-identity card types and being clients of the passport; for the region dimension table, customers whose birth addresses and habitats are non-Chinese regions are screened. The screening results of the two are combined to screen out the non-residents.
In one or more embodiments of the present specification, before the step of constructing the data table corresponding to each feature, the method further includes: and establishing a data index for the client retrieval data set according to the screening characteristics in the screening conditions. By establishing the data index, the retrieval efficiency can be improved, and the efficiency of the data table of each characteristic of the component can be improved.
As an optional embodiment, the constructing of the data table corresponding to each feature is realized by a classification marking mode.
It should be noted that the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene and completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the devices may perform only one or more steps of the method of one or more embodiments of the present disclosure, and the devices may interact with each other to complete the method.
It should be noted that the above description describes certain embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Based on the same inventive concept, corresponding to the method of any embodiment, one or more embodiments of the present specification further provide a screening apparatus for identity attributes of insurance clients.
Referring to fig. 4, the apparatus for screening identity attributes of insurance clients includes:
a data acquisition module 401 configured to: extracting characteristic data corresponding to a plurality of data sources according to the identity attribute characteristics of the client, and preprocessing to obtain normalized data; wherein the normalized data comprises at least data matching two of the customer identity attribute features;
a data integration module 402 configured to: based on the same customer identity attribute characteristics of different normalized data, correlating and deduplicating the normalized data to obtain a customer retrieval data set;
a data screening module 403 configured to: and screening the client retrieval data set according to a preset screening condition to obtain a target attribute client set.
As an optional embodiment, the data obtaining module 401 is further configured to:
matching the customer identity attribute features with the feature tags of the data sources;
extracting corresponding characteristic data according to the characteristic label matched with the client identity attribute characteristic;
and carrying out normalization processing on the feature data of different data sources to obtain the normalized data.
As an optional embodiment, the data filtering module 403 is configured to:
the step of screening the client retrieval data set according to a preset screening condition to obtain a target client set comprises the following steps:
constructing a data table corresponding to each feature according to the screening features in the screening conditions and the client retrieval data set;
and screening and combining the data tables of each characteristic according to the screening conditions to obtain a target attribute client set.
As an optional embodiment, the data filtering module 403 is further configured to:
and establishing a data index for the client retrieval data set according to the screening characteristics in the screening conditions.
As an optional embodiment, the constructing of the data table corresponding to each feature is realized by a classification marking mode.
As an alternative embodiment, the normalized data is processed using a Hadoop distributed file architecture.
As an alternative embodiment, the client identity attribute features include name, certificate type, certificate number, nationality, birth address, residence address, and contact address.
As an alternative embodiment, the plurality of data sources includes an individual customer data source, a group customer data source, a long-risk data source, and a short-risk data source.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the modules may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.
The apparatus of the foregoing embodiment is used to implement the screening method for the identity attribute of the corresponding insurance client in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Based on the same inventive concept, corresponding to any of the above embodiments, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method for screening the identity attribute of the insurance client according to any of the above embodiments is implemented.
Fig. 5 is a schematic diagram illustrating a more specific hardware structure of an electronic device according to this embodiment, where the electronic device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
The electronic device of the above embodiment is used to implement the screening method for the identity attribute of the corresponding insurance client in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.
Based on the same inventive concept, corresponding to any of the above-mentioned embodiment methods, one or more embodiments of the present specification further provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method for screening of insurance client identity attributes according to any of the above-mentioned embodiments.
Computer-readable media of the present embodiments, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
The computer instructions stored in the storage medium of the above embodiment are used to enable the computer to execute the method for screening identity attributes of insurance clients according to any of the above embodiments, and have the beneficial effects of corresponding method embodiments, which are not described herein again.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the spirit of the present disclosure, features from the above embodiments or from different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of different aspects of one or more embodiments of the present description as described above, which are not provided in detail for the sake of brevity.
In addition, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown in the provided figures, for simplicity of illustration and discussion, and so as not to obscure one or more embodiments of the disclosure. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the understanding of one or more embodiments of the present description, and this also takes into account the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the one or more embodiments of the present description are to be implemented (i.e., specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that one or more embodiments of the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative instead of restrictive.
While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic ram (dram)) may use the discussed embodiments.
It is intended that the one or more embodiments of the present specification embrace all such alternatives, modifications and variations as fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of one or more embodiments of the present disclosure are intended to be included within the scope of the present disclosure.
Claims (10)
1. A method for screening identity attributes of insurance clients is characterized by comprising the following steps:
extracting characteristic data corresponding to a plurality of data sources according to the identity attribute characteristics of the client, and preprocessing to obtain normalized data; wherein the normalized data comprises at least data matching two of the customer identity attribute features;
based on the same customer identity attribute characteristics of different normalized data, correlating and deduplicating the normalized data to obtain a customer retrieval data set;
and screening the client retrieval data set according to a preset screening condition to obtain a target attribute client set.
2. The screening method according to claim 1, wherein the step of extracting feature data corresponding to a plurality of data sources and preprocessing the feature data to obtain normalized data comprises:
matching the customer identity attribute features with the feature tags of the data sources;
extracting corresponding characteristic data according to the characteristic label matched with the client identity attribute characteristic;
and carrying out normalization processing on the feature data of different data sources to obtain the normalized data.
3. The screening method according to claim 1, wherein the step of screening the client search data set according to a preset screening condition to obtain a target client set comprises:
constructing a data table corresponding to each feature according to the screening features in the screening conditions and the client retrieval data set;
and screening and combining the data tables of each characteristic according to the screening conditions to obtain a target attribute client set.
4. The screening method of claim 3, wherein the step of constructing a data table corresponding to each feature is preceded by the step of:
and establishing a data index for the client retrieval data set according to the screening characteristics in the screening conditions.
5. The screening method of claim 3, wherein the constructing of the data table corresponding to each feature is performed by a classification labeling method.
6. The screening method of claim 1, wherein the normalized data is processed using a Hadoop distributed file architecture.
7. The screening method of claim 1, wherein the customer identity attribute characteristics include name, certificate type, certificate number, nationality, birth address, residence address, and contact address.
8. The screening method of claim 1, wherein the plurality of data sources includes an individual customer data source, a group customer data source, a long-risk data source, and a short-risk data source.
9. An insurance client identity attribute screening device, comprising:
a data acquisition module configured to: extracting characteristic data corresponding to a plurality of data sources according to the identity attribute characteristics of the client, and preprocessing to obtain normalized data; wherein the normalized data comprises at least data matching two of the customer identity attribute features;
a data integration module configured to: based on the same customer identity attribute characteristics of different normalized data, correlating and deduplicating the normalized data to obtain a customer retrieval data set;
a data screening module configured to: and screening the client retrieval data set according to a preset screening condition to obtain a target attribute client set.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the screening method of any one of claims 1 to 7 when executing the program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011459951.7A CN112508720A (en) | 2020-12-11 | 2020-12-11 | Insurance client identity attribute screening method and screening device and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011459951.7A CN112508720A (en) | 2020-12-11 | 2020-12-11 | Insurance client identity attribute screening method and screening device and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112508720A true CN112508720A (en) | 2021-03-16 |
Family
ID=74972040
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011459951.7A Pending CN112508720A (en) | 2020-12-11 | 2020-12-11 | Insurance client identity attribute screening method and screening device and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112508720A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112883073A (en) * | 2021-03-22 | 2021-06-01 | 北京同邦卓益科技有限公司 | Data screening method, device, equipment, readable storage medium and product |
CN113449221A (en) * | 2021-03-22 | 2021-09-28 | 北京新氧科技有限公司 | Page screening method and device, terminal equipment and storage medium |
CN114240630A (en) * | 2021-12-21 | 2022-03-25 | 中国建设银行股份有限公司 | Data processing method, data processing apparatus, electronic device, medium, and program product |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107895280A (en) * | 2017-10-27 | 2018-04-10 | 深圳索信达数据技术股份有限公司 | A kind of marketing program method for pushing, system, terminal and storage medium |
CN108388675A (en) * | 2018-03-26 | 2018-08-10 | 深圳市买买提信息科技有限公司 | Circulation method and terminal device are drawn in a kind of identity |
CN108520073A (en) * | 2018-04-13 | 2018-09-11 | 深圳壹账通智能科技有限公司 | Air control data integration method, device, equipment and computer readable storage medium |
CN110046933A (en) * | 2019-04-02 | 2019-07-23 | 上海网商电子商务有限公司 | A kind of sale of automobile clue automatically screening system Internet-based |
-
2020
- 2020-12-11 CN CN202011459951.7A patent/CN112508720A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107895280A (en) * | 2017-10-27 | 2018-04-10 | 深圳索信达数据技术股份有限公司 | A kind of marketing program method for pushing, system, terminal and storage medium |
CN108388675A (en) * | 2018-03-26 | 2018-08-10 | 深圳市买买提信息科技有限公司 | Circulation method and terminal device are drawn in a kind of identity |
CN108520073A (en) * | 2018-04-13 | 2018-09-11 | 深圳壹账通智能科技有限公司 | Air control data integration method, device, equipment and computer readable storage medium |
CN110046933A (en) * | 2019-04-02 | 2019-07-23 | 上海网商电子商务有限公司 | A kind of sale of automobile clue automatically screening system Internet-based |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112883073A (en) * | 2021-03-22 | 2021-06-01 | 北京同邦卓益科技有限公司 | Data screening method, device, equipment, readable storage medium and product |
CN113449221A (en) * | 2021-03-22 | 2021-09-28 | 北京新氧科技有限公司 | Page screening method and device, terminal equipment and storage medium |
CN113449221B (en) * | 2021-03-22 | 2023-09-01 | 北京新氧科技有限公司 | Page screening method and device, terminal equipment and storage medium |
CN112883073B (en) * | 2021-03-22 | 2024-04-05 | 北京同邦卓益科技有限公司 | Data screening method, device, equipment, readable storage medium and product |
CN114240630A (en) * | 2021-12-21 | 2022-03-25 | 中国建设银行股份有限公司 | Data processing method, data processing apparatus, electronic device, medium, and program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11494339B2 (en) | Multi-level compression for storing data in a data store | |
CN112508720A (en) | Insurance client identity attribute screening method and screening device and electronic equipment | |
US12061571B2 (en) | Lineage data for data records | |
US8862566B2 (en) | Systems and methods for intelligent parallel searching | |
CN110990390B (en) | Data cooperative processing method, device, computer equipment and storage medium | |
US10503798B2 (en) | System and method for automated address verification | |
US20210240784A1 (en) | Method, apparatus and storage medium for searching blockchain data | |
US20210357461A1 (en) | Method, apparatus and storage medium for searching blockchain data | |
US9047368B1 (en) | Self-organizing user-centric document vault | |
CN112182004B (en) | Method, device, computer equipment and storage medium for checking data in real time | |
CN108319608A (en) | The method, apparatus and system of access log storage inquiry | |
CN109471893B (en) | Network data query method, equipment and computer readable storage medium | |
CN114356851A (en) | Data file storage method and device, electronic equipment and storage medium | |
WO2019071907A1 (en) | Method for identifying help information based on operation page, and application server | |
CN117171108B (en) | Virtual model mapping method and system | |
US9589038B1 (en) | Attribute tracking, profiling, and recognition | |
CN111159213A (en) | Data query method, device, system and storage medium | |
US9286348B2 (en) | Dynamic search system | |
CN110471708B (en) | Method and device for acquiring configuration items based on reusable components | |
CN113127496B (en) | Method and device for determining change data in database, medium and equipment | |
US9323817B2 (en) | Distributed storage system with pluggable query processing | |
CN114510605A (en) | Data storage method and device, electronic equipment and storage medium | |
JP2020154381A (en) | Information processing system, information processing device, information processing method, and program | |
CN114996364B (en) | Classification and classification method and device for audit logs of PaaS cloud database and storage medium | |
CN113836168B (en) | Big data processing system and method based on block chain |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210316 |