CN116362750A - Data screening method and device, electronic equipment and storage medium - Google Patents
Data screening method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN116362750A CN116362750A CN202310261949.6A CN202310261949A CN116362750A CN 116362750 A CN116362750 A CN 116362750A CN 202310261949 A CN202310261949 A CN 202310261949A CN 116362750 A CN116362750 A CN 116362750A
- Authority
- CN
- China
- Prior art keywords
- data table
- screening
- data
- primary key
- time period
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
- G06Q20/4016—Transaction verification involving fraud or risk level assessment in transaction processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0207—Discounts or incentives, e.g. coupons or rebates
- G06Q30/0224—Discounts or incentives, e.g. coupons or rebates based on user history
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Development Economics (AREA)
- Computer Security & Cryptography (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Economics (AREA)
- Marketing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application provides a data screening method, a data screening device, electronic equipment and a storage medium, and relates to the technical field of big data intelligent analysis. The method comprises the following steps: acquiring a preset screening rule, wherein the screening rule comprises a time period and screening conditions; acquiring a first data table and a second data table which are related to screening conditions in a production environment, wherein the first data table comprises first historical service data corresponding to each of a plurality of accounts in any time period, and the second data table comprises second historical service data corresponding to each of a plurality of accounts in any time period and a time period before any time period; and screening target accounts meeting screening conditions in the time period from the first data table in a primary key association mode based on the first data table and the second data table. The data quantity conforming to the rule can be accurately estimated in advance before the rule is deployed in the production environment, so that the situations of overlarge data quantity conforming to the rule and the like are avoided, and risks caused by logarithmic bin capacity and business processing are avoided.
Description
Technical Field
The present disclosure relates to the field of intelligent analysis of big data, and in particular, to a data screening method, a device, an electronic apparatus, and a storage medium.
Background
Currently, in order to perform risk assessment on user accounts, for example, to determine whether a user has fraudulent activity or not, service data of each user account in a real production environment may be screened through a preset screening rule, so that a user account meeting the screening rule is determined as a risk account.
Due to reasons such as fluidity and instantaneity of data generated in real time, the related technology is mostly limited in that after preset screening rules are deployed in a production environment, the effect of the screening rules in the risk assessment is judged according to the number of data which are screened out in the production environment and accord with the screening rules. The situation that the data quantity which is screened out in the production environment and accords with the screening rule is overlarge easily occurs, so that risks are caused to the data warehouse capacity and business processing. Therefore, a method for pre-estimating the data volume conforming to the screening rule before the set screening rule is deployed in the production environment is needed.
Disclosure of Invention
The application provides a data screening method, a device, electronic equipment and a storage medium, which are used for solving the problems that after a preset screening rule is deployed in a production environment in the prior art, the effect of the screening rule in the risk assessment is judged according to the quantity of screened data conforming to the screening rule in the production environment, the quantity of screened data conforming to the screening rule in the production environment is overlarge, and the like, so that risks are caused to the data warehouse capacity and business processing.
In a first aspect, the present application provides a data screening method, including: acquiring a preset screening rule, wherein the screening rule comprises a time period and preset screening conditions; acquiring a first data table related to the screening condition in a production environment, wherein the first data table comprises first historical service data corresponding to a plurality of accounts in any time period; acquiring a second data table related to the screening condition in the production environment, wherein the second data table comprises second historical service data corresponding to each of a plurality of accounts in the optional time period and the time period before the optional time period; and screening target accounts meeting the screening conditions in the time period from the first data table in a primary key association mode based on the first data table and the second data table.
In a second aspect, the present application provides a data screening apparatus comprising: the first acquisition module is used for acquiring a preset screening rule, wherein the screening rule comprises a time period and preset screening conditions; the second acquisition module is used for acquiring a first data table related to the screening condition in the production environment, wherein the first data table comprises first historical service data corresponding to a plurality of accounts in any time period; a third obtaining module, configured to obtain a second data table related to the screening condition in the production environment, where the second data table includes the arbitrary time period and second historical service data corresponding to each of the plurality of accounts in the time period before the arbitrary time period; and the screening module is used for screening the target account meeting the screening condition in the time period from the first data table in a main key association mode based on the first data table and the second data table.
In a third aspect, the present application provides an electronic device, comprising: a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes computer-executable instructions stored in the memory to implement the method as described in the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium having stored therein computer-executable instructions for performing the method according to the first aspect when executed by a processor.
In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the method of the first aspect.
According to the data screening method, the device, the electronic equipment and the storage medium, through the acquisition of the preset screening rule, the screening rule comprises a time period and preset screening conditions, a first data table related to the screening conditions in a production environment is acquired, the first data table comprises first historical service data corresponding to a plurality of accounts in any time period, a second data table related to the screening conditions in the production environment is acquired, the second data table comprises second historical service data corresponding to a plurality of accounts in any time period and before any time period, and target accounts meeting the screening conditions in the time period are screened out from the first data table in a main key association mode based on the first data table and the second data table. Therefore, the data quantity conforming to the screening rule can be accurately estimated in advance before the set screening rule is deployed in the production environment, so that the conditions of overlarge data quantity conforming to the screening rule screened in the production environment and the like are avoided, and risks caused by logarithmic bin capacity and business processing are avoided.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a flowchart of a data screening method according to an embodiment of the present application;
fig. 2 is a second flowchart of a data screening method according to an embodiment of the present application;
fig. 3 is a flowchart III of a data screening method according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a data screening device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.
It should be noted that, in the technical scheme of the application, the acquisition, storage, use, processing and the like of the data all conform to the relevant regulations of national laws and regulations.
In order to clearly understand the technical solutions proposed in the present application, first, the technical names in the embodiments of the present application are explained.
The screening rule is a logic method for judging a certain event through various conditions.
And the time period is the time span covered by the screening rule.
The test environment is used for simulating production and is independent of the production environment.
The production environment is an environment for actual production, is a real environment and is independent of the test environment.
Real-time data is data generated during a current time period in a production environment.
Historical data is data generated during a historical period of time in a production environment.
It can be understood that, due to reasons such as fluidity and instantaneity of data generated in real time, the related technology is mostly limited in that after a preset screening rule is deployed in a production environment, the effect of the screening rule when used for risk assessment is judged according to the number of data which is screened out in the production environment and accords with the screening rule, and the situation that the data which is screened out in the production environment and accords with the screening rule is overlarge easily occurs, so that risks are caused to the data warehouse capacity and business processing.
The related art addresses the above problems by slicing data as follows: historical data relevant to screening rules in a plurality of time periods in a production environment are selected, the historical data in each time period are screened according to the screening rules, and then screening results in each time period are averaged, so that the data quantity conforming to the screening rules is estimated in advance before the set screening rules are deployed in the production environment.
For example, take the following screening rules as examples: the method comprises the steps of screening accounts with transaction times of other payee being greater than or equal to N and total transaction amount being greater than M in T hours, selecting historical data related to screening rules in 5 time periods in a production environment, screening the historical data in each time period, screening the accounts with transaction times of other payee being greater than or equal to N and total transaction amount being greater than M in T hours, obtaining 5 screening results, calculating an average value of the 5 screening results, and taking the average value as the estimated data quantity conforming to the screening rules.
In this way, only a rough data volume estimation can be performed on a simple screening rule, and calculation of flow data by using a slicing mode can result in larger errors, and the estimated data volume conforming to the screening rule is inaccurate.
According to the method, after a preset screening rule is deployed in a production environment in the related technology, the effect of the screening rule when the screening rule is used for risk assessment is judged according to the quantity of data which is screened out in the production environment and accords with the screening rule, and the conditions that the data quantity which is screened out in the production environment and accords with the screening rule is overlarge and the like are easy to occur, so that risks are caused to the warehouse capacity and business processing are solved, and the following technical concept is provided:
acquiring a preset screening rule, wherein the screening rule comprises a time period and preset screening conditions; acquiring a first data table related to screening conditions in a production environment, wherein the first data table comprises first historical service data corresponding to a plurality of accounts in any time period; acquiring a second data table related to screening conditions in the production environment, wherein the second data table comprises second historical service data corresponding to a plurality of accounts in any time period and a time period before any time period; and screening target accounts meeting screening conditions in the time period from the first data table in a primary key association mode based on the first data table and the second data table.
Therefore, the data quantity conforming to the screening rule can be accurately estimated in advance before the set screening rule is deployed in the production environment, so that the conditions of overlarge data quantity conforming to the screening rule screened in the production environment and the like are avoided, and risks caused by logarithmic bin capacity and business processing are avoided. For the risk prediction scene, the data quantity conforming to the screening rule is accurately predicted in advance before the set screening rule is deployed in the production environment, so that a more proper screening rule can be obtained.
The following describes the technical solutions of the present application and how the technical solutions of the present application solve the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
First, with reference to fig. 1, a data screening method provided in an embodiment of the present application will be described. Fig. 1 is a flowchart of a data screening method according to an embodiment of the present application.
It should be noted that, the data screening method provided in the embodiment of the present application may be executed by a data screening device, where the data screening device may be an electronic device, or may be configured in the electronic device, so as to accurately predict, in advance, the data volume conforming to the screening rule before the set screening rule is deployed in the production environment, thereby avoiding situations that the screened data volume conforming to the screening rule in the production environment is too large, and avoiding risk caused by logarithmic bin capacity and service processing.
The electronic device may be any device with computing capability, for example, may be a personal computer, a mobile terminal, a server, etc., and the mobile terminal may be, for example, a vehicle-mounted device, a mobile phone, a tablet computer, a personal digital assistant, a wearable device, etc., which have various operating systems, touch screens, and/or display screens.
As shown in fig. 1, the data screening method includes the steps of:
s101, acquiring a preset screening rule, wherein the screening rule comprises a time period and preset screening conditions.
The method comprises a time period and a preset screening rule of screening conditions, wherein the screening rule is used for screening out accounts meeting the screening conditions in the time period. The time period and the preset screening conditions can be set according to actual application scenes, which is not limited in the application.
For example, assuming that the time period is T hours (T is a number greater than 0), the screening condition may include a number of transactions being N or greater (N is an integer greater than 0), the total transaction amount exceeding M (M is a number greater than 0), and the screening rule may include: the account with the transaction number of other recipients being greater than or equal to N and the total transaction amount exceeding M in T hours is screened, and the account with the transaction number of other recipients being greater than or equal to N and the total transaction amount exceeding M in T hours can be screened according to the screening rule.
Alternatively, assuming that the time period is T, the screening condition may include the number of transactions being N or greater, the total transaction amount exceeding M, and the screening rule may include: the account with the total transaction amount exceeding M and the number of transactions with other recipients being equal to or greater than N within T hours of residence address change is screened, and the account with the total transaction amount exceeding M and the number of transactions with other recipients being equal to or greater than N within T hours of address change can be screened according to the screening rule.
Alternatively, assuming that the time period is T, the filtering condition may include the number of personal information modification times being N or more, and the filtering rule may include: and screening the accounts with the personal information change times greater than or equal to N in T hours, and screening the accounts with the personal information change times greater than or equal to N in T hours according to the screening rule.
It should be noted that the above screening rules are only illustrative, and should not be construed as limiting the screening rules of the present application, and those skilled in the art may set the screening rules arbitrarily according to need, and only the screening rules include time periods and screening conditions.
S102, acquiring a first data table related to screening conditions in a production environment, wherein the first data table comprises first historical service data corresponding to a plurality of accounts in any time period.
The arbitrary time period is an arbitrary time period before the current time, and may be arbitrarily set as needed, for example, may be set to 0 point on 1 month and 1 day to 24 points on 1 month and 30 days, 0 point on 1 month and 1 day to 24 points, 13 points on 1 month and 2 days to 14 points, and the like.
Taking the screening condition that the number of transactions is greater than or equal to N and the total transaction amount exceeds M, and taking 0 point to 24 points in any time period of 1 month and 11 days as an example, a first data table related to the screening condition in the production environment can be obtained, wherein the first data table can comprise historical transaction data of each of a plurality of accounts in 0 point to 24 points of 1 month and 11 days, and the historical transaction data comprises transaction amount, transaction time and the like of transactions with other people.
Taking the screening condition including the number of personal information change times being greater than or equal to N and taking 0 point to 24 point of 1 month and 11 days as an example, a first data table related to the screening condition in the production environment can be obtained, wherein the first data table can include information change data of each of a plurality of accounts within 0 point to 24 point of 1 month and 11 days, and the information change data includes information change time, types of changed information and the like.
S103, acquiring a second data table related to screening conditions in the production environment, wherein the second data table comprises second historical service data corresponding to each of a plurality of accounts in any time period and a time period before any time period.
The arbitrary time period is the same as the arbitrary time period in step S102, and the time period is a time period included in the filtering rule. The time period before the arbitrary time period is a time period before the arbitrary time period adjacent to the start time of the arbitrary time period. For example, assuming that the arbitrary time period is 0 to 24 points on 1 month and 11 days and the time period is one week, the time period before the arbitrary time period is 0 to 24 points on 1 month and 4 days and 1 month and 10 days.
Taking the screening condition that the number of transactions is greater than or equal to N, the total transaction amount exceeds M, the arbitrary time period is 0 point to 24 points of 1 month and 11 days, and the time period is one week as an example, a second data table related to the screening condition in the production environment can be obtained, wherein the second data table can comprise historical transaction data of each of a plurality of accounts within 24 points of 0 point of 1 month and 4 days to 24 points of 1 month and 11 days, and the historical transaction data comprises transaction amount, transaction time and the like of transactions with other people.
Taking the screening condition including the number of personal information change times being greater than or equal to N, wherein the arbitrary time period is 0 point to 24 points of 1 month and 11 days, and the time period is one week as an example, a second data table related to the screening condition in the production environment can be obtained, and the second data table can include information change data of each of a plurality of accounts within 24 points of 0 point of 1 month and 4 days to 24 points of 1 month and 11 days, wherein the information change data includes information change time, changed information belonging type and the like.
S104, based on the first data table and the second data table, the target account meeting the screening condition in the time period is screened from the first data table in a main key association mode.
In this embodiment of the present invention, the primary key for distinguishing the account may be included in both the first data table and the second data table, for example, the account number, the identification card number of the user to which the account belongs, the mobile phone number of the user to which the account belongs, and the like, and the first data table may be used as the primary table, and the second data table is associated with the primary key in the first data table through the primary key in the first data table, so as to obtain data associated with each primary key value of the primary key in the second data table, where the data associated with each primary key value includes historical service data corresponding to the primary key value in a time period, and further data screening is performed according to the data associated with each primary key value of the primary key in the first data table, so as to obtain the target account meeting the screening condition in the time period in the first data table.
For example, taking a case that the screening condition includes that the number of transactions is greater than or equal to N, the total transaction amount exceeds M, the arbitrary time period is 0 point to 24 point of 1 month 11 day, the time period is one week, the screening rule includes that the number of transactions with other payee in one week is greater than or equal to N, and the total transaction amount exceeds M as an example, the first data table may include historical transaction data of each of a plurality of accounts in 0 point to 24 point of 1 month 11 day, wherein the historical transaction data includes transaction amount, transaction time, etc. for transactions with other people, the second data table may include historical transaction data of each of a plurality of accounts in 0 point to 24 point of 1 month 4 day, wherein the historical transaction data includes transaction amount, transaction time, etc. for transactions with other people.
Assuming that the first data table includes historical transaction data of each of the account 1, the account 2, the account 3 and the account 4, and the second data table includes historical transaction data of each of the account 1, the account 2, the account 3, the account 4, the account 5, the account 6, the account 7 and the account 8, the first data table may be used as a main table, and the second data table is associated with a main key (such as an account number) used for distinguishing the account in the first data table, so as to obtain data associated with main key values corresponding to the account 1, the account 2, the account 3 and the account 4 in the second data table, where, taking the account 1 as an example, the data associated with the main key value corresponding to the account 1 includes historical transaction data of the account 1 in a week.
And further, data respectively associated with the primary key values corresponding to the account 1, the account 2, the account 3 and the account 4 can be screened according to screening rules, so as to obtain a target account, wherein the transaction times with other recipients in a week in the first data table are greater than or equal to N, and the total transaction amount exceeds M.
After slicing the data, historical service data corresponding to each primary key value in the first data table in a time period is obtained through a primary key association mode, then a target account meeting the screening conditions in the time period is screened out from the first data table according to the screening conditions, and the processing logic of the screening rules is truly restored, so that the data volume conforming to the screening rules can be accurately estimated in advance before the set screening rules are deployed in the production environment.
According to the data screening method, a preset screening rule is obtained, the screening rule comprises a time period and preset screening conditions, a first data table relevant to the screening conditions in a production environment is obtained, the first data table comprises first historical service data corresponding to a plurality of accounts in any time period, a second data table relevant to the screening conditions in the production environment is obtained, the second data table comprises second historical service data corresponding to a plurality of accounts in any time period and a time period before any time period, and target accounts meeting the screening conditions in the time period are screened out from the first data table in a primary key association mode based on the first data table and the second data table. Therefore, the data quantity conforming to the screening rule can be accurately estimated in advance before the set screening rule is deployed in the production environment, so that the conditions of overlarge data quantity conforming to the screening rule screened in the production environment and the like are avoided, and risks caused by logarithmic bin capacity and business processing are avoided. For the risk prediction scene, the data quantity conforming to the screening rule is accurately predicted in advance before the set screening rule is deployed in the production environment, so that a more proper screening rule can be obtained.
The process of screening target accounts meeting the screening conditions in the time period from the first data table by the primary key association method based on the first data table and the second data table in the data screening method provided in the embodiment of the present application is further described below with reference to fig. 2.
Fig. 2 is a second flowchart of a data screening method according to an embodiment of the present application. As shown in fig. 2, the data screening method includes the steps of:
s201, acquiring a preset screening rule, wherein the screening rule comprises a time period and preset screening conditions.
S202, acquiring a first data table related to screening conditions in a production environment, wherein the first data table comprises first historical service data corresponding to a plurality of accounts in any time period.
S203, acquiring a second data table related to the screening condition in the production environment, wherein the second data table comprises second historical service data corresponding to each of a plurality of accounts in any time period and a time period before any time period.
It should be noted that, the related descriptions of the steps S201 to S203 may refer to other embodiments, and are not described herein.
S204, taking the first data table as a main table, and associating the second data table through a first main key used for distinguishing accounts in the first data table to obtain first target data respectively associated with each main key value of the first main key.
The first target data comprises second historical service data, corresponding to the main key value, in the second data table, which is the same as the main key value of the associated first main key, and the second historical service data comprises service handling time in the first target data, wherein the service handling time is in a time period before the service handling time corresponding to the main key value of the associated first main key. The first target data associated with each primary key value of the first primary key may include second historical service data corresponding to one or more accounts.
The first primary key is used for distinguishing accounts, for example, an account number, an identity card number of a user to whom the account belongs, a mobile phone number of the user to which the account belongs, and the like.
The primary key value of the first primary key is a specific account identifier of an account, such as a specific account number, a specific identification card number, a specific mobile phone number, and the like.
The business handling time is the time of handling business, for example, the business handling time is the transaction time for transaction business, the business handling time is the information changing time for information changing business, and the like.
The business handling time included in a certain second historical business data is the business handling time of the user corresponding to the second historical business data; the business handling time corresponding to the primary key value of the first primary key is the business handling time of the user corresponding to the primary key value in the first data table.
In this embodiment of the present invention, the first data table may be used as a master table, and the second data table is associated with the first master key used for distinguishing the account in the first data table, where when the second data table is associated with the second data table, the second historical service data corresponding to the master key value in the second data table and the master key value of the associated first master key may be limited, where the service handling time included in the second historical service data is in a time period before the service handling time corresponding to the master key value of the associated first master key, so that for each master key value of the first master key in the first data table, the associated first target data may be obtained. The first target data associated with each primary key value of the first primary key comprises historical service data corresponding to the primary key value in a time period.
For example, taking a case that the screening condition includes that the number of transactions is greater than or equal to N, the total transaction amount exceeds M, the arbitrary time period is 0 point to 24 points of 1 month 11 days, the time period is one week, the screening rule includes that the number of transactions with other payee is greater than or equal to N and the total transaction amount exceeds M in one week, the first data table may include historical transaction data of each of the plurality of accounts in 0 point to 24 points of 1 month 11 days, wherein the historical transaction data includes transaction amount, transaction time, etc. for transactions with other people, the second data table may include historical transaction data of each of the plurality of accounts in 0 point to 24 points of 1 month 4 days, wherein the historical transaction data includes transaction amount, transaction time, etc. for transactions with other people.
Assuming that the first data table includes historical transaction data of each of the account 1, the account 2, the account 3 and the account 4, and the second data table includes historical transaction data of each of the account 1, the account 2, the account 3, the account 4, the account 5, the account 6, the account 7 and the account 8, the first data table may be used as a main table, and the second data table is associated with a main key (such as an account number) in the first data table, so as to obtain first target data associated with each main key value of the first main key in the second data table. Taking account 1 as an example, the first target data associated with the primary key value corresponding to account 1 may include at least one piece of historical transaction data corresponding to account 1 in the second data table, where the at least one piece of historical transaction data includes a business transaction time in a time period before the business transaction time corresponding to account 1 in the first data table. For example, the first data table includes a transaction amount of the transaction performed by the account 1 at 13 points of 11 days of 1 month, and the first target data associated with the primary key value corresponding to the account 1 may include a transaction time and a transaction amount of the transaction performed between 13 points of 4 days of 1 month and 13 points of 11 days of 1 month corresponding to the account 1 in the second data table. Thus, the first target data associated with the primary key corresponding to account 1 includes historical transaction data of account 1 during a week.
S205, screening the first target data respectively associated with the primary key values of the first primary keys according to the screening conditions to obtain target accounts meeting the screening conditions in the time period in the first data table.
In one possible implementation, the screening condition may include the number of business transactions being greater than the number threshold and the amount of business being greater than the amount threshold, and accordingly, step S205 may be implemented by:
for each primary key value of the first primary key, determining the service handling times and the service amount corresponding to the primary key value in the time period according to the second historical service data in the associated first target data; determining a primary key value of which the corresponding business handling times in a time period are larger than a time threshold and the corresponding business amount is larger than an amount threshold in each primary key value of the first primary key as a first target primary key value; and determining the account corresponding to the first target primary key value as a target account meeting the screening condition in the time period. Therefore, for the screening conditions comprising that the business handling times are larger than the times threshold and the business amount is larger than the amount threshold, the target account meeting the screening conditions in the time period can be accurately screened.
For example, continuing the above example, assuming that the screening condition includes a number of transactions equal to or greater than N, a total transaction amount exceeding M, an arbitrary time period of 0 to 24 points of 1 month 11 days, and a time period of one week, the screening rule includes screening accounts with other recipients having a number of transactions equal to or greater than N and a total transaction amount exceeding M within one week. As the historical transaction data of the accounts in the week can be obtained through the steps for each account in the first data table, the transaction times and the total transaction amount in the week can be determined for each account, and then the target accounts with the transaction times of N and the total transaction amount exceeding M with other payee in the week can be screened out.
In another possible implementation manner, the screening condition may include that the number of business processes is greater than the number threshold, and accordingly, the step S205 may be implemented by: for each primary key value of the first primary key, determining the business handling times corresponding to the primary key value in the time period according to the second historical business data in the associated first target data; determining a primary key value of which the corresponding business handling times in a time period are greater than a times threshold value in all primary key values of a first primary key as a first target primary key value; and determining the account corresponding to the first target primary key value as a target account meeting the screening condition in the time period. Therefore, for the screening conditions including that the business handling times are larger than the times threshold, the target account meeting the screening conditions in the time period can be accurately screened out.
For example, assuming that the filtering condition includes that the number of personal information changes is N or more, the arbitrary time period is 0 to 24 points on day 11 of 1 month, and the time period is one week, the filtering rule includes filtering accounts whose number of personal information changes is N or more in one week. Because the information change data of the accounts in one week can be obtained through the steps for each account in the first data table, the number of times of personal information change in one week can be determined for each account, and then the target account with the number of times of personal information change in one week being greater than or equal to N can be screened out.
Therefore, after slicing the data, historical service data corresponding to each primary key value in the first data table in a time period can be obtained through a primary key association mode, then a target account meeting the screening conditions in the time period is screened out from the first data table according to the screening conditions, and the processing logic of the screening rules is truly restored, so that the data volume meeting the screening rules can be accurately estimated in advance before the set screening rules are deployed in the production environment, and further the situations of overlarge data volume meeting the screening rules and the like screened out in the production environment are avoided, and the risk caused by logarithmic bin capacity and service processing is avoided.
As can be seen from the above analysis, in the embodiment of the present application, the screening rule may include a time period and a screening condition, and for the screening rule that does not include a triggering event, the first data table may be used as a master table, the first primary key used to distinguish accounts in the first data table is associated with the second data table, so as to obtain first target data associated with each primary key value of the first primary key, and then, according to the screening condition, the first target data associated with each primary key value of the first primary key is screened, so as to obtain the target account satisfying the screening condition in the time period in the first data table. In one possible implementation form, the screening rule may further include a trigger event, where the screening rule is used to screen out an account that meets the screening condition in a time period after the trigger event occurs, and correspondingly, step S104 may include: and screening target accounts meeting screening conditions in a time period after the triggering event occurs from the first data table in a main key association mode based on the first data table and the second data table. Therefore, for the screening rules comprising the screening conditions, the time period and the triggering event, the data volume conforming to the screening rules can be accurately estimated in advance before the set screening rules are deployed in the production environment, so that the conditions of overlarge data volume conforming to the rules and the like are avoided, and risks caused by logarithmic bin capacity and business processing are avoided.
The data screening method provided in the embodiment of the present application is further described below with reference to fig. 3, for the case that the above screening rule includes a time period, a preset screening condition, and a trigger event.
Fig. 3 is a flowchart III of a data screening method according to an embodiment of the present application. As shown in fig. 3, the data screening method includes the steps of:
s301, acquiring a preset screening rule, wherein the screening rule comprises a time period, preset screening conditions and a triggering event.
The method comprises a time period, preset screening conditions and a screening rule of the triggering event, and is used for screening out accounts meeting the screening conditions in the time period after the triggering event occurs. The triggering event, the time period and the preset screening condition can be set according to the actual application scene, which is not limited in the application.
The trigger event may be, for example, an address change, a mobile phone number change, etc.
For example, assuming that the trigger event is an address change and the time period is T, the screening condition may include the number of transactions being greater than or equal to N, and the total transaction amount exceeding M, and the screening rule may include: the account with the total transaction amount exceeding M and the number of transactions with other recipients being equal to or greater than N within T hours when the address change occurs is screened, and the account with the total transaction amount exceeding M and the number of transactions with other recipients being equal to or greater than N within T hours when the address change occurs can be screened according to the screening rule.
S302, acquiring a first data table related to screening conditions in a production environment, wherein the first data table comprises first historical service data corresponding to a plurality of accounts in any time period.
S303, acquiring a second data table related to screening conditions in the production environment, wherein the second data table comprises second historical service data corresponding to each of a plurality of accounts in any time period and a time period before any time period.
It should be noted that, the related descriptions of the steps S301 to S303 may refer to other embodiments, and are not described herein.
S304, acquiring a third data table related to the trigger event in the production environment, wherein the third data table comprises any time period and occurrence time of the trigger event corresponding to each of a plurality of accounts in a time period before any time period.
The arbitrary time period in step S304 is the same as the arbitrary time period in steps S302 to S303, and the time period is a time period included in the filtering rule. The time period before the arbitrary time period is a time period before the arbitrary time period adjacent to the start time of the arbitrary time period. For example, assuming that the arbitrary time period is 0 to 24 points on 1 month and 11 days and the time period is one week, the time period before the arbitrary time period is 0 to 24 points on 1 month and 4 days and 1 month and 10 days.
Taking the triggering event as address change, the arbitrary time period is 0 point to 24 points of 11 days of 1 month, and the time period is one week as an example, a third data table related to the triggering event in the production environment can be obtained, and the third data table can comprise the respective address change time of a plurality of accounts within 24 points of 0 point of 4 days of 1 month to 24 days of 1 month.
S305, using the first data table as a main table, and associating the third data table through a first main key used for distinguishing accounts in the first data table to obtain a fourth data table.
The fourth data table comprises first historical service data with the same corresponding primary key value in the first data table as the primary key value of the second primary key used for distinguishing accounts in the third data table, and target occurrence time associated with the first historical service data with the same corresponding primary key value as the primary key value of the second primary key, wherein the target occurrence time is the earliest occurrence time in the occurrence time of triggering events with the same corresponding primary key value as the corresponding primary key value of the first historical service data in the third data table.
The first primary key is used for distinguishing accounts, for example, an account number, an identity card number of a user to whom the account belongs, a mobile phone number of the user to which the account belongs, and the like.
The primary key value of the first primary key is a specific account identifier of an account, such as a specific account number, a specific identification card number, a specific mobile phone number, and the like.
The second primary key is used for distinguishing the account, for example, the account number, the identification card number of the user to which the account belongs, the mobile phone number of the user to which the account belongs, and the like, which is not limited in the application.
The primary key value of the second primary key is a specific account identifier of the account, such as a specific account number, a specific identification card number, a specific mobile phone number, and the like.
The first primary key and the second primary key may be the same or different, and the present application is not limited thereto.
For example, taking the triggering event as an address change, the screening condition includes that the transaction number is greater than or equal to N, the total transaction amount exceeds M, the arbitrary time period is 0 point to 24 points of 1 month 11 days, the time period is one week, the screening rule includes that the number of transactions with other recipients in one week after the screening address change is greater than or equal to N, and the total transaction amount exceeds M, the first data table may include historical transaction data of each of a plurality of accounts in 0 point to 24 points of 1 month 11 days, wherein the historical transaction data includes transaction amount, transaction time and the like for transactions with others, and the third data table may include address change time of each of a plurality of accounts in 24 points of 1 month 4 days to 1 month 11 days.
Assuming that the first data table includes historical transaction data of each of the account 1, the account 2, the account 3 and the account 4 in 0-24 points of 1 month and 11 days, and the third data table includes address change time of each of the account 1, the account 2, the account 4, the account 5, the account 6 and the account 7 in 24 points of 0-1 month and 4 days, the first data table can be used as a main table, and the third data table is associated through a main key (such as an account number) for distinguishing the accounts in the first data table, so as to obtain a fourth data table.
Wherein, since the primary key values corresponding to account 1, account 2 and account 4 in the first data table are the same as the primary key values corresponding to account 1, account 2 and account 4 in the third data table, the fourth data table includes the historical transaction data corresponding to account 1, account 2 and account 4 in 0 to 24 points of 1 month 11 days in the first data table, and the target occurrence time associated with the historical transaction data corresponding to account 1, account 2 and account 4. The target occurrence time associated with the historical transaction data corresponding to the account 1 is the earliest address change time corresponding to the account 1 in 24 points from 0 point of 1 month to 4 days to 1 month to 11 days in the third data table; the target occurrence time associated with the historical transaction data corresponding to the account 2 is the earliest address change time corresponding to the account 2 in 24 points from 0 point of 1 month to 4 days to 1 month to 11 days in the third data table; the target occurrence time associated with the historical transaction data corresponding to the account 3 is the earliest address change time corresponding to the account 3 in 24 points from 0 point of 1 month to 4 days to 1 month to 11 days in the third data table.
Taking account 1 as an example, assuming that the first data table includes the transaction amount of the transaction performed by account 1 at 13 points of 11 days of 1 month, the fourth data table includes account 1 in which address is changed to address 1 at 10 points of 10 days of 1 month, address is changed to address 2 at 14 points of 9 days of 1 month, and address is changed to address 3 at 15 points of 5 days of 1 month, the target occurrence time associated with the historical transaction data corresponding to account 1 is 15 points of 5 days of 1 month.
Thus, the historical transaction data of a part of accounts in the first data table in any time period can be obtained, wherein the part of accounts are accounts with trigger events in any time period and time period before any time period, and the earliest time of the part of accounts with trigger events in any time period and time period before any time period can be obtained.
S306, based on the fourth data table and the second data table, the target account meeting the screening condition in the time period after the triggering event occurs is screened from the first data table in a main key association mode.
The first data table is used as a main table, the first main key used for distinguishing accounts in the first data table is used for associating the third data table, the fourth data table is obtained, the target account meeting the screening conditions in the time period after the triggering event occurs is screened out from the first data table based on the fourth data table and the second data table in a main key association mode, and the data quantity meeting the screening rules can be accurately estimated in advance before the set screening rules are deployed in the production environment, so that the conditions that the screened data quantity meeting the screening rules is overlarge and the like in the production environment are avoided, and risks caused by logarithmic bin capacity and business processing are avoided. For the risk prediction scene, the data quantity conforming to the screening rule is accurately predicted in advance before the set screening rule is deployed in the production environment, so that a more proper screening rule can be obtained.
In some embodiments, step S306 may be implemented by the following steps a1-a 2:
a1, taking the fourth data table as a main table, and associating the second data table through a third main key used for distinguishing accounts in the fourth data table to obtain second target data respectively associated with each main key value of the third main key; the second target data comprises second historical service data, corresponding to the main key value of the third main key, in the second data table, and service handling time, corresponding to the main key value of the third main key, in the second target data, wherein the service handling time is included in the second historical service data, is before the service handling time corresponding to the main key value of the third main key, and is in a time period after the target occurrence time;
and a2, screening the second target data respectively associated with the main key values of the third main key according to the screening conditions to obtain target accounts meeting the screening conditions in the time period after the triggering event occurs in the first data table.
The second target data associated with each primary key value of the third primary key may include second historical service data corresponding to one or more accounts.
The third primary key is used for distinguishing accounts, for example, an account number, an identity card number of a user to whom the account belongs, a mobile phone number of the user to which the account belongs, and the like.
The primary key value of the third primary key is a specific account identifier of the account, such as a specific account number, a specific identification card number, a specific mobile phone number, and the like.
The business handling time is the time of handling business, for example, the business handling time is the transaction time for transaction business, the business handling time is the information changing time for information changing business, and the like.
The business handling time included in a certain second historical business data is the business handling time of the user corresponding to the second historical business data; and the business handling time corresponding to the primary key value of the third primary key is the business handling time of the user corresponding to the primary key value in the fourth data table.
In this embodiment of the present invention, the fourth data table may be used as a master table, and the second data table is associated with the third master key used for distinguishing the account in the fourth data table, where when the second data table is associated, the second historical service data corresponding to the master key value in the second data table and the master key value of the associated third master key may be limited, where the service handling time included in the second historical service data is before the service handling time corresponding to the master key value of the associated third master key and in a time period after the target occurrence time, so that the associated second target data may be obtained for each master key value of the third master key in the fourth data table. The second target data associated with each primary key value of the third primary key comprises historical service data corresponding to the primary key value in a time period after the target occurrence time.
For example, continuing the above example, taking the trigger event as the address change, the screening condition includes that the transaction number is N or greater, the total transaction amount exceeds M, the arbitrary time period is 0 point to 24 points of 1 month 11 days, the time period is one week, the screening rule includes that the screening address change is followed by the transaction number with other payee accounts of N or greater and the total transaction amount exceeds M, the fourth data table may include historical transaction data corresponding to each of account 1, account 2 and account 4 in the first data table from 0 point to 24 points of 1 month 11 days, and the target occurrence time associated with each of the historical transaction data corresponding to each of account 1, account 2 and account 4 in the 24 points of 1 month 4 days from 0 point to 1 month 11 days.
Assume that the second data table includes historical transaction data for each of account 1, account 2, account 3, account 4, account 5, account 6, account 7, account 8 for 24 points from 0 on day 1 month 4 to day 1 month 11. And the fourth data table can be used as a main table, and the second data table is associated through a third main key (such as an account number) used for distinguishing the account in the fourth data table, so that second target data respectively associated with each main key value of the third main key in the second data table is obtained. Taking account 1 as an example, the second target data associated with the primary key value corresponding to account 1 may include at least one piece of historical transaction data corresponding to account 1 in the second data table, where the at least one piece of historical transaction data includes a business transaction time before the business transaction time corresponding to account 1 in the fourth data table and in a time period after the target occurrence time corresponding to account 1. For example, the fourth data table includes the transaction amount and the target occurrence time of the transaction performed by account 1 at 13 points of 11 days of 1 month: the second target data associated with the primary key value corresponding to the account 1 at 15 points on 1 month 5 may include a transaction time and a transaction amount of a transaction performed in a week before 13 points on 1 month 11 and after 15 points on 1 month 5, that is, a transaction time and a transaction amount of a transaction performed between 25 points on 1 month 5 and 13 points on 1 month 11, corresponding to the account 1 in the second data table. Thus, the second target data associated with the primary key corresponding to the account 1 includes historical transaction data of the account 1 within one week after the address change.
In one possible implementation, the screening condition may include the number of business transactions being greater than the number threshold and the amount of business being greater than the amount threshold, and accordingly, step a2 may be implemented by:
for each primary key value of the third primary key, determining the service handling times and the corresponding service amounts corresponding to the primary key values in the time period after the triggering event occurs according to the second historical service data in the associated second target data; determining a main key value, corresponding to the business handling times in a time period after the triggering event occurs, in each main key value of the first main key to be a second target main key value, wherein the corresponding business handling times are larger than a times threshold value and the corresponding business amount is larger than an amount threshold value; and determining the account corresponding to the second target primary key value as a target account meeting the screening condition in a time period after the triggering event occurs. Therefore, for the screening conditions comprising that the business handling times are larger than the times threshold and the business amount is larger than the amount threshold, the target account meeting the screening conditions in the time period after the triggering event occurs can be accurately screened.
For example, continuing the above example, assuming that the trigger event is an address change, the screening condition includes a transaction number of times N or more, a total transaction amount exceeding M, an arbitrary time period of 0 to 24 points of 1 month and 11 days, and a time period of one week, and the screening rule includes an account in which the transaction number of times N or more with other payee and the total transaction amount exceeding M is within one week after the screening address change. Because for each account in the fourth data table, the historical transaction data of the account in a week after the address is changed can be obtained through the steps, for each account, the transaction times and the total transaction amount in the week after the address is changed can be determined, and further the target account with the transaction times of N and the total transaction amount exceeding M with other payee in the week after the address is changed can be screened out.
In another possible implementation form, assuming that the trigger event is an address change, the screening condition may include the number of business transactions being greater than the number threshold, and accordingly, the step a2 may be implemented by: for each primary key value of the third primary key, determining the service handling times corresponding to the primary key value in the time period after the triggering event occurs according to the second historical service data in the associated second target data; determining a primary key value, of the primary key values of the third primary key, of which the corresponding business handling times are greater than a time threshold value in a time period after the triggering event occurs, as a second target primary key value; and determining the account corresponding to the second target primary key value as a target account meeting the screening condition in a time period after the triggering event occurs. Therefore, for the screening conditions including the business handling times greater than the times threshold, the target account meeting the screening conditions in the time period after the triggering event occurs can be accurately screened.
For example, assuming that the trigger event is an address change, the screening condition includes that the number of personal information changes is N or more, the arbitrary time period is 0 point to 24 points on 1 month and 11 days, the time period is one week, and the screening rule includes that the number of personal information changes in one week after the address change is selected from accounts with N or more. Because the information change data of the account in one week after the address change can be obtained through the steps for each account in the fourth data table, the number of times of personal information change in one week after the address change can be determined for each account, and further the target account with the number of times of personal information change in one week after the address change being greater than or equal to N can be screened out.
Therefore, after slicing the data, historical service data corresponding to each primary key value in the fourth data table in a time period after the triggering event occurs can be obtained through a primary key association mode, then a target account meeting the screening condition in the time period after the triggering event occurs is screened out from the fourth data table according to the screening condition, and the processing logic of the screening rule is truly restored, so that the data volume conforming to the screening rule can be accurately estimated in advance before the set screening rule is deployed in the production environment, the situations that the screened data volume conforming to the screening rule in the production environment is overlarge and the like are avoided, and the risk caused to the data warehouse capacity and service processing is avoided.
It should be noted that, for the screening rule of the multi-triggering event fusion, for example, the mobile phone number is changed after the address is changed for T hours, the number of transactions with other people is greater than or equal to N after X hours of changing the mobile phone number, and the total transaction amount exceeds the account of M, the data screening method provided by the scheme can also be adopted, and the data volume conforming to the screening rule can be accurately estimated in advance before the set screening rule is deployed in the production environment. The specific implementation process is similar to the previous embodiment, and will not be repeated here.
Fig. 4 is a schematic structural diagram of a data screening device according to an embodiment of the present application. As shown in fig. 4, the data screening apparatus 400 includes: the first acquisition module 410, the second acquisition module 420, the third acquisition module 430, and the screening module 440.
The first obtaining module 410 is configured to obtain a preset screening rule, where the screening rule includes a time period and a preset screening condition;
the second obtaining module 420 is configured to obtain a first data table related to the screening condition in the production environment, where the first data table includes first historical service data corresponding to each of the plurality of accounts in any time period;
a third obtaining module 430, configured to obtain a second data table related to the screening condition in the production environment, where the second data table includes second historical service data corresponding to each of the plurality of accounts in any time period and a time period before any time period;
and the screening module 440 is configured to screen, based on the first data table and the second data table, the target account that satisfies the screening condition in the time period from the first data table by means of primary key association.
It should be noted that, the data screening apparatus 400 provided in the embodiment of the present application may perform the data screening method in the foregoing embodiment. The data screening device can be electronic equipment or can be configured in the electronic equipment, so that the data volume conforming to the screening rule is accurately estimated in advance before the set screening rule is deployed in the production environment, the situation that the screened data volume conforming to the screening rule in the production environment is overlarge and the like is avoided, and risks caused by logarithmic bin capacity and business processing are avoided.
The electronic device may be any device with computing capability, for example, may be a personal computer, a mobile terminal, a server, etc., and the mobile terminal may be, for example, a vehicle-mounted device, a mobile phone, a tablet computer, a personal digital assistant, a wearable device, etc., which have various operating systems, touch screens, and/or display screens.
In some embodiments, the screening module 440 includes:
the association unit is used for taking the first data table as a main table, and associating the second data table through a first main key used for distinguishing accounts in the first data table to obtain first target data respectively associated with each main key value of the first main key; the first target data comprises second historical service data, corresponding to the main key value, in the second data table, which is the same as the main key value of the associated first main key, and the second historical service data comprises service handling time in the first target data, wherein the service handling time is in a time period before the service handling time corresponding to the main key value of the associated first main key;
and the first screening unit is used for screening the first target data respectively associated with the main key values of the first main key according to the screening conditions to obtain target accounts meeting the screening conditions in the time period in the first data table.
In some embodiments, the screening criteria includes a number of business transactions greater than a number threshold;
the first screening unit is specifically configured to:
for each primary key value of the first primary key, determining the business handling times corresponding to the primary key value in the time period according to the second historical business data in the associated first target data;
determining a primary key value of which the corresponding business handling times in a time period are greater than a times threshold value in all primary key values of a first primary key as a first target primary key value;
and determining the account corresponding to the first target primary key value as a target account meeting the screening condition in the time period.
In some embodiments, the screening rules further include a trigger event;
the screening module 440 includes:
and the second screening unit is used for screening target accounts meeting screening conditions in a time period after the triggering event occurs from the first data table in a main key association mode based on the first data table and the second data table.
In some embodiments, the second screening unit comprises:
the acquisition subunit is used for acquiring a third data table related to the trigger event in the production environment, wherein the third data table comprises any time period and the occurrence time of the trigger event corresponding to each of a plurality of accounts in a time period before the any time period;
The association subunit is used for taking the first data table as a main table, associating the third data table through a first main key used for distinguishing accounts in the first data table, and obtaining a fourth data table; the fourth data table comprises first historical service data with the same corresponding primary key value in the first data table as the primary key value of the second primary key used for distinguishing accounts in the third data table and target occurrence time associated with the first historical service data with the same corresponding primary key value as the primary key value of the second primary key, wherein the target occurrence time is earliest occurrence time in the occurrence time of triggering events with the same corresponding primary key value as the corresponding primary key value of the first historical service data in the third data table;
and the screening subunit is used for screening the target account meeting the screening condition in the time period after the triggering event occurs from the first data table in a main key association mode based on the fourth data table and the second data table.
In some embodiments, the screening subunit is specifically configured to:
taking the fourth data table as a main table, and associating the second data table through a third main key used for distinguishing accounts in the fourth data table to obtain second target data respectively associated with each main key value of the third main key; the second target data comprises second historical service data, corresponding to the main key value of the third main key, in the second data table, and service handling time, corresponding to the main key value of the third main key, in the second target data, wherein the service handling time is included in the second historical service data, is before the service handling time corresponding to the main key value of the third main key, and is in a time period after the target occurrence time;
And screening the second target data respectively associated with the primary key values of the third primary key according to the screening conditions to obtain target accounts meeting the screening conditions in the time period after the triggering event occurs in the first data table.
In some embodiments, the screening criteria include a number of business transactions greater than a number threshold and a business amount greater than an amount threshold;
a screening subunit further configured to:
for each primary key value of the third primary key, determining the service handling times and the corresponding service amount corresponding to the primary key value in the time period according to the second historical service data in the associated second target data;
determining a main key value of which the corresponding business handling times in the time period is larger than a times threshold and the corresponding business amount is larger than an amount threshold in each main key value of the third main key as a second target main key value;
and determining the account corresponding to the second target primary key value as a target account meeting the screening condition in a time period after the triggering event occurs.
The data screening device provided in the embodiment of the present application may be used to execute the technical scheme of the data screening method in the above embodiment, and its implementation principle and technical effect are similar, and are not described herein again.
According to the data screening device, a preset screening rule is obtained, the screening rule comprises a time period and preset screening conditions, a first data table relevant to the screening conditions in a production environment is obtained, the first data table comprises first historical service data corresponding to a plurality of accounts in any time period, a second data table relevant to the screening conditions in the production environment is obtained, the second data table comprises second historical service data corresponding to a plurality of accounts in any time period and a time period before any time period, and a target account meeting the screening conditions in the time period is screened out from the first data table based on the first data table and the second data table in a primary key association mode. Therefore, the data quantity conforming to the screening rule can be accurately estimated in advance before the set screening rule is deployed in the production environment, so that the conditions of overlarge data quantity conforming to the screening rule screened in the production environment and the like are avoided, and risks caused by logarithmic bin capacity and business processing are avoided.
It should be noted that, it should be understood that the division of the modules of the above apparatus is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. And these modules may all be implemented in software in the form of calls by the processing element; or can be realized in hardware; the method can also be realized in a form of calling software by a processing element, and the method can be realized in a form of hardware by a part of modules. For example, the screening module 440 may be a processing element that is set up separately, may be implemented in a chip of the above apparatus, or may be stored in a memory of the above apparatus in the form of program codes, and may be called by a processing element of the above apparatus to execute the functions of the screening module 440. The implementation of the other modules is similar. In addition, all or part of the modules can be integrated together or can be independently implemented. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in a software form.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 5, the electronic device may include: a transceiver 121, a processor 122, a memory 123.
The transceiver 121 may be used to acquire a task to be run and configuration information of the task to be run.
The system bus may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The system bus may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus. The transceiver is used to enable communication between the database access device and other computers (e.g., clients, read-write libraries, and read-only libraries). The memory may include random access memory (random access memory, RAM) and may also include non-volatile memory (non-volatile memory).
The electronic device provided in the embodiment of the present application may be a terminal device in the above embodiment.
The embodiment of the application also provides a chip for running the instruction, and the chip is used for executing the technical scheme of the data screening method in the embodiment.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores computer instructions, and when the computer instructions run on a computer, the computer is caused to execute the technical scheme of the data screening method of the embodiment.
The embodiment of the application also provides a computer program product, which comprises a computer program, wherein the computer program is stored in a computer readable storage medium, the computer program can be read from the computer readable storage medium by at least one processor, and the technical scheme of the data screening method in the embodiment can be realized when the computer program is executed by the at least one processor.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.
Claims (17)
1. A method of data screening comprising:
acquiring a preset screening rule, wherein the screening rule comprises a time period and preset screening conditions;
acquiring a first data table related to the screening condition in a production environment, wherein the first data table comprises first historical service data corresponding to a plurality of accounts in any time period;
acquiring a second data table related to the screening condition in the production environment, wherein the second data table comprises second historical service data corresponding to each of a plurality of accounts in the optional time period and the time period before the optional time period;
and screening target accounts meeting the screening conditions in the time period from the first data table in a primary key association mode based on the first data table and the second data table.
2. The method according to claim 1, wherein the screening the target account satisfying the screening condition in the time period from the first data table by a primary key association method based on the first data table and the second data table includes:
Taking the first data table as a main table, and associating the second data table through a first main key used for distinguishing accounts in the first data table to obtain first target data respectively associated with each main key value of the first main key; the first target data comprises second historical service data, corresponding to a main key value, in the second data table, which is the same as the main key value of the associated first main key, and the second historical service data comprises service handling time in the first target data, wherein the service handling time is in the time period before the service handling time corresponding to the main key value of the associated first main key;
and screening the first target data respectively associated with the primary key values of the first primary key according to the screening conditions to obtain target accounts meeting the screening conditions in the time period in the first data table.
3. The method of claim 2, wherein the screening criteria includes a number of business transactions greater than a number threshold;
screening the first target data respectively associated with the primary key values of the first primary key according to the screening conditions to obtain target accounts meeting the screening conditions in the time period in the first data table, wherein the screening comprises the following steps:
For each primary key value of the first primary key, determining the business handling times corresponding to the primary key value in the time period according to the second historical business data in the associated first target data;
determining a primary key value of which the corresponding business handling times in the time period is larger than the times threshold value in each primary key value of the first primary key as a first target primary key value;
and determining the account corresponding to the first target primary key value as a target account meeting the screening condition in the time period.
4. The method of claim 1, wherein the screening rule further comprises a trigger event;
the screening, based on the first data table and the second data table, the target account meeting the screening condition in the time period from the first data table in a primary key association manner includes:
and screening target accounts meeting the screening conditions in the time period after the triggering event occurs from the first data table in a primary key association mode based on the first data table and the second data table.
5. The method of claim 4, wherein the screening, based on the first data table and the second data table, the target account satisfying the screening condition in the time period after the trigger event occurs from the first data table by means of primary key association, includes:
Acquiring a third data table related to the trigger event in the production environment, wherein the third data table comprises the random time period and occurrence time of the trigger event corresponding to each of a plurality of accounts in the time period before the random time period;
taking the first data table as a main table, and associating the third data table through a first main key used for distinguishing accounts in the first data table to obtain a fourth data table; the fourth data table comprises first historical service data with the same corresponding primary key value as the primary key value of a second primary key used for distinguishing accounts in the third data table and target occurrence time associated with the first historical service data with the same corresponding primary key value as the primary key value of the second primary key, wherein the target occurrence time is the earliest occurrence time in the occurrence time of the triggering event with the same corresponding primary key value as the corresponding primary key value of the associated first historical service data in the third data table;
and screening target accounts meeting the screening conditions in the time period after the triggering event occurs from the first data table in a primary key association mode based on the fourth data table and the second data table.
6. The method according to claim 5, wherein the screening the target account satisfying the screening condition in the time period after the trigger event occurs from the first data table by a primary key association method based on the fourth data table and the second data table comprises:
taking the fourth data table as a main table, and associating the second data table through a third main key used for distinguishing accounts in the fourth data table to obtain second target data respectively associated with each main key value of the third main key; wherein the second target data includes second historical service data in the second data table, the corresponding primary key value of the second target data being the same as the primary key value of the associated third primary key, and the second historical service data includes service handling time in the second target data, before the service handling time corresponding to the primary key value of the associated third primary key, and within the time period after the target occurrence time;
and screening the second target data respectively associated with the primary key values of the third primary key according to the screening conditions to obtain target accounts which meet the screening conditions in the time period after the trigger event occurs in the first data table.
7. The method of claim 6, wherein the screening criteria includes a number of business transactions greater than a number threshold and a business amount greater than an amount threshold;
screening the second target data respectively associated with the primary key values of the third primary key according to the screening condition to obtain a target account meeting the screening condition in the time period after the trigger event occurs in the first data table, wherein the screening method comprises the following steps:
for each primary key value of the third primary key, determining the service handling times and the corresponding service amounts corresponding to the primary key values in the time period after the triggering event occurs according to the second historical service data in the associated second target data;
determining a primary key value of which the corresponding business handling times in the time period after the triggering event occurs in each primary key value of the third primary key is larger than the times threshold and the corresponding business amount is larger than the amount threshold as a second target primary key value;
and determining the account corresponding to the second target primary key value as a target account meeting the screening condition in the time period after the triggering event occurs.
8. A data screening apparatus, comprising:
the first acquisition module is used for acquiring a preset screening rule, wherein the screening rule comprises a time period and preset screening conditions;
the second acquisition module is used for acquiring a first data table related to the screening condition in the production environment, wherein the first data table comprises first historical service data corresponding to a plurality of accounts in any time period;
a third obtaining module, configured to obtain a second data table related to the screening condition in the production environment, where the second data table includes the arbitrary time period and second historical service data corresponding to each of the plurality of accounts in the time period before the arbitrary time period;
and the screening module is used for screening the target account meeting the screening condition in the time period from the first data table in a main key association mode based on the first data table and the second data table.
9. The apparatus of claim 8, wherein the screening module comprises:
the association unit is used for taking the first data table as a main table, and associating the second data table through a first main key used for distinguishing accounts in the first data table to obtain first target data respectively associated with each main key value of the first main key; the first target data comprises second historical service data, corresponding to a main key value, in the second data table, which is the same as the main key value of the associated first main key, and the second historical service data comprises service handling time in the first target data, wherein the service handling time is in the time period before the service handling time corresponding to the main key value of the associated first main key;
And the first screening unit is used for screening the first target data respectively associated with the main key values of the first main key according to the screening conditions to obtain target accounts which meet the screening conditions in the time period in the first data table.
10. The apparatus of claim 9, wherein the screening criteria includes a number of business transactions greater than a number threshold;
the first screening unit is specifically configured to:
for each primary key value of the first primary key, determining the service handling times corresponding to the primary key value in the time period according to the second historical service data in the associated first target data;
determining a primary key value of which the corresponding business handling times in the time period is larger than the times threshold value in each primary key value of the first primary key as a first target primary key value;
and determining the account corresponding to the first target primary key value as a target account meeting the screening condition in the time period.
11. The apparatus of claim 8, wherein the screening rule further comprises a trigger event;
the screening module comprises:
and the second screening unit is used for screening target accounts meeting the screening conditions in the time period after the triggering event occurs from the first data table in a main key association mode based on the first data table and the second data table.
12. The apparatus of claim 11, the second screening unit comprising:
an obtaining subunit, configured to obtain a third data table related to the trigger event in the production environment, where the third data table includes the arbitrary time period and occurrence times of the trigger events corresponding to the accounts in the time period before the arbitrary time period;
the association subunit is used for taking the first data table as a main table, and associating the third data table through a first main key used for distinguishing accounts in the first data table to obtain a fourth data table; the fourth data table comprises first historical service data with the same corresponding primary key value as the primary key value of a second primary key used for distinguishing accounts in the third data table and target occurrence time associated with the first historical service data with the same corresponding primary key value as the primary key value of the second primary key, wherein the target occurrence time is the earliest occurrence time in the occurrence time of the triggering event with the same corresponding primary key value as the corresponding primary key value of the associated first historical service data in the third data table;
And the screening subunit is used for screening the target account meeting the screening condition in the time period after the trigger event occurs from the first data table in a main key association mode based on the fourth data table and the second data table.
13. The apparatus of claim 12, wherein the screening subunit is specifically configured to:
taking the fourth data table as a main table, and associating the second data table through a third main key used for distinguishing accounts in the fourth data table to obtain second target data respectively associated with each main key value of the third main key; wherein the second target data includes second historical service data in the second data table, the corresponding primary key value of the second target data being the same as the primary key value of the associated third primary key, and the second historical service data includes service handling time in the second target data, before the service handling time corresponding to the primary key value of the associated third primary key, and within the time period after the target occurrence time;
and screening the second target data respectively associated with the primary key values of the third primary key according to the screening conditions to obtain target accounts which meet the screening conditions in the time period after the trigger event occurs in the first data table.
14. The apparatus of claim 13, wherein the screening criteria comprises a number of business transactions greater than a number threshold and a business amount greater than an amount threshold;
the screening subunit is further configured to:
for each primary key value of the third primary key, determining the service handling times and the corresponding service amount corresponding to the primary key value in the time period according to the second historical service data in the associated second target data;
determining a primary key value of which the corresponding business handling times in the time period is larger than the times threshold and the corresponding business amount is larger than the amount threshold in each primary key value of the third primary key as a second target primary key value;
and determining the account corresponding to the second target primary key value as a target account meeting the screening condition in the time period after the triggering event occurs.
15. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored in the memory to implement the method of any one of claims 1-7.
16. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out the method of any one of claims 1-7.
17. A computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310261949.6A CN116362750A (en) | 2023-03-16 | 2023-03-16 | Data screening method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310261949.6A CN116362750A (en) | 2023-03-16 | 2023-03-16 | Data screening method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116362750A true CN116362750A (en) | 2023-06-30 |
Family
ID=86934936
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310261949.6A Pending CN116362750A (en) | 2023-03-16 | 2023-03-16 | Data screening method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116362750A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117910969A (en) * | 2024-01-24 | 2024-04-19 | 宁波一起益企科技有限公司 | Policy project collaborative office-based flow information management method and system |
-
2023
- 2023-03-16 CN CN202310261949.6A patent/CN116362750A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117910969A (en) * | 2024-01-24 | 2024-04-19 | 宁波一起益企科技有限公司 | Policy project collaborative office-based flow information management method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107886414B (en) | Order combination method and equipment and computer storage medium | |
CN110188990B (en) | Resource request and fund request splitting method, device and equipment | |
CN108134944B (en) | Identification method and device for anchor user with abnormal income and electronic equipment | |
CN116362750A (en) | Data screening method and device, electronic equipment and storage medium | |
CN114022151A (en) | Block chain data visualization method and system, electronic device and storage medium | |
CN107688959B (en) | Breakpoint list processing method, storage medium and server | |
CN113282460A (en) | Distributed alarm system, method and device | |
CN112596985A (en) | IT asset detection method, device, equipment and medium | |
CN111428197A (en) | Data processing method, device and equipment | |
CN112631808B (en) | Data synchronization method, device, electronic equipment and storage medium | |
CN111951011B (en) | Monitoring system threshold value determining method and device | |
CN109146122A (en) | A kind of probability forecasting method, device, electronic equipment and computer storage medium | |
CN114169451A (en) | Behavior data classification processing method, device, equipment and storage medium | |
CN110458707B (en) | Behavior evaluation method and device based on classification model and terminal equipment | |
CN110148044B (en) | Method and device for setting buffering threshold for accounting | |
CN113836130A (en) | Data quality evaluation method, device, equipment and storage medium | |
CN108694648B (en) | Article interaction method, article identification registration method, system, device and storage medium | |
CN116382924B (en) | Recommendation method and device for resource allocation, electronic equipment and storage medium | |
CN110851441A (en) | Data storage method and related equipment | |
CN112559805A (en) | Index optimization method and device | |
CN110891097A (en) | Cross-device user identification method and device | |
CN117234929A (en) | Method and device for generating test scene, electronic equipment and storage medium | |
CN117764327A (en) | Service application dispatching method and device | |
CN112115132B (en) | Data association method, device, equipment and storage medium | |
CN116228308A (en) | Resource pushing method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |