CN114116802A - Data processing method, device, equipment and storage medium of Flink computing framework - Google Patents

Data processing method, device, equipment and storage medium of Flink computing framework Download PDF

Info

Publication number
CN114116802A
CN114116802A CN202111420013.0A CN202111420013A CN114116802A CN 114116802 A CN114116802 A CN 114116802A CN 202111420013 A CN202111420013 A CN 202111420013A CN 114116802 A CN114116802 A CN 114116802A
Authority
CN
China
Prior art keywords
data stream
processed
verification
processing
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111420013.0A
Other languages
Chinese (zh)
Inventor
汪照阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Xingyun Digital Technology Co Ltd
Original Assignee
Nanjing Xingyun Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Xingyun Digital Technology Co Ltd filed Critical Nanjing Xingyun Digital Technology Co Ltd
Priority to CN202111420013.0A priority Critical patent/CN114116802A/en
Publication of CN114116802A publication Critical patent/CN114116802A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • G06Q30/0635Processing of requisition or of purchase orders

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Finance (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a data processing method and device of a Flink computing framework, computer equipment and a storage medium. The method comprises the following steps: acquiring a data stream to be processed in real time, wherein the data stream to be processed comprises a user behavior data stream and a transaction order data stream; distributing the data stream to be processed to corresponding processing windows according to a plurality of processing windows preset in a Flink computing frame; and carrying out parallel verification on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk verification strategy to obtain a verification result. According to the method and the device, the user behavior data stream and the transaction order data stream are analyzed in real time based on the Flink computing box, the reliability of data verification is improved, parallel processing of the data to be processed is achieved through a plurality of processing windows in the Flink computing box, and the efficiency of data processing is improved.

Description

Data processing method, device, equipment and storage medium of Flink computing framework
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data processing method, an apparatus, a device, and a storage medium for a Flink computation framework.
Background
At present, as online shopping behaviors are gradually increased, the fraud risk is also increased. Fraudsters continually optimize their means of fraud and seek to avoid discrimination. The account numbers and passwords registered by users on a plurality of websites are consistent, information lost by the users on one website can be taken by hackers to bump into the library, and if the information is successfully bumped into the library, the transaction is influenced. For example, lawless persons caused by account theft frequently use stolen account cash register, cheat customer information to register cash, and earn the cash register of difference price by themselves. Risk checking of various online behaviors of users is an important issue faced by businesses.
In the conventional technology, model training is generally performed on offline application behavior data of a user by acquiring the offline behavior data of the user, and data analysis is performed according to the trained model to check the risk of the user behavior. However, this scheme has a problem that the accuracy of detection is not high.
Disclosure of Invention
In view of the above, it is necessary to provide a data processing method and apparatus, a computer device, and a storage medium for a Flink computation framework.
A data processing method of a Flink computing framework comprises the following steps:
acquiring a data stream to be processed in real time, wherein the data stream to be processed comprises a user behavior data stream and a transaction order data stream;
distributing the data streams to be processed to corresponding processing windows according to a plurality of processing windows preset in a Flink computing frame;
and carrying out parallel verification on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk verification strategy to obtain a verification result.
In one embodiment, the acquiring the data stream to be processed in real time includes:
collecting data flow to be processed in real time;
classifying the data stream to be processed into different partitions of a preset message queue according to the data stream type of the data stream to be processed;
and acquiring the data stream to be processed from different partitions of a preset message queue.
In one embodiment, the allocating the data streams to be processed to the corresponding processing windows according to a plurality of processing windows preset in the Flink computation framework includes:
and distributing the data stream to be processed to the corresponding processing window according to the window range description information of each processing window and the data stream type of the data stream to be processed, wherein the window range description information is used for describing the data stream type processed by each processing window.
In one embodiment, the risk verification policy includes a verification condition, and the parallel verification is performed on the data stream to be processed and the data stream of the trade order in each processing window according to a preconfigured risk verification policy to obtain a verification result, where the verification result includes:
according to the window function of each processing window, calling a risk checking strategy corresponding to the data stream to be processed in each processing window in parallel from a preset Redis cache;
extracting metadata in a data stream to be processed in a processing window;
and when the metadata meets the verification condition in the risk verification strategy, generating a verification result with the data stream to be processed as abnormal data so as to verify the data stream to be processed in each processing window in parallel.
In one embodiment, the verification condition includes a plurality of conditions, and the method further includes:
when the data stream to be processed meets the preset number of verification conditions, risk prompt information is generated;
and pushing the risk prompt to a preset target terminal for display.
In one embodiment, the method further includes:
storing the checking result into a preset database;
and reading the verification result in the database within the preset time period in real time, and sending the verification result within the preset time period to the target terminal for displaying.
In one embodiment, the method further includes:
receiving a newly increased request of a risk verification strategy sent by a terminal;
generating a new risk checking strategy according to the new checking condition in the new request;
and updating the new checking strategy into a Redis cache.
A data processing apparatus of a Flink computation framework, the method comprising:
the acquisition module is used for acquiring data streams to be processed in real time, wherein the data streams to be processed comprise user behavior data streams and transaction order data streams;
the distribution module is used for distributing the data streams to be processed to the corresponding processing windows according to a plurality of processing windows preset in the Flink computing frame;
and the checking module is used for performing parallel checking on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk checking strategy to obtain a checking result.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring a data stream to be processed in real time, wherein the data stream to be processed comprises a user behavior data stream and a transaction order data stream;
distributing the data streams to be processed to corresponding processing windows according to a plurality of processing windows preset in a Flink computing frame;
and carrying out parallel verification on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk verification strategy to obtain a verification result.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring a data stream to be processed in real time, wherein the data stream to be processed comprises a user behavior data stream and a transaction order data stream;
distributing the data streams to be processed to corresponding processing windows according to a plurality of processing windows preset in a Flink computing frame;
and carrying out parallel verification on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk verification strategy to obtain a verification result.
According to the data processing method and device of the Flink computing framework, the computer equipment and the storage medium, the data stream to be processed is obtained in real time, and comprises a user behavior data stream and a transaction order data stream; distributing the data streams to be processed to corresponding processing windows according to a plurality of processing windows preset in a Flink computing frame; and carrying out parallel verification on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk verification strategy to obtain a verification result. According to the method and the device, the user behavior data stream and the transaction order data stream are analyzed in real time based on the Flink computing box, the reliability of data verification is improved, parallel processing of the data to be processed is achieved through a plurality of processing windows in the Flink computing box, and the efficiency of data processing is improved.
Drawings
FIG. 1 is a diagram of an application environment of a data processing method of the Flink computing framework in one embodiment;
FIG. 2 is a flow diagram illustrating a data processing method of the Flink computing framework in one embodiment;
FIG. 3 is a diagram illustrating an application environment of a data processing method of the Flink computing framework in one embodiment;
FIG. 4 is a flow diagram of the calculation of the Flink calculation framework in another embodiment;
FIG. 5 is a block diagram of a data processing apparatus of the Flink computing framework in one embodiment;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Referring to fig. 1, fig. 1 is a schematic application environment diagram of a data processing method of a Flink computing framework according to an exemplary embodiment of the present application. As shown in fig. 1, the application environment includes a server 100 and a terminal 101, and the server 100 and the terminal 101 can be communicatively connected through a network 102 to implement the data processing method of the Flink computing framework of the present application.
The server 100 is configured to obtain a to-be-processed data stream in real time, where the to-be-processed data stream includes a user behavior data stream and a transaction order data stream; distributing the data streams to be processed to corresponding processing windows according to a plurality of processing windows preset in a Flink computing frame; and carrying out parallel verification on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk verification strategy to obtain a verification result. Storing the checking result into a preset database; and reading the verification result in the database within the preset time period in real time, and sending the verification result within the preset time period to the target terminal 101 for display. The server 100 may be implemented as an independent server or a server cluster composed of a plurality of servers.
The terminal 101 is configured to receive the verification result pushed by the server 100 within a preset time period, and format the verification result within the preset time period to be displayed on a user interface of the terminal. For example, in the form of a chart shown on the user interface. The terminal 101 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
The network 102 is used for network connection between the terminal 101 and the server 100, and in particular, the network 102 may include various types of wired or wireless networks.
In one embodiment, as shown in fig. 2, a data processing method of a Flink computing framework is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:
and S11, acquiring data streams to be processed in real time, wherein the data streams to be processed comprise user behavior data streams and transaction order data streams.
In this application, the data stream to be processed may include a user behavior data stream and a transaction order data stream. The user behavior data stream may include a user click data stream, a user access data stream, a user login data stream, and the like. The metadata in the user behavior data stream may include account information of the user, a receiving address, a receiving mobile phone number, a used device number, an IP address, and the like. The metadata in the transaction order data stream may include buyer member number, purchase channel code, such as cash, app, bank card, credit card, etc., payer member type, payment time, payment completion time, order code, order amount, sub-order amount, product code, seller merchant number, transaction initiator merchant number, commodity type, payer merchant number, payer member number, payment method, merchant order details, and order details.
In one embodiment, the acquiring the data stream to be processed in real time may include:
collecting data flow to be processed in real time;
classifying the data stream to be processed into different partitions of a preset message queue according to the data stream type of the data stream to be processed;
and acquiring the data stream to be processed from different partitions of a preset message queue.
In this application, the message queue may be a kafka message queue. Multiple partitions are arranged in the kafka message queue and used for storing the data streams to be processed. The data stream types may include user click data stream, user access data stream, user login data stream, and transaction order data stream. According to the method and the device, the data streams to be processed are stored in different partitions respectively according to different data stream types of the data streams to be processed. According to the method and the device, consumption of the data stream to be processed in the message queue by the subsequent Flink computing framework is facilitated, and the data processing efficiency is improved.
In the application, user behavior data streams (including access, click, exposure, login and the like) of a user on an APP page, a module and a pit position and basic information of the user, such as account information, a receiving address, a receiving mobile phone number, transaction order information, payment information and the like, can be collected based on a point-burying technology, a kafka message queue platform is thrown in real time, consumption-related data streams to be processed are subscribed based on a Flink streaming type calculation frame, the data streams to be processed are cleaned and processed, and real-time streaming data are processed according to data attribute classification.
And S12, distributing the data streams to be processed to corresponding processing windows according to a plurality of preset processing windows in the Flink calculation frame.
In this application, the Flink computation framework may be a distributed data processing system, and is used for parallel processing of real-time data streams. In the present application, a plurality of processing windows are defined in advance, and window range description information of each processing window is set. The window range description information is used for describing the type of the data stream processed by each processing window. Further, the present application may further specifically define a time trigger of each processing window, where the time trigger is used to describe the start running time, the end running time, and the calculation frequency of each processing window.
In one embodiment, the allocating the data streams to be processed to the corresponding processing windows according to a plurality of processing windows preset in the Flink computation framework may include:
and distributing the data stream to be processed to the corresponding processing window according to the window range description information of each processing window and the data stream type of the data stream to be processed, wherein the window range description information is used for describing the data stream type processed by each processing window.
In the present application, the data stream to be processed is classified in advance, and specifically, the data stream to be processed may be classified into a plurality of data streams according to the data stream type of the data stream to be processed. For example, it can be divided into a user click data stream, a user access data stream, a user login data stream, and a transaction order data stream.
Further, the window range description information of each processing window describes the type of data stream processed by each processing window. Therefore, the window range description information can be matched with the data stream types to acquire the data streams to be processed corresponding to the processing windows. For example, a processing window 1 may be preset, the window range description information of which is used for processing the transaction order data stream; the processing window 2, the window scope description information of which is used for processing the user access data stream; and processing the window 3, wherein the window range description information is used for processing the user click data stream. The data stream of the transaction order in the data stream to be processed can be allocated to the processing window 1 for processing according to the window range description information, the user access data stream in the data to be processed is allocated to the processing window 2 for processing, and the user click data stream in the data to be processed is allocated to the processing window 3 for processing.
According to the data processing method and device, the data in the data stream to be processed are classified and distributed to different processing windows for processing, and therefore the data processing efficiency is improved.
And S13, performing parallel verification on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk verification strategy to obtain a verification result.
In the application, the Flink computation framework is used for consuming the data stream to be processed in the message queue, and processing and preprocessing the data stream to be processed; the system is also used for connecting a database, reading and writing data indexes or intermediate results in real time, and performing calculation and storage operation; and the system is also used for calling or embedding a risk checking strategy and filtering or calculating data.
In this application, the risk verification policy may include a verification item and a corresponding verification condition. Specifically, the check items may include, but are not limited to, the following items:
firstly, judging whether a user orders under a common IP (Internet protocol);
judging whether a user orders on or off a common device;
thirdly, judging whether the goods receiving address of the user is a common goods receiving address or not;
judging whether the mobile phone number for receiving goods of the user is a common mobile phone number or not;
judging whether the user modifies the login password recently;
sixthly, judging whether the user modifies the payment password recently;
seventhly, judging whether the user modifies the mobile phone number recently;
eighthly, judging whether the order is a delivery payment;
ninthly, judging whether the quantity of the commodities with the appointed price in the order meets a threshold value;
d, judging whether the price of the commodity of the specified class in the order meets a threshold value;
Figure BDA0003376935770000081
and judging whether the total value of the order reaches a certain threshold value.
Further, the check result may include information of whether the current data stream to be processed is successfully checked and details of the current data stream to be processed. For example, if the current data stream to be processed is a transaction order data stream, the verification result may include information on whether the current transaction order is successfully verified and order details of the current transaction order. The order detail information may include information such as an order code, payment time, payment completion time, an order amount, a product code, a seller merchant number, a transaction initiator merchant number, a commodity type, a payer member number, a payment method, merchant order details, and order details. If the verification fails, the verification result may include notification information that the current trade order is an abnormal order, and the like.
In one embodiment, the risk verification policy further includes a verification condition. The parallel verification of the to-be-processed data stream and the transaction order data stream in each processing window according to the preconfigured risk verification policy to obtain a verification result may include:
according to the window function of each processing window, calling a risk checking strategy corresponding to the data stream to be processed in each processing window in parallel from a preset Redis cache;
extracting metadata in a data stream to be processed in a processing window;
and when the metadata meets the verification condition in the risk verification strategy, generating a verification result with the data stream to be processed as abnormal data so as to verify the data stream to be processed in each processing window in parallel.
In this application, the verification condition may include:
when the user orders the data stream not under the common IP, determining that the current data stream to be processed meets the verification condition;
when the user does not order the frequently-used equipment, determining that the current data stream to be processed meets the verification condition;
and when the receiving address of the user is not the common receiving address, determining that the current data stream to be processed meets the checking condition and the like.
For example, when "determining whether the user shipping address is a common shipping address" is set as the check item, when it is determined that the user shipping address is not a common shipping address, it is determined that the check condition is satisfied, the current order is regarded as an abnormal order, the system stores detail data under the current order number as a check result in the database by default, and the service may screen and check desired field information such as an order number, a shipping address, a user name, a time when the user places an order, a device number, an ip, and the like through the wind control management platform. Further, the server judges whether the user's shipping address is a common shipping address and needs to use the user's history information, such as the user's common shipping address. Therefore, the server needs to store the history information of the user in the Redis cache in advance for calling. For example, when the server processes the transaction order data stream, it needs to rely on the history data of the user, such as the information of the mobile phone number, the receiving address, etc., and the history data needs to be stored in the Redis in advance.
Different verification conditions are set for different verification items in the application. Specifically, the check items and the corresponding check conditions may be set according to actual service requirements, and are not specifically limited herein. The risk verification strategy may further include a pre-trained verification model.
The risk checking strategy in the application is stored in a preset Redis cache for the real-time reading and writing of the Flink computing framework. The data processing of each processing window is carried out concurrently in the application, and the data processing efficiency is improved through simultaneous processing of a plurality of processes. The application of the Flink computing framework and the database are combined with each other, so that the real-time data stream is processed, the real-time detection of the user behavior data stream and the transaction order data stream is realized, and the accuracy and the scientificity of data detection are improved.
In one embodiment, the verification condition includes a plurality of conditions, and the method further includes:
when the data stream to be processed meets the preset number of verification conditions, risk prompt information is generated;
and pushing the risk prompt to a preset target terminal for display.
In this application, the risk verification policy may include a plurality of verification items and corresponding verification conditions. And when the data stream to be processed simultaneously meets the preset number of verification conditions, generating risk prompt information. For example, the preset number may be preset to be M, and if the current trade order data stream simultaneously satisfies M verification conditions, the risk prompt information is generated according to the current trade order data stream. The specific content of the risk prompt message may be a prompt message that the current trade order is an abnormal order. The prompt information can be specifically sent to the target terminal in a form of a short message or an email, specifically, the prompt can be performed in a form of characters or voice, and a specific prompt form can be set according to requirements and is not specifically limited herein.
According to the method and the device, the abnormal order can be early warned, and relevant personnel can be timely used, so that the relevant personnel can take corresponding measures.
In one embodiment, the method may further include:
storing the checking result into a preset database;
and reading the verification result in the database within the preset time period in real time, and sending the verification result within the preset time period to the target terminal for displaying.
In the application, the verification result can be stored in the database, and specifically, the verification result can be stored in the Mysql database in a fixed table structure. The specific verification result may include order detail data of the abnormal order, a risk personnel list, dimension index data that business personnel want to be shown on the wind control management platform, and the like. The verification results stored in the database can be classified and stored according to dates or attributes, and subsequent query is facilitated.
According to the method and the device, when the query request of the verification result sent by the terminal is received, the corresponding verification result can be obtained and sent to the terminal for display. And the verification result in the preset time period in the database can be read in real time, and the verification result in the preset time period is sent to the target terminal for display. The specific display can be formatted, for example, the specific display can be displayed on a web page in a chart form for business personnel to perform data analysis, and the specific display is visual and vivid.
In one embodiment, the method may further include:
receiving a newly increased request of a risk verification strategy sent by a terminal;
generating a new risk checking strategy according to the new checking condition in the new request;
and updating the new checking strategy into a Redis cache.
The risk verification policies in the present application support modifiable, deletable, and increasable. Specifically, the user is supported to select the custom risk rule configuration for importing a CSV (Comma Separate Values) format file, the user is supported to select the risk rule model existing in the real-time adding database, and the user is supported to set the rule validity period, delete the rule and the like. In the application, on one hand, the statement of the business-defined risk checking strategy in the fixed format is supported to be imported. On the other hand, various checking items and checking conditions can be made into a component form through the wind control management platform to be selected by operators, and a plurality of conditions can be selected simultaneously to be combined for use.
In a possible application scenario, an operator may upload a risk verification policy to be newly added to a server in the form of a file, and the server generates a corresponding risk verification policy according to the file. The configuration items of the plurality of check items and the check conditions can be displayed on the front-end interface, the check items are directly selected by an operator and then submitted to the server, and the server generates the corresponding risk check strategy according to the configuration items selected by the operator. According to the method and the device, the risk checking strategy can be flexibly modified, and the flexibility of the scheme is improved.
Referring to fig. 3, fig. 3 is a schematic application environment diagram of a data processing method of a Flink computing framework according to an exemplary embodiment of the present application. As shown in fig. 3, the application environment includes a kafka message queue 21, a Flink computation framework 22, a Mysql database 23, a Redis cache database 24, and an operation management platform 25. The Flink computation framework 22 includes an input module 26 of a data source, a processing window 27, and an output module 28.
Specifically, the kafka message queue 21 is used for storing real-time data streams such as a transaction order data stream and a user behavior data stream (behavior information such as APP page access, pit bit click, page module pit bit exposure) collected based on the embedded point technology, so as to serve as a data source of the Flink computation framework 22.
The Flink computation framework 22 described above is used to:
consuming the real-time metadata in Kafka, processing and preprocessing the transaction order data stream and the user behavior data stream.
And 2, connecting the database, reading and writing data indexes or intermediate results in real time, and performing calculation and storage operation.
And 3, calling or embedding a risk checking strategy, and filtering or calculating the data stream to be processed.
The Mysql database 23 is used for storing the verification result calculated by the Flink calculation framework 22, and the verification result may specifically include abnormal order data, risk personnel list, and dimension index data that business personnel want to show on the wind control management platform.
The Redis cache database 24 is used for storing risk verification strategies, on one hand, the risk verification strategies are read by a Flink program in real time, and on the other hand, business personnel are supported to modify the wind control rules in real time through the wind control management platform.
The operation management platform 25 provides an entry for a user to operate and modify a risk checking policy by using a visual page, and using the Mysql database and the Redis database as a back-end repository, and may also show a checking result in a form of a graph, that is, a consumer with a risk or an illegal action. The background of the risk management platform 25 reads the check result directly from the mysql database and displays the check result on the terminal of the operator in the form of a chart.
Specifically, the Flink computing framework is connected with the kafka message queue platform, and subscribes and consumes the partitions stored in the data stream to be processed; the server acquires detailed metadata such as orders, flow and payment, then preprocesses the data, and performs cleaning, processing and conversion or stores the data into an intermediate table; writing a plurality of processing windows in the stream processing link, such as an order information processing window, a flow information processing window, a payment information processing window and the like, wherein each processing window realizes the classified multi-concurrent calculation processing of different service attribute data, calls a Redis database interface in a window function of the calculation window to perform read-write operation on a risk verification strategy and realizes the processing of personalized service logic; and writing the verification result information obtained by calculation through a Flink calculation framework into a Mysql database for storage in a form of a table.
Referring to fig. 4, fig. 4 is a flowchart illustrating a Flink computation framework according to an exemplary embodiment of the present application. In fig. 4, the Flink computation framework includes, when processing data:
s31, inputting a data stream to be processed;
s32, preprocessing data;
s33, processing the data of the window in parallel;
and S34, outputting.
The parallel processing of the data in the processing window specifically includes:
the processing window 1 is used for processing a transaction order data stream;
the processing window 2 is used for processing the user access data stream;
the processing window 2 is used for processing the user click data stream;
and so on, until the processing window n is used for processing the user login data stream.
The data preprocessing may include data cleaning, format conversion, and the like. The output data is stored in a preset database for operators to inquire. When each processing window performs parallel processing on the data stream to be processed, the data to be processed is verified by calling a risk verification strategy, so that a verification result is output.
In one embodiment, as shown in fig. 5, there is provided a data processing apparatus of a Flink computation framework, including: an acquisition module 11, a distribution module 12 and a verification module 13, wherein:
the acquiring module 11 is configured to acquire a data stream to be processed in real time, where the data stream to be processed includes a user behavior data stream and a transaction order data stream;
the distribution module 12 is configured to distribute the data stream to be processed to corresponding processing windows according to a plurality of processing windows preset in the Flink calculation framework;
and the checking module 13 is configured to perform parallel checking on the user behavior data stream and the transaction order data stream in each processing window according to a preconfigured risk checking policy to obtain a checking result.
In one embodiment, the obtaining module 11 may collect data streams to be processed in real time, store the data streams to be processed in different partitions of a preset message queue according to the data stream type of the data streams to be processed, and obtain the data streams to be processed from the different partitions of the preset message queue.
In one embodiment, the allocating module 12 may allocate the data stream to be processed to the corresponding processing window according to the window range description information of each processing window and the data stream type of the data stream to be processed, where the window range description information is used to describe the data stream type processed by each processing window.
In one embodiment, the risk checking policy includes a checking condition, the checking module 13 may invoke, in parallel, a risk checking policy corresponding to the to-be-processed data stream in each processing window from a preset Redis cache according to a window function of each processing window, extract metadata in the to-be-processed data stream in the processing window, and generate a checking result that the to-be-processed data stream is abnormal data when the metadata meets the checking condition in the risk checking policy, so as to check the to-be-processed data stream in each processing window in parallel.
In one embodiment, the verification conditions include a plurality of verification conditions, and the apparatus further includes a prompt module (not shown), where the prompt module may generate risk prompt information when the data stream to be processed satisfies a preset number of verification conditions, and push the risk prompt to a preset target terminal for display.
In one embodiment, the apparatus further includes a storage module (not shown), and the storage module may store the verification result in a preset database, read the verification result in the preset time period in the database in real time, and send the verification result in the preset time period to the target terminal for displaying.
In one embodiment, the apparatus further includes a newly adding module (not shown), where the newly adding module may receive a newly adding request of the risk verification policy sent by the terminal, generate a new risk verification policy according to a new adding verification condition in the newly adding request, and update the new verification policy to the Redis cache.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer equipment is used for storing data such as operation data of the intelligent household equipment. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a resource allocation method of a compiled virtual machine.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring a data stream to be processed in real time, wherein the data stream to be processed comprises a user behavior data stream and a transaction order data stream; distributing the data streams to be processed to corresponding processing windows according to a plurality of processing windows preset in a Flink computing frame; and carrying out parallel verification on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk verification strategy to obtain a verification result.
In an embodiment, when the processor executes the computer program to implement the step of acquiring the data stream to be processed in real time, the following steps are specifically implemented:
collecting data flow to be processed in real time;
classifying the data stream to be processed into different partitions of a preset message queue according to the data stream type of the data stream to be processed;
and acquiring the data stream to be processed from different partitions of a preset message queue.
In an embodiment, when the processor executes the computer program to implement the step of allocating the data stream to be processed to the corresponding processing window according to the plurality of processing windows preset in the Flink computing framework, the following steps are specifically implemented:
and distributing the data stream to be processed to the corresponding processing window according to the window range description information of each processing window and the data stream type of the data stream to be processed, wherein the window range description information is used for describing the data stream type processed by each processing window.
In an embodiment, the risk verification policy includes a verification condition, and the processor executes a computer program to implement the parallel verification on the data stream to be processed and the transaction order data stream in each processing window according to the pre-configured risk verification policy, so as to implement the following steps when obtaining the verification result:
according to the window function of each processing window, calling a risk checking strategy corresponding to the data stream to be processed in each processing window in parallel from a preset Redis cache;
extracting metadata in a data stream to be processed in a processing window;
and when the metadata meets the verification condition in the risk verification strategy, generating a verification result with the data stream to be processed as abnormal data so as to verify the data stream to be processed in each processing window in parallel.
In one embodiment, the check condition includes a plurality of conditions, and the processor executes the computer program to further specifically implement the following steps:
when the data stream to be processed meets the preset number of verification conditions, risk prompt information is generated;
and pushing the risk prompt to a preset target terminal for display.
In one embodiment, the processor when executing the computer program further specifically implements the following steps:
storing the checking result into a preset database;
and reading the verification result in the database within the preset time period in real time, and sending the verification result within the preset time period to the target terminal for displaying.
In one embodiment, the processor when executing the computer program further specifically implements the following steps:
receiving a newly increased request of a risk verification strategy sent by a terminal;
generating a new risk checking strategy according to the new checking condition in the new request;
and updating the new checking strategy into a Redis cache.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring a data stream to be processed in real time, wherein the data stream to be processed comprises a user behavior data stream and a transaction order data stream; distributing the data streams to be processed to corresponding processing windows according to a plurality of processing windows preset in a Flink computing frame; and carrying out parallel verification on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk verification strategy to obtain a verification result.
In one embodiment, when the processor executes the step of acquiring the to-be-processed data stream in real time, the following steps are specifically implemented:
collecting data flow to be processed in real time;
classifying the data stream to be processed into different partitions of a preset message queue according to the data stream type of the data stream to be processed;
and acquiring the data stream to be processed from different partitions of a preset message queue.
In one embodiment, when the computer program is executed by the processor to implement the step of allocating the data stream to be processed to the corresponding processing window according to the plurality of processing windows preset in the Flink computing framework, the following steps are specifically implemented:
and distributing the data stream to be processed to the corresponding processing window according to the window range description information of each processing window and the data stream type of the data stream to be processed, wherein the window range description information is used for describing the data stream type processed by each processing window.
In an embodiment, the risk verification policy includes a verification condition, and the computer program is executed by the processor to implement the parallel verification of the to-be-processed data stream and the transaction order data stream in each processing window according to the pre-configured risk verification policy, and when the step of obtaining the verification result is implemented, the following steps are specifically implemented:
according to the window function of each processing window, calling a risk checking strategy corresponding to the data stream to be processed in each processing window in parallel from a preset Redis cache;
extracting metadata in a data stream to be processed in a processing window;
and when the metadata meets the verification condition in the risk verification strategy, generating a verification result with the data stream to be processed as abnormal data so as to verify the data stream to be processed in each processing window in parallel.
In one embodiment, the check condition includes a plurality of conditions, and the computer program when executed by the processor further specifically implements the following steps:
when the data stream to be processed meets the preset number of verification conditions, risk prompt information is generated;
and pushing the risk prompt to a preset target terminal for display.
In one embodiment, the computer program when executed by the processor further embodies the steps of:
storing the checking result into a preset database;
and reading the verification result in the database within the preset time period in real time, and sending the verification result within the preset time period to the target terminal for displaying.
In one embodiment, the computer program when executed by the processor further embodies the steps of:
receiving a newly increased request of a risk verification strategy sent by a terminal;
generating a new risk checking strategy according to the new checking condition in the new request;
and updating the new checking strategy into a Redis cache.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A data processing method of a Flink computing framework, the method comprising:
acquiring a data stream to be processed in real time, wherein the data stream to be processed comprises a user behavior data stream and a transaction order data stream;
distributing the data stream to be processed to corresponding processing windows according to a plurality of processing windows preset in a Flink computing frame;
and carrying out parallel verification on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk verification strategy to obtain a verification result.
2. The method of claim 1, wherein the obtaining the data stream to be processed in real time comprises:
collecting the data stream to be processed in real time;
classifying the data stream to be processed into different partitions of a preset message queue according to the data stream type of the data stream to be processed;
and acquiring the data stream to be processed from different partitions of the preset message queue.
3. The method according to claim 1, wherein the allocating the data streams to be processed to the corresponding processing windows according to a plurality of processing windows preset in a Flink computation framework comprises:
and allocating the data stream to be processed to the corresponding processing window according to the window range description information of each processing window and the data stream type of the data stream to be processed, wherein the window range description information is used for describing the data stream type processed by each processing window.
4. The method of claim 1, wherein the risk verification policy includes a verification condition, and the performing parallel verification on the data stream to be processed and the data stream of the trade order within each processing window according to a pre-configured risk verification policy to obtain a verification result includes:
according to the window function of each processing window, calling a risk checking strategy corresponding to the data stream to be processed in each processing window in parallel from a preset Redis cache;
extracting metadata in the data stream to be processed in the processing window;
and when the metadata meets the verification condition in the risk verification strategy, generating a verification result that the data stream to be processed is abnormal data so as to verify the data stream to be processed in each processing window in parallel.
5. The method of claim 4, wherein the verification condition comprises a plurality of conditions, the method further comprising:
when the data stream to be processed meets the preset number of verification conditions, risk prompt information is generated;
and pushing the risk prompt to a preset target terminal for display.
6. The method of claim 5, further comprising:
storing the checking result into a preset database;
and reading the verification result in the database within a preset time period in real time, and sending the verification result within the preset time period to the target terminal for displaying.
7. The method of claim 4, further comprising:
receiving a newly increased request of the risk verification strategy sent by a terminal;
generating a new risk checking strategy according to the new verification condition in the new request;
and updating the new checking strategy to the Redis cache.
8. A data processing apparatus of a Flink computing framework, the method comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring data streams to be processed in real time, and the data streams to be processed comprise user behavior data streams and transaction order data streams;
the distribution module is used for distributing the data stream to be processed to corresponding processing windows according to a plurality of processing windows preset in a Flink computing frame;
and the checking module is used for performing parallel checking on the user behavior data stream and the transaction order data stream in each processing window according to a pre-configured risk checking strategy to obtain a checking result.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented when the computer program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202111420013.0A 2021-11-26 2021-11-26 Data processing method, device, equipment and storage medium of Flink computing framework Pending CN114116802A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111420013.0A CN114116802A (en) 2021-11-26 2021-11-26 Data processing method, device, equipment and storage medium of Flink computing framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111420013.0A CN114116802A (en) 2021-11-26 2021-11-26 Data processing method, device, equipment and storage medium of Flink computing framework

Publications (1)

Publication Number Publication Date
CN114116802A true CN114116802A (en) 2022-03-01

Family

ID=80369838

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111420013.0A Pending CN114116802A (en) 2021-11-26 2021-11-26 Data processing method, device, equipment and storage medium of Flink computing framework

Country Status (1)

Country Link
CN (1) CN114116802A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114358911A (en) * 2022-03-16 2022-04-15 深圳高灯计算机科技有限公司 Invoicing data risk control method and device, computer equipment and storage medium
CN114723413A (en) * 2022-04-19 2022-07-08 南京慧尔视软件科技有限公司 Real-time processing method, device, equipment and medium of stream data
CN114860549A (en) * 2022-05-30 2022-08-05 北京新唐思创教育科技有限公司 Buried point data checking method, device, equipment and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114358911A (en) * 2022-03-16 2022-04-15 深圳高灯计算机科技有限公司 Invoicing data risk control method and device, computer equipment and storage medium
CN114358911B (en) * 2022-03-16 2022-08-02 深圳高灯计算机科技有限公司 Invoicing data risk control method and device, computer equipment and storage medium
CN114723413A (en) * 2022-04-19 2022-07-08 南京慧尔视软件科技有限公司 Real-time processing method, device, equipment and medium of stream data
CN114723413B (en) * 2022-04-19 2023-12-19 南京慧尔视软件科技有限公司 Real-time processing method, device, equipment and medium for stream data
CN114860549A (en) * 2022-05-30 2022-08-05 北京新唐思创教育科技有限公司 Buried point data checking method, device, equipment and storage medium
CN114860549B (en) * 2022-05-30 2024-02-20 北京新唐思创教育科技有限公司 Buried data verification method, buried data verification device, buried data verification equipment and storage medium

Similar Documents

Publication Publication Date Title
US10867304B2 (en) Account type detection for fraud risk
US11823153B1 (en) Cash advance payment deferrals
US10445721B2 (en) Method and system for data security utilizing user behavior and device identification
CN108932585B (en) Merchant operation management method and equipment, storage medium and electronic equipment thereof
US20120191517A1 (en) Prepaid virtual card
CN114116802A (en) Data processing method, device, equipment and storage medium of Flink computing framework
US20140172697A1 (en) Systems and methods for detecting fraud in retail return transactions
KR101794221B1 (en) System and method for providing calculation of online sellers
WO2017034643A1 (en) Systems and methods for processing charges for disputed transactions
WO2012042382A1 (en) System, method, and computer readable medium for distributing targeted data using anonymous profiles
US20200327548A1 (en) Merchant classification based on content derived from web crawling merchant websites
US10275812B2 (en) Method and apparatus for denying a transaction detected to be initiated outside of a required application on an endpoint device
US10607204B2 (en) Support messages based on merchant account context
CN110766275A (en) Data verification method and device, computer equipment and storage medium
CN110019774B (en) Label distribution method, device, storage medium and electronic device
US11538116B2 (en) Life event bank ledger
US20200233696A1 (en) Real Time User Matching Using Purchasing Behavior
CN117114901A (en) Method, device, equipment and medium for processing insurance data based on artificial intelligence
US10152754B2 (en) System and method for small business owner identification
US10628806B2 (en) System and method for test data provisioning
US20170249697A1 (en) System and method for machine learning based line assignment
CN110070383B (en) Abnormal user identification method and device based on big data analysis
CN108074186B (en) Health card account opening processing method and device
CN113052673B (en) Account checking and clearing method and device, computer equipment and storage medium
US10395312B2 (en) System and method for proactively offering financing offers to customers of E-commerce websites

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination