CN115907840A - Transaction risk prediction method and device for transaction risk prediction - Google Patents

Transaction risk prediction method and device for transaction risk prediction Download PDF

Info

Publication number
CN115907840A
CN115907840A CN202211234310.0A CN202211234310A CN115907840A CN 115907840 A CN115907840 A CN 115907840A CN 202211234310 A CN202211234310 A CN 202211234310A CN 115907840 A CN115907840 A CN 115907840A
Authority
CN
China
Prior art keywords
transaction
information
refund
historical
seller
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211234310.0A
Other languages
Chinese (zh)
Inventor
王明毅
陈杰瑛
冯景华
陈荣奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Alibaba Overseas Internet Industry Co ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202211234310.0A priority Critical patent/CN115907840A/en
Publication of CN115907840A publication Critical patent/CN115907840A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method for predicting transaction risk, which comprises the following steps: acquiring information related to current transaction; extracting a first feature based on the information related to the current transaction, the first feature being related to a statistical characteristic of the current transaction; inputting the first characteristic into a pre-trained first prediction model to obtain a transaction risk grade; wherein the first predictive model is trained based on a plurality of second features, wherein each second feature is associated with a statistical characteristic of at least one historical transaction and corresponds to the first feature. The transaction risk prediction method provided by the invention can accurately describe the refund and return risk of the order granularity by constructing the order dimension (transaction-by-transaction) refund and return prediction model and carrying out refund and return prediction on different orders, and not only can be used as a reference in the design of freight risk underwriting and pricing schemes, but also is suitable for identification of false transactions.

Description

Transaction risk prediction method and device for transaction risk prediction
Technical Field
The present application relates to the field of machine learning technologies, and in particular, to a method and an apparatus for predicting transaction risk, a device for generating a transaction risk prediction model, and a non-transitory computer-readable storage medium.
Background
The freight insurance is the insurance service provided by the insurance company for the one-way freight fee generated by refund and return when the buyer initiates refund and return in the sale. The seller's goods can be exposed to exclusive traffic by adding the freight insurance, and the purchasing confidence of the buyer is improved. In the stage of insurance approval and pricing of freight insurance, a common design scheme is to design freight insurance pricing from merchant dimensions, different orders execute the same pricing logic for the same merchant, and the insurance approval and pricing scheme of the merchant dimensions is convenient for transparent transmission and understanding, is convenient and fast, but brings limitations that: orders with different risks can execute the same pricing, and obtain benefits for low-risk orders, but bring potential claim settlement risks for high-risk orders, and cannot perform targeted pricing for orders with higher probability of claim settlement. Therefore, in the prior art, accurate prediction of the return risk from order to order belongs to the technical problem to be solved urgently.
Disclosure of Invention
In view of at least one of the drawbacks of the prior art, the present invention provides a method for predicting transaction risk, comprising:
acquiring information related to current transaction;
extracting a first feature based on the information related to the current transaction, the first feature being related to a statistical characteristic of the current transaction;
inputting the first characteristic into a pre-trained first prediction model to obtain a transaction risk grade;
wherein the first predictive model is trained based on a plurality of second features, wherein each second feature is associated with a statistical characteristic of at least one historical transaction and corresponds to the first feature.
According to an aspect of the invention, wherein said extracting a first feature based on said current transaction related information comprises:
the first feature is extracted based on one or more of seller information, buyer-seller pair information, category information, and order information for the current transaction.
According to an aspect of the invention, wherein said extracting a first feature based on said current transaction related information further comprises:
extracting the first characteristic based on one or more of historical refund and return statistical information of a seller of current transaction, historical refund and return statistical information of a buyer of current transaction, historical refund and return statistical information of the buyer of current transaction and the seller in historical transaction, historical refund and return statistical information of a class to which a commodity of current transaction belongs, order amount of current transaction, quantity of the commodity and logistics information; wherein
The historical refund and return statistical information is obtained by delaying the current date and counting the refund and return conditions within a preset time range.
According to an aspect of the invention, wherein said extracting a first feature based on said current transaction related information further comprises:
extracting the first characteristic based on one or more of recent refund and return statistical information of a currently transacted seller, transaction behavior information of the currently transacted seller, shop operation information of the currently transacted seller, recent refund and return statistical information of a currently transacted buyer, affiliate attribute analysis information of the currently transacted buyer, recent refund and return statistical information of the currently transacted buyer and the seller in recent transactions, and recent refund and return statistical information of a class to which a currently transacted commodity belongs; wherein
The recent refund statistical information is obtained by counting the refund condition within the past preset time range from the current date.
According to an aspect of the invention, wherein said extracting a first feature based on said current transaction related information further comprises:
extracting the first features through feature engineering, and constructing a feature vector, wherein the feature vector comprises: seller dimensions, buyer-seller pair dimensions, category dimensions, order dimensions.
According to an aspect of the invention, wherein the training based on the plurality of second features comprises:
extracting the plurality of second features based on one or more of seller information, buyer-seller pair information, category information and order information of a plurality of historical transactions, and training the first prediction model based on the plurality of second features.
According to an aspect of the invention, wherein the training based on the plurality of second features further comprises:
and extracting the plurality of second characteristics based on one or more of historical refund and return statistical information of a seller in historical transaction, historical refund and return statistical information of a buyer in historical transaction, historical refund and return statistical information of the buyer and the seller in historical transaction, historical refund and return statistical information of a class to which the commodity belongs in historical transaction, order amount in historical transaction, quantity of the commodity and logistics information.
According to an aspect of the invention, wherein the training based on the plurality of second features further comprises:
the plurality of second features are extracted based on one or more of recent refund and return statistical information of a historically transacted seller, transaction behavior information of a historically transacted seller, shop operation information of a historically transacted seller, recent refund and return statistical information of a historically transacted buyer, affiliate attribute analysis information of a historically transacted buyer, recent refund and return statistical information of a historically transacted buyer and seller in recent transactions, and recent refund and return statistical information of a class one to which a historically transacted commodity belongs.
According to one aspect of the invention, the seller information, buyer-seller pair information, category information and order information of the plurality of historical transactions are obtained through big data of a transaction platform.
According to an aspect of the invention, wherein said training said first predictive model based on said plurality of second features comprises:
according to the labels and/or the amounts of the plurality of historical transactions, the plurality of historical transactions are subjected to hierarchical sampling through a binary classification model, wherein the labels comprise refund and non-refund refunds;
and inputting the hierarchically sampled data into a lightGBM model to generate model parameters of the first prediction model.
According to one aspect of the invention, the method further comprises:
determining the freight insurance premium of the current transaction according to the transaction risk level; and/or
And determining whether the current transaction is a false transaction or not according to the transaction risk level.
The present invention also provides an electronic device comprising:
a processor; and
a memory storing a computer program which, when executed by the processor, causes the processor to perform the method as described above.
The invention also provides a generating device of a transaction risk prediction model, which is obtained by training historical transaction data, wherein the historical transaction data comprises a plurality of committed transactions, and the device comprises:
a binary model configured to hierarchically sample the historical transaction data according to a label and/or an amount of the committed transaction, wherein the label comprises a refund and an unrerefund;
and the lightGBM model is used for inputting the hierarchically sampled historical transaction data to generate model parameters of the transaction risk prediction model.
The present invention also provides a non-transitory computer readable storage medium having stored thereon computer readable instructions which, when executed by a processor, cause the processor to perform a method as described above.
The transaction risk prediction method provided by the invention can be used for predicting the refund and the return of goods aiming at different orders (current transactions) by constructing an order dimension (transaction-by-transaction) refund and return prediction model, can accurately describe the refund and return risk of order granularity, can be used as a reference in the design of freight risk underwriting and pricing schemes, and is also suitable for identifying false transactions. Furthermore, the transaction risk prediction method provided by the invention subdivides the refund and return statistical information from the refund and return behavior with the presentation period and absolute statistics, thereby further improving the prediction accuracy of the transaction risk prediction model.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without exceeding the protection scope of the present application.
FIG. 1 illustrates an application scenario of one or more embodiments of the present invention;
FIG. 2 illustrates a method for predicting transaction risk provided by an embodiment of the invention;
FIG. 3 illustrates a process for extracting features from information associated with a current transaction to generate a feature vector according to an embodiment of the present invention;
FIG. 4 illustrates an apparatus for transaction risk prediction provided by an embodiment of the present invention;
fig. 5 shows a generation apparatus of a transaction risk prediction model provided by an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, of the embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the description of the embodiments is only intended to facilitate the understanding of the methods and their core concepts of the present application. Meanwhile, a person skilled in the art should, according to the idea of the present application, change or modify the embodiments and applications of the present application based on the scope of the present application. In view of the above, the description should not be taken as limiting the application.
The invention provides a freight risk pricing method for order dimension, which is applied to transaction risk prediction of a transaction platform for order dimension. As shown in fig. 1, when the seller a issues an offer to the buyer B on the transaction platform C and the buyer B makes a commitment, the contract is usually established, and in a transaction scenario of commodity circulation, the seller a needs to send an actual commodity or a virtual product to the buyer B, and the buyer B needs to pay a corresponding money. In another case, within the range specified by the contract, the buyer B decides to refund before receiving no goods or after receiving goods, at this time, the seller a is usually compensated by the freight insurance for goods in transit, and the existing freight insurance calculation model usually performs insurance pricing according to the collected historical transaction data of the seller a, and the pricing may cause investment to the insurance party and may cause excessive investment to the insurance applicant (the seller a or the transaction platform C). For example, the seller a has a high risk of refund/return if hosting a costume daily product, and has a low risk of refund/return if hosting a household product, which is the application logic of the existing freight risk pricing/return risk prediction; there is another case where the risk of refund/return is more strongly correlated with the statistical properties of the order than with the statistical properties of the seller a. For example, while the buyer B is less likely to generate a refund/return risk when purchasing products of other sellers (not assumed to be the seller D or the seller E), the buyer B is likely to generate a refund/return risk when purchasing products of the seller a due to the correspondence between the buyer B and the seller a, and the refund/return risk is also related to the characteristics of the order by order. That is, the probability value of the refund/return risk is related to the statistical properties of the corresponding order (single transaction). By constructing a refund and return forecasting model of order dimension, refund and return forecasting is carried out aiming at different orders, so that the model is used as a reference in the design of freight insurance underwriting and pricing schemes. The method can accurately depict the refund and return risk of order granularity and is also suitable for the identification of false transactions.
According to an embodiment of the present invention, as shown in fig. 2, the present invention provides a method 10 for predicting transaction risk, which includes steps S101 to S103. Wherein:
in step S101, information related to the current transaction is acquired. Alternatively, the related information may be obtained by the transaction platform when the seller a issues the offer, or when the buyer B makes a commitment, or when the goods are actually issued. Optionally, the information related to the current transaction is obtained by real-time big data screening and intercepting. The public information of the current transaction that can be obtained is numerous, some of which are applicable to making predictions of transaction risk. According to a pre-established transaction risk prediction model, acquiring information related to the current transaction required by the pre-established transaction risk prediction model, such as: and acquiring seller information, buyer pair information, category information, order information and the like of the current transaction.
In step S102, a first feature is extracted based on the information related to the current transaction, the first feature being related to a statistical characteristic of the current transaction. The information related to the current transaction acquired in the step S101 is raw data, and a first feature is extracted from the information related to the current transaction through a feature engineering for use by a pre-established transaction risk prediction model.
In step S103, the first feature is input into a first pre-trained predictive model to obtain a transaction risk level. Wherein the first predictive model is trained based on a plurality of second features, wherein each second feature is associated with a statistical feature of at least one historical transaction and corresponds to the first feature. The pre-trained first prediction model is a transaction risk prediction model, the transaction risk prediction model is a prior model, a plurality of second characteristics are extracted through the committed transactions in the historical transaction big data, the first prediction model is obtained through training based on the massive second characteristics, whether the committed transactions in the historical transaction data are refunded and returned is known information, model parameters of the transaction risk prediction model are obtained through training according to the corresponding relation between the committed transactions and the refunded and returned in the historical transaction data, and model construction is completed. In practical application, the first characteristic is input into the pre-established transaction risk prediction model, so that the posterior probability of refund and return can be obtained, and the posterior probability can be converted into the transaction risk level of the current transaction.
According to an embodiment of the present invention, in the method 10 for predicting transaction risk, the information related to the current transaction includes: seller information, buyer pair information, category information and order information. The information related to the current transaction is original data, and can be used by a transaction risk prediction model after feature extraction.
According to an embodiment of the present invention, the seller information includes: historical refund and return statistics for the seller of the current transaction. Whether the current transaction has risks or not has strong dependence relationship with the seller of the current transaction, so that the historical refund and return statistical information of the seller of the current transaction is necessary information for performing transaction risk prediction by using a transaction risk prediction model.
Wherein the buyer information comprises: historical refund and return statistics for the buyer of the current transaction. Whether the current transaction has risks or not and the purchasing habits of the current transaction buyer have the same dependency relationship, so that the historical refund and return statistical information of the current transaction buyer is also necessary information for the transaction risk prediction model to predict the transaction risks.
Wherein the seller and buyer pair information includes: historical refund and return statistics of the buyer and seller in the historical transaction of the current transaction. Whether the current transaction has risks or not is dependent on whether the buyer and the seller of the current transaction are matched, for example, the annual purchasing client of the current transaction seller is taken as the buyer, and the risk of generating refund and return is low; if a large proportion of refunds and returns exist in the historical transactions of the seller and the buyer of the current transaction, the risk of refunds and returns generated in the current transaction is high. Therefore, the historical refund and return statistical information of the buyer and seller in the current transaction in the historical transaction is also the necessary information for the transaction risk prediction model to perform the transaction risk prediction.
Wherein the category information includes: historical refund and return statistical information of the class I category to which the current transaction commodities belong.
According to one embodiment of the present invention, the primary category to which the currently traded item belongs is determined by:
according to the statistics of a trading platform, about 98% of trade orders have only 1 primary category, about 94% of trade orders have 1 secondary category, and about 92% of trade orders have 1 commodity category. And when the first-class category to which the current transaction commodity belongs is counted, counting the category with the highest sum ratio in the current transaction commodity. For example: the current trade commodity contains 3 primary categories, and the 3 primary categories are respectively: c1 (100 yuan), C2 (200 yuan) and C3 (50 yuan), the primary category C2 with the highest fund withdrawal amount is the primary category to which the current transaction belongs. Alternatively, the secondary category and leaf category to which the current transaction belongs may be determined according to the same logic.
Whether the current transaction has risks is dependent on the primary category to which the current transaction commodity belongs, for example, the clothing commodity has a higher risk of refunding and returning goods, and the household commodity has a lower risk of refunding and returning goods. Therefore, the historical refund and return statistical information of the primary category to which the current transaction commodities belong is also necessary information for the transaction risk prediction model to predict the transaction risk.
Wherein the order information comprises: the current transaction order amount, commodity quantity and logistics information. The order amount, the commodity quantity and the logistics information (including logistics suppliers and logistics fees) of the current transaction also have great influence on the risk of whether the current transaction generates refund and return goods, and the distinctiveness is obvious. Therefore, the order amount, the commodity quantity and the logistics information of the current transaction are also necessary information for the transaction risk prediction model to predict the transaction risk.
According to an embodiment of the present invention, the historical refund and return statistical information is obtained by delaying the current date and counting the refund and return conditions within a preset time range.
The historical refund and return statistical information comprises a statistical result of partial transactions which can determine whether to refund and return goods in the transaction, and is a reliable basis for predicting the current transaction risk level. According to one embodiment of the invention, the historical refund and return statistics are determined by:
according to the statistical data of a certain trading platform, the trade orders which are finished within 20 days (a series of processes of receiving goods and returning goods or not) account for 85% -90%, and the orders which are finished within 30 days account for more than 90%, so that for subjects of different dimensions (such as the class I category of the seller/buyer/seller pair/goods of the current trade) the first 30 days of the current date are taken as the presentation period (the period of possible change of the trade risk), and the statistical presentation period is the refund information of the returned goods within 30 days/60 days/90 days, optionally, the refund information specifically includes: the transaction quantity of the refund, the amount of the order, the number of buyers corresponding to the seller/the number of sellers corresponding to the buyer, the same-industry ranking of the refund rate, the historical same-period same-ring ratio of the refund rate and the like.
And carrying out delay processing on the current date, advancing a preset presentation period (such as one month), counting the refund and return conditions within a preset time range (such as 30 days/60 days/90 days), and acquiring the historical refund statistical information of the subjects with different dimensions.
According to an embodiment of the present invention, in the method 10 for predicting transaction risk, the seller information further includes: the system comprises recent refund and return statistical information of a current seller transacted, transaction behavior information of the current seller transacted and shop operation information of the current seller transacted. Wherein the transaction behavior information comprises: DSR Rating (merchant comprehensive performance Rating) of a current transaction Seller, and the store operation information includes: the number of times of searching persons in the shop of the current transaction seller, the return rate of the shop, the content operation capability and the like. .
According to an embodiment of the invention, the buyer information further comprises: the recent refund and return statistical information of the buyer of the current transaction and the affiliation attribute analysis information of the buyer of the current transaction.
According to one embodiment of the present invention, the affiliate attribute analysis information of the buyer of the current transaction is determined by the following method:
the buyer is subjected to attribute construction of 'whether the buyer is suspected to be surrogated' by adopting the following two judgment modes: absolute value judgment and proportion judgment. Wherein
The absolute value judgment method comprises the following steps: if the transaction order is >50 orders in the last 30 days and the transaction address number is >5 in the last 30 days, it is determined that the buyer is suspected to commission the buyer.
The proportion judging mode comprises the following steps: if the number of the order to be sold is known to be > 80% in the order of the transaction in the last 30 days, the buyer is judged to be the suspected buyer to be sold.
The buyer's affiliation has a significant impact on whether the current transaction is at risk for refund and return.
According to one embodiment of the present invention, the seller and buyer pair information further comprises: the buyer and seller of the current transaction have recent refund and return statistics in the recent transaction.
According to an embodiment of the invention, the category information further comprises: and the recent refund and return statistical information of the class I to which the current transaction commodities belong.
According to an embodiment of the present invention, the recent refund statistical information is obtained by counting refund situations within a preset time range in the past from a current date.
The recent refund statistical information includes statistical results of partial transactions in which whether refund and return is not determined among the already committed transactions, and has a reference value for predicting the current transaction risk level. According to one embodiment of the invention, the historical refund and return statistical information is determined by the following method:
for subjects of different dimensions (primary category to which seller/buyer/seller pair/commodity of the current transaction belong as described above), refund/return information within a past preset time range from the current date is counted. For example, refund and return information for the past 30 days from the current date is counted. Optionally, the refund or information specifically includes: the transaction quantity of the refund, the amount of the order, the number of buyers corresponding to the seller/the number of sellers corresponding to the buyer, the same-industry ranking of the refund rate, the historical same-period same-ring ratio of the refund rate and the like.
Taking the current date as a starting point, counting the refund and return conditions within a preset time range (such as within 30 days), and obtaining the recent refund and return statistical information of the main body with different dimensionalities, wherein the recent refund and return statistical information is an absolute statistical quantity.
According to an embodiment of the present invention, in the method 10 for predicting transaction risk, step S102: based on the information, extracting the first feature comprises:
extracting the first features from the related information of the current transaction through feature engineering, and constructing feature vectors, wherein the feature engineering comprises data preprocessing, feature selection and dimension reduction processing, the data preprocessing comprises data standardization, normalization, dimensionless, quantitative feature binarization and qualitative feature dummy coding, the feature selection comprises feature screening through a preset algorithm, and the dimension reduction processing comprises information amount reduction through the preset algorithm. The feature vector includes: seller dimension, buyer-seller pair dimension, category dimension and order dimension, which respectively correspond to seller information, buyer-seller pair information, category information and order information in the information related to the current transaction. A schematic diagram of extracting and constructing feature vectors from information related to a current transaction is shown in fig. 3.
According to one embodiment of the invention, in the prediction method 10 of transaction risk, the first prediction model is obtained by training historical transaction data. The first prediction model is a transaction risk prediction model which is a prior model and is obtained through training of the committed transactions in the historical transaction data, whether the committed transactions in the historical transaction data are refunded and returned is known information, model parameters of the transaction risk prediction model are obtained through training according to the corresponding relation between the committed transactions in the historical transaction data and the refunded and returned, and model construction is completed. In practical application, relevant information suitable for transaction risk prediction in current transaction is obtained, the relevant information of the current transaction is extracted as a first feature, a transaction risk prediction model built in advance is input, and the posterior probability of refund and return can be obtained and can be converted into a transaction risk grade of the current transaction. Optionally, the historical transaction data required for training by the first prediction model has the same multiple dimensions as the input data (information related to current transaction) when the first prediction model is used.
According to an embodiment of the present invention, in the method 10 for predicting transaction risk, the historical transaction data includes: seller information, buyer pair information, category information and order information of the transaction. The historical transaction data is raw data, and can be used for training a transaction risk prediction model after feature extraction.
According to an embodiment of the present invention, the seller information in the historical transaction data includes: historical refund and return statistics for a seller that has committed to a transaction. Whether the transaction has risks or not has strong dependence relationship with sellers of the transaction, so that the historical refund and return statistical information of sellers of the committed transaction is used as training data of a transaction risk prediction model.
The committed transaction comprises a transaction that the goods receiving is completed, and the committed transaction comprises a closing transaction (transaction that the goods receiving is completed and the risk of returning goods is not generated any more) and an unfinished transaction (transaction that the goods receiving is completed and the risk of returning goods is generated possibly afterwards).
According to an embodiment of the present invention, the buyer information in the historical transaction data comprises: historical refund and return statistics for a transacted buyer. Whether the transaction has risks or not and the purchasing habits of the transaction buyers have the same dependency relationship, so that the historical refund and return statistical information of the transaction buyers can be used as the training data of the transaction risk prediction model.
According to an embodiment of the present invention, the buyer-seller pair information in the historical transaction data comprises: historical refund and return statistics of the buyer and seller in the past transaction. Whether the transaction is at risk or not and whether the buyer and the seller of the transaction are matched have a dependency relationship, for example, the risk of generating refund is lower when the perennial purchasing client of the transaction seller is used as the buyer; if there is a large percentage of refunds and returns in the historical transactions of the seller and buyer of the transaction, the risk of refunds and returns in the transaction is high. Therefore, the historical refund and return statistical information of the buyer and seller in the historical transaction is also used as the training data of the transaction risk prediction model.
According to an embodiment of the invention, the category information in the historical transaction data comprises: historical refund and return statistical information of the class I to which the commodities which have been transacted belong. Whether the transaction has risks or not is dependent on the primary category of the transaction commodity, for example, the clothing commodity has a higher risk of refunding and returning goods, and the household electrical commodity has a lower risk of refunding and returning goods. Therefore, historical refund and return statistical information of the class I category to which the transacted commodities belong is also used as training data of the transaction risk prediction model.
According to an embodiment of the invention, the order information in the historical transaction data comprises: the amount of the orders, the quantity of the commodities and the logistics information of the transaction. The order amount, the commodity quantity and the logistics information (including logistics suppliers and logistics fees) of the transaction also have great influence on the risk of whether the transaction generates refund and return, and the distinction is obvious. Therefore, the order amount, the commodity quantity and the logistics information of the finished transaction are also used as training data of the transaction risk prediction model.
The statistical method of the historical refund and return statistical information in the historical transaction data and the method for determining the primary category of the commodity which has been subjected to transaction are substantially the same as the statistical method of the historical refund and return statistical information in the information related to the current transaction and the method for determining the primary category of the commodity which is subjected to the current transaction, and reference may be made to the statistical method of the historical refund and return statistical information in the information related to the current transaction and the description of the method for determining the primary category of the commodity which is subjected to the current transaction, which are not described herein again.
According to an embodiment of the present invention, in the method 10 for predicting transaction risk, the seller information of the historical transaction data further includes: the information of the transaction information comprises recent refund and return statistical information of the transacted seller, transaction behavior information of the transacted seller and shop operation information of the transacted seller. Wherein the transaction behavior information comprises: DSR scoring of a deal-done seller, and the like, the shop operation information including: the number of store searchers of the deal-done seller, the store return rate, the content operation capacity and the like.
According to an embodiment of the invention, wherein the buyer information of the historical transaction data further comprises: the recent refund and return statistical information of the transaction buyer and the sales attribute analysis information of the transaction buyer. The sponsoring attributes of the buyer also have a significant impact on whether the transaction is at risk for refund and return.
According to an embodiment of the present invention, wherein the buyer-seller pair information of the historical transaction data further comprises: the buyer and seller in the transaction have recent refund and return statistics in recent transaction.
According to an embodiment of the invention, the category information of the historical transaction data further comprises: the transaction-completed commodity belongs to the recent refund and return statistical information of the class I.
The statistical method of recent refund and return statistical information in the historical transaction data and the determination method of the affiliate attribute analysis information of the buyer are substantially the same as the statistical method of recent refund and return statistical information in the current transaction-related information and the determination method of the affiliate attribute analysis information of the buyer, and reference may be made to the above description of the statistical method of recent refund and return statistical information in the current transaction-related information and the determination method of the affiliate attribute analysis information of the buyer, which will not be described herein again.
According to an embodiment of the present invention, in the method 10 for predicting transaction risk, the historical transaction data includes a plurality of the committed transactions, seller information, buyer-seller pair information, category information, and order information of the committed transactions, which are obtained through big data of a transaction platform.
According to an embodiment of the present invention, in the method 10 for predicting transaction risk, the training of the first prediction model by historical transaction data includes:
hierarchically sampling the historical transaction data according to the label and/or the amount of the committed transaction through a binary classification model, wherein the label comprises refund and refund;
inputting the hierarchically sampled historical transaction data into a lightGBM model, and generating model parameters of the first prediction model.
According to an embodiment of the present invention, the method 10 for predicting transaction risk further comprises:
and determining the freight insurance premium of the current transaction according to the transaction risk level.
If the transaction risk level output by the first prediction model (transaction risk prediction model) is higher, the risk that the current transaction generates refund and return goods is judged to be higher, and the collection of insurance cost can be properly improved in the scheme of the underwriting and pricing of freight risk. If the transaction risk level output by the first prediction model (transaction risk prediction model) is lower, the risk that the current transaction generates refund and return is judged to be lower, and the collection of insurance cost can be properly reduced in the scheme of the underwriting and pricing of freight risk.
According to an embodiment of the present invention, the method 10 for predicting transaction risk further comprises:
and determining whether the current transaction is a false transaction or not according to the transaction risk level.
If the transaction risk level output by the first prediction model (transaction risk prediction model) is extremely high, it is determined that there is a high probability of failure probability for the current transaction, which is usually an abnormal transaction, and the seller/buyer should be reminded to pay attention to the false transaction risk.
According to an embodiment of the present invention, there is also provided an electronic apparatus including: a processor and a memory. Wherein:
the memory stores a computer program which, when executed by the processor, causes the processor to perform the method 10 of predicting transaction risk as described above.
According to an embodiment of the present invention, as shown in fig. 4, the present invention further provides an apparatus 100 for transaction risk prediction, comprising: an information acquisition unit 110, a feature extraction unit 120, and a prediction unit 130. Wherein:
the information obtaining unit 110 is configured to obtain information related to the current transaction.
The feature extraction unit 120 is configured to extract the first feature based on the information.
The prediction unit 130 is configured to input the first feature into a pre-trained first prediction model, and obtain a transaction risk level.
According to an embodiment of the present invention, as shown in fig. 5, the present invention further provides a device 200 for generating a transaction risk prediction model, the device 200 for generating a transaction risk prediction model is obtained by training historical transaction data, the historical transaction data includes a plurality of committed transactions, and the device 200 for generating a transaction risk prediction model includes a binary classification model 210 and a lightBGM model 220. Wherein:
the two-class model 210 is configured to hierarchically sample the historical transaction data based on the labels and/or amounts of the committed transactions, wherein the labels include refunds and non-refunds.
The lightGBM model 220 inputs the hierarchically sampled historical transaction data to generate model parameters of the transaction risk prediction model.
The present invention also provides a non-transitory computer readable storage medium having stored thereon computer readable instructions which, when executed by a processor, cause the processor to perform the method 10 of predicting transaction risk as described above.
According to the transaction risk prediction method and the device for predicting the transaction risk, provided by the invention, the refund and return prediction is carried out on different orders by constructing a refund and return prediction model of order dimension, the refund and return risk of order granularity can be accurately described by the method, and the method not only can be used as a reference in freight risk underwriting and pricing scheme design, but also is suitable for identification of false transactions. In the transaction risk prediction method and the device for predicting the transaction risk, the refund and return statistical information is subdivided from the two aspects of refund and return behavior with the presentation period and absolute statistics, so that the prediction accuracy of a transaction risk prediction model is further improved.
The freight insurance of the invention comprises: before the transaction is not finished, the buyer sends a goods returning request, and the insurance company provides insurance service for the one-way freight charge generated by goods returning. The claim amount typically varies from 6 to 50 dollars, and is calculated by the insurer, rather than the actual shipping cost paid by the buyer. The insurance company usually checks the claim within 72 hours after the seller confirms the refund, and the payment amount will be paid directly to the buyer's account.
The freight insurance underwriting and pricing design scheme related by the invention comprises the following steps: freight insurance premium = freight rate. In the prior art, the insurance company can determine the risk rate of the first party according to the transaction and return condition of the seller in three months before the insurance is applied every natural month, and the freight rate is automatically calculated by the system and is not the freight rate actually paid by the merchant.
The claim settlement and the investment loss related by the invention comprise: the claim is the amount of the insurance company paid in advance. High claim settlement can generate claim settlement risks, and the insurance company cannot pay the merchants after paying in advance, so that the insurance company is the expense risk.
The light GBM model is a gradient lifting algorithm framework based on a decision tree algorithm.

Claims (14)

1. A method for predicting transaction risk, comprising:
acquiring information related to current transaction;
extracting a first feature based on the information related to the current transaction, the first feature being related to a statistical characteristic of the current transaction;
inputting the first characteristic into a pre-trained first prediction model to obtain a transaction risk grade;
wherein the first predictive model is trained based on a plurality of second features, wherein each second feature is associated with a statistical characteristic of at least one historical transaction and corresponds to the first feature.
2. The method of claim 1, wherein said extracting a first feature based on the current transaction related information comprises:
the first feature is extracted based on one or more of seller information, buyer-seller pair information, category information, and order information for the current transaction.
3. The method of claim 2, wherein said extracting a first feature based on said current transaction related information further comprises:
extracting the first characteristic based on one or more of historical refund and return statistical information of a seller in the current transaction, historical refund and return statistical information of a buyer in the current transaction, historical refund and return statistical information of the buyer and the seller in the current transaction in the historical transaction, historical refund and return statistical information of a class to which a commodity in the current transaction belongs, order amount of the current transaction, commodity quantity and logistics information; wherein
The historical refund and return statistical information is obtained by delaying the current date and counting the refund and return conditions within the preset time range.
4. The method of claim 3, wherein said extracting a first feature based on said current transaction related information further comprises:
extracting the first characteristic based on one or more of recent refund and return statistical information of a currently transacted seller, transaction behavior information of the currently transacted seller, shop operation information of the currently transacted seller, recent refund and return statistical information of a currently transacted buyer, affiliate attribute analysis information of the currently transacted buyer, recent refund and return statistical information of the currently transacted buyer and the seller in recent transactions, and recent refund and return statistical information of a class to which a currently transacted commodity belongs; wherein
The recent refund statistical information is obtained by counting the refund condition within the past preset time range from the current date.
5. The method of any of claims 2-4, wherein said extracting a first feature based on the current transaction related information further comprises:
extracting the first features through feature engineering, and constructing a feature vector, wherein the feature vector comprises: seller dimensions, buyer-seller pair dimensions, category dimensions, order dimensions.
6. The method of any of claims 1-4, wherein the training based on the plurality of second features comprises:
extracting the plurality of second features based on one or more of seller information, buyer-seller pair information, category information and order information of a plurality of historical transactions, and training the first prediction model based on the plurality of second features.
7. The method of claim 6, wherein the training based on the second plurality of features further comprises:
and extracting the plurality of second characteristics based on one or more of historical refund and return statistical information of a seller in historical transaction, historical refund and return statistical information of a buyer in historical transaction, historical refund and return statistical information of the buyer and the seller in historical transaction, historical refund and return statistical information of a class to which the commodity belongs in historical transaction, order amount in historical transaction, quantity of the commodity and logistics information.
8. The method of claim 7, wherein the training based on the plurality of second features further comprises:
the plurality of second features are extracted based on one or more of recent refund and return statistical information of a historically transacted seller, transaction behavior information of a historically transacted seller, shop operation information of a historically transacted seller, recent refund and return statistical information of a historically transacted buyer, affiliate attribute analysis information of a historically transacted buyer, recent refund and return statistical information of a historically transacted buyer and seller in recent transactions, and recent refund and return statistical information of a class one to which a historically transacted commodity belongs.
9. The method of claim 6, wherein seller information, buyer-seller pair information, category information, order information for the plurality of historical transactions are obtained through big data of a transaction platform.
10. The method of any of claims 6-9, wherein the training the first predictive model based on the plurality of second features comprises:
according to the labels and/or the amount of the plurality of historical transactions, the plurality of historical transactions are hierarchically sampled through a binary classification model, wherein the labels comprise refund and refund goods which are not refund;
and inputting the hierarchically sampled data into a lightGBM model to generate model parameters of the first prediction model.
11. The method of any of claims 1-4, further comprising:
determining the freight insurance premium of the current transaction according to the transaction risk level; and/or
And determining whether the current transaction is a false transaction according to the transaction risk level.
12. An electronic device, comprising:
a processor; and
a memory storing a computer program that, when executed by the processor, causes the processor to perform the method of any of claims 1-11.
13. An apparatus for generating a transaction risk prediction model, the apparatus being trained from historical transaction data, the historical transaction data including a plurality of committed transactions, the apparatus comprising:
a binary model configured to hierarchically sample the historical transaction data according to a label and/or an amount of the committed transaction, wherein the label comprises a refund and an unrerefund;
and the lightGBM model is used for inputting the historical transaction data after the hierarchical sampling to generate model parameters of the transaction risk prediction model.
14. A non-transitory computer readable storage medium having computer readable instructions stored thereon, which when executed by a processor, cause the processor to perform the method of any one of claims 1-11.
CN202211234310.0A 2022-10-10 2022-10-10 Transaction risk prediction method and device for transaction risk prediction Pending CN115907840A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211234310.0A CN115907840A (en) 2022-10-10 2022-10-10 Transaction risk prediction method and device for transaction risk prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211234310.0A CN115907840A (en) 2022-10-10 2022-10-10 Transaction risk prediction method and device for transaction risk prediction

Publications (1)

Publication Number Publication Date
CN115907840A true CN115907840A (en) 2023-04-04

Family

ID=86483088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211234310.0A Pending CN115907840A (en) 2022-10-10 2022-10-10 Transaction risk prediction method and device for transaction risk prediction

Country Status (1)

Country Link
CN (1) CN115907840A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116739719A (en) * 2023-08-14 2023-09-12 南京大数据集团有限公司 Flow configuration system of transaction platform

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116739719A (en) * 2023-08-14 2023-09-12 南京大数据集团有限公司 Flow configuration system of transaction platform
CN116739719B (en) * 2023-08-14 2023-11-03 南京大数据集团有限公司 Flow configuration system and method of transaction platform

Similar Documents

Publication Publication Date Title
US7296734B2 (en) Systems and methods for scoring bank customers direct deposit account transaction activity to match financial behavior to specific acquisition, performance and risk events defined by the bank using a decision tree and stochastic process
JP2001282957A (en) System and method for analyzing credit risk
KR100746107B1 (en) Cross correlation tool for automated portfolio descriptive statistics
CN109961368A (en) Data processing method and device based on machine learning
CN111429258A (en) Method and device for monitoring loan fund flow direction
US20210097543A1 (en) Determining fraud risk indicators using different fraud risk models for different data phases
CN112801529B (en) Financial data analysis method and device, electronic equipment and medium
Nguyen et al. Spherical Fuzzy WASPAS-based Entropy Objective Weighting for International Payment Method Selection.
Gkillas et al. The properties of realized volatility and realized correlation: Evidence from the Indian stock market
CN113034046A (en) Data risk metering method and device, electronic equipment and storage medium
Dimitras et al. Evaluation of empirical attributes for credit risk forecasting from numerical data
CN115907840A (en) Transaction risk prediction method and device for transaction risk prediction
CN109903166B (en) Data risk prediction method, device and equipment
JP6794431B2 (en) Initial Margin Methods and systems for calculating and providing initial margin based on the Standard Model
Choi et al. Effect of export credit insurance on export performance: an empirical analysis of Korea
CN113506173A (en) Credit risk assessment method and related equipment thereof
Souza et al. Commodity prices and the Brazilian real exchange rate
CN116629998A (en) Automatic information counting method and device, electronic equipment and readable storage medium
JP2020135434A (en) Enterprise information processing device, enterprise event prediction method and prediction program
CN115809930A (en) Anti-fraud analysis method, device, equipment and medium based on data fusion matching
CN115994819A (en) Risk customer identification method, apparatus, device and medium
CN114119107A (en) Steel trade enterprise transaction evaluation method, device, equipment and storage medium
KR102310320B1 (en) System for trading Financial Derivatives and Method for estimating hedge ratio therein
Hassani et al. Studying product quality by exploring credit card customers behaviour via data mining techniques
CN113269629A (en) Credit limit determining method, electronic equipment and related product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240226

Address after: Room 303, 3rd Floor, Building 5, No. 699 Wangshang Road, Changhe Street, Binjiang District, Hangzhou City, Zhejiang Province, 310052

Applicant after: Hangzhou Alibaba Overseas Internet Industry Co.,Ltd.

Country or region after: China

Address before: Room 554, 5 / F, building 3, 969 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province 311100

Applicant before: Alibaba (China) Co.,Ltd.

Country or region before: China