CN112418864A - Data sending method and device - Google Patents

Data sending method and device Download PDF

Info

Publication number
CN112418864A
CN112418864A CN202011241257.8A CN202011241257A CN112418864A CN 112418864 A CN112418864 A CN 112418864A CN 202011241257 A CN202011241257 A CN 202011241257A CN 112418864 A CN112418864 A CN 112418864A
Authority
CN
China
Prior art keywords
service
data
party
service data
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011241257.8A
Other languages
Chinese (zh)
Inventor
王喜
熊秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202011241257.8A priority Critical patent/CN112418864A/en
Publication of CN112418864A publication Critical patent/CN112418864A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • G06Q30/0635Processing of requisition or of purchase orders
    • G06Q30/0637Approvals

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present specification discloses a method and an apparatus for data transmission, where a service platform can obtain service data of a service party, and perform feature extraction on the service data in each service dimension for each service data to obtain feature data of the service data. Then, the service platform can cluster the feature data of each service data corresponding to each service party according to a preset mode for each service party to obtain a clustering result, and determine abnormal service data from each service data corresponding to each service party according to the clustering result. The service platform can determine the service strategy data aiming at the abnormal service data according to the abnormal service data and send the service strategy data to the service party.

Description

Data sending method and device
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for data transmission.
Background
Currently, a user can process various services on-line through a service platform, for example, the user can purchase goods through the service platform. In practical application, there may be a case where a user poses a certain risk to a service platform after processing a service. For example, a sales promotion policy made by a business platform during sales promotion by a business is vulnerable, and if a user swipes an order for the business, the business may suffer a huge loss. Therefore, in order to provide a safer service and ensure the security of the service platform, the service platform may determine, in a manual manner, service data that may possibly have a risk from the service data after the service platform processes the service, so as to discover the service that has a risk and provide a corresponding service policy to a merchant.
Disclosure of Invention
The specification provides a data transmission method and device for providing business strategies for merchants.
The technical scheme adopted by the specification is as follows:
the present specification provides a method of data transmission, comprising:
acquiring service data of a service party, wherein the service data is generated when a user executes a service provided by the service party;
for each service data, performing feature extraction on the service data under each service dimension to obtain feature data of the service data;
clustering the characteristic data of each service data corresponding to each service party according to a preset mode aiming at each service party to obtain a clustering result;
determining abnormal service data from the service data corresponding to the service party according to the clustering result;
and determining the service strategy data aiming at the abnormal service data according to the abnormal service data, and sending the service strategy data to the service party.
Optionally, for each service party, clustering feature data of each service data corresponding to the service party according to a preset manner to obtain a clustering result, specifically including:
determining the weight of each service dimension under each service party aiming at each service party;
and according to a preset mode, clustering the characteristic data of each service data corresponding to the service party according to the corresponding weight of each service dimension under the service party to obtain a clustering result.
Optionally, for each service party, determining a weight corresponding to each service dimension under the service party includes:
for each service dimension, determining the information entropy of other service dimensions under the service party according to the feature data of other service dimensions except the service dimension in each service data corresponding to the service party, and determining the information entropy of all service dimensions under the service party according to the feature data of each service data corresponding to the service party;
determining the information entropy difference between the information entropies of all the service dimensions under the service party and the information entropies of other service dimensions under the service party;
and determining the corresponding weight of the service dimension under the service according to the information entropy difference.
Optionally, determining abnormal service data from the service data of the service party according to the clustering result, which specifically includes:
and for each cluster contained in the clustering result, if the number of the feature points contained in the cluster is less than a first set number, determining that the service data corresponding to the cluster is abnormal service data, and corresponding each feature point to the feature data of one service data.
Optionally, for each service party, clustering feature data of each service data corresponding to the service party according to a preset manner to obtain a clustering result, specifically including:
determining the distance between the characteristic points of the service data corresponding to the service party according to the characteristic data of the service data corresponding to the service party, wherein each characteristic point corresponds to the characteristic data of one service data;
for each feature point, if the number of other feature points in the preset neighborhood range of the feature point is determined to be not less than a second set number according to the distance, the feature point is used as a core feature point, and the core feature point and the feature point in the preset neighborhood range of the core feature point are used as core clusters corresponding to the core feature point;
determining feature points between core feature points in any two core clusters as target points;
and if any two adjacent target points are determined to meet the preset condition, merging the two core clusters, wherein for any two adjacent target points, if one target point is located in the preset neighborhood range of the other target point, the two adjacent target points meet the preset condition.
Optionally, for each service party, clustering feature data of each service data corresponding to the service party according to a preset manner to obtain a clustering result, specifically including:
for each service data of the service party, if the time interval between the current time and the service execution time corresponding to the service data is determined to exceed the set duration, deleting the service data;
and clustering the characteristic data of the residual service data corresponding to the service party to obtain the clustering result.
Optionally, the method further comprises:
acquiring newly added service data corresponding to the service party;
updating the clustering result according to the newly added service data to obtain an updated clustering result;
and sending a service strategy to the service party according to the updated clustering result.
This specification provides an apparatus for data transmission, comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring service data of a service party, and the service data is generated when a user executes a service provided by the service party;
the extraction module is used for extracting the characteristics of the service data under each service dimension aiming at each service data to obtain the characteristic data of the service data;
the clustering module is used for clustering the characteristic data of each service data corresponding to each service party according to a preset mode for each service party to obtain a clustering result;
the determining module is used for determining abnormal service data from all service data corresponding to the service party according to the clustering result;
and the sending module is used for determining the service strategy data aiming at the abnormal service data according to the abnormal service data and sending the service strategy data to the service party.
The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the above-described method of data transmission.
The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the above-mentioned data transmission method when executing the program.
The technical scheme adopted by the specification can achieve the following beneficial effects:
in the data sending method provided in this specification, a service platform may obtain service data of a service party, and perform feature extraction on the service data in each service dimension for each service data, to obtain feature data of the service data. Then, the service platform can cluster the feature data of each service data corresponding to each service party according to a preset mode for each service party to obtain a clustering result, and determine abnormal service data from each service data corresponding to each service party according to the clustering result. The service platform can determine the service strategy data aiming at the abnormal service data according to the abnormal service data and send the service strategy data to the service party.
The method can be seen that the service platform can cluster the service data in real time after acquiring the service data, so that a clustering result is obtained, and the service platform can determine abnormal service data from the clustering result, so that the service platform can quickly find the abnormal service data after acquiring the service data and provide a service strategy for a service party according to the abnormal service data.
Drawings
The accompanying drawings, which are included to provide a further understanding of the specification and are incorporated in and constitute a part of this specification, illustrate embodiments of the specification and together with the description serve to explain the specification and not to limit the specification in a non-limiting sense. In the drawings:
fig. 1 is a schematic flow chart of a data transmission method in this specification;
fig. 2 is a schematic diagram of clustering according to a preset manner provided in the present specification;
fig. 3 is a schematic diagram of a data transmission apparatus provided in the present specification;
fig. 4 is a schematic diagram of an electronic device corresponding to fig. 1 provided in the present specification.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more clear, the technical solutions of the present disclosure will be clearly and completely described below with reference to the specific embodiments of the present disclosure and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without any creative effort belong to the protection scope of the present specification.
In practical application, after a user performs service processing in a service platform, the service platform stores service data of the user. For example, after a user purchases a commodity in the service platform, the service platform may store service data such as time, place, amount, etc. of the order made by the user. If the service platform needs to determine whether a certain risk exists after the user performs the service processing, abnormal service data can be determined in the stored service data in a manual mode. However, this method is inefficient, and in addition, after a large amount of service data is stored in the service platform, abnormal service data is detected in a certain manner (such as clustering, classification, etc.), so that the service platform often cannot find the abnormal service data in time.
In order to solve the above problems, the present method provides a data transmission method, in which a service platform may obtain service data of a service party in real time, and perform feature extraction on the service data in each service dimension for each service data to obtain feature data of the service data. Then, the service platform can cluster the feature data of each service data corresponding to each service party, and determine abnormal service data from each service data corresponding to the service party according to the obtained clustering result.
It can be seen that the method can determine that abnormal business data exists in time, so that efficiency of detecting the abnormal data is improved, and if the phenomenon that the user swipes an order for a merchant exists, as mentioned in the above example, the business platform can detect the order-swiping behavior of the user through the method, so that a corresponding business strategy is provided for the merchant (for example, an adjusted promotion strategy is provided for the merchant), and loss of the merchant is reduced in time.
The technical solutions provided by the embodiments of the present description are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a data sending method in this specification, which specifically includes the following steps:
s101: and acquiring service data of the service party, wherein the service data is generated when a user executes the service provided by the service party.
In practical application, a service platform may include a plurality of service parties to provide services for users. Reference herein to a business party may refer to each merchant that is resident in the business platform. Of course, the service party mentioned herein may also refer to a sub-platform included in the service platform for providing different services, for example, the service platform may include a takeout platform, a taxi-taking platform, and the like.
Based on this, the service platform may obtain service data of each service party, where the service data of the service party may be generated by the user executing the service provided by the service party. For example, if a user purchases a take-out at merchant a through the service platform, the service platform will generate order data (i.e., service data) for the take-out order of the user at merchant a, which may include the amount of the order, the time the user placed the order, the goods data purchased by the user, etc.
S102: and for each service data, performing feature extraction on the service data under each service dimension to obtain feature data of the service data.
S103: and for each service party, clustering the characteristic data of each service data corresponding to the service party according to a preset mode to obtain a clustering result.
After the service platform obtains the service data of the service party, the service platform can perform feature extraction on the service data under each service dimension aiming at each service data to obtain the feature data of the service data. That is, the service platform needs to extract feature data corresponding to each service dimension.
Wherein, each service dimension can be set according to actual requirements. For example, if the service data is order data, each service dimension may include order payment time, order payment amount, order transaction location, network traffic consumed by a user to place an order, and the like. The business platform needs to cluster the feature data based on these business dimensions.
It should be noted that the service platform may clean the service data first, and then perform feature extraction on the service data. The service platform can perform operations such as deduplication and missing value filling on service data. If the characteristic data needs to contain data of the time attribute, namely the order payment time, the service platform can clean the data according to a certain mode. For example, if the order payment time is missing in the service data, or the order payment time is wrong (for example, the order payment time is earlier than the time when the user places an order), the service platform may obtain each time data related to the service executed by the user when the user executes the service corresponding to the service data, and correct the wrong order payment time or supplement the missing order payment time according to a sorting result of sorting each time data, and a preset time constraint condition.
The time data mentioned above may include the time when the user opens an Application (APP), the final time when the user adds goods to the shopping cart, the time when the user places an order, the time when the order is paid, and the like. The chronological order of these times and the time intervals preceding them usually follow a certain law. That is, the chronological order should be the time the user opened the APP, the final time the user added each item to the shopping cart, the time the user placed the order to the order payment time, and the time interval between the final time the user added each item to the shopping cart and the time the user placed the order to the order payment time are generally not too long.
For example, if the service platform determines that order payment time is missing in certain order data (i.e., service data), the service platform acquires each time data related to the order placing of the order corresponding to the order data by the user: the APP opening time of the user is 12:00, the final time of adding each commodity to the shopping cart by the user is 12:11, and the ordering time of the user is 12: 13. After sequencing the time, the service platform determines that the time conforms to the rule of the sequence, and can determine the order payment time according to the 3 times and supplement the order payment time into order data. In combination with the constraint that the time interval between the order placing time of the user and the order payment time cannot exceed 15 minutes, and the time interval between the order placing time of the user and the final time is 2 minutes, the service platform can predict the order payment time at 12 points 15. Of course, the service platform may also determine the order placing time of the user by other means, such as determining the average time interval between the order placing time and the order payment time of the user historically.
In this specification, after the service platform cleans each service data and extracts the feature data of each service data, the feature data of each service data corresponding to each service party may be clustered according to a preset manner for each service party to obtain a clustering result. After the service platform clusters the feature data of the service party, the similar feature data are combined into a cluster, the abnormal service data are combined into a smaller cluster, or the service data are not combined into the cluster, so that the service platform can detect the abnormal service data through a clustering result in the subsequent process. If the business party is a business entity who is resident in the business platform, the clustering process is to respectively calculate the clustering result according to the characteristic data of the business data corresponding to each business entity, and finally the business platform can determine whether abnormal business data exists in the business data of a certain business entity.
Because the feature data corresponds to different service dimensions, and the influence degrees of the different service dimensions on the clustering result may also be different, the weights for calculating the clustering result can be given to the different service dimensions in practical application. That is, the service platform may determine, for each service party, a weight corresponding to each service dimension below the service party, and then, when clustering the feature data, may cluster the feature data of each service data corresponding to the service party according to the weight corresponding to each service dimension below the service party in a preset manner. That is to say, when calculating the clustering result, the feature data under different service dimensions have different weights, and the service platform needs to calculate the clustering result according to the weights.
In practical applications, the way of determining the corresponding weight of each service dimension under the service is not unique. Specifically, the service platform may determine, for each service dimension, the information entropy of other service dimensions under the service party according to the feature data of other service dimensions except the service dimension in each service data corresponding to the service party, and determine the information entropy of all service dimensions under the service party according to the feature data of each service data corresponding to the service party. Then, the service platform can determine the information entropy difference between the information entropies of all the service dimensions under the service party and the information entropies of other service dimensions under the service party, and determine the corresponding weight of the service dimension under the service party according to the determined information entropy difference.
When determining the weight of one service dimension under the service side, the above manner is to calculate the information entropy of each feature data of all service dimensions corresponding to the service side (i.e. the information entropy of all service dimensions under the service side), and calculate the information entropy of each feature data of only no service dimension (i.e. the information entropy of other service dimensions under the service side). Since the information entropy represents the uncertainty of the information, the subsequently calculated information entropy difference value of the two can represent the uncertainty of the service dimension in the overall feature data. Therefore, the service platform can determine the weight corresponding to the service dimension under the service according to the information entropy difference, and the lower the information entropy difference is, the higher the weight corresponding to the service dimension under the service should be, so that the weight corresponding to the service dimension under the service can represent the importance degree of the service dimension in the overall feature data corresponding to the service.
Of course, the service platform may also directly calculate, through the feature data of the service dimension corresponding to the service party, an information entropy corresponding to the service dimension under the service party, and determine, according to the information entropy, a weight corresponding to the service dimension under the service party, where the information entropy represents an uncertainty of the feature data of the service dimension, and therefore, the lower the information entropy, the higher the weight corresponding to the service dimension under the service party should be. For example, the service platform may determine, for each service dimension, a reciprocal of an information entropy corresponding to the service dimension below the service, normalize the reciprocal of the information entropy, and use a result after the normalization as a weight corresponding to the service dimension below the service, or of course, the service platform may determine the weight corresponding to the service dimension below the service according to a difference between a set value and the information entropy. The calculation formula of the information entropy is as follows:
Figure BDA0002768491020000091
if the business dimension needs to be directly calculated, the information entropy corresponding to the business dimension under the business side is calculated, H is the information entropy, k is the number of the feature data corresponding to the business side, and xiFor the service party, the ith characteristic data in the service dimension, q (x)i) Is xiThe corresponding probability. The service platform can be according to xiDetermining q (x) according to the ratio of the sum of the characteristic data of the service dimension corresponding to the service partyi)。
If the information entropies of all the service dimensions under the service party and the information entropies of other service dimensions under the service party need to be calculated, and the corresponding weights of the service dimensions under the service party are determined by calculating the difference value of the information entropies, the two information entropies can be calculated through the formula. Wherein, when calculating the information entropy of all service dimensions under the service party, xiFor the feature data corresponding to the ith service data corresponding to the service party, the service platform can respectively calculate xiDetermining the probability corresponding to the feature data under each service dimension, and determining q (x) according to each determined probabilityi)。
Similarly, the same is true for determining the information entropy of other service dimensions under the service side, and xiRemoving the feature data of the service dimension (i.e. the feature data of other service dimensions) from the ith service data corresponding to the service party, and the service platform can respectively calculate xiDetermining the probability corresponding to the feature data under each other service dimensionality, and then determining q (x) according to the determined probabilityi). It should be noted that the service platform calculates q (x)i) In the method, the feature data under the same service dimension can be normalized first and then normalizedLatter feature data, pair q (x)i) And (6) performing calculation.
In practical applications, the service platform may be clustered by a plurality of clustering algorithms (e.g., K-MEANS algorithm, CLARANS algorithm, etc.). In this specification, the service platform may cluster feature data of service data corresponding to the service party in the following manner:
the service platform may determine a distance between feature points of each service data corresponding to the service party according to the feature data of each service data corresponding to the service party, where each feature point corresponds to a feature data of one service data. The distance between feature points may be calculated by using a euclidean distance, a cosine distance, or the like. Then, for each feature point, if the number of other feature points located in the preset neighborhood range of the feature point is determined to be not less than a second set number according to the determined distance, the service platform may use the feature point as a core feature point, and use the core feature point and the feature point located in the preset neighborhood range of the core feature point as a core cluster corresponding to the core feature point. The second setting number and the preset neighborhood range can be set according to actual requirements.
After each core cluster is determined, the service platform can determine a feature point between core feature points in any two core clusters as a target point. If it is determined that any two adjacent target points all meet the preset condition, merging the two core clusters, wherein for any two adjacent target points, if one target point is located in the preset neighborhood range of the other target point, the two adjacent target points meet the preset condition, as shown in fig. 2.
Fig. 2 is a schematic diagram of clustering performed according to a preset manner provided in the present specification.
In fig. 2, a five-pointed star represents a core feature point, a circle point represents other feature points, and the service platform performs clustering in this way, which is to find out the core feature point in the feature points actually, first use each feature point in a preset neighborhood range of the core feature point as a core cluster, where the core cluster is a basic cluster using the core feature point as a center, and determine whether to merge the core clusters according to a relationship between the core feature points. As can be seen from fig. 2, the core clusters centered at the point A, B, C, D, E, F are merged into a cluster containing more feature points, while the core cluster centered at the point T cannot be merged with other core clusters, and the point G is not merged into the cluster.
As mentioned in the above process, if any two adjacent target points among the target points between the core feature points in the two core clusters meet the preset condition, the two core clusters may be merged. The target point between the core feature points may refer to a feature point located in a direction of a straight line connecting two core feature points, such as a pair of core feature points in fig. 2: the dashed double-arrow line between points a and B may indicate the direction of the straight line connecting the two core feature points, and it can be seen that points O, X, Y, and D are all located in this direction, and point Z is not a feature point located in this direction. It can be seen that point B is located within the predetermined neighborhood range of point a, and thus, the core cluster centered on point B can be merged with the core cluster centered on point a, and point C is located within the predetermined neighborhood range of point B, and therefore, the core cluster centered on point C can be merged with the first two core clusters, and it can be seen that point E is located within the predetermined neighborhood range of point C, point F is located within the neighborhood range of point E, and point D is located within the neighborhood range of point B, and thus, the core cluster centered on point E, point F, and point B can also be merged with the core cluster centered on point a.
S104: and determining abnormal service data from the service data corresponding to the service party according to the clustering result.
S105: and determining the service strategy data aiming at the abnormal service data according to the abnormal service data, and sending the service strategy data to the service party.
After the service platform clusters the feature data of the service data corresponding to the service party to obtain a clustering result, abnormal service data can be determined from each service data corresponding to the service party according to the clustering result.
For each cluster included in the clustering result, if the number of feature points included in the cluster is smaller than a first set number, the service platform may determine that the service data corresponding to the cluster is abnormal service data. That is, if the number of feature points included in one cluster is small, the service data corresponding to each feature point in the cluster can be used as abnormal service data. Of course, if there is a feature point that is not included in the cluster, the service data corresponding to the feature point may also be regarded as abnormal service data (e.g., point G in fig. 2).
After determining the abnormal business data according to the clustering result, the business platform can determine the business strategy data aiming at the abnormal business data and send the business strategy data to the business party.
For example, if the service data is order data corresponding to a user purchasing takeout, and the service party is a merchant in the service platform, the abnormal service data may be order data corresponding to a user placing an order by using a vulnerability of a merchant preference policy, the service policy data may be a merchant preference policy related to an order corresponding to the abnormal service data, a preference policy provided by the service platform and suggesting merchant adjustment, and the like, and the merchant may adjust or close the related preference policy after receiving the service policy data, so as to ensure that the merchant does not suffer from more serious loss.
For another example, if the service party is a shopping platform, the detected abnormal service data may be order data corresponding to a user using a script to perform a billing operation, after the abnormal service data is detected by the service platform, it may be determined whether the user corresponding to the abnormal service data performs a large amount of billing operations in a very short time, if so, it indicates that the user may use the script to perform the billing operation, and the service platform may send service policy data including a user identifier of the user to the shopping platform, so that the shopping platform performs identity verification on the user according to the received service policy data.
In addition, in order to detect abnormal service data in time, the service platform may obtain newly added service data corresponding to the service party (i.e., obtain service data corresponding to the service party that is newly added in real time), and update the clustering result according to the newly added service data to obtain an updated clustering result. And then, determining abnormal service data in the newly added service data according to the updated clustering result, and sending a service strategy to the service party according to the abnormal service data.
In addition, when the service platform performs clustering, the service data generated very early can be deleted, so that the computing resources and the storage resources are saved. Specifically, the service platform may delete the service data and cluster feature data of remaining service data corresponding to the service party to obtain a clustering result, if it is determined that a time interval between the current time and a service execution time corresponding to the service data exceeds a set duration, for each service data of the service party. Wherein, the set time length can be a longer time length set according to actual requirements.
The data transmission method provided in the present specification may perform a series of steps of extracting features of the service data, determining weights of the service dimensions and feature data of the service data corresponding to the service party through a Flink calculation framework, for example, the weights of the service dimensions may be determined in parallel by the Flink calculation framework. For another example, the Flink computation framework can compute the clustering results of the respective service parties in parallel. Therefore, the method can improve the efficiency of data processing through the Flink computing framework, so that abnormal business data in the business data can be detected in time.
In the process, the service platform can detect abnormal service data in the service data of the service party in a clustering mode, and can also obtain real-time newly added service data and update a clustering result according to the newly added service data, so that the abnormal service data in the newly added service data can be detected. Therefore, the method can detect the abnormal service data in time, thereby ensuring the safety of the service provided by each service party in the service platform and improving the efficiency of detecting the abnormal service data.
Based on the same idea, the present specification also provides a corresponding data transmission apparatus, as shown in fig. 3, for the method for data transmission provided in one or more embodiments of the present specification.
Fig. 3 is a schematic diagram of a data transmission apparatus provided in this specification, which specifically includes:
an obtaining module 301, configured to obtain service data of a service party, where the service data is generated when a user executes a service provided by the service party;
an extraction module 302, configured to perform feature extraction on each service dimension on each service data to obtain feature data of the service data;
the clustering module 303 is configured to cluster, according to a preset manner, feature data of each service data corresponding to each service party to obtain a clustering result;
a determining module 304, configured to determine, according to the clustering result, abnormal service data from the service data corresponding to the service party;
a sending module 305, configured to determine, according to the abnormal service data, service policy data for the abnormal service data, and send the service policy data to the service party.
Optionally, the clustering module 303 is specifically configured to, for each service party, determine a weight corresponding to each service dimension under the service party; and according to a preset mode, clustering the characteristic data of each service data corresponding to the service party according to the corresponding weight of each service dimension under the service party to obtain a clustering result.
Optionally, the clustering module 303 is specifically configured to, for each service dimension, determine, according to feature data of other service dimensions except the service dimension in each service data corresponding to the service party, information entropies of the other service dimensions in the service party, and determine, according to the feature data of each service data corresponding to the service party, information entropies of all service dimensions in the service party; determining the information entropy difference between the information entropies of all the service dimensions under the service party and the information entropies of other service dimensions under the service party; and determining the corresponding weight of the service dimension under the service according to the information entropy difference.
Optionally, the determining module 304 is specifically configured to, for each cluster included in the clustering result, if the number of the feature points included in the cluster is smaller than a first set number, determine that the service data corresponding to the cluster is abnormal service data, and each feature point corresponds to feature data of one service data.
Optionally, the clustering module 303 is specifically configured to determine, according to the feature data of each service data corresponding to the service party, a distance between feature points of each service data corresponding to the service party, where each feature point corresponds to a feature data of one service data; for each feature point, if the number of other feature points in the preset neighborhood range of the feature point is determined to be not less than a second set number according to the distance, the feature point is used as a core feature point, and the core feature point and the feature point in the preset neighborhood range of the core feature point are used as core clusters corresponding to the core feature point; determining feature points between core feature points in any two core clusters as target points; and if any two adjacent target points are determined to meet the preset condition, merging the two core clusters, wherein for any two adjacent target points, if one target point is located in the preset neighborhood range of the other target point, the two adjacent target points meet the preset condition.
Optionally, the clustering module 303 is specifically configured to, for each service data of the service party, delete the service data if it is determined that a time interval between the current time and a service execution time corresponding to the service data exceeds a set duration; and clustering the characteristic data of the residual service data corresponding to the service party to obtain the clustering result.
Optionally, the apparatus further comprises:
an updating module 306, configured to obtain newly added service data corresponding to the service party; updating the clustering result according to the newly added service data to obtain an updated clustering result; and sending a service strategy to the service party according to the updated clustering result.
The present specification also provides a computer-readable storage medium storing a computer program, which can be used to execute the method of data transmission shown in fig. 1 described above.
This specification also provides a schematic block diagram of the electronic device shown in fig. 4. As shown in fig. 4, at the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, and may also include hardware required for other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the data transmission method described in fig. 1. Of course, besides the software implementation, the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the various elements may be implemented in the same one or more software and/or hardware implementations of the present description.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims (10)

1. A method of data transmission, comprising:
acquiring service data of a service party, wherein the service data is generated when a user executes a service provided by the service party;
for each service data, performing feature extraction on the service data under each service dimension to obtain feature data of the service data;
clustering the characteristic data of each service data corresponding to each service party according to a preset mode aiming at each service party to obtain a clustering result;
determining abnormal service data from the service data corresponding to the service party according to the clustering result;
and determining the service strategy data aiming at the abnormal service data according to the abnormal service data, and sending the service strategy data to the service party.
2. The method of claim 1, wherein for each service party, clustering feature data of each service data corresponding to the service party according to a preset manner to obtain a clustering result, specifically comprising:
determining the weight of each service dimension under each service party aiming at each service party;
and according to a preset mode, clustering the characteristic data of each service data corresponding to the service party according to the corresponding weight of each service dimension under the service party to obtain a clustering result.
3. The method of claim 2, wherein determining, for each business party, a weight corresponding to each business dimension under the business party specifically comprises:
for each service dimension, determining the information entropy of other service dimensions under the service party according to the feature data of other service dimensions except the service dimension in each service data corresponding to the service party, and determining the information entropy of all service dimensions under the service party according to the feature data of each service data corresponding to the service party;
determining the information entropy difference between the information entropies of all the service dimensions under the service party and the information entropies of other service dimensions under the service party;
and determining the corresponding weight of the service dimension under the service according to the information entropy difference.
4. The method according to claim 1, wherein determining abnormal service data from the service data of the service party according to the clustering result specifically comprises:
and for each cluster contained in the clustering result, if the number of the feature points contained in the cluster is less than a first set number, determining that the service data corresponding to the cluster is abnormal service data, and corresponding each feature point to the feature data of one service data.
5. The method according to claim 1 or 3, wherein for each service party, clustering the feature data of each service data corresponding to the service party according to a preset manner to obtain a clustering result, specifically comprising:
determining the distance between the characteristic points of the service data corresponding to the service party according to the characteristic data of the service data corresponding to the service party, wherein each characteristic point corresponds to the characteristic data of one service data;
for each feature point, if the number of other feature points in the preset neighborhood range of the feature point is determined to be not less than a second set number according to the distance, the feature point is used as a core feature point, and the core feature point and the feature point in the preset neighborhood range of the core feature point are used as core clusters corresponding to the core feature point;
determining feature points between core feature points in any two core clusters as target points;
and if any two adjacent target points are determined to meet the preset condition, merging the two core clusters, wherein for any two adjacent target points, if one target point is located in the preset neighborhood range of the other target point, the two adjacent target points meet the preset condition.
6. The method of claim 1, wherein for each service party, clustering feature data of each service data corresponding to the service party according to a preset manner to obtain a clustering result, specifically comprising:
for each service data of the service party, if the time interval between the current time and the service execution time corresponding to the service data is determined to exceed the set duration, deleting the service data;
and clustering the characteristic data of the residual service data corresponding to the service party to obtain the clustering result.
7. The method of claim 1, wherein the method further comprises:
acquiring newly added service data corresponding to the service party;
updating the clustering result according to the newly added service data to obtain an updated clustering result;
and sending a service strategy to the service party according to the updated clustering result.
8. An apparatus for data transmission, comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring service data of a service party, and the service data is generated when a user executes a service provided by the service party;
the extraction module is used for extracting the characteristics of the service data under each service dimension aiming at each service data to obtain the characteristic data of the service data;
the clustering module is used for clustering the characteristic data of each service data corresponding to each service party according to a preset mode for each service party to obtain a clustering result;
the determining module is used for determining abnormal service data from all service data corresponding to the service party according to the clustering result;
and the sending module is used for determining the service strategy data aiming at the abnormal service data according to the abnormal service data and sending the service strategy data to the service party.
9. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1 to 7.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 7 when executing the program.
CN202011241257.8A 2020-11-09 2020-11-09 Data sending method and device Pending CN112418864A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011241257.8A CN112418864A (en) 2020-11-09 2020-11-09 Data sending method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011241257.8A CN112418864A (en) 2020-11-09 2020-11-09 Data sending method and device

Publications (1)

Publication Number Publication Date
CN112418864A true CN112418864A (en) 2021-02-26

Family

ID=74780815

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011241257.8A Pending CN112418864A (en) 2020-11-09 2020-11-09 Data sending method and device

Country Status (1)

Country Link
CN (1) CN112418864A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115564450A (en) * 2022-12-06 2023-01-03 支付宝(杭州)信息技术有限公司 Wind control method, device, storage medium and equipment
US20230376962A1 (en) * 2022-05-20 2023-11-23 Socure, Inc. System and Method for Automated Feature Generation and Usage in Identity Decision Making

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230376962A1 (en) * 2022-05-20 2023-11-23 Socure, Inc. System and Method for Automated Feature Generation and Usage in Identity Decision Making
CN115564450A (en) * 2022-12-06 2023-01-03 支付宝(杭州)信息技术有限公司 Wind control method, device, storage medium and equipment
CN115564450B (en) * 2022-12-06 2023-03-10 支付宝(杭州)信息技术有限公司 Wind control method, device, storage medium and equipment

Similar Documents

Publication Publication Date Title
CN108460523B (en) Wind control rule generation method and device
KR102175226B1 (en) Methods and devices for controlling data risk
TWI782205B (en) Risk control model training, risk control method, device and equipment for identifying the theft of second-hand door number accounts
EP3780541B1 (en) Identity information identification method and device
CN107424069B (en) Wind control feature generation method, risk monitoring method and equipment
CN108460681B (en) Risk management and control method and device
CN108665143B (en) Wind control model evaluation method and device
CN109118053B (en) Method and device for identifying card stealing risk transaction
CN110020427B (en) Policy determination method and device
CN108830705B (en) Method, device and equipment for summarizing transaction data
CN110008991B (en) Risk event identification method, risk identification model generation method, risk event identification device, risk identification equipment and risk identification medium
CN111028084A (en) Transaction processing method, device and equipment based on block chain
CN111932273B (en) Transaction risk identification method, device, equipment and medium
CN111639687A (en) Model training and abnormal account identification method and device
CN112418864A (en) Data sending method and device
CN106033574B (en) Method and device for identifying cheating behaviors
CN110543317A (en) transaction request processing method, device, gateway and storage medium
CN111582872A (en) Abnormal account detection model training method, abnormal account detection device and abnormal account detection equipment
CN110458571B (en) Risk identification method, device and equipment for information leakage
CN110033092B (en) Data label generation method, data label training device, event recognition method and event recognition device
CN111275071B (en) Prediction model training method, prediction device and electronic equipment
CN110322139B (en) Policy recommendation method and device
CN113297462A (en) Data processing method, device, equipment and storage medium
CN109063967B (en) Processing method and device for wind control scene feature tensor and electronic equipment
CN116485391A (en) Payment recommendation processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210226