CN110689218A

CN110689218A - Risk user identification method and device, computer equipment and storage medium

Info

Publication number: CN110689218A
Application number: CN201910746104.XA
Authority: CN
Inventors: 丁露涛
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-08-13
Filing date: 2019-08-13
Publication date: 2020-01-14
Also published as: WO2021027407A1

Abstract

The embodiment of the invention provides a method and a device for identifying a risk user, computer equipment and a storage medium. The method is applied to the field of data processing, and comprises the steps of acquiring a position data set corresponding to a terminal if order data sent by a user through the terminal is received; clustering the position data set according to a preset first clustering algorithm to obtain a position data cluster corresponding to the position data set after clustering; clustering the position data clusters according to a preset second clustering algorithm to obtain a centroid corresponding to the position data clusters after clustering; judging whether the user is a risk user or not according to the mass center, the reserved position data corresponding to the user and the preset risk position data; and if the user is a risk user, determining order data corresponding to the user as risk data. The embodiment of the invention is beneficial to improving the accuracy and the speed of identifying the risk users.

Description

Risk user identification method and device, computer equipment and storage medium

Technical Field

The present invention relates to the field of computer data processing, and in particular, to a method and an apparatus for identifying a risky user, a computer device, and a computer-readable storage medium.

Background

With the rapid development of the internet, more and more products can realize transactions (such as commodity transactions, service transactions, etc.) through the internet. In order to secure the security of transactions using the internet, it is necessary to identify risky users (e.g., advertisers who run fraudulent websites, merchants who run illegal products, users who falsify information and fraud, etc.) and avoid them from participating in transactions. In the prior art, a mode of checking risk users is to manually identify risk of order data after a user submits an order. Through the manual identification risk, a user is easily influenced by subjective factors to cause low identification accuracy, and the time consumption is long, so that the identification speed is low.

Disclosure of Invention

The embodiment of the invention provides a method and a device for identifying a risk user, computer equipment and a storage medium, and aims to solve the problems of low identification accuracy, low identification speed and the like of the risk user.

In a first aspect, an embodiment of the present invention provides a method for identifying a risky user, including: if order data sent by a user through a terminal is received, a position data set corresponding to the terminal is obtained, wherein the position data set comprises position information of at least two terminals; clustering the position data set according to a preset first clustering algorithm to obtain a position data cluster corresponding to the position data set after clustering; clustering the position data clusters according to a preset second clustering algorithm to obtain a centroid corresponding to the position data clusters after clustering; judging whether the centroid is matched with the reserved position data corresponding to the user; if the centroid is matched with the reserved position data corresponding to the user, judging whether the centroid is matched with the preset risk position data; and if the centroid is matched with the preset risk position data, determining that the user is a risk user and determining order data corresponding to the user as risk data. In a second aspect, an embodiment of the present invention provides an apparatus for identifying a risky user, including:

the terminal comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a position data set corresponding to the terminal if order data sent by a user through the terminal is received, and the position data set comprises position information of at least two terminals;

the first clustering unit is used for clustering the position data set according to a preset first clustering algorithm to obtain a position data cluster corresponding to the position data set after clustering;

the second clustering unit is used for clustering the position data clusters according to a preset second clustering algorithm so as to obtain the centroids corresponding to the position data clusters after clustering;

the first judgment unit is used for judging whether the mass center is matched with the reserved position data corresponding to the user or not;

a second judging unit, configured to judge whether the centroid is matched with the preset risk location data if the centroid is matched with the reserved location data corresponding to the user;

and the order determining unit is used for determining that the user is a risk user and determining order data corresponding to the user as risk data if the center of mass is matched with the preset risk position data.

In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the above-mentioned method for identifying a risky user when executing the program.

In a fourth aspect, the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, causes the processor to execute the above risk user identification method.

The embodiment of the invention provides a method and a device for identifying a risk user, computer equipment and a computer readable storage medium. The method comprises the steps that if order data sent by a user through a terminal are received, a position data set corresponding to the terminal is obtained, wherein the position data set comprises position information of at least two terminals; clustering the position data set according to a preset first clustering algorithm to obtain a position data cluster corresponding to the position data set after clustering; clustering the position data clusters according to a preset second clustering algorithm to obtain a centroid corresponding to the position data clusters after clustering; judging whether the centroid is matched with the reserved position data corresponding to the user; if the centroid is matched with the reserved position data corresponding to the user, judging whether the centroid is matched with the preset risk position data; and if the centroid is matched with the preset risk position data, determining that the user is a risk user and determining order data corresponding to the user as risk data. The method and the device are implemented, the position data set is clustered through a preset first clustering algorithm and a preset second clustering algorithm to obtain a centroid; and then the identification of the risk users is realized according to the mass center, the reserved position data corresponding to the users and the preset risk position data, the whole process is not influenced by artificial subjective factors, and the accuracy and the identification speed of the identification of the risk users are favorably improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart of a method for identifying a risky user according to an embodiment of the present invention;

fig. 2 is a schematic view of an application scenario of a risk user identification method according to an embodiment of the present invention;

fig. 3 is another schematic flow chart of a method for identifying a risky user according to an embodiment of the present invention;

fig. 4 is another schematic flow chart of a method for identifying a risky user according to an embodiment of the present invention;

fig. 5 is another schematic flow chart of a method for identifying a risky user according to an embodiment of the present invention;

fig. 6 is another schematic flow chart of a method for identifying a risky user according to an embodiment of the present invention;

fig. 7 is a schematic block diagram of an apparatus for identifying a risky user according to an embodiment of the present invention;

fig. 8 is another schematic block diagram of an apparatus for identifying a risky user according to an embodiment of the present invention;

fig. 9 is another schematic block diagram of an apparatus for identifying an at risk user according to an embodiment of the present invention;

fig. 10 is another schematic block diagram of an apparatus for identifying an at risk user according to an embodiment of the present invention;

fig. 11 is another schematic block diagram of an apparatus for identifying an at risk user according to an embodiment of the present invention;

fig. 12 is a schematic block diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

Fig. 1 is a flowchart illustrating a method for identifying a risky user according to an embodiment of the present invention. The method for identifying the risky users provided by the embodiment of the invention can be applied to the server 20. The server 20 may be a server within an enterprise for processing order data for risk user identification. The server 20 may be an independent server, or may be a server cluster composed of a plurality of servers. The server 20 may establish a communication connection with the terminal 10 for data interaction, for example, the server 20 may establish a communication connection with the terminal 10 to receive order data sent by the terminal. The terminal 10 may be an electronic terminal such as a mobile phone, a tablet computer, and a desktop computer. As shown in fig. 1, the risky user identifying method includes steps S110 to S160.

S110, if order data sent by a user through a terminal is received, a position data set corresponding to the terminal is obtained, and the position data set comprises position information of at least two terminals.

In specific implementation, the terminal and the server can realize data interaction by establishing communication connection. The user can send order data to the server by operating the terminal. The order data may be order data of various types of goods. For example, order data includes, but is not limited to: travel order data, take-away order data, insurance order data, and the like.

And the position data set corresponding to the terminal comprises position information of at least two terminals. The acquiring of the position data set corresponding to the terminal may specifically be performed by acquiring a plurality of position information corresponding to the terminal, and an acquired set of the plurality of position information corresponding to the terminal is the position data set corresponding to the terminal.

In some embodiments, as shown in FIG. 3, step S110 includes, but is not limited to, steps S111-S112.

And S111, if order data sent by a user through a terminal is received, generating a position acquisition time range according to the sending time of the order data and a preset time period.

In specific implementation, the preset time period can be set according to actual requirements. The predetermined time period is, for example, 7 days, 30 days, 60 days, etc. Generating a position acquisition time range according to the sending time of the order data and a preset time period specifically comprises the following steps: subtracting the sending time of the order data from a preset time period to obtain a position acquisition starting time; determining the sending time of the order data as the position acquisition ending time; determining a time range between the location acquisition start time and the location acquisition end time as a location acquisition time range.

And S112, acquiring the position information matched with the position acquisition time range in a preset position database according to the terminal identification code corresponding to the terminal, and generating the position data set according to the position information matched with the position acquisition time range.

In a specific implementation, each terminal corresponds to a unique terminal identifier, and the terminal identifier is, for example, an International Mobile Equipment Identity (IMEI). And the preset position database is used for storing the acquired position information corresponding to the terminal. The position information includes position coordinates and coordinate acquisition time corresponding to the position coordinates, and the position coordinates include longitude coordinate information and latitude coordinate information. The position coordinate corresponding to the terminal may be obtained by, but not limited to, a Global Positioning System (GPS), a Location Based Service (LBS), or a combination thereof.

Specifically, the storage format of the location information may be "L1, L2; t "; where L1 denotes longitude coordinate information, L2 denotes latitude coordinate information, and T denotes coordinate acquisition time. For example, the location information includes: 114.059818, 22.540215; 2019-1-1416:52: 55; wherein 114.059818 is longitude coordinate information, 22.540215 is latitude coordinate information, and 2019-1-1416:52:55 is coordinate acquisition time.

Specifically, the generating the location data set according to the location information matched with the location acquisition time range specifically includes: judging whether coordinate acquisition time corresponding to position information in a preset position database is within the position acquisition time range or not; if the coordinate acquisition time corresponding to the position information in the preset position database is within the position acquisition time range, determining the position information as the position information matched with the position acquisition time range; storing the location information that matches the location acquisition time range to form the location data set. And if the coordinate acquisition time corresponding to the position information in the preset position database is not within the position acquisition time range, determining that the position information is position information which does not match with the position acquisition time range.

In some embodiments, as shown in fig. 4, step S110 may be preceded by step S210.

S210, acquiring the position information of the terminal according to a preset time interval, and storing the position information to preset position data.

In specific implementation, the preset time interval can be set according to actual requirements. The preset time interval is, for example, 5 minutes, 10 minutes, 30 minutes. The smaller the preset time interval is, the higher the recognition accuracy is. Wherein the preset time interval is less than or equal to the preset time period.

And S120, clustering the position data set according to a preset first clustering algorithm to obtain a position data cluster corresponding to the position data set after clustering.

Specifically, the preset first clustering algorithm may be a DBSCAN algorithm (Density-Based spatial clustering of Applications with Noise, Density-Based clustering method with Noise). The DBSCAN algorithm is a density-based spatial clustering algorithm. The algorithm divides the area with sufficient density into clusters and finds arbitrarily shaped clusters in a spatial database with noise, and the DBSCAN algorithm defines clusters as the largest set of density-connected points.

The DBSCAN algorithm is normally operated by setting operation parameters for the DBSCAN algorithm in advance. The operation parameters include a scan radius Eps and a minimum inclusion point number MinPts. (1) The scan radius Eps represents the extent of a circular neighborhood centered at a point P, which is any unvisited (unvisited) data in the dataset; (2) the minimum inclusion point number MinPts represents the minimum inclusion point number MinPts in the neighborhood centered on the point P. If the number of points in the neighborhood centered on the point P and having the scanning radius Eps is not less than the minimum inclusion point number MinPts, the point P is referred to as a core point.

The operation parameters can be adjusted according to actual requirements. If the minimum inclusion point number MinPts is not changed, the scanning radius Eps is too large, so that most data points are gathered in the same cluster; if the scan radius Eps is too small, it will cause a cluster to split. If the scanning radius Eps is not changed, the minimum inclusion point number MinPts is too large, which results in that the midpoint of the same cluster is determined as an outlier, and the minimum inclusion point number MinPts is too small, which results in that a large number of core points are found. In a specific implementation, the scan radius Eps may be set to 2 kilometers, and the minimum inclusion point number MinPts may be set to 5.

And S130, clustering the position data clusters according to a preset second clustering algorithm to obtain the centroids corresponding to the position data clusters after clustering.

Specifically, the preset second clustering algorithm may be a K-Means clustering algorithm (K-Means clustering algorithm). The K-means algorithm takes K objects selected in advance as initial clustering centers, and the numerical value of K needs to be set in advance. The distance between each object and the respective seed cluster center is then calculated, and each object is assigned to the cluster center closest to it. The cluster centers and the objects assigned to them represent a cluster. Once all objects are assigned, the cluster center for each cluster is recalculated based on the objects existing in the cluster. This process will be repeated until the termination condition is met. The termination condition may be that no (or a minimum number) of objects are reassigned to different clusters, that no (or a minimum number) cluster centers are changed again, or that the sum of squared errors is locally minimal.

In a specific implementation, the number of the position data clusters may be one or more, and the clustering processing is performed on the position data clusters according to a preset second clustering algorithm, so as to obtain a centroid corresponding to the position data clusters after the clustering processing specifically includes: and clustering each position data cluster according to the K-means to obtain a centroid corresponding to the position data cluster after clustering. Wherein the values of K are all preset to 1.

And S140, judging whether the mass center is matched with the reserved position data corresponding to the user.

Specifically, the reserved location data corresponding to the user is location information that is stored in the server in advance by the user, and the reserved location data corresponding to the user includes, but is not limited to, home address information, office address information, and the like.

The type of the preset risk location data may be one or more, and the type of the preset risk location data may be determined according to the type of the order data and a preset type mapping relationship. The preset type mapping relationship is used for determining the corresponding relationship between the order data type and the preset risk position data type.

For example, assume that the order data is insurance order data and the insurance type corresponding to the insurance order data is a severe risk. And determining the type of the preset risk position data as a hospital according to a preset type mapping relation and the type of the order data.

In specific implementation, whether the centroid obtained after clustering processing through a preset first clustering algorithm and a preset second clustering algorithm is smaller than a preset error threshold is judged by judging whether the centroid is matched with the reserved position data corresponding to the user. And if the centroid is matched with the reserved position data corresponding to the user, determining that the obtained centroid is smaller than a preset error threshold value, and identifying the risk user according to the obtained centroid. If the centroid is not matched with the reserved position data corresponding to the user, and the obtained centroid is determined to be not smaller than the preset error threshold, a reminding message is sent to a manager to remind the manager to modify the operation parameters, so that the accuracy of the obtained centroid is improved, and the accuracy of identifying the risk user is further improved. The preset error threshold may be set according to actual requirements, and the preset error threshold is, for example, 1 km.

In some embodiments, as shown in FIG. 5, step S140 includes, but is not limited to, steps S141-S143.

And S141, calculating a distance difference between the centroid and the reserved position data corresponding to the user.

In a specific implementation, calculating the distance difference between the centroid and the reserved location data corresponding to the user may be implemented by a first formula, where the first formula may be a Haversine formula. Wherein the first formula specifically is:

wherein havesin (theta) ═ sin²(θ/2) ═ 1-cos (θ))/2; d1 is the distance difference between the centroid and the reserved location data corresponding to the user; r is the radius of the earth, and the average value can be 6371 kilometer; phi 1 and phi 2 represent the latitude of the centroid and the reserved position data; Δ λ represents a difference of the centroid from the longitude of the reserved location data.

And S142, judging whether the distance difference between the centroid and the reserved position data corresponding to the user is smaller than a preset first difference threshold value.

Specifically, the preset first difference threshold may be set according to an actual requirement, for example, the preset first difference threshold may be set to 1 km.

S143, if the distance difference between the centroid and the reserved position data corresponding to the user is smaller than a preset first difference threshold, determining that the centroid is matched with the reserved position data corresponding to the user.

Specifically, if the distance difference between the centroid and the reserved position data corresponding to the user is smaller than a preset first difference threshold, it is determined that the centroid is matched with the reserved position data corresponding to the user. And if the distance difference between the centroid and the reserved position data corresponding to the user is not smaller than a preset first difference threshold, determining that the centroid is not matched with the reserved position data corresponding to the user.

S150, if the mass center is matched with the reserved position data corresponding to the user, judging whether the mass center is matched with the preset risk position data.

Specifically, if the centroid is matched with the reserved position data corresponding to the user, it is indicated that the centroid obtained after clustering processing through a preset first clustering algorithm and a preset second clustering algorithm is smaller than a preset error threshold, the obtained centroid has high reliability, and can be used for risk user identification, so as to judge whether the centroid is matched with the preset risk position data.

In some embodiments, as shown in FIG. 6, step S150 includes, but is not limited to, steps S151-S153.

And S151, if the centroid is matched with the reserved position data corresponding to the user, calculating a distance difference between the centroid and the preset risk position data.

In a specific implementation, calculating the distance difference between the centroid and the preset risk position data may be implemented by a second formula, where the second formula may be a Haversine formula. Wherein the second formula specifically is:

wherein havesin (theta) ═ sin²(θ/2) ═ 1-cos (θ))/2; d2 is the distance difference between the centroid and the preset risk location data; r is the radius of the earth, and the average value can be 6371 kilometer; phi 1 and phi 2 represent the latitude of the centroid and the reserved position data; Δ λ represents a difference of the centroid from the longitude of the reserved location data.

And S152, judging whether the distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold value.

Specifically, the preset second difference threshold may be set according to an actual requirement, for example, the preset second difference threshold may be set to 1 km.

And S153, if the distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold value, determining that the centroid is matched with the preset risk position data.

Specifically, if the distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold, it is determined that the centroid is matched with the preset risk position data. And if the distance difference between the centroid and the preset risk position data is not smaller than a preset second difference threshold value, determining that the centroid is not matched with the preset risk position data.

And S160, if the centroid is matched with the preset risk position data, determining that the user is a risk user and determining order data corresponding to the user as risk data.

Specifically, assuming that the order data is insurance order data, if the centroid matches with the preset risk location data, it indicates that the user has moved at a preset risk location (such as a location of a hospital) before sending insurance order data, and there may be a risk of applying insurance with illness, and it is determined that the user is a risk user.

If the user is a risk user, the order data corresponding to the user is determined to be risk data, so that subsequent monitoring or manual follow-up can be performed conveniently. For example, assuming that the order data is insurance order data, if the user is a risk user, it indicates that the fraud probability of the insurance order data is high, and further determines the policy corresponding to the user as a risk policy for the auditor to manually perform investigation, so as to reduce the policy fraud risk.

Fig. 7 is a schematic block diagram of an apparatus 100 for identifying a risky user according to an embodiment of the present invention. As shown in fig. 7, the present invention also provides a risky user identification apparatus 100 corresponding to the above risky user identification method. The risky user identification apparatus 100 includes means for performing the risky user identification method described above, and the apparatus 100 may be configured in a server. The server may be an independent server, or a server cluster composed of a plurality of servers. As shown in fig. 7, the apparatus 100 includes a first obtaining unit 110, a first clustering unit 120, a second clustering unit 130, a first judging unit 140, a second judging unit, and an order determining unit 160.

A first obtaining unit 110, configured to obtain, if order data sent by a user through a terminal is received, a location data set corresponding to the terminal, where the location data set includes location information of at least two terminals.

In some embodiments, as shown in fig. 8, the first obtaining unit 110 includes a first generating unit 111 and a second generating unit 112.

The first generating unit 111 is configured to generate a location acquisition time range according to a sending time of order data and a preset time period if the order data sent by a user through a terminal is received.

A second generating unit 112, configured to obtain, in a preset location database, location information matched with the location obtaining time range according to a terminal identifier corresponding to the terminal, and generate the location data set according to the location information matched with the location obtaining time range.

In some embodiments, as shown in fig. 9, the apparatus 100 further comprises a location storage unit 210.

The location storage unit 210 is configured to obtain location information of the terminal according to a preset time interval, and store the location information to preset location data.

The first clustering unit 120 is configured to perform clustering processing on the position data set according to a preset first clustering algorithm to obtain a position data cluster corresponding to the position data set after the clustering processing.

And the second clustering unit 130 is configured to perform clustering processing on the position data clusters according to a preset second clustering algorithm to obtain a centroid corresponding to the position data clusters after the clustering processing.

A first determining unit 140, configured to determine whether the user is a risk user according to the centroid, the reserved location data corresponding to the user, and preset risk location data.

In some embodiments, as shown in fig. 10, the first judging unit 140 includes a first calculating unit 141, a fourth judging unit 142, and a second determining unit 143.

A first calculating unit 141, configured to calculate a distance difference between the centroid and the reserved location data corresponding to the user.

A fourth determining unit 142, configured to determine whether a distance difference between the centroid and the reserved location data corresponding to the user is smaller than a preset first difference threshold.

A second determining unit 143, configured to determine that the centroid is matched with the reserved location data corresponding to the user if a distance difference between the centroid and the reserved location data corresponding to the user is smaller than a preset first difference threshold.

A second determining unit 150, configured to determine whether the centroid matches the preset risk location data if the centroid matches the reserved location data corresponding to the user.

In some embodiments, as shown in fig. 11, the second judging unit 150 includes a second calculating unit 151, a fifth judging unit 152, and a third determining unit 153.

A second calculating unit 151, configured to calculate a distance difference between the centroid and the preset risk location data if the centroid is matched with the reserved location data corresponding to the user.

A fifth judging unit 152, configured to judge whether a distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold.

A third determining unit 153, configured to determine that the centroid matches the preset risk position data if a distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold.

The order determining unit 160 is configured to determine, if the user is a risk user, order data corresponding to the user as risk data.

It should be noted that, as can be clearly understood by those skilled in the art, the detailed implementation process of the risk user identification apparatus 100 and each unit may refer to the corresponding description in the foregoing method embodiment, and for convenience and brevity of description, no further description is provided herein.

The apparatus 100 described above may be implemented in the form of a computer program which may be run on a computer device as shown in fig. 12.

Referring to fig. 12, fig. 12 is a schematic block diagram of a computer device according to an embodiment of the present invention. The computer device 500 may be a server. The server may be an independent server, or a server cluster composed of a plurality of servers.

The computer device 500 includes a processor 520, memory, and a network interface 550 coupled by a system bus 510, where the memory may include a non-volatile storage medium 530 and an internal memory 540.

The non-volatile storage medium 530 may store an operating system 531 and computer programs 532. The computer program 532, when executed, may cause the processor 520 to perform a method of risk user identification.

The processor 520 is used to provide computing and control capabilities that support the operation of the overall computer device 500.

The internal memory 540 provides an environment for the execution of a computer program on a non-volatile storage medium, which when executed by the processor 520, causes the processor 520 to perform a method of risk user identification.

The network interface 550 is used for network communication with other devices. It will be appreciated by those skilled in the art that the schematic block diagram of the computer device is only a partial block diagram of the structure associated with the inventive arrangements and does not constitute a limitation of the computer device 500 to which the inventive arrangements are applied, and that a particular computer device 500 may include more or less components than those shown, or combine certain components, or have a different arrangement of components.

Wherein the processor 520 is configured to run the program code stored in the memory to implement the following functions: if order data sent by a user through a terminal is received, a position data set corresponding to the terminal is obtained, wherein the position data set comprises position information of at least two terminals; clustering the position data set according to a preset first clustering algorithm to obtain a position data cluster corresponding to the position data set after clustering; clustering the position data clusters according to a preset second clustering algorithm to obtain a centroid corresponding to the position data clusters after clustering; judging whether the centroid is matched with the reserved position data corresponding to the user; if the centroid is matched with the reserved position data corresponding to the user, judging whether the centroid is matched with the preset risk position data; and if the centroid is matched with the preset risk position data, determining that the user is a risk user and determining order data corresponding to the user as risk data.

In an embodiment, when the processor 520 performs the step of acquiring the location data set corresponding to the terminal if the order data sent by the user through the terminal is received, the following steps are specifically performed: if order data sent by a user through a terminal is received, generating a position acquisition time range according to the sending time of the order data and a preset time period; and acquiring the position information matched with the position acquisition time range in a preset position database according to the terminal identification code corresponding to the terminal, and generating the position data set according to the position information matched with the position acquisition time range.

In an embodiment, before executing the step of obtaining the location data set corresponding to the terminal if the order data sent by the user through the terminal is received, the processor 520 specifically executes the following steps: and acquiring the position information of the terminal according to a preset time interval, and storing the position information to preset position data.

In an embodiment, when the processor 520 performs the step of determining whether the centroid is matched with the reserved location data corresponding to the user, the following steps are specifically performed: calculating a distance difference between the centroid and the reserved position data corresponding to the user; judging whether the distance difference between the centroid and the reserved position data corresponding to the user is smaller than a preset first difference threshold value or not; and if the distance difference between the centroid and the reserved position data corresponding to the user is smaller than a preset first difference threshold, determining that the centroid is matched with the reserved position data corresponding to the user.

In an embodiment, when the processor 520 performs the step of determining whether the centroid is matched with the preset risk location data if the centroid is matched with the reserved location data corresponding to the user, the following steps are specifically performed: if the centroid is matched with the reserved position data corresponding to the user, calculating a distance difference value between the centroid and the preset risk position data; judging whether the distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold value or not; and if the distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold value, determining that the centroid is matched with the preset risk position data.

It should be understood that, in the embodiment of the present invention, the Processor 520 may be a Central Processing Unit (CPU), and the Processor 520 may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field-Programmable Gate arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Those skilled in the art will appreciate that the schematic block diagram of the computer device 500 does not constitute a limitation of the computer device 500 and may include more or less components than those shown, or some components in combination, or a different arrangement of components.

In a further embodiment of the invention, a computer-readable storage medium is provided, in which a computer program is stored, wherein the computer program, when executed by a processor, realizes the steps of: if order data sent by a user through a terminal is received, a position data set corresponding to the terminal is obtained, wherein the position data set comprises position information of at least two terminals; clustering the position data set according to a preset first clustering algorithm to obtain a position data cluster corresponding to the position data set after clustering; clustering the position data clusters according to a preset second clustering algorithm to obtain a centroid corresponding to the position data clusters after clustering; judging whether the centroid is matched with the reserved position data corresponding to the user; if the centroid is matched with the reserved position data corresponding to the user, judging whether the centroid is matched with the preset risk position data; and if the centroid is matched with the preset risk position data, determining that the user is a risk user and determining order data corresponding to the user as risk data.

In an embodiment, the computer program is executed by a processor to implement the following steps when the step of obtaining the location data set corresponding to the terminal if receiving order data sent by a user through the terminal is performed: if order data sent by a user through a terminal is received, generating a position acquisition time range according to the sending time of the order data and a preset time period; and acquiring the position information matched with the position acquisition time range in a preset position database according to the terminal identification code corresponding to the terminal, and generating the position data set according to the position information matched with the position acquisition time range.

In an embodiment, the computer program is executed by a processor to implement the following steps before the step of obtaining the location data set corresponding to the terminal if the order data sent by the user through the terminal is received: and acquiring the position information of the terminal according to a preset time interval, and storing the position information to preset position data.

In an embodiment, when the computer program is executed by a processor to implement the step of determining whether the centroid matches the reserved location data corresponding to the user, the following steps are specifically implemented: calculating a distance difference between the centroid and the reserved position data corresponding to the user; judging whether the distance difference between the centroid and the reserved position data corresponding to the user is smaller than a preset first difference threshold value or not; and if the distance difference between the centroid and the reserved position data corresponding to the user is smaller than a preset first difference threshold, determining that the centroid is matched with the reserved position data corresponding to the user.

In an embodiment, the computer program is executed by a processor to implement the step of determining whether the centroid matches the preset risk location data if the centroid matches the reserved location data corresponding to the user, and specifically implements the following steps: if the centroid is matched with the reserved position data corresponding to the user, calculating a distance difference value between the centroid and the preset risk position data; judging whether the distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold value or not; and if the distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold value, determining that the centroid is matched with the preset risk position data.

The computer readable storage medium may be various media that can store program codes, such as a usb disk, a removable hard disk, a Read-only memory (ROM), a magnetic disk, or an optical disk.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, more than one unit or component may be combined or may be integrated into another system, or some features may be omitted, or not implemented.

The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the invention can be merged, divided and deleted according to actual needs.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for identifying an at-risk user, the method comprising:

if order data sent by a user through a terminal is received, a position data set corresponding to the terminal is obtained, wherein the position data set comprises position information of at least two terminals;

clustering the position data set according to a preset first clustering algorithm to obtain a position data cluster corresponding to the position data set after clustering;

clustering the position data clusters according to a preset second clustering algorithm to obtain a centroid corresponding to the position data clusters after clustering;

judging whether the centroid is matched with the reserved position data corresponding to the user;

if the centroid is matched with the reserved position data corresponding to the user, judging whether the centroid is matched with the preset risk position data;

and if the centroid is matched with the preset risk position data, determining that the user is a risk user and determining order data corresponding to the user as risk data.

2. The method of claim 1, wherein the obtaining of the location data set corresponding to the terminal if order data sent by the user through the terminal is received comprises:

if order data sent by a user through a terminal is received, generating a position acquisition time range according to the sending time of the order data and a preset time period;

and acquiring the position information matched with the position acquisition time range in a preset position database according to the terminal identification code corresponding to the terminal, and generating the position data set according to the position information matched with the position acquisition time range.

3. The method of claim 1, wherein before the step of obtaining the location data set corresponding to the terminal if the order data sent by the user through the terminal is received, the method further comprises:

and acquiring the position information of the terminal according to a preset time interval, and storing the position information to preset position data.

4. The method of claim 1, wherein the determining whether the centroid matches reserved location data corresponding to the user comprises:

calculating a distance difference between the centroid and the reserved position data corresponding to the user;

judging whether the distance difference between the centroid and the reserved position data corresponding to the user is smaller than a preset first difference threshold value or not;

and if the distance difference between the centroid and the reserved position data corresponding to the user is smaller than a preset first difference threshold, determining that the centroid is matched with the reserved position data corresponding to the user.

5. The method of claim 1, wherein the determining whether the centroid matches the predetermined risky location data if the centroid matches the reserved location data corresponding to the user comprises:

if the centroid is matched with the reserved position data corresponding to the user, calculating a distance difference value between the centroid and the preset risk position data;

judging whether the distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold value or not;

and if the distance difference between the centroid and the preset risk position data is smaller than a preset second difference threshold value, determining that the centroid is matched with the preset risk position data.

6. An apparatus for identifying an at-risk user, the apparatus comprising:

7. The apparatus of claim 6, wherein the first obtaining unit comprises:

the terminal comprises a first generating unit, a second generating unit and a processing unit, wherein the first generating unit is used for generating a position acquisition time range according to the sending time of order data and a preset time period if the order data sent by a user through the terminal is received;

and the second generation unit is used for acquiring the position information matched with the position acquisition time range in a preset position database according to the terminal identification code corresponding to the terminal, and generating the position data set according to the position information matched with the position acquisition time range.

8. The apparatus of claim 6, wherein the apparatus further comprises:

and the position storage unit is used for acquiring the position information of the terminal according to a preset time interval and storing the position information to preset position data.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of at risk user identification according to any of claims 1-5 when executing the program.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, causes the processor to carry out the method of risk user identification according to any one of claims 1-5.