WO2018040944A1 - 恶意地址/恶意订单的识别系统、方法及装置 - Google Patents

恶意地址/恶意订单的识别系统、方法及装置 Download PDF

Info

Publication number
WO2018040944A1
WO2018040944A1 PCT/CN2017/097953 CN2017097953W WO2018040944A1 WO 2018040944 A1 WO2018040944 A1 WO 2018040944A1 CN 2017097953 W CN2017097953 W CN 2017097953W WO 2018040944 A1 WO2018040944 A1 WO 2018040944A1
Authority
WO
WIPO (PCT)
Prior art keywords
address
identified
probability
order
malicious
Prior art date
Application number
PCT/CN2017/097953
Other languages
English (en)
French (fr)
Inventor
肖谦
赵争超
林君
潘林林
张一昌
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2018040944A1 publication Critical patent/WO2018040944A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0609Buyer or seller confidence or verification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • G06Q30/0635Processing of requisition or of purchase orders
    • G06Q30/0637Approvals

Definitions

  • the present invention relates to the field of Internet technologies, and in particular, to a system, method, and apparatus for identifying a malicious address/malicious order.
  • the normal addresses may be misidentified as malicious addresses.
  • it may be a malicious keyword in one address, but may be a normal keyword in another address, so if the keyword is identified as a preset malicious address, the normal address may appear. Misjudged as a malicious address.
  • the black and white list is a list of manual maintenance based on the actual feedback after the merchant delivers, the method of identifying using the black and white list not only requires manpower but also cannot identify a new malicious address in time.
  • the present invention provides a system, method, and apparatus for identifying a malicious address/malicious order, which can solve the problem of low accuracy in identifying a malicious address/malicious order in the prior art.
  • the present invention provides a system for identifying a malicious address, the system including a user client, a server, and a merchant client;
  • the user client is configured to receive the input to-be-identified address, and send the to-be-identified address to the server;
  • the server is configured to receive the to-be-identified address sent by the user client, perform address stratification processing on the to-be-identified address, and obtain each address level of the to-be-identified address;
  • An address level jump probability distribution which calculates a jump probability of each address level in the to-be-identified address to jump to an adjacent next address level, where the address level jump probability distribution includes any one of the address level jumps to a jump probability of another address level; multiplying the obtained jump probabilities to obtain a normal address probability of the to-be-identified address, and transmitting a recognition result of the malicious address identification based on the normal address probability to the Business client;
  • the merchant client is configured to receive and output the identification result sent by the server.
  • the present invention provides a method for identifying a malicious address, the method comprising:
  • the address level jump probability distribution obtained by the historical normal address analysis, calculating a jump probability of each address level in the to-be-identified address to jump to an adjacent next address level, where the address level jump probability distribution includes Jump probability of any address level jump to another address level;
  • the present invention provides a device for identifying a malicious address, the device comprising:
  • a receiving unit configured to receive an address to be identified sent by the user client
  • a first processing unit configured to perform address stratification processing on the to-be-identified address, and obtain each address level of the to-be-identified address
  • a calculating unit configured to calculate, by using an address level jump probability distribution obtained by historical normal address analysis, a jump probability of each address level in the to-be-identified address to jump to an adjacent next address level, the address level
  • the jump probability distribution includes a jump probability of any one of the address level jumps to another address level
  • a second processing unit configured to perform multiplication processing on each jump probability obtained by the calculating unit, to obtain a normal address probability of the to-be-identified address.
  • the present invention provides a system for identifying a malicious order, the system comprising a user client, a server, and a merchant client;
  • the user client is configured to receive an input order to be identified, and send the to-be-identified order to the server;
  • the server is configured to receive the to-be-identified order sent by the user client, and calculate, according to an address level jump probability distribution obtained by historical normal address analysis, each address level jump in the address of the to-be-identified order a jump probability to an adjacent next address level, the address level jump probability distribution includes a jump probability of any one of the address level jumps to another address level; multiplying the obtained jump probabilities, Obtaining a normal address probability of the address; determining, according to the normal address probability, whether the to-be-identified order is a malicious order, and sending the determination result to the merchant client;
  • the merchant client is configured to receive and display the determination result sent by the server.
  • the present invention provides a method for identifying a malicious order, the method comprising:
  • the address level jump probability Calculating, according to an address level jump probability distribution obtained by historical normal address analysis, a jump probability of each address level jump to an adjacent next address level in the address of the to-be-identified order, the address level jump probability
  • the distribution includes the jump probability of any one of the address level jumps to another address level
  • the present invention provides an apparatus for identifying a malicious order, the apparatus comprising:
  • a receiving unit configured to receive an order to be identified sent by a user client
  • a calculating unit configured to calculate, according to an address level jump probability distribution obtained by historical normal address analysis, a jump probability of each address level jump to an adjacent next address level in the address of the to-be-identified order,
  • the address level jump probability distribution includes a jump probability of any one of the address level jumps to another address level;
  • a processing unit configured to perform multiplication processing on each obtained jump probability to obtain a normal address probability of the address
  • the determining unit is configured to determine, according to the normal address probability, whether the to-be-identified order is a malicious order.
  • the system, method, and device for identifying a malicious address/malicious order can obtain the address-level jump probability distribution obtained by the server after the server obtains the address to be identified and the historical normal address analysis. Identifying an address for address stratification, obtaining each address level of the to-be-identified address, and then using the obtained address level jump probability distribution to calculate each address level of the to-be-identified address to jump to the next next The jump probability of the address level, and multiplying each jump probability to obtain a probability that the to-be-identified address belongs to a normal address, so as to determine whether the to-be-identified address is a malicious address according to the probability, or include the waiting according to the probability Whether the order identifying the address is a malicious order.
  • the maliciously keyword, the black and white list, or the address hierarchy structure is used to determine whether the address to be identified is a malicious address, and the present invention passes the correlation between each address level in the historical normal address. Perform statistics and analysis, and use the analysis results to determine the jump probability of each address level of the address to be identified, and then obtain the probability that the entire address to be identified belongs to the normal address by the jump probability, thereby not only obtaining the normal address including the malicious keyword.
  • the address probability, the normal address probability of the address included in the black and white list, and the normal address probability of the address with a complete address hierarchy structure can also obtain the normal address probability of an address that does not contain malicious keywords, and the address that is not included in the black and white list.
  • the normal address probability and the normal address probability of the address incomplete address gradation structure can determine whether the address to be identified is a malicious address according to the normal address probability, thereby determining whether the to-be-identified order is a malicious order according to whether it is a malicious address, and further Increased malicious address/malicious order knowledge Other accuracy rate.
  • FIG. 1 is a schematic diagram of a system for identifying a malicious address according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a merchant client side selection interface according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of a method for identifying a malicious address according to an embodiment of the present invention
  • FIG. 4 is a flowchart of another method for identifying a malicious address according to an embodiment of the present invention.
  • FIG. 5 is a diagram showing interaction between a server and a client in a malicious address recognition process according to an embodiment of the present invention
  • FIG. 6 is a block diagram showing the composition of a device for identifying a malicious address according to an embodiment of the present invention.
  • FIG. 7 is a block diagram showing the composition of another malicious address recognition apparatus according to an embodiment of the present invention.
  • FIG. 8 is a flowchart of a method for identifying a malicious order according to an embodiment of the present invention.
  • FIG. 9 is a block diagram showing the composition of a device for identifying a malicious order according to an embodiment of the present invention.
  • FIG. 10 is a block diagram showing the composition of another device for identifying a malicious order according to an embodiment of the present invention.
  • the embodiment of the present invention provides a system for identifying a malicious address.
  • the system includes a user client 11, a server 12, and a merchant client 13;
  • the user client 11 is configured to receive the input to-be-identified address, and send the to-be-identified address to the server 12;
  • the server 12 is configured to receive the to-be-identified address sent by the user client 11, and perform address stratification processing on the identified address to obtain each address level of the address to be identified; and calculate the address-level jump probability distribution obtained by historical normal address analysis.
  • the jump probability of each address level in the address to be identified jumps to the adjacent next address level, and the address level jump probability distribution includes the jump probability of any one of the address level jumps to another address level;
  • the probability of the jump is multiplied, the normal address probability of the address to be identified is obtained, and the identification result of the malicious address identification based on the normal address probability is sent to the merchant client 13;
  • the merchant client 13 is for receiving and outputting the recognition result transmitted by the server 12.
  • the system for identifying a malicious address provided by the embodiment of the present invention can perform address stratification processing on the to-be-identified address after receiving the to-be-identified address sent by the user client, and obtain each address level of the to-be-identified address, and then Using the address level jump probability distribution, calculating a jump probability of each address level in the to-be-identified address to jump to an adjacent next address level, and multiplying each jump probability to obtain the to-be-identified address belongs to The probability of a normal address, so as to determine whether the address to be identified is a malicious address according to the probability.
  • the maliciously keyword, the black and white list, or the address hierarchy structure is used to determine whether the address to be identified is a malicious address, and the present invention passes the correlation between each address level in the historical normal address. Perform statistics and analysis, and use the analysis results to determine the jump probability of each address level of the address to be identified, and then obtain the probability that the entire address to be identified belongs to the normal address by the jump probability, thereby not only obtaining the normal address including the malicious keyword.
  • the address probability, the normal address probability of the address included in the black and white list, and the normal address probability of the address with a complete address hierarchy structure can also obtain the normal address probability of an address that does not contain malicious keywords, and the address that is not included in the black and white list.
  • the normal address probability and the normal address probability of the address whose address hierarchy is incomplete and can determine whether the address to be identified is a malicious address according to the normal address probability, thereby improving the accuracy of malicious address recognition.
  • the server 12 is configured to: when the recognition result is that the to-be-identified address is a malicious address, to the merchant client 13 sending an alert prompt message;
  • the merchant client 13 is configured to receive and output the alert prompt information sent by the server 12.
  • the merchant client 13 is configured to output a selection interface for selecting a recognition result for secondary recognition of the address to be identified, and receive a recognition result of the secondary recognition input based on the selection interface.
  • the recognition result of the secondary recognition is returned to the server 12.
  • the merchant client when the merchant client receives the warning prompt information, not only the alert prompt information is displayed on the interface, but also a selection interface for the merchant to select the secondary recognition result, such as the selection interface.
  • a selection interface for the merchant to select the secondary recognition result, such as the selection interface.
  • warning prompt information may be located on the selection interface or may be located on another interface.
  • the merchant client 13 is configured to output a selection interface for selecting a recognition result for secondary recognition of the address to be recognized, and receive a selection interface based on the selection interface for describing the to-be-received information.
  • the identification result is the identification result of the malicious address, and the to-be-identified address carrying the malicious identifier is returned to the server 12.
  • another embodiment of the present invention further provides a method for identifying a malicious address. As shown in FIG. 3, the method mainly includes:
  • the user client ie, the buyer client
  • the server can perform the malicious address recognition operation on the order, and send the order and the identification result of the order.
  • the server needs to pre-process the to-be-identified order first, and then The address to be identified is extracted from the pre-processed order to be identified.
  • the specific implementation process of obtaining the to-be-identified address may be: obtaining an order to be identified; performing redundant processing and formatting processing on the identified order; and obtaining the to-be-identified address from the processed to-be-identified order.
  • the redundant processing and formatting processing of the order to be identified specifically includes:
  • the user may fill in some emoticons, meaningless English letters, and other meaningless data in the address, it is possible to detect whether the information is to be included in the address to be identified, and if so, filter the information.
  • the server may save some dirty data including HTML (HyperText Markup Language) text, JSON (JavaScript Object Notation) string and other abnormal information, so the server can put these dirty data. Filter.
  • HTML HyperText Markup Language
  • JSON JavaScript Object Notation
  • the address hierarchy conforms to Markov, so that the address randomization can be performed using the conditional random field model.
  • the specific implementation manner of the address stratification processing of the address to be identified is as follows: after obtaining the address to be identified, the server may perform word segmentation and address level labeling by using the conditional random field model to obtain the address level, thereby obtaining each address level of the address to be identified.
  • the address to be identified is Unit 1 of Building 5, **Home, Supo Street, Qingyang District, Chengdu City, Sichuan province.
  • the address levels are: “Province: Sichuan province, City: Chengdu, District: Qingyang District, Road: Supo Street, Community: **Home, Building No.: Building No. 5, Unit No.: Unit 1.”
  • the address level jump probability distribution includes a jump probability of any one of the address level jumps to another address level. Since the historical normal address is the address of the merchant's delivery success, after obtaining a large number of historical normal addresses, the server can perform statistics and analysis on the address level jump of the historical normal address, and obtain the address level jump probability distribution from the server. The address level jump probability distribution determines the jump between each address level of the address to be identified.
  • the server may use the address level jump probability distribution to calculate the jump probability of the adjacent address level in the address to be identified, that is, the probability that the Nth level jumps to the N+1 level.
  • the address level of the address to be identified "province: Sichuan province, city: Chengdu, district and county: Qingyang District, road: Supo Street, residential area: ** home, building number: Building 5, unit number: After 1 unit"
  • you can use the address level jump Probability distribution obtained the probability of Sichuan jumping to Chengdu, the probability of Chengdu jumping to Qingyang District, the probability of Qingyang District jumping to Supo Street, the probability of Supo Street jumping to ** homeland, ** The probability that the home will jump to Building No. 5 and the probability that Building No. 5 will jump to Unit 1.”
  • the training can be performed by using a large number of addresses nationwide, and after obtaining the jump probability of each address level of the address to be identified, these jump probabilities can be multiplied to obtain The probability that the address to be identified belongs to the normal address.
  • the probability calculation formula for calculating the address to be identified as a normal address may be:
  • S represents the address to be identified
  • w i represents the i-th address level in the address to be identified
  • C represents the province to which the address to be identified belongs.
  • the method for identifying a malicious address provided by the embodiment of the present invention can perform address stratification processing on the to-be-identified address after obtaining the to-be-identified address, obtain each address level of the to-be-identified address, and then use the address-level jump probability distribution. Calculating a jump probability of each address level in the to-be-identified address to jump to an adjacent next address level, and performing multiplication processing on each jump probability to obtain a probability that the to-be-identified address belongs to a normal address, so as to The probability determines whether the address to be identified is a malicious address.
  • the maliciously keyword, the black and white list, or the address hierarchy structure is used to determine whether the address to be identified is a malicious address, and the present invention passes the correlation between each address level in the historical normal address. Perform statistics and analysis, and use the analysis result to determine the jump probability of each address level of the address to be identified, and then obtain the probability that the entire address to be identified belongs to the normal address by the jump probability, thereby Not only can the normal address probability of the address containing the malicious keyword, the normal address probability of the address included in the black and white list, and the normal address probability of the address with the complete address hierarchy structure, but also the normal address of the address not containing the malicious keyword can be obtained.
  • the to-be-identified address may determine whether the to-be-identified address is a malicious address according to the preset identification rule and the normal address probability of the to-be-identified address.
  • the normal address probability can be directly used to determine whether the address to be identified is a malicious address, and other characteristics corresponding to the identified address can be analyzed, and then synthesized according to the normal address probability and other features. It is determined whether the address to be identified is a malicious address (as described in steps 305 to 307 below).
  • the specific implementation manner of determining whether the address to be identified is a malicious address by using the normal address probability is: determining whether the normal address probability of the to-be-identified address is greater than a preset probability threshold; if the normal address probability of the to-be-identified address is greater than a preset probability threshold, And determining that the to-be-identified address is a normal address; if the normal address probability of the to-be-identified address is less than or equal to a preset probability threshold, determining that the to-be-identified address is a malicious address.
  • the identification result may be sent to the merchant client, so that the merchant client receives and displays the recognition result for the merchant to determine whether to deliver the product according to the recognition result.
  • another embodiment of the present invention further provides a method for identifying a malicious address. As shown in FIG. 4, the method mainly includes:
  • the preset identification feature includes any one of the following or a combination of any of the following: an address text information feature, a historical shopping behavior feature, an order feature, and a cross feature.
  • this step can be specifically refined into the following steps a-d:
  • the address text information feature includes: whether to include a preset length number, whether to include a preset sensitive word And whether or not to include advertising information.
  • the preset length includes a length of the mobile phone number, a length of the landline number, and a length of the QQ number.
  • the address text information feature is extracted from the address to analyze whether the address to be identified is a malicious address from the dimension.
  • the user's historical shopping behavior can reflect whether it is possible to fill in a malicious address, for example, a user who has frequent disputes with the merchant, often has no refund, and a transaction success rate is more likely to fill in a malicious address, and never has a business with the merchant. Users who have disputes, never refunded, and have a high transaction success rate are less likely to fill in malicious addresses, so the historical shopping behavior characteristics can be extracted from the historical orders corresponding to the orders to be identified, and the feature is determined as a judgment. Whether the address is a dimension of a malicious address.
  • the historical shopping behavior features mainly include: the number of payment orders within a preset time period, the total amount of payment within a preset time period, the total amount of refund initiation within a preset time period, and the successful transaction within a preset time period. Rate, the number of disputed merchants within the preset time period, the complaint initiation rate within the preset time period, and the proportion of refund disputes within the preset time period.
  • the preset time periods of each historical shopping behavior feature may be the same or different.
  • the order feature includes: whether the phone number in the to-be-identified order is normal, whether the number of times the to-be-identified address is used is greater than a preset usage threshold, a related state of the store corresponding to the to-be-identified item, and a related status of the item to be identified to the corresponding item.
  • the relevant status of the store includes: the opening time of the store, the fluctuation of the store rating in the latest time period, the number of times the store is maliciously attacked, etc.; the relevant state of the product includes: the sales volume of the product, the price of the product, and whether the product is popular.
  • the server can extract these order features from the order to be identified. And through the dimension of the order feature to analyze whether the address to be identified is a malicious address.
  • the basic features of the address text information feature, the historical shopping behavior feature, the order feature, and the normal address probability of the address to be identified are cross-combined to generate a more abstract feature description, such as address text information features and order features.
  • a more abstract feature description such as address text information features and order features.
  • the implementation manner of the server training preset identification model may be: first acquiring a historical order; then obtaining a normal address probability of the historical address carried in the historical order according to the address level jump probability distribution; and extracting the preset from the historical order Identifying features; finally, training the preset recognition model by the normal address probability of each historical address and the corresponding preset identification feature.
  • the historical order includes a preset proportion of historical normal orders and historical malicious orders, and when the ratio of historical normal orders to historical malicious orders is about 4:1, the accuracy of malicious address recognition is relatively high.
  • the preset recognition model that needs to be trained in this step may be a GBD (Gradient Boosting Decision Tree) model, or may be other models, such as SVM (Support Vector Machine, support vector). Machine model, LR (Logistic Regression) model, neural network model, etc.
  • GBD GBD
  • SVM Serial Vector Machine, support vector
  • LR Logistic Regression
  • the server may input the features into the preset recognition model for identification, so that the preset recognition model can comprehensively analyze the features, and obtain the address to be identified eventually belongs to The probability of the normal address or the probability of the malicious address, and determining whether the to-be-identified address is a malicious address according to a preset normal probability threshold or a preset malicious address probability threshold.
  • the to-be-identified address is a malicious address
  • send an alert prompt message to the merchant client so that the merchant client receives and outputs the alert prompt information.
  • the server After the server determines that the to-be-identified address is a malicious address, in order to avoid the loss of economy, reputation, etc. caused by the malicious address, the server sends the to-be-identified order to the merchant client, and may send it to indicate that the address may be a malicious address.
  • the warning prompt message after receiving the warning prompt information, the merchant can contact the buyer according to the phone in the order to determine whether the address is a malicious address; if the merchant determines that the address is a malicious address, the merchant can refuse the delivery. If the merchant determines that the address is a normal address, not a malicious address, you can safely ship the goods.
  • the server may send the to-be-identified order only to the merchant client without sending the warning prompt information; when the merchant finds that the received order has no warning prompt information, it will directly according to the order.
  • the address is shipped.
  • the server may misjudge the malicious address as a normal address, so when the merchant finds that the address cannot be delivered during the actual delivery process, the merchant can select the address as a bad in the merchant client.
  • the button of the address is configured, so that the merchant client sends the to-be-identified address carrying the malicious identifier to the server, and after the server receives the to-be-identified address carrying the malicious identifier, updates the historical normal address database and the historical malicious address database, and The preset recognition model is retrained.
  • a button for indicating the determination as a malicious address may be selected in the page of the early warning tool (or the selection interface mentioned in the above system embodiment) for the merchant customer.
  • the terminal sends the to-be-identified address carrying the malicious identifier to the server; when the merchant determines that the to-be-identified address is a normal address instead of a malicious address, a button for indicating the determination as a normal address may be selected in the page of the warning tool for the merchant client. Send the to-be-identified address carrying the normal identifier to the server.
  • the recognition result is that the to-be-identified address is a normal address
  • the server determines that the judgment error is made, and immediately updates the historical normal address database and the historical malicious address database, and then re-locates the address level jump probability distribution. Analysis, retraining the preset recognition model.
  • the interaction process between the server and the client in the embodiment of the present invention may be as shown in FIG. 5, and the embodiment of the present invention can not only be based on the address level jump probability distribution.
  • Obtaining the probability that the address to be identified belongs to the normal address, and obtaining other preset identification features such as the address text information feature, the historical shopping behavior feature, the order feature, and the cross feature from the historical order and the to-be-identified order, and the normal address to be identified is normal.
  • the address probability and these preset identification features are input into the GBDT model (or other recognition model) for comprehensive analysis to determine whether the address to be identified is a malicious address, thereby further improving the accuracy of malicious address recognition.
  • the server when the server finally determines that the to-be-identified address is a malicious address, it can also send an alert prompt message to the merchant client, so that the merchant can determine whether the shipment is determined by contacting the buyer to verify whether the address is a malicious address. Avoid losses. Further, after the merchant determines whether the address is a malicious address according to the actual situation, the corresponding OK button can also be selected on the merchant client, so that the merchant client feeds back the actual determination result to the server, so that the server can be based on the feedback of the merchant client. Determine whether it is misjudged. If a misjudgment occurs, the GBDT model can be retrained in time to make the GBDT model more perfect, and thus the accuracy of subsequent malicious address recognition is improved.
  • another embodiment of the present invention further provides a device for identifying a malicious address.
  • the device mainly includes: a receiving unit 41, a first processing unit 42, and a computing unit. 43. Second processing unit 44. among them,
  • the receiving unit 41 is configured to receive an address to be identified sent by the user client;
  • the first processing unit 42 is configured to perform address stratification processing on the address to be identified, and obtain each address level of the address to be identified;
  • the calculating unit 43 is configured to calculate, by using an address level jump probability distribution obtained by historical normal address analysis, a jump probability of each address level jump to an adjacent next address level in the address to be identified, and an address level jump probability The distribution includes the jump probability of any one of the address level jumps to another address level;
  • the second processing unit 44 is configured to perform multiplication processing on each jump probability obtained by the calculating unit 43 to obtain a normal address probability of the address to be identified.
  • the device for identifying a malicious address provided by the embodiment of the present invention can perform address stratification processing on the to-be-identified address, obtain each address level of the to-be-identified address, and then use the address-level jump probability distribution after obtaining the to-be-identified address. Calculating a jump probability of each address level in the to-be-identified address to jump to an adjacent next address level, and performing multiplication processing on each jump probability to obtain a probability that the to-be-identified address belongs to a normal address, so as to The probability determines whether the address to be identified is a malicious address.
  • the maliciously keyword, the black and white list, or the address hierarchy structure is used to determine whether the address to be identified is a malicious address, and the present invention passes the correlation between each address level in the historical normal address. Perform statistics and analysis, and use the analysis results to determine the jump probability of each address level of the address to be identified, and then obtain the probability that the entire address to be identified belongs to the normal address by the jump probability, thereby not only obtaining the normal address including the malicious keyword.
  • the address probability, the normal address probability of the address included in the black and white list, and the normal address probability of the address with a complete address hierarchy structure can also obtain the normal address probability of an address that does not contain malicious keywords, and the address that is not included in the black and white list.
  • the normal address probability and the normal address probability of the address whose address hierarchy is incomplete and can determine whether the address to be identified is a malicious address according to the normal address probability, thereby improving the accuracy of malicious address recognition.
  • the device further includes:
  • the determining unit 45 is configured to determine, according to the preset identification rule and the normal address probability of the to-be-identified address, whether the address to be identified is a malicious address after obtaining the normal address probability of the to-be-identified address.
  • the determining unit 45 includes:
  • the extracting module 451 is configured to extract, from the historical order corresponding to the to-be-identified order corresponding to the to-be-identified address and/or the historical order corresponding to the to-be-identified order, a preset identification feature for identifying whether the to-be-identified address is a malicious address;
  • the obtaining module 452 is configured to acquire a preset recognition model trained by the historical order;
  • the first determining module 453 is configured to determine a normal address probability, a preset identification feature, and a preset according to the address to be identified. Identify the model and determine if the address to be identified is a malicious address.
  • the extraction module 451 includes:
  • a first extraction sub-module 4511 configured to extract a corresponding address text information feature from the to-be-identified address
  • a second extraction sub-module 4512 configured to extract a historical shopping behavior feature from a historical order corresponding to the to-be-identified order
  • the third extraction sub-module 4513 is configured to extract a corresponding order feature from the to-be-identified order.
  • the extraction module 451 further includes:
  • the obtaining sub-module 4514 is configured to obtain a cross feature corresponding to the to-be-identified address according to a combination of at least two of the address text information feature, the historical shopping behavior feature, the order feature, and the normal address probability of the to-be-identified address.
  • the address text information feature extracted by the first extraction sub-module 4511 includes: whether the number includes a preset length, whether the preset sensitive word is included, and whether the advertisement information is included;
  • the order feature extracted by the third extraction sub-module 4513 includes: whether the phone number in the to-be-identified order is normal, whether the number of times of use of the to-be-identified address is greater than a preset usage threshold, the relevant state of the store corresponding to the to-be-identified order, and the corresponding to-be-identified order The relevant status of the goods.
  • the obtaining module 452 is further configured to acquire a historical order, where the historical order includes a historical proportion of a normal order and a historical malicious order;
  • the obtaining module 452 is further configured to obtain a normal address probability of the historical address carried in the historical order according to the conditional random field model and the address level jump probability distribution;
  • the extraction module 451 is further configured to extract a preset identification feature from the historical order
  • the determining unit 45 further includes:
  • the training module 454 is configured to train the preset recognition model by using a normal address probability of each historical address and a corresponding preset identification feature.
  • the determining unit 45 includes:
  • the second determining module 455 is configured to determine whether a normal address probability of the to-be-identified address is greater than a preset probability threshold
  • the determining module 456 is configured to: when the judgment result of the second determining module is that the normal address probability of the to-be-identified address is greater than the preset probability threshold, determine that the to-be-identified address is a normal address, and when the determining result of the second determining module is the to-be-identified address When the normal address probability is less than or equal to the preset probability threshold, it is determined that the to-be-identified address is a malicious address.
  • the device further includes:
  • the first sending unit 46 is configured to send a recognition result that determines whether the address to be identified is a malicious address to the merchant client, so that the merchant client receives and outputs the recognition result.
  • the device further includes:
  • the second sending unit 47 is configured to: when the determining unit 45 determines that the to-be-identified address is a malicious address, send the warning prompt information to the merchant client, so that the merchant client receives and outputs the warning prompt information;
  • the receiving unit 41 is configured to receive, by the merchant client, a recognition result that is used for secondary identification of the address to be identified based on the warning prompt information;
  • the first updating unit 48 is configured to update the historical normal address database, the historical malicious address database, and the preset recognition model when the recognition result received by the first receiving unit 48 is that the to-be-identified address is a normal address.
  • the receiving unit 41 is configured to receive the to-be-identified address that carries the malicious identifier sent by the merchant client.
  • the device further includes:
  • the second update unit 49 is configured to update the historical normal address pool, the historical malicious address pool, and the preset recognition model.
  • the to-be-identified address is an address obtained after the first processing unit 42 performs redundancy processing and formatting processing on the order to be identified.
  • the first processing unit 42 includes:
  • the filtering module 421 is configured to filter the text that meets the preset filtering condition in the to-be-identified address of the order to be identified;
  • the filtering module 421 is further configured to filter the dirty data in the order to be identified;
  • the processing module 422 is configured to perform formatting processing on the to-be-identified order filtered by the filtering module 421 according to the preset formatting processing rule.
  • the device for identifying a malicious address provided by the embodiment of the present invention can not only obtain the probability that the address to be identified belongs to the normal address based on the address level jump probability distribution, but also obtain the address text information feature and the historical shopping from the historical order and the to-be-identified order.
  • Other preset identification features such as behavior characteristics, order features, and cross-characteristics, and input the normal address probability of the address to be identified and the preset identification features into a preset recognition model for comprehensive analysis to determine whether the address to be identified is a malicious address. Thereby the accuracy of malicious address recognition is further improved.
  • the server when the server finally determines that the to-be-identified address is a malicious address, it can also send an alert prompt message to the merchant client, so that the merchant can determine whether the shipment is determined by contacting the buyer to verify whether the address is a malicious address. Avoid losses. Further, after the merchant determines whether the address is a malicious address according to the actual situation, the corresponding OK button can also be selected on the merchant client, so that the merchant client feeds back the actual determination result to the server, so that the server can be based on the feedback of the merchant client. Determine whether it is misjudged. If a misjudgment occurs, the preset recognition model can be retrained in time to make the preset recognition model more perfect, and thus the accuracy of subsequent malicious address recognition is improved.
  • another embodiment of the present invention provides a system for identifying a malicious order, the system including a user client, a server, and a merchant client;
  • the user client is configured to receive the input pending order and send the order to be identified to the server;
  • the server is configured to receive the to-be-identified order sent by the user client, and calculate each address level in the address of the to-be-identified order to jump to the adjacent next address level based on the address level jump probability distribution obtained by the historical normal address analysis.
  • the jump probability, the address level jump probability distribution includes the jump probability of any one of the address level jumps to another address level; multiplying the obtained jump probabilities to obtain the normal address probability of the address; according to the normal address Probabilistically determining whether the order to be identified is a malicious order, and transmitting the judgment result to the merchant client;
  • the merchant client is used to receive and display the judgment result sent by the server.
  • the system for identifying a malicious order provided by the embodiment of the present invention, after receiving the to-be-identified order sent by the user client, first uses the address level jump probability distribution to calculate the probability that the address in the to-be-identified order belongs to the normal address, and then The probability is further used to determine whether the to-be-identified order is a malicious order. Therefore, compared with the prior art, the maliciously keyword, the black and white list, or the address hierarchy structure is used to determine whether the address to be identified is a malicious address, and the present invention passes the correlation between each address level in the historical normal address.
  • the address probability, the normal address probability of the address included in the black and white list, and the normal address probability of the address with a complete address hierarchy structure can also obtain the normal address probability of an address that does not contain malicious keywords, and the address that is not included in the black and white list.
  • the normal address probability and the normal address probability of the address with an incomplete address hierarchy structure, and whether the address is a malicious address according to the normal address probability thereby improving the accuracy of identifying the malicious address, thereby improving the accuracy of identifying the malicious order.
  • another embodiment of the present invention provides a method for identifying a malicious order. As shown in FIG. 8, the method mainly includes:
  • the user client can upload the order to the server, and after receiving the order, the server can perform malicious address recognition operation on the order.
  • the address level jump probability distribution includes a jump probability of any one of the address level jumps to another address level.
  • the server may first perform address layering processing on the address of the identified order to obtain each address level of the address (see step 202 above); and then calculate each address level to jump to the neighbor based on the address level jump probability distribution. The jump probability of the next address level (see step 203 above).
  • the server may first determine, according to the normal address probability, whether the address of the to-be-identified order is a malicious address; if the address of the to-be-identified order is a malicious address, determine that the to-be-identified order is a malicious order; if the address of the to-be-identified order is a normal address, Then determine that the order to be identified is a normal order.
  • the specific implementation manner of determining whether the address of the to-be-identified order is a malicious address according to the normal address probability is the same as the specific implementation manner in the foregoing embodiment of the method for identifying a malicious address, and details are not described herein again.
  • the malicious user in addition to causing trouble to the merchant by adding a malicious address, the malicious user often plagues the merchant by other means, such as filling in the telephone number of the service, so that the merchant cannot contact the merchant, so When judging that the address of the to-be-identified order is a normal address, it is also necessary to determine whether the telephone number in the to-be-identified order is normal. If the phone number is abnormal, it is determined that the order to be identified is a malicious order; if the phone number is normal, it is determined that the order to be identified is a normal order.
  • the method for determining whether the phone number is abnormal may be: constructing a normal phone number database, matching the phone number to be identified with the normal phone number library, and if the matching fails, determining that the phone number to be identified is abnormal, and if the matching is successful, determining The phone number to be identified is normal.
  • the method for identifying a malicious order provided by the embodiment of the present invention, after receiving the to-be-identified order sent by the user client, first uses the address level jump probability distribution to calculate the probability that the address in the to-be-identified order belongs to the normal address, and then The probability is further used to determine whether the to-be-identified order is a malicious order. Therefore, compared with the prior art, the maliciously keyword, the black and white list, or the address hierarchy structure is used to determine whether the address to be identified is a malicious address, and the present invention passes the correlation between each address level in the historical normal address.
  • the address probability, the normal address probability of the address included in the black and white list, and the normal address probability of the address with a complete address hierarchy structure can also obtain the normal address probability of an address that does not contain malicious keywords, and the address that is not included in the black and white list.
  • the normal address probability and the normal address probability of the address with an incomplete address hierarchy structure and determine whether the address is a malicious address according to the normal address probability, thereby improving the accuracy of identifying the malicious address, thereby improving the accuracy of identifying the malicious order. rate.
  • another embodiment of the present invention provides a device for identifying a malicious order.
  • the device mainly includes:
  • the receiving unit 61 is configured to receive an to-be-identified order sent by the user client;
  • the calculating unit 62 is configured to calculate a jump probability of each address level jump to an adjacent next address level in the address of the to-be-identified order based on the address level jump probability distribution obtained by the historical normal address analysis, and the address level jump
  • the turn probability distribution includes the jump probability of any one of the address level jumps to another address level
  • the processing unit 63 is configured to perform multiplication processing on each obtained jump probability to obtain a normal address probability of the address
  • the determining unit 64 is configured to determine, according to the normal address probability, whether the to-be-identified order is a malicious order.
  • the determining unit 64 includes:
  • the determining module 641 is configured to determine, according to the normal address probability, whether the address of the to-be-identified order is a malicious address
  • the determining module 642 is configured to determine that the to-be-identified order is a malicious order when the address of the to-be-identified order is a malicious address.
  • the determining module 641 is further configured to: when the address of the to-be-identified order is a normal address, determine whether the phone number in the to-be-identified order is normal;
  • the determining module 642 is further configured to determine that the order to be identified is a malicious order when the phone number is abnormal.
  • the calculating unit 62 includes:
  • the processing module 621 is configured to perform address stratification processing on the address of the order to be recognized, and obtain each address level of the address;
  • the calculating module 622 is configured to calculate a jump probability of each address level jump to an adjacent next address level based on the address level jump probability distribution.
  • the device for identifying a malicious order after receiving the to-be-identified order sent by the user client, first uses the address level jump probability distribution to calculate the probability that the address in the to-be-identified order belongs to the normal address, and then The probability is further used to determine whether the to-be-identified order is a malicious order. Therefore, compared with the prior art, the maliciously keyword, the black and white list, or the address hierarchy structure is used to determine whether the address to be identified is a malicious address, and the present invention passes the correlation between each address level in the historical normal address.
  • the address probability, the normal address probability of the address included in the black and white list, and the normal address probability of the address with a complete address hierarchy structure can also obtain the normal address probability of an address that does not contain malicious keywords, and the address that is not included in the black and white list.
  • the address probability and the normal address probability of the incomplete address of the address hierarchy and determine whether the address is a malicious address according to the normal address probability, thereby improving the accuracy of identifying the malicious address, thereby improving the accuracy of identifying the malicious order.
  • modules in the devices of the embodiments can be adaptively changed and placed in one or more devices different from the embodiment.
  • the modules or units or components of the embodiments may be combined into one module or unit or component, and further they may be divided into a plurality of sub-modules or sub-units or sub-components.
  • any combination of the features disclosed in the specification, including the accompanying claims, the abstract and the drawings, and any methods so disclosed, or All processes or units of the device are combined. Unless otherwise stated clearly, this specification Each feature disclosed in the accompanying claims, the abstract and the drawings may be replaced by alternative features that provide the same, equivalent or similar purpose.
  • the various component embodiments of the present invention may be implemented in hardware, or in a software module running on one or more processors, or in a combination thereof.
  • a microprocessor or digital signal processor may be used in practice to implement some or all of the identification systems, methods, and devices of malicious addresses/malicious orders in accordance with embodiments of the present invention.
  • the invention can also be implemented as a device or device program (e.g., a computer program and a computer program product) for performing some or all of the methods described herein.
  • Such a program implementing the invention may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Computer And Data Communications (AREA)

Abstract

本发明公开一种恶意地址/恶意订单的识别系统、方法及装置,涉及互联网技术领域,能够解决现有技术中识别恶意地址/恶意订单准确率较低的问题。本发明的方法主要包括:接收用户客户端发送的待识别地址;对所述待识别地址进行地址层级化处理,获得所述待识别地址的各地址层级;利用由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;对获得的各个跳转概率进行相乘处理,获得所述待识别地址的正常地址概率。

Description

恶意地址/恶意订单的识别系统、方法及装置
本申请要求2016年08月31日递交的申请号为201610797563.7、发明名称为“恶意地址/恶意订单的识别系统、方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及互联网技术领域,特别是涉及一种恶意地址/恶意订单的识别系统、方法及装置。
背景技术
随着互联网技术的发展,人们通过网络不仅可以实现观看视频、浏览网页、聊天等操作,还可以进行购物,并且实现购物的操作过程也十分方便。
然而,在实际应用中,却常常发生某些买家通过故意填写不完整的收货地址、错误的收货地址等恶意行为使商品无法送达,由此给商家带来经济损失、信誉损失的现象,因此,如何识别恶意地址对商家是极其重要的。现有识别恶意地址的方式主要有三种:(1)通过将待识别地址与预设恶意关键词进行匹配,来确定待识别地址是否为恶意地址;(2)通过将待识别地址与黑白名单中的地址分别进行匹配,来确定待识别地址是否为恶意地址;(3)通过对待识别地址进行层级结构化划分,然后与预设地址层级结构进行匹配,来确定待识别地址是否为恶意地址。
虽然上述三种方式均可以在一定程度上识别出部分恶意地址,但是无法识别出一些隐藏的恶意地址,或者可能会将正常的地址误判为恶意地址。例如,对于同一个关键词,在一个地址中可能为恶意关键词,但在另一个地址中可能为正常关键词,因此若将该关键词作为预设恶意地址进行识别,则可能出现将正常地址误判为恶意地址的现象。又如,由于黑白名单是根据商家发货后的实际反馈进行的人工维护的名单,所以利用黑白名单进行识别的方式不仅需要消耗人力,还不能及时识别出新的恶意地址。再如,对于一些地址层级结构完整但在现实生活中不存在的地址,如果利用预设地址层级结构进行识别,会将其误判为正常地址。因此,现有技术中识别恶意地址的准确率较低,从而使得识别恶意订单的准确率也较低。
发明内容
有鉴于此,本发明提供一种恶意地址/恶意订单的识别系统、方法及装置,能够解决现有技术中识别恶意地址/恶意订单准确率较低的问题。
第一方面,本发明提供了一种恶意地址的识别系统,所述系统包括用户客户端、服务器和商家客户端;其中,
所述用户客户端用于接收输入的待识别地址,并将所述待识别地址发送给所述服务器;
所述服务器用于接收所述用户客户端发送的所述待识别地址,并对所述待识别地址进行地址层级化处理,获得所述待识别地址的各地址层级;利用由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;对获得的各个跳转概率进行相乘处理,获得所述待识别地址的正常地址概率,并将基于所述正常地址概率进行恶意地址识别的识别结果发送给所述商家客户端;
所述商家客户端用于接收并输出所述服务器发送的所述识别结果。
第二方面,本发明提供了一种恶意地址的识别方法,所述方法包括:
接收用户客户端发送的待识别地址;
对所述待识别地址进行地址层级化处理,获得所述待识别地址的各地址层级;
利用由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;
对获得的各个跳转概率进行相乘处理,获得所述待识别地址的正常地址概率。
第三方面,本发明提供了一种恶意地址的识别装置,所述装置包括:
接收单元,用于接收用户客户端发送的待识别地址;
第一处理单元,用于对所述待识别地址进行地址层级化处理,获得所述待识别地址的各地址层级;
计算单元,用于利用由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;
第二处理单元,用于对所述计算单元获得的各个跳转概率进行相乘处理,获得所述待识别地址的正常地址概率。
第四方面,本发明提供了一种恶意订单的识别系统,所述系统包括用户客户端、服务器和商家客户端;其中,
所述用户客户端用于接收输入的待识别订单,并将所述待识别订单发送给所述服务器;
所述服务器用于接收所述用户客户端发送的所述待识别订单,并基于由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别订单的地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;对获得的各个跳转概率进行相乘处理,获得所述地址的正常地址概率;根据所述正常地址概率判断所述待识别订单是否为恶意订单,并将判断结果发送给所述商家客户端;
所述商家客户端用于接收并显示所述服务器发送的所述判断结果。
第五方面,本发明提供了一种恶意订单的识别方法,所述方法包括:
接收用户客户端发送的待识别订单;
基于由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别订单的地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;
对获得的各个跳转概率进行相乘处理,获得所述地址的正常地址概率;
根据所述正常地址概率判断所述待识别订单是否为恶意订单。
第六方面,本发明提供了一种恶意订单的识别装置,所述装置包括:
接收单元,用于接收用户客户端发送的待识别订单;
计算单元,用于基于由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别订单的地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;
处理单元,用于对获得的各个跳转概率进行相乘处理,获得所述地址的正常地址概率;
判断单元,用于根据所述正常地址概率判断所述待识别订单是否为恶意订单。
借由上述技术方案,本发明提供的恶意地址/恶意订单的识别系统、方法及装置,能够在服务器获取待识别地址以及由历史正常地址分析得到的地址层级跳转概率分布后,先对该待识别地址进行地址层级化处理,获得该待识别地址的各地址层级,然后利用获取的地址层级跳转概率分布,计算该待识别地址中每个地址层级跳转至相邻的下一 地址层级的跳转概率,并对各个跳转概率进行相乘处理,获得该待识别地址属于正常地址的概率,以便根据该概率判断待识别地址是否为恶意地址,或者根据该概率判断包括该待识别地址的订单是否为恶意订单。由此可知,与现有技术中粗滤地通过恶意关键词、黑白名单或者地址层级结构来判断待识别地址是否为恶意地址相比,本发明通过对历史正常地址中各地址层级之间相关性进行统计与分析,并利用分析结果来判断待识别地址各地址层级的跳转概率,再由跳转概率获得整个待识别地址属于正常地址的概率,从而不仅能够获得包含恶意关键词的地址的正常地址概率、包含在黑白名单中的地址的正常地址概率以及地址层级结构完整的地址的正常地址概率,还能够获得不包含恶意关键词的地址的正常地址概率、不包含在黑白名单中的地址的正常地址概率以及地址层级结构不完整的地址的正常地址概率,并可以根据该正常地址概率来确定待识别地址是否为恶意地址,从而根据是否为恶意地址来确定待识别订单是否为恶意订单,进而提高了恶意地址/恶意订单识别的准确率。
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。
附图说明
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:
图1示出了本发明实施例提供的一种恶意地址的识别系统示意图;
图2示出了本发明实施例提供的一种商家客户端侧选择界面示意图;
图3示出了本发明实施例提供的一种恶意地址的识别方法的流程图;
图4示出了本发明实施例提供的另一种恶意地址的识别方法的流程图;
图5示出了本发明实施例提供的恶意地址识别过程中服务器与客户端的交互图;
图6示出了本发明实施例提供的一种恶意地址的识别装置的组成框图;
图7示出了本发明实施例提供的另一种恶意地址的识别装置的组成框图;
图8示出了本发明实施例提供的一种恶意订单的识别方法的流程图;
图9示出了本发明实施例提供的一种恶意订单的识别装置的组成框图;
图10示出了本发明实施例提供的另一种恶意订单的识别装置的组成框图。
具体实施方式
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
为了提高识别恶意地址的准确率,本发明实施例提供了一种恶意地址的识别系统,如图1所示,系统包括用户客户端11、服务器12和商家客户端13;其中,
用户客户端11用于接收输入的待识别地址,并将待识别地址发送给服务器12;
服务器12用于接收用户客户端11发送的待识别地址,并对待识别地址进行地址层级化处理,获得待识别地址的各地址层级;利用由历史正常地址分析得到的地址层级跳转概率分布,计算待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;对获得的各个跳转概率进行相乘处理,获得待识别地址的正常地址概率,并将基于正常地址概率进行恶意地址识别的识别结果发送给商家客户端13;
商家客户端13用于接收并输出服务器12发送的识别结果。
本发明实施例提供的恶意地址的识别系统,能够在服务器接收到用户客户端发送的待识别地址后,先对该待识别地址进行地址层级化处理,获得该待识别地址的各地址层级,然后利用地址层级跳转概率分布,计算该待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,并对各个跳转概率进行相乘处理,获得该待识别地址属于正常地址的概率,以便根据该概率判断待识别地址是否为恶意地址。由此可知,与现有技术中粗滤地通过恶意关键词、黑白名单或者地址层级结构来判断待识别地址是否为恶意地址相比,本发明通过对历史正常地址中各地址层级之间相关性进行统计与分析,并利用分析结果来判断待识别地址各地址层级的跳转概率,再由跳转概率获得整个待识别地址属于正常地址的概率,从而不仅能够获得包含恶意关键词的地址的正常地址概率、包含在黑白名单中的地址的正常地址概率以及地址层级结构完整的地址的正常地址概率,还能够获得不包含恶意关键词的地址的正常地址概率、不包含在黑白名单中的地址的正常地址概率以及地址层级结构不完整的地址的正常地址概率,并可以根据该正常地址概率来确定待识别地址是否为恶意地址,从而提高了恶意地址识别的准确率。
进一步的,服务器12用于当识别结果是待识别地址为恶意地址时,向商家客户端 13发送预警提示信息;
商家客户端13用于接收并输出服务器12发送的预警提示信息。
进一步的,商家客户端13用于在接收到预警提示信息后,输出用于选择对待识别地址进行二次识别的识别结果的选择界面,并接收基于选择界面输入的、二次识别的识别结果,将二次识别的识别结果返回给服务器12。
示例性的,如图2所示,当商家客户端接收到预警提示信息后,不仅会在界面显示该预警提示信息,还会显示一个供商家选择二次识别结果的选择界面,如该选择界面上可以有一个文本内容“请联系买家再次确认地址***是否为恶意地址”,以及两个选择按钮“是”和“否”,供用户选择。
需要说明的是,预警提示信息可以位于选择界面,也可以位于另一个界面。
进一步的,商家客户端13用于在未接收到预警提示信息的情况下,输出用于选择对待识别地址进行二次识别的识别结果的选择界面,并接收基于选择界面输入的、用于描述待识别地址为恶意地址的识别结果,并将携带恶意标识的待识别地址返回给服务器12。
进一步的,依据上述系统实施例,本发明的另一个实施例还提供了一种恶意地址的识别方法,如图3所示,该方法主要包括:
201、接收用户客户端发送的待识别地址。
当用户下单成功后,用户客户端(即买家客户端)可以将订单上传给服务器,服务器接收到该订单后,能够对该订单进行恶意地址识别操作,并将订单以及订单的识别结果发送给商家客户端,以便商家根据识别结果对该订单进行相应处理。由于服务器接收到的待识别订单中往往会存在一些没有意义的数据,所以为了防止这些数据干扰待识别地址的识别,在获得待识别订单后,服务器需要对该待识别订单先进行预处理,然后再从预处理后的待识别订单中提取待识别地址。
因此,获取待识别地址的具体实现过程可以为:获取待识别订单;对待识别订单进行冗余处理以及格式化处理;从处理后的待识别订单中获取待识别地址。
其中,对待识别订单进行冗余处理以及格式化处理具体包括:
(1)对待识别订单的待识别地址中满足预设过滤条件的文字进行过滤。
由于用户可能会在地址中填写一些表情符号、无意义的英文字母以及其他一些无意义的数据,所以可以检测待识别地址中是否含有这些信息,若含有则将这些信息进行过滤。
(2)对待识别订单中的脏数据进行过滤。
由于服务器在保存待识别订单时,可能会保存一些包含HTML(HyperText Markup Language,超文本标记语言)文本、JSON(JavaScript Ob ject Notation)字符串等异常信息的脏数据,所以服务器可以将这些脏数据进行过滤。
(3)根据预设格式化处理规则,对过滤后的待识别订单进行格式化处理。
由于用户在填写地址、电话等信息时,可能会添加空格、使用繁体字、使用拼音等,所以为了便于后续能够准确识别待识别地址,在对待识别订单进行过滤后,还需要进行去除空格、全角半角转换、繁简体转换、拼音转汉字等格式化操作,从而使得获得的地址具有统一的格式。
需要说明的是,在对历史正常地址以及历史恶意地址进行分析时,同样也需要进行上述预处理操作。
202、对待识别地址进行地址层级化处理,获得待识别地址的各地址层级。
因为地址的每一层级仅与邻近的上一层级有关,而与其他层级无关,所以地址层级结构符合马尔科夫性,从而可以利用条件随机场模型进行地址层级化处理。其中,对待识别地址进行地址层级化处理的具体实现方式为:在获得待识别地址后,服务器可以通过条件随机场模型对待识别地址进行分词、地址层级标注,从而获得待识别地址的各个地址层级。例如,待识别地址为四川省成都市青羊区苏坡街道**家园5号楼1单元,则各地址层级分别为:“省:四川省、市:成都市、区县:青羊区、道路:苏坡街道、小区:**家园、楼号:5号楼、单元号:1单元”。
203、利用由历史正常地址分析得到的地址层级跳转概率分布,计算待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率。
其中,地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率。由于历史正常地址为商家送货成功的地址,所以服务器在获得大量的历史正常地址后,可以对历史正常地址的地址层级跳转情况进行统计与分析,从中获得地址层级跳转概率分布,以便通过地址层级跳转概率分布确定待识别地址的各地址层级之间的跳转情况。
当获得待识别地址的各地址层级后,服务器可以利用地址层级跳转概率分布计算出待识别地址中相邻地址层级的跳转概率,即第N层级跳转至第N+1层级的概率。例如,在获得待识别地址的各地址层级“省:四川省、市:成都市、区县:青羊区、道路:苏坡街道、小区:**家园、楼号:5号楼、单元号:1单元”后,可以利用地址层级跳转 概率分布,获得“四川省跳转至成都市的概率、成都市跳转至青羊区的概率、青羊区跳转至苏坡街道的概率、苏坡街道跳转至**家园的概率、**家园跳转至5号楼的概率、以及5号楼跳转至1单元的概率”。
204、对获得的各个跳转概率进行相乘处理,获得待识别地址的正常地址概率。
在对历史正常地址的地址层级跳转概率进行训练时,可以利用全国范围内大量地址进行训练,并在获得待识别地址各地址层级的跳转概率后,可以将这些跳转概率相乘,获得待识别地址属于正常地址的概率。
在实际应用中,有的恶意地址可能是由多个省内的不同地方顺序拼凑而成,例如,上海市闸北区龙岗区龙岗镇龙岗街道高科技工业园区内深圳***有限公司,其中,“上海市闸北区”属于上海市地址,“龙岗区龙岗镇龙岗街道高科技工业园区内深圳***有限公司”属于广东省地址。因此,当利用全国范围大量正常地址进行训练时,只有在闸北区到龙岗区这个地址层级间的跳转是异常的,而其他都是正常跳转,从而获得的整个地址属于正常地址的概率较大,进而将其误判为正常地址;而若单独利用上海市内海量历史正常地址进行训练,则整个地址只有上海市到闸北区的跳转是正常的,而其他地址层级间的跳转都是异常的,从而获得整个地址属于恶意地址的概率较大,进而将其确定为恶意地址。因此,在增加省份这一变量后,恶意地址识别的准确率得到提高。
在实际应用中,在增加省份这一变量后,计算待识别地址属于正常地址的概率计算公式可以为:
Figure PCTCN2017097953-appb-000001
其中,S表示待识别地址,wi表示待识别地址中的第i地址层级,C表示待识别地址所属省份。
本发明实施例提供的恶意地址的识别方法,能够在获取待识别地址后,先对该待识别地址进行地址层级化处理,获得该待识别地址的各地址层级,然后利用地址层级跳转概率分布,计算该待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,并对各个跳转概率进行相乘处理,获得该待识别地址属于正常地址的概率,以便根据该概率判断待识别地址是否为恶意地址。由此可知,与现有技术中粗滤地通过恶意关键词、黑白名单或者地址层级结构来判断待识别地址是否为恶意地址相比,本发明通过对历史正常地址中各地址层级之间相关性进行统计与分析,并利用分析结果来判断待识别地址各地址层级的跳转概率,再由跳转概率获得整个待识别地址属于正常地址的概率,从而 不仅能够获得包含恶意关键词的地址的正常地址概率、包含在黑白名单中的地址的正常地址概率以及地址层级结构完整的地址的正常地址概率,还能够获得不包含恶意关键词的地址的正常地址概率、不包含在黑白名单中的地址的正常地址概率以及地址层级结构不完整的地址的正常地址概率,并可以根据该正常地址概率来确定待识别地址是否为恶意地址,从而提高了恶意地址识别的准确率。
进一步的,在获得待识别地址属于正常地址的概率后,可以根据预设识别规则以及待识别地址的正常地址概率,判断待识别地址是否为恶意地址。
具体的,在获得待识别地址的正常地址概率后,可以直接利用正常地址概率判断待识别地址是否为恶意地址,也可以对待识别地址对应的其他特征进行分析,然后根据正常地址概率以及其他特征综合判断待识别地址是否为恶意地址(如下述步骤305至307所述)。其中,直接利用正常地址概率判断待识别地址是否为恶意地址的具体实现方式为:判断待识别地址的正常地址概率是否大于预设概率阈值;若待识别地址的正常地址概率大于预设概率阈值,则确定待识别地址为正常地址;若待识别地址的正常地址概率小于或等于预设概率阈值,则确定待识别地址为恶意地址。
此外,当服务器获得识别结果后,可以将该识别结果发送给商家客户端,以便商家客户端接收并显示识别结果,供商家根据识别结果确定是否发货。
进一步的,依据上述实施例,本发明的另一个实施例还提供了一种恶意地址的识别方法,如图4所示,该方法主要包括:
301、接收用户客户端发送的待识别地址。
302、对待识别地址进行地址层级化处理,获得待识别地址的各地址层级。
303、利用由历史正常地址分析得到的地址层级跳转概率分布,计算待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率。
304、对获得的各个跳转概率进行相乘处理,获得待识别地址的正常地址概率。
305、从待识别地址对应的待识别订单和/或待识别订单对应的历史订单中提取用于识别待识别地址是否为恶意地址的预设识别特征。
具体的,预设识别特征包括以下任意一项或者任意几项的组合:地址文本信息特征、历史购物行为特征、订单特征以及交叉特征。
相应的,本步骤可以具体细化为下述步骤a-d:
(a)从待识别地址中提取对应的地址文本信息特征。
其中,地址文本信息特征包括:是否包括预设长度的数字、是否包括预设敏感词以 及是否包括广告信息等。其中,预设长度包括手机号长度、座机号长度、QQ号长度等。
由于用户可能会为了辱骂商家或者为自己、自己的商品打广告而在地址中填写一些辱骂信息、手机号、广告信息等,并且填写这些内容的用户可能会填写一个恶意地址,所以可以从待识别地址中提取地址文本信息特征,以便从该维度分析待识别地址是否为恶意地址。
(b)从待识别订单对应的历史订单中提取历史购物行为特征。
由于用户的历史购物行为能够反映其是否可能会填写一个恶意地址,例如经常与商家发生纠纷、经常无故退款、交易成功率较低的用户填写恶意地址的可能性较大,而从未与商家发生纠纷、从未退过款、交易成功率较高的用户填写恶意地址的可能性较小,所以可以从待识别订单对应的历史订单中提取历史购物行为特征,并将该特征作为判断待识别地址是否为恶意地址的一个维度。
此外,在实际应用中,历史购物行为特征主要包括:预设时间段内支付订单数、预设时间段内支付总额、预设时间段内的退款发起总量、预设时间段内交易成功率、预设时间段内纠纷商家数、预设时间段内投诉发起率、预设时间段内退款纠纷占比等。其中,各个历史购物行为特征的预设时间段可以相同,也可以不同。
(c)从待识别订单中提取对应的订单特征。
具体的,订单特征包括:待识别订单中的电话号码是否正常、待识别地址的使用次数是否大于预设使用阈值、待识别订单对应的店铺的相关状态以及待识别订单到对应的商品的相关状态。其中,店铺的相关状态包括:店铺的开店时间、最近时间段内店铺评分的波动、店铺被恶意攻击的次数等;商品的相关状态包括:商品的销量、商品的价格、商品是否热门等。
由于用户在填写地址时,可能会故意填写错误的电话号码,或者填写不曾使用过的新地址,并且恶意行为往往集中在大商家或者热门商品上,所以服务器可以从待识别订单提取这些订单特征,并通过订单特征这一维度来分析待识别地址是否为恶意地址。
(d)根据地址文本信息特征、历史购物行为特征、订单特征以及待识别地址的正常地址概率中至少两项的组合,获取待识别地址对应的交叉特征。
在实际应用中,将地址文本信息特征、历史购物行为特征、订单特征以及待识别地址的正常地址概率这些基本特征进行交叉组合,可以产生更抽象的特征描述,例如将地址文本信息特征与订单特征进行交叉组合,可以获得待识别地址中不仅没有包含无意义的文本描述(即地址中没有携带电话号码、QQ号、预设敏感词、广告信息等信息),且 该地址为用户的常用地址。因此,可以将待识别地址对应的交叉特征作为识别恶意地址的又一个维度。
306、获取通过历史订单训练的预设识别模型。
具体的,服务器训练预设识别模型的实现方式可以为:先获取历史订单;然后根据地址层级跳转概率分布,获得历史订单中携带的历史地址的正常地址概率;再从历史订单中提取预设识别特征;最后通过各个历史地址的正常地址概率以及对应的预设识别特征训练预设识别模型。
其中,历史订单中包括预设比例的历史正常订单和历史恶意订单,且当历史正常订单与历史恶意订单的比例大约为4:1时,恶意地址识别的准确率相对较高。
需要说明的是,在实际应用中,本步骤需要训练的预设识别模型可以为GBDT(Gradient Boosting Decision Tree,梯度提升决策树)模型,也可以为其他模型,例如SVM(Support Vector Machine,支持向量机)模型、LR(Logistic Regression,逻辑回归)模型、神经网络模型等。
307、根据待识别地址的正常地址概率、预设识别特征以及预设识别模型,判断待识别地址是否为恶意地址。
在获得待识别地址的正常地址概率以及预设识别特征之后,服务器可以将这些特征输入到预设识别模型中进行识别,以便预设识别模型可以对这些特征进行综合分析,获得待识别地址最终属于正常地址的概率或者恶意地址的概率,并根据预设正常概率阈值或者预设恶意地址概率阈值来确定该待识别地址是否为恶意地址。
308、若判断待识别地址为恶意地址,则向商家客户端发送预警提示信息,以便商家客户端接收并输出预警提示信息。
当服务器判断待识别地址为恶意地址后,为了避免商家因恶意地址而造成经济、信誉等损失,服务器在向商家客户端发送待识别订单的同时,可以向其发送用于指示地址可能为恶意地址的预警提示信息,商家接收到该预警提示信息后,可以根据订单中的电话与买家进行联系,从而判断该地址是否确实为恶意地址;若商家确定该地址为恶意地址,则可以拒绝发货,若商家确定该地址为正常地址,而非恶意地址,则可以放心发货。
此外,若服务器判断该待识别地址为正常地址,则可以仅向商家客户端发送待识别订单,而无需发送预警提示信息;当商家发现接收到的订单没有预警提示信息时,会直接根据订单中的地址进行发货。然而,服务器可能会将恶意地址误判为正常地址,因此,当商家在实际发货过程中发现地址无法送达时,商家可以在商家客户端中选择地址为恶 意地址的按钮,以便商家客户端将携带恶意标识的待识别地址发送给服务器,并在服务器接收到携带恶意标识的待识别地址后,对历史正常地址库、历史恶意地址库进行更新,以及对预设识别模型进行重新训练。
309、接收商家客户端发送的、基于预警提示信息对待识别地址进行二次识别的识别结果。
在实际应用中,当商家确定待识别地址为恶意地址时,可以在预警工具的页面(或者上述系统实施例中提及的选择界面)中选择用于指示确定为恶意地址的按钮,以便商家客户端将携带恶意标识的待识别地址发送给服务器;当商家确定待识别地址是正常地址而非恶意地址时,可以在预警工具的页面中选择用于指示确定为正常地址的按钮,以便商家客户端将携带正常标识的待识别地址发送给服务器。
310、若识别结果是待识别地址为正常地址,则更新历史正常地址库、历史恶意地址库以及预设识别模型。
当商家客户端发送的二次识别的识别结果是待识别地址为正常地址,则服务器确定其判断错误,并立即更新历史正常地址库、历史恶意地址库,然后对地址层级跳转概率分布进行重新分析,对预设识别模型进行重新训练。
此外,以GBDT模型为例,本发明实施例中服务器与客户端之间的交互过程可以如图5所示,且通过上述实施例可知,本发明实施例不仅能够基于地址层级跳转概率分布初步获得待识别地址属于正常地址的概率,还能够从历史订单以及待识别订单中获得地址文本信息特征、历史购物行为特征、订单特征以及交叉特征等其他预设识别特征,并将待识别地址的正常地址概率以及这些预设识别特征输入至GBDT模型(或者其他识别模型)中进行综合分析,判断待识别地址是否为恶意地址,从而进一步提高了恶意地址识别的准确率。此外,当服务器最终确定该待识别地址为恶意地址时,还能够向商家客户端发送预警提示信息,从而使得商家能够通过与买家联系核实该地址是否为恶意地址,来决定是否发货,进而避免产生损失。进一步的,在商家根据实际情况确定地址是否为恶意地址后,还能够在商家客户端上选择对应的确定按钮,以便商家客户端将实际确定结果反馈给服务器,从而服务器可以根据商家客户端的反馈来确定其是否发生误判,若发生误判,则可以及时对GBDT模型进行重新训练,使GBDT模型更加完善,并由此提高了后续恶意地址识别的准确率。
进一步的,依据上述方法实施例,本发明的另一个实施例还提供了一种恶意地址的识别装置,如图6所示,该装置主要包括:接收单元41、第一处理单元42、计算单元 43、第二处理单元44。其中,
接收单元41,用于接收用户客户端发送的待识别地址;
第一处理单元42,用于对待识别地址进行地址层级化处理,获得待识别地址的各地址层级;
计算单元43,用于利用由历史正常地址分析得到的地址层级跳转概率分布,计算待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;
第二处理单元44,用于对计算单元43获得的各个跳转概率进行相乘处理,获得待识别地址的正常地址概率。
本发明实施例提供的恶意地址的识别装置,能够在获取待识别地址后,先对该待识别地址进行地址层级化处理,获得该待识别地址的各地址层级,然后利用地址层级跳转概率分布,计算该待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,并对各个跳转概率进行相乘处理,获得该待识别地址属于正常地址的概率,以便根据该概率判断待识别地址是否为恶意地址。由此可知,与现有技术中粗滤地通过恶意关键词、黑白名单或者地址层级结构来判断待识别地址是否为恶意地址相比,本发明通过对历史正常地址中各地址层级之间相关性进行统计与分析,并利用分析结果来判断待识别地址各地址层级的跳转概率,再由跳转概率获得整个待识别地址属于正常地址的概率,从而不仅能够获得包含恶意关键词的地址的正常地址概率、包含在黑白名单中的地址的正常地址概率以及地址层级结构完整的地址的正常地址概率,还能够获得不包含恶意关键词的地址的正常地址概率、不包含在黑白名单中的地址的正常地址概率以及地址层级结构不完整的地址的正常地址概率,并可以根据该正常地址概率来确定待识别地址是否为恶意地址,从而提高了恶意地址识别的准确率。
进一步的,如图7所示,该装置还包括:
判断单元45,用于在获得所述待识别地址的正常地址概率之后,根据预设识别规则以及待识别地址的正常地址概率,判断待识别地址是否为恶意地址。
进一步的,如图7所示,判断单元45包括:
提取模块451,用于从待识别地址对应的待识别订单和/或待识别订单对应的历史订单中提取用于识别待识别地址是否为恶意地址的预设识别特征;
获取模块452,用于获取通过历史订单训练的预设识别模型;
第一判断模块453,用于根据待识别地址的正常地址概率、预设识别特征以及预设 识别模型,判断待识别地址是否为恶意地址。
进一步的,如图7所示,提取模块451包括:
第一提取子模块4511,用于从待识别地址中提取对应的地址文本信息特征;
第二提取子模块4512,用于从待识别订单对应的历史订单中提取历史购物行为特征;
第三提取子模块4513,用于从待识别订单中提取对应的订单特征。
进一步的,如图7所示,提取模块451还包括:
获取子模块4514,用于根据地址文本信息特征、历史购物行为特征、订单特征以及待识别地址的正常地址概率中至少两项的组合,获取待识别地址对应的交叉特征。
进一步的,第一提取子模块4511提取的地址文本信息特征包括:是否包括预设长度的数字、是否包括预设敏感词以及是否包括广告信息;
第三提取子模块4513提取的订单特征包括:待识别订单中的电话号码是否正常、待识别地址的使用次数是否大于预设使用阈值、待识别订单对应的店铺的相关状态以及待识别订单到对应的商品的相关状态。
进一步的,获取模块452还用于获取历史订单,历史订单中包括预设比例的历史正常订单和历史恶意订单;
获取模块452还用于根据条件随机场模型以及地址层级跳转概率分布,获得历史订单中携带的历史地址的正常地址概率;
提取模块451还用于从历史订单中提取预设识别特征;
如图7所示,判断单元45还包括:
训练模块454,用于通过各个历史地址的正常地址概率以及对应的预设识别特征训练预设识别模型。
进一步的,如图7所示,判断单元45包括:
第二判断模块455,用于判断待识别地址的正常地址概率是否大于预设概率阈值;
确定模块456,用于当第二判断模块的判断结果为待识别地址的正常地址概率大于预设概率阈值时,确定待识别地址为正常地址,当第二判断模块的判断结果为待识别地址的正常地址概率小于或等于预设概率阈值时,确定待识别地址为恶意地址。
进一步的,如图7所示,该装置还包括:
第一发送单元46,用于将判断待识别地址是否为恶意地址的识别结果发送给商家客户端,以便商家客户端接收并输出识别结果。
进一步的,如图7所示,该装置还包括:
第二发送单元47,用于当判断单元45判断待识别地址为恶意地址时,向商家客户端发送预警提示信息,以便所述商家客户端接收并输出所述预警提示信息;
接收单元41,用于接收商家客户端发送的、基于预警提示信息对待识别地址进行二次识别的识别结果;
第一更新单元48,用于当第一接收单元48接收的识别结果是待识别地址为正常地址时,更新历史正常地址库、历史恶意地址库以及预设识别模型。
进一步的,接收单元41,用于接收商家客户端发送的携带恶意标识的待识别地址;
如图7所示,该装置还包括:
第二更新单元49,用于更新历史正常地址库、历史恶意地址库以及预设识别模型。
进一步的,待识别地址为在第一处理单元42对待识别订单进行冗余处理以及格式化处理后获得的地址。
进一步的,如图7所示,第一处理单元42包括:
过滤模块421,用于对待识别订单的待识别地址中满足预设过滤条件的文字进行过滤;
过滤模块421还用于对待识别订单中的脏数据进行过滤;
处理模块422,用于根据预设格式化处理规则,对过滤模块421过滤后的待识别订单进行格式化处理。
本发明实施例提供的恶意地址的识别装置,不仅能够基于地址层级跳转概率分布初步获得待识别地址属于正常地址的概率,还能够从历史订单以及待识别订单中获得地址文本信息特征、历史购物行为特征、订单特征以及交叉特征等其他预设识别特征,并将待识别地址的正常地址概率以及这些预设识别特征输入至预设识别模型中进行综合分析,判断待识别地址是否为恶意地址,从而进一步提高了恶意地址识别的准确率。此外,当服务器最终确定该待识别地址为恶意地址时,还能够向商家客户端发送预警提示信息,从而使得商家能够通过与买家联系核实该地址是否为恶意地址,来决定是否发货,进而避免产生损失。进一步的,在商家根据实际情况确定地址是否为恶意地址后,还能够在商家客户端上选择对应的确定按钮,以便商家客户端将实际确定结果反馈给服务器,从而服务器可以根据商家客户端的反馈来确定其是否发生误判,若发生误判,则可以及时对预设识别模型进行重新训练,使预设识别模型更加完善,并由此提高了后续恶意地址识别的准确率。
进一步的,为了提高识别恶意订单的准确率,本发明的另一个实施例提供了一种恶意订单的识别系统,该系统包括用户客户端、服务器和商家客户端;其中,
用户客户端用于接收输入的待识别订单,并将待识别订单发送给服务器;
服务器用于接收用户客户端发送的待识别订单,并基于由历史正常地址分析得到的地址层级跳转概率分布,计算待识别订单的地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;对获得的各个跳转概率进行相乘处理,获得地址的正常地址概率;根据正常地址概率判断待识别订单是否为恶意订单,并将判断结果发送给商家客户端;
商家客户端用于接收并显示服务器发送的判断结果。
本发明实施例提供的恶意订单的识别系统,在服务器接收到用户客户端发送的待识别订单后,先利用地址层级跳转概率分布,计算该待识别订单中的地址属于正常地址的概率,然后再利用该概率判断该待识别订单是否为恶意订单。由此可知,与现有技术中粗滤地通过恶意关键词、黑白名单或者地址层级结构来判断待识别地址是否为恶意地址相比,本发明通过对历史正常地址中各地址层级之间相关性进行统计与分析,并利用分析结果来判断待识别地址各地址层级的跳转概率,再由跳转概率获得整个待识别地址属于正常地址的概率,从而不仅能够获得包含恶意关键词的地址的正常地址概率、包含在黑白名单中的地址的正常地址概率以及地址层级结构完整的地址的正常地址概率,还能够获得不包含恶意关键词的地址的正常地址概率、不包含在黑白名单中的地址的正常地址概率以及地址层级结构不完整的地址的正常地址概率,并根据正常地址概率确定地址是否为恶意地址,从而提高了识别恶意地址的准确率,进而提高了识别恶意订单的准确率。
进一步的,依据上述实施例中提及的恶意订单的识别系统,本发明的另一个实施例提供了一种恶意订单的识别方法,如图8所示,该方法主要包括:
501、接收用户客户端发送的待识别订单。
当用户下单成功后,用户客户端可以将订单上传给服务器,服务器接收到该订单后,能够对该订单进行恶意地址识别操作。
502、基于由历史正常地址分析得到的地址层级跳转概率分布,计算待识别订单的地址中每个地址层级跳转至相邻的下一地址层级的跳转概率。
其中,地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率。
具体的,服务器可以先对待识别订单的地址进行地址层级化处理,获得地址的各地址层级(详见上述步骤202);然后基于地址层级跳转概率分布,计算每个地址层级跳转至相邻的下一地址层级的跳转概率(详见上述步骤203)。
503、对获得的各个跳转概率进行相乘处理,获得地址的正常地址概率。
本步骤的具体实现方式与上述步骤204相同,在此不再赘述。
504、根据正常地址概率判断待识别订单是否为恶意订单。
具体的,服务器可以先根据正常地址概率判断待识别订单的地址是否为恶意地址;若待识别订单的地址为恶意地址,则确定待识别订单为恶意订单;若待识别订单的地址为正常地址,则确定待识别订单为正常订单。
其中,根据正常地址概率判断待识别订单的地址是否为恶意地址的具体实现方式与上述“恶意地址的识别方法”的实施例中的具体实现方式相同,在此不再赘述。
进一步的,由于在实际应用中,恶意用户除了通过过添加恶意地址的方式给商家带来困扰外,往往还通过其他方式困扰商家,例如填写所务的电话号码,使得商家无法与其进行联系,所以当判断待识别订单的地址为正常地址时,还需要再判断待识别订单中的电话号码是否正常。若电话号码异常,则确定待识别订单为恶意订单;若电话号码正常,则确定待识别订单为正常订单。
其中,判断电话号码是否异常的方法可以为:构建一个正常电话号码库,将待识别电话号码与正常电话号码库进行匹配,若匹配失败,则确定待识别电话号码异常,若匹配成功,则确定待识别电话号码正常。
本发明实施例提供的恶意订单的识别方法,在服务器接收到用户客户端发送的待识别订单后,先利用地址层级跳转概率分布,计算该待识别订单中的地址属于正常地址的概率,然后再利用该概率判断该待识别订单是否为恶意订单。由此可知,与现有技术中粗滤地通过恶意关键词、黑白名单或者地址层级结构来判断待识别地址是否为恶意地址相比,本发明通过对历史正常地址中各地址层级之间相关性进行统计与分析,并利用分析结果来判断待识别地址各地址层级的跳转概率,再由跳转概率获得整个待识别地址属于正常地址的概率,从而不仅能够获得包含恶意关键词的地址的正常地址概率、包含在黑白名单中的地址的正常地址概率以及地址层级结构完整的地址的正常地址概率,还能够获得不包含恶意关键词的地址的正常地址概率、不包含在黑白名单中的地址的正常地址概率以及地址层级结构不完整的地址的正常地址概率,并根据正常地址概率确定地址是否为恶意地址,从而提高了识别恶意地址的准确率,进而提高了识别恶意订单的准确 率。
进一步的,依据图8所示的方法,本发明的另一个实施例提供了一种恶意订单的识别装置,如图9所示,该装置主要包括:
接收单元61,用于接收用户客户端发送的待识别订单;
计算单元62,用于基于由历史正常地址分析得到的地址层级跳转概率分布,计算待识别订单的地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;
处理单元63,用于对获得的各个跳转概率进行相乘处理,获得地址的正常地址概率;
判断单元64,用于根据正常地址概率判断待识别订单是否为恶意订单。
进一步的,如图10所示,判断单元64包括:
判断模块641,用于根据正常地址概率判断待识别订单的地址是否为恶意地址;
确定模块642,用于当待识别订单的地址为恶意地址时,确定待识别订单为恶意订单。
进一步的,判断模块641还用于当待识别订单的地址为正常地址时,判断待识别订单中的电话号码是否正常;
确定模块642还用于当电话号码异常时,确定待识别订单为恶意订单。
进一步的,如图10所示,计算单元62包括:
处理模块621,用于对待识别订单的地址进行地址层级化处理,获得地址的各地址层级;
计算模块622,用于基于地址层级跳转概率分布,计算每个地址层级跳转至相邻的下一地址层级的跳转概率。
本发明实施例提供的恶意订单的识别装置,在服务器接收到用户客户端发送的待识别订单后,先利用地址层级跳转概率分布,计算该待识别订单中的地址属于正常地址的概率,然后再利用该概率判断该待识别订单是否为恶意订单。由此可知,与现有技术中粗滤地通过恶意关键词、黑白名单或者地址层级结构来判断待识别地址是否为恶意地址相比,本发明通过对历史正常地址中各地址层级之间相关性进行统计与分析,并利用分析结果来判断待识别地址各地址层级的跳转概率,再由跳转概率获得整个待识别地址属于正常地址的概率,从而不仅能够获得包含恶意关键词的地址的正常地址概率、包含在黑白名单中的地址的正常地址概率以及地址层级结构完整的地址的正常地址概率,还能够获得不包含恶意关键词的地址的正常地址概率、不包含在黑白名单中的地址的正常地 址概率以及地址层级结构不完整的地址的正常地址概率,并根据正常地址概率确定地址是否为恶意地址,从而提高了识别恶意地址的准确率,进而提高了识别恶意订单的准确率。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
可以理解的是,上述方法、装置及系统中的相关特征可以相互参考。另外,上述实施例中的“第一”、“第二”等是用于区分各实施例,而并不代表各实施例的优劣。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在此提供的算法和显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本发明也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本发明的内容,并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。
类似地,应当理解,为了精简本公开并帮助理解各个发明方面中的一个或多个,在上面对本发明的示例性实施例的描述中,本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。
本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书 (包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。
此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如,在下面的权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的恶意地址/恶意订单的识别系统、方法及装置中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。

Claims (25)

  1. 一种恶意地址的识别系统,其特征在于,所述系统包括用户客户端、服务器和商家客户端;其中,
    所述用户客户端用于接收输入的待识别地址,并将所述待识别地址发送给所述服务器;
    所述服务器用于接收所述用户客户端发送的所述待识别地址,并对所述待识别地址进行地址层级化处理,获得所述待识别地址的各地址层级;利用由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;对获得的各个跳转概率进行相乘处理,获得所述待识别地址的正常地址概率,并将基于所述正常地址概率进行恶意地址识别的识别结果发送给所述商家客户端;
    所述商家客户端用于接收并输出所述服务器发送的所述识别结果。
  2. 根据权利要求1所述的系统,其特征在于,所述服务器用于当所述识别结果是所述待识别地址为恶意地址时,向所述商家客户端发送预警提示信息;
    所述商家客户端用于接收并输出所述服务器发送的所述预警提示信息。
  3. 根据权利要求2所述的系统,其特征在于,所述商家客户端用于在接收到所述预警提示信息后,输出用于选择对所述待识别地址进行二次识别的识别结果的选择界面,并接收基于所述选择界面输入的、二次识别的识别结果,将所述二次识别的识别结果返回给所述服务器。
  4. 根据权利要求2或3所述的系统,其特征在于,所述商家客户端用于在未接收到所述预警提示信息的情况下,输出用于选择对所述待识别地址进行二次识别的识别结果的选择界面,并接收基于所述选择界面输入的、用于描述所述待识别地址为恶意地址的识别结果,并将携带恶意标识的所述待识别地址返回给所述服务器。
  5. 一种恶意地址的识别方法,其特征在于,所述方法包括:
    接收用户客户端发送的待识别地址;
    对所述待识别地址进行地址层级化处理,获得所述待识别地址的各地址层级;
    利用由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;
    对获得的各个跳转概率进行相乘处理,获得所述待识别地址的正常地址概率。
  6. 根据权利要求5所述的方法,其特征在于,在获得所述待识别地址的正常地址概率之后,所述方法还包括:
    根据预设识别规则以及所述待识别地址的正常地址概率,判断所述待识别地址是否为恶意地址。
  7. 根据权利要求6所述的方法,其特征在于,根据预设识别规则以及所述待识别地址的正常地址概率,判断所述待识别地址是否为恶意地址包括:
    从所述待识别地址对应的待识别订单和/或所述待识别订单对应的历史订单中提取用于识别所述待识别地址是否为恶意地址的预设识别特征;
    获取通过历史订单训练的预设识别模型;
    根据所述待识别地址的正常地址概率、所述预设识别特征以及所述预设识别模型,判断所述待识别地址是否为恶意地址。
  8. 根据权利要求7所述的方法,其特征在于,从所述待识别地址对应的待识别订单和/或所述待识别订单对应的历史订单中提取用于识别所述待识别地址是否为恶意地址的预设识别特征包括:
    从所述待识别地址中提取对应的地址文本信息特征;
    和/或,从所述待识别订单对应的历史订单中提取历史购物行为特征;
    和/或,从所述待识别订单中提取对应的订单特征。
  9. 根据权利要求8所述的方法,其特征在于,从所述待识别地址对应的待识别订单和/或所述待识别订单对应的历史订单中提取用于识别所述待识别地址是否为恶意地址的预设识别特征还包括:
    根据所述地址文本信息特征、所述历史购物行为特征、所述订单特征以及所述待识别地址的正常地址概率中至少两项的组合,获取所述待识别地址对应的交叉特征。
  10. 根据权利要求8所述的方法,其特征在于,所述地址文本信息特征包括:是否包括预设长度的数字、是否包括预设敏感词以及是否包括广告信息;
    所述订单特征包括:所述待识别订单中的电话号码是否正常、所述待识别地址的使用次数是否大于预设使用阈值、所述待识别订单对应的店铺的相关状态以及所述待识别订单到对应的商品的相关状态。
  11. 根据权利要求7所述的方法,其特征在于,在获取通过历史订单训练的预设识别模型之前,所述方法还包括:
    获取历史订单,所述历史订单中包括预设比例的历史正常订单和历史恶意订单;
    根据所述地址层级跳转概率分布,获得所述历史订单中携带的历史地址的正常地址概率;
    从所述历史订单中提取预设识别特征;
    通过各个历史地址的正常地址概率以及对应的预设识别特征训练所述预设识别模型。
  12. 根据权利要求6所述的方法,其特征在于,根据预设识别规则以及所述待识别地址的正常地址概率,判断所述待识别地址是否为恶意地址包括:
    判断所述待识别地址的正常地址概率是否大于预设概率阈值;
    若所述待识别地址的正常地址概率大于所述预设概率阈值,则确定所述待识别地址为正常地址;
    若所述待识别地址的正常地址概率小于或等于所述预设概率阈值,则确定所述待识别地址为恶意地址。
  13. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    将判断所述待识别地址是否为恶意地址的识别结果发送给商家客户端,以便所述商家客户端接收并输出所述识别结果。
  14. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    若判断所述待识别地址为恶意地址,则向商家客户端发送预警提示信息,以便所述商家客户端接收并输出所述预警提示信息;
    接收所述商家客户端发送的、基于所述预警提示信息对所述待识别地址进行二次识别的识别结果;
    若所述识别结果是所述待识别地址为正常地址,则更新历史正常地址库、历史恶意地址库以及所述预设识别模型。
  15. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    接收商家客户端发送的携带恶意标识的所述待识别地址;
    更新历史正常地址库、历史恶意地址库以及所述预设识别模型。
  16. 根据权利要求5所述的方法,其特征在于,所述待识别地址为在对待识别订单进行冗余处理以及格式化处理后获得的地址。
  17. 根据权利要求16所述的方法,其特征在于,对待识别订单进行冗余处理以及格式化处理包括:
    对所述待识别订单的待识别地址中满足预设过滤条件的文字进行过滤;
    对所述待识别订单中的脏数据进行过滤;
    根据预设格式化处理规则,对过滤后的待识别订单进行格式化处理。
  18. 根据权利要求5至17中任一项所述的方法,其特征在于,对所述待识别地址进行地址层级化处理包括:
    基于条件随机场模型,对所述待识别地址进行地址层级化处理。
  19. 一种恶意地址的识别装置,其特征在于,所述装置包括:
    接收单元,用于接收用户客户端发送的待识别地址;
    第一处理单元,用于对所述待识别地址进行地址层级化处理,获得所述待识别地址的各地址层级;
    计算单元,用于利用由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;
    第二处理单元,用于对所述计算单元获得的各个跳转概率进行相乘处理,获得所述待识别地址的正常地址概率。
  20. 一种恶意订单的识别系统,其特征在于,所述系统包括用户客户端、服务器和商家客户端;其中,
    所述用户客户端用于接收输入的待识别订单,并将所述待识别订单发送给所述服务器;
    所述服务器用于接收所述用户客户端发送的所述待识别订单,并基于由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别订单的地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;对获得的各个跳转概率进行相乘处理,获得所述地址的正常地址概率;根据所述正常地址概率判断所述待识别订单是否为恶意订单,并将判断结果发送给所述商家客户端;
    所述商家客户端用于接收并显示所述服务器发送的所述判断结果。
  21. 一种恶意订单的识别方法,其特征在于,所述方法包括:
    接收用户客户端发送的待识别订单;
    基于由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别订单的地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布 包括任意一个地址层级跳转至另一个地址层级的跳转概率;
    对获得的各个跳转概率进行相乘处理,获得所述地址的正常地址概率;
    根据所述正常地址概率判断所述待识别订单是否为恶意订单。
  22. 根据权利要求21所述的方法,其特征在于,根据所述正常地址概率判断所述待识别订单是否为恶意订单包括:
    根据所述正常地址概率判断所述待识别订单的地址是否为恶意地址;
    若所述待识别订单的地址为恶意地址,则确定所述待识别订单为恶意订单。
  23. 根据权利要求22所述的方法,其特征在于,若所述待识别订单的地址为正常地址,则所述方法还包括:
    判断所述待识别订单中的电话号码是否正常;
    若所述电话号码异常,则确定所述待识别订单为恶意订单。
  24. 根据权利要求21至23中任一项所述的方法,其特征在于,基于由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别订单的地址中每个地址层级跳转至相邻的下一地址层级的跳转概率包括:
    对所述待识别订单的地址进行地址层级化处理,获得所述地址的各地址层级;
    基于所述地址层级跳转概率分布,计算每个地址层级跳转至相邻的下一地址层级的跳转概率。
  25. 一种恶意订单的识别装置,其特征在于,所述装置包括:
    接收单元,用于接收用户客户端发送的待识别订单;
    计算单元,用于基于由历史正常地址分析得到的地址层级跳转概率分布,计算所述待识别订单的地址中每个地址层级跳转至相邻的下一地址层级的跳转概率,所述地址层级跳转概率分布包括任意一个地址层级跳转至另一个地址层级的跳转概率;
    处理单元,用于对获得的各个跳转概率进行相乘处理,获得所述地址的正常地址概率;
    判断单元,用于根据所述正常地址概率判断所述待识别订单是否为恶意订单。
PCT/CN2017/097953 2016-08-31 2017-08-18 恶意地址/恶意订单的识别系统、方法及装置 WO2018040944A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610797563.7A CN107798571B (zh) 2016-08-31 2016-08-31 恶意地址/恶意订单的识别系统、方法及装置
CN201610797563.7 2016-08-31

Publications (1)

Publication Number Publication Date
WO2018040944A1 true WO2018040944A1 (zh) 2018-03-08

Family

ID=61301279

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/097953 WO2018040944A1 (zh) 2016-08-31 2017-08-18 恶意地址/恶意订单的识别系统、方法及装置

Country Status (3)

Country Link
CN (1) CN107798571B (zh)
TW (1) TW201812689A (zh)
WO (1) WO2018040944A1 (zh)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109587248A (zh) * 2018-12-06 2019-04-05 腾讯科技(深圳)有限公司 用户识别方法、装置、服务器及存储介质
CN110852080A (zh) * 2018-08-01 2020-02-28 北京京东尚科信息技术有限公司 订单地址的识别方法、系统、设备和存储介质
CN110874778A (zh) * 2018-08-31 2020-03-10 阿里巴巴集团控股有限公司 异常订单检测方法及装置
CN111132144A (zh) * 2019-12-25 2020-05-08 中国联合网络通信集团有限公司 异常号码识别方法及设备
CN111461815A (zh) * 2020-03-17 2020-07-28 上海携程国际旅行社有限公司 订单识别模型生成方法、识别方法、系统、设备和介质
CN111915256A (zh) * 2020-07-31 2020-11-10 上海寻梦信息技术有限公司 构建派件围栏的方法、异地签收识别方法及相关设备
CN111935646A (zh) * 2020-07-22 2020-11-13 北京明略昭辉科技有限公司 移动设备用户的常用地址估算方法及系统
CN112101993A (zh) * 2020-09-11 2020-12-18 厦门美图之家科技有限公司 离线反作弊方法、装置、电子设备和可读存储介质
CN112446425A (zh) * 2020-11-20 2021-03-05 北京思特奇信息技术股份有限公司 一种用于自动获取疑似养卡渠道的方法和装置
CN112491863A (zh) * 2020-11-23 2021-03-12 中国联合网络通信集团有限公司 Ip地址黑灰名单分析方法、服务器、终端及存储介质
CN112950298A (zh) * 2019-11-26 2021-06-11 北京沃东天骏信息技术有限公司 一种恶意订单识别方法、装置及存储介质
CN113449523A (zh) * 2021-06-29 2021-09-28 京东科技控股股份有限公司 异常地址文本的确定方法、装置、电子设备和存储介质
CN117371893A (zh) * 2023-10-09 2024-01-09 杭州正马软件科技有限公司 一种自动更改电商订单地址的系统和方法

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108683749B (zh) * 2018-05-18 2021-07-06 携程旅游信息技术(上海)有限公司 一种随机邮箱地址的判断方法、设备和介质
CN108876545A (zh) * 2018-06-22 2018-11-23 北京小米移动软件有限公司 订单识别方法、装置和可读存储介质
CN109345332A (zh) * 2018-08-27 2019-02-15 中国民航信息网络股份有限公司 一种航空订票恶意行为的智能检测方法
CN109407504B (zh) * 2018-11-30 2021-05-14 华南理工大学 一种基于智能手表的人身安全检测系统及方法
CN116126538A (zh) * 2019-03-07 2023-05-16 创新先进技术有限公司 业务处理方法、装置、设备及存储介质
CN110335115A (zh) * 2019-07-01 2019-10-15 阿里巴巴集团控股有限公司 一种业务订单处理方法及装置
CN110503517A (zh) * 2019-08-13 2019-11-26 蚌埠聚本电子商务产业园有限公司 一种用于电子商务的恶意信息检测及处置方法
CN110807685B (zh) * 2019-10-22 2021-09-07 上海钧正网络科技有限公司 信息处理方法、装置、终端及可读存储介质
CN111859956B (zh) * 2020-07-09 2021-08-27 睿智合创(北京)科技有限公司 一种用于金融行业的地址分词方法
CN112686732B (zh) * 2021-01-06 2023-07-11 中国联合网络通信集团有限公司 异常地址数据识别方法、装置、设备、介质
CN113240480A (zh) * 2021-01-25 2021-08-10 天津五八到家货运服务有限公司 订单处理方法、装置、电子终端及存储介质
CN113076752A (zh) * 2021-03-26 2021-07-06 中国联合网络通信集团有限公司 识别地址的方法和装置
CN116934418B (zh) * 2023-06-15 2024-03-19 广州淘通科技股份有限公司 一种异常订单的检测预警方法、系统、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008038017A1 (en) * 2006-09-29 2008-04-03 British Telecommunications Public Limited Company Information processing system and related method
CN103095711A (zh) * 2013-01-18 2013-05-08 重庆邮电大学 一种针对网站的应用层DDoS攻击检测方法和防御系统
CN104462059A (zh) * 2014-12-01 2015-03-25 银联智惠信息服务(上海)有限公司 商户地址信息识别方法和装置
CN105389722A (zh) * 2015-11-20 2016-03-09 小米科技有限责任公司 恶意订单识别方法及装置
CN105468742A (zh) * 2015-11-25 2016-04-06 小米科技有限责任公司 恶意订单识别方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008038017A1 (en) * 2006-09-29 2008-04-03 British Telecommunications Public Limited Company Information processing system and related method
CN103095711A (zh) * 2013-01-18 2013-05-08 重庆邮电大学 一种针对网站的应用层DDoS攻击检测方法和防御系统
CN104462059A (zh) * 2014-12-01 2015-03-25 银联智惠信息服务(上海)有限公司 商户地址信息识别方法和装置
CN105389722A (zh) * 2015-11-20 2016-03-09 小米科技有限责任公司 恶意订单识别方法及装置
CN105468742A (zh) * 2015-11-25 2016-04-06 小米科技有限责任公司 恶意订单识别方法及装置

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110852080A (zh) * 2018-08-01 2020-02-28 北京京东尚科信息技术有限公司 订单地址的识别方法、系统、设备和存储介质
CN110874778A (zh) * 2018-08-31 2020-03-10 阿里巴巴集团控股有限公司 异常订单检测方法及装置
CN110874778B (zh) * 2018-08-31 2023-04-25 阿里巴巴集团控股有限公司 异常订单检测方法及装置
CN109587248A (zh) * 2018-12-06 2019-04-05 腾讯科技(深圳)有限公司 用户识别方法、装置、服务器及存储介质
CN109587248B (zh) * 2018-12-06 2023-08-29 腾讯科技(深圳)有限公司 用户识别方法、装置、服务器及存储介质
CN112950298A (zh) * 2019-11-26 2021-06-11 北京沃东天骏信息技术有限公司 一种恶意订单识别方法、装置及存储介质
CN111132144A (zh) * 2019-12-25 2020-05-08 中国联合网络通信集团有限公司 异常号码识别方法及设备
CN111132144B (zh) * 2019-12-25 2022-09-13 中国联合网络通信集团有限公司 异常号码识别方法及设备
CN111461815B (zh) * 2020-03-17 2023-04-28 上海携程国际旅行社有限公司 订单识别模型生成方法、识别方法、系统、设备和介质
CN111461815A (zh) * 2020-03-17 2020-07-28 上海携程国际旅行社有限公司 订单识别模型生成方法、识别方法、系统、设备和介质
CN111935646A (zh) * 2020-07-22 2020-11-13 北京明略昭辉科技有限公司 移动设备用户的常用地址估算方法及系统
CN111915256A (zh) * 2020-07-31 2020-11-10 上海寻梦信息技术有限公司 构建派件围栏的方法、异地签收识别方法及相关设备
CN111915256B (zh) * 2020-07-31 2023-09-26 上海寻梦信息技术有限公司 构建派件围栏的方法、异地签收识别方法及相关设备
CN112101993A (zh) * 2020-09-11 2020-12-18 厦门美图之家科技有限公司 离线反作弊方法、装置、电子设备和可读存储介质
CN112101993B (zh) * 2020-09-11 2022-12-23 厦门美图之家科技有限公司 离线反作弊方法、装置、电子设备和可读存储介质
CN112446425A (zh) * 2020-11-20 2021-03-05 北京思特奇信息技术股份有限公司 一种用于自动获取疑似养卡渠道的方法和装置
CN112491863A (zh) * 2020-11-23 2021-03-12 中国联合网络通信集团有限公司 Ip地址黑灰名单分析方法、服务器、终端及存储介质
CN112491863B (zh) * 2020-11-23 2022-07-29 中国联合网络通信集团有限公司 Ip地址黑灰名单分析方法、服务器、终端及存储介质
CN113449523A (zh) * 2021-06-29 2021-09-28 京东科技控股股份有限公司 异常地址文本的确定方法、装置、电子设备和存储介质
CN113449523B (zh) * 2021-06-29 2024-05-24 京东科技控股股份有限公司 异常地址文本的确定方法、装置、电子设备和存储介质
CN117371893A (zh) * 2023-10-09 2024-01-09 杭州正马软件科技有限公司 一种自动更改电商订单地址的系统和方法

Also Published As

Publication number Publication date
TW201812689A (zh) 2018-04-01
CN107798571A (zh) 2018-03-13
CN107798571B (zh) 2019-08-30

Similar Documents

Publication Publication Date Title
WO2018040944A1 (zh) 恶意地址/恶意订单的识别系统、方法及装置
CN112084383A (zh) 基于知识图谱的信息推荐方法、装置、设备及存储介质
WO2018188576A1 (zh) 资源推送方法及装置
US11074634B2 (en) Probabilistic item matching and searching
CN109711955B (zh) 基于当前订单的差评预警方法、系统、黑名单库建立方法
WO2017088496A1 (zh) 一种搜索推荐方法、装置、设备及计算机存储介质
CN112100513A (zh) 基于知识图谱的推荐方法、装置、设备及计算机可读介质
CN111680165B (zh) 信息匹配方法、装置、可读存储介质和电子设备
CN107767152B (zh) 产品购买倾向分析方法及服务器
CN111429214B (zh) 一种基于交易数据的买卖双方匹配方法及装置
CN113761219A (zh) 基于知识图谱的检索方法、装置、电子设备及存储介质
CN110781428A (zh) 评论展示方法、装置、计算机设备及存储介质
CN116796027A (zh) 商品图片标签生成方法及其装置、设备、介质、产品
JP2024041849A (ja) 確率的アイテムマッチングおよび検索
CN113887214B (zh) 基于人工智能的意愿推测方法、及其相关设备
US9626356B2 (en) System support for evaluation consistency
CN112182126A (zh) 用于确定匹配度的模型训练方法、装置、电子设备及可读存储介质
CN110992076A (zh) 商家质量评价方法、装置、电子设备及可读存储介质
CN116010707A (zh) 商品价格异常识别方法、装置、设备和存储介质
CN113706207A (zh) 基于语义解析的订单成交率分析方法、装置、设备及介质
WO2024183225A1 (zh) 一种商品匹配方法、装置、计算机设备及介质
US12073947B1 (en) Meta-learning for automated health scoring
US11989660B1 (en) Transaction entity prediction with a global list
CN113486145B (zh) 基于网络节点的用户咨询回复方法、装置、设备及介质
US11238490B2 (en) Determining performance metrics for delivery of electronic media content items by online publishers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17845246

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17845246

Country of ref document: EP

Kind code of ref document: A1