CN115544247B - Information processing method, apparatus, computer device, and storage medium - Google Patents

Information processing method, apparatus, computer device, and storage medium Download PDF

Info

Publication number
CN115544247B
CN115544247B CN202210987082.8A CN202210987082A CN115544247B CN 115544247 B CN115544247 B CN 115544247B CN 202210987082 A CN202210987082 A CN 202210987082A CN 115544247 B CN115544247 B CN 115544247B
Authority
CN
China
Prior art keywords
abnormal
identifier
code
identifiers
information corresponding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210987082.8A
Other languages
Chinese (zh)
Other versions
CN115544247A (en
Inventor
许良锋
王丰
高黎明
陈嵩
王红亮
荆华
杨韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Post Bureau Postal Industry Security Center
Original Assignee
State Post Bureau Postal Industry Security Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Post Bureau Postal Industry Security Center filed Critical State Post Bureau Postal Industry Security Center
Priority to CN202210987082.8A priority Critical patent/CN115544247B/en
Publication of CN115544247A publication Critical patent/CN115544247A/en
Application granted granted Critical
Publication of CN115544247B publication Critical patent/CN115544247B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Creation or modification of classes or clusters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/381Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using identifiers, e.g. barcodes, RFIDs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Library & Information Science (AREA)
  • Marketing (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to an information processing method, an information processing device, a computer device and a storage medium. The method comprises the following steps: the method comprises the steps of obtaining a first data list, namely obtaining logistics information corresponding to a plurality of code identifiers in a preset time period, wherein the code identifiers are used for indicating telephone numbers corresponding to user identifiers, clustering the code identifiers according to the logistics information corresponding to the code identifiers, namely classifying the code identifiers to form clusters and abnormal code identifiers which do not belong to any clusters, marking the abnormal code identifiers by using preset abnormal labels, namely classifying the abnormal code identifiers, and determining the abnormal type of each abnormal code identifier, so that abnormal label data are obtained, the abnormal label data are used for indicating the code identifiers with abnormal receiving and sending, and the telephone numbers with abnormal receiving and sending do not need to be manually distinguished, so that the distinguishing cost of the logistics information is reduced, and the information distinguishing efficiency is quickened.

Description

Information processing method, apparatus, computer device, and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to an information processing method, an information processing apparatus, a computer device, and a storage medium.
Background
Along with the entering of the 'Internet+' age in China, the express industry presents 'explosive' development, the development of science and technology enables the intellectualization of the express industry to continuously change, the development of the industry further promotes the improvement of production efficiency, various large enterprises start to use a large number of automatic information equipment from the aspects of acquisition terminals, automatic sorting equipment, delivery self-service terminals, electronic scales, security check machines and the like in production links, the industry technology is improved, an emerging way of utilizing crimes of a delivery channel is derived, a series of public security problems start to appear in the express industry, and public security, citizen life and property security are seriously endangered. Criminal suspects often utilize the convenience of the "internet" to construct fictional identities for mailing. Abnormal behavior of typical mailing includes: frequent mailing or receiving of multiple names or multiple addresses is replaced, and a receipt telephone number is the same for one face sheet. Meanwhile, postal supervision personnel and case handling personnel cannot quickly and efficiently screen out problem mails from the history of looking at questions and asking for questions according to past experience, and cannot strike illegal criminal activities through a delivery channel.
The method has the characteristics of low selling and fake cost, high efficiency, concealment, wide case-related range, frequent cross-regional crime and the like by utilizing the illegal crimes of the delivery channel; the criminal is mainly screened by manual, large-scale, multi-scene, multi-batch, periodic, irregular and other modes, so that the phenomena of high cost, low efficiency and the like occur in the case of striking illegal criminals.
Disclosure of Invention
In order to solve the technical problems, the application provides an information processing method, an information processing device, a computer device and a storage medium.
In a first aspect, the present application provides an information processing method, including:
obtaining a first data list, wherein the first data list comprises logistics information corresponding to a plurality of coding identifiers in a preset time period, and the coding identifiers are used for indicating telephone numbers which do not carry E-commerce labels;
clustering the code identifiers according to the logistics information corresponding to the code identifiers to obtain at least one cluster and an abnormal code identifier, wherein the abnormal code identifier is used for indicating the code identifier which does not belong to the cluster;
and marking each abnormal code identifier according to a preset abnormal label to obtain corresponding abnormal label data, wherein the abnormal label data are used for indicating the code identifier with the received abnormality.
In a second aspect, the present application provides an information processing apparatus including:
the system comprises an acquisition module, a judgment module and a judgment module, wherein the acquisition module is used for acquiring a first data list, the first data list comprises logistics information corresponding to a plurality of coding identifiers in a preset time period, and the coding identifiers are used for indicating telephone numbers which do not carry E-commerce labels;
The clustering module is used for carrying out clustering processing on each code identifier according to the logistics information corresponding to the plurality of code identifiers to obtain at least one cluster and an abnormal code identifier, wherein the abnormal code identifier is used for indicating the code identifier which does not belong to the cluster;
the marking module is used for marking each abnormal code identifier according to a preset abnormal label to obtain corresponding abnormal label data, wherein the abnormal label data are used for indicating the code identifier with the received abnormality.
In a third aspect, the present application provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
obtaining a first data list, wherein the first data list comprises logistics information corresponding to a plurality of coding identifiers in a preset time period, and the coding identifiers are used for indicating telephone numbers which do not carry E-commerce labels;
clustering the code identifiers according to the logistics information corresponding to the code identifiers to obtain at least one cluster and an abnormal code identifier, wherein the abnormal code identifier is used for indicating the code identifier which does not belong to the cluster;
And marking each abnormal code identifier according to a preset abnormal label to obtain corresponding abnormal label data, wherein the abnormal label data are used for indicating the code identifier with the received abnormality.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
obtaining a first data list, wherein the first data list comprises logistics information corresponding to a plurality of coding identifiers in a preset time period, and the coding identifiers are used for indicating telephone numbers which do not carry E-commerce labels;
clustering the code identifiers according to the logistics information corresponding to the code identifiers to obtain at least one cluster and an abnormal code identifier, wherein the abnormal code identifier is used for indicating the code identifier which does not belong to the cluster;
and marking each abnormal code identifier according to a preset abnormal label to obtain corresponding abnormal label data, wherein the abnormal label data are used for indicating the code identifier with the received abnormality.
Based on the information processing method, a first data list is obtained, namely, logistics information corresponding to a plurality of code identifiers in a preset time period is obtained, the code identifiers are used for indicating telephone numbers corresponding to user identifiers, clustering processing is carried out on the code identifiers according to the logistics information corresponding to the code identifiers, namely, the code identifiers are classified into clusters and abnormal code identifiers which do not belong to any cluster, the preset abnormal labels are used for carrying out marking processing on the abnormal code identifiers, namely, the abnormal code identifiers are classified, and the abnormal type of each abnormal code identifier is determined, so that abnormal label data are obtained, the abnormal label data are used for indicating the code identifiers which receive and send the abnormal codes, and manual screening of telephone numbers which receive and send the abnormal is not needed, so that the screening cost of the logistics information is reduced, and the information screening efficiency is quickened.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of a method of processing information in one embodiment;
FIG. 2 is a schematic diagram of density connections in one embodiment;
FIG. 3 is a schematic diagram of density connections in one embodiment;
FIG. 4 is a schematic diagram of density connections in one embodiment;
FIG. 5 is a schematic diagram of a physical distribution knowledge graph in one embodiment;
FIG. 6 is a block diagram showing the structure of an information processing apparatus in one embodiment;
fig. 7 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present application based on the embodiments herein.
In one embodiment, fig. 1 is a flow chart of an information processing method in one embodiment, and referring to fig. 1, an information processing method is provided. The embodiment is mainly exemplified by the method applied to a server, and the information processing method specifically comprises the following steps:
step S110, a first data list is acquired.
The first data list comprises logistics information corresponding to a plurality of code identifiers in a preset time period, and the code identifiers are used for indicating telephone numbers which do not carry E-commerce labels.
Specifically, the preset time period can be set in a self-defined manner according to an actual application scene, for example, one week, two weeks, one month, six months and the like. Since most crimes using the shipping channel are mostly personal users, the code identifier in the first data list is used to indicate the telephone number that does not carry the e-commerce label, that is, the first data list does not contain the logistics information corresponding to the e-commerce label, the code identifier is used to indicate the mobile phone number consisting of 11 digits or the telephone number of the base phone consisting of 7-8 digits, and in order to ensure the privacy degree of the telephone number, the hash encryption processing is performed on part of the digits in the telephone number, for example, the code identifier is shown as "131 x 2345". The logistics information comprises a coding identifier, a user identifier corresponding to the coding identifier, address information, a receiving type, generation time of a logistics face sheet and the like, wherein the user identifier can be a unique identifier such as a user name, an identity card number and the like capable of representing the user identity, the address information comprises a receiving address and/or a receiving address, the receiving type comprises receiving and receiving, namely, a first data list is shown in table 1:
Name of name Telephone number Address of Type of receipts and mailings
Zhang San 131****2345 Building 1 in XX district of Chaoyang district in Beijing city Mail piece
Zhang Si 131****2345 Beijing city sea lake area XX company Receiving piece
Li Si 131****2345 XX district in baoding city in Hebei province Receiving piece
Table 1 first data list
The first data list not only can count name and address conditions according to individuals and serve as a basis for information analysis abnormality, but also can provide a basis for data restoration in the case that future delivery face sheet data may be missing.
And step S120, clustering each code identifier according to the logistics information corresponding to the plurality of code identifiers to obtain at least one cluster and an abnormal code identifier.
Wherein the anomaly code tag is used to indicate the code tag that does not belong to a cluster.
Specifically, clustering processing is performed on each code identifier based on the first data list, so that the use state of each code identifier is determined, the use state comprises normal and abnormal, a plurality of code identifiers with normal use states can be clustered to form a cluster, the code identifiers with abnormal use states cannot be clustered into the cluster, namely, the code identifiers do not belong to any cluster formed by clustering, and therefore the code identifiers with abnormal use states can be screened through clustering processing, namely, abnormal code identifiers are obtained.
And step S130, marking each abnormal code mark according to a preset abnormal label to obtain corresponding abnormal label data.
The abnormal label data is used for indicating the code identification of the receiving abnormality.
Specifically, the preset exception labels are used for indicating exception types of different exception use states, and specifically include receiving and sending same numbers, a plurality of numbers, frequent mail sending and the like, wherein the receiving and sending same numbers are used for indicating that the receiving telephone numbers and the sending telephone numbers in the commodity circulation face list are the same. The multiple numbers are used for indicating multiple different user identifications corresponding to multiple corresponding flow surface lists of the same phone number, and the number of the multiple different user identifications exceeds a first threshold, usually one phone number corresponds to one user identification, whether the user identification corresponding to the phone number is a sender or a receiver, but the same phone number corresponds to different user identifications in different flow surface lists, which indicates that an abnormal situation of one person and multiple names exists in the multiple flow surface lists corresponding to the same phone number, for example, 131 x 2345 corresponds to the user identification in the flow surface list 1 as the sender, and the user identification corresponding to the flow surface list 2 as the sender, or the user identification corresponding to the flow surface list 3 as the receiver, is the receiver.
The first multiple access is used for indicating a plurality of different address information corresponding to a plurality of logistics surface sheets of the same telephone number, and the number of the plurality of different user identifications exceeds a second threshold value, and usually, one telephone number of an individual user corresponds to a smaller number of address information, but the same telephone number corresponds to different address information in different logistics surface sheets, which indicates that the abnormal condition of one multiple access exists in the plurality of logistics surface sheets corresponding to the same telephone number.
The frequent posting is used for indicating that the same telephone number is frequently posted to one or more address information within a preset time period.
The abnormal code identifiers are marked by using the preset abnormal labels, namely the abnormal code identifiers are classified, and the abnormal types of the abnormal code identifiers are determined, so that abnormal label data are obtained, the abnormal label data are used for indicating the code identifiers of the receiving and sending abnormalities, and the telephone numbers of the receiving and sending abnormalities are not needed to be manually screened, so that the screening cost of logistics information is reduced, and the information screening efficiency is accelerated.
In one embodiment, the clustering processing is performed on each code identifier according to the logistics information corresponding to the plurality of code identifiers to obtain at least one cluster and an abnormal code identifier, including:
Clustering each code identifier according to the logistics information corresponding to the plurality of code identifiers to obtain at least one cluster and an abnormal identifier, wherein the abnormal identifier is used for indicating the code identifier which does not belong to the cluster;
acquiring a second data list, wherein the second data list comprises identity information corresponding to a plurality of preset identifiers, and the preset identifiers are used for indicating telephone numbers corresponding to preset user identifiers;
and determining the abnormal identifier matched with the preset identifier as the abnormal code identifier.
Specifically, clustering is performed on each code identifier according to the logistics information corresponding to each code identifier to form clusters and abnormal identifiers which do not belong to any one cluster, but only the code identifiers corresponding to individual special normal persons are possibly judged to be abnormal identifiers according to clustering results, so that further judgment is needed by combining a second data list, the second data list comprises a plurality of preset identifiers, namely the code identifiers corresponding to preset user identifiers, the preset user identifiers are user identifiers needing to pay attention, the preset user identifiers comprise toxic user identifiers, electric fraud user identifiers and the like, the identity information corresponding to the preset user identifiers comprises a case type, the code identifiers corresponding to the preset user identifiers, a user type, information sources and the like, the case type comprises toxic materials, electric pin fraud, marketing and the like, the user type comprises toxic materials absorbing personnel, toxic materials producing personnel, electric pin cards and the like, and the information sources comprise crowd reporting personnel, investigation personnel and the like.
And carrying out matching processing on the abnormal identifications screened by the clusters and each preset identification in the second data list, and determining the abnormal identifications successfully matched as abnormal coding identifications, namely combining the preset identifications focused on with the abnormal identifications screened by the clusters, namely combining the focused objects with the posting abnormal behaviors (posting same number, one person multiple address and one person multiple names), so that the possibility of data abnormality can be accurately captured, and the accuracy of determining abnormal data is improved.
In one embodiment, the clustering processing is performed on each code identifier according to the logistics information corresponding to the plurality of code identifiers to obtain at least one cluster and an abnormal identifier, including:
determining the number of user identifications and the number of address information corresponding to each code identification according to the logistics information corresponding to the plurality of code identifications;
and clustering the coded identifiers according to the number of the user identifiers and the number of the address information corresponding to the coded identifiers to obtain at least one cluster and abnormal identifiers.
Specifically, the number of user identifications and the number of address information corresponding to each code identification are counted, namely the number of user identifications and the number of address information corresponding to a plurality of logistics face sheets related to each telephone number are counted, namely the number of personal names and the number of addresses corresponding to each telephone number are determined. And performing density clustering processing according to the number of personal names and the number of addresses corresponding to each telephone number to form clusters, and marking noise points which do not belong to the clusters as abnormal marks.
In one embodiment, the clustering processing is performed on each code identifier according to the number of user identifiers and the number of address information corresponding to each code identifier to obtain at least one cluster and an abnormal identifier, including:
determining the density connection relation between the coding identifications according to the number of the user identifications and the number of the address information corresponding to the coding identifications;
and obtaining at least one cluster and an abnormal identifier according to the density connection relation among the code identifiers, wherein the abnormal identifier is used for indicating the code identifier without the density connection relation.
Specifically, the number of user identifications and the number of address information corresponding to each code identification are utilized to determine the density connection relation among the code identifications, namely, data objects corresponding to each code identification are formed according to the number of user identifications and the number of address information corresponding to each code identification, the density connection relation among the data objects is determined, and the density connection relation comprises direct density accessibility, density accessibility and density connection. Taking each data object as a sample object, searching for clusters according to a preset field range, marking the preset field range as epsilon, namely searching for other sample objects in the preset field range by taking a target sample object as a center, wherein the target sample object is any sample object, namely p is the target sample object, if a plurality of sample objects are found in the preset field range, and the number of the sample objects in the preset field range reaches the preset number, taking the target sample object as a core object and forming clusters, marking the preset number as Min Pts, and the density connection relation between the core object and all the sample objects in the preset field range is direct density, namely the direct density of the core object can reach any sample object in the preset field range, wherein the direct density of the core object p can reach q is as shown in fig. 2.
According to the method, a plurality of sample objects can be core objects, and each core object can be directly and sequentially density-reachable to form an object chain, so that density can be reached between any two non-directly density-reachable core objects on the object chain, and density can be reached between sample objects corresponding to two ends of the object chain. That is, there is an object chain composed of core objects, and the p direct density can reach p 1 、p 1 The direct density can reach p 2 、…、p n-1 The direct density can reach p n ,p n The direct density can reach q, and the p density can reach q at this time, namely, the starting point object and the passing object of the object chain are core objects, but the end point object of the object chain can be any sample object. As shown in fig. 3, the p direct density can reach q, the q direct density can reach t, and the p direct density can reach t.
When there is an intermediate core object between two sample objects and the intermediate core object density can reach two sample objects, then the two sample objects are connected by density. That is, there is an intermediate core object O between sample object p and sample object q, O to p being density reachable and O to q being density reachable, so that p is density-connected to q, as shown in FIG. 4, O density reachable p passes through core object p in the middle 1 And core object p 2 O density can reach q and pass through the core object q in the middle 1 And core object q 2 I.e. intermediate coreThe objects passing through the middle of the sample objects with the reachable object density are core objects, and the sample objects with the reachable object density of the middle core objects can be core objects or common sample objects.
In this way, whether each sample object is a core object or not is determined, and the density connection relation among each sample object is determined, so that all data objects with the density connection relation are clustered, and the corresponding code identification of the data objects without any density connection relation is used as an abnormal code identification.
In one embodiment, the marking processing is performed on each abnormal code identifier according to a preset abnormal label to obtain corresponding abnormal label data, where the abnormal label data is used to indicate the code identifier of the receiving exception, and includes:
determining corresponding preset abnormal labels according to logistics information corresponding to each abnormal code mark, wherein the logistics information comprises face order information, user mark number and address information number corresponding to the abnormal code mark in a preset time period;
And marking the abnormal code mark according to the preset abnormal label to obtain corresponding abnormal label data.
Specifically, the logistics information includes a code identifier, a user identifier corresponding to the code identifier, address information, a receiving type, a commodity identifier, a logistics list, the number of the logistics list, the generation time of each logistics list and the like, so that the number of the user identifiers and the number of the address information corresponding to the abnormal code identifiers can be obtained through analysis, and a preset abnormal label corresponding to each abnormal code identifier can be determined through analysis based on the logistics information, namely, when the situation that the telephone number of a receiver is the same as the telephone number of a sender exists in the corresponding logistics list of the abnormal code identifier, the preset abnormal label corresponding to the abnormal code identifier is determined to include the receiving same number.
When the number of the user identifications corresponding to the abnormal code identifications exceeds a first threshold value, determining that a preset abnormal label corresponding to the abnormal code identifications comprises a number one and a plurality of names; when the number of the address information corresponding to the abnormal code identifier exceeds a second threshold value, determining that a preset abnormal label corresponding to the abnormal code identifier comprises a first number of multiple addresses; when the number of the logistics face sheets corresponding to the abnormal code identification exceeds a third threshold value in a preset time period, determining that the preset abnormal label corresponding to the abnormal code identification comprises frequent mailing. In this way, each anomaly code identifier is marked according to a preset anomaly tag corresponding to each anomaly code identifier, so that corresponding anomaly tag data is obtained, namely, the anomaly tag data is marked with at least one preset anomaly tag, and the anomaly tag data is possibly marked with a plurality of different types of preset anomaly tags.
And similarly, marking the abnormal mark according to the preset abnormal mark to obtain corresponding abnormal mark data.
In one embodiment, after the marking processing is performed on each anomaly coded identifier according to a preset anomaly tag to obtain corresponding anomaly tag data, the method further includes:
and constructing a logistics knowledge graph according to the abnormal tag data and the logistics information corresponding to the coding identifier related to the abnormal tag data.
Specifically, the corresponding code identifier of each abnormal label data is taken as a node, the receiving relationship is taken as an edge, and because the edge is a directed edge, the direction of the edge indicates the sending relationship, namely two nodes are respectively arranged at two ends of the edge, wherein the node corresponding to one end is the corresponding code identifier of the abnormal label data, the node corresponding to the other end is the code identifier which has the receiving relationship with the abnormal label data, the association relationship is the receiving relationship, and the logistic knowledge graph is constructed based on the nodes and the edge and by utilizing Neo4 j. The corresponding logistics information or abnormal label data of each node is used as attribute information of each node, namely the attribute information of the corresponding node of the abnormal label data comprises preset abnormal labels, user identifications, address information, logistics commodity identifications, logistics face list and the like, the attribute information of the corresponding node of the non-abnormal label data comprises user identifications, address information, logistics commodity identifications, logistics face list and the like, different types of nodes adopt different display modes, as shown in fig. 5, key persons are used for indicating abnormal coding identifications, relational persons are used for indicating normal coding identifications with mail relations with the abnormal coding identifications, abnormal labels are used for indicating abnormal coding identifications, the corresponding node of the abnormal coding identifications is represented by red circles, the node with the mail relations with the abnormal coding identifications is represented by blue circles corresponding to the normal coding identifications, the node with the abnormal coding identifications is represented by blue circles and red circles corresponding to the abnormal coding identifications, the corresponding node with the abnormal coding identifications, and the corresponding node with the abnormal coding identifications can display corresponding logistics face list numbers on the sides between the nodes, as shown in fig. 5, the number of mail sending pieces and the number of receiving pieces of the corresponding commodity is represented by the blue circles, the corresponding commodity circulation map is represented by the blue circles, and the corresponding knowledge map is formed by the statistical map with the time stamp is only shown in the preset time section of the fig. 5, and the corresponding knowledge map is formed according to the example.
In one embodiment, the building a logistics knowledge graph according to the anomaly tag data and the logistics information corresponding to the coding identifier related to the anomaly tag data includes:
determining the coding identifier directly related to the abnormal tag data according to the logistics information corresponding to the abnormal tag data to obtain a first related identifier;
determining the coding identifier directly related to the first related identifier according to the logistics information corresponding to the first related identifier to obtain a second related identifier indirectly related to the abnormal tag data;
and constructing a logistics knowledge graph according to the logistics information corresponding to the abnormal label data, the logistics information corresponding to the first related identifier and the logistics information corresponding to the second related identifier.
Specifically, a first correlation identifier directly related to the abnormal code identifier and a second correlation identifier indirectly related to the abnormal code identifier are determined according to the logistics information corresponding to the abnormal label data, wherein the direct correlation indicates that a direct receiving hosting relationship exists between the abnormal code identifier and the second correlation identifier, the indirect correlation indicates that a direct receiving hosting relationship does not exist between the abnormal code identifier and the second correlation identifier, but a direct receiving hosting relationship exists between the abnormal code identifier and the second correlation identifier, namely, the first correlation identifier is the code identifier which has the direct receiving hosting relationship with the abnormal code identifier, the second correlation identifier is the code identifier which has the receiving hosting relationship with the abnormal code identifier, and therefore each code identifier is taken as a node, and the receiving hosting relationship among the code identifiers is taken as an edge to construct a logistics knowledge graph.
Referring to fig. 5, a first correlation identifier corresponding to an abnormal code identifier corresponding to a key person includes code identifiers 132, 133, and 151 as the beginning, and code identifiers with the number of senders being 10 and 138 as the beginning, and a second correlation identifier corresponding to the abnormal code identifier is a code identifier with the corresponding code identifier of 151, having a sender relationship, with the number of senders being 20 and 138 as the beginning.
The mail sending relation among the telephone numbers can be intuitively displayed through the logistics knowledge graph, the online and offline relation of a certain group of group organization networks or a certain type of cases can be clearly judged, and an auxiliary decision basis is provided for relevant researchers.
In summary, the first data list is built to combine with the logistics knowledge graph to perform abnormal analysis on the personnel posting behaviors, massive posting data is fully utilized to analyze, study and judge the personnel posting behaviors from multiple dimensions, multiple scenes and multiple conditions, analysis efficiency is improved, time cost is saved, analysis and judging accuracy is improved, practical and useful technical means are provided for posting risk behavior studying and judging work, and supervision strength of supervision departments is improved.
FIG. 1 is a flow chart of an information processing method in one embodiment. It should be understood that, although the steps in the flowchart of fig. 1 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 1 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of other steps or sub-steps of other steps.
In one embodiment, as shown in fig. 6, there is provided an information processing apparatus including:
the obtaining module 210 is configured to obtain a first data list, where the first data list includes logistics information corresponding to a plurality of coding identifiers in a preset time period, where the coding identifiers are used to indicate phone numbers that do not carry an e-commerce tag;
a clustering module 220, configured to perform clustering processing on each code identifier according to the logistics information corresponding to the plurality of code identifiers, to obtain at least one cluster and an abnormal code identifier, where the abnormal code identifier is used to indicate the code identifier that does not belong to the cluster;
the marking module 230 is configured to perform marking processing on each of the abnormal code identifiers according to a preset abnormal label, so as to obtain corresponding abnormal label data, where the abnormal label data is used to indicate the code identifier that is received and sent as abnormal.
In one embodiment, the clustering module 220 is specifically configured to:
clustering each code identifier according to the logistics information corresponding to the plurality of code identifiers to obtain at least one cluster and an abnormal identifier, wherein the abnormal identifier is used for indicating the code identifier which does not belong to the cluster;
Acquiring a second data list, wherein the second data list comprises identity information corresponding to a plurality of preset identifiers, and the preset identifiers are used for indicating telephone numbers corresponding to preset user identifiers;
and determining the abnormal identifier matched with the preset identifier as the abnormal code identifier.
In one embodiment, the clustering module 220 is specifically configured to:
determining the number of user identifications and the number of address information corresponding to each code identification according to the logistics information corresponding to the plurality of code identifications;
and clustering the coded identifiers according to the number of the user identifiers and the number of the address information corresponding to the coded identifiers to obtain at least one cluster and abnormal identifiers.
In one embodiment, the clustering module 220 is specifically configured to:
determining the density connection relation between the coding identifications according to the number of the user identifications and the number of the address information corresponding to the coding identifications;
and obtaining at least one cluster and an abnormal identifier according to the density connection relation among the code identifiers, wherein the abnormal identifier is used for indicating the code identifier without the density connection relation.
In one embodiment, the marking module 230 is specifically configured to:
determining corresponding preset abnormal labels according to logistics information corresponding to each abnormal code mark, wherein the logistics information comprises face order information, user mark number and address information number corresponding to the abnormal code mark in a preset time period;
and marking the abnormal code mark according to the preset abnormal label to obtain corresponding abnormal label data.
In one embodiment, the apparatus further comprises a building module for:
and constructing a logistics knowledge graph according to the abnormal tag data and the logistics information corresponding to the coding identifier related to the abnormal tag data.
In one embodiment, the building module is further to:
determining the coding identifier directly related to the abnormal tag data according to the logistics information corresponding to the abnormal tag data to obtain a first related identifier;
determining the coding identifier directly related to the first related identifier according to the logistics information corresponding to the first related identifier to obtain a second related identifier indirectly related to the abnormal tag data;
and constructing a logistics knowledge graph according to the logistics information corresponding to the abnormal label data, the logistics information corresponding to the first related identifier and the logistics information corresponding to the second related identifier.
FIG. 7 illustrates an internal block diagram of a computer device in one embodiment. The computer device may in particular be a server. As shown in fig. 7, the computer device includes a processor, a memory, a network interface, an input device, and a display screen connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system, and may also store a computer program that, when executed by a processor, causes the processor to implement an information processing method. The internal memory may also store a computer program which, when executed by the processor, causes the processor to perform the information processing method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 7 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, the information processing apparatus provided in the present application may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 7. The memory of the computer device may store various program modules constituting the information processing apparatus, such as the acquisition module 210, the clustering module 220, and the labeling module 230 shown in fig. 6. The computer program constituted by the respective program modules causes the processor to execute the steps in the information processing method of the respective embodiments of the present application described in the present specification.
The computer device shown in fig. 7 may perform, by using the obtaining module 210 in the information processing apparatus shown in fig. 6, obtaining a first data list, where the first data list includes logistics information corresponding to a plurality of coding identifiers in a preset period of time, where the coding identifiers are used to indicate phone numbers that do not carry an e-commerce label. The computer device may perform clustering processing on each code identifier according to the logistics information corresponding to the plurality of code identifiers through the clustering module 220 to obtain at least one cluster and an abnormal code identifier, where the abnormal code identifier is used to indicate the code identifier that does not belong to the cluster. The computer device may perform marking processing on each of the abnormal code identifiers according to a preset abnormal label through the marking module 230, so as to obtain corresponding abnormal label data, where the abnormal label data is used to indicate the code identifier of the receiving exception.
In one embodiment, a computer device is provided that includes a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the method of any of the above embodiments when the computer program is executed.
In one embodiment, a computer readable storage medium is provided, on which a computer program is stored which, when executed by a processor, implements a method as described in any of the above embodiments.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program, which may be stored on a non-transitory computer readable storage medium, and which, when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (8)

1. An information processing method, characterized in that the method comprises:
obtaining a first data list, wherein the first data list comprises logistics information corresponding to a plurality of coding identifiers in a preset time period, and the coding identifiers are used for indicating telephone numbers which do not carry E-commerce labels;
clustering the code identifiers according to the logistics information corresponding to the code identifiers to obtain at least one cluster and an abnormal code identifier, wherein the abnormal code identifier is used for indicating the code identifier which does not belong to the cluster;
marking each abnormal code mark according to a preset abnormal label to obtain corresponding abnormal label data, wherein the abnormal label data are used for indicating the code mark for receiving the abnormality;
clustering the code identifiers according to the logistics information corresponding to the code identifiers to obtain at least one cluster and an abnormal code identifier, including:
clustering each code identifier according to the logistics information corresponding to the plurality of code identifiers to obtain at least one cluster and an abnormal identifier, wherein the abnormal identifier is used for indicating the code identifier which does not belong to the cluster;
Acquiring a second data list, wherein the second data list comprises identity information corresponding to a plurality of preset identifiers, and the preset identifiers are used for indicating telephone numbers corresponding to preset user identifiers;
determining the abnormal identifier matched with the preset identifier as the abnormal code identifier;
clustering the coded identifiers according to the logistics information corresponding to the coded identifiers to obtain at least one cluster and an abnormal identifier, wherein the clustering comprises the following steps:
determining the number of user identifications and the number of address information corresponding to each code identification according to the logistics information corresponding to the plurality of code identifications;
and clustering the coded identifiers according to the number of the user identifiers and the number of the address information corresponding to the coded identifiers to obtain at least one cluster and abnormal identifiers.
2. The method of claim 1, wherein the clustering the coded identifiers according to the number of user identifiers and the number of address information corresponding to the coded identifiers to obtain at least one cluster and an abnormal identifier includes:
determining the density connection relation between the coding identifications according to the number of the user identifications and the number of the address information corresponding to the coding identifications;
And obtaining at least one cluster and an abnormal identifier according to the density connection relation among the code identifiers, wherein the abnormal identifier is used for indicating the code identifier without the density connection relation.
3. The method of claim 1, wherein the marking each anomaly coded identifier according to a preset anomaly tag to obtain corresponding anomaly tag data comprises:
determining corresponding preset abnormal labels according to logistics information corresponding to each abnormal code mark, wherein the logistics information comprises face order information, user mark number and address information number corresponding to the abnormal code mark in a preset time period;
and marking the abnormal code mark according to the preset abnormal label to obtain corresponding abnormal label data.
4. The method according to claim 3, wherein after the marking process is performed on each of the anomaly coded identifiers according to a preset anomaly tag to obtain corresponding anomaly tag data, the method further comprises:
and constructing a logistics knowledge graph according to the abnormal tag data and the logistics information corresponding to the coding identifier related to the abnormal tag data.
5. The method according to claim 4, wherein constructing a logistics knowledge graph from the anomaly tag data and logistics information corresponding to the encoded identifications associated with the anomaly tag data comprises:
determining the coding identifier directly related to the abnormal tag data according to the logistics information corresponding to the abnormal tag data to obtain a first related identifier;
determining the coding identifier directly related to the first related identifier according to the logistics information corresponding to the first related identifier to obtain a second related identifier indirectly related to the abnormal tag data;
and constructing a logistics knowledge graph according to the logistics information corresponding to the abnormal label data, the logistics information corresponding to the first related identifier and the logistics information corresponding to the second related identifier.
6. An information processing apparatus, characterized in that the apparatus comprises:
the system comprises an acquisition module, a judgment module and a judgment module, wherein the acquisition module is used for acquiring a first data list, the first data list comprises logistics information corresponding to a plurality of coding identifiers in a preset time period, and the coding identifiers are used for indicating telephone numbers which do not carry E-commerce labels;
the clustering module is used for carrying out clustering processing on each code identifier according to the logistics information corresponding to the plurality of code identifiers to obtain at least one cluster and an abnormal code identifier, wherein the abnormal code identifier is used for indicating the code identifier which does not belong to the cluster;
The marking module is used for marking each abnormal code mark according to a preset abnormal label to obtain corresponding abnormal label data, wherein the abnormal label data is used for indicating the code mark with the received abnormality;
clustering the code identifiers according to the logistics information corresponding to the code identifiers to obtain at least one cluster and an abnormal code identifier, including:
clustering each code identifier according to the logistics information corresponding to the plurality of code identifiers to obtain at least one cluster and an abnormal identifier, wherein the abnormal identifier is used for indicating the code identifier which does not belong to the cluster;
acquiring a second data list, wherein the second data list comprises identity information corresponding to a plurality of preset identifiers, and the preset identifiers are used for indicating telephone numbers corresponding to preset user identifiers;
determining the abnormal identifier matched with the preset identifier as the abnormal code identifier;
clustering the coded identifiers according to the logistics information corresponding to the coded identifiers to obtain at least one cluster and an abnormal identifier, wherein the clustering comprises the following steps:
Determining the number of user identifications and the number of address information corresponding to each code identification according to the logistics information corresponding to the plurality of code identifications;
and clustering the coded identifiers according to the number of the user identifiers and the number of the address information corresponding to the coded identifiers to obtain at least one cluster and abnormal identifiers.
7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 5 when the computer program is executed by the processor.
8. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 5.
CN202210987082.8A 2022-08-17 2022-08-17 Information processing method, apparatus, computer device, and storage medium Active CN115544247B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210987082.8A CN115544247B (en) 2022-08-17 2022-08-17 Information processing method, apparatus, computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210987082.8A CN115544247B (en) 2022-08-17 2022-08-17 Information processing method, apparatus, computer device, and storage medium

Publications (2)

Publication Number Publication Date
CN115544247A CN115544247A (en) 2022-12-30
CN115544247B true CN115544247B (en) 2023-08-04

Family

ID=84724734

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210987082.8A Active CN115544247B (en) 2022-08-17 2022-08-17 Information processing method, apparatus, computer device, and storage medium

Country Status (1)

Country Link
CN (1) CN115544247B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118469428B (en) * 2024-04-01 2025-04-29 中通云仓科技有限公司 Logistics digital sharing method based on contactless coding

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935696A (en) * 2021-12-14 2022-01-14 国家邮政局邮政业安全中心 Consignment behavior abnormity analysis method and system, electronic equipment and storage medium
CN114037395A (en) * 2022-01-07 2022-02-11 国家邮政局邮政业安全中心 Abnormal consignment data identification method and system, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114706899B (en) * 2022-01-24 2025-01-17 北京明朝万达科技股份有限公司 Sensitivity calculation method, device, storage medium and equipment for express data
CN114118880A (en) * 2022-01-25 2022-03-01 国家邮政局邮政业安全中心 Method and system for identifying consignment risk figure, electronic device and storage medium
CN114444936A (en) * 2022-01-27 2022-05-06 浙江玖重科技有限公司 A logistics analysis tool
CN114154595B (en) * 2022-02-07 2022-04-08 国家邮政局邮政业安全中心 Abnormal consignment behavior detection method, system, electronic device and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935696A (en) * 2021-12-14 2022-01-14 国家邮政局邮政业安全中心 Consignment behavior abnormity analysis method and system, electronic equipment and storage medium
CN114037395A (en) * 2022-01-07 2022-02-11 国家邮政局邮政业安全中心 Abnormal consignment data identification method and system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN115544247A (en) 2022-12-30

Similar Documents

Publication Publication Date Title
DeBrusk The risk of machine-learning bias (and how to prevent it)
Fox et al. Modeling e-mail networks and inferring leadership using self-exciting point processes
Poursafaei et al. Detecting malicious Ethereum entities via application of machine learning classification
US9928465B2 (en) Machine learning and validation of account names, addresses, and/or identifiers
CN112036579B (en) Multi-classification model self-learning online updating method, system and device
CN115544247B (en) Information processing method, apparatus, computer device, and storage medium
CN116305168A (en) Multi-dimensional information security risk assessment method, system and storage medium
Mentch On racial disparities in recent fatal police shootings
CN110825817A (en) Suspected incidence relation determination method and system for enterprise
Paraschiv et al. A unified graph-based approach to disinformation detection using contextual and semantic relations
CN113657902B (en) Financial security management method, system and storage medium based on graph database
CN110598814A (en) System is bound with two-dimensional code to real object asset
CN112069230A (en) Data analysis method, device, equipment and storage medium
CN117172796B (en) Big data electronic commerce management system
CN112561538B (en) Risk model creation method, apparatus, computer device and readable storage medium
CN115760320A (en) Public rental house declaration supervision early warning method based on big data analysis and application thereof
CN114037395A (en) Abnormal consignment data identification method and system, electronic equipment and storage medium
Hameed et al. Motif-based exploratory data analysis for state-backed platform manipulation on Twitter
Ouyang et al. E-mail Spam Classification using KNN and Naive Bayes
Ajhari et al. PROCTOR: A Robust URL Protection System Against Fraudulent, Phishing, and Scam Activities
Khalifa et al. Fake reviews detection based on both the review and the reviewer features under belief function theory
ElFarnawani et al. Malaysia halal meat import: from fraud to blockchain assurance
Olatunbosun et al. Capturing the Existential Cyber Security Threats from the Sub-Saharan Africa Zone through Literature Database
Guan et al. Research on Classification Method of Sensitive Structural Data of Electric Power
KR102733859B1 (en) System for providing union management service for redevelopment and reconstruction projects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant