CN115795335A - Logistics network anomaly identification method and device and electronic equipment - Google Patents

Logistics network anomaly identification method and device and electronic equipment Download PDF

Info

Publication number
CN115795335A
CN115795335A CN202310053367.9A CN202310053367A CN115795335A CN 115795335 A CN115795335 A CN 115795335A CN 202310053367 A CN202310053367 A CN 202310053367A CN 115795335 A CN115795335 A CN 115795335A
Authority
CN
China
Prior art keywords
logistics
data
clustering
value
optimal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310053367.9A
Other languages
Chinese (zh)
Other versions
CN115795335B (en
Inventor
许良锋
李惟聪
陈曦
王辉
闻克宇
朱家成
师雪娇
杨阳
王必红
胡梦阳
马全峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Post Bureau Postal Industry Security Center
Original Assignee
State Post Bureau Postal Industry Security Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Post Bureau Postal Industry Security Center filed Critical State Post Bureau Postal Industry Security Center
Priority to CN202310053367.9A priority Critical patent/CN115795335B/en
Publication of CN115795335A publication Critical patent/CN115795335A/en
Application granted granted Critical
Publication of CN115795335B publication Critical patent/CN115795335B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the disclosure discloses a method and a device for identifying abnormity of logistics outlets and electronic equipment. The logistics network anomaly identification method comprises the following steps: performing K-means clustering on historical logistics data to obtain a first optimal K value and a first clustering result; acquiring flow data to be analyzed, wherein the flow data to be analyzed comprises the historical logistics data and newly added logistics data; performing K-means clustering on the data of the data to be analyzed to obtain a second optimal K value and a second clustering result; comparing the second optimal k value with the first optimal k value; and identifying abnormal logistics network points according to the comparison result, the first clustering result and the second clustering result. The method can effectively identify the abnormal logistics network points.

Description

Logistics network anomaly identification method and device and electronic equipment
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method and an apparatus for identifying an anomaly in a logistics grid, and an electronic device.
Background
In recent years, the logistics industry in China is rapidly developed, express has become an indispensable part in the life of people, and the express gradually develops towards the trend of 'express living and life express delivery'.
The express delivery is used as a mode of article circulation, the stability of a logistics network plays a very important role in normal circulation of the express delivery, and the last effective delivery in the express delivery relay is received in time after the express delivery is released, so that the stability of the logistics network is closely related to the logistics network.
The inventor finds that in the prior art, an intelligent method for monitoring logistics outlets is lacked, abnormal logistics outlets cannot be identified in time, and express delivery time is prolonged or even lost.
Disclosure of Invention
In view of this, the embodiments of the present disclosure provide a method and an apparatus for identifying an anomaly in a logistics node, and an electronic device, which can effectively identify the anomaly in the logistics node.
In a first aspect, an embodiment of the present disclosure provides a method for identifying an anomaly in a logistics node, which adopts the following technical scheme:
the logistics network anomaly identification method comprises the following steps:
performing K-means clustering on historical logistics data to obtain a first optimal K value and a first clustering result, wherein the historical logistics data comprises all logistics data which are distributed from a target transfer center to each logistics network point within a certain time period before a specific time;
acquiring data of the to-be-analyzed data, wherein the data of the to-be-analyzed data comprises the historical logistics data and newly-added logistics data, and the newly-added logistics data comprises all logistics data which are distributed to all logistics points from a target transfer center after a specific time;
performing K-means clustering on the data of the stream to be analyzed to obtain a second optimal K value and a second clustering result;
comparing the second optimal k value with the first optimal k value;
and identifying abnormal logistics network points according to the comparison result, the first clustering result and the second clustering result.
Optionally, the K-means clustering the historical streaming data includes: performing K-means clustering on the historical streaming data based on at least one time-related information;
the K-means clustering of the analyte flow data comprises: and performing K-means clustering on the data of the flow to be analyzed based on at least one time-related information.
Optionally, the obtaining the first optimal k value comprises: determining the first optimal k value by a loss function; or traversing all the k values, and selecting the k value with the minimum sum of Euclidean distances from all the cluster nodes to the corresponding cluster center as the first optimal k value; or traversing all the k values, and selecting the k value with the minimum mean square error of the Euclidean distances from all the cluster nodes to the corresponding cluster center as the first optimal k value; or traversing all the k values, and selecting the k value with the highest proportion that the clustering nodes in each cluster of the clustering result belong to the same logistics network point as the first optimal k value.
Optionally, the logistics data includes actual time information and delay permission information; when clustering is performed according to the actual time information, the method for identifying the abnormity of the logistics network further comprises the following steps: before K-means clustering is carried out on the historical logistics data, logistics data of which the actual time duration is lower than the time delay permission duration in the historical logistics data are removed; before K-means clustering is carried out on the data flow to be analyzed, removing the logistics data of which the actual time duration is lower than the time delay permission duration in the data flow to be analyzed.
Optionally, the identifying an abnormal logistics node according to the comparison result, the first clustering result, and the second clustering result includes:
when the second optimal k value is equal to the first optimal k value, calculating cosine similarity between a first feature vector of the first clustering result and a second feature vector of the second clustering result;
if the cosine similarity is greater than or equal to a first threshold value, abnormal logistics dots do not exist;
if the cosine similarity is smaller than the first threshold, decomposing the first feature vector into a plurality of first sub-feature vectors, and decomposing the second feature vector into a plurality of second sub-feature vectors;
and calculating cosine similarity between each first sub-feature vector and each corresponding second sub-feature vector, wherein all clustering nodes corresponding to the second sub-feature vectors with the cosine similarity smaller than a second threshold are abnormal logistics data, and logistics outlets corresponding to the abnormal logistics data are abnormal logistics outlets.
Optionally, the first feature vector is generated based on a proportion of the number of nodes owned by each cluster in the first clustering result, and the second feature vector is generated based on a proportion of the number of nodes owned by each cluster in the second clustering result.
Optionally, the identifying an abnormal logistics site according to the comparison result, the first clustering result, and the second clustering result further includes:
when the second optimal k value is larger than the first optimal k value, determining a newly added clustering center;
and judging all clustering nodes in the cluster where each newly added clustering center is located as abnormal logistics data, wherein the logistics network points corresponding to the abnormal logistics data are abnormal logistics network points.
Optionally, the determining a newly added cluster center includes: and selecting n clustering centers with the largest Euclidean distance average value with the k clustering centers in the first clustering result as new clustering centers in all clustering centers in the second clustering result, wherein n is the difference between the second optimal k value and the first optimal k value.
In a second aspect, an embodiment of the present disclosure further provides a device for identifying an anomaly in a logistics node, where the following technical scheme is adopted:
the abnormal recognition device of logistics network includes:
the first clustering module is used for carrying out K-means clustering on historical logistics data to obtain a first optimal K value and a first clustering result, wherein the historical logistics data comprises all logistics data which are distributed from a target transfer center to each logistics network point within a certain time period before a specific time;
the data acquisition module is used for acquiring data of the to-be-analyzed object flow, wherein the data of the to-be-analyzed object flow comprises the historical logistics data and newly-added logistics data, and the newly-added logistics data comprises all logistics data which are distributed to all logistics points from a target transfer center after a specific time;
the second clustering module is used for carrying out K-means clustering on the data of the data to be analyzed to obtain a second optimal K value and a second clustering result;
a comparison module for comparing the second optimal k value with the first optimal k value;
and the abnormal identification module is used for identifying abnormal logistics points according to the comparison result, the first clustering result and the second clustering result.
In a third aspect, an embodiment of the present disclosure further provides an electronic device, which adopts the following technical scheme:
the electronic device includes:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform any one of the above methods for logistics node anomaly identification.
In a fourth aspect, the disclosed embodiments also provide a computer-readable storage medium storing computer instructions for causing a computer to execute any one of the above-mentioned logistics node anomaly identification methods.
The embodiment of the disclosure provides an abnormal identification method and device for logistics nodes and electronic equipment, wherein in the abnormal identification method for logistics nodes, firstly, historical logistics data are subjected to K-means clustering to obtain a first optimal K value and a first clustering result, then, to-be-analyzed logistics data are obtained, the to-be-analyzed logistics data are subjected to K-means clustering to obtain a second optimal K value and a second clustering result, then, the second optimal K value is compared with the first optimal K value, and finally, the abnormal logistics nodes are identified according to the comparison result, the first clustering result and the second clustering result. Therefore, the abnormal logistics network points can be effectively identified through the abnormal logistics network point identification method in the embodiment of the disclosure.
The foregoing is a summary of the present disclosure, and for the purposes of promoting a clear understanding of the technical means of the present disclosure, the present disclosure may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of an anomaly identification method for a logistics node according to an embodiment of the present disclosure;
fig. 2 is a specific flowchart of step S5 provided in the embodiment of the present disclosure;
fig. 3 is a schematic block diagram of an anomaly identification device of a logistics node according to an embodiment of the present disclosure;
fig. 4 is a schematic block diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
The embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.
It is to be understood that the embodiments of the present disclosure are described below by way of specific examples, and that other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure herein. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. The disclosure may be carried into practice or applied to various other specific embodiments, and various modifications and changes may be made in the details within the description and the drawings without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without inventive step, are intended to be within the scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
It should be further noted that the drawings provided in the following embodiments are only schematic illustrations of the basic concepts of the present disclosure, and the drawings only show the components related to the present disclosure rather than the numbers, shapes and dimensions of the components in actual implementation, and the types, the numbers and the proportions of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.
In addition, in the following description, specific details are provided to provide a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.
The embodiment of the present disclosure provides a method for identifying an anomaly in a logistics node, specifically, as shown in fig. 1, the method for identifying an anomaly in a logistics node includes:
s1, performing K-means clustering on historical logistics data to obtain a first optimal K value and a first clustering result.
The historical logistics data comprises all logistics data which are distributed from the target transfer center to each logistics point within a certain time period before a specific time, and the part of logistics data is logistics data of a normal logistics point and does not relate to logistics data of an abnormal logistics point. For example, the historical logistics data comprises all logistics data distributed from the transfer center A to the logistics points B1-Bn within one month before 31 days of 10 months in 2022. And aiming at each piece of logistics data, the logistics data can comprise various information such as an express bill number, departure time, actual time, a logistics network point responsible person, delay permission and the like. The delay permission information can be determined according to one or more of the requirement of the sender, the requirement of the receiver, the requirement of the delivered goods and the like. The historical logistics data can be presented in a data table mode, and the table I can be referred to.
Figure SMS_1
When the historical logistics data are subjected to K-means clustering, K-means clustering can be performed on the basis of one or more types of information (such as other information except the express bill number, the logistics network point and the logistics network point responsible person) generated in the actual express transportation process, the related information of the logistics data form a vector in the clustering process, and the end point of the vector is used as a clustering node for clustering.
Considering that the timeliness of express delivery is particularly critical in the express delivery process, and is critical to whether an article can arrive accurately, in the embodiment of the disclosure, K-means clustering is performed on historical logistics data based on at least one time-related information, so that abnormal logistics points can be identified through the dimension of time. The above time-related information may be actual time information, or departure time information and actual time information, or actual time information and delay permission information, or departure time information, actual time information, and delay permission information.
Optionally, when the logistics data includes actual time information and delay permission information, and clustering is performed by using the actual time information, the method for identifying an anomaly in a logistics node further includes: before K-means clustering is carried out on historical logistics data, logistics data with the actual time duration being lower than the time delay permission duration in the historical logistics data are removed, the actual time duration value in the logistics data is still normal logistics data although being large, and the removal of the logistics data is beneficial to improving the identification efficiency.
In the embodiment of the present disclosure, the first optimal k value may be obtained in a plurality of ways as follows:
firstly, determining a first optimal k value through a loss function; secondly, traversing all the k values, and selecting the k value with the minimum sum of Euclidean distances from all the cluster nodes to the corresponding cluster center as a first optimal k value; thirdly, traversing all the k values, and selecting the k value with the minimum mean square error of the Euclidean distances from all the clustering nodes to the corresponding clustering centers as a first optimal k value; fourthly, traversing all the k values, and selecting the k value with the highest proportion that the clustering nodes in each cluster of the clustering result belong to the same logistics network point as the first optimal k value.
In an exemplary embodiment, the fourth step of traversing all the k values, and selecting the k value with the highest proportion that the clustering nodes in each cluster of the clustering result belong to the same logistics network point as the first optimal k value may specifically be performed in the following manner: for example, when k is 4, the clustering result includes 4 clusters, which are 60%, 70%, 80% and 90% respectively, where the highest proportion is 90%, and when k is 5, the clustering result includes 5 clusters, and the proportions of clustering nodes in the 5 clusters that belong to the same logistics site are 55%, 70%, 60%, 65% and 80% respectively, where the highest proportion is 80%, then k is 4 rather than k being 5, and so on, and finally the first optimal k value is determined. The above numerical values are merely examples and are not limiting. The proportion of clustering nodes in one cluster belonging to the same logistics site can be determined in the following way: if all the cluster nodes in one cluster belong to the same logistics node, the proportion is 100%, if all the cluster nodes in one cluster belong to at least two logistics nodes, the proportion of the cluster nodes belonging to each logistics node is calculated, and the highest value is selected as the required proportion. Of course, it is also possible to calculate the average value of each ratio (for example, when k is 4, the average value is 75%, and when k is 5, the average value is 66%), and then perform comparison to determine the k value with the highest ratio that the clustering node belongs to the same logistics node in each cluster of the clustering result.
And the clustering result corresponding to the first optimal k value is the first clustering result.
And S2, acquiring data of the to-be-analyzed object.
The data of the flow to be analyzed comprise historical logistics data and newly added logistics data, and the newly added logistics data comprise all logistics data which are distributed to each logistics network from the target transfer center after specific time. For example, the newly added logistics data comprises all logistics data distributed from the transfer center A to the logistics points B1-Bn within 15 days from 10 months and 31 days in 2022. And aiming at each piece of logistics data, the logistics data can comprise various information such as an express bill number, departure time, actual time, a logistics network point responsible person, delay permission and the like. The type of information included in the newly added physical distribution data should be at least equal to the type of information included in the historical physical distribution data. The data of the data to be analyzed can also be presented in a data table mode, and the table I can be specifically referred to.
And S3, performing K-means clustering on the data of the data to be analyzed to obtain a second optimal K value and a second clustering result.
In the embodiment of the disclosure, K-means clustering is performed on the data of the data to be analyzed based on at least one type of time-related information, so that abnormal logistics points can be identified through the dimension of time.
Optionally, the logistics data includes actual time information and delay permission information; when the actual time information is used for clustering, the logistics network point abnormity identification method further comprises the following steps: before K-means clustering is carried out on the flow data to be analyzed, the logistics data with the actual time duration lower than the time delay permission duration in the flow data to be analyzed are removed.
The specific manner of performing K-means clustering on the data of the to-be-analyzed streams to obtain the second optimal K value and the second clustering result may refer to the specific manner of performing K-means clustering on the historical stream data in step S1 to obtain the first optimal K value and the first clustering result, which is not described herein again.
And S4, comparing the second optimal k value with the first optimal k value.
The second optimal k value and the first optimal k value are compared in a manner that whether the two values are equal is directly compared, and the comparison result can be two types: the second optimal k value is equal to the first optimal k value
And S5, identifying abnormal logistics points according to the comparison result, the first clustering result and the second clustering result.
Optionally, as shown in fig. 2, in the embodiment of the present disclosure, identifying an abnormal logistics node according to the comparison result, the first clustering result, and the second clustering result includes:
and a substep S51 of judging whether the second optimal k value is equal to the first optimal k value.
The judgment result may be that the second optimal k value is equal to the first optimal k value, or that the second optimal k value is greater than the first optimal k value. Since the number of the clustering nodes in the second clustering result is greater than that of the clustering nodes in the first clustering result, the situation that the second optimal k value is less than the first optimal k value does not exist.
And a substep S52, when the second optimal k value is equal to the first optimal k value, calculating cosine similarity between the first eigenvector of the first clustering result and the second eigenvector of the second clustering result.
The first feature vector and the second feature vector may be any vectors that can express the first clustering result and the second clustering result. Illustratively, in the embodiment of the present disclosure, the first feature vector is generated based on a proportion of the number of nodes owned by each cluster in the first clustering result, for example, the first clustering result includes 4 clusters, and the proportion of the number of nodes owned by each cluster is 10. Or the first characteristic vector is generated based on the proportion that the cluster nodes in each cluster in the first clustering result belong to the same logistics network point, and the second characteristic vector is generated based on the proportion that the cluster nodes in each cluster in the second clustering result belong to the same logistics network point. The calculation of the ratio is as described above, and is not described herein again.
And a substep S53 of determining whether an abnormal logistics network point exists according to the cosine similarity.
Specifically, determining whether an abnormal logistics network point exists according to the cosine similarity includes:
if the cosine similarity is greater than or equal to the first threshold, no abnormal logistics dots exist, namely, the logistics dots included in each piece of logistics data corresponding to each clustering node are abnormal.
If the cosine similarity is smaller than a first threshold, decomposing the first feature vector into a plurality of first sub-feature vectors, and decomposing the second feature vector into a plurality of second sub-feature vectors;
and calculating cosine similarity between each first sub-characteristic vector and each corresponding second sub-characteristic vector, wherein all cluster nodes corresponding to the second sub-characteristic vectors with the cosine similarity smaller than a second threshold value are abnormal logistics data, and logistics outlets corresponding to the abnormal logistics data are abnormal logistics outlets.
The above first threshold and the second threshold can be determined according to actual needs. The decomposition mode of the first feature vector is the same as that of the second feature vector, and can be specifically determined according to actual needs. Plural refers to 2, 3, 4, etc. For example, the above first feature vector may be decomposed according to the number of the cluster nodes, for example, the first clustering result includes 10 cluster nodes, and the first feature vector is decomposed into 2 first feature sub-vectors, where each first feature sub-vector corresponds to 2 cluster nodes.
Optionally, identifying the abnormal logistics node according to the comparison result, the first clustering result and the second clustering result further includes:
s54, determining a newly added clustering center when the second optimal k value is larger than the first optimal k value;
and a substep S55, judging all the cluster nodes in the cluster where each newly added cluster center is located as abnormal logistics data, wherein the logistics network points corresponding to the abnormal logistics data are abnormal logistics network points.
Optionally, determining the newly added cluster center includes: and selecting n clustering centers with the largest Euclidean distance average value with the k clustering centers in the first clustering result as new clustering centers in all clustering centers in the second clustering result, wherein n is the difference between the second optimal k value and the first optimal k value. The Euclidean distance average value between one cluster center in the second cluster result and k cluster centers in the first cluster result is calculated in the following mode: respectively calculating Euclidean distances between the clustering center and the 1 st to k clustering centers in the first clustering result, and then calculating the average value of all the Euclidean distances. If one cluster in the second clustering result is original (namely exists in the first clustering result), the Euclidean distance between the cluster center of the cluster and each clustering center in the first clustering result is smaller, and if one cluster in the second clustering result is newly increased, the Euclidean distance between the cluster center of the cluster and each clustering center in the first clustering result is larger, and the newly increased clustering center can be determined through the method.
Optionally, the method for identifying an anomaly in a logistics site in the embodiment of the present disclosure further includes: and counting the quantity of the abnormal logistics data corresponding to each abnormal logistics network point, and reminding a responsible person of the related abnormal logistics network point and a superior supervisor or supervision department of the related abnormal logistics network point when the quantity exceeds a corresponding threshold value.
In the method for identifying the logistics network anomaly, firstly, K-means clustering is carried out on historical logistics data to obtain a first optimal K value and a first clustering result, then, to-be-analyzed logistics data is obtained, K-means clustering is carried out on the to-be-analyzed logistics data to obtain a second optimal K value and a second clustering result, then, the second optimal K value is compared with the first optimal K value, and finally, the abnormal logistics network anomaly is identified according to the comparison result, the first clustering result and the second clustering result. Therefore, the abnormal logistics network points can be effectively identified through the abnormal logistics network point identification method in the embodiment of the disclosure.
In addition, an embodiment of the present disclosure further provides a device for identifying an anomaly in a logistics site, specifically, as shown in fig. 3, the device for identifying an anomaly in a logistics site includes:
the first clustering module 100 is configured to perform K-means clustering on historical logistics data to obtain a first optimal K value and a first clustering result, where the historical logistics data includes all logistics data distributed from a target transfer center to each logistics node within a certain time period before a specific time;
the data acquisition module 200 is configured to acquire analyte flow data, where the analyte flow data includes historical logistics data and newly added logistics data, and the newly added logistics data includes all logistics data distributed from the target transportation center to each logistics site after a specific time;
the second clustering module 300 is configured to perform K-means clustering on the data of the stream to be analyzed to obtain a second optimal K value and a second clustering result;
a comparison module 400, configured to compare the second optimal k value with the first optimal k value;
and the abnormal identification module 500 is used for identifying abnormal logistics points according to the comparison result, the first clustering result and the second clustering result.
Optionally, when the logistics data includes actual time information and delay permission information, and clustering is performed by the actual time information, the device for identifying an anomaly in a logistics site further includes: the preprocessing module is used for removing logistics data of which the actual time duration is lower than the delay allowable time duration in the historical logistics data before K-means clustering is carried out on the historical logistics data; before K-means clustering is carried out on the data flow to be analyzed, the logistics data with the actual time duration lower than the time delay permission duration in the data flow to be analyzed are removed.
It should be noted that relevant details of each step in the method for identifying an anomaly in a logistics node are all applicable to corresponding modules in the device for identifying an anomaly in a logistics node, and are not described herein again.
In addition, the embodiment of the disclosure also provides an electronic device, which adopts the following technical scheme:
the electronic device includes:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform any of the above methods for anomaly identification at a logistics site.
In addition, the embodiment of the disclosure also provides a computer-readable storage medium, which stores computer instructions for causing a computer to execute any one of the above methods for identifying an anomaly in a logistics node.
An electronic device according to an embodiment of the present disclosure includes a memory and a processor. The memory is to store non-transitory computer readable instructions. In particular, the memory may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, etc.
The processor may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions. In an embodiment of the present disclosure, the processor is configured to execute the computer readable instructions stored in the memory, so that the electronic device performs all or part of the foregoing steps of the logistics node abnormality identification method according to the embodiments of the present disclosure.
Those skilled in the art should understand that, in order to solve the technical problem of how to obtain a good user experience, the present embodiment may also include well-known structures such as a communication bus, an interface, and the like, and these well-known structures should also be included in the protection scope of the present disclosure.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. There is shown a schematic diagram of a structure suitable for implementing an electronic device in an embodiment of the present disclosure. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 4, the electronic device may include a processing means (e.g., a central processing unit, a graphic processor, etc.) that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) or a program loaded from a storage means into a Random Access Memory (RAM). In the RAM, various programs and data necessary for the operation of the electronic apparatus are also stored. The processing device, the ROM, and the RAM are connected to each other by a bus. An input/output (I/O) interface is also connected to the bus.
Generally, the following devices may be connected to the I/O interface: input means including, for example, a sensor or a visual information acquisition device; output devices including, for example, display screens and the like; storage devices including, for example, magnetic tape, hard disk, and the like; and a communication device. The communication means may allow the electronic device to communicate wirelessly or by wire with other devices, such as edge computing devices, to exchange data. While fig. 4 illustrates an electronic device having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, the processes described above with reference to the flow diagrams may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means, or installed from a storage means, or installed from a ROM. When the computer program is executed by a processing device, all or part of the steps of the logistics node abnormality identification method of the embodiment of the disclosure are executed.
For the detailed description of the present embodiment, reference may be made to the corresponding descriptions in the foregoing embodiments, which are not repeated herein.
A computer-readable storage medium according to an embodiment of the present disclosure has non-transitory computer-readable instructions stored thereon. When the non-transitory computer readable instructions are executed by a processor, all or part of the steps of the logistics node abnormality identification method of the embodiments of the present disclosure are executed.
The computer-readable storage media include, but are not limited to: optical storage media (e.g., CD-ROMs and DVDs), magneto-optical storage media (e.g., MOs), magnetic storage media (e.g., magnetic tapes or removable disks), media with built-in rewritable non-volatile memory (e.g., memory cards), and media with built-in ROMs (e.g., ROM cartridges).
For the detailed description of the present embodiment, reference may be made to the corresponding descriptions in the foregoing embodiments, which are not repeated herein.
The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.
In the present disclosure, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions, and the block diagrams of devices, apparatuses, devices, systems, etc. referred to in the present disclosure are used merely as illustrative examples and are not intended to require or imply that they must be connected, arranged, or configured in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. As used herein, the words "or" and "refer to, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
Also, as used herein, "or" as used in a list of items beginning with "at least one" indicates a separate list, such that, for example, a list of "at least one of a, B, or C" means a or B or C, or AB or AC or BC, or ABC (i.e., a and B and C). Furthermore, the word "exemplary" does not mean that the described example is preferred or better than other examples.
It is also noted that in the systems and methods of the present disclosure, components or steps may be decomposed and/or re-combined. Such decomposition and/or recombination should be considered as equivalents of the present disclosure.
Various changes, substitutions and alterations to the techniques described herein may be made without departing from the techniques of the teachings as defined by the appended claims. Moreover, the scope of the claims of the present disclosure is not limited to the particular aspects of the process, machine, manufacture, composition of matter, means, methods and acts described above. Processes, machines, manufacture, compositions of matter, means, methods, or acts, presently existing or later to be developed that perform substantially the same function or achieve substantially the same result as the corresponding aspects described herein may be utilized. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or acts.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (10)

1. A logistics network anomaly identification method is characterized by comprising the following steps:
performing K-means clustering on historical logistics data to obtain a first optimal K value and a first clustering result, wherein the historical logistics data comprises all logistics data which are distributed from a target transfer center to each logistics network point within a certain time period before a specific time;
acquiring data of flow to be analyzed, wherein the data of flow to be analyzed comprises the historical logistics data and newly added logistics data, and the newly added logistics data comprises all logistics data which are distributed to each logistics network from a target transfer center after a specific time;
performing K-means clustering on the data of the data to be analyzed to obtain a second optimal K value and a second clustering result;
comparing the second optimal k value with the first optimal k value;
and identifying abnormal logistics network points according to the comparison result, the first clustering result and the second clustering result.
2. The logistics network anomaly identification method of claim 1, wherein the K-means clustering of historical logistics data comprises: performing K-means clustering on the historical streaming data based on at least one time-related information;
the K-means clustering of the analyte flow data comprises: and performing K-means clustering on the data of the data to be analyzed based on at least one time-related information.
3. The method for identifying abnormalities in logistics outlets of claim 2, wherein obtaining the first optimal k value comprises: determining the first optimal k value by a loss function; or traversing all the k values, and selecting the k value with the minimum sum of Euclidean distances from all the cluster nodes to the corresponding cluster center as the first optimal k value; or traversing all the k values, and selecting the k value with the minimum mean square error of the Euclidean distances from all the clustering nodes to the corresponding clustering centers as the first optimal k value; or traversing all the k values, and selecting the k value with the highest proportion that the clustering nodes in each cluster of the clustering result belong to the same logistics network point as the first optimal k value.
4. The method for identifying abnormalities in logistics nodes according to claim 2, wherein said logistics data includes actual time information and delay permission information; when clustering is performed according to the actual time information, the method for identifying the abnormity of the logistics network further comprises the following steps: before K-means clustering is carried out on the historical logistics data, logistics data of which the actual time duration is lower than the time delay permission duration in the historical logistics data are removed; before K-means clustering is carried out on the data flow to be analyzed, removing the logistics data of which the actual time duration is lower than the time delay permission duration in the data flow to be analyzed.
5. The method for identifying abnormal logistics nodes according to claim 1, wherein the identifying abnormal logistics nodes according to the comparison result, the first clustering result and the second clustering result comprises:
when the second optimal k value is equal to the first optimal k value, calculating cosine similarity between a first eigenvector of the first clustering result and a second eigenvector of the second clustering result;
if the cosine similarity is greater than or equal to a first threshold value, abnormal logistics dots do not exist;
if the cosine similarity is smaller than the first threshold, decomposing the first feature vector into a plurality of first sub-feature vectors, and decomposing the second feature vector into a plurality of second sub-feature vectors;
and calculating cosine similarity between each first sub-characteristic vector and each corresponding second sub-characteristic vector, wherein all cluster nodes corresponding to the second sub-characteristic vectors with the cosine similarity smaller than a second threshold value are abnormal logistics data, and logistics outlets corresponding to the abnormal logistics data are abnormal logistics outlets.
6. The method according to claim 5, wherein the first eigenvector is generated based on a proportion of the number of nodes owned by each cluster in the first clustering result, and the second eigenvector is generated based on a proportion of the number of nodes owned by each cluster in the second clustering result.
7. The method according to claim 5 or 6, wherein the identifying abnormal logistics nodes according to the comparison result, the first clustering result and the second clustering result further comprises:
when the second optimal k value is larger than the first optimal k value, determining a newly added clustering center;
and judging all clustering nodes in the cluster where each newly added clustering center is located as abnormal logistics data, wherein the logistics network points corresponding to the abnormal logistics data are abnormal logistics network points.
8. The method as claimed in claim 7, wherein the determining the newly added cluster center comprises: and selecting n clustering centers with the largest Euclidean distance average value with the k clustering centers in the first clustering result as newly added clustering centers in all clustering centers in the second clustering result, wherein n is the difference between the second optimal k value and the first optimal k value.
9. An abnormality recognition device for a logistics node, comprising:
the first clustering module is used for carrying out K-means clustering on historical logistics data to obtain a first optimal K value and a first clustering result, wherein the historical logistics data comprises all logistics data which are distributed from a target transfer center to each logistics network point within a certain time period before a specific time;
the data acquisition module is used for acquiring data of the to-be-analyzed logistics, wherein the data of the to-be-analyzed logistics comprises the historical logistics data and newly-added logistics data, and the newly-added logistics data comprises all logistics data which are distributed to each logistics network from the target transfer center after a specific time;
the second clustering module is used for performing K-means clustering on the data of the data to be analyzed to obtain a second optimal K value and a second clustering result;
a comparison module for comparing the second optimal k value with the first optimal k value;
and the abnormal identification module is used for identifying abnormal logistics points according to the comparison result, the first clustering result and the second clustering result.
10. An electronic device, characterized in that the electronic device comprises:
at least one processor; and (c) a second step of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of logistics node anomaly identification of any one of claims 1-8.
CN202310053367.9A 2023-02-02 2023-02-02 Logistics network point anomaly identification method and device and electronic equipment Active CN115795335B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310053367.9A CN115795335B (en) 2023-02-02 2023-02-02 Logistics network point anomaly identification method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310053367.9A CN115795335B (en) 2023-02-02 2023-02-02 Logistics network point anomaly identification method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN115795335A true CN115795335A (en) 2023-03-14
CN115795335B CN115795335B (en) 2023-07-25

Family

ID=85429617

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310053367.9A Active CN115795335B (en) 2023-02-02 2023-02-02 Logistics network point anomaly identification method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN115795335B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180114016A1 (en) * 2016-10-24 2018-04-26 Samsung Sds Co., Ltd. Method and apparatus for detecting anomaly based on behavior-analysis
CN109978070A (en) * 2019-04-03 2019-07-05 北京市天元网络技术股份有限公司 A kind of improved K-means rejecting outliers method and device
CN112446660A (en) * 2019-09-05 2021-03-05 顺丰科技有限公司 Network point clustering method, device, server and storage medium
CN113298162A (en) * 2021-05-30 2021-08-24 福建中锐网络股份有限公司 Bridge health monitoring method and system based on K-means algorithm
CN114330584A (en) * 2021-12-31 2022-04-12 北京明朝万达科技股份有限公司 Data clustering method and device, storage medium and electronic equipment
CN114926042A (en) * 2022-05-27 2022-08-19 上海东普信息科技有限公司 Network logistics monitoring method, device, equipment and storage medium
CN115454779A (en) * 2022-09-28 2022-12-09 建信金融科技有限责任公司 Cloud monitoring stream data detection method and device based on cluster analysis and storage medium
CN115496249A (en) * 2022-04-26 2022-12-20 国网山西省电力公司营销服务中心 Industrial adjustable load potential analysis method and system based on clustering algorithm

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180114016A1 (en) * 2016-10-24 2018-04-26 Samsung Sds Co., Ltd. Method and apparatus for detecting anomaly based on behavior-analysis
CN109978070A (en) * 2019-04-03 2019-07-05 北京市天元网络技术股份有限公司 A kind of improved K-means rejecting outliers method and device
CN112446660A (en) * 2019-09-05 2021-03-05 顺丰科技有限公司 Network point clustering method, device, server and storage medium
CN113298162A (en) * 2021-05-30 2021-08-24 福建中锐网络股份有限公司 Bridge health monitoring method and system based on K-means algorithm
CN114330584A (en) * 2021-12-31 2022-04-12 北京明朝万达科技股份有限公司 Data clustering method and device, storage medium and electronic equipment
CN115496249A (en) * 2022-04-26 2022-12-20 国网山西省电力公司营销服务中心 Industrial adjustable load potential analysis method and system based on clustering algorithm
CN114926042A (en) * 2022-05-27 2022-08-19 上海东普信息科技有限公司 Network logistics monitoring method, device, equipment and storage medium
CN115454779A (en) * 2022-09-28 2022-12-09 建信金融科技有限责任公司 Cloud monitoring stream data detection method and device based on cluster analysis and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李亚玲;李涛;: "改进K-means算法在风电异常数据的识别研究", 计算机时代, no. 02 *

Also Published As

Publication number Publication date
CN115795335B (en) 2023-07-25

Similar Documents

Publication Publication Date Title
CN111461164B (en) Sample data set capacity expansion method and model training method
CN110347888B (en) Order data processing method and device and storage medium
CN112988440B (en) System fault prediction method and device, electronic equipment and storage medium
CN110647913A (en) Abnormal data detection method and device based on clustering algorithm
WO2019119635A1 (en) Seed user development method, electronic device and computer-readable storage medium
CN114781688A (en) Method, device, equipment and storage medium for identifying abnormal data of business expansion project
CN111967521A (en) Cross-border active user identification method and device
CN114691868A (en) Text clustering method and device and electronic equipment
CN113705625A (en) Method and device for identifying abnormal life guarantee application families and electronic equipment
CN115795335A (en) Logistics network anomaly identification method and device and electronic equipment
CN111784246B (en) Logistics path estimation method
CN112465012A (en) Machine learning modeling method and device, electronic equipment and readable storage medium
CN110084498B (en) Service end dispatching method and device, computer equipment and storage medium
CN111339294A (en) Client data classification method and device and electronic equipment
CN109462510B (en) CDN node quality evaluation method and device
CN114969738B (en) Interface abnormal behavior monitoring method, system, device and storage medium
CN110795308A (en) Server inspection method, device, equipment and storage medium
CN116432061A (en) WAT abnormal data detection method and system based on Gaussian mixture model
US11940890B2 (en) Timing index anomaly detection method, device and apparatus
CN111258788B (en) Disk failure prediction method, device and computer readable storage medium
CN116362577A (en) Target class membership analysis method, system, device and storage medium
CN113705626A (en) Method and device for identifying abnormal life guarantee application families and electronic equipment
CN109711222B (en) Radio frequency identification anti-collision performance test method, test instrument and storage medium
CN115310505A (en) Automatic identification method and system for secondary circuit wiring terminal of mutual inductor
TWI653587B (en) Dispatching method and system based on multiple levels of steady state production rate in working benches

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant