CN111258874B - User operation track analysis method and device based on web data - Google Patents
User operation track analysis method and device based on web data Download PDFInfo
- Publication number
- CN111258874B CN111258874B CN201811453609.9A CN201811453609A CN111258874B CN 111258874 B CN111258874 B CN 111258874B CN 201811453609 A CN201811453609 A CN 201811453609A CN 111258874 B CN111258874 B CN 111258874B
- Authority
- CN
- China
- Prior art keywords
- user operation
- track
- default
- tracks
- service type
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 36
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 53
- 230000002159 abnormal effect Effects 0.000 claims abstract description 42
- 238000000034 method Methods 0.000 claims abstract description 41
- 230000006870 function Effects 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 8
- 238000003860 storage Methods 0.000 claims description 8
- 238000013139 quantization Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000003064 k means clustering Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 4
- 238000012795 verification Methods 0.000 claims description 4
- 238000004891 communication Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 8
- 239000000243 solution Substances 0.000 description 8
- 230000008569 process Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 239000000523 sample Substances 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000009960 carding Methods 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/366—Software debugging using diagnostics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3692—Test management for test results analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a user operation track analysis method and device based on web data. The method comprises the steps of acquiring a user operation track in real time, wherein the user operation track at least comprises a service type; comparing the user operation track with all default tracks with the same service type according to a track model obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type; if the user operation track is different from the default track, the user operation track is marked as an abnormal track, and the embodiment of the invention judges that the user operation track is the abnormal track by comparing the acquired user operation track with the default track of the corresponding service type, so that the user operation track can be accurately analyzed more simply and efficiently.
Description
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a user operation track analysis method and device based on web data.
Background
The popularity of cloud computing and container clouds has led to the gradual deployment of a large number of IT application systems in virtualized, containerized environments. With the continuous enrichment of various business scenes and the blowout type increase of the business volume, great challenges are brought to the maintainability of the system and the application. In particular, in the telecommunications industry, operators themselves build a very large number of application systems to provide various feature services to a wide range of consumers, while some system functions involve sub-functions of multiple service systems, requiring multi-system coordination to function properly. The evolution of architecture is further compounded by the complexity of such business systems, which places higher demands on user operational behavior analysis.
Aiming at the problems, the prior art mainly adopts the following scheme: scheme one: operation track arrangement based on manual carding: the traditional maintainer wants to know the operation track of the whole business handling process, needs to sort the business operation process in the project stage, transmits the operation track to the subsequent maintainer in the form of an interactive manual, and needs to update the operation track by relying on the subjective behaviors of the developer and the maintainer if the business is changed and newly added. The method is suitable for small-sized application systems with small change rate and the like, and has relatively stable effect. Scheme II: outputting operation tracks based on code embedding: the code pre-buried operation track output is mainly realized in the code in advance by outputting information required by the operation track in a code development stage. After the system is put into a production environment, each operation of a user is output to a track analysis center, and the analysis center sorts out the operation path of each service through the dimensions of ip address, user identification, serial number, time and the like. When the subsequent new service codes are developed, the subsequent services can be ensured to be also incorporated into the operation track center by encoding according to the reserved development specifications. Scheme III: probe-based trajectory acquisition: the scheme is that a method-level call record is acquired for a deployed middleware by introducing a probe packet into the middleware. Injection of the operation trace code is automatically completed through a software development kit (Software Development Kit, SDK) for automatically burying the code and collecting data. In this way, it is possible that the developer need only modify a small number of codes, even one line of codes. And the subsequent relation mapping between the calling method and the business operation is used for completing the operation track analysis of the user.
However, the prior art has serious defects: with the continuous expansion of the cluster size of various systems at present, simple manual carding has become a difficult task, not to mention the increasing of the code variation of the application caused by the landing of agile development, thereby increasing the operation types and steps. The rapidly growing business operations knowledge cannot be quickly and accurately combed, and the existing knowledge manual is more and more inaccurate. Precisely, this approach is not suitable for medium and large sizes. In the second scheme, a developer needs to design an operation track output scheme in advance for the whole project, but the existing production system is often developed by combining a plurality of projects, different manufacturers are introduced, different technical frameworks are adopted, an old, medium and green third-generation system possibly exists in the system, and the like, so that the embedded transformation cannot be completed through one-time transformation even if part of the system exists, and if only part of the system outputs data, the effect is not obvious. Therefore, this approach has practical popularization drawbacks. The third scheme adopts an operation track acquisition scheme based on a middleware probe, and the code is basically not changed, but the prior art is in system stability, the system is rapidly deployed, the data acquisition ductility is a certain distance away from the actual requirement, and the actual production requirement can not be met temporarily. To sum up, the prior art is too complex and inefficient in terms of data analysis capabilities.
Disclosure of Invention
The embodiment of the invention provides a user operation track analysis method and device based on web data, which are used for solving the problems that the prior art is too complex and the efficiency is low in data analysis capability.
In a first aspect, an embodiment of the present invention provides a method for analyzing a user operation track based on web data, including:
acquiring a user operation track in real time, wherein the user operation track at least comprises a service type;
comparing the user operation track with all default tracks with the same service type according to a track model obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type;
and if the user operation track is different from the default track, marking the user operation track as an abnormal track.
In a second aspect, an embodiment of the present invention provides a user operation trajectory analysis device for web data, including:
the flow acquisition unit is used for acquiring a user operation track in real time, wherein the user operation track at least comprises a service type;
the track analysis unit is used for comparing the user operation track with all default tracks with the same service type according to a track model which is obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type;
and the cross recognition unit is used for marking the user operation track as an abnormal track if the user operation track is different from the default track.
In a third aspect, an embodiment of the present invention further provides an electronic device, including:
a processor, a memory, a communication interface, and a communication bus; wherein,,
the processor, the memory and the communication interface complete communication with each other through the communication bus;
the communication interface is used for information transmission between communication devices of the electronic device;
the memory stores computer program instructions executable by the processor, the processor invoking the program instructions capable of performing the method of:
acquiring a user operation track in real time, wherein the user operation track at least comprises a service type;
comparing the user operation track with all default tracks with the same service type according to a track model obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type;
and if the user operation track is different from the default track, marking the user operation track as an abnormal track.
In a fourth aspect, embodiments of the present invention also provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the following method:
acquiring a user operation track in real time, wherein the user operation track at least comprises a service type;
comparing the user operation track with all default tracks with the same service type according to a track model obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type;
and if the user operation track is different from the default track, marking the user operation track as an abnormal track.
According to the web data-based user operation track analysis method and device, the collected user operation track is compared with the default track of the corresponding service type, if the collected user operation track is different from the default track, the user operation track is judged to be the abnormal track, and therefore accurate analysis can be performed on the user operation track more simply and efficiently.
Drawings
FIG. 1 is a flowchart of a user operation track analysis method based on web data according to an embodiment of the present invention;
FIG. 2 is a flowchart of another method for analyzing user operation tracks based on web data according to an embodiment of the present invention;
FIG. 3 is a flowchart of a user operation track analysis method based on web data according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a user operation trajectory analysis device for web-based data according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of another embodiment of a user operation trajectory analysis device for web data according to the present invention;
FIG. 6 is a schematic diagram of a user operation trajectory analysis device for web data according to another embodiment of the present invention;
fig. 7 illustrates a physical structure diagram of an electronic device.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a flowchart of a method for analyzing a user operation track based on web data according to an embodiment of the present invention, as shown in fig. 1, where the method includes:
step S01, acquiring a user operation track in real time, wherein the user operation track at least comprises a service type.
The Web data in the network is collected, so that a user operation track of a user when the user transacts the service can be obtained, wherein the user operation track comprises multidimensional data of user operation, and specifically at least comprises service types, work numbers, addresses, time, duration and the like.
In a specific acquisition process, a traditional network mirror image is acquired by using a switch mirror image port or a beam splitter, so that the existing agile release, cloud deployment and container deployment are not flexible. The embodiment of the invention adopts a fusion acquisition technology, not only supports the traditional physical switch, but also can support flexible deployment of virtual machines and containers, and can completely support the current technical architecture.
1) For a system deployed on a physical machine, introducing a flow mirror image from the existing flow exchanger, collecting flow data by a collection program deployed on a physical collector, and outputting the flow data to a flow convergence machine for subsequent analysis.
2) For a system deployed on a Virtual machine such as WMWare, a Virtual machine traffic collection program is deployed in a Virtual machine cluster through a mirroring scheme of a Virtual Switch (VSwitch), so that a VMWare network packet can be automatically unpacked, and unpacked traffic data is output to a traffic aggregation machine for subsequent analysis.
3) For a system deployed by adopting a container technology, the flow cannot be determined due to the dynamic deployment characteristic of the container, and the flow of a network load layer is acquired and output to a flow gathering machine for analysis.
And acquiring the work number, the IP, the service type, the address, the time and the duration from the web data through a preset service information automatic processing function for the acquired web data, returning a result, customizing data conversion, and outputting the processed data as a user operation track according to a specified format.
Step S02, comparing the user operation track with all default tracks with the same service type according to a track model obtained in advance through a clustering algorithm; wherein the trajectory model includes at least one default trajectory corresponding to each traffic type.
The default track corresponding to each service type is obtained in advance, which is equivalent to a typical user operation track when the user handles the service. The method for obtaining the default track may be at least one default track obtained directly according to classification of each service type and setting in the whole service center, or may be that actual historical data is obtained through a preset algorithm, for example, a clustering algorithm, which is equivalent to statistics of user operation tracks in the actual use process of a user, where the track model includes at least one default track corresponding to each service type, for example, a default track for changing package service is an operation a1, a2, a3, a4, a5; the default trajectories for subscription traffic are operations b1, b2, b3, b4.
And S03, if the user operation track is different from the default track, marking the user operation track as an abnormal track.
After comparison, if the user operation track is the same as the default track, no subsequent operation is performed. And if the user operation tracks are different, marking the user operation tracks as abnormal tracks. An alarm may be made or only recorded. For example, when the user performs the package service change, the user operation tracks are the operations a1, a2, a3, a6, a7, a5, and the user operation tracks may be recorded as the abnormal tracks when the user can see that the user operation tracks are different from the default track operations a1, a2, a3, a4, a 5.
According to the embodiment of the invention, the acquired user operation track is compared with the default track of the corresponding service type, and if the acquired user operation track is different from the default track, the user operation track is judged to be the abnormal track, so that the user operation track can be accurately analyzed more simply and efficiently.
FIG. 2 is a flowchart of another method for analyzing user operation tracks based on web data according to an embodiment of the present invention, as shown in FIG. 2, the method further includes:
step S10, all user operation tracks in a preset historical time range are obtained regularly;
in order to acquire a default track corresponding to each service type, it is necessary to acquire in advance all user operation tracks obtained from web data within a preset history time range, for example, within the first half year of the current time, or 1 year or the like.
And S11, obtaining at least one cluster by adopting a clustering algorithm on all the user operation tracks.
And (3) classifying and converging all the user operation tracks according to different dimension conditions by adopting a clustering algorithm. So that a cluster is obtained according to the specific setting of the clustering algorithm, wherein at least one cluster is included.
Further, the clustering algorithm is a K-Means clustering algorithm.
There are a variety of clustering algorithms, such as K-Means (K-Means) clustering algorithms, K-center-point (K-Medians) clustering algorithms, mean-shift clustering algorithms, coacervate-like clustering algorithms, etc. The K-Means clustering algorithm is merely exemplified herein.
The K-Means algorithm divides similar user operation tracks through preset K values and initial centroids of each category. And obtaining an optimal clustering result through the divided mean iterative optimization. Using the sum of squares error (Sum of the Squared Error, SSE) as an objective function of clustering, two different clusters generated by running K-means twice, the smaller the SSE the higher that similarity. So that clustering at the time of SSE minimization is the final result. Wherein the K value may be set according to the number of service types. And after the final clustering is obtained, verification is performed and adjustment is performed as needed.
And step S12, analyzing the user operation tracks contained in each cluster to obtain the track model.
The main user operation track contained in each cluster can be obtained through analyzing the user operation track contained in each cluster, or the main user operation track can be considered as the user operation track corresponding to the cluster center, and the user operation track is used as the default operation track of the corresponding service type. And counting to obtain the track model.
The track model can be counted periodically according to actual needs, for example, a month or a half year, and the new user operation track obtained in the time range is added into the historical data, so that a new track model is obtained.
According to the embodiment of the invention, the track model is obtained by analyzing all the user operation tracks in the historical time range through the clustering algorithm, the user operation tracks acquired in real time are compared with the default tracks of the corresponding service types, and if the user operation tracks are different, the user operation tracks are judged to be abnormal tracks, so that the user operation tracks can be analyzed more simply and efficiently.
FIG. 3 is a flowchart of another method for analyzing user operation tracks based on web data according to an embodiment of the present invention, as shown in FIG. 3, the method further includes:
step S20, counting each abnormal track.
And counting all the obtained abnormal tracks according to different service types. Thus, the types of the abnormal tracks generated by each service type can be obtained, and each abnormal track is counted. For example, if the user has operations a1, a2, a3, a6, a7, a5 or operations a1, a2, a3, a8, a5 in the course of changing the package, it is possible to determine that two types of abnormal tracks exist for the changing package service, and count the corresponding types of abnormal tracks each time the abnormal tracks are received.
And S21, if the count exceeds a preset count threshold, sending out early warning information.
And presetting a counting threshold, and if the counting of one abnormal track exceeds the counting threshold or exceeds the preset counting threshold within a preset time range, sending corresponding early warning information to inform that the default track of the corresponding service type is possibly changed or a new default track appears.
According to the embodiment of the invention, through counting the abnormal tracks of each service type, if the count of one abnormal track exceeds the preset count threshold, the early warning information is sent out, so that accurate analysis of the user operation track is facilitated more simply and efficiently.
Fig. 4 is a schematic structural diagram of a user operation track analysis device for web data according to an embodiment of the present invention, as shown in fig. 4, the device includes: a flow acquisition unit 10, a trajectory analysis unit 11 and a cross recognition unit 12, wherein,
the flow acquisition unit 10 is configured to acquire a user operation track in real time, where the user operation track at least includes a service type; the track analysis unit 11 is configured to compare the user operation track with all default tracks with the same service type according to a track model obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type; the cross recognition unit 12 is configured to mark the user operation track as an abnormal track if the user operation track is different from a default track. Specifically:
the flow collection unit 10 may obtain a user operation track when the user transacts the service by collecting Web data in the network, where the user operation track includes multidimensional data of the user operation, and specifically includes at least a service type, a job number, an address, a time, a duration, and the like.
The trajectory analysis unit 11 acquires a default trajectory corresponding to each service type in advance, which is equivalent to a typical user operation trajectory when the user handles the service. The method for acquiring the default track can be at least one preset default track directly obtained according to the classification of each service type and the setting in the whole service center, or can acquire a track model by counting the actual historical data through a preset algorithm and through the user operation track in the actual use process of a user, wherein the track model comprises at least one default track corresponding to each service type.
After comparison, if the user operation track is the same as the default track, no subsequent operation is performed. And if it is different, the user operation trajectory is marked as an abnormal trajectory by the cross recognition unit 12. An alarm may be made or only recorded.
The device provided in the embodiment of the present invention is used for executing the above method, and the function of the device specifically refers to the above method embodiment, and the specific method flow is not repeated herein.
In the embodiment of the invention, the user operation track acquired by the flow acquisition unit 10 is compared with the default track of the corresponding service type in the track analysis unit 11, and if the user operation track is different from the default track, the cross recognition unit 12 judges that the user operation track is an abnormal track, so that the user operation track can be accurately analyzed more simply and efficiently.
FIG. 5 is a schematic structural diagram of another apparatus for analyzing user operation tracks based on web data according to an embodiment of the present invention, as shown in FIG. 5, the apparatus includes: a flow acquisition unit 10, a trajectory analysis unit 11, a cross recognition unit 12, a data warehouse unit 13, an association calculation unit 14 and a modeling unit 15, wherein,
the data warehouse unit 13 is configured to periodically acquire all user operation tracks within a preset historical time range; the association calculation unit 14 is configured to obtain at least one cluster by using a clustering algorithm on all user operation tracks; the modeling unit 15 is configured to analyze the user operation tracks included in each cluster, to obtain the track model.
In order to acquire the default track corresponding to each service type, the data warehouse unit 13 is required to acquire all user operation tracks obtained from the web data in advance within a preset history time range, for example, within the first half year of the current time, or 1 year, etc.
The association calculation unit 14 adopts a clustering algorithm to classify and aggregate all user operation tracks in the data warehouse unit 13 according to different dimension conditions. So that a cluster is obtained according to the specific setting of the clustering algorithm, wherein at least one cluster is included.
Further, the clustering algorithm is a K-Means clustering algorithm.
There are a variety of clustering algorithms, such as K-Means (K-Means) clustering algorithms, K-center-point (K-Medians) clustering algorithms, mean-shift clustering algorithms, coacervate-like clustering algorithms, etc. The K-Means clustering algorithm is merely exemplified herein.
The K-Means algorithm divides similar user operation tracks through preset K values and initial centroids of each category. And obtaining an optimal clustering result through the divided mean iterative optimization. Using the sum of squares error (Sum of the Squared Error, SSE) as an objective function of clustering, two different clusters generated by running K-means twice, the smaller the SSE the higher that similarity. So that clustering at the time of SSE minimization is the final result. Wherein the K value may be set according to the number of service types. And after the final clustering is obtained, verification is performed and adjustment is performed as needed.
The modeling unit 15 may obtain a main user operation track included in each cluster by analyzing the user operation tracks included in each cluster obtained by the association calculating unit 14, or may consider the main user operation track as a user operation track corresponding to a cluster center, and use the user operation track as a default operation track of a corresponding service type. The trajectory model is obtained after statistics and sent to the trajectory analysis unit 11.
The track model may be counted periodically according to actual needs, and the flow acquisition unit 10 adds the new user operation track obtained in the time range to the data warehouse unit 13, so as to obtain a new track model.
The device provided in the embodiment of the present invention is used for executing the above method, and the function of the device specifically refers to the above method embodiment, and the specific method flow is not repeated herein.
According to the embodiment of the invention, the association calculation unit 14 analyzes all user operation tracks in the historical time range in the data warehouse unit 13 through a clustering algorithm, the modeling unit 15 obtains the track model, then the user operation tracks acquired by the flow acquisition unit 10 in real time are compared with default tracks of corresponding service types in the track analysis unit 11, if different, the intersection recognition unit 12 judges that the user operation tracks are abnormal tracks, and therefore, the user operation tracks can be analyzed more simply and efficiently.
FIG. 6 is a schematic structural diagram of a user operation track analysis device for web data according to another embodiment of the present invention, as shown in FIG. 6, the device includes:
a flow acquisition unit 10, a trajectory analysis unit 11, a cross recognition unit 12, a data warehouse unit 13, an association calculation unit 14, a modeling unit 15 and a quantization unit 16, wherein,
the quantization unit 16 is configured to count each abnormal trajectory; the quantization unit 16 is further configured to send out early warning information if the count exceeds a preset count threshold. Specifically:
the quantization unit 16 counts all the obtained abnormal trajectories according to different service types. Thus, the types of the abnormal tracks generated by each service type can be obtained, and each abnormal track is counted.
The quantization unit 16 sets a count threshold in advance, and if the count of one of the abnormal tracks exceeds the count threshold or exceeds the preset count threshold within a preset time range, the corresponding early warning information may be sent to inform that the default track of the corresponding service type may change or a new default track appears.
The device provided in the embodiment of the present invention is used for executing the above method, and the function of the device specifically refers to the above method embodiment, and the specific method flow is not repeated herein.
According to the embodiment of the invention, the quantification unit 16 is used for counting the abnormal tracks of each service type, and if the count of one abnormal track exceeds the preset count threshold, early warning information is sent, so that accurate analysis of the user operation track is facilitated more simply and efficiently.
Fig. 7 illustrates a physical structure diagram of an electronic device, and as shown in fig. 7, the server may include: processor 810, communication interface (Communications Interface) 820, memory 830, and communication bus 840, wherein processor 810, communication interface 820, memory 830 accomplish communication with each other through communication bus 840. The processor 810 may call logic instructions in the memory 830 to perform the following method: acquiring a user operation track in real time, wherein the user operation track at least comprises a service type; comparing the user operation track with all default tracks with the same service type according to a track model obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type; and if the user operation track is different from the default track, marking the user operation track as an abnormal track.
Further, embodiments of the present invention disclose a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the methods provided by the above-described method embodiments, for example comprising: acquiring a user operation track in real time, wherein the user operation track at least comprises a service type; comparing the user operation track with all default tracks with the same service type according to a track model obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type; and if the user operation track is different from the default track, marking the user operation track as an abnormal track.
Further, embodiments of the present invention provide a non-transitory computer readable storage medium storing computer instructions that cause a computer to perform the methods provided by the above-described method embodiments, for example, including: acquiring a user operation track in real time, wherein the user operation track at least comprises a service type; comparing the user operation track with all default tracks with the same service type according to a track model obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type; and if the user operation track is different from the default track, marking the user operation track as an abnormal track.
Those of ordinary skill in the art will appreciate that: further, the logic instructions in the memory 830 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above-described embodiments of electronic devices and the like are merely illustrative, wherein the elements described as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (8)
1. A web data-based user operation trajectory analysis method, comprising:
acquiring a user operation track in real time, wherein the user operation track at least comprises a service type;
comparing the user operation track with all default tracks with the same service type according to a track model obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type;
if the user operation track is different from the default track, marking the user operation track as an abnormal track;
classifying and converging all user operation tracks according to different dimension conditions by adopting a clustering algorithm, and obtaining a cluster according to the specific setting of the clustering algorithm, wherein the cluster at least comprises one cluster;
the clustering algorithm is a K-Means clustering algorithm, the K-Means algorithm divides similar user operation tracks through a preset K value and an initial centroid of each category, and an optimal clustering result is obtained through mean iterative optimization after division;
using the error square sum as an objective function of clustering, running two different clusters generated by the K mean twice, wherein the smaller the error square sum is, the higher the similarity is, so that the cluster when the error square sum is minimum is a final result; the K value is set according to the number of service types, verification is carried out after final clustering is obtained, and adjustment is carried out according to the requirement;
and presetting a counting threshold, and if the counting of one abnormal track exceeds the counting threshold or exceeds the preset counting threshold within a preset time range, sending corresponding early warning information to inform that the default track of the corresponding service type changes or a new default track appears.
2. The method according to claim 1, wherein the method further comprises:
all user operation tracks in a preset historical time range are obtained regularly;
obtaining at least one cluster from all user operation tracks by adopting a clustering algorithm;
and analyzing the user operation tracks contained in each cluster respectively to obtain the track model.
3. The method according to claim 1, wherein the method further comprises:
counting each abnormal track;
and if the count exceeds a preset count threshold, sending out early warning information.
4. A user operation trajectory analysis device for web data, comprising:
the flow acquisition unit is used for acquiring a user operation track in real time, wherein the user operation track at least comprises a service type;
the track analysis unit is used for comparing the user operation track with all default tracks with the same service type according to a track model which is obtained in advance through a clustering algorithm; wherein the trajectory model comprises at least one default trajectory corresponding to each service type;
the cross recognition unit is used for marking the user operation track as an abnormal track if the user operation track is different from a default track;
classifying and converging all user operation tracks according to different dimension conditions by adopting a clustering algorithm, and obtaining a cluster according to the specific setting of the clustering algorithm, wherein the cluster at least comprises one cluster;
the clustering algorithm is a K-Means clustering algorithm, the K-Means algorithm divides similar user operation tracks through a preset K value and an initial centroid of each category, and an optimal clustering result is obtained through mean iterative optimization after division;
using the error square sum as an objective function of clustering, running two different clusters generated by the K mean twice, wherein the smaller the error square sum is, the higher the similarity is, so that the cluster when the error square sum is minimum is a final result; the K value is set according to the number of service types, verification is carried out after final clustering is obtained, and adjustment is carried out according to the requirement;
and presetting a counting threshold, and if the counting of one abnormal track exceeds the counting threshold or exceeds the preset counting threshold within a preset time range, sending corresponding early warning information to inform that the default track of the corresponding service type changes or a new default track appears.
5. The apparatus of claim 4, wherein the apparatus further comprises:
the data warehouse unit is used for periodically acquiring all user operation tracks in a preset historical time range;
the association calculation unit is used for obtaining at least one cluster by adopting a clustering algorithm on all user operation tracks;
and the modeling unit is used for respectively analyzing the user operation tracks contained in each cluster to obtain the track model.
6. The apparatus of claim 4, wherein the apparatus further comprises:
a quantization unit for counting each abnormal trace;
and the quantization unit is also used for sending out early warning information if the count exceeds a preset count threshold.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the user operation trajectory analysis method according to any one of claims 1 to 3 when executing the program.
8. A non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of the user operation trajectory analysis method according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811453609.9A CN111258874B (en) | 2018-11-30 | 2018-11-30 | User operation track analysis method and device based on web data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811453609.9A CN111258874B (en) | 2018-11-30 | 2018-11-30 | User operation track analysis method and device based on web data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111258874A CN111258874A (en) | 2020-06-09 |
CN111258874B true CN111258874B (en) | 2023-09-05 |
Family
ID=70948489
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811453609.9A Active CN111258874B (en) | 2018-11-30 | 2018-11-30 | User operation track analysis method and device based on web data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111258874B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111782876A (en) * | 2020-06-30 | 2020-10-16 | 杭州海康机器人技术有限公司 | Data processing method, device and system and storage medium |
CN112667277B (en) * | 2020-12-25 | 2023-07-25 | 中国平安人寿保险股份有限公司 | Information pushing method and device based on small program and computer equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102946317A (en) * | 2012-08-07 | 2013-02-27 | 甘利俭 | User behavior analysis system |
KR20140073345A (en) * | 2012-12-06 | 2014-06-16 | 한국과학기술원 | Method for task list recommanation associated with user interation and mobile device using the same |
CN107306252A (en) * | 2016-04-21 | 2017-10-31 | 中国移动通信集团河北有限公司 | A kind of data analysing method and system |
CN107426177A (en) * | 2017-06-13 | 2017-12-01 | 努比亚技术有限公司 | A kind of user behavior clustering method and terminal, computer-readable recording medium |
CN108512806A (en) * | 2017-02-24 | 2018-09-07 | 中国移动通信集团公司 | A kind of operation behavior analysis method and server based on virtual environment |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8321251B2 (en) * | 2010-03-04 | 2012-11-27 | Accenture Global Services Limited | Evolutionary process system |
US8855361B2 (en) * | 2010-12-30 | 2014-10-07 | Pelco, Inc. | Scene activity analysis using statistical and semantic features learnt from object trajectory data |
US8660368B2 (en) * | 2011-03-16 | 2014-02-25 | International Business Machines Corporation | Anomalous pattern discovery |
US10423892B2 (en) * | 2016-04-05 | 2019-09-24 | Omni Ai, Inc. | Trajectory cluster model for learning trajectory patterns in video data |
-
2018
- 2018-11-30 CN CN201811453609.9A patent/CN111258874B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102946317A (en) * | 2012-08-07 | 2013-02-27 | 甘利俭 | User behavior analysis system |
KR20140073345A (en) * | 2012-12-06 | 2014-06-16 | 한국과학기술원 | Method for task list recommanation associated with user interation and mobile device using the same |
CN107306252A (en) * | 2016-04-21 | 2017-10-31 | 中国移动通信集团河北有限公司 | A kind of data analysing method and system |
CN108512806A (en) * | 2017-02-24 | 2018-09-07 | 中国移动通信集团公司 | A kind of operation behavior analysis method and server based on virtual environment |
CN107426177A (en) * | 2017-06-13 | 2017-12-01 | 努比亚技术有限公司 | A kind of user behavior clustering method and terminal, computer-readable recording medium |
Non-Patent Citations (1)
Title |
---|
成静 ; 朱怡安 ; 张涛 ; 杨艳丽 ; .一种基于操作轨迹模型的移动应用易用性评估方法.西北工业大学学报.2016,(04),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111258874A (en) | 2020-06-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116188821B (en) | Copyright detection method, system, electronic device and storage medium | |
US20220086071A1 (en) | A network device classification apparatus and process | |
WO2019051941A1 (en) | Method, apparatus and device for identifying vehicle type, and computer-readable storage medium | |
CN111258874B (en) | User operation track analysis method and device based on web data | |
CN111274084A (en) | Fault diagnosis method, device, equipment and computer readable storage medium | |
CN108509975A (en) | A kind of exception on-line talking method and device, electronic equipment | |
EP4209959A1 (en) | Target identification method and apparatus, and electronic device | |
CN112418065A (en) | Equipment operation state identification method, device, equipment and storage medium | |
CN115269438A (en) | Automatic testing method and device for image processing algorithm | |
US20100131626A1 (en) | Information Processing Apparatus and Method of Operating the Same | |
CN111461143A (en) | Picture copying identification method and device and electronic equipment | |
CN108182444A (en) | The method and device of video quality diagnosis based on scene classification | |
CN111210634B (en) | Intelligent traffic information processing method and device, intelligent traffic system and server | |
CN112487265A (en) | Data processing method and device, computer storage medium and electronic equipment | |
US20160366021A1 (en) | User interface for an application performance management system | |
CN109740750B (en) | Data collection method and device | |
CN111176950A (en) | Method and equipment for monitoring network card of server cluster | |
CN107111757A (en) | Method for detecting lane lines and device | |
CN110716778A (en) | Application compatibility testing method, device and system | |
CN115526859A (en) | Method for identifying production defects, distributed processing platform, equipment and storage medium | |
CN114911677A (en) | Monitoring method and device for containers in cluster and computer readable storage medium | |
CN114005060A (en) | Image data determining method and device | |
CN111722977A (en) | System inspection method and device and electronic equipment | |
CN117520994B (en) | Method and system for identifying abnormal air ticket searching user based on user portrait and clustering technology | |
CN117953252B (en) | Automatic acquisition method and system for highway asset data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |