CN108197142B - Method, device, storage medium and equipment for determining relevance of network transaction - Google Patents
Method, device, storage medium and equipment for determining relevance of network transaction Download PDFInfo
- Publication number
- CN108197142B CN108197142B CN201711195221.9A CN201711195221A CN108197142B CN 108197142 B CN108197142 B CN 108197142B CN 201711195221 A CN201711195221 A CN 201711195221A CN 108197142 B CN108197142 B CN 108197142B
- Authority
- CN
- China
- Prior art keywords
- transaction
- name
- hash value
- list
- word vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2255—Hash tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1865—Transactional file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The disclosure relates to a method, a device, a storage medium and a device for determining network transaction relevance, wherein the method comprises the following steps: the method comprises the steps of obtaining a first transaction list of a target log, determining a word vector corresponding to the name of each transaction in the first transaction list through a preset document word vector model, obtaining a first relation tree according to the word vectors and a preset relation tree creation rule when the first transaction needing to be checked is determined, and then determining the association degree of the first transaction and other transactions according to the first relation tree. Therefore, operation and maintenance personnel can conveniently analyze the relevance between any affair and other affairs, so that when the system is configured, the system setting is better optimized according to the relevance between the affairs, and the system performance is improved.
Description
Technical Field
The present disclosure relates to the field of network technologies, and in particular, to a method, an apparatus, a storage medium, and a device for determining a network transaction relevance.
Background
With the development of network technology, network systems of enterprises are developed more and more, and more transactions are processed in the network systems, so that the relationships among a large number of transactions cannot be recorded and analyzed only by means of documents of developers. In the prior art, for business transactions in a complex enterprise network system, system developers cannot clearly and accurately determine the association degree between the transactions, and the association of the business transactions between the enterprise network systems conceived during early development may be different after the enterprise network systems are online. However, if the incidence relation between the transactions after the system actually runs can be known, the cache optimization strategy can be adjusted, and the system performance is improved.
Therefore, how to obtain the relationship between network transactions in the log by analyzing a large number of logs is an urgent problem to be solved.
Disclosure of Invention
The invention aims to provide a method, a device, a storage medium and equipment for determining the relevance of network transactions, which are used for overcoming the problem that the relationship among the network transactions cannot be obtained by analyzing a large number of logs in a manual mode.
In order to achieve the above object, the present disclosure provides a method for determining network transaction relevance, the method including:
acquiring a first transaction list of a target log, wherein the first transaction list comprises a name hash value of each transaction in the target log and a timestamp corresponding to each transaction;
determining a word vector corresponding to the name of each transaction in the first transaction list through a preset document word vector model;
when a first transaction needing to be viewed is determined, acquiring a first relation tree according to the word vector and a preset relation tree creating rule, wherein the first relation tree comprises the first transaction and other transactions related to the first transaction;
and determining the association degree of the first transaction and the other transactions according to the first relation tree.
Optionally, the obtaining the first transaction list of the target log includes:
extracting the name of each transaction and the timestamp corresponding to each transaction from the target log;
sequencing all the transactions in the first transaction list according to the timestamp corresponding to each transaction to obtain a first sequence;
replacing the name of each transaction with the name hash value of each transaction by using a preset hash algorithm;
and generating the first transaction list comprising the name hash value of each transaction and the timestamp corresponding to each transaction according to the first sequence.
Optionally, replacing the name of each transaction with the name hash value of each transaction by using a preset hash algorithm includes:
acquiring an integer hash value of the name of each transaction by using a preset hash value calculation formula according to the name of each transaction;
converting the integer hash value of the name of each transaction into a hexadecimal hash value;
performing A-added hexadecimal calculation on each bit of the hexadecimal hash value of each transaction to obtain a name hash value of each transaction;
replacing the name of each transaction with the name hash value of each transaction;
wherein the hash value calculation formula includes:
wherein HV (i) is an integer hash value representing the name of the ith transaction in the first ordering; n represents the total number of characters of the name of the ith transaction, t represents the tth character in the name of the ith transaction, and s [ t ] represents the ten thousand code of the tth character.
Optionally, the preset document word vector model is determined by training using a continuous bag-of-words model and a history log.
Optionally, the relationship tree creation rule includes a number of child nodes of the relationship tree, a depth and a transaction non-repetition principle; when the first transaction needing to be viewed is determined, the obtaining of the first relation tree according to the word vector and a preset relation tree creating rule comprises the following steps:
determining the first transaction as a root node of the first relationship tree;
determining the number and depth of child nodes of the first relation tree according to the relation tree creating rule;
and establishing the first relation tree by utilizing depth-first search according to the root node, the number of the child nodes, the depth and the word vector under the transaction non-repetition principle.
In a second aspect of the embodiments of the present disclosure, an apparatus for determining network transaction relevance is provided, where the apparatus includes:
the system comprises a list acquisition module, a list acquisition module and a list processing module, wherein the list acquisition module is used for acquiring a first transaction list of a target log, and the first transaction list comprises a name hash value of each transaction in the target log and a timestamp corresponding to each transaction;
the word vector acquisition module is used for determining a word vector corresponding to the name of each transaction in the first transaction list through a preset document word vector model;
the relation tree determining module is used for acquiring a first relation tree according to the word vector and a preset relation tree creating rule when a first transaction needing to be checked is determined, wherein the first relation tree comprises the first transaction and other transactions related to the first transaction;
and the association degree determining module is used for determining the association degree of the first transaction and the other transactions according to the first relation tree.
Optionally, the list obtaining module includes:
the extraction submodule is used for extracting the name of each transaction and the timestamp corresponding to each transaction from the target log;
the ordering submodule is used for ordering all the transactions in the first transaction list according to the timestamp corresponding to each transaction to obtain a first ordering;
the calculation submodule is used for replacing the name of each transaction with the name hash value of each transaction by using a preset hash algorithm;
and the list generation submodule is used for generating the first transaction list comprising the name hash value of each transaction and the timestamp corresponding to each transaction according to the first sequence.
Optionally, the calculation sub-module includes:
the first obtaining submodule is used for obtaining an integer hash value of the name of each transaction by utilizing a preset hash value calculation formula according to the name of each transaction;
a conversion submodule, configured to convert the integer hash value of the name of each transaction into a hexadecimal hash value;
the second obtaining submodule is used for carrying out A-added hexadecimal calculation on each bit of the hexadecimal hash value of each transaction to obtain a name hash value of each transaction;
a replacing submodule, configured to replace the name of each transaction with the name hash value of each transaction;
wherein the hash value calculation formula includes:
wherein HV (i) is an integer hash value representing the name of the ith transaction in the first ordering; n represents the total number of characters of the name of the ith transaction, t represents the tth character in the name of the ith transaction, and s [ t ] represents the ten thousand code of the tth character.
Optionally, the preset document word vector model is determined by training using a continuous bag-of-words model and a history log.
Optionally, the relationship tree creation rule includes a number of child nodes of the relationship tree, a depth and a transaction non-repetition principle; the relationship tree determination module includes:
a first determining submodule, configured to determine the first transaction as a root node of the first relationship tree;
the second determining submodule is used for determining the number and the depth of the child nodes of the first relation tree according to the relation tree creating rule;
and the relation tree establishing submodule is used for establishing the first relation tree by utilizing depth-first search according to the root node, the number of the child nodes, the depth and the word vector under the transaction non-repetition principle.
In a third aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, implements the steps of the method of any one of the first aspect.
In a fourth aspect of the embodiments of the present disclosure, an electronic device is provided, including:
the computer-readable storage medium of the third aspect; and
one or more processors to execute the computer program in the computer-readable storage medium.
According to the method, the device, the storage medium and the equipment for determining the relevance of the network transaction, a first transaction list of a target log is obtained, wherein the first transaction list comprises a name hash value of each transaction in the target log and a timestamp corresponding to each transaction; determining a word vector corresponding to the name of each transaction in the first transaction list through a preset document word vector model; when a first transaction needing to be viewed is determined, acquiring a first relation tree according to the word vector and a preset relation tree creating rule, wherein the first relation tree comprises the first transaction and other transactions related to the first transaction; and determining the association degree of the first transaction and the other transactions according to the first relation tree. Therefore, the relevance among the network transactions can be analyzed in the system log of the network system with a large amount of data, and the relationship tree is established, so that the transaction related to the abnormal transaction can be determined through the relationship tree after the abnormal transaction occurs in a certain network transaction, thereby quickly and conveniently solving the abnormal transaction and optimizing the system setting.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
FIG. 1 is a flow diagram illustrating a method for network transaction affinity determination in accordance with an exemplary embodiment;
FIG. 2 is a schematic structural diagram of a CBOW model;
FIG. 3 is a flow diagram illustrating another method of network transaction affinity determination in accordance with an exemplary embodiment;
FIG. 4 is a flow diagram illustrating yet another method of network transaction affinity determination in accordance with an illustrative embodiment;
FIG. 5 is a flow diagram illustrating yet another method of network transaction affinity determination in accordance with an illustrative embodiment;
FIG. 6 is a diagram illustrating a first relationship tree structure, according to an example embodiment;
FIG. 7 is a block diagram illustrating a network transaction association determination apparatus in accordance with an exemplary embodiment;
FIG. 8 is a block diagram illustrating a list acquisition module in accordance with an exemplary embodiment;
FIG. 9 is a block diagram illustrating a computation submodule in accordance with an exemplary embodiment;
FIG. 10 is a block diagram illustrating a relationship tree determination module in accordance with an exemplary embodiment;
FIG. 11 is a block diagram illustrating an electronic device in accordance with an example embodiment.
Detailed Description
The following detailed description of specific embodiments of the present disclosure is provided in connection with the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present disclosure, are given by way of illustration and explanation only, not limitation.
Before the embodiments of the present disclosure are described, the terms and application scenarios referred to in the present disclosure will be explained and described first, and the terms referred to in the present disclosure will be described first:
a Hash value, also commonly translated as a "Hash value," is the transformation of an input of arbitrary length, through a Hash algorithm, into an output of fixed length, which is the Hash value. This conversion is a compression mapping, the space occupied by the hash value is usually much smaller than the space occupied by the input quantity, and different inputs may hash to the same output without the possibility of uniquely determining the input value from the hash value.
The document Word Vector Model (English: Word to Vector Model) is to construct a multilayer neural network, then obtain corresponding input and output in a given text, continuously correct parameters in the neural network in the training process, and finally obtain Word vectors. The most common of which is the continuous bag-of-words model employed in this disclosure.
The Continuous Bag of Words Model (CBOW Model for short) is a Model for predicting a current word by using a context word, and when the Model is used in a network system, for a network transaction record recorded in a log file in the network system, it predicts what transaction the target transaction is most likely to be according to network transactions recorded in the log before and after the target transaction.
Huffman coding is a variable length coding method, proposed by Huffman in 1952, which provides unique coding of characters according to their probability of appearing in the document to be coded and ensures that the average coding of variable coding is the shortest. The Huffman tree, also known as an optimal tree, designates n weights as n leaf nodes to construct a binary tree, and when the length of the weighted path reaches the minimum, calls such binary tree as the optimal binary tree, i.e. the Huffman tree. The huffman tree is the tree with the shortest weighted path length, the node with the larger weight is closer to the root, and if the node in the tree is assigned with a numerical value with a certain meaning, the numerical value is called as the weight of the node.
Depth-First Search (DFS) is a type of Search algorithm that traverses nodes of a tree along its Depth, searching branches of the tree as deep as possible. For example, in an HTML (Chinese: Hypertext Markup Language, English: Hypertext Markup Language) file, when a hyperlink is selected, the linked HTML file will perform a depth-first search, i.e., a single chain must be searched in its entirety before searching the remaining hyperlink results. The depth-first search goes down a hyperlink on an HTML file until it can no longer be reached, then returns to an HTML file, and continues to select other hyperlinks in the HTML file. When no more hyperlinks are selectable, the search is said to have ended. DFS is a kind of graph algorithm, and the process is briefly that each possible branch path is deep until it can not be deep any more, and each node can only be visited once.
In the following description of the application scenario related to the present disclosure, after the network system is online, there may be a problem that the relevance between the transactions and the setting at the time of development are different, for example, for a group buying application, it is considered that it is possible that a user may want to click to view the store details after opening a store page at the time of development, so the store details are set as a default cache, while in fact, for a store page, the user may be more concerned about the evaluation of the store by other users, due to the difference, the store page request transaction set at the time of system development may have a close relationship with the store detail request transaction, and in fact, when the system is online, the store page request transaction and the store comment request transaction have a higher relevance. Therefore, the real affair relevance of the system needs to be analyzed timely, so that after the real affair relevance of the system is obtained, the cache of the system can be adjusted accordingly, and further the system is optimized, namely, for the system information of one shop, the shop evaluation, which is not the shop details, should be cached preferentially.
Therefore, according to the technical scheme provided by the embodiment of the disclosure, the relevance between each network transaction in the log file can be determined, so that when a certain transaction is abnormal or the relevance of a certain transaction is specially checked, the relevance between each transaction can be checked through the method, and then the transaction related to the transaction is determined.
Fig. 1 is a flowchart illustrating a method for determining network transaction association according to an exemplary embodiment, as shown in fig. 1, including the following steps:
During the operation of the network service, before or after a network transaction occurs, transactions with high association degree are likely to occur, for example, network transactions with high system association degree generally occur in one user access, a user request news list is highly related to two network transactions with user request news specific content, that is, after the user clicks the request news list, specific content of a certain news is requested in a large probability, that is, the occurrence time of the associated network transactions is close, the occurrence frequency is similar, and the relationships are represented in a log file as being close in position and similar in occurrence frequency recorded in the log file, and the characteristics are similar to the situation that words in a natural language are distributed in a document. In documents in natural language, words with high degrees of association between word senses appear in documents with similar frequencies and are likely to appear with greater probability in the context of the target word. Therefore, for the above situation, one network transaction may be regarded as one word, the content of the log file in a certain time window is intercepted as one document, the network transaction is a plurality of words constituting the document, and the log segments intercepted in different time windows may be respectively regarded as a plurality of documents.
In the step, all transactions in the target log are sequenced according to the time stamps, and the name hash value of each transaction is determined by using a hash value algorithm so as to be converted into a word vector in the next step, the more similar the transactions occur in the first transaction list, the higher the association degree of the transactions is, and the time stamps are also used for merging and distinguishing a large number of transactions in the target log.
And step 120, determining a word vector corresponding to the name of each transaction in the first transaction list through a preset document word vector model.
The document word vector model is used for vector conversion of the transaction, the conversion efficiency is high, the method is suitable for processing of large batches of texts, meanwhile, distributed computing can be adopted during actual operation, and computing operation is simplified.
Illustratively, the preset document word vector model adopted by the embodiment is determined by training using a continuous bag-of-words model and a history log.
Wherein, the network structure of the CBOW model comprises three layers: the input layer, the projection layer and the output layer are as shown in fig. 2, and the following description will be given by taking samples (context (w), w) as an example to describe the determination method of the document word vector model:
an input layer: word vector v (context (w) containing 2C words in context (w)1),v(Context(w)2),……,v(Context(W)2C)∈RmAnd m denotes the length of the word vector.
That is, at the beginning of model building, a word vector of C words before and after a sample w is taken as an input, w represents a certain transaction in a history log, the word vector of the transaction is assumed, and the word vector of 2C other transactions is determined according to the assumed word vector and the correlation between the 2C other transactions related to the transaction w.
The projection layer accumulates the input 2C word vectors, that is:
the output layer corresponds to a Huffman tree structure, each leaf node in the binary tree corresponds to a word, and the tree structure shows the correlation among the leaf nodes.
Based on the above concept, first, the conditional probability of predicting a word w using a context is expressed as:
then, the objective function may use a log-likelihood function as shown in equation 2.2:
f=∑w∈clogP(w|Context(w)) (2.2)
then, substituting equation 2.1 into log-likelihood function 2.2 can result in equation 2.3, where equation 2.3 is as follows:
wherein p iswFor a path from the root node to the corresponding leaf node of w, lwRepresents a path pwThe number of the nodes is included in the medium,represents a path pwInwA node, whereinA root node is represented as a root node,the node corresponding to the expression w is shown,huffman coding of the word w, consisting ofw-a 1-bit code formation,represents a path pwAnd vectors corresponding to the middle non-leaf nodes.Represents a path pwThe vector corresponding to the jth non-leaf node in (j).
In the process, a random gradient descent method can be adopted to calculate the minimum value of f (w, j), so that all parameters of the document word vector model can be determined.
With the above model determination method, after the history log is used as the training sample to determine the document word vector model, the hash value of the name of each transaction in the first transaction list of step 110 is used as the input, the output of the model is the word vector of the name corresponding to each transaction, and then the first relationship tree indicating the association between the transactions is determined according to the word vector.
Illustratively, the relationship tree creation rules include the number of child nodes of the relationship tree, the depth, and the transaction non-duplication principle. In addition, the first transaction needing to be checked can be a transaction with an exception or can be based on a transaction needing to be checked specially.
The threshold value of the number of child nodes of the first relation tree can be determined when the document word vector model is trained through the historical log, and the number of child nodes of the relation tree is preset as the threshold value in the network system; the depth of the relation tree indicates the number of layers of the relation tree, and for example, the number of child nodes may be set to 2 and the depth to 3. In the transaction non-repetition principle, in the process of establishing the relationship tree, the newly added nodes do not include nodes existing in the relationship tree, and any node in the relationship tree only appears once.
And step 140, determining the association degree of the first transaction and other transactions according to the first relation tree.
From the first relationship tree it is clearly indicated that the first transaction is related to those other transactions and to what extent. When the transaction relevance in the network system is analyzed, only the log file is relied on, the code of the network system does not need to be modified or a new file is not configured, and the format of the log file is not limited, so that the searching and the solving of the problems occurring in the network system by operation and maintenance personnel are simplified.
In addition, the system can be optimally configured according to the first relation tree. For example, the first relationship tree indicates that the user is more interested in the notification announcements and the news of innovation in the network system, so that the content of news lists of the notification announcements and the news of innovation can be cached in the default setting of the network system, thereby improving the response time of the network system and improving the performance of the network system.
In summary, according to the method for determining relevance of network transactions provided by the present disclosure, a first transaction list of a target log is obtained, and then a word vector corresponding to a name of each transaction in the first transaction list is determined through a preset document word vector model, when a first transaction to be checked is determined, a first relation tree is obtained according to the word vector and a preset relation tree creation rule, and then relevance of the first transaction and other transactions is determined according to the first relation tree. Therefore, operation and maintenance personnel can conveniently analyze the relevance between any affair and other affairs, so that when the system is configured, the system setting is better optimized according to the relevance between the affairs, and the system performance is improved.
Fig. 3 is a flowchart illustrating another method for determining network transaction relevance according to an exemplary embodiment, where, as shown in fig. 3, the obtaining of the first transaction list of the target log in step 110 includes the following steps:
And 112, sequencing all the transactions in the first transaction list according to the timestamp corresponding to each transaction to obtain a first sequence.
And 113, replacing the name of each transaction with the name hash value of each transaction by using a preset hash algorithm.
Wherein the step may further comprise the following sub-steps, as shown in fig. 4:
For example, mapping the name of each transaction to a hash value with a fixed length, since generally similar transactions have similar names, for example, the name of a database transaction may include the same database name and indication, and thus the text distance of the name of the transaction is reflected in the hash value.
Wherein, the hash value calculation formula includes:
wherein hv (i) is an integer hash value representing the name of the ith transaction in the first ordering; n represents the total number of characters of the name of the ith transaction, t represents the t-th character in the name of the ith transaction, and s [ t ] represents the ten thousand code of the t-th character.
The hash value calculation formula above shows that the first character in the name of the ith transaction has a greater effect on the hash value.
At step 1132, the integer hash value of the name of each transaction is converted to a hexadecimal hash value.
The hexadecimal system is used for indicating the hash value, word segmentation processing can be conveniently carried out during subsequent word vector conversion, each digit of the hexadecimal system hash value of each transaction is added with the character 'A', the hexadecimal system hash value can be completely converted into letter representation, and the word vector can be conveniently determined.
That is, the name of the transaction in the first ordering in step 112 is replaced with the name hash value.
And sequentially extracting the name and the timestamp of the transaction from the target log by using the steps 111 to 114 to determine a first sequence, and replacing the name of the transaction in the first sequence with the hash value to generate a first transaction list with the name hash value.
Fig. 5 is a flowchart illustrating a further method for determining relevance of a network transaction according to an exemplary embodiment, where as shown in fig. 5, when determining a first transaction that needs to be viewed in step 130, obtaining a first relationship tree according to a word vector and a preset relationship tree creation rule includes the following steps:
at step 131, the first transaction is determined as a root node of the first relationship tree.
For example, after obtaining the word vector corresponding to the name of each transaction, the association relationship between the transactions may be determined by comparing the distances between the word vector of the first transaction and the word vectors of the other transactions in the first transaction list, and determining the relationship between the transactions by comparing the distances between every two transactions having the association with the first transaction, so as to determine the first relationship tree including the first transaction and the other transactions associated with the first transaction.
For example, as shown in fig. 6, through the operations of the above steps 131 to 133, a first relationship tree without repeated transactions is established, in which the first transaction (website home page) is the root node, the number of child nodes is 2, and the depth is 3, and the association degree of the default left-side node with the parent node is greater than that of the parent node of the right-side node, as shown in fig. 6, it can be seen that, after the user requests the website home page, the probability of accessing two secondary pages, namely notification announcement and innovation news, is the greatest, and when the user accesses the two secondary pages, the probability of the user clicking daily recommendation is greater than that of picture news and news hotspots, so that the daily recommended news list is preferentially cached in the website cache, so as to improve the cache hit rate of the network system and speed up the overall response rate of the network system.
In addition, it should be noted that the root node of the first relationship tree may be set according to user requirements, an abnormal transaction may be used as the root node of the first relationship tree, or a default root node of the first relationship tree may be set, for example, a home page of the network system may be used as the root node to establish the first relationship tree, so as to determine the association degrees of all transactions in the network system, so that an operation maintenance worker may more comprehensively obtain the correlation of the transactions in the network system when viewing the first relationship tree.
In summary, according to the method for determining relevance of network transactions provided by the present disclosure, a first transaction list of a target log is obtained, and then a word vector corresponding to a name of each transaction in the first transaction list is determined through a preset document word vector model, when a first transaction to be checked is determined, a first relation tree is obtained according to the word vector and a preset relation tree creation rule, and then relevance of the first transaction and other transactions is determined according to the first relation tree. Therefore, operation and maintenance personnel can conveniently analyze the relevance between any affair and other affairs, so that when the system is configured, the system setting is better optimized according to the relevance between the affairs, and the system performance is improved.
Fig. 7 is a block diagram illustrating a network transaction association determination apparatus 700, according to an example embodiment, the apparatus 700 may be used to execute the method described in any of fig. 1-6, referring to fig. 7, the apparatus 700 including:
the list obtaining module 710 is configured to obtain a first transaction list of the target log, where the first transaction list includes a name hash value of each transaction in the target log and a timestamp corresponding to each transaction.
And the word vector obtaining module 720 is configured to determine, through a preset document word vector model, a word vector corresponding to the name of each transaction in the first transaction list.
The relationship tree determining module 730 is configured to, when determining the first transaction that needs to be checked, obtain a first relationship tree according to the word vector and a preset relationship tree creating rule, where the first relationship tree includes the first transaction and other transactions associated with the first transaction.
And the association degree determining module 740 is configured to determine, according to the first relationship tree, an association degree between the first transaction and another transaction.
Fig. 8 is a block diagram illustrating a list acquisition module according to an example embodiment, where the list acquisition module 710 includes, as shown in fig. 8:
the extracting sub-module 711 is configured to extract a name of each transaction and a timestamp corresponding to each transaction in the target log.
The sorting submodule 712 is configured to sort all the transactions in the first transaction list according to the timestamp corresponding to each transaction, so as to obtain a first sort.
A computation submodule 713, configured to replace the name of each transaction with a name hash value of each transaction using a preset hash algorithm.
The list generation sub-module 714 is configured to generate a first transaction list including the name hash value of each transaction and a timestamp corresponding to each transaction in a first order.
Fig. 9 is a block diagram illustrating a calculation submodule, as shown in fig. 9, of the calculation submodule 713, including:
the first obtaining sub-module 7131 is configured to obtain, according to the name of each transaction, an integer hash value of the name of each transaction by using a preset hash value calculation formula.
A conversion submodule 7132, configured to convert the integer hash value of the name of each transaction into a hexadecimal hash value.
The second obtaining sub-module 7133 is configured to perform a hexadecimal calculation of adding a to each bit of the hexadecimal hash value of each transaction, to obtain a name hash value of each transaction.
A replace sub-module 7134 for replacing the name of each transaction with a name hash value for each transaction.
Wherein, the hash value calculation formula includes:
wherein hv (i) is an integer hash value representing the name of the ith transaction in the first ordering; n represents the total number of characters of the name of the ith transaction, t represents the t-th character in the name of the ith transaction, and s [ t ] represents the ten thousand code of the t-th character.
Optionally, the preset document word vector model is determined by training using a continuous bag-of-words model and a history log.
Fig. 10 is a block diagram illustrating a relationship tree determination module according to an exemplary embodiment, where, as shown in fig. 10, a relationship tree creation rule includes a number of child nodes of a relationship tree, a depth, and a transaction non-duplication rule, and the relationship tree determination module 730 includes:
a first determining sub-module 731 for determining the first transaction as a root node of the first relationship tree.
The second determining submodule 732 is configured to determine the number and depth of child nodes of the first relationship tree according to the relationship tree creation rule.
The relation tree building sub-module 733 is configured to build a first relation tree by using depth-first search according to the number of root nodes, child nodes, depth, and word vectors, on the principle that a transaction is not repeated.
In summary, according to the device for determining relevance of network transactions provided by the present disclosure, a first transaction list of a target log is obtained, a word vector corresponding to a name of each transaction in the first transaction list is determined through a preset document word vector model, when a first transaction to be checked is determined, a first relation tree is obtained according to the word vector and a preset relation tree creation rule, and then relevance of the first transaction and other transactions is determined according to the first relation tree. Therefore, operation and maintenance personnel can conveniently analyze the relevance between any affair and other affairs, so that when the system is configured, the system setting is better optimized according to the relevance between the affairs, and the system performance is improved.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 11 is a block diagram illustrating an electronic device 1100 in accordance with an example embodiment. As shown in fig. 11, the electronic device 1100 may include: a processor 1101, a memory 1102, multimedia components 1103, input/output (I/O) interfaces 1104, and communication components 1105.
The processor 1101 is configured to control the overall operation of the electronic device 1100, so as to complete all or part of the steps in the network transaction relevance determination method. The memory 1102 is used to store various types of data to support operation at the electronic device 1100, such as instructions for any application or method operating on the electronic device 1100, as well as application-related data, such as contact data, messaging, pictures, audio, video, and so forth. The Memory 1102 may be implemented by any type or combination of volatile and non-volatile Memory devices, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk. The multimedia components 1103 may include screen and audio components. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 1102 or transmitted through the communication component 1105. The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface 1104 provides an interface between the processor 1101 and other interface modules, such as a keyboard, mouse, buttons, and the like. These buttons may be virtual buttons or physical buttons. The communication component 1105 provides for wired or wireless communication between the electronic device 1100 and other devices. Wireless Communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC), 2G, 3G, or 4G, or a combination of one or more of them, so that the corresponding Communication component 1105 may include: Wi-Fi module, bluetooth module, NFC module.
In an exemplary embodiment, the electronic Device 1100 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components for performing the network transaction correlation determination methods described above.
In another exemplary embodiment, a computer readable storage medium comprising program instructions, such as the memory 1102 comprising program instructions, executable by the processor 1101 of the electronic device 1100 to perform the network transaction association determination method described above is also provided.
The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.
It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. In order to avoid unnecessary repetition, various possible combinations will not be separately described in this disclosure.
In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure, as long as it does not depart from the spirit of the present disclosure.
Claims (10)
1. A method for determining network transaction relevance, the method comprising:
acquiring a first transaction list of a target log, wherein the first transaction list comprises a name hash value of each transaction in the target log and a timestamp corresponding to each transaction;
inputting the name hash value of each transaction in the first transaction list into a preset document word vector model to obtain a word vector corresponding to the name of each transaction in the first transaction list output by the document word vector model;
when a first transaction needing to be viewed is determined, acquiring a first relation tree according to the word vector and a preset relation tree creating rule, wherein the first relation tree comprises the first transaction and other transactions related to the first transaction;
determining the association degree of the first transaction and the other transactions according to the first relation tree;
the name hash value is obtained by the following steps:
acquiring an integer hash value of the name of each transaction by using a preset hash value calculation formula according to the name of each transaction;
converting the integer hash value of the name of each transaction into a hexadecimal hash value;
and adding each bit of the hexadecimal hash value of each transaction with a character A to obtain the name hash value of each transaction.
2. The method of claim 1, wherein obtaining the first transaction list of the target log comprises:
extracting the name of each transaction and the timestamp corresponding to each transaction from the target log;
sequencing all the transactions in the first transaction list according to the timestamp corresponding to each transaction to obtain a first sequence;
replacing the name of each transaction with the name hash value of each transaction by using a preset hash algorithm;
and generating the first transaction list comprising the name hash value of each transaction and the timestamp corresponding to each transaction according to the first sequence.
3. The method of claim 2, wherein the hash calculation formula comprises:
wherein HV (i) is an integer hash value representing the name of the ith transaction in the first ordering; n represents the total number of characters of the name of the ith transaction, t represents the tth character in the name of the ith transaction, and s [ t ] represents the ten thousand code of the tth character.
4. The method of claim 1, wherein the preset document word vector model is determined by training using a continuous bag of words model and a history log.
5. The method of claim 1, wherein the relationship tree creation rules include a relationship tree child node number, a relationship tree depth, and a transaction non-duplication rule; when the first transaction needing to be viewed is determined, the obtaining of the first relation tree according to the word vector and a preset relation tree creating rule comprises the following steps:
determining the first transaction as a root node of the first relationship tree;
determining the number and depth of child nodes of the first relation tree according to the relation tree creating rule;
and establishing the first relation tree by utilizing depth-first search according to the root node, the number of the child nodes, the depth and the word vector under the transaction non-repetition principle.
6. An apparatus for determining network transaction relevance, the apparatus comprising:
the system comprises a list acquisition module, a list acquisition module and a list processing module, wherein the list acquisition module is used for acquiring a first transaction list of a target log, and the first transaction list comprises a name hash value of each transaction in the target log and a timestamp corresponding to each transaction;
a word vector obtaining module, configured to input the name hash value of each transaction in the first transaction list into a preset document word vector model, so as to obtain a word vector corresponding to the name of each transaction in the first transaction list output by the document word vector model;
the relation tree determining module is used for acquiring a first relation tree according to the word vector and a preset relation tree creating rule when a first transaction needing to be checked is determined, wherein the first relation tree comprises the first transaction and other transactions related to the first transaction;
the association degree determining module is used for determining the association degree of the first transaction and the other transactions according to the first relation tree;
the name hash value is obtained by the following steps:
acquiring an integer hash value of the name of each transaction by using a preset hash value calculation formula according to the name of each transaction;
converting the integer hash value of the name of each transaction into a hexadecimal hash value;
and adding each bit of the hexadecimal hash value of each transaction with a character A to obtain the name hash value of each transaction.
7. The apparatus of claim 6, wherein the list obtaining module comprises:
the extraction submodule is used for extracting the name of each transaction and the timestamp corresponding to each transaction from the target log;
the ordering submodule is used for ordering all the transactions in the first transaction list according to the timestamp corresponding to each transaction to obtain a first ordering;
the calculation submodule is used for replacing the name of each transaction with the name hash value of each transaction by using a preset hash algorithm;
and the list generation submodule is used for generating the first transaction list comprising the name hash value of each transaction and the timestamp corresponding to each transaction according to the first sequence.
8. The apparatus of claim 7, wherein the hash calculation formula comprises:
wherein HV (i) is an integer hash value representing the name of the ith transaction in the first ordering; n represents the total number of characters of the name of the ith transaction, t represents the tth character in the name of the ith transaction, and s [ t ] represents the ten thousand code of the tth character.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
10. An electronic device, comprising:
the computer-readable storage medium of claim 9; and
one or more processors to execute the program in the computer-readable storage medium.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711195221.9A CN108197142B (en) | 2017-11-24 | 2017-11-24 | Method, device, storage medium and equipment for determining relevance of network transaction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711195221.9A CN108197142B (en) | 2017-11-24 | 2017-11-24 | Method, device, storage medium and equipment for determining relevance of network transaction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108197142A CN108197142A (en) | 2018-06-22 |
CN108197142B true CN108197142B (en) | 2020-10-30 |
Family
ID=62573086
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711195221.9A Active CN108197142B (en) | 2017-11-24 | 2017-11-24 | Method, device, storage medium and equipment for determining relevance of network transaction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108197142B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101888309A (en) * | 2010-06-30 | 2010-11-17 | 中国科学院计算技术研究所 | Online log analysis method |
CN102054004A (en) * | 2009-11-04 | 2011-05-11 | 清华大学 | Webpage recommendation method and device adopting same |
CN102158355A (en) * | 2011-03-11 | 2011-08-17 | 广州蓝科科技股份有限公司 | Log event correlation analysis method and device capable of concurrent and interrupted analysis |
CN102855309A (en) * | 2012-08-21 | 2013-01-02 | 亿赞普(北京)科技有限公司 | Information recommendation method and device based on user behavior associated analysis |
CN104917627A (en) * | 2015-01-20 | 2015-09-16 | 杭州安恒信息技术有限公司 | Log cluster scanning and analysis method used for large-scale server cluster |
CN106452808A (en) * | 2015-08-04 | 2017-02-22 | 北京奇虎科技有限公司 | Data processing method and data processing device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8060540B2 (en) * | 2007-06-18 | 2011-11-15 | Microsoft Corporation | Data relationship visualizer |
-
2017
- 2017-11-24 CN CN201711195221.9A patent/CN108197142B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102054004A (en) * | 2009-11-04 | 2011-05-11 | 清华大学 | Webpage recommendation method and device adopting same |
CN101888309A (en) * | 2010-06-30 | 2010-11-17 | 中国科学院计算技术研究所 | Online log analysis method |
CN102158355A (en) * | 2011-03-11 | 2011-08-17 | 广州蓝科科技股份有限公司 | Log event correlation analysis method and device capable of concurrent and interrupted analysis |
CN102855309A (en) * | 2012-08-21 | 2013-01-02 | 亿赞普(北京)科技有限公司 | Information recommendation method and device based on user behavior associated analysis |
CN104917627A (en) * | 2015-01-20 | 2015-09-16 | 杭州安恒信息技术有限公司 | Log cluster scanning and analysis method used for large-scale server cluster |
CN106452808A (en) * | 2015-08-04 | 2017-02-22 | 北京奇虎科技有限公司 | Data processing method and data processing device |
Non-Patent Citations (1)
Title |
---|
基于Lucene的中文自然语言搜索引擎;胡长春;《中国优秀硕士学位论文全文数据库信息科技辑》;20091215;正文第22页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108197142A (en) | 2018-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11023505B2 (en) | Method and apparatus for pushing information | |
CN107463704B (en) | Search method and device based on artificial intelligence | |
US20140279751A1 (en) | Aggregation and analysis of media content information | |
US10748165B2 (en) | Collecting and analyzing electronic survey responses including user-composed text | |
KR20200019824A (en) | Entity relationship data generating method, apparatus, equipment and storage medium | |
CN111783016B (en) | Website classification method, device and equipment | |
CN110674255A (en) | Text content auditing method and device | |
US10803257B2 (en) | Machine translation locking using sequence-based lock/unlock classification | |
CN110598109A (en) | Information recommendation method, device, equipment and storage medium | |
JP2023017921A (en) | Content recommendation and sorting model training method, apparatus, and device and computer program | |
WO2022076885A1 (en) | Systems and methods for tracking data shared with third parties using artificial intelligence-machine learning | |
US20220391595A1 (en) | User discussion environment interaction and curation via system-generated responses | |
Zhu et al. | CCBLA: a lightweight phishing detection model based on CNN, BiLSTM, and attention mechanism | |
US11055330B2 (en) | Utilizing external knowledge and memory networks in a question-answering system | |
WO2022235404A1 (en) | Composing human-readable explanations for user navigational recommendations | |
CN112765966B (en) | Method and device for removing duplicate of associated word, computer readable storage medium and electronic equipment | |
US20210264480A1 (en) | Text processing based interface accelerating | |
CN117216393A (en) | Information recommendation method, training method and device of information recommendation model and equipment | |
CN111382232A (en) | Question and answer information processing method and device and computer equipment | |
CN108197142B (en) | Method, device, storage medium and equipment for determining relevance of network transaction | |
US20230161948A1 (en) | Iteratively updating a document structure to resolve disconnected text in element blocks | |
US20220358293A1 (en) | Alignment of values and opinions between two distinct entities | |
CN113961811A (en) | Conversational recommendation method, device, equipment and medium based on event map | |
Do et al. | Some Research Issues of Harmful and Violent Content Filtering for Social Networks in the Context of Large-Scale and Streaming Data with Apache Spark | |
CN115687736B (en) | Web application searching method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |