Detailed Description
Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present invention by illustrating examples of the present invention.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone.
In order to solve the problems that structural information containing position correlation in logs cannot be judged, and when one type of similar logs contain multiple structures, the clustering time is long and the accuracy of clustering results is low due to the fact that a vectorization model is single in structure, the embodiment of the invention provides a log clustering method, a log clustering device, log clustering equipment and a computer readable storage medium.
The technical solutions of the embodiments of the present invention are described below with reference to the accompanying drawings.
Fig. 1 is a schematic flowchart of a log clustering method provided in an embodiment of the present invention, and as shown in fig. 1, an execution subject of the method is a device for log clustering, and the log clustering method may include the following steps:
s101, first log data are acquired.
First, second log data is obtained, wherein the second log data is a large amount of original log data generated by daily work of the system, and the original log data can include normal log data generated when the system normally runs and abnormal log data generated when an application program in the system fails.
The original log data usually contains some semi-structured data with incomplete structural information or log data with missing content, for example, some logs can miss link tracking codes (Trace ID, TID) in the generation process, so that in the face of the original log data, sample denoising can be performed to remove sample noise points with missing information, semi-structured data or log data with missing content in the original log data are removed, first log data including structured information is obtained, clustering is performed by using the first log data, and a subsequent clustering result can be improved.
Then, in one embodiment, the obtained first log data may be subjected to a feature analysis, including: and cleaning useless information such as useless numbers, punctuation marks and the like in the first log data by using a regular expression, analyzing the cleaned first log data by using methods such as word segmentation, classification, statistics and the like, determining the characteristic content in the first log data, and obtaining log characteristics, wherein the log characteristics comprise TID (terminal identification) and log description.
Optionally, in some embodiments, the log features may also include application system name, project name, host address, and log content.
S102, classifying the first log data based on the link tracking code TID to obtain log data of a plurality of TID categories.
And rapidly classifying the first log data by using the TID, and classifying the log data of the same TID into one class, thereby obtaining the log data of a plurality of TID classes.
S103, carrying out clustering processing on the text information corresponding to the log description in the log data of the TID categories according to a K-means clustering algorithm and an editing distance algorithm to obtain a clustering result of the first log data.
In the embodiment of the present invention, a clustering algorithm may be adopted to perform clustering processing on the text information for the log descriptions in the log data of multiple TID categories, so as to obtain a clustering result of the first log data.
In one embodiment, the clustering algorithm may include a K-means clustering algorithm and an edit distance algorithm.
Firstly, before clustering processing is carried out on log data of a plurality of TID categories, vectorization is carried out on log features in the log data of the TID categories respectively, and a plurality of feature dimensions are obtained.
And then selecting log descriptions in the multiple characteristic dimensions, and clustering text information corresponding to the log descriptions in the log data of the multiple TID categories to obtain a clustering result of the first log data.
In some embodiments, clustering text information corresponding to log descriptions in log data of multiple TID categories may include the following steps:
and S1031, performing word frequency-inverse file frequency TF-IDF numeralization on the text information corresponding to the log description in the log data of the TID categories respectively.
Firstly, according to formula (1), performing word Frequency (TF) statistics on text information corresponding to log descriptions in log data of a plurality of TID categories, and calculating the Frequency of a given entry appearing in a log to which the given entry belongs.
In which an entry t is giveni,ni,kAs an entry tiIn the log djNumber of occurrences, Σknk,jFor logs djSum of the number of occurrences of all entries in tfi,jAs an entry tiIn the affiliated log djThe frequency of occurrence of (a).
Then, according to formula (2), reverse document frequency (IDF) statistics is performed on the text information corresponding to the log descriptions in the log data of the TID categories, and the importance degree of a given term in all logs is evaluated.
In which an entry t is giveni,|{j:ti∈djIs a term containing tiIs the total number of logs, | D | is the total number of logs, idfiAs an entry tiThe inverse file frequency of (1).
And finally, respectively calculating the high-dimensional vector of each log in all TID category log data according to a formula (3).
tfidfi,j=tfi,j×idfi (3)
According to TF-IDF numeralization, each log generates a high-dimensional vector with a fixed length.
And S1032, reducing the dimension of the high-dimension vector according to Principal Component Analysis (PCA).
After the high-dimensional vector is obtained, the high-dimensional vector is subjected to dimensionality reduction by Principal Component Analysis (PCA), so that the considered characteristic variables are reduced, and the low-dimensional vector is obtained.
In some embodiments, dimensionality reduction of the high-dimensional vector according to PCA may include the following steps:
step 1, in order to eliminate the influence caused by different dimensions and too large numerical difference of the high-dimensional vectors, the high-dimensional vectors need to be standardized to obtain a standardized matrix.
As a specific example, a high-dimensional vector is represented as a data matrix X (X ═ X)ij)n×pWherein i is 1, 2 … n, j is 1, 2 … p, XijA j index value representing the i unit.
First, a data matrix X is calculated according to formula (4)
jIs an arithmetic mean of
Then, the data matrix X is calculated according to the formula (5)jStandard deviation of (a)j。
Finally, the normalized data matrix y is calculated according to equation (6)ij。
And 2, establishing a correlation matrix according to the standardized data matrix, and calculating the eigenvalue and the eigenvector of the correlation matrix.
As a specific example, a correlation matrix R can be determined according to the normalized data matrix Y, and the eigenvalue λ of R can be obtained according to the correlation matrix RjJ is 1 or 2 … p, and the eigenvalues are arranged from small to large to obtain λ1≥λ2≥…≥λp(ii) a Then, the corresponding characteristic vector alpha is solved according to the characteristic polynomiali=(αi1,αi2,…αi1p),i=1、2…p。
And 3, calculating the variance contribution rate and the accumulated variance contribution rate according to the eigenvalue and the eigenvector of the correlation matrix.
The eigenvalue of the correlation matrix is equal to the variance of the corresponding principal component, and the magnitude of the eigenvalue reflects the proportion of all information of the original data contained in the ith principal component and the contribution of each principal component.
And 4, calculating the principal component of the high-dimensional vector according to a formula (7).
Z=Yα (7)
Wherein, Y is the normalized data matrix, and alpha is the characteristic vector of the correlation matrix.
If it is
And the contribution rate beta (S) of the cumulative variance of the S-th principal component is more than or equal to alpha, then Z
1,Z
2,...,Z
sIs a sample X
1,X
2,...X
pHas a significance level of alpha, and contains a main component Z
1,Z
2,...,Z
sTo replace the sample X
1,X
2,...X
pThe method not only reduces the dimensionality of the input high-dimensional vector, but also eliminates the autocorrelation of the original sample space, thereby obtaining the low-dimensional vector.
S1033, clustering the low-dimensional vectors according to the K-means clustering algorithm and the edit distance algorithm to obtain a clustering result of the first log data.
After the low-dimensional vector is obtained, firstly, the low-dimensional vector is subjected to primary Clustering by using a K-means Clustering Algorithm (K-means Clustering Algorithm) to obtain a first Clustering result.
The PCA dimension reduction in S1032 has a certain positive influence on the result of the preliminary clustering.
In some embodiments, the method comprises the steps of carrying out K-means preliminary clustering by adopting near 100 pieces of abnormal log sample information, and evaluating an obtained first clustering result to obtain a clustering result evaluation graph. Fig. 2 is an evaluation graph of a clustering result including PCA dimension reduction provided in an embodiment of the present invention, where a vertical axis represents an evaluation coefficient, and a horizontal axis represents K value selection, and the higher the evaluation coefficient is, the better the clustering effect is, as shown in fig. 2, the evaluation coefficient of the abnormal log sample information subjected to PCA dimension reduction is the highest within an interval of 8 to 11, and is about 0.997. Fig. 3 is an evaluation diagram of a clustering result without PCA dimension reduction provided in the embodiment of the present invention, and as shown in fig. 3, for abnormal log sample information without PCA dimension reduction, an evaluation coefficient of a clustering K value in an interval of 2 to 5 is too low, and evaluation coefficients of clustering K values in an interval of 8 to 100 are substantially the same and generally less than 0.99. Therefore, the PCA dimensionality reduction is carried out on the log data before clustering, and the result quality of the primary clustering can be improved to a certain extent.
On the basis of the primary clustering, the low-dimensional vectors are further clustered by using Edit Distance (Edit Distance), wherein the Edit Distance refers to the minimum number of Edit operations required for converting two strings from one string to another string, so that the similar degree between different logs can be well represented by using the Edit Distance, the Edit Distance of similar logs is short, and the Edit Distance of dissimilar logs is long.
Specifically, the edit distance is calculated according to formula (8).
Wherein, leva,b(| i |, | j |) represents the edit distance of the two character strings a, b, i and j correspond to the character string lengths of a, b, respectively.
In some embodiments, a distance threshold is preset as an evaluation basis of the current clustering, if the minimum editing distance between the log A to be clustered and the log B in the existing TID category is smaller than the distance threshold, the log A is classified into a sub-category with the minimum editing distance under the TID category of the log B, otherwise, the log A is classified into a new TID category, and the whole clustering process can be completed by repeating the above process.
In some embodiments, 5000 anomaly log samples containing TIDs are input, and the change of threshold, cluster and running time in the process of analyzing and editing distance clustering is calculated. Table 1 is a change table when clustering is performed according to an edit distance algorithm according to an embodiment of the present invention, and as shown in table 1, in a clustering process for 5000 abnormal log samples input this time, a larger threshold value is, a smaller number of clustering clusters is, and a shorter program running time is.
TABLE 1
Threshold value
|
Number of clusters
|
Run time (seconds)
|
Run time (minutes)
|
0.05
|
17
|
3248.016
|
54.1336
|
0.1
|
12
|
2120.227
|
35.33711667
|
0.15
|
11
|
2121.372
|
35.3562
|
0.2
|
10
|
1658.652
|
27.6442
|
0.25
|
9
|
1632.79
|
27.21316667
|
0.3
|
8
|
1798.63
|
29.97716667 |
In some embodiments, a number of tests have yielded the run time and cluster number versus distance threshold for clustering according to the edit distance algorithm. Fig. 4 is a graph of distance threshold and operation time, where the horizontal axis represents the size of the distance threshold and the vertical axis represents the operation time (unit: second), the larger the distance threshold, the shorter the operation time in clustering, and the gradual and gradual operation time is greater than 0.2, as shown in fig. 4. Fig. 5 is a line graph of a distance threshold and the number of clusters provided in the embodiment of the present invention, as shown in fig. 5, a horizontal axis represents the size of the distance threshold, a vertical axis represents the number of clusters (unit: one), and the larger the distance threshold, the smaller the number of clusters obtained by clustering.
According to the log clustering method, the request call can be tracked through the TID, when an application program fails, the failure source can be found quickly, the performance bottleneck on each link can be monitored, the logs are classified based on the TID generated in the logs, the logs are clustered and analyzed by using an algorithm fusing various clustering, the condition that a single log category corresponds to multi-structure log content can be effectively compensated, and therefore the accuracy and the clustering speed of log clustering results are effectively improved.
Fig. 6 is a schematic flowchart of another log clustering method according to an embodiment of the present invention, and as shown in fig. 6, the log clustering method may include S101 to S104.
S101, first log data are acquired.
S102, classifying the first log data based on the link tracking code TID to obtain log data of a plurality of TID categories.
S103, carrying out clustering processing on the text information corresponding to the log description in the log data of the TID categories according to a K-means clustering algorithm and an editing distance algorithm to obtain a clustering result of the first log data.
And S104, evaluating the clustering result according to the clustering algorithm evaluation index to obtain an evaluation result, wherein the clustering algorithm evaluation index comprises an outline coefficient, a Calinski-Harabasz index and a Thewessonbergin index.
The contour Coefficient (Silhouette coeffient) is a way to evaluate the clustering effect, and can be used for evaluating the influence of different algorithms or different operation ways of the algorithms on the clustering result on the basis of the same original data by combining two factors of cohesion and separation.
The contour coefficient of each vector in the cluster is calculated according to equation (9).
Where a (i) average represents the distance from the i vector to other points in all the clusters to which it belongs, and b (i) min represents the average distance from the i vector to all the points in the cluster nearest to it, and the value of the contour coefficient is in the range of-1, and the closer to 1, the better the cohesion and separation are.
And averaging the contour coefficients of all the points to obtain the total contour coefficient of the clustering result, wherein the higher the contour coefficient is, the better the clustering effect is.
In some embodiments, 5000 abnormal log samples containing TIDs are input for editing distance clustering, obtained clustering results are evaluated by using contour coefficients, and the relationship between the contour coefficients and the number of clustered clusters is analyzed. Fig. 7 is a TID cluster-based profile coefficient evaluation index graph according to the embodiment of the present invention, where as shown in fig. 7, the vertical axis represents a profile coefficient, and the horizontal axis represents the cluster number of the clustering result, and in TID-based clustering, the clustering result has the highest profile coefficient between 2 and 5 cluster numbers and the best clustering effect, and the cluster number is the second highest between 8 and 11 cluster numbers.
The Calinski-Harabasz (CHI) index is calculated according to the formula (10), and the higher the CHI index value is, the better the clustering effect is.
Where m is the number of samples in the training set, k is the number of classes, BkAs covariance between classes, WkIs the covariance matrix of the data inside the class, tr is the trace of the matrix.
In some embodiments, 5000 abnormal log samples containing TIDs are input for K-means clustering and editing distance algorithm clustering, obtained clustering results are evaluated by using a CHI index, and the relationship between the CHI index and the number of clustered clusters is analyzed. Fig. 8 is a CHI index evaluation index graph based on TID clustering according to an embodiment of the present invention, where as shown in fig. 8, the vertical axis represents the CHI index, and the horizontal axis represents the cluster number of the clustering result, and in TID-based clustering, the CHI index value of the clustering result is the highest between 2 and 3 cluster numbers, the clustering effect is the best, and the clustering effect is the second between 5 and 9 cluster numbers.
The Daviesenbergin (DBI) index is the maximum value of the ratio of the sum of the average distances in any two categories of the intra-category distances to the distance between two clustering centroids, and the smaller the DBI index is, the better the clustering effect is.
Calculating the DBI index of the clustering result can comprise the following steps:
step 1, the degree of dispersion is calculated according to formula (11).
Wherein, XjDenotes the jth data point, A, in the ith classiDenotes the center of the i-th class, TiRepresenting the number of data points in the ith class, representing the mean value of the distances from each point to the center when q is 1, representing the standard deviation of the distances from each point to the center when q is 2, and SiIndicating the degree of scatter of the metric data points in the ith class.
And 2, calculating the distance between the categories according to the formula (12).
Wherein, akiValue of the Kth attribute representing the center point of the ith class, akjRepresenting the value of the Kth attribute of the center point of the jth class, representing the mean value of the distances from each point to the center when the value of p is 1, representing the standard deviation of the distances from each point to the center when the value of p is 2, N representing the number of the attributes in i, MijIndicating the distance of the ith class from the center of the jth class.
And 3, calculating the similarity between the categories according to a formula (13).
Wherein S isiRepresenting the degree of scatter of the metric data points in the ith class, SjRepresents the degree of scatter, M, of the metric data points in the jth classijDenotes the distance, R, between the ith class and the jth class centerijIndicating the similarity between the ith class and the jth class.
Step 4, from R
ijMaximum value of
I.e., the value of the maximum similarity among the similarities of the ith class and the other classes, the mean value of the maximum similarity of each class is calculated according to formula (14).
Wherein, N represents the number of categories,
the average value representing the maximum similarity, namely the DBI index of the clustering result, and the number of categories influence the size of the DBI index.
In some embodiments, 5000 abnormal log samples containing TIDs are input for K-means clustering and editing distance algorithm clustering, obtained clustering results are evaluated by using the DBI index, and the relationship between the DBI index and the number of clustered clusters is analyzed. Fig. 9 is a DBI index evaluation index graph based on TID clustering according to an embodiment of the present invention, where as shown in fig. 9, a vertical axis represents a DBI index, and a horizontal axis represents the cluster number of a clustering result, and in TID-based clustering, the CHI index value of the clustering result is the highest between 2 and 3 cluster numbers, and the clustering effect is the best, and the clustering effect is the second between 8 and 11 cluster numbers.
In some embodiments, after the clustering result is evaluated according to the clustering algorithm evaluation index, the clustering result, the evaluation result, the parameter conversion table of the TF-IDF, and the edited distance clustering parameter conversion table are output, wherein the parameter conversion table of the TF-IDF and the edited distance clustering parameter conversion table can be stored in a database in a document form, which is convenient for development and maintenance personnel to check.
In some embodiments, 5000 abnormal log samples are input for traditional hierarchical algorithm clustering (classification is not performed based on TID), and the obtained clustering result is evaluated by using the CHI index. Fig. 10 is a CHI index evaluation index graph based on a conventional hierarchical clustering algorithm provided by an embodiment of the present invention, as shown in fig. 10, a vertical axis represents a CHI index, a horizontal axis represents the number of clusters of a clustering result, the clustering result with the CHI index greater than 0.7 is basically distributed in an area with a high number of clustering clusters (between 50 and 100), and as the number of clustering clusters increases, the CHI index has a tendency of increasing continuously, which obviously does not conform to the principle of cluster extraction of data classes, and the clustering effect is poor.
In some embodiments, 5000 abnormal log samples containing TIDs are input for K-means clustering and edit distance clustering, and the obtained clustering results are evaluated by using the CHI index. Fig. 11 is a CHI index evaluation index graph based on TID clustering according to an embodiment of the present invention, where as shown in fig. 11, the vertical axis represents a CHI index, and the horizontal axis represents the cluster number of the clustering result, and in TID-based clustering, the clustering result with a CHI index greater than 0.85 is substantially distributed in an area with a lower cluster number (between 14 and 25), and as the cluster number increases, the CHI index tends to decrease continuously, so that the clustering effect is better.
According to the log clustering method provided by the embodiment of the invention, the clustering result of the log is evaluated through the contour coefficient, the CHI index and the DBI index, the relation between the evaluation index and the number of clustering clusters in the clustering result can be analyzed, a proper log clustering algorithm is selected through the relation between the evaluation index and the number of clustering clusters in the clustering result, and the number of clustering clusters in the clustering result is adjusted, so that the log clustering effect is effectively improved.
Fig. 12 is a schematic structural diagram of a log clustering apparatus according to an embodiment of the present invention, and as shown in fig. 12, the log clustering apparatus 200 may include: an obtaining module 210, a classifying module 220, and a clustering module 230.
The obtaining module 210 is configured to obtain first log data, where the first log data includes a link tracking code TID and a log description; the classification module 220 is configured to classify the first log data based on the TID to obtain log data of multiple TID categories; and the clustering module 230 is configured to perform clustering processing on text information corresponding to log descriptions in the log data of the multiple TID categories according to a K-means clustering algorithm and an edit distance algorithm to obtain a clustering result of the first log data.
In some embodiments, the obtaining module 210 is specifically configured to: acquiring second log data; and removing the semi-structured data or the log data with missing content in the second log data to obtain the first log data.
In some embodiments, after the first log data is obtained, a determining module 240 is further included for washing the first log data by using a regular expression, and determining the TID and the log description in the first log data according to the washed first log data.
In some embodiments, the clustering module 230 is specifically configured to: according to a K-means clustering algorithm and an edit distance algorithm, clustering processing is carried out on text information corresponding to log descriptions in the log data of the TID categories to obtain a clustering result of the first log data, and the method comprises the following steps: vectorizing the TIDs and the log description in the log data of the TID categories respectively to obtain a plurality of characteristic dimensions; selecting log descriptions in first log data corresponding to a plurality of characteristic dimensions, and performing word frequency-inverse file frequency TF-IDF numeralization on text information corresponding to the log descriptions in the log data of a plurality of TID categories to obtain a plurality of high-dimensional vectors; performing dimensionality reduction on the high-dimensionality vector according to Principal Component Analysis (PCA) to obtain a low-dimensionality vector; and clustering the low-dimensional vectors according to a K-means clustering algorithm and an edit distance algorithm to obtain a clustering result of the first log data.
In some embodiments, the evaluation module 250 is further included to evaluate the clustering result according to a clustering algorithm evaluation index to obtain an evaluation result, where the clustering algorithm evaluation index includes a contour coefficient, a Calinski-Harabasz index, and a davisenburg index.
In some embodiments, the first log data further comprises: application system name, project name, host address, and log content.
The log clustering device provided by the embodiment of the invention can deeply dig effective information for helping operation and maintenance personnel to detect from massive logs, so that the error generated by a vectorization model with a single structure is compensated, the accuracy of a clustering result is improved, and the clustering time is saved.
Fig. 13 is a schematic diagram of a hardware structure of a log clustering device according to an embodiment of the present invention.
As shown in fig. 13, the clustering device 300 of logs in the present embodiment includes an input device 301, an input interface 302, a central processor 303, a memory 304, an output interface 305, and an output device 306. The input interface 302, the central processing unit 303, the memory 304, and the output interface 305 are connected to each other through a bus 310, and the input device 301 and the output device 306 are connected to the bus 310 through the input interface 302 and the output interface 305, respectively, and further connected to other components of the information acquisition device 300.
Specifically, the input device 301 receives input information from the outside and transmits the input information to the central processor 303 through the input interface 302; central processor 303 processes the input information based on computer-executable instructions stored in memory 304 to generate output information, stores the output information temporarily or permanently in memory 304, and then transmits the output information to output device 306 through output interface 305; the output device 306 outputs the output information to the outside of the information acquisition device 300 for use by the user.
In one embodiment, the clustering device 300 of the log shown in fig. 13 includes: a memory 304 for storing programs; a processor 303 for executing the program stored in the memory to execute the method of the embodiment shown in fig. 1 or fig. 6 provided by the embodiment of the present invention.
An embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium has computer program instructions stored thereon; the computer program instructions, when executed by a processor, implement the method of the embodiment of fig. 1 or fig. 6 provided by embodiments of the present invention.
It is to be understood that the invention is not limited to the specific arrangements and instrumentality described above and shown in the drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications and additions or change the order between the steps after comprehending the spirit of the present invention.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic Circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of machine-readable media include electronic circuits, semiconductor Memory devices, Read-Only memories (ROMs), flash memories, erasable ROMs (eroms), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.
It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
As described above, only the specific embodiments of the present invention are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.