CN111625342B - Data tracing method, device and server - Google Patents

Data tracing method, device and server Download PDF

Info

Publication number
CN111625342B
CN111625342B CN202010509005.2A CN202010509005A CN111625342B CN 111625342 B CN111625342 B CN 111625342B CN 202010509005 A CN202010509005 A CN 202010509005A CN 111625342 B CN111625342 B CN 111625342B
Authority
CN
China
Prior art keywords
data
tracing
target
vector
thread
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010509005.2A
Other languages
Chinese (zh)
Other versions
CN111625342A (en
Inventor
梁成敏
杨乐忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GUIZHOU BOCHENG TECHNOLOGY Co.,Ltd.
Original Assignee
Guizhou Zheng Hi Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Zheng Hi Tech Co Ltd filed Critical Guizhou Zheng Hi Tech Co Ltd
Priority to CN202010509005.2A priority Critical patent/CN111625342B/en
Publication of CN111625342A publication Critical patent/CN111625342A/en
Application granted granted Critical
Publication of CN111625342B publication Critical patent/CN111625342B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/508Monitor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present invention relates to the field of data processing technologies, and in particular, to a data tracing method, an apparatus, and a server. The method comprises the following steps: determining the residual memory resources corresponding to the server when the number of the detected target service processing threads is lower than a set value, acquiring a target data characteristic vector from a preset data storage area, starting a data tracing thread, loading the target data characteristic vector into the data tracing thread to determine a data tracing result, determining a running log corresponding to the data tracing thread, extracting a script file for determining the data tracing result from the running log, and storing the script file after the association between the script file and the target data characteristic vector is completed. The method and the system can determine the source tracing result after the scripting on the premise of not influencing the normal service development of the server, can call and operate the source tracing result after the scripting to efficiently and quickly realize data source tracing, and reduce the source tracing difficulty of data in the later period.

Description

Data tracing method, device and server
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data tracing method, an apparatus, and a server.
Background
With the development of big data, social production and life have not been separated from the data. Massive information behind the data can be mined by analyzing the data, so that normal business development of various industries is guided. In order to facilitate uniform management of data, the data is often stored in a database in a centralized manner. In general, the database may be a cloud server or a local server. When data management is performed on a database, in order to improve management efficiency, a large amount of data is usually stored in a specific manner, but this increases the difficulty of tracing the source of the data in the later period.
Disclosure of Invention
In order to solve the technical problem that the big data tracing updating is difficult to effectively realize according to the activity track in the related technology, the invention provides a data tracing method, a data tracing device and a server.
A data tracing method is applied to a server, and the method at least comprises the following steps:
detecting whether the number of target service processing threads in a running state is lower than a set value, wherein the target service processing threads are started by a server based on a service processing request initiated by a user terminal, the target service processing threads occupy part of memory resources of the server when running, and the memory resources occupied by different target service processing threads are different in size;
when detecting that the number of the target service processing threads in the running state is lower than the set value, determining the residual memory resources corresponding to the server according to the memory resources occupied by the target service processing threads and the memory resources occupied by the system program of the server;
acquiring a target data characteristic vector having a preset mapping relation with the residual memory resources from a preset data storage area, and starting a data tracing thread according to the vector dimension of the target data characteristic vector; the preset mapping relation is obtained through vector dimensions of data characteristic vectors in the data storage area and resource fragmentation sequences of residual memory resources;
loading the target data feature vector into the data tracing thread to determine a data tracing result of the target data feature vector through the data tracing thread;
determining an operation log of the data tracing thread during operation and determining the data tracing result, and extracting a script file for determining the data tracing result from the operation log;
and after the association between the script file and the target data characteristic vector is completed, storing the script file into target equipment communicated with the server and returning to the step of detecting whether the number of the target service processing threads in the running state is lower than a set numerical value.
Preferably, the step of obtaining the target data feature vector having a preset mapping relationship with the remaining memory resources from a preset data storage area specifically includes:
sequentially selecting first data characteristic vectors in the data storage area according to the characteristic weight, and determining second data characteristic vectors in the data characteristic vectors before the first data characteristic vectors in the data storage area;
acquiring a weight distribution parameter of each vector value in the second data characteristic vector, and performing parameter characteristic mapping on the first data characteristic vector according to a parameter corresponding relation between the weight distribution parameter and the performance parameters of the residual memory resources to obtain a parameter mapping list of the first data characteristic vector; the performance parameter of the residual memory resource is used for representing a memory occupation coefficient of a reference data feature vector which is pre-distributed by taking the operating efficiency of the residual memory resource as a reference;
splitting the parameter mapping list to obtain a plurality of parameter list units corresponding to the parameter mapping list; determining a mapping identifier of the first data characteristic vector from the parameter mapping list to obtain identifier description information corresponding to the mapping identifier of the first data characteristic vector;
according to the identification description information and the parameter list units, determining a first mapping parameter of the first data characteristic vector relative to the data storage area and a second mapping parameter relative to the residual memory resource from the first data characteristic vector; and judging whether the similarity of the first mapping parameter and the second mapping parameter reaches a set threshold value, and determining the current first data feature vector as the target data feature vector when the similarity of the first mapping parameter and the second mapping parameter reaches the set threshold value.
Preferably, the step of starting a data tracing thread according to the vector dimension of the target data feature vector specifically includes:
generating a dimension distribution grid corresponding to a vector dimension of the target data feature vector and a vector value sequence distribution grid corresponding to a vector value of the target data feature vector, wherein the dimension distribution grid and the vector value sequence distribution grid respectively comprise a plurality of grid units with different grid characteristics;
extracting the unit attribute of the vector dimension of the target data feature vector in any grid unit of the dimension distribution grid, and determining the grid unit with the minimum grid feature in the vector value sequence distribution grid as a reference grid unit;
mapping the unit attribute to the reference grid unit according to a preset thread performance parameter and thread structure description, obtaining a current mapping attribute in the reference grid unit, and generating a conversion list between a vector dimension of the target data characteristic vector and a vector value of the target data characteristic vector according to the unit attribute and the current mapping attribute;
obtaining a tracing direction parameter in the reference grid unit by taking the current mapping attribute as a current attribute index, mapping the tracing direction parameter to the grid unit where the unit attribute is located according to a reverse conversion list corresponding to the conversion list, and obtaining tracing thread direction information corresponding to the tracing direction parameter in the grid unit where the unit attribute is located;
and determining a corresponding data tracing thread according to the tracing thread direction information and starting the data tracing thread corresponding to the tracing thread direction information.
Preferably, the step of loading the target data feature vector into the data tracing thread to determine a data tracing result of the target data feature vector through the data tracing thread specifically includes:
acquiring loading flow direction information of a target data feature vector loaded into the data tracing thread; analyzing the loading flow direction information to obtain at least a plurality of information flows;
performing information weight sorting on the at least a plurality of information flows to obtain an information flow sorting sequence of the loading flow direction information, wherein the information flow sorting sequence is used for describing information flow direction characteristics of the target data characteristic vector in a loading process relative to the data tracing thread;
sequencing information capacity of the at least a plurality of information flows to obtain an information capacity sequencing sequence of the loading flow direction information, wherein the information capacity sequencing sequence is used for information capacity characteristics of the target data characteristic vector in a loading process relative to the data tracing thread;
when the loading flow direction information comprises at least two information flows, respectively extracting sequence characteristics of an information flow sequencing sequence and an information capacity sequencing sequence of the information flows to obtain a first sequence characteristic and a second sequence characteristic for each information flow in the loading flow direction information;
searching in a preset cloud database based on the first sequence feature and the second sequence feature to obtain a data packet corresponding to the first sequence feature and the second sequence feature;
and obtaining a data tracing result of the target data feature vector in the data tracing thread according to the mapping result of the data packet in the data tracing thread.
Preferably, the step of determining the running log of the data tracing thread when running and the step of determining the data tracing result specifically include:
determining a first moment when the occupation behavior of the time slice resources is generated and a second moment when the occupation behavior of the time slice resources is released, which correspond to the data tracing thread, in the current time slice resources;
and determining the log file between the first time and the second time as the running log.
Preferably, the step of extracting a script file for determining the data tracing result from the running log specifically includes:
acquiring the datamation description information of the data tracing result, and splitting the datamation description information into a plurality of data segments; pushing each data segment to each preset data analysis thread in an activation process in parallel; the data segments are used for indicating corresponding preset data analysis threads to generate script parameters corresponding to the data segments, the data segments are also used for indicating corresponding preset data analysis threads to respectively convert the data segments into data description parameters and data logic parameters, respectively extracting a first parameter sequence from each parameter array of the data description parameters, respectively extracting a second parameter sequence from each logic array of the data logic parameters, determining scripted description according to the first parameter sequence and determining a source tracing logic topology according to the second parameter sequence;
analyzing each scripted description and the tracing logic topology to obtain script parameters corresponding to the plurality of data segments; and removing redundant system data in the script parameters generated by each preset data analysis thread, and combining and generating a script file corresponding to the data tracing result according to the script parameters left after the redundant system data are removed.
Preferably, the method further comprises:
acquiring a request instruction which is sent by a service terminal and used for acquiring target data;
acquiring a characteristic vector of data to be traced from the data storage area according to the request instruction;
determining a script file to be processed which is associated with the characteristic vector of the data to be traced from the target equipment;
running the script file to be processed to obtain target data corresponding to the characteristic vector of the data to be traced;
and sending the target data to the service terminal.
A data tracing device is applied to a server, and the device at least comprises:
the thread detection module is used for detecting whether the number of target service processing threads in a running state is lower than a set value, the target service processing threads are started by a server based on a service processing request initiated by a user terminal, the target service processing threads occupy part of memory resources of the server when running, and the memory resources occupied by different target service processing threads are different in size;
the resource determining module is used for determining the residual memory resources corresponding to the server according to the memory resources occupied by the target business processing threads and the memory resources occupied by the system programs of the server when detecting that the number of the target business processing threads in the running state is lower than the set value;
the vector acquisition module is used for acquiring a target data characteristic vector which has a preset mapping relation with the residual memory resources from a preset data storage area and starting a data tracing thread according to the vector dimension of the target data characteristic vector; the preset mapping relation is obtained through vector dimensions of data characteristic vectors in the data storage area and resource fragmentation sequences of residual memory resources;
a source tracing determining module, configured to load the target data feature vector into the data source tracing thread to determine a data source tracing result of the target data feature vector through the data source tracing thread;
the script extraction module is used for determining the running logs of the data tracing thread during running and the data tracing result, and extracting script files for determining the data tracing result from the running logs;
and the script storage module is used for storing the script file into target equipment communicated with the server after the association between the script file and the target data characteristic vector is completed, and returning to the step of detecting whether the number of the target service processing threads in the running state is lower than a set numerical value.
A server, comprising:
a processor, and
a non-volatile memory and a network interface connected with the processor;
the processor is used for calling the computer program in the nonvolatile memory through a network interface and running the computer program through the memory of the processor so as to execute the data tracing method.
A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the data tracing method described above.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects.
The source tracing result after the scripted file can be determined on the premise of not influencing the normal service development of the server. When data tracing is needed in the later stage, the source tracing result after scripting can be called and run to efficiently and quickly realize data tracing, and the source tracing difficulty of the data in the later stage is reduced.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a flow diagram illustrating a data tracing method in accordance with an exemplary embodiment.
Fig. 2 is a schematic diagram illustrating a server according to an example embodiment.
FIG. 3 is a block diagram illustrating an apparatus in accordance with an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
In order to facilitate effective tracing of stored data, the present disclosure discloses a data tracing method, a data tracing device, and a server, which can perform data tracing on different data in turn during a service processing window period of the server, and perform associated storage on a scripted tracing result. When the data tracing method is applied, the source tracing result after the scriptness can be determined on the premise of not influencing the normal service development of the server. When data tracing is needed in the later stage, the source tracing result after scripting can be called and run to efficiently realize data tracing, and the source tracing difficulty of the data in the later stage is reduced.
As shown in fig. 1, a flowchart of a data tracing method according to one possible embodiment of the present application is disclosed, where the data tracing method may be applied to a server, and the data tracing method may specifically be implemented through the following steps.
Step S110, detecting whether the number of target service processing threads in a running state is lower than a set value, where the target service processing threads are started by a server based on a service processing request initiated by a user terminal, and the target service processing threads occupy part of memory resources of the server when running, and the memory resources occupied by different target service processing threads are different in size.
In a possible implementation manner, the set value may be determined according to the total number of the target service processing threads in the set time period and an average value of the memory resources occupied by the target service processing threads in the set time period. In detail, the set period may be a certain period before the current time, for example, several hours or several days ago.
Step S120, when it is detected that the number of the target service processing threads in the running state is lower than the set value, determining the remaining memory resources corresponding to the server according to the memory resources occupied by the target service processing threads and the memory resources occupied by the system program of the server.
In the step S120, the remaining memory resources may be determined by a difference between the rated memory resources of the server and the memory resources occupied by the target service processing thread and the memory resources occupied by the system program of the server. Sufficient memory resources can be reserved for subsequent data tracing by determining the residual memory resources, and tracing errors caused by insufficient memory resources are avoided.
Step S130, acquiring a target data characteristic vector having a preset mapping relation with the residual memory resources from a preset data storage area, and starting a data tracing thread according to a vector dimension of the target data characteristic vector; and the preset mapping relation is obtained through the vector dimension of the data characteristic vector in the data storage area and the resource fragmentation sequence of the residual memory resource.
In one example, the starting of the data tracing thread occupies the remaining memory resources, and the corresponding target data characteristic vector is determined through the preset mapping relation, so that the utilization rate of the remaining memory resources can be ensured, and the underload or overload of the remaining memory resources is avoided.
Step S140, loading the target data feature vector into the data tracing thread to determine a data tracing result of the target data feature vector through the data tracing thread.
In this embodiment, the data tracing result may include information of multiple dimensions, such as a user behavior corresponding to the target data feature vector, and a data interaction record of the data application scenario. Further, the storage space occupied by the data tracing result is larger than the storage space occupied by the target data characteristic vector.
Step S150, determining an operation log when the data tracing thread is running and when the data tracing result is determined, and extracting a script file for determining the data tracing result from the operation log.
In a specific example, the running log is updated in real time, the running log of the data tracing thread during running is different from the running log when the data tracing result is determined, and in this embodiment, the timeline of the running log is when the data tracing result is determined. Colloquially, the running log is: and determining a log of a data tracing result by using the data tracing thread.
With the above example, the script file is used to characterize the whole process of determining the data tracing result, the script file retains the node information of the whole process, and when the script file is re-run, the data tracing result can be quickly determined.
Step S160, after the association between the script file and the target data feature vector is completed, storing the script file into a target device in communication with the server, and returning to the step of detecting whether the number of target service processing threads in the running state is lower than a set value.
In one possible embodiment, the target device is configured to store a script file that completes the association. When a data tracing result of the target data feature vector needs to be obtained at a later stage, the script file can be called from the target equipment and run, and therefore the data tracing result can be determined quickly.
Since the steps are executed in the relative idle time of the server, the normal service development of the server is not influenced. It can be understood that through the circulation of the steps, script files corresponding to a plurality of stored data feature vectors can be determined, so that data tracing can be rapidly realized in a later period.
It can be understood that, when the data tracing method described in steps S110 to S160 is applied, the tracing result after the scripting can be determined without affecting the normal service development of the server. When data tracing is needed in the later stage, the source tracing result after scripting can be called and run to efficiently and quickly realize data tracing, and the source tracing difficulty of the data in the later stage is reduced.
In a possible implementation manner, in order to ensure the utilization rate of the remaining memory resources, the step of obtaining the target data feature vector having the preset mapping relationship with the remaining memory resources from the preset data storage area, which is described in step S130, may be further implemented by the method described in the following substeps.
Step S1311, sequentially selecting first data feature vectors in the data storage area according to the feature weight, and determining second data feature vectors in the data storage area, where the feature weight is located before the first data feature vectors.
Step S1312 obtaining a weight distribution parameter of each vector value in the second data feature vector, and performing parameter feature mapping on the first data feature vector with reference to a parameter correspondence between the weight distribution parameter and the performance parameters of the remaining memory resources to obtain a parameter mapping list of the first data feature vector; and the performance parameter of the residual memory resource is used for representing a memory occupation coefficient of a reference data feature vector which is pre-distributed on the basis of the operating efficiency of the residual memory resource.
Step S1313, splitting the parameter mapping list to obtain a plurality of parameter list units corresponding to the parameter mapping list; and determining the mapping identifier of the first data feature vector from the parameter mapping list to obtain identifier description information corresponding to the mapping identifier of the first data feature vector.
Step S1314, determining a first mapping parameter of the first data feature vector with respect to the data storage area and a second mapping parameter with respect to the remaining memory resources from the first data feature vector according to the identifier description information and the plurality of parameter list units; and judging whether the similarity of the first mapping parameter and the second mapping parameter reaches a set threshold value, and determining the current first data feature vector as the target data feature vector when the similarity of the first mapping parameter and the second mapping parameter reaches the set threshold value.
It should be understood that, when the method steps described in steps S1311 to S1314 are executed, similarity comparison can be performed on the first mapping parameter and the second mapping parameter corresponding to the data feature vector according to the feature weight of the data feature vector stored in the data storage area, so as to determine the target data feature vector according to the result of the similarity comparison. Therefore, the target data feature vector can be matched with the residual memory resources, and the utilization rate of the residual memory resources is further ensured.
In another possible example, the step of starting the data tracing thread according to the vector dimension of the target data feature vector described in the above step S130 may be further implemented by a method described in the following step.
Step S1321, generating a dimension distribution grid corresponding to a vector dimension of the target data feature vector and generating a vector value sequence distribution grid corresponding to a vector value of the target data feature vector, where the dimension distribution grid and the vector value sequence distribution grid respectively include a plurality of grid units with different grid features.
Step S1322 is to extract a cell attribute of the vector dimension of the target data feature vector in any grid cell of the dimension distribution grid, and determine a grid cell having a minimum grid feature in the vector value sequence distribution grid as a reference grid cell.
Step S1323, mapping the unit attribute to the reference grid unit according to a preset thread performance parameter and a thread structure description, obtaining a current mapping attribute in the reference grid unit, and generating a conversion list between the vector dimension of the target data feature vector and the vector value of the target data feature vector according to the unit attribute and the current mapping attribute.
Step S1324, obtaining a tracing direction parameter in the reference grid unit by taking the current mapping attribute as a current attribute index, mapping the tracing direction parameter to the grid unit where the unit attribute is located according to the reverse conversion list corresponding to the conversion list, and obtaining tracing thread direction information corresponding to the tracing direction parameter in the grid unit where the unit attribute is located.
Step S1325, determining a corresponding data tracing thread according to the tracing thread direction information and starting the data tracing thread corresponding to the tracing thread direction information.
In specific implementation, by executing the steps described in the steps S1321 to S1325, the vector dimension and the vector value of the target data feature vector can be analyzed, so that the mapping relationship and the confirmation of the conversion list are performed by using the dimension distribution grid and the vector value sequence distribution grid, and the tracing thread direction information corresponding to the tracing direction parameter is accurately determined. Therefore, the data tracing thread can be determined based on the tracing thread direction information, and reliable data tracing can be realized by the opened data tracing thread.
On the basis, in order to accurately and comprehensively obtain the data tracing result corresponding to the target data feature vector, the step of loading the target data feature vector into the data tracing thread to determine the data tracing result of the target data feature vector through the data tracing thread described in the step S140 may specifically be implemented according to a method principle corresponding to the following sub-steps.
Step 1401, obtaining loading flow direction information of a target data feature vector loaded into the data tracing thread; and analyzing the loading flow direction information to obtain at least a plurality of information flows.
Step S1402, performing information weight sorting on the at least multiple information streams to obtain an information stream sorting sequence of the loading flow direction information, where the information stream sorting sequence is used to describe information flow direction characteristics of the target data feature vector in a loading process relative to the data tracing thread.
Step S1403, perform information capacity ordering on the at least multiple information flows to obtain an information capacity ordering sequence of the loading flow direction information, where the information capacity ordering sequence is used for information capacity characteristics of the target data characteristic vector in a loading process relative to the data tracing thread.
Step S1404, when the loading flow direction information includes at least two information flows, for each information flow in the loading flow direction information, performing sequence feature extraction on the information flow ordering sequence and the information capacity ordering sequence of the information flow, respectively, to obtain a first sequence feature and a second sequence feature.
Step S1405, based on the first sequence feature and the second sequence feature, searching in a preset cloud database to obtain a data packet corresponding to the first sequence feature and the second sequence feature.
Step S1406, according to the mapping result of the data packet in the data tracing thread, obtaining a data tracing result of the target data feature vector in the data tracing thread.
When the method described in the above steps S1401 to S1406 is applied, the data tracing result corresponding to the target data feature vector can be accurately and comprehensively obtained.
Further, in step S150, the step of determining the running log of the data tracing thread when running and the step of determining the data tracing result may specifically include the contents described in the following substeps.
Determining a first moment when the occupation behavior of the time slice resources is generated and a second moment when the occupation behavior of the time slice resources is released, which correspond to the data tracing thread, in the current time slice resources;
and determining the log file between the first time and the second time as the running log.
Further, on the basis of the above, the step of extracting the script file for determining the data tracing result from the running log in step S150 may be implemented by the method described in the following sub-step.
Step S1501, obtaining the datamation description information of the data tracing result, and splitting the datamation description information into a plurality of data segments; pushing each data segment to each preset data analysis thread in an activation process in parallel; the data segments are used for indicating corresponding preset data analysis threads to generate script parameters corresponding to the data segments, the data segments are also used for indicating corresponding preset data analysis threads to respectively convert the data segments into data description parameters and data logic parameters, respectively extracting a first parameter sequence from each parameter array of the data description parameters, respectively extracting a second parameter sequence from each logic array of the data logic parameters, determining scripted description according to the first parameter sequence and determining tracing logic topology according to the second parameter sequence.
Step S1502, analyzing each scripted description and the tracing logic topology to obtain script parameters corresponding to the plurality of data segments; and removing redundant system data in the script parameters generated by each preset data analysis thread, and combining and generating a script file corresponding to the data tracing result according to the script parameters left after the redundant system data are removed.
It can be understood that based on the contents described in step S1501 and step S1502, the simplification processing of the script file can be realized on the premise of ensuring the accuracy of the script file, so as to improve the determination efficiency of the script file. More script files can be stored in the same storage space at the later stage conveniently.
It is understood that, on the basis of the above steps S110 to S160, the data tracing method may further include the following steps.
Step S1701 acquires a request command for acquiring target data, which is transmitted by the service terminal.
Step 1702, obtaining a to-be-traced data feature vector from the data storage area according to the request instruction.
Step S1703, determining a script file to be processed associated with the to-be-traced data feature vector from the target device.
And step S1704, operating the script file to be processed to obtain target data corresponding to the characteristic vector of the data to be traced.
Step S1705, sending the target data to the service terminal.
When the method described in the above steps S1701 to S1705 of this embodiment is applied, data tracing can be quickly implemented based on the script file stored by the target device and the association relationship between the script file and the data feature vector in the data storage area, and the efficiency of data tracing is improved.
In an alternative embodiment, the step of storing the script file to the target device in communication with the server in step S160 may be specifically implemented by the method described in the following steps.
Step S1601, acquiring a message sequence including the communication protocol state of the target device, which is acquired in real time; and identifying the communication protocol state track of the communication protocol state of the target equipment in the message sequence.
Step S1602, when the communication protocol state track is a target state track for receiving the communication packet of the server, determining a packet encapsulation logic corresponding to the communication protocol state of the target device in the packet sequence according to a packet encoding mode of a specified transmission format in the communication protocol state of the target device.
Step S1603, the script file is packaged according to the message packaging logic to obtain a message to be processed, whether a message sending channel between the message sending channel and the target equipment is occupied is judged, if yes, the step of judging whether the message sending channel between the message sending channel and the target equipment is occupied is returned after the set time length is waited, and if not, the message to be processed is sent to the target equipment through the message sending channel so that the target equipment can unseal the message to be processed to obtain the script file and store the script file.
It can be understood that based on the descriptions in step S1601 to step S1603, the script file can be packaged before being sent, so that distortion and code confusion of the script file during transmission are avoided, and it is ensured that the target device can store the complete and accurate script file.
Corresponding to the embodiment of the data tracing method, the application also provides embodiments of a data tracing device and a server.
The embodiment of the data tracing device can be applied to the server. The embodiments of the apparatus may be implemented by software, or by hardware, or by a combination of hardware and software. Taking a software implementation as an example, as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for operation through the processor of the server where the device is located.
From a hardware aspect, as shown in fig. 2, a hardware structure diagram of a server 200 where a data tracing apparatus 201 is located in the present application is shown, except for the processor 210, the memory 230, the network interface 240, and the nonvolatile memory 220 shown in fig. 2, a device where the apparatus is located in the embodiment may also include other hardware according to an actual function of the device, and is not shown in fig. 2 one by one.
Further, the processor 210 is configured to call the computer program in the nonvolatile memory 220 through the network interface 240, and run the computer program through the memory 230 of the processor 210 to execute the data tracing method.
In another example, the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by the processor 210 to implement the data tracing method described above
Fig. 3 is a functional block diagram of the data tracing apparatus 201 according to the present application. Specifically, the data tracing apparatus 201 includes the following functional modules.
The thread detection module 2011 is configured to detect whether the number of target service processing threads in a running state is lower than a set value, where the target service processing threads are started by the server based on a service processing request initiated by the user terminal, and the target service processing threads occupy part of memory resources of the server when running, and the memory resources occupied by different target service processing threads are different in size.
The resource determining module 2012 is configured to determine, when it is detected that the number of the target service processing threads in the running state is lower than the set value, the remaining memory resource corresponding to the server according to the memory resource occupied by the target service processing threads and the memory resource occupied by the system program of the server.
The vector obtaining module 2013 is configured to obtain a target data feature vector having a preset mapping relationship with the remaining memory resources from a preset data storage area, and start a data tracing thread according to a vector dimension of the target data feature vector; and the preset mapping relation is obtained through the vector dimension of the data characteristic vector in the data storage area and the resource fragmentation sequence of the residual memory resource.
A source tracing determining module 2014, configured to load the target data feature vector into the data source tracing thread to determine a data source tracing result of the target data feature vector through the data source tracing thread.
And the script extraction module 2015 is used for determining an operation log of the data tracing thread during operation and determining the data tracing result, and extracting a script file for determining the data tracing result from the operation log.
A script storage module 2016, configured to store the script file in a target device in communication with the server after completing the association between the script file and the target data feature vector, and return to the step of detecting whether the number of target service processing threads in the running state is lower than a set value.
In summary, the data tracing method, the data tracing device and the server disclosed by the disclosure can determine the source tracing result after the scripting on the premise of not influencing the normal service development of the server. When data tracing is needed in the later stage, the source tracing result after scripting can be called and run to efficiently and quickly realize data tracing, and the source tracing difficulty of the data in the later stage is reduced.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (10)

1. A data tracing method is applied to a server, and the method at least comprises the following steps:
detecting whether the number of target service processing threads in a running state is lower than a set value, wherein the target service processing threads are started by a server based on a service processing request initiated by a user terminal, the target service processing threads occupy part of memory resources of the server when running, and the memory resources occupied by different target service processing threads are different in size;
when detecting that the number of the target service processing threads in the running state is lower than the set value, determining the residual memory resources corresponding to the server according to the memory resources occupied by the target service processing threads and the memory resources occupied by the system program of the server;
acquiring a target data characteristic vector having a preset mapping relation with the residual memory resources from a preset data storage area, and starting a data tracing thread according to the vector dimension of the target data characteristic vector; the preset mapping relation is obtained through vector dimensions of data characteristic vectors in the data storage area and resource fragmentation sequences of residual memory resources;
loading the target data feature vector into the data tracing thread to determine a data tracing result of the target data feature vector through the data tracing thread;
determining an operation log of the data tracing thread during operation and determining the data tracing result, and extracting a script file for determining the data tracing result from the operation log;
and after the association between the script file and the target data characteristic vector is completed, storing the script file into target equipment communicated with the server and returning to the step of detecting whether the number of the target service processing threads in the running state is lower than a set numerical value.
2. The data tracing method according to claim 1, wherein the step of obtaining the target data feature vector having a preset mapping relationship with the remaining memory resources from a preset data storage area specifically comprises:
sequentially selecting first data characteristic vectors in the data storage area according to the characteristic weight, and determining second data characteristic vectors in the data characteristic vectors before the first data characteristic vectors in the data storage area;
acquiring a weight distribution parameter of each vector value in the second data characteristic vector, and performing parameter characteristic mapping on the first data characteristic vector according to a parameter corresponding relation between the weight distribution parameter and the performance parameters of the residual memory resources to obtain a parameter mapping list of the first data characteristic vector; the performance parameter of the residual memory resource is used for representing a memory occupation coefficient of a reference data feature vector which is pre-distributed by taking the operating efficiency of the residual memory resource as a reference;
splitting the parameter mapping list to obtain a plurality of parameter list units corresponding to the parameter mapping list; determining a mapping identifier of the first data characteristic vector from the parameter mapping list to obtain identifier description information corresponding to the mapping identifier of the first data characteristic vector;
according to the identification description information and the parameter list units, determining a first mapping parameter of the first data characteristic vector relative to the data storage area and a second mapping parameter relative to the residual memory resource from the first data characteristic vector; and judging whether the similarity of the first mapping parameter and the second mapping parameter reaches a set threshold value, and determining the current first data feature vector as the target data feature vector when the similarity of the first mapping parameter and the second mapping parameter reaches the set threshold value.
3. The data tracing method of claim 2, wherein the step of starting a data tracing thread according to the vector dimension of the target data feature vector specifically comprises:
generating a dimension distribution grid corresponding to a vector dimension of the target data feature vector and a vector value sequence distribution grid corresponding to a vector value of the target data feature vector, wherein the dimension distribution grid and the vector value sequence distribution grid respectively comprise a plurality of grid units with different grid characteristics;
extracting the unit attribute of the vector dimension of the target data feature vector in any grid unit of the dimension distribution grid, and determining the grid unit with the minimum grid feature in the vector value sequence distribution grid as a reference grid unit;
mapping the unit attribute to the reference grid unit according to a preset thread performance parameter and thread structure description, obtaining a current mapping attribute in the reference grid unit, and generating a conversion list between a vector dimension of the target data characteristic vector and a vector value of the target data characteristic vector according to the unit attribute and the current mapping attribute;
obtaining a tracing direction parameter in the reference grid unit by taking the current mapping attribute as a current attribute index, mapping the tracing direction parameter to the grid unit where the unit attribute is located according to a reverse conversion list corresponding to the conversion list, and obtaining tracing thread direction information corresponding to the tracing direction parameter in the grid unit where the unit attribute is located;
and determining a corresponding data tracing thread according to the tracing thread direction information and starting the data tracing thread corresponding to the tracing thread direction information.
4. The data tracing method according to any one of claims 1 to 3, wherein the step of loading the target data feature vector into the data tracing thread to determine a data tracing result of the target data feature vector by the data tracing thread specifically comprises:
acquiring loading flow direction information of a target data feature vector loaded into the data tracing thread; analyzing the loading flow direction information to obtain at least a plurality of information flows;
performing information weight sorting on the at least a plurality of information flows to obtain an information flow sorting sequence of the loading flow direction information, wherein the information flow sorting sequence is used for describing information flow direction characteristics of the target data characteristic vector in a loading process relative to the data tracing thread;
sequencing information capacity of the at least a plurality of information flows to obtain an information capacity sequencing sequence of the loading flow direction information, wherein the information capacity sequencing sequence is used for information capacity characteristics of the target data characteristic vector in a loading process relative to the data tracing thread;
when the loading flow direction information comprises at least two information flows, respectively extracting sequence characteristics of an information flow sequencing sequence and an information capacity sequencing sequence of the information flows to obtain a first sequence characteristic and a second sequence characteristic for each information flow in the loading flow direction information;
searching in a preset cloud database based on the first sequence feature and the second sequence feature to obtain a data packet corresponding to the first sequence feature and the second sequence feature;
and obtaining a data tracing result of the target data feature vector in the data tracing thread according to the mapping result of the data packet in the data tracing thread.
5. The data tracing method of claim 1, wherein the step of determining the running log of the data tracing thread when running and the step of determining the data tracing result specifically comprises:
determining a first moment when the occupation behavior of the time slice resources is generated and a second moment when the occupation behavior of the time slice resources is released, which correspond to the data tracing thread, in the current time slice resources;
and determining the log file between the first time and the second time as the running log.
6. The data tracing method of claim 5, wherein the step of extracting a script file for determining the data tracing result from the running log specifically comprises:
acquiring the datamation description information of the data tracing result, and splitting the datamation description information into a plurality of data segments; pushing each data segment to each preset data analysis thread in an activation process in parallel; the data segments are used for indicating corresponding preset data analysis threads to generate script parameters corresponding to the data segments, the data segments are also used for indicating corresponding preset data analysis threads to respectively convert the data segments into data description parameters and data logic parameters, respectively extracting a first parameter sequence from each parameter array of the data description parameters, respectively extracting a second parameter sequence from each logic array of the data logic parameters, determining scripted description according to the first parameter sequence and determining a source tracing logic topology according to the second parameter sequence;
analyzing each scripted description and the tracing logic topology to obtain script parameters corresponding to the plurality of data segments; and removing redundant system data in the script parameters generated by each preset data analysis thread, and combining and generating a script file corresponding to the data tracing result according to the script parameters left after the redundant system data are removed.
7. The data tracing method of claim 1, wherein said method further comprises:
acquiring a request instruction which is sent by a service terminal and used for acquiring target data;
acquiring a characteristic vector of data to be traced from the data storage area according to the request instruction;
determining a script file to be processed which is associated with the characteristic vector of the data to be traced from the target equipment;
running the script file to be processed to obtain target data corresponding to the characteristic vector of the data to be traced;
and sending the target data to the service terminal.
8. A data tracing apparatus, applied to a server, the apparatus at least comprises:
the thread detection module is used for detecting whether the number of target service processing threads in a running state is lower than a set value, the target service processing threads are started by a server based on a service processing request initiated by a user terminal, the target service processing threads occupy part of memory resources of the server when running, and the memory resources occupied by different target service processing threads are different in size;
the resource determining module is used for determining the residual memory resources corresponding to the server according to the memory resources occupied by the target business processing threads and the memory resources occupied by the system programs of the server when detecting that the number of the target business processing threads in the running state is lower than the set value;
the vector acquisition module is used for acquiring a target data characteristic vector which has a preset mapping relation with the residual memory resources from a preset data storage area and starting a data tracing thread according to the vector dimension of the target data characteristic vector; the preset mapping relation is obtained through vector dimensions of data characteristic vectors in the data storage area and resource fragmentation sequences of residual memory resources;
a source tracing determining module, configured to load the target data feature vector into the data source tracing thread to determine a data source tracing result of the target data feature vector through the data source tracing thread;
the script extraction module is used for determining the running logs of the data tracing thread during running and the data tracing result, and extracting script files for determining the data tracing result from the running logs;
and the script storage module is used for storing the script file into target equipment communicated with the server after the association between the script file and the target data characteristic vector is completed, and returning to the step of detecting whether the number of the target service processing threads in the running state is lower than a set numerical value.
9. A server, comprising:
a processor, and
a non-volatile memory and a network interface connected with the processor;
the processor is used for calling a computer program in the nonvolatile memory through a network interface and running the computer program through a memory of the processor so as to execute the data tracing method of any one of the claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the data tracing method according to any one of claims 1 to 7.
CN202010509005.2A 2020-06-07 2020-06-07 Data tracing method, device and server Active CN111625342B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010509005.2A CN111625342B (en) 2020-06-07 2020-06-07 Data tracing method, device and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010509005.2A CN111625342B (en) 2020-06-07 2020-06-07 Data tracing method, device and server

Publications (2)

Publication Number Publication Date
CN111625342A CN111625342A (en) 2020-09-04
CN111625342B true CN111625342B (en) 2020-11-17

Family

ID=72258247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010509005.2A Active CN111625342B (en) 2020-06-07 2020-06-07 Data tracing method, device and server

Country Status (1)

Country Link
CN (1) CN111625342B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113641544B (en) * 2021-08-23 2024-04-26 北京百度网讯科技有限公司 Method, apparatus, device, medium and product for detecting application state
CN114185937B (en) * 2021-11-03 2022-12-06 苏州汇成软件开发科技有限公司 Big data tracing method and system based on digital finance
CN114722014B (en) * 2022-06-09 2022-09-02 杭银消费金融股份有限公司 Batch data time sequence transmission method and system based on database log file

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9081560B2 (en) * 2013-09-30 2015-07-14 Sungard Systems International Inc. Code tracing processor selection
CN105373930A (en) * 2015-09-15 2016-03-02 仲恺农业工程学院 RFID tag estimation method and RFID tag estimation device for tracing system
CN106547531A (en) * 2015-09-23 2017-03-29 云智慧(北京)科技有限公司 PHP-based application performance management method and module thereof
CN109657110A (en) * 2018-12-13 2019-04-19 上海达梦数据技术有限公司 A kind of data source tracing method and corresponding data are traced to the source device
CN110990382A (en) * 2019-12-19 2020-04-10 国网安徽省电力有限公司信息通信分公司 Data traceability management system for information operation monitoring
CN111178471A (en) * 2020-01-20 2020-05-19 四川九哈科技股份有限公司 Intelligent collection weight transmission traceability system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9569336B2 (en) * 2013-03-06 2017-02-14 International Business Machines Corporation System and method for managing traceability suspicion with suspect profiles
US10872409B2 (en) * 2018-02-07 2020-12-22 Analogic Corporation Visual augmentation of regions within images

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9081560B2 (en) * 2013-09-30 2015-07-14 Sungard Systems International Inc. Code tracing processor selection
CN105373930A (en) * 2015-09-15 2016-03-02 仲恺农业工程学院 RFID tag estimation method and RFID tag estimation device for tracing system
CN106547531A (en) * 2015-09-23 2017-03-29 云智慧(北京)科技有限公司 PHP-based application performance management method and module thereof
CN109657110A (en) * 2018-12-13 2019-04-19 上海达梦数据技术有限公司 A kind of data source tracing method and corresponding data are traced to the source device
CN110990382A (en) * 2019-12-19 2020-04-10 国网安徽省电力有限公司信息通信分公司 Data traceability management system for information operation monitoring
CN111178471A (en) * 2020-01-20 2020-05-19 四川九哈科技股份有限公司 Intelligent collection weight transmission traceability system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Tracing with Less Data:Active Learning for Classification-Based Traceability Link Recovery;Chris Mills,Javier-Avila,etc;《2019 IEEE International Conference on Software Maintenance and Evolution》;20191205;103-113 *
基于fibjs的企业溯源管理系统的设计与实现;沈阳;《中国优秀硕士学位论文全文数据库 信息科技辑》;20200215;I138-623 *
溯源的高效存储管理及在安全方面的应用研究;谢雨来;《中国博士学位论文全文数据库 信息科技辑》;20150215;I137-1 *

Also Published As

Publication number Publication date
CN111625342A (en) 2020-09-04

Similar Documents

Publication Publication Date Title
CN111625342B (en) Data tracing method, device and server
CN111506498B (en) Automatic generation method and device of test case, computer equipment and storage medium
CN110287696B (en) Detection method, device and equipment for rebound shell process
CN109271545B (en) Feature retrieval method and device, storage medium and computer equipment
CN107203464B (en) Method and device for positioning service problem
CN111367782B (en) Regression testing data automatic generation method and device
CN116755891A (en) Event queue processing method and system based on multithreading
CN116756298B (en) Cloud database-oriented AI session information optimization method and big data optimization server
CN111193631B (en) Information processing method, system, and computer-readable storage medium
CN110188033B (en) Data detection device, method, computer device, and computer-readable storage medium
CN106649678B (en) Data processing method and system
CN109767546B (en) Quality checking and scheduling device and quality checking and scheduling method for valuable bills
CN115080401A (en) Automatic testing method and related device
CN114625612A (en) User behavior analysis method and service system based on big data office
CN114490164B (en) Log collection method, system, device and computer storage medium
CN117216011B (en) File transmission method and device and electronic equipment
CN112737812B (en) Data transmission method and device
CN112685653B (en) Question bank pushing configuration method and system of talent employment model
CN117610970B (en) Intelligent evaluation method and system for data migration work
CN114328076B (en) Log information extraction method, device, computer equipment and storage medium
CN117827382B (en) Container cloud resource management method based on resource deployment audit
CN115840834B (en) Face database quick search method and system
CN114666231B (en) Visual operation and maintenance management method and system under multi-cloud environment and storage medium
CN112600282B (en) Intelligent solar luggage charging and discharging control method and system
CN115145810A (en) Method, device, equipment, medium and product for obtaining test data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20201030

Address after: Room b508, standard workshop auxiliary room, Jinyang science and Technology Industrial Park, national high tech Industrial Development Zone, Guiyang City, Guizhou Province

Applicant after: Guizhou Zheng Hi Tech Co., Ltd

Address before: 510700 Room 601, No.16, Kehui 1st Street, Huangpu District, Guangzhou City, Guangdong Province

Applicant before: Zhiboyun information technology (Guangzhou) Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20211124

Address after: 550000 room 312, building B, college student entrepreneurship Park, Jinyang science and Technology Industrial Park, Guiyang National High tech Industrial Development Zone, Guiyang City, Guizhou Province

Patentee after: GUIZHOU BOCHENG TECHNOLOGY Co.,Ltd.

Address before: 550000 room b508, standard workshop auxiliary room, Jinyang science and Technology Industrial Park, national high tech Industrial Development Zone, Guiyang City, Guizhou Province

Patentee before: Guizhou Zheng Hi Tech Co., Ltd

TR01 Transfer of patent right