CN116668503A - Data acquisition method, device, system, electronic equipment and storage medium - Google Patents

Data acquisition method, device, system, electronic equipment and storage medium Download PDF

Info

Publication number
CN116668503A
CN116668503A CN202310427742.1A CN202310427742A CN116668503A CN 116668503 A CN116668503 A CN 116668503A CN 202310427742 A CN202310427742 A CN 202310427742A CN 116668503 A CN116668503 A CN 116668503A
Authority
CN
China
Prior art keywords
data
character string
encoded
application program
link
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310427742.1A
Other languages
Chinese (zh)
Inventor
胡一帆
闫鹏
周小帆
司徒放
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202310427742.1A priority Critical patent/CN116668503A/en
Publication of CN116668503A publication Critical patent/CN116668503A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • G06F9/548Object oriented; Remote method invocation [RMI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/133Protocols for remote procedure calls [RPC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC

Abstract

The application provides a data acquisition method, a device, a system, electronic equipment and a storage medium, wherein the data acquisition method comprises the following steps: acquiring link data generated by remotely calling an application program by an application program node; the link data at least comprises a service interface name requested in a remote calling process, a user permission identifier of an application program node, an identifier of a called application program and related tag information; compressing the service interface name to obtain compressed data; coding the user permission identification, the application program identification and the label information to obtain a coding character string and a mapping relation between the coding character string and the original text; and uploading the compressed data, the coded strings and the mapping relation to the data collection equipment. The application can compress the link data and effectively reduce the communication overhead and the performance overhead in the link data transmission process.

Description

Data acquisition method, device, system, electronic equipment and storage medium
Technical Field
The application belongs to the technical field of remote invocation, and particularly relates to a data acquisition method, a device, a system, electronic equipment and a storage medium.
Background
The development of technologies such as cloud protogenesis, micro-service and the like brings more advantages for the development and deployment of application programs, particularly the development and deployment of a distributed system, can realize the elastic expansion, the automatic management and the high availability of the application programs, and can also realize the higher flexibility and the maintainability of the application programs. However, while these complex functions can be achieved, higher stability and reliability of the system are also required. As such, there is an increasing demand for systems to be observed, and in the process of software development and operation and maintenance, link (Trace) data, index (Metric) data, log (Log) data, etc. of remote invoicing programs need to be collected, so as to be able to observe states, performances, health conditions, and behavior characteristics of the systems, and thus to be able to observe, diagnose, optimize, and process faults of the systems.
In practical applications, since the size of the link data depends on the number of requests received by the user service, the data size of the link data is generally much larger than the Metric data, and thus more communication overhead and performance overhead are caused in the process of reporting to the server.
Disclosure of Invention
The application provides a data acquisition method, a device, a system, electronic equipment and a storage medium, which can compress link data and effectively reduce communication overhead and performance overhead in the link data transmission process.
An embodiment of a first aspect of the present application provides a data acquisition method, including:
acquiring link data generated by remotely calling an application program by an application program node; the link data at least comprises a service interface name requested in a remote calling process, a user permission identifier of an application program node, an identifier of a called application program and related tag information;
compressing the interface name of the server to obtain compressed data;
encoding the user permission identifier, the application program identifier and the tag information to obtain an encoding character string and a mapping relation between the encoding character string and an original text;
and uploading the compressed data, the coded character strings and the mapping relation to data collection equipment.
In some embodiments of the present application, the compressing the server interface name to obtain compressed data includes:
acquiring service interface names requested by application program nodes in each remote calling process from the link data;
and merging the same characters in the interface names of the service interfaces, and obtaining compressed data based on the same characters which are merged and different characters which are not merged.
In some embodiments of the present application, the merging the same character in each service interface name, obtaining compressed data based on the same character that is merged and different characters that are not merged, includes:
determining a tree data structure formed by all service end interface names, and prefix tree father nodes and prefix tree leaf child nodes corresponding to the service end interface names;
and merging prefix tree father nodes with the same service end interface names, and obtaining the compressed data based on each prefix tree father node after merging and the prefix tree leaf child node of each service end interface name.
In some embodiments of the present application, encoding the user permission identifier, the identifier of the application program, and the tag information to obtain an encoded string and a mapping relationship between the encoded string and an original text, including:
arranging the user permission identification, the application program identification and the tag information according to a preset sequence to form a character string to be encoded;
and encoding the character string to be encoded to obtain an encoded character string and a mapping relation between the encoded character string and the original text.
In some embodiments of the present application, the encoding the character string to be encoded to obtain an encoded character string and a mapping relationship between the encoded character string and an original text includes:
and encoding the character string to be encoded by adopting a cyclic redundancy code checking encoding mode to obtain an encoded character string and a mapping relation between the encoded character string and an original text.
In some embodiments of the present application, the uploading the compressed data, the encoded string, and the mapping relationship to a data collection device includes:
for each remote call process, forming a link unit based on the coded character string and the character string which is not combined in the compressed data; the link unit represents a data structure for storing link data and is used for describing one remote call between two services;
and uploading the link unit, the coding character string, the mapping relation between the coding character string and the original text and the character string combined in the compressed data to data collection equipment.
In some embodiments of the present application, the uploading the link unit, the encoded string, the mapping relationship between the encoded string and the original text, and the string combined in the compressed data to a data collection device includes:
Generating a link data packet based on the link unit, the mapping relation and the character strings combined in the compressed data generated in each remote call process;
and uploading the link data packet to data collection equipment.
An embodiment of a second aspect of the present application provides a data acquisition method applied to a data acquisition system, where the system includes a data acquisition device and a data collection device, the method including:
the data acquisition device executes the data acquisition method described in the first aspect;
the data collection device receives the data uploaded by the data collection device, decompresses the compressed data included in the uploaded data, restores the encoded character string based on the mapping relation between the encoded character string and the original text, and sends the decompressed data and restored data to the link data storage device.
An embodiment of a third aspect of the present application provides a data acquisition device, including:
the data acquisition module is used for acquiring link data generated by remotely calling an application program by the application program node; the link data at least comprises a service interface name requested in a remote calling process, a user permission identifier of an application program node, an identifier of a called application program and related tag information;
The data compression module is used for compressing the interface name of the server based on a preset algorithm to obtain compressed data;
the data coding module is used for respectively coding the user permission identification, the application program identification and the tag information to obtain a coding character string and a mapping relation between the coding character string and an original text;
and the data uploading module is used for uploading the compressed data, the coded character string and the mapping relation between the coded character string and the original text to data collecting equipment.
An embodiment of a fourth aspect of the present application provides a data acquisition system, including a data acquisition device and a data acquisition system of the data acquisition device;
the data acquisition device is used for executing the data acquisition method described in the first aspect;
the data collection device is used for decompressing the compressed data based on the preset algorithm, restoring the coded character string based on the mapping relation between the coded character string and the original text, and sending the decompressed data and the restored data to the link data storage device.
An embodiment of a fifth aspect of the present application provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor running the computer program to implement the method of the first or second aspect.
An embodiment of a sixth aspect of the present application provides a computer readable storage medium having stored thereon a computer program for execution by a processor to implement the method of the first or second aspect.
The technical scheme provided by the embodiment of the application has at least the following technical effects or advantages:
in the embodiment of the application, link data generated by remotely calling an application program by an application program node is firstly obtained; then compressing the service interface name in the link data to obtain compressed data; coding the user permission identification, the application program identification and the label information in the link data to obtain a coding character string and a mapping relation between the coding character string and an original text; and uploading the compressed data, the coded strings and the mapping relation to the data collection equipment. Therefore, when link data are collected, the server interface names which are possibly different in each remote call in the link data are respectively compressed, user permission identification, application program identification and label information which are not normally changed in each remote call in the link data are encoded, and then the compressed and encoded data are reported, so that the transmission quantity of the data can be effectively reduced on the basis of data integrity, and the communication cost and the performance cost in the link data transmission process are reduced.
Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is a schematic diagram of an observable system including a data acquisition device and system in an embodiment of the application;
FIG. 2 is a flow chart illustrating a data acquisition method according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of step S2 in an embodiment of the application;
FIG. 4 is a schematic flow chart of step S22 in an embodiment of the application;
FIG. 5 is a schematic diagram of a prefix tree formed by service port names of Span1, span2, and Span3 according to an embodiment of the present application;
FIG. 6 is a schematic diagram showing a specific flow chart at step S4 according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a link packet according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a data acquisition device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 10 is a schematic diagram of a storage medium according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the application to those skilled in the art.
It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs.
At present, in the field of observability, the number of link data to be reported is reduced in a sampling manner, so that all the link data of the system cannot be acquired, and the observability of the system may be greatly reduced. But it is also possible to start from the link data itself, and reduce the size of the link data itself by some coding compression algorithm, so as to reduce the communication overhead of the link data.
Based on the above consideration, the embodiment of the application provides a data acquisition method, which includes: acquiring link data generated by remotely calling an application program by an application program node; compressing the service interface names in the link data to obtain compressed data; coding the user permission identification, the application program identification and the label information in the link data to obtain a coding character string and a mapping relation between the coding character string and an original text; and uploading the compressed data, the coded strings and the mapping relation to the data collection equipment. Therefore, when link data are collected, the server interface names which are possibly different in each remote call in the link data are respectively compressed, user permission identification, application program identification and label information which are not normally changed in each remote call in the link data are encoded, and then the compressed and encoded data are reported, so that the transmission quantity of the data can be effectively reduced on the basis of data integrity, and the communication cost and the performance cost in the link data transmission process are reduced.
The link data, which may be referred to as trace data, may be understood as data generated during a process of remotely calling an application program, which is used to describe a call track of an external request in a distributed system, and trace and record a circulation path of the call request in the system, so that a developer can know about a processing process and a performance bottleneck of the request in the system. Specifically, link data includes, but is not limited to, requested call links, cross-service call relationships, etc., with the TraceId as a unique tag, each link data is a directed acyclic graph composed of multiple link elements (spans) containing the same TraceId. In a distributed system, a request is typically invoked remotely between multiple services, each of which may be considered a "link" representing the transfer of the request from one service to another, and a link element may be considered a data structure storing link data describing a remote invocation between two services in the distributed system. Span is a unique tag with Span id, and includes ServiceName, startTime, endTime, traceId and parentspan (reference to associate Span), and some tag data.
The interface name of the service end in Span is used for describing the interface name of the requested service end in a remote call process. Tag data in the Span is used to describe and annotate the Span in more detail, and metadata, service information, environment information, and any other information customized according to service requirements, etc. related to the Span can be recorded. Metadata such as URL of request (Uniform Resource Locator ), HTTP (Hyper Text Transfer Protocol, hypertext transfer protocol) method, database query statement, RPC (Remote Procedure Call, remote call) protocol, etc. Service information such as order number, user identification, product name, etc. Environmental information such as hostname, IP (Internet Protocol) address, port number, running environment (e.g., production environment, test environment, etc.).
The execution body of the data acquisition method may be a component deployed on an application node in a distributed system or a micro-service architecture, which may be referred to as an application probe (Agent), or a data acquisition device, or may be a data acquisition device that installs the component, for example, an application node in an observable system shown in fig. 1, may intercept a service call, generate Span data, and report the Span data to a data collection device, where the data collection device may decompress and decode received compressed data and encoded data, restore original text data, and send the original text data to a storage device (which may be a single device or a cluster) of link data, so as to store the link data. Then, the console can query the observable data based on the tag, specifically, can query the Trace data from the Trace storage device according to the ServiceName, and can query the Trace data according to the tag text. Specifically, the application nodes can be deployed in a distributed system or a micro-service architecture of a cloud network, and also can be deployed in a local distributed physical server or cluster; likewise, the data collection device may be deployed in a distributed system or a micro-service architecture of the cloud network, or may be deployed in a local distributed physical server or a cluster, which is not specifically limited in this embodiment, so long as the data collection method provided in this embodiment can be applied to collect link data.
The data acquisition method provided by the embodiment of the application is described in detail below with reference to the accompanying drawings.
Example 1
Referring to fig. 2, a flow chart of a data acquisition method according to an embodiment of the application is shown in fig. 2, and the data acquisition method may include the following steps:
step S1, obtaining link data generated by remotely calling an application program by an application program node.
The link data at least comprises a service interface name requested by the application program node, a user permission identifier of the application program node, an identifier of the called application program, related label information and the like in the remote call process. The user license identification of the application node is used to specify the user's usage rights and restrictions on the invoked application, and is typically provided by the provider of the application, including authorization information for the user when using the application, such as lifetime, access rights, user restrictions, etc. The identification of an application, also referred to as application ID (Application ID), is an identifier that is used to uniquely identify an application. During application development, application IDs may be assigned by a developer or system administrator and used to distinguish between different applications. The application ID may be a number, string or other form of identifier that uniquely identifies an application in the system. Application IDs are commonly used in the development of applications to identify and manage applications, for example, in system configuration, logging, error tracking, etc., the application IDs may be used to identify different applications.
In practical applications, there may be thousands or even more application nodes, and thousands or even more application probes (may be called Agent end) are needed to intercept remote call services, generate link data, compress and encode the link data, then report the compressed and encoded data to a data collection device, where the data collection device may restore the received compressed and encoded data, and send the restored original text to the link data storage device. The data transmission between the data collection device and the link data storage device belongs to internal transmission, and can adopt larger transmission bandwidth to transmit original text data, so that the generated communication overhead is smaller.
And S2, compressing the service interface name to obtain compressed data.
In practical applications, since the remotely invoked link usually involves multiple services, the requested service interface name is also multiple, so that the service interface name will be relatively long, and a relatively large communication overhead will be generated when the service interface name is transmitted. And the names of the interfaces of the service end in each remote call are often different, and the names of the interfaces of the service end are often compressed in a coding mode, so that an ideal compression effect is often not achieved. Therefore, the embodiment can report the obtained compressed data by compressing the interface name of the server, and the data collecting device can preset the same decompression mode to decompress the reported compressed data, thereby effectively reducing the communication overhead generated by the interface name of the server.
In some embodiments, as shown in fig. 3, the step S2 may include the following steps: step S21, obtaining the service interface name requested by the application program node in each remote calling process from the link data; and S22, merging the same characters in the interface names of the service interfaces, and obtaining compressed data based on the same characters which are merged and different characters which are not merged.
In this embodiment, the link data may include data generated by multiple remote calls, and the names of the service interfaces of the remote calls may be the same or different, and the service interface names requested by the application node may be acquired first in each remote call process, then the acquired service interface names are sorted and counted, the same character and different characters in each service interface name are determined, then the same character is combined, and compressed data is obtained based on the same character that is combined and different characters that are not combined, so when the application node reports the link data to the data collecting device, only the compressed data may be reported, thereby effectively reducing communication overhead generated by the service interface names in the link data transmission process.
For example, serviceName1 in Span1 is/test/abc, serviceName2 in Span2 is/test/abe, and ServiceName3 in Span3 is/test/cdf. The same character in the three service interface names is/test/, the different characters are "abc", "abe" and "cdf", while the same characters in Span1 and Span2 are/test/ab, the different characters are "c" and "e". For the link data including Span1, span2 and Span3, the interface name of the service end in the compressed data may include/test/ab, c, e,/test/cdr, and compared with/test/abc,/test/abe and/test/cdf, the data transmission amount may be significantly reduced, and the communication overhead may be reduced.
Further, as shown in fig. 4, the step S22 may include the following steps: step S221, determining a tree data structure formed by all service interface names, and prefix tree father nodes and prefix tree leaf child nodes corresponding to the service interface names; step S222, merging the father nodes of the prefix tree with the same service interface names, and obtaining compressed data based on the father nodes of the prefix tree after merging and the father tree leaf child nodes of the service interface names.
Wherein the tree data structure is a non-linear data structure, and the nodes of the tree represent data elements in the tree. In this embodiment, a tree data structure (prefix tree for short) constructed based on common prefixes of different strings may be obtained specifically, so as to reduce the storage space of the strings to a greater extent.
In this embodiment, for example, span1, span2, and Span3 described above, a prefix tree as shown in fig. 5 may be constructed. The Agent end firstly generates a prefix tree for all Span serviceNames in the Span array acquired at the time, and all nodes of the prefix tree have references to father nodes. The prefix leaf child nodes corresponding to the Span's ServiceName, namely "c", "e" and "f" in fig. 5, are then stored in each Span. And storing the prefix tree and Span arrays comprising a plurality of spans in the link data packet reported at this time. Specifically, for Span1, span2, and Span3 described above, leaf node c is stored in Span1, leaf node e is stored in Span2, and leaf node f is stored in Span 3. Thus, in this scenario, the compression efficiency of ServiceName is:
wherein length (T) represents the compressed data length of current Span1, span2 and Span3, which is equal to 10 prefix tree parent nodes (/ test/+ab+cd), plus 3 prefix leaf child nodes ("c", "e" and "f"), namely 10+3; length (ServiceName) the length of each ServiceName before compression, i.e./test/abc,/test/abe and/test/cdf, is 9.
It can be understood that the above method for compressing the service interface name based on the prefix tree algorithm is only one compression method of the present embodiment, and the present embodiment is not limited thereto, as long as the service interface name can be compressed, for example, a dictionary compression algorithm may also be adopted: and replacing the repeated data fragments with corresponding dictionary indexes by constructing a dictionary or dictionary, thereby realizing data compression. Or a sliding window algorithm, wherein the data is divided into windows with fixed sizes through the sliding window algorithm, the data in each window is compressed, if the data in the window repeatedly appears, the data only needs to be stored once, and the data is represented in the subsequent window by using a reference or an index, so that the storage space of the data is reduced. Sliding window algorithms are often applied to compress data in network transmissions.
And S3, respectively encoding the user permission identification, the application program identification and the label information to obtain an encoding character string and a mapping relation between the encoding character string and the original text.
In this embodiment, under the condition of compressing the service interface name, the user permission identifier, the application program identifier, and the tag information in the link data are also encoded respectively, so as to further reduce the data size of the link data and reduce the communication overhead in the link data transmission process. Because the user permission identification, the application program identification and the label information are usually fixed information, the user permission identification of the same user and the identification of the same application program are not changed in each remote call process, a coding mode can be adopted to generate a character string from the user permission identification, the application program identification and the label information in one link data, the mapping relation between the character string and the user permission identification, the application program identification and the label information in the link data is uploaded to the data collection equipment, and the data collection equipment can restore the user permission identification and the application program identification in the link data based on the received mapping relation and the coding character string.
In some embodiments, the above step S3 may include the following processes: arranging the user permission identification, the application program identification and the label information according to a preset sequence to form a character string to be coded; and encoding the character string to be encoded to obtain the encoded character string and the mapping relation between the encoded character string and the original text.
The preset sequence may be any combination sequence of the user permission identifier, the application identifier, and the tag information, which is not specifically limited in this embodiment, and only the three characters can be connected in series and arranged into a long character string to form the character string to be encoded.
In practical application, the arranged character strings to be encoded can be encoded, the mapping relation between the character strings to be encoded and the encoding character strings adopted in encoding is saved, and in one remote call process, the same mapping relation is adopted for the character strings to be encoded in all spans, so that when link data is uploaded, the adopted mapping relation and the encoding character strings can be uploaded to data collection equipment, and as all spans adopt the same mapping relation, only one mapping relation and the encoding character strings in all spans need to be uploaded, so that the data quantity can be greatly reduced, and the communication cost of the transmission link data is further reduced.
Specifically, the step of encoding the character string to be encoded to obtain the encoded character string and the mapping relationship between the encoded character string and the original text may include the following processes: the method can be used for encoding the character string to be encoded by adopting a cyclic redundancy code checking encoding mode, so as to obtain an 8-byte encoding character string and a mapping relation between the encoding character string and an original text.
The cyclic redundancy check (Cyclic Redundancy Check, CRC) coding refers to generating a section of redundancy check code with a fixed length by performing polynomial calculation on data, and generating a coding string based on the redundancy check code. The fixed length is usually 8 bytes, so the data volume of the original data can be obviously reduced.
In this embodiment, after generating the redundancy check code, the generated redundancy check code may be converted into a 16-system code string, that is, the code string is a fixed 8-byte string, and the code string and the mapping relationship between the code string and the original text may be stored in the link data packet reported this time, and the code string may be stored in the corresponding Span.
It can be understood that the cyclic redundancy check coding scheme adopted above is only one coding scheme of the present embodiment, and the specific coding scheme is not specifically limited in this embodiment, so long as the value is irrelevant to Span, and only the common field related to the Agent end, that is, the character string to be coded, is coded, for example, the MD5 (Message-Digest Algorithm 5) coding scheme may also be adopted to code the character string to be coded.
And S4, uploading the compressed data, the coded character strings and the mapping relation to data collection equipment.
In some embodiments, as shown in fig. 6, the step S4 may include the following steps: step S41, for each remote call process, forming a link unit based on the coded character string and the character string which is not combined in the compressed data; step S42, uploading the link unit, the coded character string, the mapping relation and the character string combined in the compressed data to the data collection device.
In this embodiment, for each remote invocation procedure, a plurality of link units may be formed, and data with individual variability in one link data may be formed into a link unit and uploaded to the data collection device together with a common field without individual variability, so that communication overhead of the link data and performance overhead of the device may be reduced on the basis of data integrity.
Specifically, the above step S42 may include the following processes: generating a link data packet based on the link unit generated in each remote calling process, the mapping relation between the coding character string and the original text, and the character string combined in the compressed data; the link data packet is uploaded to the data collection device.
In this embodiment, for each remote call procedure, a plurality of link units may be formed, and when forming a link data packet, one link data packet may include a set of encoded strings, a mapping relationship between the encoded strings and the original text, a string that is combined in the compressed data, and a Span array that is formed by a plurality of link units, where each link unit includes a character that is not combined in the compressed data, as shown in fig. 7. Thus, the link data unit is formed by the data with individual variability in one link data, the link data packet is formed by the link data unit and the common fields without individual variability (the user permission identification, the application program identification and the code character string of the label information, the code data string and the mapping relation of the original text), and the generated link data packet is integrally uploaded to the data collection equipment, so that the link data transmission can be carried out by adopting a larger transmission bandwidth on the basis of the data integrity, and the communication cost of the link data and the performance cost of the equipment are further reduced.
It should be understood that the order of the steps S2 and S3 is limited only by the order described above, and the processing order of the two steps is not limited, that is, the step S3 may be performed preferentially to the step S2, or may be performed together with the step S2, which is not particularly limited in this embodiment.
The data acquisition method provided by the embodiment firstly acquires link data generated by remotely calling an application program by an application program node; then compressing the service interface name in the link data to obtain compressed data; coding the user permission identification, the application program identification and the label information in the link data to obtain a coding character string and a mapping relation between the coding character string and an original text; and uploading the compressed data, the coded strings and the mapping relation to the data collection equipment. Therefore, when link data are collected, the server interface names which are possibly different in each remote call in the link data are respectively compressed, user permission identification, application program identification and label information which are not normally changed in each remote call in the link data are encoded, and then the compressed and encoded data are reported, so that the transmission quantity of the data can be effectively reduced on the basis of data integrity, and the communication cost and the performance cost in the link data transmission process are reduced.
It should be noted that, the data (including, but not limited to, data used for model training, stored data, displayed data, etc.) related to the present application are all information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.
Example 2
Based on the above data acquisition method, some embodiments of the present application further provide another data acquisition method, which is applied to a data acquisition system including a data acquisition device and a data collection device, and includes the following steps:
the data acquisition device performs the data acquisition method of any of the embodiments of example 1 above;
the data collection device receives the data uploaded by the data collection device, decompresses the compressed data included in the uploaded data, restores the encoded character string based on the mapping relation between the encoded character string and the original text, and sends the decompressed data and restored data to the link data storage device.
It can be understood that the present embodiment 2 is based on the same concept as that of the above embodiment 1, and any implementation manner of the above embodiment 1 can be applied to the embodiment 2, and the same beneficial effects are achieved, which is not described herein.
Example 3
Some embodiments of the present application further provide a data acquisition device for performing the data acquisition method provided in any of the foregoing embodiments 1, and fig. 8 shows a schematic diagram of the data acquisition device, as shown in fig. 8, where the data acquisition device includes:
The data acquisition module is used for acquiring link data generated by remotely calling an application program by the application program node; the link data at least comprises a service interface name requested in a remote calling process, a user permission identifier of an application program node, an identifier of a called application program and related tag information;
the data compression module is used for compressing the interface name of the server based on a preset algorithm to obtain compressed data;
the data coding module is used for respectively coding the user permission identification, the application program identification and the label information to obtain a coding character string and a mapping relation between the coding character string and the original text;
and the data uploading module is used for uploading the compressed data, the coding character strings and the mapping relation between the coding character strings and the original text to the data collecting equipment.
It can be understood that, for the same inventive concept, the data acquisition device provided in this embodiment 3 and the data acquisition method provided in embodiment 1 of the present application at least can achieve the same beneficial effects of the data acquisition method, and various implementation manners of the data acquisition method embodiment are also applicable to the embodiment of the data acquisition device, which is not described herein again.
Example 4
Some embodiments of the present application also provide a data acquisition system, as shown in fig. 1, including: a data acquisition device and a data acquisition system of the data collection device;
the data acquisition device is configured to perform the data acquisition method of any of the embodiments 1;
the data collection device is used for decompressing the compressed data based on a preset algorithm, restoring the coded character string based on the mapping relation between the coded character string and the original text, and sending the decompressed data and the restored data to the link data storage device.
It should be noted that, the data acquisition system may be deployed in a distributed system or a micro-service architecture in a cloud network, or may be deployed in a local distributed physical server or a cluster, which is not particularly limited in this embodiment, so long as the data acquisition method provided in this embodiment can be applied to acquire link data.
It can be understood that, for the same inventive concept, the data acquisition device provided in this embodiment 4 and the data acquisition method provided in embodiment 1 of the present application at least can achieve the same beneficial effects of the data acquisition method, and various implementation manners of the data acquisition method embodiment are also applicable to the embodiment of the data acquisition system, which is not described herein again.
Example 5
The embodiment of the application also provides an electronic device for executing the data acquisition method provided by the embodiment 1. Referring to fig. 9, a schematic diagram of an electronic device according to some embodiments of the present application is shown. As shown in fig. 9, the electronic device 4 includes: processor 400, memory 401, bus 402 and communication interface 403, processor 400, communication interface 403 and memory 401 being connected by bus 402; the memory 401 stores a computer program executable on the processor 400, and the processor 400 executes the data acquisition method according to any of the foregoing embodiments of the present application when the computer program is executed.
The memory 401 may include a high-speed random access memory (RAM: random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The communication connection between the device network element and at least one other network element is achieved through at least one communication interface 403 (which may be wired or wireless), the internet, a wide area network, a local network, a metropolitan area network, etc. may be used.
Bus 402 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be divided into address buses, data buses, control buses, etc. The memory 401 is configured to store a program, and the processor 400 executes the program after receiving an execution instruction, and the data collection method disclosed in any of the foregoing embodiments of the present application may be applied to the processor 400 or implemented by the processor 400.
The processor 400 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 400 or by instructions in the form of software. The processor 400 may be a general-purpose processor, including a processor (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 401, and the processor 400 reads the information in the memory 401, and in combination with its hardware, performs the steps of the above method.
The electronic device provided by the embodiment of the application and the data acquisition method provided by the embodiment 1 of the application have the same beneficial effects as the method adopted, operated or realized by the electronic device.
Example 6
The embodiment of the present application further provides a computer readable storage medium corresponding to the data collection method provided in the foregoing embodiment 1, referring to fig. 10, the computer readable storage medium is shown as an optical disc 50, on which a computer program (i.e. a program product) is stored, and the computer program when executed by a processor performs the data collection method provided in any of the foregoing embodiments.
It should be noted that examples of the computer readable storage medium may also include, but are not limited to, a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory, or other optical or magnetic storage medium, which will not be described in detail herein.
The computer readable storage medium provided by the above embodiment of the present application has the same advantages as the method adopted, operated or implemented by the application program stored therein, because of the same inventive concept as the data acquisition method provided by the embodiment of the present application.
It should be noted that:
in the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the application may be practiced without these specific details. In some instances, well-known structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the following schematic diagram: i.e., the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the application and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
The present application is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present application are intended to be included in the scope of the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims (12)

1. A method of data acquisition, comprising:
acquiring link data generated by remotely calling an application program by an application program node; the link data at least comprises a service interface name requested in a remote calling process, a user permission identifier of an application program node, an identifier of a called application program and related tag information;
Compressing the interface name of the server to obtain compressed data;
encoding the user permission identifier, the application program identifier and the tag information to obtain an encoding character string and a mapping relation between the encoding character string and an original text;
and uploading the compressed data, the coded character strings and the mapping relation to data collection equipment.
2. The method of claim 1, wherein compressing the server interface name to obtain compressed data comprises:
acquiring service interface names requested by application program nodes in each remote calling process from the link data;
and merging the same characters in the interface names of the service interfaces, and obtaining compressed data based on the same characters which are merged and different characters which are not merged.
3. The method of claim 2, wherein the merging the same character in each service interface name, based on the same character being merged and different characters not being merged, results in compressed data, comprising:
determining a tree data structure formed by all service end interface names, and prefix tree father nodes and prefix tree leaf child nodes corresponding to the service end interface names;
And merging prefix tree father nodes with the same service end interface names, and obtaining the compressed data based on each prefix tree father node after merging and the prefix tree leaf child node of each service end interface name.
4. The method of claim 1, wherein encoding the user license identifier, the application identifier, and the tag information to obtain an encoded string and a mapping relationship between the encoded string and an original text comprises:
arranging the user permission identification, the application program identification and the tag information according to a preset sequence to form a character string to be encoded;
and encoding the character string to be encoded to obtain an encoded character string and a mapping relation between the encoded character string and the original text.
5. The method of claim 4, wherein the encoding the character string to be encoded to obtain an encoded character string and a mapping relationship between the encoded character string and an original text, comprises:
and encoding the character string to be encoded by adopting a cyclic redundancy code checking encoding mode to obtain an encoded character string and a mapping relation between the encoded character string and an original text.
6. The method of claim 1, wherein uploading the compressed data, the encoded string, and the mapping relationship to a data collection device comprises:
for each remote call process, forming a link unit based on the coded character string and the character string which is not combined in the compressed data; the link unit represents a data structure for storing link data and is used for describing one remote call between two services;
and uploading the link unit, the coding character string, the mapping relation between the coding character string and the original text and the character string combined in the compressed data to data collection equipment.
7. The method of claim 6, wherein uploading the link unit, the encoded string, the mapping between the encoded string and the original text, and the combined string in the compressed data to a data collection device comprises:
generating a link data packet based on the link unit, the mapping relation and the character strings combined in the compressed data generated in each remote call process;
and uploading the link data packet to data collection equipment.
8. A data acquisition method, characterized by being applied to a data acquisition system, the system comprising a data acquisition device and a data collection device, the method comprising:
the data acquisition device performing the data acquisition method of any one of claims 1-7;
the data collection device receives the data uploaded by the data collection device, decompresses the compressed data included in the uploaded data, restores the encoded character string based on the mapping relation between the encoded character string and the original text, and sends the decompressed data and restored data to the link data storage device.
9. A data acquisition device, comprising:
the data acquisition module is used for acquiring link data generated by remotely calling an application program by the application program node; the link data at least comprises a service interface name requested in a remote calling process, a user permission identifier of an application program node, an identifier of a called application program and related tag information;
the data compression module is used for compressing the interface name of the server based on a preset algorithm to obtain compressed data;
the data coding module is used for respectively coding the user permission identification, the application program identification and the tag information to obtain a coding character string and a mapping relation between the coding character string and an original text;
And the data uploading module is used for uploading the compressed data, the coded character string and the mapping relation between the coded character string and the original text to data collecting equipment.
10. A data acquisition system comprising a data acquisition device and a data collection device;
the data acquisition device is configured to perform the data acquisition method of any one of claims 1-7;
the data collection device is used for decompressing the compressed data based on the preset algorithm, restoring the coded character string based on the mapping relation between the coded character string and the original text, and sending the decompressed data and the restored data to the link data storage device.
11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the method of any of claims 1-8.
12. A computer readable storage medium having stored thereon a computer program, characterized in that the program is executed by a processor to implement the method of any of claims 1-8.
CN202310427742.1A 2023-04-18 2023-04-18 Data acquisition method, device, system, electronic equipment and storage medium Pending CN116668503A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310427742.1A CN116668503A (en) 2023-04-18 2023-04-18 Data acquisition method, device, system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310427742.1A CN116668503A (en) 2023-04-18 2023-04-18 Data acquisition method, device, system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116668503A true CN116668503A (en) 2023-08-29

Family

ID=87710773

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310427742.1A Pending CN116668503A (en) 2023-04-18 2023-04-18 Data acquisition method, device, system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116668503A (en)

Similar Documents

Publication Publication Date Title
US8117602B2 (en) Method and system for monitoring execution performance of software program product
US20090006614A1 (en) Monitoring Web Service Transactions
CN110445860B (en) Message sending method, device, terminal equipment and storage medium
CN104104649A (en) Webpage logging method, device and system
CN103077108B (en) A kind of method and system browser collapse being carried out to early warning
CN111030888B (en) Domain name system DNS capacity measuring method, device, equipment and medium
US9928517B1 (en) Interaction reconstruction in a service-oriented system
CN111552669A (en) Data processing method and device, computing equipment and storage medium
CN111737207B (en) Method and device for showing and collecting logs of service nodes in distributed system
CN112559285B (en) Micro-service monitoring method and related device based on distributed service architecture
CN111177094A (en) Log data processing method and device, electronic equipment and storage medium
CN110851409A (en) Log compression and decompression method, device and storage medium
CN111563199B (en) Data processing method and device
CN107786544B (en) A kind of the task status processing method and system of message
CN112765103A (en) File analysis method, system, device and equipment
CN105825641A (en) Service alarm method and apparatus
CN113312633A (en) Website vulnerability scanning method, device, equipment and storage medium
CN116668503A (en) Data acquisition method, device, system, electronic equipment and storage medium
CN112235358A (en) Data acquisition method and device, electronic equipment and computer readable storage medium
CN115967604A (en) Message transmission method and device, electronic equipment and computer readable storage medium
CN112379965B (en) Sandbox file mapping system, client device, mapping end device, sandbox file mapping method and electronic equipment
CN113204683A (en) Information reconstruction method and device, storage medium and electronic equipment
CN112001156A (en) Form processing method and device and computer readable storage medium
CN105827447A (en) Service alarm method and apparatus
Ali et al. A metadata encoding for memory-constrained devices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination