CN113297057A - Memory analysis method, device and system - Google Patents

Memory analysis method, device and system Download PDF

Info

Publication number
CN113297057A
CN113297057A CN202010222504.3A CN202010222504A CN113297057A CN 113297057 A CN113297057 A CN 113297057A CN 202010222504 A CN202010222504 A CN 202010222504A CN 113297057 A CN113297057 A CN 113297057A
Authority
CN
China
Prior art keywords
memory
data
table structure
snapshot file
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010222504.3A
Other languages
Chinese (zh)
Inventor
王烨
周祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010222504.3A priority Critical patent/CN113297057A/en
Publication of CN113297057A publication Critical patent/CN113297057A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/366Software debugging using diagnostics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/128Details of file system snapshots on the file-level, e.g. snapshot creation, administration, deletion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a memory analysis method, a memory analysis device and a memory analysis system. The method comprises the following steps: acquiring a memory snapshot file during system operation, wherein the memory snapshot file comprises at least one type of memory object, and each type of memory object comprises attribute information organized according to a preset data specification; creating at least one table structure, wherein each table structure corresponds to one type of memory object; analyzing the memory snapshot file, and analyzing the attribute information of each memory object into a corresponding table structure according to the corresponding data specification; and searching at least one table structure so as to carry out memory analysis according to the search result. The method collects the attribute information of the memory objects in the memory snapshot file into a plurality of table structures, so as to analyze and locate the memory problems by retrieving data in the table structures.

Description

Memory analysis method, device and system
Technical Field
The invention relates to the technical field of computers, in particular to a memory analysis method, device and system.
Background
Object-oriented language is a high-level language that is generally applicable and is often used to build complex systems, such as large e-commerce websites, which often require real-time processing of network requests from thousands or even hundreds of millions of users, and in order to be able to process these network requests, the system in operation needs to maintain a large number of objects (objects). In complex systems, various problems often arise with the systems, wherein memory problems tend to be common. Memory problems mainly include memory leaks and memory overflows. Memory leaks are due to an unreasonable organization of code, resulting in some objects not being released in time in memory. The memory overflow is caused by that a large number of objects are generated in the process of the system instantly and a large amount of memory needs to be applied due to reasons of overlarge instant access flow and the like, and at the moment, the memory is consumed instantly, so that faults such as shutdown of a server and the like are caused. If memory problems occur, the system is greatly influenced, and finally, the user experience is influenced.
For the memory problem, some analysis tools exist in the market, such as jmap and jhat tools which are most commonly used by jdk, the jmap can generate memory snapshot files by dump, the memory snapshot files are memory state records at a certain time when the system runs, and the jhat can analyze the memory snapshot files generated by dump. In addition, there are many other commercial tools such as Memory Analysis Tool (MAT) of eclipse corporation. These tools provide great help to developers. However, most of these tools have some disadvantages, for example, these tools only observe the object at a specific time point during the operation of the system, and cannot analyze the change of the object over time. As another example, it is difficult for these tools to correlate multiple objects.
Disclosure of Invention
In view of the above, the present invention is directed to a memory analysis method and system, which provides an analysis tool without the related analysis capability.
To achieve this object, according to a first aspect of the present invention, an embodiment of the present invention provides a memory analysis method, including:
acquiring a memory snapshot file during system operation, wherein the memory snapshot file comprises at least one type of memory object, and each type of memory object comprises attribute information organized according to a preset data specification;
creating at least one table structure, wherein each table structure corresponds to one type of memory object;
analyzing the memory snapshot file, and analyzing the attribute information of each memory object into a corresponding table structure according to a corresponding data specification;
and searching the at least one table structure so as to carry out memory analysis according to the search result.
Optionally, the at least one type includes a class object and an instance object, and the parsing the attribute information of each memory object into a corresponding table structure includes:
analyzing the attribute information of the class object in the memory snapshot file into a class table;
and analyzing the attribute information of the instance object in the memory snapshot file into an instance table.
Optionally, the retrieving of the at least one table structure includes selecting at least two items from the class table, the instance table, and other data structures for associated retrieval.
Optionally, the following search results for memory analysis are obtained according to the association search of the class table and the instance table:
the names of several class objects with the most instance objects;
names of class objects corresponding to a plurality of example objects with most memory occupation;
statistics of instance objects.
Optionally, the retrieving the at least one table structure comprises:
receiving an SQL statement for retrieving data from the at least one table structure;
and constructing n-level tasks according to the semantics of the SQL statement, wherein each level of tasks comprises one or more than one task, the same level of tasks are executed in parallel, different levels of tasks are executed in series, the first level of tasks acquire data from the at least one table structure, the second level to the (n-1) th level of tasks receive the output data of the previous level of tasks and output the output data to the next level of tasks, the nth level of tasks summarize the data to be used as the response of the SQL statement, and n is larger than 1.
Optionally, the corresponding relationship between the at least one table structure and the at least one type of memory object is established through configuration information.
Optionally, the memory snapshot files at different times when the system is running are obtained, for each memory snapshot file at different time, the attribute information of each memory object included in the memory snapshot file is analyzed into the corresponding table structure, the at least one table structure is retrieved, and the state information of each memory object at the different times is obtained, so as to analyze the change condition of each memory object at the different times.
Optionally, the system is executed on a virtual machine, and the memory snapshot file is a memory state file of the virtual machine at a certain moment.
Optionally, the virtual machine is a JAVA virtual machine, and the memory snapshot file of the virtual machine is obtained through a DUMP command of the JAVA virtual machine, or the memory snapshot file of the heap of the JAVA virtual machine is obtained through a DUMP command of the JAVA virtual machine.
Optionally, the method is implemented using a data lake system.
Optionally, the at least one type comprises one or more of the following types: class, instance, data structure, array, pointer.
In a second aspect, an embodiment of the present invention provides a memory analysis apparatus, including:
the data preparation module is used for acquiring a memory snapshot file during system operation, wherein the memory snapshot file comprises at least one type of memory object, and each type of memory object comprises attribute information organized according to a preset data specification;
the table creating module is used for creating at least one table structure, and each table structure corresponds to one type of memory object;
the data analysis module is used for analyzing the memory snapshot file and analyzing the attribute information of each memory object into a corresponding table structure according to a corresponding data specification;
and the retrieval analysis module is used for retrieving the at least one table structure so as to perform memory analysis according to the retrieval result.
Optionally, the at least one type includes a class object and an instance object, and the data parsing module includes:
analyzing the attribute information of the class object in the memory snapshot file into a class table;
and analyzing the attribute information of the instance object in the memory snapshot file into an instance table.
Optionally, the retrieval analysis module selects at least two items from the class table, the instance table and other data structures for association retrieval.
Optionally, the retrieval analysis module obtains the following retrieval results for memory analysis:
the names of several class objects with the most instance objects;
names of class objects corresponding to a plurality of example objects with most memory occupation;
statistics of all instance objects.
Optionally, the retrieval analysis module includes: receiving an SQL statement for retrieving data from the at least one table structure; and constructing n-level tasks according to the semantics of the SQL statement, wherein each level of tasks comprises one or more than one task, the same level of tasks are executed in parallel, different levels of tasks are executed in series, the first level of tasks acquire data from the at least one table structure, the second level to the (n-1) th level of tasks receive the output data of the previous level of tasks and output the output data to the next level of tasks, the nth level of tasks summarize the data to be used as the response of the SQL statement, and n is larger than 1.
Optionally, the memory analysis device further includes: a configuration module, configured to establish configuration information, where the configuration information indicates a correspondence between the plurality of table structures and the plurality of types of memory objects.
Optionally, the data preparation module obtains memory snapshot files at different times when the system is running, the data analysis module analyzes the attribute information of each memory object included in the memory snapshot file at each different time to a corresponding table structure, and the retrieval analysis module retrieves the at least one table structure to obtain state information of each memory object at the different times, so as to analyze the change condition of each memory object at the different times.
Optionally, the system is executed on a virtual machine, the virtual machine is a JAVA virtual machine, and the data preparation module obtains the memory snapshot file of the virtual machine through a DUMP command of the JAVA virtual machine, or obtains the memory snapshot file of the heap of the JAVA virtual machine through a DUMP command of the JAVA virtual machine.
Optionally, the at least one type comprises one or more of the following types: class, instance, data structure, array, pointer.
In a third aspect, the present invention provides a memory analysis system, including a plurality of front nodes, a plurality of computing nodes and a plurality of data sources, where the front nodes receive various data requests issued by applications and distribute the various data requests to the computing nodes, the computing nodes process the various data requests, the data requests include SQL query requests related to memory analysis, where,
at least one of the plurality of compute nodes retrieves at least one table structure in a relational data source according to the SQL query request for memory analysis according to the retrieval results,
the at least one computing node is further configured to perform:
the method comprises the steps of obtaining a memory snapshot file during system operation and storing the memory snapshot file into an object data source, wherein the memory snapshot file comprises at least one type of memory object, and each type of memory object comprises attribute information organized according to a preset data specification;
creating at least one table structure of the relational data source, wherein each table structure corresponds to one type of memory object;
and analyzing the memory snapshot file, and analyzing the attribute information of each memory object to a corresponding table structure according to the corresponding data specification.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including a memory and a processor, where the memory further stores computer instructions executable by the processor, and the computer instructions, when executed, implement any one of the methods described above.
In a fifth aspect, the present invention provides a computer-readable medium storing computer instructions executable by an electronic device, where the computer instructions, when executed, implement any one of the above methods.
According to the memory analysis method, device and system provided by the embodiment of the invention, the attribute information of the memory object is analyzed from the memory snapshot file and is stored in the corresponding table structure, so that the memory problem is analyzed and positioned by retrieving data in the table structure. Furthermore, the data in the plurality of table structures can be subjected to associated retrieval, and the memory problem is determined according to the result of the associated retrieval.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing embodiments of the present invention with reference to the following drawings, in which:
FIG. 1 is a diagram of an exemplary JAVA virtual machine memory;
FIG. 2 is a flowchart of a memory analysis method according to an embodiment of the present invention;
FIG. 3 illustrates data records for a class table and an instance table;
fig. 4 is a structural diagram of a memory analysis apparatus according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a multi-level task for processing SQL statements;
FIG. 6a is a schematic diagram of a software architecture of an exemplary data lake system;
FIG. 6b shows a deployment diagram of the data lake service 602 of FIG. 6 a;
FIG. 7 is a schematic diagram of an electronic device for implementing the methods and systems of embodiments of the present invention.
Detailed Description
The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, and procedures have not been described in detail so as not to obscure the present invention. The figures are not necessarily drawn to scale.
In the following, before the embodiments of the present invention are described in detail based on the drawings, some aspects of the virtual machine interpretation and execution system will be described by taking a JAVA virtual machine as an example.
The JAVA virtual machine divides the memory managed by the JAVA virtual machine into a plurality of different data areas during the execution of the JAVA program. The data areas have respective purposes and time for creating and destroying, some data areas exist along with the starting of the virtual machine process, and some data areas are created and destroyed depending on the starting and the ending of the user thread. Fig. 1 is a schematic diagram of an exemplary JAVA virtual machine memory. As shown in the figure, the data areas comprise a method area, a virtual machine stack, a local method stack, a heap and a program counter.
A Program Counter (Program Counter Register) is a small block of memory space that can be used as a line number indicator for the bytecode being executed by the current thread (the JAVA source code must be converted to bytecode for execution in the virtual machine). Since multithreading in a JAVA virtual machine is implemented by switching and allocating processor execution times in turns, at any one determined time, a processor (a core for a multi-core processor) will only execute instructions in one thread. Therefore, in order to recover to the correct execution position after the thread switching, each thread needs to have an independent program counter, and the counters of the threads are not affected by each other and are stored independently, so that the program counter is a memory space which is 'thread private'.
The virtual machine stack describes a memory space executed by a JAVA method: when each method is executed, a stack frame is created at the same time for storing information such as a local variable table, an operation stack, a dynamic link, a method exit and the like. Each method is called until the process of execution completion corresponds to the process of stacking from pushing to popping in the virtual machine stack by a stack frame. The virtual machine stack is thread private. The existing virtual machine directly combines a local method stack and a virtual machine stack into a whole.
A heap is a block of memory area shared by all threads. The purpose of heap existence is to deposit object instances, and almost all object instances allocate memory in the heap. However, as compilation technology advances, the principle of allocating object instances on a heap also changes.
The method area is a memory area shared by all threads and is used for storing data such as class information, constants, static variables, codes compiled by a just-in-time compiler and the like which are loaded by the virtual machine. A Runtime Constant Pool (Runtime Constant Pool) is part of the method area for holding various literal and symbolic references generated by compile time.
Although various virtual machine memories are divided into a plurality of data areas, different types and versions of virtual machines affect the division manner of the data areas. For example, the JAVA virtual machine uses different JDK versions, and thus the division of the local method stack and the virtual machine stack is different. For another example, the Sun HotSpot virtual machine is divided into data areas different from those of the JAVA virtual machine. In this regard, a practical method is to read the corresponding virtual machine specification to know the division mode of the data area and the data specification of each data area.
Hereinafter, embodiments of the present invention will be described in detail based on the drawings.
Fig. 2 is a flowchart of a memory analysis method according to an embodiment of the present invention. The method specifically comprises the following steps.
In step S211, a memory snapshot file of the system runtime is obtained.
The system may be any system, such as a social networking system, a surveillance system, a banking system, a telecommunications system, an office system, and so forth. The memory snapshot file refers to a memory state record at a certain time when the system runs. For a system adopting object-oriented language programming, a memory snapshot file comprises one or more types of memory objects, each memory object refers to a storage area of a variable body constructed according to various data types supported by the object-oriented programming language in a computer memory, and the type of the memory object is the data type for constructing the memory object. Accordingly, memory objects constructed from different data types, such as classes, instances, data structures, arrays, pointers, etc., may be referred to herein as class objects, instance objects, data structure objects, array objects, pointer objects. Each type of memory object includes attribute information organized according to a predetermined data specification.
The system adopting the object-oriented language programming is generally interpreted and executed by the virtual machine, so that the state information of all memory objects of the system at a certain moment can be obtained by collecting the memory snapshot file of the virtual machine. However, due to the memory recovery mechanism of the virtual machine, most memory objects are created and recovered by the virtual machine, and only a small number of memory objects cannot be recovered by the virtual machine due to the problems of error dependence and the like, so that the acquisition of most memory objects has no great value for the memory analysis of the virtual machine, and the huge data volume increases the workload, and therefore, for this step, it is preferable to collect only memory objects with high error probability for analysis. The probability that the virtual machine cannot be recycled is high due to the complex association relationship between the classes of the instance objects, and therefore, it is preferable that only the instance objects are collected for analysis. Further, since the instance object is generally created on the heap of the virtual machine, the memory snapshot file in this step may be only the memory snapshot file of the heap, and certainly, if the class information is also required to be compared with the instance, the memory snapshot files in other areas may be continuously obtained.
Take JAVA virtual machine as an example. The JAVA virtual machine interprets and executes the JAVA system. The memory snapshot file of the JAVA virtual machine can be obtained through the JAVA dump command. Examples of JAVA dump commands are: the file is/home/admin/zhujmap-dump, the file 15065 is the process number,/home/admin/zhujl.dump 15065 is the file name of the output, and b indicates that the file format is binary. For binary dump files, they need to be converted to decimal and then identified.
Of course, if the system does not use virtual machine interpretation and execution, the memory snapshot file may be obtained by other means, such as by using an operating system interface.
In step S212, at least one table structure is created. In this step, a library structure may be constructed first, followed by one or more table structures under the library structure. The library structure is used to specify the storage location of the table structure. Each table structure corresponds to a type of memory object. As can be seen from the above, the data type is the type of the memory object, i.e. a class, an instance, a data structure, an array, a pointer, etc. can be a type of the memory object. Therefore, after determining which types of memory objects need to be subjected to memory analysis, a plurality of table structures are created for the types, and each table structure stores attribute information of the type of memory object.
It should be noted that the table structure in the present invention may be a database table created by using an existing database management system, or may also be a table structure not constructed by using an existing database management system, for example, a self-developed data analysis system is used to create a table structure in an internal memory or a file system, and the organization form of the data analysis system may be similar to that of an existing database management system, and a standard SQL statement may be used to reduce the difficulty of a technician in use.
In step S213, the memory snapshot file is parsed, and the attribute information of each memory object is parsed into the corresponding table structure according to the corresponding data specification. If the memory snapshot file is a memory snapshot file of the virtual machine, the virtual machine specification can be used, the storage position of each memory object in the memory snapshot file is determined first, and then the attribute information of each memory object is analyzed from the corresponding storage position of the memory snapshot file according to the data specification of each memory object and is added to the corresponding table structure.
In step S214, at least one table structure is retrieved to facilitate memory analysis according to the retrieval result. In the above steps, the attribute information of various different types of memory objects is already formed into data of at least one table structure, and in this step, the memory problem is analyzed and located by retrieving the data in each table structure. The retrieval in the step can be single-table retrieval or multi-table correlation retrieval, and aims to determine the memory problem as soon as possible.
In this embodiment, the attribute information of the memory object is analyzed from the memory snapshot file and stored in the corresponding table structure, so that the memory problem can be analyzed and located by retrieving data in the table structure. Furthermore, the data in the plurality of table structures can be subjected to associated retrieval, and the memory problem is determined according to the result of the associated retrieval.
As an alternative embodiment, only two types of memory objects are collected from the memory snapshot file: class objects and instance objects. Specifically, a class table is created for the class object, an instance table is created for the instance object, then the attribute information of the class object is analyzed from the memory snapshot file according to the data specification of the class object and stored in the class table, the attribute information of the instance object is analyzed from the memory snapshot file according to the data specification of the instance object and stored in the instance table, then various instance tables and class tables are searched, and the memory problem is analyzed and positioned according to the search result. In an object-oriented programming language, the relationships between classes include inheritance, implementation, association, dependency, aggregation, and the like, and an instance is a data type constructed according to a class, and the memory problem of one instance affects other instances or other structures. For example, an instance of class a applies for a segment of memory on the heap of the virtual machine, but is not released, and this memory problem affects instances created by other classes that have an association with class a. Analyzing memory problems based on instances and classes is therefore valuable.
As an optional embodiment, the memory snapshot files at different times of the system during operation may be obtained, for each memory snapshot file at different times, the attribute information of each memory object included in the memory snapshot file is analyzed to the corresponding table structure, and the state information of each memory object at the different times is obtained by retrieving the table structure, so as to analyze the change condition of each memory object at the different times by comparison.
As an alternative embodiment, the configuration information indicates a corresponding relationship between at least one table structure and at least one type of memory object, and then step S213 may implement parsing the attribute information of the memory object in the memory snapshot file into the corresponding table structure based on the corresponding relationship. The configuration information may be stored in files or database tables.
As an optional embodiment, a query interface capable of inputting an SQL statement is provided for a user, after the SQL statement provided by the user is received, the table structure is retrieved according to the SQL statement, and if the table structure related to the SQL statement lacks corresponding data, the memory snapshot file can be read from a designated server in real time and parsed into the table structure, so as to complete retrieval.
The class table, instance table, data parsing and data retrieval processes are described in detail below with the following table structures and with reference to FIG. 3.
Table 1 shows a class table structure table _ of _ class storing attribute information of class objects.
Table 1
Figure BDA0002426577900000111
Table 2 is a table structure table _ of _ instance storing attribute information of instance objects.
Table 2
Figure BDA0002426577900000112
The table structures table _ of _ class and table _ of _ instance are associated by class identification. It should be noted that the table structure does not include the information of the method, because the information of the method does not contribute much to the analysis of the memory of the virtual machine. Moreover, if the method includes memory objects such as classes, instances, data structures, arrays, pointers, etc., these memory objects are also collected and stored in the corresponding table structures.
Based on the class table and the instance table, the data record shown in fig. 3 is obtained. As shown in the figure, id and class _ id in the example table and the class table are allocated by the system operation process and serve as unique identifiers of corresponding records. And performing associated retrieval on the field classID of the instance table and the field id of the class table to obtain statistical information of instances related to the classes, and analyzing the statistical information and analyzing and positioning memory problems by a user.
An association search may be performed based on the class table and the instance table. For example, the table structure table _ of _ class and table _ of _ instance may be associated by class identifier, and associated data may be subjected to operations such as grouping, de-duplication, descending order, and the like, so as to obtain class names of ten classes that generate the most instance objects. Generally speaking, the more instances a class is created, the more complex the dependency relationship is, and the more memory problems are easily generated, so analyzing these classes helps to locate the memory problems as soon as possible. For another example, the table structure table _ of _ class and table _ of _ instant may be jointly queried by class identifier, and the obtained data may be grouped and sorted, so as to obtain 10 class names whose data members occupy the most memory space and the statistical size of the memory space (represented by bits). Generally speaking, the memory space occupied by the data member of each class can be estimated, and the statistical value and the estimated value of the SQL statement are compared to troubleshoot the memory problem. The third example is to perform a combined query on the table structures of table _ of _ class and table _ of _ instance through class identification, count how many instances are created for each class, and derive such statistical data into the table structure of target _ table. As described above, the more instances a class is created, the more complex the dependency relationship is, and the more memory problems are easily generated, so that the class creating more instances is analyzed in a focused manner, which is helpful for locating the memory problems as soon as possible.
Of course, the association query that can be provided by the present invention is not limited to the above example. Moreover, not only can the table structure of the same memory snapshot file be subjected to correlation query, but also the table structures of a plurality of memory snapshot files can be subjected to correlation query, for example, two systems a and B, a run on the virtual machine v1, and B run on the virtual machine v2, can respectively export respective memory snapshot files, and perform correlation query based on the exported table structures. If the system A and the system B both use the instance constructed by the same class, the two instances can be associated through the class name, and whether the two instances are abnormal or not can be further judged, so that some problems can be eliminated. In addition, the memory snapshot files of the systems a and B can be imported into heterogeneous data sources. For example, the memory snapshot file of the system a is imported into the file storage system, the memory snapshot file of the system B is imported into the ORACLE database, and the correlation query between the two is realized through programming.
Fig. 4 is a structural diagram of a memory analysis apparatus according to an embodiment of the present invention.
As shown in fig. 4, the 40 is divided into a data preparation module 401, a table creation module 402, a data parsing module 403, and a retrieval analysis module 404.
The data preparation module 401 is configured to obtain a memory snapshot file during system operation, where the memory snapshot file includes at least one type of memory object, and each type of memory object includes attribute information organized according to a predetermined data specification. Specifically, the system 40 may include configuration information, where the configuration information indicates a server IP for obtaining the memory snapshot file, a port number, and an instruction (including an input parameter and an output parameter) for obtaining the memory snapshot, and through the configuration information, the system 40 may actively (periodically or conditionally) obtain the memory snapshot file from a specified server, for example, the system 40 provides an interface for querying data of each table structure to a user, when the user submits a user request through the interface, if the data preparation module 401 checks that corresponding data is absent in a corresponding table structure, the memory snapshot file is obtained in real time according to the configuration information, and after the table structure is analyzed, the query request is executed. The system 40 may also provide the user with an interface to upload the memory snapshot file, with the user determining when to upload the memory snapshot file to the system.
The table creation module 402 is configured to create at least one table structure, each table structure corresponding to a type of memory object. Data types such as classes, instances, data structures, arrays, pointers, etc. may be employed as types of memory objects, and memory objects are class objects, instance objects, data structure objects, array objects, pointer objects, etc.
The data parsing module 403 is configured to parse the memory snapshot file, and parse the attribute information of each memory object into a corresponding table structure according to the corresponding data specification. The data specification may be created by a developer, or may be provided by a user, for example, the user uploads the memory snapshot file and simultaneously submits the data specification corresponding to the memory snapshot file. The data specification should be information, such as attribute information indicating that the first binary character to the 100 th binary character of the memory snapshot file are class objects, the first binary character to the tenth binary character are class names, and so on. The attribute information of the memory object can be analyzed through the data specification, and then the attribute information is added into the table structure according to the corresponding relation between the memory object and the table structure.
The retrieval analysis module 404 is configured to retrieve data in the table structure, so as to perform memory analysis according to a retrieval result. Alternatively, retrieval may be accomplished by a received user request. The user request generally describes a retrieval mode for various table structures, and may also include an SQL statement, and the retrieval analysis module 404 executes the SQL statement to obtain a retrieval result, and then the user analyzes the retrieval result to locate the memory problem.
In this embodiment, the attribute information of the memory object is parsed from the memory snapshot file and stored in the corresponding table structure, so that the data in the table structure can be retrieved to determine the memory problem. Furthermore, the data in the plurality of table structures can be subjected to associated retrieval, and the memory problem is determined according to the result of the associated retrieval.
As an alternative embodiment, the data analysis module 403 may only analyze the class object and the instance object, that is, the attribute information of the class object in the memory snapshot file is analyzed into the class table, the attribute information of the instance object in the memory snapshot file is analyzed into the instance table, and the retrieval and analysis module 404 only performs the association retrieval on the class table and the instance table.
As an alternative embodiment, the data of the class table, the instance table and other data sources are also subjected to associated retrieval. For example, the names of a plurality of class objects which generate the most instance objects are obtained from the data of the class table, the instance table and other data sources, and the names of class objects corresponding to a plurality of instance objects which occupy the most memory are obtained.
As an alternative embodiment, the retrieval analysis module 404 constructs n-level tasks according to the semantics of the SQL statements submitted by the user, where each level of task includes one or more than one task, the same level of task is executed in parallel, the different levels of tasks are executed in series, the first level of task obtains data from a data source, the second to n-1 levels of tasks receive output data of the previous level of task and output the data to the next level of task, and the n-level of task summarizes the data to respond to the user request, where n is greater than 1. The multitask processing mode is beneficial to improving the retrieval efficiency. A more detailed description is given below based on fig. 5.
For any SQL statement, the search analysis module 404 constructs a multi-level task as shown in fig. 5. As shown in the figure, each node is a task, firstly, a first-stage task respectively acquires data from three data sources, then, the data are filtered, aggregated, combined, sorted, scaled and the like through the tasks of the later stages, and finally, a retrieval result conforming to an SQL statement is output. The tablescan operator represents a task of acquiring data from a data source, and the task supports heterogeneous data sources, for example, data can be acquired from a relational database, and data can also be acquired from a memory snapshot file; the FilterOperator represents the task of filtering data, namely filtering the data output by the tablecandoperator once and only retaining part of the data; project operator maps data; the task of the JoinOperator to combine the data; an aggregation operator performs a task of aggregating data; the sortOperator sorts the data; the limiptorator limits the amount of data.
As an optional embodiment, the system further includes a configuration module, configured to establish configuration information, where the configuration information indicates a correspondence between the plurality of table structures and the plurality of types of memory objects. The configuration information may be stored in files or database tables.
As an alternative embodiment, the system is formed as a WEB system, and provides a query interface capable of inputting SQL statements to a user, and after receiving the SQL statements provided by the user, the system triggers the functions of the data preparation module and the data parsing module to prepare data, then retrieves the data according to the SQL statements, and returns the result to the system, so that the data can be acquired in real time according to requirements. Of course, if the amount of data is large, it is preferable to prepare the data, analyze the data into a table structure, and provide the table structure to the user for retrieval.
As an alternative embodiment, the above method and system may be implemented on a Data Lake (DLA) system. The data lake system is a large warehouse that stores the various raw data of an enterprise, where the data is available for access, processing, analysis, and transmission. The data lake system obtains raw data from multiple data sources of an enterprise, and for different purposes, the same raw data may have multiple copies of the data that satisfy a particular internal model format. Thus, the data processed in the data lake system may be any type of information, from structured data to completely unstructured data. Furthermore, the data lake system provides Serverless interactive federated query services, and data from data sources such as Object Storage System (OSS), database (PostgreSQL/MySQL, etc.), nosql (tablestore), etc. can be analyzed using standard SQL.
Fig. 6a is a schematic diagram of the structure of an exemplary data lake system. As shown, data lake service 602 can collect data from multiple data sources 601 that are heterogeneous for joint query. The data lake service 602 provides a plurality of data interfaces 603 for the user at the same time, the user can submit an SQL statement to the data lake service 603 through the plurality of data interfaces 603 and obtain a retrieval result, and the data lake server can acquire various data from the data interfaces 603 and store the various data into the data source 601 after processing. Fig. 6b shows a deployment diagram of data lake service 602. As shown, data lake service 602 is deployed on a cluster of computers 610. The computer cluster 610 comprises a plurality of front nodes FN and a plurality of computing nodes CN. The front node FN is used for receiving various data requests issued by the application and distributing the various data requests to a plurality of computing nodes CN by using the configuration metadata Meta to realize load balancing. The compute node CN actually processes the various data requests. The compute node CN may optimize various data requests with the data source metadata Meta. A plurality of data sources 601 such as Relational Data Sources (RDS), Table Store, object Store (OSS), and the like.
The memory analysis method and apparatus provided by the present invention can be applied to at least one compute node CN shown in fig. 6 b. The data request forwarded by the first node FN includes an SQL query request related to the memory analysis, and after the computing node CN receives such a request, at least one table structure of the relational data source is retrieved according to the SQL query request, so that the user can perform the memory analysis according to the retrieval result. The computing node CN is further configured to parse the memory snapshot file to generate at least one table structure of the relational data source for retrieval in the above steps, and specifically, the computing node may obtain, via or not via the front node FN, the memory snapshot file in the system operation and store the memory snapshot file in the object data source, then create at least one table structure of the relational data source, where each table structure corresponds to one type of memory object, parse the memory snapshot file, and parse the attribute information of each memory object into the corresponding table structure according to the corresponding data specification. Therefore, the data lake service-based system architecture is beneficial to quickly constructing the memory analysis device provided by the invention.
Corresponding to the above embodiments, the present invention further provides an electronic device, as shown in fig. 7, in a hardware level, the electronic device 70 includes a memory 703 and a processor 702, and in addition, in some cases, an input/output device 704 and other hardware 701. The Memory 703 may be, for example, a Random-Access Memory (RAM), or a non-volatile Memory (non-volatile Memory), such as at least 1 disk Memory. The input/output device 704 is, for example, a display, a keyboard, a mouse, a network controller, or the like. The processor 702 may be constructed based on the various models of processors currently on the market. The processor 702, the memory 703, the input/output device 704, and the other hardware 701 are connected to each other via a bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one line is shown in FIG. 7, but it is not intended that there be only one bus or one type of bus.
The memory 703 is used for storing programs. In particular, the program may comprise program code comprising computer instructions. The memory may include both memory and non-volatile storage and provides computer instructions and data to the processor 702. The processor 702 reads a corresponding computer program from the memory 703 to a memory and then runs the computer program to form a database capacity expansion method on a logic level, and specifically performs the following steps: acquiring a memory snapshot file during system operation, wherein the memory snapshot file comprises at least one type of memory object, and each type of memory object comprises attribute information organized according to a preset data specification; creating at least one table structure, wherein each table structure corresponds to one type of memory object; analyzing the memory snapshot file, and analyzing the attribute information of each memory object into a corresponding table structure according to a corresponding data specification; and searching the at least one table structure so as to carry out memory analysis according to the search result.
As will be appreciated by one skilled in the art, the present invention may be embodied as systems, methods and computer program products. Accordingly, the present invention may be embodied in the form of entirely hardware, entirely software (including firmware, resident software, micro-code), or in a combination of software and hardware. Furthermore, in some embodiments, the invention may also be embodied as a computer program product in one or more computer-readable media having computer-readable program code embodied in the medium.
Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium is, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer-readable storage medium include: an electrical connection for the particular wire or wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical memory, a magnetic memory, or any suitable combination of the foregoing. In this context, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with a processing unit, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a chopper. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any other suitable combination. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., and any suitable combination of the foregoing.
Computer program code for carrying out embodiments of the present invention may be written in one or more programming languages or combinations. The programming language includes an object-oriented programming language such as JAVA, C + +, and may also include a conventional procedural programming language such as C. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (23)

1. A memory analysis method includes:
acquiring a memory snapshot file during system operation, wherein the memory snapshot file comprises at least one type of memory object, and each type of memory object comprises attribute information organized according to a preset data specification;
creating at least one table structure, wherein each table structure corresponds to one type of memory object;
analyzing the memory snapshot file, and analyzing the attribute information of each memory object into a corresponding table structure according to a corresponding data specification;
and searching the at least one table structure so as to carry out memory analysis according to the search result.
2. The memory analysis method of claim 1, wherein the at least one type comprises a class object and an instance object, and the parsing the attribute information of each memory object into the corresponding table structure comprises:
analyzing the attribute information of the class object in the memory snapshot file into a class table;
and analyzing the attribute information of the instance object in the memory snapshot file into an instance table.
3. The memory analysis method of claim 2, wherein the retrieving the at least one table structure comprises selecting at least two of the class table, the instance table, and other data structures for associative retrieval.
4. The memory analysis method according to claim 3, wherein the following retrieval results for the memory analysis are obtained according to the association retrieval of the class table and the instance table:
the names of several class objects with the most instance objects;
names of class objects corresponding to a plurality of example objects with most memory occupation;
statistics of instance objects.
5. The memory analysis method of claim 1, wherein the retrieving the at least one table structure comprises:
receiving an SQL statement for retrieving data from the at least one table structure;
and constructing n-level tasks according to the semantics of the SQL statement, wherein each level of tasks comprises one or more than one task, the same level of tasks are executed in parallel, different levels of tasks are executed in series, the first level of tasks acquire data from the at least one table structure, the second level to the (n-1) th level of tasks receive the output data of the previous level of tasks and output the output data to the next level of tasks, the nth level of tasks summarize the data to be used as the response of the SQL statement, and n is larger than 1.
6. The memory analysis method according to claim 1, wherein the correspondence between the at least one table structure and the at least one type of memory object is established by configuration information.
7. The memory analysis method according to claim 1, wherein the memory snapshot files at different times of the system during operation are obtained, for each memory snapshot file at different time, the attribute information of each memory object included in the memory snapshot file is analyzed into the corresponding table structure, the at least one table structure is retrieved, and the state information of each memory object at the different times is obtained, so as to analyze the change condition of each memory object at the different times.
8. The method of claim 1, wherein the system executes on a virtual machine, and the memory snapshot file is a memory state file of the virtual machine at a certain time.
9. The method according to claim 8, wherein the virtual machine is a JAVA virtual machine, and the memory snapshot file of the virtual machine is obtained through a DUMP command of the JAVA virtual machine, or the memory snapshot file of the heap of the JAVA virtual machine is obtained through a DUMP command of the JAVA virtual machine.
10. The method of claim 1, wherein the memory analysis method is implemented using a data lake system.
11. The method of claim 1, wherein the at least one type comprises one or more of the following types: class, instance, data structure, array, pointer.
12. A memory analysis device, comprising:
the data preparation module is used for acquiring a memory snapshot file during system operation, wherein the memory snapshot file comprises at least one type of memory object, and each type of memory object comprises attribute information organized according to a preset data specification;
the table creating module is used for creating at least one table structure, and each table structure corresponds to one type of memory object;
the data analysis module is used for analyzing the memory snapshot file and analyzing the attribute information of each memory object into a corresponding table structure according to a corresponding data specification;
and the retrieval analysis module is used for retrieving the at least one table structure so as to perform memory analysis according to the retrieval result.
13. The memory analysis device of claim 12, wherein the at least one type comprises a class object and an instance object, and the data parsing module comprises:
analyzing the attribute information of the class object in the memory snapshot file into a class table;
and analyzing the attribute information of the instance object in the memory snapshot file into an instance table.
14. The memory analysis device of claim 13, wherein the search analysis module comprises at least two items selected from the class table, the instance table, and other data structures for performing an associative search.
15. The memory analysis device according to claim 14, wherein the search analysis module obtains the following search results for memory analysis:
the names of several class objects with the most instance objects;
names of class objects corresponding to a plurality of example objects with most memory occupation;
statistics of all instance objects.
16. The memory analysis device of claim 12, the retrieval analysis module comprising: receiving an SQL statement for retrieving data from the at least one table structure; and constructing n-level tasks according to the semantics of the SQL statement, wherein each level of tasks comprises one or more than one task, the same level of tasks are executed in parallel, different levels of tasks are executed in series, the first level of tasks acquire data from the at least one table structure, the second level to the (n-1) th level of tasks receive the output data of the previous level of tasks and output the output data to the next level of tasks, the nth level of tasks summarize the data to be used as the response of the SQL statement, and n is larger than 1.
17. The memory analysis device of claim 12, further comprising: a configuration module, configured to establish configuration information, where the configuration information indicates a correspondence between the plurality of table structures and the plurality of types of memory objects.
18. The memory analysis device according to claim 12, wherein the data preparation module obtains memory snapshot files at different times when the system is running, the data analysis module analyzes attribute information of each memory object included in each memory snapshot file at different times into a corresponding table structure, and the retrieval and analysis module retrieves the at least one table structure to obtain status information of each memory object at different times, so as to analyze a change condition of each memory object at different times.
19. The memory analysis device according to claim 12, wherein the system executes on a virtual machine, the virtual machine is a JAVA virtual machine, and the data preparation module obtains the memory snapshot file of the virtual machine through a DUMP command of the JAVA virtual machine, or obtains the memory snapshot file of the heap of the JAVA virtual machine through a DUMP command of the JAVA virtual machine.
20. The system of claim 12, wherein the at least one type comprises one or more of the following types: class, instance, data structure, array, pointer.
21. A memory analysis system comprises a plurality of front nodes, a plurality of computing nodes and a plurality of heterogeneous data sources, wherein the front nodes receive various data requests sent by applications and distribute the various data requests to the computing nodes, the computing nodes process the various data requests, the various data requests comprise SQL query requests related to memory analysis, wherein,
at least one of the plurality of compute nodes retrieves at least one table structure in a relational data source according to the SQL query request for memory analysis according to the retrieval results,
at least one computing node of the plurality of computing nodes is further configured to perform:
the method comprises the steps of obtaining a memory snapshot file during system operation and storing the memory snapshot file into an object data source, wherein the memory snapshot file comprises at least one type of memory object, and each type of memory object comprises attribute information organized according to a preset data specification;
creating at least one table structure of the relational data source, wherein each table structure corresponds to one type of memory object;
and analyzing the memory snapshot file, and analyzing the attribute information of each memory object to a corresponding table structure according to the corresponding data specification.
22. An electronic device comprising a memory and a processor, the memory further storing computer instructions executable by the processor, the computer instructions, when executed, implementing a memory analysis method as claimed in any one of claims 1 to 11.
23. A computer readable medium storing computer instructions executable by an electronic device, the computer instructions, when executed, implementing a memory analysis method as claimed in any one of claims 1 to 11.
CN202010222504.3A 2020-03-26 2020-03-26 Memory analysis method, device and system Pending CN113297057A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010222504.3A CN113297057A (en) 2020-03-26 2020-03-26 Memory analysis method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010222504.3A CN113297057A (en) 2020-03-26 2020-03-26 Memory analysis method, device and system

Publications (1)

Publication Number Publication Date
CN113297057A true CN113297057A (en) 2021-08-24

Family

ID=77317935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010222504.3A Pending CN113297057A (en) 2020-03-26 2020-03-26 Memory analysis method, device and system

Country Status (1)

Country Link
CN (1) CN113297057A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722324A (en) * 2021-08-30 2021-11-30 平安国际智慧城市科技股份有限公司 Report generation method and device based on artificial intelligence, electronic equipment and medium
CN113791742A (en) * 2021-11-18 2021-12-14 南湖实验室 High-performance data lake system and data storage method
CN116048735A (en) * 2023-03-23 2023-05-02 阿里云计算有限公司 Information processing method and object sharing method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722324A (en) * 2021-08-30 2021-11-30 平安国际智慧城市科技股份有限公司 Report generation method and device based on artificial intelligence, electronic equipment and medium
CN113722324B (en) * 2021-08-30 2023-08-18 深圳平安智慧医健科技有限公司 Report generation method and device based on artificial intelligence, electronic equipment and medium
CN113791742A (en) * 2021-11-18 2021-12-14 南湖实验室 High-performance data lake system and data storage method
CN113791742B (en) * 2021-11-18 2022-03-25 南湖实验室 High-performance data lake system and data storage method
CN116048735A (en) * 2023-03-23 2023-05-02 阿里云计算有限公司 Information processing method and object sharing method
CN116048735B (en) * 2023-03-23 2023-08-29 阿里云计算有限公司 Information processing method and object sharing method

Similar Documents

Publication Publication Date Title
US6801903B2 (en) Collecting statistics in a database system
EP1240604B1 (en) A method and apparatus for improving the performance of a generated code cache search operation through the use of static key values
US8903841B2 (en) System and method of massively parallel data processing
WO2017019879A1 (en) Multi-query optimization
CN110795455A (en) Dependency relationship analysis method, electronic device, computer device and readable storage medium
CN107038161B (en) Equipment and method for filtering data
GB2508503A (en) Batch evaluation of remote method calls to an object oriented database
CN113297057A (en) Memory analysis method, device and system
US11514009B2 (en) Method and systems for mapping object oriented/functional languages to database languages
CN110990420A (en) Data query method and device
US20160342652A1 (en) Database query cursor management
CN112395333B (en) Method, device, electronic equipment and storage medium for checking data abnormality
US11301469B2 (en) Dynamic rebuilding of query execution trees and reselection of query execution operators
US20230418824A1 (en) Workload-aware column inprints
US11354313B2 (en) Transforming a user-defined table function to a derived table in a database management system
KR102541934B1 (en) Big data intelligent collecting system
CN116483831B (en) Recommendation index generation method for distributed database
CN111221698A (en) Task data acquisition method and device
US20220318314A1 (en) System and method of performing a query processing in a database system using distributed in-memory technique
US11157506B2 (en) Multiform persistence abstraction
CN113986876A (en) Method and device for developing data query management and electronic equipment
EP2990960A1 (en) Data retrieval via a telecommunication network
US11868353B1 (en) Fingerprints for database queries
Eddy et al. Supporting feature location and mining of software repositories on the Amazon EC2
WO2023211625A1 (en) Materialized view generation and provision based on queries having a semantically equivalent or containment relationship

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination