CN116226470B - Management method, system, equipment and medium for ocean space-time data - Google Patents

Management method, system, equipment and medium for ocean space-time data Download PDF

Info

Publication number
CN116226470B
CN116226470B CN202310513268.4A CN202310513268A CN116226470B CN 116226470 B CN116226470 B CN 116226470B CN 202310513268 A CN202310513268 A CN 202310513268A CN 116226470 B CN116226470 B CN 116226470B
Authority
CN
China
Prior art keywords
data
vertexes
graph
vertex
marine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310513268.4A
Other languages
Chinese (zh)
Other versions
CN116226470A (en
Inventor
徐子晨
李江波
梁成林
姜晗健
孔露露
肖欣雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang University
Original Assignee
Nanchang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang University filed Critical Nanchang University
Priority to CN202310513268.4A priority Critical patent/CN116226470B/en
Publication of CN116226470A publication Critical patent/CN116226470A/en
Application granted granted Critical
Publication of CN116226470B publication Critical patent/CN116226470B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Alarm Systems (AREA)

Abstract

The invention discloses a management method, a system, equipment and a medium of ocean space-time data, which comprise the steps of obtaining the ocean space-time data and converting the ocean space-time data into graph data in an edge list format; the method comprises the steps of converting the graph data in the edge list format into the graph data in the D-CSR format, distributing a logic thread to each vertex or each edge of the graph data in the D-CSR format through the new hardware characteristic of the GPU-like accelerator card, shortening the time consumed by data processing, effectively improving the real-time performance of ocean data information processing, and carrying out graph calculation on the graph data in the D-CSR format through all logic threads and based on a PageRank algorithm, or based on a BFS algorithm or based on a Triangle Count algorithm to obtain graph calculation results, thereby realizing data authorization and sharing with higher safety and lower delay time in a distributed ocean space-time data management system.

Description

Management method, system, equipment and medium for ocean space-time data
Technical Field
The invention relates to the technical field related to ocean space-time data management technology, in particular to a method, a system, equipment and a medium for managing ocean space-time data.
Background
The prior ocean information processing method mainly uses hardware based on CPU chips to process massive complex ocean data, and analyzes and extracts key information from the ocean data. However, due to the limitation of the calculation performance of the CPU chip, a great amount of time is consumed for processing the ocean data with huge mass and extremely high complexity by using the CPU chip, and the requirement of the system for analyzing the ocean data in real time cannot be met.
Disclosure of Invention
The present invention aims to at least solve the technical problems existing in the prior art. Therefore, the invention provides a management method, a management system, management equipment and management media for marine space-time data, which can shorten the time consumed by data processing and effectively improve the real-time performance of marine data information processing.
In a first aspect of the present invention, there is provided a method for managing marine spatiotemporal data, comprising the steps of:
acquiring marine space-time data and converting the marine space-time data into graph data in an edge list format, wherein the edge list format is that edges are stored in a list form, each element in the list is an edge, the edge list format is specifically ((a, b), (b, c), (y, z)), (a, b) is an edge with a pointing to b, (b, c) is an edge with b pointing to c, and (y, z) is an edge with y pointing to z;
Converting the graph data in the edge list format into graph data in a D-CSR format;
dividing different vertexes in the D-CSR format graph data into all preset logic threads uniformly through new hardware of a similar GPU acceleration card, and carrying out iterative computation on the different vertexes through all the logic threads and a PageRank algorithm to obtain PageRank values after each iteration; judging the values of the PageRank value and a preset threshold value of each iteration, ending the PageRank algorithm when the PageRank value is larger than the preset threshold value, and obtaining a final graph calculation result of the ocean space-time data according to the PageRank value, wherein each logic thread is responsible for calculating the PageRank value of a corresponding single vertex for one round, and the execution times of each vertex in one round is the number of entrances corresponding to each vertex;
and when the PageRank value is smaller than a preset threshold value, updating a graph calculation result at the current moment according to the PageRank value of the current round, and so on until the PageRank value of the kth round reaches the preset threshold value, and obtaining a final graph calculation result of the ocean space-time data according to the PageRank value of the kth round.
According to the embodiment of the invention, at least the following technical effects are achieved:
according to the method, marine space-time data are obtained, the marine space-time data are converted into graph data in an edge list format, and the graph data in the edge list format are converted into graph data in a D-CSR format; dividing different vertexes in the D-CSR format graph data into all preset logic threads uniformly through new hardware of a similar GPU acceleration card, and carrying out iterative computation on the different vertexes through all logic threads and a PageRank algorithm to obtain PageRank values after each iteration; the time consumed by data processing can be shortened through new hardware of the GPU-like accelerator card, the real-time performance of ocean data information processing is effectively improved, the PageRank value of each iteration and the preset threshold value are judged, when the PageRank value is larger than the preset threshold value, the PageRank algorithm is ended, the final graph calculation result of ocean space-time data is obtained according to the PageRank value, wherein each logic thread is responsible for the calculation of the PageRank value of a corresponding single vertex for one round, and the execution times of each vertex in one round is the number of entrances corresponding to each vertex; when the PageRank value is smaller than a preset threshold value, updating the graph calculation result at the current moment according to the PageRank value of the current round, and so on until the PageRank value of the kth round reaches the preset threshold value, obtaining the final graph calculation result of the ocean space-time data according to the PageRank value of the kth round, and realizing data authorization and sharing with higher safety and lower delay time in the distributed ocean space-time data management system.
According to some embodiments of the present invention, the calculation formula for obtaining the PageRank value after each iteration by performing iterative calculation with the PageRank algorithm through all the logic threads is:
wherein,,is->Round->PageRank value of the individual vertices, < ->For the constant to be set in advance,outflow of the +.>Neighbor vertex set of vertices, +.>Flow in the undirected graph +.>Neighbor vertex set of vertices, +.>First->Round->PageRank values for the vertices.
According to some embodiments of the invention, after the performing iterative computation with the PageRank algorithm through all the logic threads to obtain the PageRank value after each iteration, the method further includes:
unbinding the logic threads from the vertexes, wherein the vertexes allocated by each logic thread are equal to or smaller than a preset value.
In a second aspect of the present invention, there is provided a method for managing marine spatiotemporal data, comprising the steps of:
acquiring marine space-time data and converting the marine space-time data into graph data in an edge list format, wherein the edge list format is that edges are stored in a list form, each element in the list is an edge, the edge list format is specifically ((a, b), (b, c), (y, z)), (a, b) is an edge with a pointing to b, (b, c) is an edge with b pointing to c, and (y, z) is an edge with y pointing to z;
Converting the graph data in the edge list format into graph data in a D-CSR format;
distributing a preset logic thread to each vertex of the D-CSR format graph data through the new hardware characteristic of the GPU-like accelerator card; initializing the distance between a source vertex in each vertex, the distance between other vertices and the source vertex, the current traversal level and the running mark to obtain an initial communication diagram; the distance is the distance from any vertex to the source vertex, and the source vertex is the initial vertex when the BFS algorithm starts to execute;
judging whether the initial communication diagram has non-accessed vertexes or not according to the operation mark of the initial communication diagram; ending the BFS algorithm when the initial connected graph does not have the non-accessed vertexes, and obtaining a graph calculation result;
when the non-accessed vertexes exist in the initial connected graph, and the running mark is modified, searching all boundary vertexes according to the distances from the other vertexes to the source vertexes and the current traversal level; traversing all neighbor vertexes of all boundary vertexes, judging whether all neighbor vertexes of all boundary vertexes have non-accessed neighbor vertexes or not, ending the BFS algorithm when all neighbor vertexes of all boundary vertexes do not have non-accessed neighbor vertexes, and obtaining a graph calculation result, wherein the boundary vertexes are vertexes which should be accessed in the current traversal hierarchy;
When the non-accessed neighbor vertex exists, updating the distance between the non-accessed neighbor vertex and the source vertex, obtaining the updated distance between the non-accessed neighbor vertex and the source vertex, and modifying the running mark; and searching all boundary vertexes according to the updated distance from the non-accessed neighbor vertexes to the source vertexes and the current traversal level, and analogizing until the BFS algorithm is finished, so as to obtain the graph calculation result.
According to the method, marine space-time data are obtained, the marine space-time data are converted into graph data in an edge list format, and the graph data in the edge list format are converted into graph data in a D-CSR format; distributing a preset logic thread to each vertex of the D-CSR format graph data through the new hardware characteristic of the GPU-like accelerator card; initializing the distance from a source vertex to the source vertex in each vertex, the distance from other vertices to the source vertex, the current traversal level and the running mark to obtain an initial connected graph; wherein, the distance is the distance from any vertex to the source vertex, and the source vertex is the initial vertex when the BFS algorithm starts to execute; the time consumed by data processing can be shortened through the new hardware of the GPU-like acceleration card, the real-time performance of ocean data information processing is effectively improved, and whether the initial connected graph has non-accessed vertexes is judged according to the running mark of the initial connected graph; ending the BFS algorithm when the unaccessed vertexes do not exist in the initial connected graph, and obtaining a graph calculation result; when the unaccessed vertexes exist in the initial connected graph, and the running mark is modified, searching all boundary vertexes according to the distances from other vertexes to the source vertexes and the current traversal level; traversing all neighbor vertexes of all boundary vertexes, judging whether all neighbor vertexes of all boundary vertexes have non-accessed neighbor vertexes or not, and ending the BFS algorithm when all neighbor vertexes of all boundary vertexes do not have non-accessed neighbor vertexes, so as to obtain a graph calculation result, wherein the boundary vertexes are vertexes which should be accessed in the current traversal hierarchy; when the unaccessed neighbor vertex exists, updating the distance from the unaccessed neighbor vertex to the source vertex, obtaining the updated distance from the unaccessed neighbor vertex to the source vertex, and modifying the running mark; searching all boundary vertexes according to the distance from the updated unviewed neighbor vertexes to the source vertexes and the current traversal hierarchy, and so on until finishing the BFS algorithm to obtain a graph calculation result, thereby realizing data authorization and sharing with higher security and lower delay time in the distributed ocean space-time data management system.
According to some embodiments of the invention, the updating the distance from the non-visited neighboring vertex to the source vertex, to obtain the updated distance from the non-visited neighboring vertex to the source vertex, includes:
and setting the distance from the non-visited neighbor vertex to the source vertex as the distance from the boundary vertex corresponding to the neighbor vertex to the source vertex plus one.
In a third aspect of the present invention, there is provided a method for managing marine spatiotemporal data, comprising the steps of:
acquiring ocean space-time data, and converting the ocean space-time data into graph data in an edge list format;
converting the graph data in the edge list format into graph data in a D-CSR format;
and distributing a preset logic thread for each side (u, v) in the ordered edge list format graph data of the D-CSR format graph data through the new hardware characteristic of the GPU-like accelerator card, and summing the number of triangles generated by each side (u, v) through a Triangle Count algorithm to obtain a graph calculation result.
The method comprises the steps of obtaining ocean space-time data, and converting the ocean space-time data into graph data in an edge list format; converting the graph data in the edge list format into graph data in the D-CSR format; the method has the advantages that the novel hardware characteristic of the GPU-like accelerator card is used for distributing a preset logic thread to each side (u, v) in the ordered edge list format graph data of the D-CSR format graph data, the novel hardware of the GPU-like accelerator card can shorten the time consumed by data processing, the real-time performance of ocean data information processing is effectively improved, the Triangle quantity generated by each side (u, v) is summed through a Triangle Count algorithm, graph calculation results are obtained, and data authorization and sharing with higher safety and lower delay time in a distributed ocean space-time data management system are realized.
According to some embodiments of the invention, the acquiring marine spatiotemporal data comprises:
and importing local ocean space-time data according to the local data interface and/or crawling open-source ocean space-time data according to the crawler module.
In a fourth aspect of the present invention, there is provided a management system of marine spatiotemporal data, the management system of marine spatiotemporal data comprising:
the map storage module is used for storing ocean space-time data;
the client interface module is used for receiving an operation request of a client for the ocean space-time data and generating an operation instruction based on the operation request;
the map service module is used for calling the ocean space-time data to the map calculation module according to the operation instruction;
a map calculation module for executing the management method of marine spatiotemporal data according to the first to third aspects;
and the metadata service module is used for managing account number and authority information, storage and fragmentation information and map space information of the user.
According to the system, the marine space-time data are obtained and converted into the graph data in the edge list format, and the graph data are subjected to graph calculation through a PageRank algorithm based on new hardware of the GPU-like accelerator card, a BFS algorithm based on new hardware of the GPU-like accelerator card, or a Triangle Count algorithm based on new hardware of the GPU-like accelerator card, so that graph calculation results are obtained, and data authorization and sharing with higher safety and lower delay time in the distributed marine space-time data management system are realized.
In a fifth aspect of the invention, there is provided a management electronic device for marine spatiotemporal data comprising at least one control processor and a memory for communicative connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform the above-described method of managing marine spatiotemporal data.
In a sixth aspect of the present invention, there is provided a computer-readable storage medium storing computer-executable instructions for causing a computer to perform the above-described marine spatiotemporal data management method.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of a method of managing marine spatiotemporal data in accordance with an embodiment of the invention;
FIG. 2 is a flow chart of BFS algorithm applied to a method of managing marine spatiotemporal data according to an embodiment of the present invention;
FIG. 3 is a flowchart of an applied triangulation algorithm of a method for managing marine spatiotemporal data according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a management system for marine spatiotemporal data according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
In the description of the present invention, the description of first, second, etc. is for the purpose of distinguishing between technical features only and should not be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, it should be understood that the direction or positional relationship indicated with respect to the description of the orientation, such as up, down, etc., is based on the direction or positional relationship shown in the drawings, is merely for convenience of describing the present invention and simplifying the description, and does not indicate or imply that the apparatus or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
In the description of the present invention, unless explicitly defined otherwise, terms such as arrangement, installation, connection, etc. should be construed broadly and the specific meaning of the terms in the present invention can be determined reasonably by a person skilled in the art in combination with the specific content of the technical solution.
The prior ocean information processing method mainly uses hardware based on CPU chips to process massive complex ocean data, and analyzes and extracts key information from the ocean data. However, due to the limitation of the calculation performance of the CPU chip, a great amount of time is consumed for processing the ocean data with huge mass and extremely high complexity by using the CPU chip, and the requirement of the system for analyzing the ocean data in real time cannot be met.
In order to solve the technical defect, referring to fig. 1, the invention further provides a management method of ocean space-time data, which comprises the following steps:
step S101, acquiring ocean space-time data and converting the ocean space-time data into graph data in an edge list format, wherein the edge list format is to store edges in a list form, each element in the list is an edge, the edge list format is specifically ((a, b), (b, c),. The angle is (y, z)), (a, b) is an edge with a pointing to b, (b, c) is an edge with b pointing to c, and (y, z) is an edge with y pointing to z;
Step S102, converting the graph data in the edge list format into graph data in the D-CSR format;
step S103, uniformly dividing different vertexes in the D-CSR format graph data to all preset logic threads through new hardware of a similar GPU acceleration card, and carrying out iterative computation on the different vertexes through all logic threads and a PageRank algorithm to obtain PageRank values after each iteration; judging the values of the PageRank value and a preset threshold value of each iteration, ending the PageRank algorithm when the PageRank value is larger than the preset threshold value, and obtaining a final graph calculation result of the ocean space-time data according to the PageRank value, wherein each logic thread is responsible for the calculation of the PageRank value of a corresponding single vertex for one round, and the execution times of each vertex in one round is the number of entrances corresponding to each vertex;
and step S104, when the PageRank value is smaller than a preset threshold, updating the graph calculation result at the current moment according to the PageRank value of the current round, and so on until the PageRank value of the kth round reaches the preset threshold, and obtaining the final graph calculation result of the ocean space-time data according to the PageRank value of the kth round.
According to the method, marine space-time data are obtained, the marine space-time data are converted into graph data in an edge list format, and the graph data in the edge list format are converted into graph data in a D-CSR format; dividing different vertexes in the D-CSR format graph data into all preset logic threads uniformly through new hardware of a similar GPU acceleration card, and carrying out iterative computation on the different vertexes through all logic threads and a PageRank algorithm to obtain PageRank values after each iteration; the time consumed by data processing can be shortened through new hardware of the GPU-like accelerator card, the real-time performance of ocean data information processing is effectively improved, the PageRank value of each iteration and the preset threshold value are judged, when the PageRank value is larger than the preset threshold value, the PageRank algorithm is ended, the final graph calculation result of ocean space-time data is obtained according to the PageRank value, wherein each logic thread is responsible for the calculation of the PageRank value of a corresponding single vertex for one round, and the execution times of each vertex in one round is the number of entrances corresponding to each vertex; when the PageRank value is smaller than a preset threshold value, updating the graph calculation result at the current moment according to the PageRank value of the current round, and so on until the PageRank value of the kth round reaches the preset threshold value, obtaining the final graph calculation result of the ocean space-time data according to the PageRank value of the kth round, and realizing data authorization and sharing with higher safety and lower delay time in the distributed ocean space-time data management system.
In particular, in the field of marine spatiotemporal data, the PageRank algorithm is mainly used to analyze and evaluate the importance and impact of marine networks. For example, in marine protection and management, the PageRank algorithm can analyze the connections between various nodes in the marine network to determine which nodes are most important, and the interrelationship between the nodes. This is very useful for grasping and managing marine resources. PageRank can also be used in the marine field to assess the importance and impact of ports, routes, etc. By analysing the links between ports, between airlines, a reference can be provided for the vessel to select the optimal airlines. In addition, the PageRank algorithm can predict the likely location of certain events.
Specifically, the D-CSR format map data comprises three parts:
the first part is a traditional CSR data structure, CSR is known as Compressed Sparse Row, which is one way of sparse matrix compressed storage. The CSR contains two arrays, one for each: the column pointer array is used for recording the position of the first non-zero element of each column in the value array; the row index array records the row number where each non-zero element is located.
The second part is a column index array corresponding to the row index array, the column number of each non-zero element is recorded, and the row index array and the column index array form an edge list together.
The third part is a vertex degree array, and degree information of each vertex is recorded.
In some embodiments, the PageRank algorithm is iterated through all logic threads, and a calculation formula for obtaining the PageRank value after each iteration is as follows:
wherein,,is->Round->PageRank value of the individual vertices, < ->For the constant to be set in advance,outflow of the +.>Neighbor vertex set of vertices, +.>Flow in the undirected graph +.>Neighbor vertex set of vertices, +.>First->Round->PageRank values for the vertices.
Specifically, in some embodiments, the assistance data constructed according to the algorithm formula is the result set, all initialized to 0.85.
In some embodiments, after performing iterative computation with the PageRank algorithm through all logical threads to obtain the PageRank value after each iteration, the method further includes:
unbinding the logic threads and the vertexes, wherein the vertexes allocated by each logic thread are equal to or smaller than a preset value.
Specifically, in some embodiments, the logical threads are unbinding from the vertex v, and during a round of PageRank value calculation, all vertices v are equally divided to the logical threads, so that each thread is responsible for performing 8 cycles to multiply (the last thread may be less than 8).
In addition, referring to fig. 2, an embodiment of the present invention provides a method for managing marine spatiotemporal data, including the steps of:
step S201, acquiring ocean space-time data and converting the ocean space-time data into graph data in an edge list format, wherein the edge list format is to store edges in a list form, each element in the list is an edge, the edge list format is specifically ((a, b), (b, c),. The angle is (y, z)), (a, b) is an edge with a pointing to b, (b, c) is an edge with b pointing to c, and (y, z) is an edge with y pointing to z;
step S202, converting the graph data in the edge list format into graph data in the D-CSR format;
step S203, distributing a preset logic thread for each vertex of the D-CSR format graph data through the new hardware characteristic of the GPU-like accelerator card; initializing the distance from a source vertex to the source vertex in each vertex, the distance from other vertices to the source vertex, the current traversal level and the running mark to obtain an initial connected graph; wherein, the distance is the distance from any vertex to the source vertex, and the source vertex is the initial vertex when the BFS algorithm starts to execute;
step S204, judging whether the initial connected graph has non-accessed vertexes according to the operation mark of the initial connected graph; ending the BFS algorithm when the unaccessed vertexes do not exist in the initial connected graph, and obtaining a graph calculation result;
Step S205, when the unaccessed vertexes exist in the initial connected graph, and the running mark is modified, searching all boundary vertexes according to the distances from other vertexes to the source vertexes and the current traversal level; traversing all neighbor vertexes of all boundary vertexes, judging whether all neighbor vertexes of all boundary vertexes have non-accessed neighbor vertexes or not, and ending the BFS algorithm when all neighbor vertexes of all boundary vertexes do not have non-accessed neighbor vertexes, so as to obtain a graph calculation result, wherein the boundary vertexes are vertexes which should be accessed in the current traversal hierarchy;
step S206, when the unviewed neighbor vertexes exist, updating the distances from the unviewed neighbor vertexes to the source vertexes, obtaining updated distances from the unviewed neighbor vertexes to the source vertexes, and modifying the running marks; searching all boundary vertexes according to the updated distance from the neighbor vertexes which are not accessed to the source vertexes and the current traversal level, and so on until finishing the BFS algorithm, and obtaining a graph calculation result.
According to the method, marine space-time data are obtained, the marine space-time data are converted into graph data in an edge list format, and the graph data in the edge list format are converted into graph data in a D-CSR format; distributing a preset logic thread to each vertex of the D-CSR format graph data through the new hardware characteristic of the GPU-like accelerator card; initializing the distance from a source vertex to the source vertex in each vertex, the distance from other vertices to the source vertex, the current traversal level and the running mark to obtain an initial connected graph; wherein, the distance is the distance from any vertex to the source vertex, and the source vertex is the initial vertex when the BFS algorithm starts to execute; the time consumed by data processing can be shortened through the new hardware of the GPU-like acceleration card, the real-time performance of ocean data information processing is effectively improved, and whether the initial connected graph has non-accessed vertexes is judged according to the running mark of the initial connected graph; ending the BFS algorithm when the unaccessed vertexes do not exist in the initial connected graph, and obtaining a graph calculation result; when the unaccessed vertexes exist in the initial connected graph, and the running mark is modified, searching all boundary vertexes according to the distances from other vertexes to the source vertexes and the current traversal level; traversing all neighbor vertexes of all boundary vertexes, judging whether all neighbor vertexes of all boundary vertexes have non-accessed neighbor vertexes or not, and ending the BFS algorithm when all neighbor vertexes of all boundary vertexes do not have non-accessed neighbor vertexes, so as to obtain a graph calculation result, wherein the boundary vertexes are vertexes which should be accessed in the current traversal hierarchy; when the unaccessed neighbor vertex exists, updating the distance from the unaccessed neighbor vertex to the source vertex, obtaining the updated distance from the unaccessed neighbor vertex to the source vertex, and modifying the running mark; searching all boundary vertexes according to the distance from the updated unviewed neighbor vertexes to the source vertexes and the current traversal hierarchy, and so on until finishing the BFS algorithm to obtain a graph calculation result, thereby realizing data authorization and sharing with higher security and lower delay time in the distributed ocean space-time data management system.
In particular, the BFS (breadth first search) algorithm is mainly used for marine disaster prediction and emergency response in the field of marine spatiotemporal data. Firstly, the BFS algorithm can help to analyze and build a model of the ocean environment, such as a ocean current model, a storm tide model and the like, so as to better know the change and development trend of the ocean environment, predict ocean disasters in advance and take countermeasures. Second, the BFS algorithm can rapidly search for critical information in the marine environment, such as searching for the location of an accident occurring in the ocean, affected areas, etc. Searching can help a decision maker make optimal decisions in a minimum time, such as deploying rescue forces, repairing underwater equipment, removing pollution sources, and the like, so as to reduce the loss caused by disasters. Finally, the BFS algorithm can also help analyze the propagation path and distribution of marine organisms so as to formulate more accurate and effective measures to protect the marine ecological environment.
In some embodiments, updating the distance of the unvisited neighbor vertex to the source vertex, resulting in an updated distance of the unvisited neighbor vertex to the source vertex, includes:
and setting the distance from the non-visited neighbor vertex to the source vertex as the distance from the boundary vertex corresponding to the neighbor vertex to the source vertex plus one.
In addition, referring to fig. 3, in one embodiment of the present invention, there is provided a method for managing marine spatiotemporal data, including the steps of:
step S301, acquiring ocean space-time data, and converting the ocean space-time data into graph data in an edge list format;
step S302, converting the graph data in the edge list format into graph data in the D-CSR format;
step S303, distributing a preset logic thread to each side (u, v) in the ordered edge list format graph data of the D-CSR format graph data through the new hardware characteristics of the GPU-like accelerator card, and summing the number of triangles generated by each side (u, v) through a Triangle Count algorithm to obtain a graph calculation result.
The method comprises the steps of obtaining ocean space-time data, and converting the ocean space-time data into graph data in an edge list format; converting the graph data in the edge list format into graph data in the D-CSR format; the method has the advantages that the novel hardware characteristic of the GPU-like accelerator card is used for distributing a preset logic thread to each side (u, v) in the ordered edge list format graph data of the D-CSR format graph data, the novel hardware of the GPU-like accelerator card can shorten the time consumed by data processing, the real-time performance of ocean data information processing is effectively improved, the Triangle quantity generated by each side (u, v) is summed through a Triangle Count algorithm, graph calculation results are obtained, and data authorization and sharing with higher safety and lower delay time in a distributed ocean space-time data management system are realized.
In particular, the Triangle Counting (triangulation) algorithm is mainly used in the field of marine spatiotemporal data for analyzing complex relationship networks in marine ecosystems. The marine ecosystem is a complex network of many different organisms, and by analyzing the relationships between these organisms, the interactions and effects between these organisms can be understood, and the marine ecosystem can be better protected and managed. In this case, the Triangle Counting algorithm can determine the number of nodes that make up a triangle in one ecosystem. While many of the relationships in the ecosystem are ternary relationships, i.e., interactions between three different species, the present invention can rapidly analyze these triples and evaluate their number and importance using the Triangle Counting algorithm. Can help better understand the relationships between different species in this ecosystem to develop better protection and management policies.
In some embodiments, acquiring marine spatiotemporal data comprises:
and importing local ocean space-time data according to the local data interface and/or crawling open-source ocean space-time data according to the crawler module.
In some embodiments, the RocksDB is used as a local storage engine, implementing its own KVM store, and the Raft protocol is used as the underlying distributed consistency protocol.
Specifically, in some embodiments, the logical threads are unbinding from the vertex v, and each thread is made to execute a fixed number of cycles to multiply during a round of the PageRank value calculation.
In some embodiments, u is one marine thing, v is another marine thing, and relationship is the link that exists between u and v. When ocean data is acquired, all (u, v) tuples with relationship are queried according to relationship query, namely, edge list format graph data, wherein each edge is one (u, v) tuple.
In some embodiments, during the initialization operation, the distances from all vertices to the source vertex are set to ≡, the current traversal hierarchy is set to 0, the run flag is set to 1, then one vertex is selected as the source vertex, and the distance from the source vertex is set to 0.
Specifically, in some embodiments, converting edge list formatted graph data into D-CSR formatted graph data includes:
calculating the total degree of each vertex in the graph data in the edge list format;
reassigning vertex numbers to each vertex according to the total degree of each vertex, and performing edge turning on illegal edges in the edge list format graph data to obtain renumbered edge list format graph data;
Sequentially ordering the renumbered edge list format graph data to obtain ordered edge list format graph data; calculating according to the ordered edge list format diagram data to obtain a vertex number offset array;
and integrating the ordered edge list format diagram data and the vertex number offset array to obtain D-CSR format diagram data.
Specifically, in some embodiments, according to the edge list format graph data, each vertex degree in the graph is calculated, specifically, all vertex degrees in the graph are initialized to be 0, the edge list format graph data is traversed, when (u, v) appears, then vertex u degrees are increased by 1, and vertex v degrees are increased by one. And renumbering the vertexes, wherein the small degree vertexes are numbered smaller, and the large degree vertexes are numbered larger, so that a vertex number mapping table is obtained. And updating the edge list format graph data according to the vertex number mapping table, and updating the vertex number of each edge. If there is an edge (u, v) and u > v for the edge list format map data, the edge (u, v) is changed to the edge (v, u), and renumbered edge list format map data is obtained.
And calling a sort function in C++ to sequentially sort the double-element groups of the renumbered edge list format graph data, so that the renumbered edge list format graph data meets the following rule, and if the (u, v) is before the (x, y), u < x or u=x and v < y are present, and the ordered edge list format graph data is obtained. According to the ordered edge list format diagram data, calculating to obtain an array of vertex number offset, specifically, calculating the number x of edges in front of an edge taking u as a starting vertex to obtain the value x of an element where the subscript of the vertex number offset data is u. And combining the ordered edge list format graph data with the vertex number offset array to form D-CSR format graph data.
In some embodiments, updating the distance of the unvisited neighbor vertex to the source vertex, resulting in an updated distance of the unvisited neighbor vertex to the source vertex, includes:
and setting the distance from the non-visited neighbor vertex to the source vertex as the distance from the boundary vertex corresponding to the neighbor vertex to the source vertex plus one.
In some embodiments, external users may directly manipulate the high-performance graph system of marine spatiotemporal data by using nGQL, or generate nGQL statements through clients made in Python, java, or the like languages, thereby using the high-performance graph system of marine spatiotemporal data.
Unlike the prior art, which aims at solving the cache utilization problem when the GPU acceleration card executes the graphics algorithm, the method mainly solves the problem of thread divergence when the GPU acceleration card executes the graphics algorithm, and in the parallel computation of the GPU acceleration card, the problem of thread divergence is very common, which can cause the reduction of the execution efficiency of the GPU acceleration card. The problem of thread divergence refers to that when threads in the GPU-like accelerator card are executed, different branch selections are required to be made according to some conditions, and as different threads may make different selections at different positions, threads with short execution time need to wait for threads with long execution time, so that the execution efficiency is affected. When the GPU-like acceleration card is executed, the invention reduces the difference of workload among threads by limiting the data range to be processed by the threads, thereby reducing the occurrence of thread divergence. Meanwhile, in the GPU-like accelerator card, threads can be formed into a Warp, and the execution efficiency of the GPU-like accelerator card is improved by distributing similar tasks to the same Warp, so that the problem of thread divergence of the GPU-like accelerator card when a graph algorithm is executed is solved.
In addition, referring to fig. 4, an embodiment of the present invention provides a management system for marine spatiotemporal data, including a management system 300 for marine spatiotemporal data, a graph storage module 35, a metadata service module 34, a graph service module 33, a graph calculation module 32, and a client interface module 31, wherein:
the map storage module 35 is used for storing marine spatiotemporal data;
the client interface module 31 is configured to receive an operation request of the client for the marine spatiotemporal data, and generate an operation instruction based on the operation request;
the map service module 33 is used for calling the ocean spatiotemporal data to the map calculation module according to the operation instruction;
the diagram calculation module 32 is used to perform the management method of marine spatiotemporal data of any of the above embodiments;
the metadata service module 34 is used to manage account and rights information, storage and fragmentation information, and map space information for the user.
According to the system, the marine space-time data are obtained and converted into the graph data in the edge list format, and the graph data are subjected to graph calculation through a PageRank algorithm based on new hardware of the GPU-like accelerator card, a BFS algorithm based on new hardware of the GPU-like accelerator card, or a Triangle Count algorithm based on new hardware of the GPU-like accelerator card, so that graph calculation results are obtained, and data authorization and sharing with higher safety and lower delay time in the distributed marine space-time data management system are realized.
It should be noted that, the system embodiment and the above-mentioned system embodiment are based on the same inventive concept, so that the relevant content of the above-mentioned method embodiment is also applicable to the system embodiment, and is not repeated here.
The application also provides an electronic device for managing marine spatiotemporal data, comprising: memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing when executing the computer program: the management method of the ocean space-time data is as above.
The processor and the memory may be connected by a bus or other means.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The non-transitory software program and instructions required to implement the marine spatiotemporal data management method of the above embodiments are stored in the memory and when executed by the processor, the marine spatiotemporal data management method of the above embodiments is performed, for example, the method steps S101 to S104 in fig. 1 described above are performed.
The present application also provides a computer-readable storage medium storing computer-executable instructions for performing: the management method of the ocean space-time data is as above.
The computer-readable storage medium stores computer-executable instructions that are executed by a processor or controller, for example, by a processor in the above-described electronic device embodiment, which may cause the processor to perform the method of managing marine spatiotemporal data in the above-described embodiment, for example, performing the method steps S101 to S104 in fig. 1 described above.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program elements or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program elements or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of one of ordinary skill in the art without departing from the spirit of the present invention.

Claims (7)

1. The management method of the marine space-time data is characterized by comprising the following steps of:
acquiring marine space-time data and converting the marine space-time data into graph data in an edge list format, wherein the edge list format is that edges are stored in a list form, each element in the list is an edge, the edge list format is specifically ((a, b), (b, c), (y, z)), (a, b) is an edge with a pointing to b, (b, c) is an edge with b pointing to c, and (y, z) is an edge with y pointing to z;
converting the graph data in the edge list format into graph data in a D-CSR format;
dividing different vertexes in the D-CSR format graph data into all preset logic threads through new hardware of a GPU-like accelerator card, carrying out iterative computation on the different vertexes through the logic threads and a PageRank algorithm to obtain PageRank values after each iteration, wherein the calculation formula for obtaining the PageRank values after each iteration through the logic threads and the PageRank algorithm is as follows:
Wherein,,is->Round->PageRank value of the individual vertices, < ->For a preset constant->Outflow of the +.>Neighbor vertex set of vertices, +.>Flow in the undirected graph +.>Neighbor vertex set of vertices, +.>First->Round->PageRank values for the vertices;
unbinding the logic threads and the vertexes, wherein the vertexes allocated by each logic thread are equal to or smaller than a preset value;
judging the values of the PageRank value and a preset threshold value of each iteration, ending the PageRank algorithm when the PageRank value is larger than the preset threshold value, and obtaining a final graph calculation result of the ocean space-time data according to the PageRank value, wherein each logic thread is responsible for calculating the PageRank value of a corresponding single vertex for one round, and the execution times of each vertex in one round is the number of entrances corresponding to each vertex;
and when the PageRank value is smaller than a preset threshold value, updating a graph calculation result at the current moment according to the PageRank value of the current round, and so on until the PageRank value of the kth round reaches the preset threshold value, and obtaining a final graph calculation result of the ocean space-time data according to the PageRank value of the kth round.
2. The management method of the marine space-time data is characterized by comprising the following steps of:
acquiring marine space-time data and converting the marine space-time data into graph data in an edge list format, wherein the edge list format is that edges are stored in a list form, each element in the list is an edge, the edge list format is specifically ((a, b), (b, c), (y, z)), (a, b) is an edge with a pointing to b, (b, c) is an edge with b pointing to c, and (y, z) is an edge with y pointing to z;
converting the graph data in the edge list format into graph data in a D-CSR format;
distributing a preset logic thread to each vertex of the D-CSR format graph data through the new hardware characteristic of the GPU-like accelerator card; initializing the distance between a source vertex in each vertex, the distance between other vertices and the source vertex, the current traversal level and the running mark to obtain an initial communication diagram; the distance is the distance from any vertex to the source vertex, and the source vertex is the initial vertex when the BFS algorithm starts to execute;
judging whether the initial communication diagram has non-accessed vertexes or not according to the operation mark of the initial communication diagram; ending the BFS algorithm when the initial connected graph does not have the non-accessed vertexes, and obtaining a graph calculation result;
When the non-accessed vertexes exist in the initial connected graph, and the running mark is modified, searching all boundary vertexes according to the distances from the other vertexes to the source vertexes and the current traversal level; traversing all neighbor vertexes of all boundary vertexes, judging whether all neighbor vertexes of all boundary vertexes have non-accessed neighbor vertexes or not, ending the BFS algorithm when all neighbor vertexes of all boundary vertexes do not have non-accessed neighbor vertexes, and obtaining a graph calculation result, wherein the boundary vertexes are vertexes which should be accessed in the current traversal hierarchy;
when the non-accessed neighbor vertex exists, updating the distance from the non-accessed neighbor vertex to the source vertex to obtain the updated distance from the non-accessed neighbor vertex to the source vertex, wherein the updated distance is specifically as follows:
setting the distance from the non-visited neighbor vertex to the source vertex as the distance from the boundary vertex corresponding to the neighbor vertex to the source vertex plus one;
modifying the running flag;
and searching all boundary vertexes according to the updated distance from the non-accessed neighbor vertexes to the source vertexes and the current traversal level, and analogizing until the BFS algorithm is finished, so as to obtain the graph calculation result.
3. The management method of the marine space-time data is characterized by comprising the following steps of:
acquiring ocean space-time data, and converting the ocean space-time data into graph data in an edge list format;
converting the graph data in the edge list format into graph data in a D-CSR format;
and distributing a preset logic thread for each side (u, v) in the ordered edge list format graph data of the D-CSR format graph data through the new hardware characteristic of the GPU-like accelerator card, and summing the number of triangles generated by each side (u, v) through a Triangle Count algorithm to obtain a graph calculation result.
4. A method of marine spatiotemporal data management according to claim 3, characterized in that said obtaining marine spatiotemporal data comprises:
and importing local ocean space-time data according to the local data interface and/or crawling open-source ocean space-time data according to the crawler module.
5. A system for managing marine spatiotemporal data, the system comprising:
the map storage module is used for storing ocean space-time data;
the client interface module is used for receiving an operation request of a client for the ocean space-time data and generating an operation instruction based on the operation request;
The map service module is used for calling the ocean space-time data to the map calculation module according to the operation instruction;
a graph computation module for performing the method of managing marine spatiotemporal data of any of claims 1 to 4;
and the metadata service module is used for managing account number and authority information, storage and fragmentation information and map space information of the user.
6. A management device for marine spatiotemporal data, comprising at least one control processor and a memory for communicative connection with said at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a method of marine spatiotemporal data management as claimed in any of claims 1 to 4.
7. A computer-readable storage medium, characterized by: the computer-readable storage medium stores computer-executable instructions for causing a computer to perform a method of managing marine spatiotemporal data as claimed in any of claims 1 to 4.
CN202310513268.4A 2023-05-09 2023-05-09 Management method, system, equipment and medium for ocean space-time data Active CN116226470B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310513268.4A CN116226470B (en) 2023-05-09 2023-05-09 Management method, system, equipment and medium for ocean space-time data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310513268.4A CN116226470B (en) 2023-05-09 2023-05-09 Management method, system, equipment and medium for ocean space-time data

Publications (2)

Publication Number Publication Date
CN116226470A CN116226470A (en) 2023-06-06
CN116226470B true CN116226470B (en) 2023-07-28

Family

ID=86571667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310513268.4A Active CN116226470B (en) 2023-05-09 2023-05-09 Management method, system, equipment and medium for ocean space-time data

Country Status (1)

Country Link
CN (1) CN116226470B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294695A (en) * 2016-08-08 2017-01-04 深圳市网安计算机安全检测技术有限公司 A kind of implementation method towards the biggest data search engine
CN108924938A (en) * 2018-08-27 2018-11-30 南昌大学 A kind of resource allocation methods for wireless charging edge calculations network query function efficiency
CN112131444A (en) * 2020-09-04 2020-12-25 中山大学 Low-space-overhead large-scale triangle counting method and system in graph
CN113742430A (en) * 2021-08-04 2021-12-03 北京大学 Method and system for determining number of triangle structures formed by nodes in graph data
CN115391069A (en) * 2022-10-27 2022-11-25 山东省计算中心(国家超级计算济南中心) Parallel communication method and system based on ocean mode ROMS

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022266608A1 (en) * 2021-06-13 2022-12-22 Artema Labs, Inc Systems and methods for blockchain-based collaborative content generation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294695A (en) * 2016-08-08 2017-01-04 深圳市网安计算机安全检测技术有限公司 A kind of implementation method towards the biggest data search engine
CN108924938A (en) * 2018-08-27 2018-11-30 南昌大学 A kind of resource allocation methods for wireless charging edge calculations network query function efficiency
CN112131444A (en) * 2020-09-04 2020-12-25 中山大学 Low-space-overhead large-scale triangle counting method and system in graph
CN113742430A (en) * 2021-08-04 2021-12-03 北京大学 Method and system for determining number of triangle structures formed by nodes in graph data
CN115391069A (en) * 2022-10-27 2022-11-25 山东省计算中心(国家超级计算济南中心) Parallel communication method and system based on ocean mode ROMS

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Distributed processing of spatiotemporal ocean data:a survey;xiaoyong li等;《World wide web》;全文 *
基于非结构化三角网络的海洋标量数据可视化研究;张雅静等;《万方》;全文 *

Also Published As

Publication number Publication date
CN116226470A (en) 2023-06-06

Similar Documents

Publication Publication Date Title
US10810257B2 (en) Fast processing of path-finding queries in large graph databases
US11748387B2 (en) Spatial computing for location-based services
US11438628B2 (en) Hash-based accessing of geometry occupancy information for point cloud coding
Wang et al. Heterogeneity-aware gradient coding for straggler tolerance
CN110955685A (en) Big data base estimation method, system, server and storage medium
CN104881467A (en) Data correlation analysis and pre-reading method based on frequent item set
Xu et al. A graph partitioning algorithm for parallel agent-based road traffic simulation
CN116226470B (en) Management method, system, equipment and medium for ocean space-time data
CN102792273B (en) Dual mode reader writer lock
CN116346638B (en) Data tampering inference method based on power grid power and alarm information interaction verification
Utomo et al. Federated trustworthy AI architecture for smart cities
US11526791B2 (en) Methods and systems for diverse instance generation in artificial intelligence planning
CN105162765B (en) A kind of cloud data security implementation method sought survival based on docking
CN117580046A (en) Deep learning-based 5G network dynamic security capability scheduling method
CN116523640A (en) Financial information management system based on scheduling feedback algorithm
Li et al. Optimization of planning layout of urban building based on improved logit and PSO algorithms
CN112906824B (en) Vehicle clustering method, system, device and storage medium
Wang et al. Virtual network embedding with pre‐transformation and incentive convergence mechanism
CN115391341A (en) Distributed graph data processing system, method, device, equipment and storage medium
CN116701091A (en) Method, electronic device and computer program product for deriving logs
Li et al. Flexible distributed heterogeneous computing in traffic noise mapping
CN114513401A (en) Automatic operation and maintenance repair method and device for private cloud and computer readable medium
US10938631B2 (en) Quantitative analysis of physical risk due to geospatial proximity of network infrastructure
CN114020471A (en) Sketch-based lightweight elephant flow detection method and platform
US9734461B2 (en) Resource usage calculation for process simulation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant