CN118227353A - Inter-core communication method, device, equipment and medium based on multi-core heterogeneous system - Google Patents

Inter-core communication method, device, equipment and medium based on multi-core heterogeneous system Download PDF

Info

Publication number
CN118227353A
CN118227353A CN202410427667.3A CN202410427667A CN118227353A CN 118227353 A CN118227353 A CN 118227353A CN 202410427667 A CN202410427667 A CN 202410427667A CN 118227353 A CN118227353 A CN 118227353A
Authority
CN
China
Prior art keywords
application
data
core
block
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410427667.3A
Other languages
Chinese (zh)
Inventor
李泳锋
陈戈军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Geely Holding Group Co Ltd
Zhejiang Zeekr Intelligent Technology Co Ltd
Original Assignee
Zhejiang Geely Holding Group Co Ltd
Zhejiang Zeekr Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Geely Holding Group Co Ltd, Zhejiang Zeekr Intelligent Technology Co Ltd filed Critical Zhejiang Geely Holding Group Co Ltd
Priority to CN202410427667.3A priority Critical patent/CN118227353A/en
Publication of CN118227353A publication Critical patent/CN118227353A/en
Pending legal-status Critical Current

Links

Landscapes

  • Multi Processors (AREA)

Abstract

The invention discloses an inter-core communication method, device, equipment and medium based on a multi-core heterogeneous system. The method is suitable for data communication between an application program in a main core and a system in a co-core in a shared memory, wherein the shared memory is divided into a plurality of memory blocks, the plurality of memory blocks are configured to be used by the plurality of application programs and the co-core system independently, the memory blocks used by the application programs are application blocks, and the memory blocks used by the co-core system are system blocks, and the method comprises the following steps: if the application program generates the data to be processed, the application program writes the data to be processed into an application block used by the application program, and sends the data to be processed in the application block to a system block; the application program sends an interrupt signal to the system of the assistant core through the kernel module of the main core; the system of the assistant core responds to the interrupt signal to read the data to be processed in the system block for processing. The invention can communicate with the system of the assistant core through the application blocks used by the application programs at the same time, does not interfere with each other, does not need additional physical hardware or controllers, greatly reduces the hardware cost, matches the CPU processing speed and greatly improves the communication efficiency.

Description

Inter-core communication method, device, equipment and medium based on multi-core heterogeneous system
Technical Field
The embodiment of the invention relates to the technical field of data communication, in particular to an inter-core communication method, device, equipment and medium based on a multi-core heterogeneous system.
Background
In the related art, software is developed on an AMP (ASYMMETRIC MULTI-Processing) system, which generally runs different tasks on a plurality of cores relatively independently, a Linux system is run on a high-performance main core, and a logic or Real-time operating system (RTOS, real-Time Operating System) is run on a low-performance co-core. Each functional module (application) in the Linux operating system running on the primary core needs to communicate data with the system running on the secondary core. The current mainstream schemes are roughly as follows: one is to communicate data through a network Socket; the other is to carry out data communication through SPI(serial peripheral interface)/I2C(Inter-Integrated Circuit)/UART(UniversalAsynchronous ReceiverTransmitter) and other buses; and the other is to carry out data communication through a shared memory.
For the scheme of carrying out data communication through the network Socket, although each application on the Linux system can independently communicate with the system on the assistant core, the cost is higher. The scheme needs to have independent network controllers on the main core and the auxiliary core, has higher hardware cost, needs to walk a complete network protocol stack through socket communication, needs to consume a large amount of CPU operation resources, is a small burden on the coprocessor with low performance, and has limited communication speed by CPU performance.
The scheme for carrying out data communication through other buses such as SPI/I2C/UART relies on the fact that corresponding bus controller hardware resources are needed on a main core and a co-core. Because the scheme relies on a physical bus for communication, only at most one Linux program running on the main core can communicate with a system running on the auxiliary core at the same time. Because only a single channel exists, a set of relatively complex Daemon system needs to be deployed in the Linux system of the main core to solve the problem of concurrent access of multiple application programs. The communication speed of the scheme depends on the communication speed of the underlying bus, so the communication speed is not high.
The scheme of carrying out data communication through the shared memory does not need the participation of a special hardware controller, can greatly reduce the hardware cost, and can directly match the CPU operation speed of the main/auxiliary core because the communication speed does not depend on the bottom hardware controller. However, this solution has the same disadvantage as the bus solution described above, that only a maximum of one Linux program running on the primary core can communicate with the system running on the co-core at the same time. In addition, because only the memory is shared, the opposite terminal system needs to be informed after the data is updated in other ways, otherwise, the opposite terminal system cannot know the data update in time. In addition, only one shared memory is used, so that only two parties can use the memory in turn, and only half-duplex transmission can be realized.
Disclosure of Invention
The embodiment of the invention provides an inter-core communication method, device, equipment and medium based on a multi-core heterogeneous system, which aim to solve the problem that the data communication of the existing multi-core heterogeneous system can only be carried out by at most one Linux program running on a main core and a system running on a co-core at the same time.
In a first aspect, an embodiment of the present invention provides an inter-core communication method based on a heterogeneous multi-core system, which is suitable for data communication between an application program in a main core and a system in a co-core in a shared memory, where the shared memory is divided into a plurality of memory blocks, and the plurality of memory blocks are configured to be used by the plurality of application programs and the system in the co-core independently, where the memory blocks used by the application programs are application blocks, and the memory blocks used by the system in the co-core are system blocks, and the method includes:
If the application program generates data to be processed, the application program writes the data to be processed into the application block used by the application program, and sends the data to be processed in the application block to the system block;
The application program sends an interrupt signal to the system of the assistant core through the kernel module of the main core;
And the system of the assistant core responds to the interrupt signal to read the data to be processed in the system block for processing.
In a second aspect, an embodiment of the present invention further provides an inter-core communication device based on a multi-core heterogeneous system, including a unit for executing the method described above.
In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes a memory and a processor, where the memory stores a computer program, and the processor implements the method when executing the computer program.
In a fourth aspect, embodiments of the present invention also provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the above method.
The embodiment of the application provides an inter-core communication method, device, equipment and medium based on a multi-core heterogeneous system. The method is suitable for data communication between an application program in a main core and a system in a co-core in a shared memory, wherein the shared memory is divided into a plurality of memory blocks, the memory blocks are configured to be used by a plurality of application programs and the system of the co-core independently, the memory blocks used by the application programs are application blocks, and the memory blocks used by the system of the co-core are system blocks, and the method comprises the following steps: if the application program generates data to be processed, the application program writes the data to be processed into the application block used by the application program, and sends the data to be processed in the application block to the system block; the application program sends an interrupt signal to the system of the assistant core through the kernel module of the main core; and the system of the assistant core responds to the interrupt signal to read the data to be processed in the system block for processing. According to the application, the shared memory is divided into a plurality of application blocks, each application program is configured to use one application block, so that a plurality of application programs can communicate with the system with the kernel through the application blocks used by the application programs at the same time, and the application programs can independently communicate with the system with the kernel through the corresponding application blocks without mutual interference, so that additional physical hardware or controllers are not needed, the hardware cost is greatly reduced, the CPU processing speed is matched, and the communication efficiency is greatly improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system architecture according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a partition layout of a shared memory according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of steps of an inter-core communication method based on a multi-core heterogeneous system according to an embodiment of the present invention;
Fig. 4 is a schematic flow chart of an application program of an inter-core communication method based on a multi-core heterogeneous system according to an embodiment of the present invention to send data to a co-core system;
FIG. 5 is a schematic diagram of a kernel module loading procedure of an inter-kernel communication method based on a heterogeneous multi-kernel system according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of an application program startup procedure of an inter-core communication method based on a multi-core heterogeneous system according to an embodiment of the present invention;
FIG. 7 is a flowchart illustrating steps of an inter-core communication method based on a heterogeneous multi-core system according to another embodiment of the present invention;
FIG. 8 is a schematic diagram of a core coordination system sending data to an application program based on an inter-core communication method of a multi-core heterogeneous system according to an embodiment of the present invention;
Fig. 9 is a schematic diagram of an application scenario of an inter-core communication method based on a multi-core heterogeneous system according to an embodiment of the present invention;
FIG. 10 is a schematic block diagram of an inter-core communication device based on a multi-core heterogeneous system according to an embodiment of the present invention;
Fig. 11 is a schematic block diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
In order to facilitate understanding of the scheme of the embodiment of the present invention, technical meanings are explained first for important terms in the following.
A multi-core heterogeneous system refers to a computer system that is made up of a plurality of different types of processor cores. These processor cores may come from different manufacturers and have different instruction sets, memory architectures, and computing capabilities. The purpose of a multi-core heterogeneous system is to combine the advantages of different processor cores to improve overall performance, energy efficiency and adaptability. In a multi-core heterogeneous system, different processor cores can run simultaneously, and data exchange and synchronization are realized through mechanisms such as shared memory or caches. This may take full advantage of the different processor cores, such as some processors being good at serial processing, while others are good at parallel processing or having specific hardware accelerators, etc.
The shared memory is a technology for realizing inter-process communication in a computer system of a multiprocessor, and allows a plurality of processes to access the same physical memory space, so that data sharing and communication among the processes are realized, and the processes can share certain data and perform read-write operation through the shared memory, thereby avoiding repeated copying and transmission of the data and improving the efficiency of data transmission and the overall performance of the system.
A kernel module is a type of code that is dynamically loaded into the kernel for adding or extending the functionality of the system. They may be loaded into and unloaded from the kernel without recompilation of the entire kernel. The most common use of kernel modules is as device drivers, supporting the operation of various hardware devices on the system. By using the device driver as a kernel module, it can be dynamically loaded into the kernel when needed, enabling the system to identify and operate hardware devices.
The shared library part refers to shared library files which are required to be used in a linked mode when an application program runs, and the library files contain functions and data required by the running of the application program. The shared library can be shared by a plurality of application programs, thereby avoiding the repetition of codes and improving the efficiency of the system. In Linux systems, a shared library is typically extended with a. So (shared object), for example, a. So file.
The user library part refers to library files which need to be linked when the application program is compiled, and the library files contain functions and data which are required by the application program to be compiled. The user library is installed into the system together with the application program at the time of installation, and the application program is used at the time of compiling. In Linux systems, the user library is typically extended with either a (static) or so (dynamic).
The interrupt mechanism of the Linux system is a process of processing an interrupt request sent by a hardware device by an operating system kernel, when the hardware device (such as a CPU, a memory, a hard disk, etc.) generates a certain event (such as completing a task, generating an error or abnormality, etc.), the hardware sends an interrupt signal to the kernel, and the kernel processes the signals through the interrupt mechanism. Implementation of interrupt mechanism enables Linux to efficiently respond and handle various hardware events
Referring to fig. 1, the inter-core communication method based on the multi-core heterogeneous system according to the embodiment of the present invention is applied to the multi-core heterogeneous system, and uses a shared memory manner to efficiently transmit data between heterogeneous cores on an AMP (asymmetric multiprocessing) architecture system. The Linux application program running on the main core is communicated with the co-core system, and the system consists of a kernel module and an application component library running in a user mode.
Specifically, the multi-core heterogeneous system in the embodiment of the invention includes a main core and a co-core, the running system of the main core is a Linux system, and the running system of the co-core is an RTOS system (Real-Time Operating System ), which can be understood that the main core and the co-core can also run in other systems, and those skilled in the art set according to actual requirements. The inter-core communication method based on the multi-core heterogeneous system is suitable for data communication between an application program in the main core and a system in the auxiliary core in a shared memory, and referring to fig. 2, a plurality of memory blocks are divided in the shared memory and are configured to be used by a plurality of application programs and the system of the auxiliary core independently, namely, each application program independently uses one memory block, the system of the auxiliary core independently uses one memory block, the memory blocks used by the application programs are application blocks, and the memory blocks used by the system of the auxiliary core are system blocks. For example, the a application independently uses the a application block for read and write operations, the B application independently uses the B application block for read and write operations, and the C application independently uses the C application block for read and write operations. Each application program uses the corresponding application program, and the application blocks are divided so that the application programs are independent and do not interfere with each other.
Referring to fig. 3, fig. 3 is a flowchart of an inter-core communication method based on a multi-core heterogeneous system according to an embodiment of the present invention. The inter-core communication method based on the multi-core heterogeneous system is described in detail below. As shown in fig. 3, the method comprises the steps of: S110-S130.
S110, if the application program generates data to be processed, the application program writes the data to be processed into the application block used by the application program, and sends the data to be processed in the application block to the system block;
s120, the application program sends an interrupt signal to the system of the assistant core through the kernel module of the main core;
S130, the system of the assistant core responds to the interrupt signal to read the data to be processed in the system block for processing.
In the embodiment of the present invention, the data to be processed is an instruction given by the application program, and the instruction may be generated by triggering by a user, for example, a shutdown instruction, or an instruction of a computing task required by the application program to operate, which is not limited herein. The application block is used for reading and writing by an application program and is divided in advance by a kernel module of the main core. The system block is used by the system read-write of the assistant core, and is divided in advance by the system of the assistant core. Data transmission can be performed between the application block and the system block in the shared memory, so that the application program and the system with the coordination function can realize communication. In order to timely acquire data update and introduce an inter-core interrupt mode, an application program and a system cooperating with cores timely inform an opposite terminal of data update through an interrupt mechanism, so that a polling behavior of the opposite terminal is avoided, and CPU resources are saved. Specifically, when the application program generates data to be processed, the data to be processed is written into an application block in the shared memory, which is used corresponding to the application program, then the data to be processed is sent from the application block to a system block in the shared memory, and then the kernel module of the main kernel drives the interrupt of the auxiliary kernel, so that the system of the auxiliary kernel can read the data to be processed in the system block.
In a specific example, referring to FIG. 4, the process of an application sending data to a system of co-cores is as follows: the first step, the user library part of the application program gives the data to be processed to the shared library part of the application program; secondly, the shared library part of the application program applies for an application block to the kernel module; third, copying the data to be processed into an applied application block, wherein the application block sends the data to be processed to a system block in a shared memory; fourthly, notifying the kernel module that the data is ready by the shared library part of the application program; and fifthly, sending an interrupt signal to the auxiliary core by the core module to trigger the interrupt of the auxiliary core, so that a system of the auxiliary core reads data to be processed from a system block and processes the data, and returning to a data node after processing the data.
It should be noted that, when there are multiple application programs and the system with the kernel, the multiple application blocks write the respective data to be processed into the application blocks corresponding to each other, and then the multiple application blocks send the respective data to be processed to the system block. For example, there are two applications that communicate with the co-hosted system simultaneously, and the two applications use two application blocks to send data to the system block, respectively. Thus, multiple applications can also communicate data with the co-core system at the same time.
According to the embodiment of the invention, due to the division of the application blocks in the shared memory, the application programs of the main core can independently communicate with the system of the auxiliary core by using the corresponding application blocks without mutual interference, and the introduction of the interrupt mechanism can timely inform the opposite terminal of data update and timely read corresponding data. In summary, the embodiment of the invention is based on the mode of adding inter-core interrupt in the shared memory, so that the application program running on the main core system in the multi-core heterogeneous system can perform data communication with the system on the auxiliary core efficiently, independently and independently without additional physical hardware or controllers, thereby greatly reducing hardware cost, matching CPU processing speed and greatly improving communication efficiency.
In an embodiment, referring to fig. 2, the application block is divided into a plurality of application data channels, the system block is divided into a plurality of system data channels, and the step S110 specifically includes: the application program respectively writes the data to be processed into a plurality of application data channels of the application block used by the application program, and concurrently sends the data to be processed in the application data channels to a plurality of system data channels of the system block.
Specifically, in order to solve the problem of data concurrency, the application block is further divided into a plurality of application data channels, so that each application data channel is used for transmitting one type of data to be processed, thereby realizing data concurrency. Similarly, the system block is divided into a plurality of system data channels. For example, the application a needs to send data 01 and data02 to be processed, and the application block a is divided into an application data channel 01 and an application data channel 02, so that the application a writes the data 01 into the application data channel 01 and writes the data02 into the application data channel 02, and the data 01 and the data02 are simultaneously transmitted by the application data channel 01 and the application data channel 02 respectively. Therefore, the application program can simultaneously use a plurality of channels in the corresponding application block to carry out data transmission, and as each application block is divided into a plurality of channels, the application program on the main core can simultaneously and concurrently transmit a plurality of data without interference, and the Linux application program side does not need to use a complex daemon technology to carry out data collection and distribution.
In an embodiment, referring to fig. 2, the application data channel and the system data channel are each divided into a transmit queue and a receive queue, and the step S110 specifically includes: the application program writes the data to be processed into the sending queues of the application data channels of the application block used by the application program, and sends the data to be processed in the sending queues of the application data channels to the receiving queues of the system data channels of the system block simultaneously.
Specifically, in order to solve the problem that the shared memory can only be used by two parties in turn and can only realize half-duplex transfer, the application data channel and the system data channel are further divided into a sending queue and a receiving queue, each application data channel and each system data channel is provided with a sending queue and a receiving queue, queue nodes are arranged in the sending queue and the receiving queue to process data, for example, 128 nodes are arranged in the sending queue and the receiving queue are used for sending data, and the two data channels can simultaneously and independently perform data sending and data receiving through the sending queue and the receiving queue respectively, so that full-duplex transfer is realized. Specifically, the transmit queue of the application data channel of the application block may transmit the data to be processed to the receive queue of the system data channel of the system block, and similarly, the transmit queue of the system data channel of the system block may also transmit the data to be processed to the receive queue of the application data channel of the application block. For example, application data channel 01 of application block a needs to transmit data 01 to be processed, and application data channel 02 needs to transmit data 02 to be processed. The system data channel 01 'of the system block f needs to transmit the data to be processed data 01', and the system data channel 02 'needs to transmit the data to be processed data 02'. Transmitting data 01 to a receiving queue of a system data channel 01 'of the system block through a transmitting queue of an application data channel 01 of the application block a, and transmitting data 02 to a receiving queue of a system data channel 02' of the system block through a transmitting queue of an application data channel 02 of the application block a; meanwhile, the transmission queue of the system data channel 01 'may simultaneously transmit data 01' to the reception queue of the application data channel 01, and the transmission queue of the system data channel 02 'may simultaneously transmit data 02' to the reception queue of the application data channel 02. Therefore, the two queues of each channel are used for transmitting and receiving data independently, the full duplex effect can be achieved, and the multi-application multi-channel full duplex data communication is realized through the division layout of the memory blocks, the data channels and the transmitting and receiving queues of the shared memory.
In an embodiment, referring to fig. 5, before the step S110, the method further includes the steps of: S101-S103.
S101, initializing a kernel module of the main core;
S102, dividing and distributing the shared memory through the kernel module, wherein the shared memory is divided into a plurality of application blocks;
S103, establishing a UIO device node for each application block through the kernel module.
Specifically, the steps S101 to S103 are the loading process of the kernel module, and the kernel module on the Linux system with the main kernel is used as the driving of inter-kernel communication, and is loaded along with the system when the Linux system is started, and meanwhile, the application block, the application data channel and the transmit-receive queue are divided. In this embodiment, the loading of the kernel module mainly completes the block division of the shared memory and the creation of the UIO device nodes, where each UIO device node accesses an application block correspondingly. A uoo (User space I/O) device node is a mechanism in the Linux kernel for supporting direct access of User space to hardware devices. Through UIO, the user space program can access the device like accessing a common file, thereby realizing control and data transmission of the hardware device. In a specific example, the kernel module is loaded as follows: initializing a kernel module, wherein the kernel module comprises hardware drivers such as a memory and an interrupt controller; the second step, the kernel module completes the division of the application blocks of the shared memory and establishes a corresponding application routing data structure; thirdly, checking whether the assistant core is started successfully; step four, if the third step fails, the test is retried after waiting for 1 second; and fifthly, after the kernel cooperation is started successfully, starting to establish the UIO equipment node which is used as an application program interface.
In an embodiment, referring to fig. 6, before the step S110, the method further includes the steps of: S104-S107.
S104, if the application program is started, searching all the UIO device nodes to obtain unregistered UIO device nodes, and registering the application program and the unregistered UIO device nodes;
S105, obtaining the layout of the application block corresponding to the UIO equipment node registered by the application program;
s106, establishing a mapping relation between the application program and the obtained layout of the application block corresponding to the UIO equipment node registered by the application program, and writing the mapping relation into a preset routing table.
Specifically, the steps S104 to S106 are processes of application program startup, and the application program startup process mainly completes registration of the application program and the UIO device node, acquisition of the application block partition layout, creation of the preset routing table, and creation of the listening thread. The kernel module creates a UIO device node for each application block, and when an application program needs to use the application block, the application program and the UIO device node need to be registered first, so that the application program can use the application block independently. After the registration is established, the division layout of the application block is further acquired, namely, the application data channel and the receiving and transmitting queue of the application block division are acquired, so that the data can be conveniently sent and read. And then, a preset routing table is established again, the mapping relation between the application program and the application block is stored in the preset routing table, and the application program can be accurately identified when the communication is cooperated with the core by utilizing the preset routing table. And finally, a monitoring thread for monitoring the notification of the kernel module is created so that the application program can know the arrival of the data in time, and the application interface information of the application data channel and the receiving and transmitting queue is created so as to facilitate the processing of the data by the application program. In a specific example, the application program is started as follows: starting an application program; secondly, searching the UIO equipment node created by the kernel module; thirdly, traversing all UIO equipment nodes, finding available nodes, and opening and registering the nodes; fourth, obtaining the layout of an application block, an application data channel and a receiving-transmitting queue from the UIO equipment node; fifthly, mapping the application block, the application data channel and the receiving and transmitting queue to the application program memory is completed; step six, creating a co-thread monitoring kernel module data arrival notification; seventh, creating application data channels and receiving and transmitting queue application interface information; and eighth step, initializing is completed, and communication is ready.
In other embodiments, the application blocks required to be read and written by each application program are directly mapped into the memory space of the corresponding application program through the combination of the user library part and the kernel module of the application program, the application program directly operates the corresponding memory application blocks, and data copying in kernel mode and user mode is avoided, so that efficient data transmission is provided.
In one embodiment, referring to fig. 7, the inter-core communication method based on the multi-core heterogeneous system further includes the steps of: S210-S250.
S210, if the system of the assistant core generates the data to be processed, the system of the assistant core writes the data to be processed into the system block and sends the data to be processed in the system block to the application block;
s220, the system of the assistant core sends an interrupt signal to the kernel module;
s230, the kernel module responds to the interrupt signal and determines a target application program according to a preset routing table, wherein the preset routing table has a mapping relation between the application block and the application program;
s240, the kernel module sends a data arrival notification to the target application program;
S250, the target application program responds to the data arrival notification and reads the data to be processed in the application block for processing.
In the embodiment of the present invention, the steps S210 to S250 are the process of sending data to the application program by the system with the coordination core, which is similar to the process of sending data to the system with the coordination core by the application program, except that the process of sending data to the application program by the system with the coordination core further includes positioning the application program by using the preset routing table. Specifically, firstly, data to be processed in a system block is sent to an application block, and an interrupt is triggered to a kernel module, so that the kernel module is prompted to inquire a routing table, find an application program corresponding to the application block, then the kernel module notifies the found application program, and the application program reads the data to be processed from the application block and processes the data after knowing the data.
The data to be processed in the system block may be sent to the application block in a static manner or in a dynamic manner, where the static manner may be, for example, that the system block sends the data to be processed to one or more fixed application blocks, for example, to the application block a, the application block b and the application block c. The dynamic manner may be that the application block opened by the UIO device node is sent to the application block a, for example, if the UIO device node of the application block a is opened, the application block a is sent to the application block.
In addition, the query process of the routing table is as follows, specifically: querying whether the application blocks receive the data to be processed one by one; and inquiring the preset routing table according to the application block receiving the data to be processed, identifying the application program corresponding to the application block receiving the data to be processed, and determining the identified application program as a target application program. For example, the system block sends the data to be processed to the application block a, the application block b and the application block C, queries the application blocks one by one, specifically queries the states of all the queue nodes of the receiving queues in the application block, and checks whether the data to be processed is received in the queue nodes, wherein the queue nodes of the application block a and the application block b have no data to be processed, the queue nodes of the application block C have data to be processed, the mapping relation of the application block C in the preset routing table is queried, and then the application program C is the target application program.
In a specific example, referring to fig. 8, a first step, a co-core system applies for queue nodes in a system block and prepares data; secondly, triggering interruption by a system of the assistant core; thirdly, after the kernel module of the main core receives the interrupt, the state of the queue is inquired one by one, and the data user, namely the target application program, is positioned by inquiring the routing table; fourthly, the kernel module of the main kernel informs the target application program that new data arrives; fifthly, the shared library part of the application program queries the queue state one by one and notifies the application program of the library part; sixth, the user library part of the application program reads data from the queue; and seventh, returning the queue node by the shared library part of the application program.
In order to further illustrate the embodiments of the present invention, the following description is made with reference to fig. 9 by using an actual application scenario, where the application scenario is that a co-core performs data communication with a power management application of a main core, so as to achieve power shutdown.
Firstly, the assistant core sends a command for preparing to shutdown to the power management application, and the power management application closes all the applications after receiving the command for preparing to shutdown; then the power management application returns a message of completion of the preliminary shutdown to the auxiliary core, and the auxiliary core sends a system shutdown instruction to the power management application; the power management application closes system service after receiving a system shutdown instruction, returns a message of system shutdown preparation completion to the auxiliary core, and the power management application exits the application; and finally, after receiving the message of completing the system shutdown preparation, the assistant core turns off the system power supply.
In summary, in order to enable the application programs running on the main core to communicate with the co-core system independently, efficiently and without interference, the whole shared memory is divided into a plurality of application blocks, and each application program uses one application block. And each application block consists of a plurality of channels, and the application program can simultaneously use the plurality of channels in the corresponding application block to carry out data transmission. Further, to meet the requirement of full duplex transmission, each channel of the application memory block is composed of two receiving and transmitting queues, so as to realize multi-application, multi-channel and full duplex data communication.
Fig. 10 is a schematic block diagram of an inter-core communication device 300 based on a multi-core heterogeneous system according to an embodiment of the present invention. As shown in fig. 10, the present invention further provides an inter-core communication device 300 based on the multi-core heterogeneous system, corresponding to the above inter-core communication method based on the multi-core heterogeneous system. The inter-core communication apparatus 300 based on the multi-core heterogeneous system includes a unit for performing the above-described inter-core communication method based on the multi-core heterogeneous system, and may be configured in a computer device. Specifically, referring to fig. 10, the inter-core communication device 300 based on the multi-core heterogeneous system includes a first writing unit 301, a first interrupt unit 302, and a first reading unit 303.
The first writing unit 301 is configured to, if the application generates data to be processed, write the data to be processed into the application block used by the application, and send the data to be processed in the application block to the system block; a first interrupt unit 302, configured to send, by the application program, an interrupt signal to the system of the co-core through a kernel module of the main core; the first reading unit 303 is configured to read, by the system of the co-core, the data to be processed in the system block in response to the interrupt signal for processing.
In an embodiment, the first writing unit 301 is further configured to write the plurality of data to be processed into the plurality of application data channels of the application block used by the application program, and send the plurality of data to be processed in the plurality of application data channels to the plurality of system data channels of the system block concurrently.
In an embodiment, the first writing unit 301 is further configured to write the plurality of data to be processed into the transmission queues of the plurality of application data channels of the application block used by the application program, and send the data to be processed in the transmission queues of the plurality of application data channels to the receiving queues of the plurality of system data channels of the system block concurrently.
In one embodiment, the inter-core communication device 300 based on the multi-core heterogeneous system further includes: an initialization unit, a partitioning unit, and a node creation unit.
The initialization unit is used for initializing a kernel module of the main core; the dividing unit is used for dividing and distributing the shared memory through the kernel module, wherein the shared memory is divided into a plurality of memory blocks; and the node creation unit is used for creating the UIO equipment node for each memory block through the kernel module.
In one embodiment, the inter-core communication device 300 based on the multi-core heterogeneous system further includes: registration unit, layout acquisition unit, route establishment unit.
The registration unit is used for searching all the UIO device nodes to acquire the unregistered UIO device nodes if the application program is started, and registering the application program and the unregistered UIO device nodes; a layout obtaining unit, configured to obtain a layout of the application block corresponding to the UIO device node registered by the application program; the route establishing unit is used for establishing a mapping relation between the application program and the obtained layout of the application block corresponding to the UIO equipment node registered by the application program, and writing the mapping relation into a preset route table.
In one embodiment, the inter-core communication device 300 based on the multi-core heterogeneous system further includes: the device comprises a second writing unit, a second interruption unit, a routing unit, a notification unit and a second reading unit.
The second writing unit is configured to, if the system of the coordination core generates the data to be processed, write the data to be processed into the system block, and send the data to be processed in the system block to the application block; the second interrupt unit is used for sending an interrupt signal to the kernel module by the system of the assistant kernel; the routing unit is used for determining a target application program according to a preset routing table by the kernel module in response to the interrupt signal, wherein the preset routing table has a mapping relation between the application block and the application program; the notification unit is used for sending a data arrival notification to the target application program by the kernel module; and the second reading unit is used for responding to the data arrival notification by the target application program and reading the data to be processed in the application block for processing.
In an embodiment, the routing unit comprises: a query unit and a determination unit.
The query unit is used for querying whether the application blocks receive the data to be processed one by one; and the determining unit is used for inquiring the preset routing table according to the application block receiving the data to be processed, acquiring the application program corresponding to the application block receiving the data to be processed, and determining the acquired application program as a target application program.
The inter-core communication apparatus 300 based on the multi-core heterogeneous system described above may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 11.
Referring to fig. 11, fig. 11 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 may be an electronic device with communication capabilities on an automobile.
With reference to FIG. 11, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions that, when executed, cause the processor 502 to perform an inter-core communication method based on a multi-core heterogeneous system.
The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the execution of a computer program 5032 in the non-volatile storage medium 503, which computer program 5032, when executed by the processor 502, causes the processor 502 to perform an inter-core communication method based on a multi-core heterogeneous system.
The network interface 505 is used for network communication with other devices. It will be appreciated by those skilled in the art that the structure shown in FIG. 11 is merely a block diagram of some of the structures associated with the present inventive arrangements and does not constitute a limitation of the computer device 500 to which the present inventive arrangements may be applied, and that a particular computer device 500 may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
Wherein the processor 502 is adapted to run a computer program 5032 stored in a memory for implementing the steps of the above method.
It should be appreciated that in embodiments of the present application, the Processor 502 may be a central processing unit (Central Processing Unit, CPU), the Processor 502 may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSPs), application SPECIFIC INTEGRATED Circuits (ASICs), off-the-shelf Programmable gate arrays (Field-Programmable GATEARRAY, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Those skilled in the art will appreciate that all or part of the flow in a method embodying the above described embodiments may be accomplished by computer programs instructing the relevant hardware. The computer program comprises program instructions, and the computer program can be stored in a storage medium, which is a computer readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present invention also provides a storage medium. The storage medium may be a computer readable storage medium. The storage medium stores a computer program, wherein the computer program includes program instructions. The program instructions, when executed by a processor, cause the processor to perform the steps of the method described above.
The storage medium may be a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, or other various computer-readable storage media that can store program codes.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed.
The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the invention can be combined, divided and deleted according to actual needs. In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The integrated unit may be stored in a storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention is essentially or partly contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device to perform all or part of the steps of the method according to the embodiments of the present invention.
In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (10)

1. An inter-core communication method based on a heterogeneous multi-core system, which is suitable for data communication between an application program in a main core and a system in a co-core in a shared memory, wherein a plurality of memory blocks are divided in the shared memory, the plurality of memory blocks are configured to be used by the plurality of application programs and the system in the co-core independently, the memory blocks used by the application programs are application blocks, and the memory blocks used by the system in the co-core are system blocks, and the method comprises:
If the application program generates data to be processed, the application program writes the data to be processed into the application block used by the application program, and sends the data to be processed in the application block to the system block;
The application program sends an interrupt signal to the system of the assistant core through the kernel module of the main core;
And the system of the assistant core responds to the interrupt signal to read the data to be processed in the system block for processing.
2. The method of claim 1, wherein the application block is divided into a plurality of application data channels, the system block is divided into a plurality of system data channels, the step of the application program writing the data to be processed into the application block it uses and sending the data to be processed in the application block to the system block comprises:
The application program respectively writes the data to be processed into a plurality of application data channels of the application block used by the application program, and concurrently sends the data to be processed in the application data channels to a plurality of system data channels of the system block.
3. The method of claim 2, wherein the application data channel and the system data channel are each divided into a transmit queue and a receive queue, and wherein the step of the application program writing a plurality of the data to be processed into a plurality of the application data channels of the application block used by the application program and concurrently transmitting a plurality of the data to be processed in a plurality of the application data channels to a plurality of the system data channels of the system block comprises:
the application program writes the data to be processed into the sending queues of the application data channels of the application block used by the application program, and sends the data to be processed in the sending queues of the application data channels to the receiving queues of the system data channels of the system block simultaneously.
4. A method according to claim 3, wherein the step of the application program writing the data to be processed into the application block in use thereof and sending the data to be processed in the application block to the system block further comprises:
Initializing a kernel module of the main core;
Dividing and distributing the shared memory through the kernel module, wherein the shared memory is divided into a plurality of application blocks;
and creating a UIO device node for each application block through the kernel module.
5. The method of claim 4, wherein the step of the application program writing the data to be processed into the application block in use thereof and sending the data to be processed in the application block to the system block further comprises:
If the application program is started, searching all the UIO device nodes to obtain unregistered UIO device nodes, and registering the application program and the unregistered UIO device nodes;
acquiring the layout of the application block corresponding to the UIO equipment node registered by the application program;
And establishing a mapping relation between the application program and the obtained layout of the application block corresponding to the UIO equipment node registered by the application program, and writing the mapping relation into a preset routing table.
6. The method according to any one of claims 1-5, further comprising:
If the system for assisting the core generates the data to be processed, the system for assisting the core writes the data to be processed into the system block and sends the data to be processed in the system block to the application block;
the system of the assistant core sends an interrupt signal to the kernel module;
the kernel module responds to the interrupt signal and determines a target application program according to a preset routing table;
The kernel module sends a data arrival notification to the target application program;
And the target application program responds to the data arrival notification and reads the data to be processed in the application block for processing.
7. The method of claim 6, wherein the step of determining the destination application based on the preset routing table comprises:
querying whether the application blocks receive the data to be processed one by one;
and inquiring the preset routing table according to the application block receiving the data to be processed, identifying the application program corresponding to the application block receiving the data to be processed, and determining the identified application program as a target application program.
8. An inter-core communication device based on a multi-core heterogeneous system, comprising means for performing the method of any of the preceding claims 1-7.
9. A computer device, characterized in that it comprises a memory on which a computer program is stored, and a processor which, when executing the computer program, implements the method according to any of claims 1-7.
10. A computer readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method according to any of claims 1-7.
CN202410427667.3A 2024-04-10 2024-04-10 Inter-core communication method, device, equipment and medium based on multi-core heterogeneous system Pending CN118227353A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410427667.3A CN118227353A (en) 2024-04-10 2024-04-10 Inter-core communication method, device, equipment and medium based on multi-core heterogeneous system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410427667.3A CN118227353A (en) 2024-04-10 2024-04-10 Inter-core communication method, device, equipment and medium based on multi-core heterogeneous system

Publications (1)

Publication Number Publication Date
CN118227353A true CN118227353A (en) 2024-06-21

Family

ID=91504612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410427667.3A Pending CN118227353A (en) 2024-04-10 2024-04-10 Inter-core communication method, device, equipment and medium based on multi-core heterogeneous system

Country Status (1)

Country Link
CN (1) CN118227353A (en)

Similar Documents

Publication Publication Date Title
US6711643B2 (en) Method and apparatus for interrupt redirection for arm processors
US5043873A (en) Method of parallel processing for avoiding competition control problems and data up dating problems common in shared memory systems
US6029204A (en) Precise synchronization mechanism for SMP system buses using tagged snoop operations to avoid retries
US20090271796A1 (en) Information processing system and task execution control method
JP3807250B2 (en) Cluster system, computer and program
GB2460735A (en) Bus Fabric for Embedded System Comprising Peer-to-Peer Communication Matrix
US20180260257A1 (en) Pld management method and pld management system
CN110532106B (en) Inter-process communication method, device, equipment and storage medium
CN116244229B (en) Access method and device of hardware controller, storage medium and electronic equipment
CN104854845B (en) Use the method and apparatus of efficient atomic operation
US20030145155A1 (en) Data transfer mechanism
CN110737618A (en) Method, device and storage medium for embedded processor to carry out rapid data communication
US20020184330A1 (en) Shared memory multiprocessor expansion port for multi-node systems
CN109992539B (en) Double-host cooperative working device
JP2001333137A (en) Self-operating communication controller and self- operating communication control method
US8909873B2 (en) Traffic control method and apparatus of multiprocessor system
CN118227353A (en) Inter-core communication method, device, equipment and medium based on multi-core heterogeneous system
US11861403B2 (en) Method and system for accelerator thread management
US20220067536A1 (en) Processor system and method for increasing data-transfer bandwidth during execution of a scheduled parallel process
JPH04291660A (en) Inter-processor communication method and its parallel processor
WO2007088582A1 (en) Asynchronous remote procedure calling method in shared-memory multiprocessor, asynchronous remote procedure calling program, and recording medium
KR100921504B1 (en) Apparatus and method for communication between processors in Multiprocessor SoC system
JP2780662B2 (en) Multiprocessor system
JP5163128B2 (en) Procedure calling method, procedure calling program, recording medium, and multiprocessor in shared memory multiprocessor
EP4293524A1 (en) Integrated chip and data transfer method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination