CN117785726A

CN117785726A - Application processor, system on chip and operation method thereof

Info

Publication number: CN117785726A
Application number: CN202311259788.3A
Authority: CN
Inventors: 柳俊熙; 金昇勋
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2022-09-29
Filing date: 2023-09-26
Publication date: 2024-03-29

Abstract

An application processor, a system on a chip (SoC), and a method of operating the same are provided, the SoC including a first processor outputting a first access address; a system bus configured to: transmitting the access address to the memory if the access address received from the first processor corresponds to a physical address area of the memory, and transmitting the access address to other processing circuitry outside the memory if the access address corresponds to a shadow physical address area other than the physical address area of the memory; and a sub-processing circuit that receives the first access address from the first processor via the system bus, translates the first access address to a second access address corresponding to the physical address area, and sends the second access address to the system bus to access the memory.

Description

Application processor, system on chip and operation method thereof

Cross Reference to Related Applications

The present application claims priority from korean patent application No. 10-2022-0124557 filed on 29 months 2022 to the Korean Intellectual Property Office (KIPO) and korean patent application No. 10-2023-0048985 filed on 13 months 2023 to the KIPO, the entire disclosures of which are incorporated herein by reference.

Technical Field

The present inventive concept relates generally to a system on a chip (SoC), and more particularly, to a SoC including sub-processing circuits for supporting applications executed by a processor, an application, and a method of operating the SoC.

Background

A system on chip (SoC) embodies a technology of integrating a complex system having various functions into a single semiconductor chip. There is now a convergence of computer, communication and broadcast trends. The use of Application Specific Integrated Circuits (ASICs) and Application Specific Standard Products (ASSPs) are both moving towards SoC technology. In addition, miniaturization and weight saving of Information Technology (IT) devices are pushing SoC related traffic.

With the development of mobile applications, processors and memories are increasingly used. Thus, a new SoC may be needed to support the use of service software for users while minimizing increases in processor and memory usage within limited power consumption design specifications.

Disclosure of Invention

Embodiments of the inventive concept may provide a system on a chip (SoC) and/or an application processor for distinguishing and processing memory requests from software.

According to an embodiment of the inventive concept, there is provided a system on a chip (SoC) including: a first processor configured to output a first access address; a system bus configured to: transmitting the first access address to the memory if the first access address corresponds to a physical address region of the memory, and transmitting the first access address to other processing circuitry outside the memory if the first access address corresponds to a shadow (shadow) physical address region outside the physical address region of the memory; and a sub-processing circuit configured to receive the first access address from the first processor via the system bus, translate the first access address into a second access address corresponding to the physical address region, and send the second access address to the system bus to access the memory.

According to an embodiment of the inventive concept, there is provided an application processor including: a main processor configured to convert a first virtual address generated when an application is executed into a first physical address by using a first page table (page table) including mapping information between a physical address indicating one of a physical address area and a shadow physical address area of a memory and a virtual address indicating an address area of a virtual memory identified by the application, and output a first access request including the first physical address; a router configured to receive a first access request from the host processor, send the first access request to the memory in response to a first physical address corresponding to a physical address region of the memory, and output the first access request to an intellectual property (intellectual property, IP) core outside the memory in response to the first physical address corresponding to a shadow physical address region of the memory; a sub-processing circuit configured to receive a first access request from the router, process data associated with the first access request, and translate a first physical address into a second virtual address; and a first Memory Management Unit (MMU) configured to translate the second virtual address to a second physical address corresponding to a physical address region of the memory, wherein the router is configured to receive a second access request including the second physical address from the first MMU, and send the second access request to the memory if the second physical address corresponds to the physical address region of the memory.

According to an embodiment of the inventive concept, there is provided a method of operating a system on a chip (SoC), the method including: transmitting, by the processor, a first access request signal including a first physical address to the router; if the first physical address does not correspond to the physical address area of the memory, sending a first access request signal to the sub-processing circuit by the router; converting, by the sub-processing circuit, the first physical address to a second physical address corresponding to a physical address region of the memory; transmitting, by the sub-processing circuit, a second access request signal including a second physical address to the router; and sending, by the router, a second access request signal to the memory.

Drawings

The embodiments will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings in which:

fig. 1 is a block diagram illustrating a system on a chip (SoC) according to an embodiment;

FIGS. 2A and 2B are mixed table schematic diagrams, the former showing a page table according to an embodiment, the latter showing address regions according to an embodiment;

fig. 3 is a block diagram schematically illustrating a SoC according to an embodiment;

fig. 4A and 4B are block diagrams illustrating a path of a processor accessing memory in a SoC according to an embodiment;

FIG. 5 is a flow chart of a method of accessing memory by a processor in a SoC according to an embodiment;

fig. 6 is a block diagram illustrating operation of a SoC according to an embodiment;

fig. 7 is a block diagram illustrating operation of a SoC according to an embodiment;

fig. 8A and 8B are block diagrams illustrating a read operation of a SoC according to an embodiment;

FIG. 9 is a hybrid block diagram illustrating compressed image data stored in a compression buffer according to an embodiment;

FIG. 10 is a hybrid block diagram of the software layers of the SoC according to an embodiment;

11A and 11B are mixed table diagrams, the former showing a match between a virtual address space and an effective physical address space according to a comparative example, and the latter showing a result of a memory access according to a comparative example;

12A and 12B are hybrid table block diagrams, the former illustrating address matching between a virtual address space and a physical address space according to an embodiment, and the latter illustrating matching between an Application Programming Interface (API) -based virtual address space and a physical address space according to an embodiment;

FIG. 13 is a block diagram illustrating an application processor according to an embodiment;

fig. 14A and 14B are block diagrams illustrating read and write operations of an application processor according to an embodiment;

FIG. 15 is a table schematic diagram showing an example of an address matching table according to an embodiment;

fig. 16 is a block diagram illustrating a SoC according to an embodiment; and

fig. 17 is a block diagram illustrating an electronic device that installs an application processor according to an embodiment.

Detailed Description

Hereinafter, the inventive concept is described in detail by way of non-limiting examples with reference to the accompanying drawings. The components described by referring to terms such as parts or units, modules, blocks, etc. used in the detailed description and functional blocks shown in the drawings may be implemented in software, hardware, or a combination thereof. For example, the software may be machine code, firmware, embedded code, and/or application software. For example, the hardware may include circuitry, electronic circuitry, processors, computers, integrated circuits, integrated circuit cores, pressure sensors, inertial sensors, microelectromechanical systems (MEMS), passive devices, and/or combinations thereof.

Fig. 1 illustrates a system on chip (SoC) according to an embodiment.

The SoC 100 may be mounted on an electronic device, and for example, the electronic device may include a mobile device such as a smart phone, tablet, personal Computer (PC), personal Digital Assistant (PDA), portable Multimedia Player (PMP), laptop computer, wearable device, global Positioning System (GPS) device, electronic book terminal, digital broadcast terminal, MP3 player, digital camera, wearable computer, navigation system, drone, and the like. For example, the electronic devices may also include internet of things (IoT) devices, home appliances, and/or Advanced Driver Assistance Systems (ADAS). The SoC 100 may include a controller and a processor that controls the operation of the electronic device. SoC 100 may refer to an Application Processor (AP), a mobile AP, and/or a control chip.

The SoC 100 may be installed in an electronic device, such as a camera, a smart phone, a wearable device, an internet of things (IoT) device, a home appliance, a tablet, a PDA, a PMP, a navigation system, an unmanned aerial vehicle, an ADAS, or the like.

Further, the SoC 100 may be mounted on electronic devices equipped as components in vehicles, furniture, manufacturing facilities, doors, and various measurement devices.

Referring to fig. 1, soc 100 may include a processor 110, sub-processing circuitry 120, memory 130, an intellectual property core (IP) such as IP1 140 and/or IP2 150, and a system bus 160. The SoC 100 may also include a communication function module, an image sensor module, and the like. Components of SoC 100, such as sub-processing circuit 120, memory 130, and IPs 140 and 150, may send and receive data via system bus 160.

The processor 110 may be a main processor of the SoC 100 and may control the operation of the SoC 100 as a whole. The processor 110 may run an Operating System (OS) and execute various applications (application software) of the electronic device in which the SoC 100 is installed. For example, the processor 110 may process various types of arithmetic and/or logical operations. Processor 110 may include a single processor core (single core) or multiple processor cores (multi-core). Processor 110 may include a cache(s) for each of one or more processor cores to perform various operations. The cache may temporarily store instructions and/or parameter values for the execution of the application by the processor 110.

The processor 110 may store data in the memory 130 or read data from the memory 130 during execution of the operating system and applications. For example, an application may read data from memory 130, process the read data, and store the processed data back into memory 130. The processor 110 may send an access request for reading or writing to the memory 130 to read data from the memory 130 and write processed data to the memory 130, and the access request may include a physical address indicating an area of the storage area of the memory 130 where the data is stored or to be written.

The application may read data from or write data to a virtual address space provided in the virtual space, and the processor 110 may translate a virtual address (which may be referred to as a logical address) indicating the virtual address space into a physical address indicating one of a plurality of address areas of an effective physical address space of the memory 130 where the data is or will be stored, for example, by using a page table (e.g., PGTB in fig. 2A).

In an embodiment, the page table may be used to map virtual addresses to an effective physical address space of memory 130 (i.e., a physical address representing one of the address regions included in the actual physical address space) or to indicate a physical address of one of the address regions (e.g., shadow physical address space) that deviates from the effective physical address space. Hereinafter, a physical address corresponding to the effective physical address space of the memory 130 is referred to as an "effective physical address", and a physical address corresponding to the shadow physical address space is referred to as a "shadow physical address".

Fig. 2A illustrates a page table according to an embodiment, and fig. 2B illustrates an address area according to an embodiment.

Referring to fig. 2A, the page table PGTB may include a virtual address VA and a physical address PA. The virtual address VA may indicate an address region of the virtual address space VAs and the physical address PA may indicate a corresponding address region of the effective physical address space EPAS and a corresponding address region of the shadow physical address space SPAS. The virtual address space VAS may include an address space provided by virtual memory technology and may be identified by an operating system and applications executing in a processor (110 in FIG. 1). The effective physical address space EPAS may have the same size as the system memory (e.g., memory 130), but is not limited thereto. The shadow physical address space SPAS may include an area outside of system memory.

Referring to fig. 2B, when the memory 130 has a capacity of 2 Gigabytes (GB) and the SoC 100 is a 32-bit system, the address area may include a first address area AR1 having 2GB and addresses 0x0000_0000 to 0x7fff_fff, a second address area AR2 having 1GB and addresses 0x8000_0000 to 0xbfff_ffff, and a third address area AR3 having 1GB and addresses 0xc000_0000 to 0xffff_fff. The first address area AR1 may be set as an effective physical address space EPAS, and one of the address areas outside the first address area AR1, for example, the third address area AR3, may be set as a shadow physical address space SPAS.

In an embodiment, the virtual address space VAS may be divided into a plurality of pages (e.g., PN0 to PNn, where n is an integer of 3 or more), and each of the pages PN0 to PNn may be an address region indicated by the virtual address VA. The size of each page PN0 to PNn may be 4KB, but is not limited thereto.

The effective physical address space EPAS and the shadow physical address space SPAS may be divided into frames FN0 to FNn, and the physical address PA corresponding to the virtual address VA may indicate the frames FN0 to FNn. Some of the frames FN0 to FNn may be provided as an address area of the effective physical address space EPAS (hereinafter referred to as effective physical address area), and other frames may be provided as an address area of the shadow physical address space SPAS (hereinafter referred to as shadow physical address area). For example, frames FN1, FN7, and FN8 in FIG. 2A may be effective physical address areas, and frames FNn-1 and FNn may be shadow physical address areas. In physical address PA, the effective physical address may indicate one of the effective physical address areas, such as frames FN1, FN7, or FN8, and the shadow physical address may represent one of the shadow physical address areas, such as frames FNn-1 or FNn.

The virtual address VA of the page table PGTB may sequentially correspond to the pages PN0 to PNn, and the physical address PA mapped to the virtual address VA may sequentially or non-sequentially correspond to the frames FN0 to FNn. The connection between the virtual address VA and the physical address PA may not be permanent and may be broken or adjusted according to various events.

Referring back to fig. 1, when accessing the memory 130, the processor 110 may generate a valid physical address or a shadow physical address mapped to a virtual address as an access address based on the page table PGTB in fig. 2A, and may output an access request having the access address to the system bus 160. The processor 110 may directly access the memory 130 based on the effective physical address. Further, the processor 110 may indirectly access the memory 130 via the sub-processing circuit 120 based on the shadow physical address. Indirect access to memory 130 may be described in more detail below with reference to FIGS. 4A-15.

The sub-processing circuits 120 may support functions not provided by the processor 110, for example, data processing for data read from the memory 130 or data to be written to the memory 130 by an application. Sub-processing circuits 120 may translate the shadow physical address into an effective physical address indicating a physical address region of memory 130 and read data from memory 130 based on the effective physical address. Sub-processing circuits 120 may process the data and send the processed data to processor 110. In addition, the sub-processing circuit 120 may process data requested by the processor 110 for writing with the shadow physical address, translate the shadow physical address into an effective physical address, and write (store) the processed data into the memory 130 based on the effective physical address.

In an embodiment, the sub-processing circuit 120 may include a compressor and a decompressor, and may compress or decompress received data. When the data stored in the memory 130 to be read by the application is compressed data, the application may not use the data even if the data is read from the memory 130 unless the processor 110 provides a decompression function. The sub-processing circuits 120 may read data from the memory 130 based on the shadow physical address, decompress the read data, and transmit the decompressed data to the processor 110. In addition, the sub-processing circuits 120 may compress data provided from the processor 110 together with the shadow physical address and write (store) the compressed data into the memory 130.

In an embodiment, the sub-processing circuit 120 may include an encoder and a decoder, and may encrypt or decrypt received data. For example, the sub-processing circuits 120 may encrypt data provided from the processor 110, such as data requested by an application to be written to the memory 130, before storing the data in the memory 130. By encrypting the data prior to storing the data in memory 130, the data may be protected from attacks, such as cold-start attacks. Since the encryption and decryption processes of the sub-processing circuit 120 are processes separate from the operation of the processor 110, performance degradation of the application caused by the security function does not need to occur when the application is executed, and power efficiency can be optimized by using dedicated hardware as compared with when the security function is provided by the processor 110.

In an embodiment, the sub-processing circuit 120 may prefetch data that is desired to be accessed by the processor 110, such as by applying a separate channel of a cache coherency protocol (cache coherence protocol) between the system bus 160 and the sub-processing circuit 110.

Memory 130 may temporarily store data being processed or to be processed by processor 110, sub-processing circuit 120, and IPs 140 and 150. The memory 130 may include volatile memory such as Dynamic Random Access Memory (DRAM), static RAM (SRAM), synchronous RAM (SDRAM), and/or nonvolatile memory such as phase change RAM (PRAM), magnetoresistive RAM (MRAM), ferroelectric RAM (FRAM), etc. However, for convenience of description, it may be assumed herein that DRAM is used as the memory 130, but is not limited thereto. In fig. 1, the memory 130 is shown as being installed in the SoC 100, but is not limited thereto. For example, the memory 130 may be implemented as a chip separate from the SoC 100, and may transmit and receive data to and from other components of the SoC 100.

Memory 130 may be a system memory. An Operating System (OS), applications, and/or firmware may be loaded in memory 130 at boot-up (boot-up). For example, when an electronic device equipped with the SoC 100 is booted, an OS image stored in a memory space may be loaded into the memory 130 according to a boot sequence. The overall input/output operations of SoC 100 may be supported by an operating system OS. Further, applications and/or firmware (e.g., related to graphics processing) may be loaded into memory 130 based on user selections and/or basic services.

Each of the IPs 140 and 150 may include a unit module or a combination of unit modules designed to perform a specific function in the SoC 100. IP may be referred to as a functional module or processing circuit. The IPs 140 and 150, for example, the first IP 140 and the second IP 150, may include a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a Digital Signal Processor (DSP), a Power Management Unit (PMU), a Clock Management Unit (CMU), a Universal Serial Bus (USB) controller, a Peripheral Component Interconnect (PCI) controller, a wireless interface, a universal controller, embedded software, a codec, a video module such as a camera interface, a Joint Photographic Experts Group (JPEG) processor, a video processor, a mixer, etc., a three-dimensional graphics core, an audio system, and/or a driver. IP 140 and 150 may be implemented in hardware, software, or firmware, or any combination thereof. In fig. 1, processor 110 and sub-processing circuit 120 are shown in a configuration independent of IPs 140 and 150, but processor 110 and/or sub-processing circuit 120 may also be referred to as an IP.

The system bus 160 may connect components of the SoC 100 to each other, such as the processor 110, the sub-processing circuit 120, the memory 130, and the IPs 140 and 150, and may provide a transmission path for data or signals between the components.

In an embodiment, the system bus 160 may be implemented in a network-on-a-chip (NoC) method. NoC methods are methods of connecting processing circuits in a semiconductor chip to the semiconductor chip by applying packet or circuit network technology between general-purpose computers or communication devices. The system bus 160 may include routers and switching circuits to provide a transmission path for data and signals between processing circuits in the SoC, such as the processor 110, the sub-processing circuit 120, the memory 130, and the transmission paths between the IPs 140 and 150.

In an embodiment, the system bus 160 may be implemented in the form of a NoC to which a protocol having a preset standard bus specification is applied. For example, the Advanced Microcontroller Bus Architecture (AMBA) protocol of the advanced Reduced Instruction Set Computer (RISC) machine (ARM) protocol may be applied as a standard bus specification. The bus types of the AMBA protocol may include one or more of advanced high-performance bus (AHB), advanced Peripheral Bus (APB), advanced extensible interface (AXI), AXI4, AXI Coherence Extension (ACE), and the like. In the above bus types, AXI is an interface protocol between functional blocks that provide a plurality of outstanding address functions and data interleaving functions. In addition, other types of protocols, such as the uNet of Sonics Inc., the Coreconnect of IBM, and/or the open core protocol of OCP-IP, may also be applied to the system bus 160.

The system bus 160 may receive an access request from at least one component of the SoC 100 (e.g., the processor 110, the sub-processing circuit 120, the first IP 140, and the second IP 150) and may send the access request to a component having a corresponding physical address, such as the memory 130, based on the physical address (e.g., the access address) included in the access request. In addition, the system bus 160 may send a response to the access request to the component providing the access request.

In SoC 100 according to an embodiment, system bus 160 may send an access request to memory 130 when a physical address in the access request is received from processor 110, such as when the access address is a valid physical address. Thus, the processor 110 may directly access the memory 130 based on the effective physical address. As used herein, "direct access" to memory 130 means that memory access is performed without any processing circuitry and includes access through system bus 160.

When the physical address is a shadow physical address, the system bus 160 may send an access request to the sub-processing circuit 120. As described above, the sub-processing circuits 120 may translate the shadow physical address into an effective physical address. The system bus 160 may receive an access request including a valid physical address from the sub-processing circuit 120 and send the access request to the memory 130. Thus, the processor 110 may indirectly access the memory 130 via the sub-processing circuits 120 based on the shadow physical address. As used herein, "indirect access" to memory 130 means that the memory access is performed by processing circuitry (such as via sub-processing circuitry 120).

As described above, in the SoC100 according to an embodiment, the page table PGTB in fig. 2A may include an effective physical address corresponding to a physical address area and a shadow physical address corresponding to a shadow physical address area, the shadow physical address being an address space outside the physical address area, and the processor 110 may directly access the memory 130 through the system bus 160 or indirectly access the memory 130 through the system bus 160 and the sub-processing circuits 120. The sub-processing circuits 120 may provide functions not provided by the processor 110, such as, but not limited to, compression/decompression functions and/or encryption/decryption functions, when executing an application. Accordingly, the SoC100 may support various applications without any modifications to the processor 110 and/or any load on the processor 110. In addition, the bandwidth used by the memory 130 may be reduced by a compression/decompression function, thereby optimizing performance of the SoC100 and/or minimizing power consumption of the SoC 100.

Fig. 3 illustrates a SoC according to an embodiment.

Referring to fig. 3, the SoC100 a according to an embodiment may include a processor 110, sub-processing circuits 120, IPs 140 and 150, a system bus 160, and a memory controller 170. The SoC100 a shown in fig. 3 is a modified example of the SoC100 shown in fig. 1. Therefore, the description repeated on the base (subtropically) can be omitted.

In an embodiment, the memory 130 may be implemented as a separate chip outside of the SoC 100 a. Memory 130 may include a system memory. Further, various types of memories that can be applied to the memory 130 in fig. 1 can be applied to the memory 130.

Memory controller 170 may receive an access request including an access address from system bus 160 and send the access request to memory 130. In addition, the memory controller 170 may send processing results and/or responses to access requests to the system bus 160.

Fig. 4A and 4B illustrate paths of a processor accessing memory in the SoC 100B, and fig. 5 illustrates a method of providing access to memory from a processor in the SoC 100B, according to an embodiment.

Referring to fig. 4A and 4b, the soc 100b may include a processor 110, a sub-processing circuit 120, a memory 130, and a router 161. The router 161 may be arranged in the system bus 160 of fig. 1. SoC 100b may further include other components described with reference to fig. 1. For example, the memory 130 is shown as being provided in the SoC 100b, but is not limited thereto. As described with reference to fig. 3, the memory 130 may be implemented as a separate chip external to the SoC 100 b. In this case, the SoC 100b may further include the memory controller 170 of fig. 3 providing a communication path with the memory 130.

Referring to fig. 5, in step S110, the processor 110 may generate a first access request signal including a first access address. According to an embodiment, the processor 110 may execute an application and operate a page table (e.g., PGTB of fig. 2A) corresponding to the application. The processor 110 may translate the virtual address VA of fig. 2A issued (issue) by the application for accessing the memory 130 into the physical address PA of fig. 2A by using the page table PGTB of fig. 2A. In this case, the physical address PA may include an effective physical address and a shadow physical address. The processor 110 may generate a first access request signal including a valid physical address or a shadow physical address as the first access address AA1.

In step S120, the processor 110 may transmit a first access request signal (e.g., a write request command or a read request command) for the memory 130 including the first access address AA1 to the system bus 160 of fig. 1. The first access request signal may be sent to the router 161 on the system bus 160.

In step S130, the router 161 may determine whether the first access address AA1 corresponds to a shadow physical address area. The router 161 may include information about a shadow physical address corresponding to the shadow physical address area and determine whether the first access address is a shadow physical address based on the information about the shadow physical address.

When it is determined that the first access address AA1 does not correspond to the shadow physical address area, the router 161 may transmit a first access request signal to the memory via the system bus at step S140. Router 161 may send the received access address to memory 130 or sub-processing circuit 120.

When it is determined that the first access address AA1 corresponds to the physical address area of fig. 2A, the router 161 may transmit the first access address AA1 to the memory 130, as shown in fig. 4A. For example, when the first access address AA1 corresponds to a valid physical address, the router 161 may send the first access address AA1 to the memory 130.

When it is determined that the first access address AA1 corresponds to the shadow physical address area of fig. 2A, the router 161 may transmit a first access request signal to the sub-processing circuit 120 at step S150. When it is determined that the first access address AA1 corresponds to the shadow physical address area, such as when it is determined that the first access address AA1 corresponds to the shadow address area, the router 161 may send the first access address AA1 to the sub-processing circuit 120 instead of the memory 130, as shown in fig. 4B. For example, when the first access address AA1 corresponds to a shadow physical address, the router 161 may send the first access address AA1 to the sub-processing circuit 120. The following operation is described with reference to fig. 4B and 5.

In step S160, the sub-processing circuit 120 may convert the first access address AA1 into a second access address AA2 corresponding to the physical address area. For example, the sub-processing circuit 120 may translate the first access address AA1 as a shadow physical address to the second access address AA2 as a valid physical address.

In step S170, the sub-processing circuit 120 may transmit a second access request signal including the second access address AA2 to the system bus 160. For example, the sub-processing circuit 120 may output the second access address AA2 to the router 161.

In step S180, the system bus 160 may transmit a second access request signal to the memory 130. As described above with respect to operation S130, the router 161 may determine whether the received access address corresponds to a shadow physical address area. Here, since the second access address AA2 is an effective physical address, it does not correspond to a shadow physical address area. Thus, the router 161 may send the second access address AA2 to the memory 130.

In the SoC 100b according to the present embodiment, the processor 110 can directly access the memory 130 based on the effective physical address, and can indirectly access the memory 130 via the sub-processing circuit 120 based on the shadow physical address.

Fig. 6 illustrates the operation of SoC100c according to an embodiment.

Referring to fig. 6, the SoC100c according to the present embodiment may include a processor 110, a sub-processing circuit 120, a memory 130, a router 161, and a Memory Management Unit (MMU) 180. The SoC100c of fig. 6 is a modified example of the SoC 100B in fig. 4A and 4B, and therefore, any further detailed description of the same elements is omitted, and the description of the SoC100c focuses on differences.

As described above, when the first access address AA1 included in the first access request signal generated by the processor 110 is included in the shadow physical address area, such as when the first access address AA1 is the shadow physical address SPA, the router 161 may transmit the first access request signal to the sub-processing circuit 120.

The sub-processing circuit 120 may translate the first access address AA1 from the shadow physical address SPA to the virtual address VA. In an embodiment, the sub-processing circuit 120 may include an address matching table having mapping information for the virtual address VA corresponding to the shadow physical address SPA. Accordingly, the sub-processing circuit 120 may translate the shadow physical address SPA into the virtual address VA by using the address matching table. Accordingly, the sub-processing circuit 120 may translate the first access address AA1, such as the shadow physical address SPA, to the virtual address VA.

MMU 180 may translate virtual address VA received from sub-processing circuitry 120 into an effective physical address EPA. In an embodiment, MMU 180 may include a page table with mapping information for the effective physical address EPA corresponding to virtual address VA. The page table used by the MMU 180 may be different from the page table used by the processor 110. For example, the MMU 180 may include a system MMU supporting one or more processing circuits of the SoC 100c, and the page table may be the same as another page table used by at least one other processing circuit of the SoC 100c, such as a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), etc., but is not limited thereto.

The MMU 180 may generate a second access request including the effective physical address EPA as the second access address AA2 and send the second access request to the router 161. And in turn router 161 may send a second access request to memory 130.

Fig. 7 illustrates the operation of SoC 100d according to an embodiment.

Referring to fig. 7, the SoC 100d according to an embodiment may include a processor 110, a sub-processing circuit 120, a memory 130, and a router 161. The SoC 100d in fig. 7 is a modified example of the SoC 100B in fig. 4B. Substantially repetitive descriptions may be omitted.

In an embodiment, router 161 may include a cache CC. Additional channels for managing the cache CCs may be provided between the sub-processing circuit 120 and the router 161, and in response to the cache coherence protocol, the sub-processing circuit 120 may send a cache management request signal QRC, such as a read command, a hide (flash) command, etc., to the cache CCs of the router 161.

Here, the router 161 may operate as a cache coherence controller. The router 161 may store data in or read stored data from the cache CC in response to the cache management request signal QRC from the sub-processing circuit 120. The router 161 may provide a path to read from and/or write to data stored in the memory 130 from the cache CC. The router 161 may be configured to maintain coherency between the cache CC and at least one cache of the processor 110 (e.g., without limitation, a local cache and/or a shared cache in the processor 110) or between the cache CC and at least one cache of the sub-processing circuits 120 (e.g., without limitation, a local cache and/or a shared cache in the sub-processing circuits 120).

In an embodiment, the cache CC is shown inside the router 161, but is not limited thereto. For example, the cache CC may be located separately outside of the router 161, such as in the system bus 160 of fig. 1, or may be a shared cache between at least one processor.

The sub-processing circuits 120 may prefetch data in the cache CC, such as data desired for use by the processor 110, in response to the cache management request signal QRC. The sub-processing circuits 120 may support forms of data prefetching that the processor 110 does not need to support. Thus, performance may be optimized when the processor 110 executes an application.

In an embodiment, referring to fig. 6 and 7, the soc 100d may further include the MMU 180 of fig. 6. In this case, the sub-processing circuit 120 may directly transmit a cache management request signal QRC, such as a read command, a hidden (flash) command, or the like, to the router 161. The router 161, in turn, may send a response to the cache management request signal QRC, such as a cache hit (hit) or miss (miss) status and associated data in response to the read request, to the sub-processing circuit 120.

Fig. 8A and 8B illustrate a read operation of the SoC 100e according to an embodiment. The compressed image data may be stored in the memory 130 in fig. 8A and 8B. In an embodiment, the compressed image data may be read out by the sub-processing circuit 120 performing indirect access to the memory 130 in response to an image data read request from the processor 110.

Referring to fig. 8A, the processor 110 may execute applications for performing image processing operations such as, but not limited to, face recognition and correction, image quality improvement, and the like. The application may request that the image data be read from the memory 130 and may generate a first virtual address indicating an area in which the image data is stored.

The processor 110 may translate the first virtual address into the first physical address PA1 based on the first page table PGTB1 set for the application. The first page table PGTB1 may be used to map virtual addresses to valid physical addresses or shadow physical addresses, such as described above with reference to fig. 2A and 2B. The first physical address PA1 may be a shadow physical address.

Memory 130 may include a compression buffer (compression buffer) CBUF, and compressed image data CDT, such as in fig. 8B, may be stored in compression buffer CBUF.

Fig. 9 illustrates compressed image data CDT stored in the compression buffer CBUF according to an embodiment.

An image processing circuit, such as but not limited to an Image Signal Processor (ISP), may compress image data through the sub-block SBL. For example, the sub-block SBL may include 16 pixels arranged in a 4×4 matrix. The packet of compressed image data CDT may include a header HD and a payload PL. The payload PL may include a sub-block SBL compressed therein. The header HD may include information on the storage order and storage size of the compressed sub-blocks SBL, which may be compressed individually or in groups and arranged in the payload PL. The header information may also include a start address of the payload PL, such as, but not limited to, a start address of the first sub-block.

The compression buffer CBUF may include a payload area PLA in which n payloads PL are stored and a header area HDA in which n headers HD are stored, where n is a positive integer of two (2) or more. An amount of space (bootprint) PF of substantially the same size may be set for each payload PL. Thus, the start address and/or end address of the payloads PL may be identified according to the order of the payloads PL.

For example, the ISP may compress raw image data received from the image sensor in sub-blocks and store the compressed image data CDT in the compression buffer CBUF. The ISP may store the compressed image data CDT in an effective physical address generated based on the second page table PGTB2 set for the ISP, wherein the effective physical address corresponds to an area of the compression buffer CBUF where the compressed image information CDT is to be stored.

Referring to fig. 8A and 9, the processor 110 need not provide compression and/or decompression functionality for applications. In this case, when the processor 110 itself reads out the compressed image data CDT from the memory 130, the application may not be able to use the compressed image data CDT. Accordingly, the processor 110 may control the sub-processing circuit 120 to decompress the compressed image data CDT, and receive the decompressed image data from the sub-processing circuit 120.

Accordingly, the processor 110 may translate the first virtual address into a shadow physical address instead of the effective physical address by using the first page table PGTB1, and generate the first read request signal RD1 including the first physical address PA1 as the access address corresponding to the shadow physical address.

In this case where the first physical address PA1 received from the processor 110 corresponds to a shadow physical address area instead of the physical address area of the memory 130, the router 161 may transmit the first read request signal RD1 of fig. 8A to the sub-processing circuit 120 instead of the memory 130.

The sub-processing circuit 120 may include an address translation circuit ACC that translates physical addresses to virtual addresses. In an embodiment, the address translation circuit ACC may translate the physical address to a virtual address by using an address matching table. The address matching table may include a virtual address corresponding to a physical address of the payload PL, and the address conversion circuit ACC may include information about the image data, such as, but not limited to, a height, a width, a format, etc. of the image data. The sub-processing circuit 120 may convert the first physical address PA1 included in the first read request signal RD1 received from the router 161 into the second virtual address VA2. In an embodiment, the address conversion circuit ACC may calculate a virtual address for each header HD and payload PL of the compressed image data CDT corresponding to the first physical address PA1, for example, by using an address matching table.

The sub-processing circuit 120 may further include a compressor COMP. The compressor COMP may compress and/or decompress the received data, as described below with reference to fig. 8B, but is not limited thereto.

The MMU 180 may translate the second virtual address VA2 received from the sub-processing circuit 120 into the second physical address PA2 by using the second page table PGTB 2.MMU 180 may send second read request signal RD2 including second physical address PA2 to router 161. In an embodiment, the second page table PGTB2 may include a page table set for another processing circuit, such as, but not limited to, ISP, for compressing image data and storing the compressed image data CDT into the memory 130. In addition, the second page table PGTB2 may map effective physical addresses to virtual addresses. The second physical address PA2 may be an effective physical address.

Here, since the second physical address PA2 corresponds to the physical address area of the memory 130, the router 161 may transmit the second read request signal RD2 to the memory 130.

Referring now to fig. 8B and 9, compressed image data CDT may be read from the second physical address PA2 in the compression buffer CBUF of the memory 130. The router 161 may send the compressed image data CDT to the sub-processing circuit 120. The compressor COMP of the sub-processing circuit 120 can decompress the compressed image data CDT. The sub-processing circuit 120 may transmit the decompressed image data DCDT to the processor 110 through the router 161.

The decompressed image data DCDT may not only include read image data requested by the processor 110, such as, but not limited to, pixel values of some pixels included in the image data of a single frame. Further, the sub-processing circuit 120 may transmit image data corresponding to the first physical address PA1 received from the processor 110 to the processor 110 in the decompressed image data DCDT.

In an embodiment, the sub-processing circuit 120 may include a cache CC, such as, but not limited to, the cache CC described in detail with reference to fig. 7. In response to the cache management request signal QRC received from the sub-processing circuit 120, the sub-processing circuit 120 may store the remaining data of the decompressed image data DCDT in the cache, but need not store the portion of the decompressed image data DCDT sent to the processor 110. The processor 110 may relatively likely request to read consecutive image data in a single frame, but is not limited thereto. Accordingly, when the read image data requested by the processor 110 is stored in the cache, the sub-processing circuit 120 can transmit the image data stored in the cache CC to the processor 110 without further accessing the memory 130.

Fig. 10 illustrates software layers of a SoC, such as but not limited to SoC 100, in accordance with an embodiment. For ease of description, the hardware connected to the SoC 100 may be shown together as hardware (H/W) 1000 hereinafter.

Applications 1020 and OS1010 may be executed by a processor, such as, but not limited to, processor 110 of fig. 1. Application 1020 refers to software (S/W) and/or services for implementing particular functions. User 1030 refers to an entity such as, but not limited to, a person, artificial Intelligence (AI), or other system that uses application 1020. The user 1030 may communicate with the application 1020 via a user interface UI. The application 1020 may be manufactured and/or configured based on each desired service and may communicate with the user 1030 via a user interface suitable for each such service. The application 1020 may perform operations requested by the user 1030 and may call (call) the content of the Application Protocol Interface (API) 1016 and/or library 1017 as needed.

The API 1016 and/or library 1017 may perform macro operations responsible for specific functions or provide an interface when communication with the underlying layers is required. When an application 1020 requests that the underlying layer operate through the API 1016 and/or library 1017, the API 1016 and/or library 1017 may categorize the received request into fields for security 1013, network 1014, and/or management 1015. The API 1016 and/or library 1017 may operate at a particular layer appropriate for the requested field. For example, when an application 1020 requests a function related to the network 1014, the API 1016 may send parameters for the layers of the network 1014 and invoke the related function. In this case, the network 1014 may communicate with the lower layers to perform the requested operation. The API 1016 and/or library 1017 may itself perform the corresponding operations when no corresponding underlying layer exists, but is not limited thereto.

For example, the driver 1011 may manage the hardware 1000 of the SoC 100 and check the operation state of the hardware 1000. When a request classified by an upper layer is received, driver 1011 may pass (reliver) the classified request to the corresponding layer of hardware 1000.

When driver 1011 passes the request to a layer of hardware 1000, firmware 1012 may convert the request into a form acceptable to hardware 1000. Firmware 1012 may be provided in the driver 1011 and/or hardware 1000 for translating requests and sending the translated requests to the hardware 1000.

For example, soC 100 of fig. 1 may include an Operating System (OS) 1010 to manage the components of SoC 100, including API 1016, drivers 1011, and firmware 1012. The OS1010 may be stored in the nonvolatile memory in the form of control command codes and/or data.

Hardware 1000 may include a processor 1001, sub-processing circuits 1002, memory 1003, a system Memory Management Unit (MMU) 1004, an Image Signal Processor (ISP) 1005, a Graphics Processing Unit (GPU) 1006, an input/output (I/O) display 1007, and the like. The hardware 1000 may execute requests or commands passed by the driver 1011 and firmware 1012 in order and/or out of order, and store the execution results in the memory 1003, in registers internal to the hardware 1000, or in a memory (such as a Dynamic Random Access Memory (DRAM)) connected to the hardware 1000. The stored execution results may be returned to the driver 1011 and/or the firmware 1012.

Hardware 1000 may generate an interrupt to request a desired operation for an upper layer. When an interrupt is generated, the hardware 1000 may check the management field 1015 of the OS1010 for the interrupt and may process the interrupt by communicating with the core of the hardware 1000.

In an embodiment, the API 1016 may set up an environment in which the processor 1001 (e.g., without limitation, the processor 110 of fig. 1) may indirectly access the memory 1003 (e.g., without limitation, the memory 130 of fig. 1) through the use of the sub-processing circuit 1002 (e.g., without limitation, the sub-processing circuit 120 of fig. 1). The API 1016 may generate a page table corresponding to the application 1020 using the functions of the sub-processing circuit 1002, such as, but not limited to, the page table PGTB of fig. 2A and/or the first page table PGTB1 of fig. 8A, and set information for the operation of the sub-processing circuit, such as the address matching table and/or the second page table PGTB2 of fig. 8A.

Thus, when the application 1020 is executed by the processor 1001, a page table corresponding to the application 1020 may be provided for the address matching table and the page table of the sub-processing circuit 1002, and the processor 1001 may access the memory 1003 by using the sub-processing circuit 1002 based on the shadow physical address, as described above with reference to fig. 1 to 8B.

In an embodiment, API 1016 is shown as being provided within OS or control software 1010, but is not limited thereto. For example, the API 1016 may be provided by the application 1020 according to different design selection criteria for the various embodiments.

Fig. 11A shows a match between the virtual address space VAS and the effective physical address space EPAS according to a comparative example. Fig. 11B shows the result of memory access according to the comparative example.

Referring to fig. 11A, the virtual address space VAS includes compression buffers comp_buf_o0 and comp_buf_o1 as storage areas for applications. The compression buffers comp_buf_o0 and comp_buf_o1 may match the valid_p0 and valid_p1 of the valid physical address areas EPAS. The valid_p0 and valid_p1 effective physical address regions may be regions in a compressed buffer of memory, such as, but not limited to, the compressed buffer CBUF of fig. 8A and the memory 130 of fig. 1.

Referring to fig. 11B, the code CD1 shows that compression buffers comp_buf_o0 and comp_buf_o1 are allocated to the virtual address space VAS. For example, compression buffers comp_buf_o0 and comp_buf_o1 may be allocated to the virtual address space VAS for camera processing. The first header of the virtual addresses corresponding to the compression buffers comp_buf_o0 and comp_buf_o1 may indicate a picture size, the first payload may be 0, and the second payload may indicate a size of the payload, but is not limited thereto.

When an application requests data from the compression buffers comp_buf_o0 and comp_buf_o1 based on the solution function (solution function) according to the code CD2, a processor, such as the processor 110 of fig. 1, may access the effective physical address areas valid_p0 and valid_p1 of the memory, such as the memory 130 described with respect to any one of fig. 1-8B, to read out the compressed data stored in the effective physical address areas valid_p0 and valid_p1, such as described with reference to fig. 4A. In addition, even when data is read out from an empty address area such as that caused by compression of image data, or compressed data is read out, an application may not be able to interpret (inter) the compressed data because the processor 110 does not need to provide a decompression function.

FIG. 12A illustrates a match between a virtual address space and a physical address space according to an embodiment. Fig. 12B illustrates a match between an API-based virtual address space and a physical address space, according to an embodiment.

Referring to fig. 12A, the virtual address space VAS may include compression buffers comp_buf_o0 and comp_buf_o1 and non-compression buffers uncomp_buf_v0 and uncomp_buf_v1 corresponding thereto. The compressed buffers comp_buf_o0 and comp_buf_o1 may match the valid_p0 and valid_p1 of the valid physical address areas EPAS, and the uncompressed buffers uncomplicate_bufv0 and uncomplicate_buf_v1 may match the shadow physical address areas shadow_p0 and shadow_p1 of the shadow physical address space SPAS. The matches between the virtual address space VAS and the effective physical address space EPAS and the matches between the virtual address space and the shadow physical address space SPAS may be set by the API 1016 of FIG. 10, and the matching information may be generated as page tables (e.g., PGTB of FIG. 2A and PGTB1 of FIG. 8A).

Referring to fig. 12B, compression buffers comp_buf_o0 and comp_buf_o1 may be allocated to the virtual address space VAS according to the code CD1, as described above for fig. 12A. Substantially repetitive descriptions may be omitted.

The code CD3 shows that the compression API is applied to allocate uncompressed buffers uncomp_buf_v0 and uncomp_buf_v1 to the virtual address space VAS. Thus, the compression API may provide an uncompressed view (view) to the processor 110 of FIG. 1. The function c_api of the compression API may generate uncompressed buffers uncomp_buf_v0 and uncomp_buf_v1 based on the compression buffers comp_buf_o0 and comp_buf_o1, the image information, and the like. For example, the uncompressed buffers uncomplicated_buf_v0 and uncomplicated_buf_v1 may be set as virtual address spaces of an Operating System (OS) or as temporary virtual address spaces for applications where the processor 110 accesses the uncompressed buffers uncomplicated_buv0 and uncomplicated_buf_v1.

When requesting data in the uncompressed buffers uncomp_buf_v0 and uncomp_buf_v1 based on the solution function 'foo_sol' according to the code CD4, the memory functions of the compression API, such as '. Memory c_api', can map the uncompressed buffers uncomp_buf_v0 and uncomp_buf_v1 to shadow address areas shadow_p0 and shadow_p1 and generate page tables such as described in the code CD 5. In addition, the memory function of the compression API may generate a matching table for use in the sub-processing circuit 120 of fig. 1. Here, shadow physical address areas shadow_p0 and shadow_p1 corresponding to the uncompressed buffers uncomp_buf_v0 and uncomp_buf_v1 are allocated consecutively in the shadow physical address space SPAS. For example, a value obtained by adding a buffer size, which is the size of each shadow physical address area, to a start address indicating a start area "shadow_p0" in the shadow physical address area may be calculated as an end address indicating a last area such as "shadow_p1" in the shadow physical address area.

For example, the processor 110 may read decompressed data from the compression buffers CBUF corresponding to the compression buffers comp_buf_o0 and comp_buf_o1 in the memory 130 of fig. 1 through the sub-processing circuit 120 of fig. 1. The application may process the decompressed data.

When the use of the uncompressed buffers uncomp_buf_v0 and uncomp_buf_v1 is completed, the uncompressed buffers uncomp_buf_v0 and uncomp_buf_v1 may be returned to the Operating System (OS) according to the free function c_free of the code CD 6.

Fig. 13 illustrates an Application Processor (AP) 200 according to an embodiment.

Referring to fig. 13, an AP 200 according to an embodiment may include a Central Processing Unit (CPU) 210, sub-processing circuits 220, a system interconnection circuit (SICC) 230, a network on chip (NoC) 240, and a Memory Management Unit (MMU) 250. The AP 200 may also include other components, such as an ISP, GPU, communication module, random Access Memory (RAM) or Read Only Memory (ROM). AP 200 may be implemented as a SoC and may send and receive data and signals between components in the chip (such as CPU 210, sub-processing circuit 220, MMU 250, and other processing functional blocks) through SICC 230 and NoC 240. SICC 230 and NoC 240 may correspond to router 161 in the above-described embodiments, but are not limited thereto. Dynamic RAM (DRAM) may be connected to AP 200 as a memory of AP 200. The AP 200 may include a DRAM controller, and the AP 200 may transmit and receive data to and from the DRAM under the control of the DRAM controller.

The CPU 210 may be a main processor of the AP 200 to control the overall operation of the AP 200, but is not limited thereto, and may include one processor core such as a single core processor, or a plurality of processor cores such as a multi-core processor. The CPU 210 may process or execute programs and/or data stored in RAM, DRAM, and/or ROM.

The CPU 210 may correspond to the processor 110 described with reference to fig. 1. The CPU 210 may execute an operating system and applications, and may translate virtual addresses generated by the applications into physical addresses. Based on the first page table PGTB set for the application, the CPU 210 may translate a virtual address corresponding to a virtual address area into an effective physical address corresponding to a physical address area of the DRAM 300 or into a shadow physical address corresponding to a shadow physical address area outside the physical address area of the DRAM 300. CPU 210 may send an access request signal to SICC 230 for accessing DRAM 300, which may include a physical address, such as an active physical address or a shadow physical address. The CPU 210 need not support compression of data to be stored in the DRAM 300 nor decompression of data to be read from the DRAM 300. Thus, when an application requests access to the compression buffer CBUF of the DRAM 300, the CPU 210 may translate a virtual address into an effective physical address based on the first page table PGTB, and the application may indirectly access the compression buffer CBUF of the DRAM 300 through the sub-processing circuit 220 supporting data compression and/or decompression. Here, the compression buffer CBUF may store compressed data, such as, but not limited to, compressed image data.

SICC 230 may send an access request signal to DRAM 300 when the physical address included in the received access request signal is a valid physical address. Here, SICC 230 may send an access request signal to sub-processing circuit 220 when the physical address is a shadow physical address. When the physical address included in the access request signal received from the CPU 210 is a valid physical address, the SICC 230 may transmit the access request signal to the DRAM 300, and the CPU 210 may make relatively direct access to the DRAM 300 through the SICC 230 directly. On the other hand, when the physical address included in the access request signal is a shadow physical address, SICC 230 may transmit the access request signal to sub-processing circuit 220, and CPU 210 may indirectly access DRAM 300 via sub-processing circuit 200.

SICC 230 may send data read from DRAM 300 to a corresponding processing circuit, such as CPU 210 and/or sub-processing circuit 220 requesting access to the data, in response to the access request signal. SICC 230 may include a cache CC, such as, but not limited to, a system cache, and may store data read from memory 300 into the cache CC or read data from the cache CC in response to a cache management request signal. In addition, SICC 230 may act as a cache coherence controller that maintains data coherence between local caches and/or shared caches of processing circuits (such as, but not limited to, CPU 210, sub-processing circuits 220, and other processing circuits).

The sub-processing circuit 220 may compress or decompress the received data and may, for example, temporarily store it in the memory 300. The sub-processing circuit 220 may translate a shadow physical address included in the received access request signal into a virtual address that matches the effective physical address, for example, by using an Address Matching Table (AMT). As described with reference to fig. 12B, the address matching table AMT may be generated by the compression API, but is not limited thereto. Further, sub-processing circuit 220 may output a cache management request signal to SICC230, for example, via NoC 240.

MMU250 may translate virtual addresses received from SICC230 into effective physical addresses, for example, via sub-processing circuitry 220. MMU250 may translate the virtual address to an effective physical address with reference to second page table PGTB 2. In an embodiment, the second page table PGTB2 may be the same as another page table used by other processing circuits (such as ISP, GSP, etc.) that may access the compression buffer CBUF of DRAM 300.

NoC 240 may send request signals, such as access request signals and/or cache management request signals, output from sub-processing circuits 220 and MMU250 to SICC230, and may also serve as a data path for sending and receiving data between sub-processing circuits 220 and SICC 230.NoC 240 may receive an access request signal from MMU250 that includes a valid physical address and send the access request signal to SICC 230.NoC 240 may send compressed data read from compressed buffer CBUF of DRAM 300 based on the access request signal received by SICC230 from MMU250 to sub-processing circuit 220. Further, noC 240 may send decompressed data decompressed by sub-processing circuitry 220 to SICC230, and SICC230 may in turn send the decompressed data to CPU 210.

Fig. 14A and 14B illustrate a read operation and a write operation of the AP 200 according to the embodiment. Fig. 14A shows a process of reading compressed image data from the compression buffer CBUF of the DRAM 300 by the CPU 210 of the AP 200 in the indirect memory access mode, and fig. 14B shows a process of writing compressed image data to the compression buffer CBUF of the DRAM 300 by the CPU 210 of the AP 200 in the indirect memory access mode.

Referring to fig. 14A, in operation (1), the CPU 210 may output a read request signal including a shadow physical address a to the SICC 230. The read request signal may request that a pixel value corresponding to a portion of the image data, for example, at least one region of the image data in a single frame, be read from the DRAM 300.

SICC 230 may determine whether the physical address included in the received access request signal is a shadow physical address or a valid physical address and send a read request signal including shadow physical address a to sub-processing circuit 220 in operation (2).

The sub-processing circuit 220 may then translate the shadow physical address a to the virtual address VA in operation (3). Further, the sub-processing circuit 220 may convert the shadow physical address a into a virtual address B of a header of a packet having a pixel requested to be read and a virtual address C of a payload in a packet of compressed image data. The sub-processing circuit 220 may refer to an address matching table, such as the AMT of fig. 15, to translate the shadow physical address a into a virtual address B of the header and a virtual address C of the payload.

Fig. 15 shows an example of an address matching table according to an embodiment. It can be assumed that the address matching table AMT of fig. 15 is generated based on the mapping relationship between the virtual address space VAS and the shadow physical address space SPAS in fig. 11A and 11B, but is not limited thereto.

Referring to FIG. 15, an address matching table AMT according to an embodiment may include physical addresses of decompressed views (i.e., shadow physical addresses SPA _S ) And the shadow physical address SPA _S Corresponding virtual address VA _CB (such as a virtual address of a compression buffer), a header virtual address VAHD, and information about the image data if_img (e.g., width, height, format, etc. of the image data). In an embodiment, the shadow physical address SPA _S Indicates the starting address of the shadow physical address area and may be referred to hereinafter as starting address SPA _S . For example, the first compression buffer comp_buf_o0 and the first header address comp_buf_o0+img0_size may be mapped to the first shadow physical address shadow_p0 as the virtual address VA, respectively _CB And header virtual address VA _HD The second compression buffer comp_buf_o1 and the second header address comp_buf_o1+img1_size may be mapped to the second shadow physical address shadow_p1 as the virtual address VA, respectively _CB And header virtual address VA _HD . Here, the reference img0_size in fig. 15 indicates a value stored as the virtual address VA _CB A first image size of compressed image data in the type of first compression buffer comp_buf_o0, and a flag img1_size of fig. 15 indicates that the compressed image data is stored as a virtual address VA _CB A second image size of compressed image data in the type of second compression buffer comp_buf_o1. In the embodiment shown in FIG. 15, the data are stored at the first shadow physical addresses shadow_p, respectivelyThe width, height and format of the first and second image data (such as uncompressed or decompressed image data) in 0 and the second shadow physical address shidow_p1 may be 3840, 2160 and NV12 in the listed order, but is not limited thereto.

Referring back to FIG. 14A, the sub-processing circuit 220 may identify the start address SPA _S And the offset alpha of the shadow physical address A, and find the starting address SPA with the shadow physical address A in the address matching table AMT _S Corresponding virtual address VA _CB The size of the image data, and information about the image data. For example, when the shadow physical address a indicates a part of the first shadow physical address area, e.g., a sub-block of image data, the shadow physical address a may be expressed as a value obtained by adding the offset α to the first shadow physical address shadow_p0. In this case, the offset α may be smaller than the size of the image data, but is not limited thereto. In the address matching table AMT, the sub-processing circuit 220 may find the first compression buffer comp_buf_o0 as the virtual address VA corresponding to the first shadow physical address shadow_p0 _CB The first image size img0_size is the size of the image data, and the width 3840, the height 2160, and the format NV12 are information about the image data. The sub-processing circuit 220 may calculate coordinates (e.g., x-coordinates and y-coordinates) of the image data based on the offset α, the format, and the width of the image data in response to the read request of the CPU 210. The sub-processing circuits 220 may be based on the virtual address VA _CB Calculates a virtual address B of a payload of a sub-block including pixels indicated by coordinates (x, y) of image data and a virtual address C of a header, from the first compression buffer comp_buf_o0 of the image data and coordinates (x, y) of the image data.

In operation (4), the sub-processing circuit 220 may send the virtual address B of the payload and the virtual address C of the header to the MMU 250.

MMU 250 may in turn translate virtual address B of the payload and virtual address C of the header to physical address D of the payload and physical address E of the header with reference to second page table PGTB 2. The physical address D of the payload and the physical address E of the header may be effective physical addresses. In an embodiment, the second page table PGTB2 may include another page table referenced by other processing circuits such as ISP, GPU, etc. for processing image data in the AP 200. In operation (5), MMU 250 may send a read request signal including physical address D of the payload and physical address E of the header to NoC240, and NoC240 may send the read request signal to SICC 230. Since the read request signal transmitted from the NoC240 contains a valid physical address, the SICC 230 may transmit the read request signal to the DRAM 300.

In the compression buffer CBUF of the DRAM 300, in operation (6), the header HD and the payload PL of the compressed image data including the pixel value of at least one area of the image data requested by the CPU 210 may be read from the DRAM 300 and may be supplied to the sub-processing circuit 220 through the SICC 230 and the NoC 240.

In operation (7), the sub-processing circuit 220 may decompress the compressed image data. The sub-processing circuit 220 may decompress the compressed image data by decoding the payload PL based on the compression information in the header HD. In operation (8), the sub-processing circuit 220 may transmit the decompressed image data DCDT to the CPU 210 through the NoC 240 and the SICC 230. The decompressed image data DCDT may be used in an application executed by the CPU, but is not limited thereto.

In an embodiment, sub-processing circuit 220 may send a cache management request signal QRC to SICC 230 associated with the decompressed image data in operation (9). For example, the decompressed image data DCDT may include pixel values of a single sub-block having pixel values of at least one region requested by the CPU 210, and the sub-processing circuit 220 may transmit a cache management request signal QRC for requesting to store remaining pixel values in the cache other than the pixel values provided to the CPU 210 to the SICC 230. When the CPU 210 requests to read consecutive pixel values in a single sub-block, a cache hit may occur and pixel values already stored in the cache CC may be sent to the CPU 210 without any further access to the DRAM 300. Thus, the hit rate of the cache can be increased.

In an embodiment, the sub-processing circuit 220 may analyze the read request signal from the CPU 210, check a specific pattern (pattern) of the image data requested by the read request signal, such as a stripe pattern in the image data, and the like, and transmit a cache management request signal QRC to the SICC 230, thereby prefetching the image data desired to be requested to be read by the CPU 210 from the DRAM 300.

Referring now to fig. 14B, in operation (1), the CPU 210 may output a write request signal including the shadow physical address a to the SICC 230. The write request signal may indicate a request to write or store pixel values corresponding to a portion of image data (e.g., at least one region of image data in a single frame) into DRAM 300.

SICC 230 may send a write request signal including shadow physical address a to sub-processing circuit 220 in operation (2). In operation (3), the sub-processing circuit 220 may translate the shadow physical address a into the virtual address VA. The sub-processing circuit 220 may refer to an address matching table, such as the AMT of fig. 15, to translate the shadow physical address a into a virtual address B of the payload and a virtual address C of the header. In operation (4), sub-processing circuit 220 may send a cache management request signal QRC to SICC 230 associated with the sub-block including the pixel value requested to be written. In an embodiment, sub-processing circuit 220 may send a "clear invalidate" request signal to SICC 230 for a sub-block including the pixel value requested to be written. SICC 230 may flush dirty lines from caches for CPU 210, such as L1 cache, L2 cache, L3 cache, and Last Level Cache (LLC), and may provide flush lines in caches for sub-processing circuits 220, such as cache CCs.

In an embodiment, when there is no pixel value of the sub-block including the pixel value requested to be written in the cache CC, the sub-processing circuit 220 may read the header HD and the payload PL corresponding to the sub-block from the compression buffer CBUF of the DRAM 300 in operation (6) and decompress the payload PL based on the compression information of the header HD in operation (7), such as operations (4) to (7) according to fig. 14A. Sub-processing circuit 220 may update the sub-block based on the pixel value requested to be written by CPU 210, compress the updated sub-block in operation (8), and write compressed image data CDT (such as the header and payload of the sub-block) to compression buffer CBUF of DRAM 300 in operation (9).

Fig. 16 shows a SoC according to an embodiment.

Referring to fig. 16, an SoC 400 according to an embodiment may include a CPU 410, a Random Access Memory (RAM) 420, a multimedia IP core 430, a memory controller 440, a sub-processing circuit 450, a sensor interface 460, and a display controller 470.SoC 400 may also include other commonly used components, such as a communication module, read Only Memory (ROM), and the like. Components of SoC 400, such as CPU 410, RAM 420, multimedia IP core 430, memory controller 440, sub-processing circuitry 450, sensor interface 460, and display controller 470, may send and receive data via bus 480. The Advanced Microcontroller Bus Architecture (AMBA) protocol may be employed as a standard specification for bus 480, but is not limited thereto. For example, any other suitable protocol, such as the open core protocols of uNetwork, coreConnect and OCP-IP, may be similarly or additionally employed in the standard specification. In an embodiment, bus 480 may be implemented in the form of a network on chip.

In an embodiment, bus 480 is further configured to receive another access address from at least one Intellectual Property (IP) core and to send the other access address to the memory if the other access address corresponds to a physical address area of the memory and to send the other access address to other processing circuitry outside the memory if the other access address corresponds to a shadow physical address area outside the physical address area of the memory.

CPU 410 may control the overall operation of SoC 400 and may correspond to processor 110 of fig. 1 and/or CPU 210 of fig. 13, each as described above. CPU 410 may execute an operating system and/or applications and may translate virtual addresses generated by the applications into physical addresses. In this case, based on the first page table set for the application, CPU 410 may translate the virtual address into one of an effective physical address corresponding to a physical address area of memory 445 and a shadow physical address corresponding to a shadow address area outside of the physical address area of memory 445. CPU 410 may generate an access request signal, such as a read request signal or a write request signal, for accessing memory 440 that includes a valid physical address or a shadow physical address. The access request signal may be sent to the memory controller 440 or the sub-processing circuit 450 via the bus 480.

RAM 420 may be implemented as volatile memory such as Dynamic RAM (DRAM) and/or Static RAM (SRAM), and more particularly as resistive memory such as PRAM, MRAM, reRAM, FRAM or the like. RAM 420 may temporarily store programs, data and/or instructions.

The multimedia IP core 430 may perform image processing on image data such as still images or video. For example, the multimedia IP core 430 may include at least one of an ISP, a GPU, a Video Processing Unit (VPU), a Display Processing Unit (DPU), and/or a neural Network Processing Unit (NPU).

The ISP may change the format of the received image data or correct the image quality of the image data. For example, the ISP may receive RGB image data as input data and convert the RGB image data into YUV image data. Further, the ISP can correct the image quality of the image data by performing image processing, such as adjusting the gamma value and/or brightness of the received image data, widening the Dynamic Range (DR) of the received image data, and/or removing noise from the received image data.

The GPU may compute and generate two-dimensional or three-dimensional graphics. The GPU may process graphics data exclusively and may process graphics data in parallel. Furthermore, GPUs may be used to perform complex operations such as geometric computations, scalar and vector floating point computations, and the like. The GPU may execute various commands encoded using an API, such as, but not limited to OpenCL, openGL and/or WebGL.

The VPU may correct the quality of received video images or record and play images, such as the recording and playback of audio and video including video images.

The DPU may perform image processing for displaying the received image data on the display device 475. For example, the DPU may change the format of the received image data to a suitable format for display on a display device and/or correct the image data based on gamma values corresponding to the display device.

The NPU may perform image processing on the received image data based on the learning neural network, derive features from the image data, and identify objects, backgrounds, etc. in the image data based on the features. The NPU may be dedicated to the computation of one or more neural networks and may process the image data in parallel.

Memory controller 440 may interface data or commands between SoC 400 and memory 445. Memory controller 440 may receive access request signals from bus 480 and send the access request signals to memory 445. As described above with reference to fig. 1, the memory 445 may be implemented as a volatile memory such as DRAM, SRAM, or SDRAM, or a nonvolatile memory such as PRAM, MRAM, reRAM, feRAM or NAND flash memory, but is not limited thereto. Memory 445 may be implemented as a memory card, such as a multimedia card (MMC), an embedded MMC (eMMC), a Secure Digital (SD) card, a micro SD card, and the like. Memory 445 may include a compression buffer and may store compressed image data without limitation.

The multimedia IP core 430 may compress data processed through image processing and store the compressed data in the memory 445. Multimedia IP core 430 may include a Memory Management Unit (MMU) and the MMU may translate virtual addresses storing compressed data to effective physical addresses corresponding to physical address regions of memory 445 based on a second page table different from the first page table used by the CPU. In addition, multimedia IP core 430 may send an access request signal to memory controller 440 via bus 480 for accessing memory 445 having a valid physical address.

Sub-processing circuit 450 may support functions that CPU 410 need not provide, such as data processing for data read from memory 445 or data to be written to memory 445. Sub-processing circuit 450 may translate the shadow physical address in the access request signal sent from CPU 410 into an effective physical address indicating the physical address area of memory 445 and read data from memory 445 based on the effective physical address. Accordingly, CPU 410 may indirectly access memory 445 through the use of sub-processing circuit 450.

In an embodiment, sub-processing circuit 450 may translate the shadow physical address into a virtual address by using an address matching table generated with the first page table when the first page table is created, and may translate the virtual address into a valid physical address by using a second page table set for multimedia IP core 430. In an embodiment, the system MMU for translating virtual addresses into effective physical addresses by using the second page table may be implemented as a separate circuit from the sub-processing circuit 450. In an embodiment, the MMU of multimedia IP core 430 may be used as a system MMU.

The sub-processing circuit 450 may correspond to the sub-processing circuit 120 of fig. 1 or the sub-processing circuit 220 of fig. 13 as described above, but is not limited thereto. A substantially repetitive description of the sub-processing circuit 450 may be omitted.

The sensor interface 460 may interface data or commands between the SoC 400 and the image sensor 465 and may receive image data from the image sensor 465. The image data received from the image sensor 465 may be processed by at least one processing circuit of the multimedia IP core 430 for image processing or may be processed by an application running on the CPU 410 for such image processing. Image data received from the image sensor 465 and/or image data undergoing image processing may be stored in the memory 445.

The display controller 470 may interface display data, such as image data, for output to the display device 475. The display device 475 may interpret display data having an image or video on a display panel such as a Liquid Crystal Display (LCD) or an Active Matrix Organic Light Emitting Diode (AMOLED) display.

Fig. 17 shows an AP-mounted electronic device according to an embodiment.

The electronic device 2000 may include a mobile device, such as a smart phone, tablet computer, laptop computer, wearable device, GPS device, electronic book terminal, MP3 player, digital camera, navigation device, drone, internet of things device, home appliance, advanced Driving Assistance System (ADAS), etc. Further, the electronic device 2000 may be provided as a component in an assembly such as a vehicle, furniture, manufacturing facilities, doors, and various measuring devices, but is not limited thereto.

Referring to fig. 17, an electronic device 2000 may include an Application Processor (AP) 2100, a camera module 2200, a working memory 2300, a storage device 2400, a display device 2500, a communication module 2600, and a user interface 2700. The electronic device 2000 may further include, but is not limited to, other commonly used components.

The AP 2100 may be implemented as a SoC that controls the overall operation of the electronic device 2000 and drives applications, operating systems, and the like. The AP 2100 may perform image processing on image data provided from the camera module 2200, and may store the image data in the storage device 2400 and/or provide the image data to the display device 2500. As described above, the SoC 100 of fig. 1 and the SoC 100a of fig. 3 and the AP 200 of fig. 13 and the SoC 400 of fig. 16 may be used as the AP 2100, but is not limited thereto.

In an embodiment, the AP 2100 may include a CPU 2101, a system bus 2102, and a sub-processing circuit 2103. When a first access request signal having a physical address for accessing the working memory is received from the CPU 2101, the system bus 2102 may determine whether the physical address is a valid physical address or a shadow physical address. The system bus 2102 may send a first access request signal to the working memory 2300 when the physical address is a valid physical address, and/or the system bus 2102 may send the first access request signal to the sub-processing circuit 2103 when the physical address is a shadow physical address. The sub-processing circuit 2103 may convert the physical address in the first access request signal into a valid physical address and send a second access request signal having the valid physical address to the system bus 2102. The system bus 2102 may transmit a second access request signal from the sub-processing circuit 2103 to the working memory 2300. In an embodiment, the sub-processing circuit 2103 may perform compression and decompression or encryption and decryption on data received from the work memory 2300 together with the first access request signal from the CPU 2101 or in response to the second access request signal sent to the system bus 2102, and may prefetch data desired to be accessed by the CPU 2101 from the work memory 2300. The CPU 2101 can indirectly access the work memory 2300 by using the sub-processing circuit 2103, and the sub-processing circuit 2103 can execute functions that the CPU 2101 is not required to support, thereby supporting the efficiency of applications running on the CPU 2101.

The camera module 2200 may generate and transmit image data to the AP 2100. The camera module 2200 may include at least one camera, and the camera may include an image sensor and a lens. The image sensor may convert an optical signal received through the lens into image data. In an embodiment, the camera module 2200 may include a plurality of cameras having different perspectives. In an embodiment, the camera module 2200 may generate image data having different exposures and transmit the image data to the AP 2100. The AP 2100 may combine different image data to generate a High Dynamic Range (HDR) image.

The working memory 2300 may be implemented as a volatile memory such as DRAM, SRAM, or the like, or a nonvolatile memory such as FeRAM, RRAM, PRAM, NAND flash memory or the like, but is not limited thereto. An operating program or an application program stored in the storage device 2400 can be loaded into the working memory 2300 and executed in the CPU 2101. Further, operation data generated in the operation of the electronic device 2000 may be temporarily stored in the work memory 2300. The working memory 2300 may store programs and/or data that are processed and/or executed by the AP 2100. For example, the AP 2100 may perform image processing on image data provided from the camera module 2200, compress the image data processed through the image processing, and temporarily store the compressed image data in the working memory 2300.

The memory device 2400 may be implemented as a non-volatile memory such as a NAND flash memory and/or a resistive memory, and may be provided as a memory card such as MMC, eMMC, SD, microSD or the like. The storage device 2400 may store data provided from the AP 2100. For example, the AP 2100 may store image data processed through image processing into the storage device 2400. Further, the storage device 2400 may store an operation program, an application program, and the like of the electronic device 2000.

Wireless transceiver 2600 may include a transceiver 2610, a modem 2620, and an antenna 2630. The wireless transceiver 2600 may perform wireless communication with one or more external devices and may receive data from and/or transmit data to the one or more external devices.

The user interface 2700 may be implemented with various devices capable of receiving user input, such as a keyboard, a curtain (curtain) key panel, a touch panel, a fingerprint sensor, a microphone, and so forth. The user interface 2700 may receive user input and provide signals corresponding to the received user input to the AP 2100.

The inventive concept has been described above by way of example with reference to illustrative embodiments of the invention. It should be understood that the embodiments shown in the drawings and described in the specification are provided without limitation. Such embodiments have been described using specific terms in the present specification to effectively convey the technical idea of the present disclosure, but should not be construed to limit the scope or spirit of the present disclosure.

One of ordinary skill in the relevant art will appreciate that various modifications and other embodiments based on the foregoing are possible without departing from the technical scope of the present disclosure. While the present inventive concept has been particularly shown and described with reference to illustrative embodiments thereof, it will be understood that various changes in form and details may be made therein without departing from the spirit and scope of the appended claims.

Claims

1. A system on a chip (SoC) comprising:

a first processor configured to output a first access address;

a system bus configured to: transmitting the first access address to a memory if the first access address corresponds to a physical address region of the memory, and transmitting the first access address to other processing circuitry than the memory if the first access address corresponds to a shadow physical address region other than the physical address region of the memory; and

a sub-processing circuit configured to receive the first access address from the first processor via a system bus, translate the first access address into a second access address corresponding to the physical address region, and send the second access address to the system bus to access the memory.

2. The SoC of claim 1, wherein the first processor generates one of a first physical address and a second physical address from the virtual address requested by an application as the first access address based on a first page table referencing mapping information including virtual addresses and first and second physical addresses, wherein the first physical address corresponds to the physical address region and the second physical address corresponds to the shadow physical address region.

3. The SoC of claim 1, wherein the first processor executes an application on an Operating System (OS).

4. The SoC of claim 1, further comprising a second processor configured to output a third access address for accessing the memory, wherein the third access address corresponds to a first physical address corresponding to a first physical address region.

5. The SoC of claim 1,

wherein the system bus includes a cache configured to cache data stored in a memory,

wherein the system bus is further configured to receive another access address from at least one Intellectual Property (IP) core and to send the other access address to a memory if the other access address corresponds to a physical address area of the memory and to send the other access address to other processing circuitry outside the memory if the other access address corresponds to a shadow physical address area outside the physical address area of the memory, and

Wherein the sub-processing circuit transmits a management request signal regarding the cache to the system bus in association with the data stored in the address area of the memory corresponding to the second access address.

6. The SoC of claim 1, wherein the sub-processing circuit comprises a first functional block configured to process data received in response to an access request comprising the first access address.

7. The SoC of claim 6, wherein the first functional block comprises a compressor configured to compress or decompress the data.

8. The SoC of claim 6, wherein the functional block comprises an encoder configured to encrypt or decrypt the data.

9. The SoC of claim 6, wherein the sub-processing circuit comprises:

a first address translation circuit configured to translate the first access address into a first virtual address, wherein the first virtual address matches a first physical address corresponding to the physical address region; and

and a second address translation circuit configured to translate the first virtual address to the first physical address based on a second page table.

10. The SoC of claim 9, wherein

The second page table is substantially the same as a page table used by another processor that performs data processing, stores processed data in a memory, or reads data from the memory for converting virtual addresses of the memory to physical addresses.

11. An application processor, comprising:

a main processor configured to convert a first virtual address generated when an application is executed into a first physical address by using a first page table including mapping information between a physical address indicating one of a physical address area and a shadow physical address area of a memory and a virtual address indicating an address area of a virtual memory identified by the application, and output a first access request including the first physical address;

a router configured to receive the first access request from the host processor, to send the first access request to the memory in response to a first physical address corresponding to a physical address region of the memory, and to output the first access request to an Intellectual Property (IP) core outside the memory in response to a first physical address corresponding to a shadow physical address region of the memory;

Sub-processing circuitry configured to receive the first access request from the router, process data associated with the first access request, and translate the first physical address into a second virtual address; and

a first Memory Management Unit (MMU) configured to translate the second virtual address into a second physical address corresponding to a physical address region of the memory,

wherein the router is configured to receive a second access request from the first MMU that includes the second physical address and to send the second access request to the memory if the second physical address corresponds to a physical address region of the memory.

12. The application processor of claim 11, wherein

The first sub-processing circuit includes a compressor configured to compress or decompress data.

13. The application processor of claim 11, further comprising:

an Image Signal Processor (ISP) configured to perform image processing on image data, compress the processed image data, and generate a third virtual address of a virtual memory in which the compressed image data is to be stored; and

a second MMU configured to translate the third virtual address into a third physical address corresponding to a compression buffer to store the compressed image data in a physical address area of the memory by using a second page table including mapping information between physical addresses indicating physical address areas of the memory and virtual addresses indicating address areas of virtual memory identified by the ISP.

14. The application processor of claim 13, wherein the first MMU is configured to translate the second virtual address to the second physical address with reference to the second page table.

15. The application processor of claim 13,

wherein the router includes a cache configured to cache data stored in the memory, and

wherein the sub-processing circuit is configured to send a management request signal regarding caching to the router in relation to data stored in an area of the compression buffer corresponding to the second physical address.

16. The application processor of claim 15, wherein the sub-processing circuit is configured to send a management request signal to the router to prefetch data expected to be requested by the main processor from memory to the cache based on a pattern of data requested to be accessed by the main processor.

17. A method of operating a system on a chip (SoC), the method comprising:

transmitting, by the processor, a first access request signal including a first physical address to the router;

transmitting, by the router, the first access request signal to the sub-processing circuit if the first physical address does not correspond to a physical address area of the memory;

Converting, by the sub-processing circuitry, the first physical address to a second physical address corresponding to a physical address region of the memory;

transmitting, by the sub-processing circuit, a second access request signal including the second physical address to the router; and

the second access request signal is sent by the router to the memory.

18. The method of claim 17, further comprising:

converting by the processor a first virtual address to the first physical address based on a page table,

wherein the page table includes mapping information between virtual addresses and effective physical addresses and shadow physical addresses,

wherein the effective physical address corresponds to the physical address area and the shadow physical address corresponds to a shadow physical address area outside of the physical address area of the memory.

19. The method of claim 17, further comprising:

decompressing, by the sub-processing circuit, the compressed data output from the memory in response to the second access request signal; and

at least some of the decompressed data is sent by the sub-processing circuitry to the processor via the router.

20. The method of claim 17, further comprising:

decrypting, by the sub-processing circuit, the encrypted data output from the memory in response to the second access request signal; and

at least some of the decrypted data is sent by the sub-processing circuit to the processor via the router.