WO2019180288A1 - Method and device for parallel processing of program instructions and trace instructions - Google Patents

Method and device for parallel processing of program instructions and trace instructions Download PDF

Info

Publication number
WO2019180288A1
WO2019180288A1 PCT/ES2019/070176 ES2019070176W WO2019180288A1 WO 2019180288 A1 WO2019180288 A1 WO 2019180288A1 ES 2019070176 W ES2019070176 W ES 2019070176W WO 2019180288 A1 WO2019180288 A1 WO 2019180288A1
Authority
WO
WIPO (PCT)
Prior art keywords
trace
instruction
instructions
signal
pipeline
Prior art date
Application number
PCT/ES2019/070176
Other languages
Spanish (es)
French (fr)
Inventor
Antonio Da Silva
Óscar RODRÍGUEZ POLO
Agustín MARTÍNEZ HELLÍN
Pablo Parra Espada
Sebastián SÁNCHEZ PRIETO
Original Assignee
Universidad Politécnica de Madrid
Universidad De Alcalá
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universidad Politécnica de Madrid, Universidad De Alcalá filed Critical Universidad Politécnica de Madrid
Publication of WO2019180288A1 publication Critical patent/WO2019180288A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3024Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring

Definitions

  • the invention falls, in general, in the Electronics, Information Technology and Telecommunications (ICT) sector, although it has specific application in critical systems typical of the Aerospace, Defense and high reliability sectors.
  • ICT Electronics, Information Technology and Telecommunications
  • US patent application 5996092 A "System and method for tracing program execution within a processor before and after a triggering event" allows to start and interrupt the instruction trace, using a trace processor that works in parallel to the processor that executes the own instructions.
  • the trace processor after detecting the start time of the trace, by means of a specific instruction, stores in a shared memory information relative to the entire sequence of instruction execution until the moment in which it detects the stop instruction of the trace.
  • there is no shared memory or a parallel trace processor and the trace is based on an instrumentation of the code that adds, at specific points of the code in which it is desired to obtain a specific trace, trace instructions that identify uniquely the point we wish to draw, without resorting to the program counter.
  • the trace instruction is executed in parallel with the previous instruction that is intended to be plotted, introducing redundancy of the necessary parts of the processor pipeline, and the result of the execution is the writing in an output register where an analysis hardware captures it, so that the specific moment in which the plotted instruction was executed is recorded.
  • the present invention presents a different approach, where it is possible, by selective instrumentation of instructions located anywhere in the code, to obtain the worst execution time of each of the system functions.
  • This type of instrumentation is being used by commercial tools such as RapiTime in critical avionics systems (G. Bernat et al., “Identifying Opportunities for Worst-case Execution Time Reduction in an Avionics System,” Ada User Journal, Volume 28, Number 3, 2007, pp. 189 -194).
  • RapiTime in critical avionics systems
  • This patent does not prevent the overload of the use of code instrumentation techniques that affect the entire system, as is the case with the claimed invention, but aims to optimize a tracing mechanism, not based on instrumentation, but on the Detection of events that conform to predefined conditions. Therefore, it is concluded that the existing systems for the instruction trace allow to program specific trace trigger events, to collect limited trace information at a certain interval before and / or after said event. These methods suffer from a certain rigidity in that the number of blocks that can be traced in each execution is always limited, and they do not adapt well to the code instrumentation techniques that are used in the characterization of the execution time in the worst case in critical systems, such as that used by the aforementioned RapiTime tool. Since the application of code instrumentation using current processors introduces overload at runtime, the invention presented, aimed at eliminating these overloads, provides an improvement with a specific objective framed in this area.
  • a parallel processing device of program instructions and trace instructions comprises:
  • controller of the data path which in turn comprises inputs and outputs that control the multiplexers, the load in associated registers of the different stages and the output register for the trace;
  • the data path controller is configured to determine, depending on the state of said controller and the value of the inputs in said controller, the value of the outputs that are sent to the multiplexers of the data path such that a trace instruction is executed in synchronization with the preceding instruction, said execution being effective during the last stage of the trace pipeline.
  • the controller comprises the following instruction sequences S1, S2 and S3:
  • S1 corresponds to Instruction-Trace pairs in which the instructions to be drawn are always loaded in the “INSTRUCTION N” element, while the corresponding trace instructions are loaded in the “N + 1 INSTRUCTION” element;
  • S2 corresponds to a sequence of instructions that are not traced, so that “INSTRUCTION N” and “INSTRUCTION N + 1” always load instructions (and not traces);
  • S3 corresponds to two Instruction-Trace pairs in which the trace instructions are loaded in successive cycles in the “INSTRUCTION N + 1” element, while the instructions to be traced are loaded in those same cycles in the “INSTRUCTION N + 1 element” "
  • the processing device executes the sequence S1 for two clock cycles, T and T + 1; such that the instructions stored in directions X + 1, X + 3 and X + 5, are the trace instructions of the instructions that precede them, located, respectively, in directions X, X + 2 and X + 4 .
  • the processing device executes the S2 sequence in which no trace instructions are loaded for two cycles; so that during the first cycle T it is detected that the two instructions that are loaded in the decoding stage are not trace, and therefore the signals "N_ES_TRAZA” "N_1_ES_TRAZA” are worth both “0”; and, in the T + 1 cycle the controller is in the state called "PENDING INSTR", in which the pending instruction located in the second decoding unit is directed to "step 3" of the processor instruction pipeline.
  • the processing device executes the sequence S3: in the T cycle the value of the “N_ES_TRAZA” signal is “1”, while “N_1_ES_TRAZA” is worth “0”, and the value of the multiplexing signal “SEL_TR_P4” is “2 ”, So that a route is enabled where the trace instruction is synchronized with the execution of the instruction to be plotted; in the T + 1 cycle, the trace instruction is located in step 4 of the trace pipeline.
  • the multiplexing signal "SEL_PIPE_TRAZA” takes the value 0, so that in the T + 1 cycle a zero is found in step 3 of the trace pipeline.
  • the processing device during a cycle "T” detects a bubble in step 3 of the instruction pipeline, so that the controller sets the route that loads a "0" in stages 3 and 4 of the pipeline and a jump direction "Z" the search stage input register is routed, so that the detection of the bubble in stage 3 corresponds to setting "1" of the "BURBUJA_P3" signal and the "BUBBLE” signal.
  • the controller controls: the route to stage 3 by assigning “0” to the "SEL_PIPE_TRAZA”signal; the route to stage 4 assigning a “0” to the "SEL_TR_P4"signal; and loading the input register of the search stage by activating the "LD_DIR” signal and routing the "Z” address to said register by assigning a "0” to the "SEL_DIR” signal.
  • the device detects a bubble in step 4 of the instruction pipeline, such that the controller sets the route that loads a "0" in the stages 3, 4 and 5 of the pipeline and the jump direction "Z" is routed to the entry register of the search stage, such that the detection of the bubble in stage 4 corresponds to the setting to "1" of the "BURBUJA_P4" signal and of the "BUBBLE” signal.
  • the controller controls: the route to stage 3 of the pipeline by assigning "0" to the "SEL_PIPE_TRAZA” signal; the route to stage 4 assigning a “0” to the "SEL_TR_P4" signal; the route to step 5 is controlled by assigning a "0” to the "SEL_TR_P5" signal; and, loading the input register of the search stage by activating the "LD_DIR” signal and routing the "Z" address to said register by assigning a 0 to the "SEL_DIR” signal.
  • a second aspect of the invention discloses a RISC processor, "Computer with Reduced Instruction Set", which comprises a parallel processing device for program instructions and trace instructions according to any one of the above embodiments for the first aspect of the invention.
  • a parallel processing method of program instructions and trace instructions is disclosed which, executed on a parallel processing device of program instructions and trace instructions defined in any of the embodiments of the first aspect of the invention, it processes an instruction and a trace instruction in parallel.
  • Figure 2. Mealy machine of the data path controller.
  • Figure 12 Table of the outputs of the controller of the data path relative to stages 3, 4 and 5 of the pipeline of the trace instructions.
  • Step 3 input selection multiplexer of the RISC processor instruction pipeline 110 Pipeline stage 3 input selection multiplexer from trace instructions
  • 125 BURBUJA_P3 Input signal to the controller that monitors the detection of a bubble in step 3 of the RISC processor instruction pipeline
  • N_1_EN_TRAZA Input signal to the controller that monitors whether the instruction that has been decoded in the "Instruction N + 1" element is trace type
  • SEL_PIPE_INSTR Output of the data path controller that controls the input multiplexer to stage 3 of the RISC processor instruction pipeline
  • TR_P5_EN_CERO Signal that monitors whether stage 5 of the trace instruction pipeline stores a zero value
  • 136 LD_TR_OUT Signal that controls the storage in the output register of the trace information. It takes the complementary value to the TR_P5_ES_CERO signal, so the register is only loaded when the trace information is nonzero.
  • 137 LD_DIR Signal that controls the storage in the entry register at the search stage of the address of the next instruction to be searched.
  • the invention consists of a device equipped with an internal processing structure that allows eliminating the runtime overhead that introduces the code instrumentation used to measure the "worst case execution" time using hybrid analysis.
  • This analysis combines the static analysis of the code with runtime measurements on the deployment platform.
  • the static analysis determines which instructions are necessary to draw and, by means of instrumentation techniques, adds plot code after each instruction that is desired to be drawn, so that the instant of execution of said instruction can be captured by means of a hardware of support and a logic analyzer.
  • the code added after the instruction that we wish to trace allows us to uniquely identify the moment of execution of said instruction, but introduces an overload that can be eliminated with this invention.
  • the device is able to detect the trace instructions and execute them in parallel, in a synchronized way, and conditioned to the complete execution of the instruction that precedes it. In this way, it allows the plotting process to be non-intrusive in regards to the execution time, since the sequence and the moment of execution of the program under analysis are not modified by the introduction of the traces, since they are executed in parallel.
  • the device proposed in the invention uses a specific instruction code, which will be used for instrumentation, and whose internal structure interprets as the trace enablement of the preceding instruction.
  • the main elements of this device which are shown in Figure 1, are: 1) an instruction search stage (100), which has an instruction address calculation module (101) and a search module for the instructions with double reading port (102); 2) a duplicate decoding stage (103); 3) a specific pipeline for the trace instructions (113); 4) an output record for the trace (123); 5) the data path, consisting of a set of multiplexers (109, 110, 116 and 119); 6) the device's data path controller (104), which determines, depending on its status, and the value of its inputs (105), the value of the outputs (106) that control both multiplexers (109, 110, 1 16 and 119), such as the loading in associated registers of the different stages (107, 108, 114, 115, 117, 118, 120 and 121), as well as the output register (123). Both the inputs (105)
  • the device thanks to the double port search stage (100), allows two instructions to be loaded simultaneously to the decoding stage (103) to be decoded in parallel.
  • the “WAIT” signal (124) of this stage is used to model possible waiting states in said search, and can be activated after a reset of the processor, or as a result of a jump, which causes the injection of bubbles in the processor instruction pipeline (112), monitored by the signals “BURBUJA_P3” (bubble injection in step 3, 125), 'BURBUJA_P4 ”(bubble injection in stage 4, 126), and their OR function, labeled "BUBBLE” (122).
  • the “WAIT” signal (124) will be deactivated to Notify the decoding stage (103) that the instructions are available for loading.
  • the two instructions found in the decoding stage always correspond to instructions stored in consecutive memory words.
  • the item labeled "INSTRUCTION N” (107) will be the one that will receive the first of the two instructions, while the one labeled "INSTRUCTION N + 1" (108) will receive the following.
  • the N and N + 1 values do not correspond to physical memory addresses that are consecutive, but instead represent two instructions stored in consecutive memory words, regardless of the word size in bytes of the processor, which in the most general case of a processor 32-bit RISC would be 4.
  • the route controller (104) uses the values of those signals, and that of the “WAIT” (124) and “BUBBLE” (122) signals, together with the state of the controller itself, to determine the route that the instructions will follow. Towards the next stages.
  • the controller configures the multiplexers (109, 110, 116 and 119) of the route to ensure that a trace instruction is executed in synchronization with the preceding instruction, making such execution effective during the last stage (121) of the pipeline of trace (113), in which it is verified that the signal “TR_P5_ES_CERO” (135) is deactivated, in which case the signal “LD_TR_OUT” is activated (136) and the trace value is directed to the output register (123) .
  • Figure 2 represents the Mealy machine of the data path controller of this device (200), which is formally specified in the tables of Figures 9, 10, 11 and 12.
  • Figures 3, 4 and 5 are, respectively, examples of how the controller, to effect synchronization, sets the route in the following possible instruction sequences S1, S2, and S3:
  • the sequence S1 corresponds to Instruction-Trace pairs (301-302, 303-304 and 305-306) in which the instructions to be traced (301, 303 and 305) they are always loaded in the “INSTRUCTION N” element (107), while the corresponding trace instructions (302, 304 and 306) are loaded in the “INSTRUCTION N + 1” element (115).
  • S2 corresponds to a sequence of instructions (401, 402, 403 and 404) that are not traced, so that “INSTRUCTION N” (114) and “INSTRUCTION N + 1” (115) always load instructions and not traces
  • the S3 sequence corresponds to two Instruction-Trace pairs (502-503 and 504-505) in which the trace instructions (503 and 505) are loaded in successive cycles (500 and 510) in the “INSTRUCTION N + 1 ”(107), while the instructions to be drawn (502 and 504) are loaded in those same cycles in the“ INSTRUCTION N + 1 ”element (108).
  • Figure 3 shows the operation of the device during two clock cycles, T (300) and T + 1 (307), in which the processor executes the instruction sequence S1.
  • the instructions stored in the addresses X + 1 (302), X + 3 (304) and X + 5 (306) are the trace instructions of the preceding instructions, located, respectively, in the addresses X (301), X + 2 (303) and X + 4 (305).
  • the scheme shows in the two cycles, T (300) and T + 1 (307), how the trace instructions (302, 304, 306), added as fruit of the instrumentation, are directed towards the stages (115 and 118) that belong to the pipeline of the trace type instructions, while the instructions to be drawn (301, 303, and 305), are directed towards the steps (114 and 117) that belong to the pipeline of the processor instructions.
  • T (300) and T + 1 (307), how the trace instructions (302, 304, 306), added as fruit of the instrumentation, are directed towards the stages (115 and 118) that belong to the pipeline of the trace type instructions, while the instructions to be drawn (301, 303, and 305), are directed towards the steps (114 and 117) that belong to the pipeline of the processor instructions.
  • FIG 4 shows the operation of the device in the sequence S2, in which for two cycles no trace instructions are loaded.
  • T (400) it is detected that the two instructions that are loaded in the decoding stage (403 and 404) are not trace, and therefore the signals "N_ES_TRAZA” (129) and “N_1_EN_TRAZA” (130) both are worth 0.
  • This situation leads to the decoding stage (103) not loading two new instructions at the beginning of the T + 1 cycle (408), since in the T cycle (400) the values of "LD_N" (127) and "LD_N_1" (128) that control said load are both 0.
  • the controller In the T +1 cycle (408) the controller is in the state called "PENDING INSTR" (202), in which the pending instruction ( 404) located in the second decoding unit (108) it is directed towards step 3 of the processor instruction pipeline (114). In both cycles, T (400) and T +1 (408), the controller loads stage 3 of the pipeline of the trace instructions (115) with values 0, so the instructions will not be traced .
  • Figure 5 shows the operation of the device for the sequence S3, which covers the case in which in cycle T (500) the instruction to be traced (502) is in step 3 of the instruction pipeline (114), while the trace instruction (503) is loaded in the "INSTRUCTION N" element (107) of the decoding stage (103).
  • the sequence also includes that in that same cycle the element "INSTRUCTION N + 1" (108) of the decoding stage (103) contains the following instruction (504) to be executed.
  • cycle T the value of the “N_ES_TRAZA” signal (129) is 1, while “N_1_ES_TRAZA” (130) is 0, and the value of the multiplexing signal “SEL_TR_P4” ( 133) is 2, which enables a route where the trace instruction (503) is synchronized with the execution of the instruction to be plotted (502). Synchronization is effective in the T + 1 cycle (510), the trace instruction (503) being located in step 4 of the trace pipeline (118).
  • the multiplexing signal “SEL_PIPE_TRAZA” (132) takes the value 0, so that in the T + 1 cycle (510) a zero (507) is found in step 3 of the trace pipeline (115).
  • the sequence S3 also causes the following instruction (504), which is stored in the "INSTRUCTION N + 1" (108), during cycle T (500), to have the route to stage 3 of the pipeline enabled of instructions (114). To enable this route, the “SEL_PIPE_INSTR” signal (131) takes the value 1 during the T cycle (500).
  • the device repeats the same configuration of the data path as in the T cycle (500), since the sequence again locates a trace type instruction (505) in the element “ INSTRUCTION N "(107) of the decoding stage (103) and the next instruction to be executed (506) in the" INSTRUCTION N + 1 "element (108).
  • Figures 6, 7 and 8 describe the operation of the device before the detection of bubbles.
  • the bubbles are inserted in the instruction pipeline of an RISC processor in all situations in which the sequential execution of instructions is interrupted, as is the case with jump instructions, both conditional and unconditional, or in function calls and returns.
  • the processor When an instruction causes the sequential order of execution to be interrupted, the processor must discard the execution of the instructions following that instruction, and start the search for the instruction whose address has been determined after the execution of the instruction that caused the sequence to break.
  • this address is labeled as address "Z" (600).
  • Figure 6 explains the data path before a bubble in step 3 of the instruction pipeline (114), while Figure 7 corresponds to a bubble detected in step 4 (117), and also causes a bubble in the stage 3 (114).
  • Figure 8 explains the evolution of the route in the cycles following the detection of a bubble until the jump direction instruction (600) is supplied by the search stage (100).
  • FIG 6 it is shown how during the T cycle a bubble is detected only in step 3 of the instruction pipeline (114), and how the controller sets the route that loads a 0 in stages 3 (115) and 4 ( 118) of the trace pipeline (113), while the jump direction "Z" (600) is routed to the entry register of the search stage (139).
  • the detection of the bubble in step 3 corresponds to the setting of 1 of the "BURBUJA_P3" signal (125) and consequently of the "BUBBLE” signal (122).
  • the route to stage 3 (115) is controlled by assigning 0 to the "SEL_PIPE_TRAZA" signal (132), and the route to stage 4 (118) is controlled by assigning a 0 to the "SEL_TR_P4" signal (133).
  • the loading of the search register input register (139) is controlled by activating the "LD_DIR” signal (137) and routing the "Z” address (600) to said register (139) by assigning a 0 to the "SEL_DIR” signal. (138).
  • Figure 7 shows how during the T cycle a bubble is detected in step 4 of the instruction pipeline (117), and how the controller sets the route that loads a 0 in stages 3 (115), 4 (118 ) and 5 (121) of the trace pipeline (113), while the jump direction "Z" (600) is routed to the input register of the search stage (139).
  • the detection of the bubble in step 4 corresponds to the setting to 1 of the "BUBBLE_P4" signal (126) and consequently of the "BUBBLE” signal (122).
  • the route to stage 3 of the trace pipeline (115) is controlled by assigning 0 to the signal "SEL_PIPE_TRAZA” (132), the route to stage 4 (118) is controlled by assigning a 0 to the signal "SEL_TR_P4" (133) , and the route to step 5 (121) is controlled by assigning a 0 to the signal "SEL_TR_P5" (134).
  • the loading of the search register input register (139) is controlled by activating the "LD_DIR” signal (137) and routing the "Z” address (600) to said register (139) by assigning a 0 to the "SEL_DIR” signal. (138).
  • Figure 8 represents the wait for two cycles (801 and 802) after either of the two bubbles described in Figures 6 and 7, so that in the T + 2 cycle (802) the search stage (100) deactivates the “WAIT” signal (124) indicating that the instructions are available for loading in the decoding stage (103) in the next cycle, activating the signals "LD_N” (127) and "LD_N_1" (128).
  • the preferred physical embodiment will consist of the "Hardware / Firmware" implementation of the described functionality, based on a model description of a standard processor architecture on which the aforementioned modifications will be made and that basically affect the design of the pipeline.
  • Said architecture description models will allow the manufacturing details of the device to be generated, which may be materialized on a programmable device such as an FPGA (Programmable Door Matrix, Field Programmable Gafe Array) or on a Specific Application Integrated Circuit ( ASIC, Application Specific Integrated Circuit).

Abstract

The invention relates to a method and device for synchronisation and parallel execution of trace instructions on a segmented RISC processor. The invention consists of a device of which the internal structure, based on a segmented processor, does away with the execution time overload introduced by the code instrumentation used to measure execution time in the worst case scenario. For this, the device uses a specific instruction code for the instrumentation, which is interpreted as enabling the tracing of the preceding instruction, and which makes it possible to identify unequivocally the time at which said instruction is executed. The proposed device executes each trace instruction in parallel, in a synchronised fashion, with the instruction to be traced that precedes same, and conditions said execution on completion of the execution of the instruction to be traced without it being affected by bubbles.

Description

DESCRIPCIÓN  DESCRIPTION
UN MÉTODO Y UN DISPOSITIVO DE PROCESAMIENTO EN PARALELO DE INSTRUCCIONES DE PROGRAMA E INSTRUCCIONES DE TRAZA A METHOD AND A PARALLEL PROCESSING DEVICE FOR PROGRAM INSTRUCTIONS AND TRACE INSTRUCTIONS
SECTOR DE LA TÉCNICA SECTOR OF THE TECHNIQUE
La invención se encuadra, de forma general, en el sector Electrónica, Informática y Telecomunicaciones (TIC), si bien tiene aplicación específica en sistemas críticos propios de los sectores Aeroespacial, Defensa y de alta fiabilidad. The invention falls, in general, in the Electronics, Information Technology and Telecommunications (ICT) sector, although it has specific application in critical systems typical of the Aerospace, Defense and high reliability sectors.
ANTECEDENTES DE LA INVENCIÓN BACKGROUND OF THE INVENTION
Se han identificado diferentes invenciones que proponen soluciones para facilitar el trazado de instrucciones pero que difieren de la presente invención. Different inventions have been identified that propose solutions to facilitate the tracing of instructions but differ from the present invention.
La solicitud de patente US 5996092 A,“System and method for tracing program execution within a processor before and after a triggering event”, permite iniciar e interrumpir la traza de instrucciones, empleando un procesador de traza que trabaja en paralelo al procesador que ejecuta las propias instrucciones. El procesador de traza, tras detectar el momento de inicio de la traza, mediante una instrucción específica, almacena en una memoria compartida información relativa a toda la secuencia de ejecución de instrucciones hasta el momento en el que detecta la instrucción de parada de la traza. En el presente dispositivo, no hay memoria compartida ni un procesador de traza en paralelo, y la traza está basada en una instrumentación del código que añade, en puntos concretos del código en los que se desea obtener una traza puntual, instrucciones de traza que identifican unívocamente el punto que deseamos trazar, sin recurrir al contador de programa. La instrucción de traza es ejecutada en paralelo con la instrucción previa que se pretende trazar, introduciendo redundancia de las partes necesarias del pipeline del procesador, y el resultado de la ejecución es la escritura en un registro de salida donde un hardware de análisis lo captura, de forma que queda registrado el instante específico en el que se ejecutó la instrucción trazada. La presente invención presenta un enfoque diferente, donde es posible, mediante la instrumentación selectiva de instrucciones ubicadas en cualquier parte del código, obtener el peor tiempo de ejecución de cada una de las funciones de sistema. Este tipo de instrumentación está siendo utilizada por herramientas comerciales como RapiTime en sistemas críticos de aviónica (G. Bernat et al.,“Identifying Opportunities for Worst-case Execution Time Reduction in an Avionics System,” Ada User Journal, Volume 28, Number 3, 2007, pp. 189-194). Sin embargo, su aplicación utilizando los procesadores disponibles en el mercado tiene como principal desventaja la sobrecarga introducida en el tiempo de ejecución, que se elimina con la invención presentada. US patent application 5996092 A, "System and method for tracing program execution within a processor before and after a triggering event", allows to start and interrupt the instruction trace, using a trace processor that works in parallel to the processor that executes the own instructions. The trace processor, after detecting the start time of the trace, by means of a specific instruction, stores in a shared memory information relative to the entire sequence of instruction execution until the moment in which it detects the stop instruction of the trace. In the present device, there is no shared memory or a parallel trace processor, and the trace is based on an instrumentation of the code that adds, at specific points of the code in which it is desired to obtain a specific trace, trace instructions that identify uniquely the point we wish to draw, without resorting to the program counter. The trace instruction is executed in parallel with the previous instruction that is intended to be plotted, introducing redundancy of the necessary parts of the processor pipeline, and the result of the execution is the writing in an output register where an analysis hardware captures it, so that the specific moment in which the plotted instruction was executed is recorded. The present invention presents a different approach, where it is possible, by selective instrumentation of instructions located anywhere in the code, to obtain the worst execution time of each of the system functions. This type of instrumentation is being used by commercial tools such as RapiTime in critical avionics systems (G. Bernat et al., “Identifying Opportunities for Worst-case Execution Time Reduction in an Avionics System,” Ada User Journal, Volume 28, Number 3, 2007, pp. 189 -194). However, its application using commercially available processors has the main disadvantage of the overhead introduced at runtime, which is eliminated with the invention presented.
Con respecto a la solicitud de patente US 2017147472 (A1),“Systems and methods for a real time embedded trace”, la diferencia principal es que el sistema traza las instrucciones de salto, de forma autónoma. En la invención que aquí se presenta, las instrucciones que se trazan son definidas mediante técnicas de instrumentación selectivas, que insertan, a continuación de cada instrucción que deseamos trazar, una instrucción de traza. Lo que hace esta invención es ejecutarlas en paralelo de forma sincronizada con la instrucción que se pretende trazar. Como ya se ha descrito, esta técnica de instrumentación está siendo utilizada en sistemas críticos y permite trazar bloques de código para evaluar su tiempo de ejecución en el peor caso, teniendo en cuenta que la transición entre algunos de estos bloques, como el correspondiente a un bloque e/se, y el bloque posterior, no implican un salto en la ejecución, por lo que no serían detectados por la solución que presenta la patente US 2017147472 (A1). With respect to US patent application 2017147472 (A1), "Systems and methods for a real time embedded trace", the main difference is that the system traces the jump instructions autonomously. In the invention presented here, the instructions that are drawn are defined by selective instrumentation techniques, which insert, after each instruction we wish to draw, a trace instruction. What this invention does is to execute them in parallel in a synchronized manner with the instruction that is intended to be drawn. As already described, this instrumentation technique is being used in critical systems and allows to draw blocks of code to evaluate its execution time in the worst case, taking into account that the transition between some of these blocks, such as that corresponding to a e / se block, and the subsequent block, do not imply a jump in the execution, so they would not be detected by the solution presented in US patent 2017147472 (A1).
En cuanto a la patente US 6513134 (B1), “System and method for tracing program execution within a superscalar processor”, presenta una mejora respecto a la US 5996092 A , permitiendo trabajar con procesadores superescalares que trabajan a altas frecuencias, por encima de los 400MHz. Para ello utiliza una codificación de la información que se desea trazar que permite reducir el espacio que es necesario utilizar para almacenarla en el buffer de traza que se proporciona como salida. Al igual que en esta patente, se trazan bloques de instrucciones para analizar su ejecución, pero definiendo una forma más flexible para el disparo de la traza, y usando una codificación de la traza que permite un ahorro en cuanto a la información almacenada y al número de pines utilizados. Esta patente, por tanto, no evita la sobrecarga del uso de técnicas de instrumentación de código que afectan a todo el sistema, como ocurre con la invención reivindicada, sino que pretende optimizar un mecanismo de trazado, no basado en la instrumentación, sino en la detección de eventos que se ajustan a unas condiciones predefinidas. Por lo tanto, se concluye que los sistemas existentes para la traza de instrucciones permiten programar eventos específicos de disparo de traza, para recopilar información de traza limitada a un determinado intervalo anterior y/o posterior a dicho evento. Estos métodos adolecen de cierta rigidez en cuanto a que el número de bloques que se puede trazar en cada ejecución es siempre limitado, y no se adaptan bien a las técnicas de instrumentación de código que se utilizan en la caracterización del tiempo de ejecución en el peor caso en sistemas críticos, como la que emplea la citada herramienta RapiTime. Dado que la aplicación de la instrumentación de código empleando los procesadores actuales introduce sobrecarga en el tiempo de ejecución, la invención presentada, destinada a eliminar estas sobrecargas, aporta una mejora con un objetivo concreto enmarcado en este ámbito. As for US Patent 6513134 (B1), "System and method for tracing program execution within a superscalar processor", it presents an improvement over US 5996092 A, allowing work with superscalar processors that work at high frequencies, above the 400MHz To do this, it uses an encoding of the information that is to be traced that allows reducing the space that needs to be used to store it in the trace buffer that is provided as output. As in this patent, instruction blocks are drawn to analyze its execution, but defining a more flexible way to trigger the trace, and using a trace coding that allows savings in terms of stored information and number of pins used. This patent, therefore, does not prevent the overload of the use of code instrumentation techniques that affect the entire system, as is the case with the claimed invention, but aims to optimize a tracing mechanism, not based on instrumentation, but on the Detection of events that conform to predefined conditions. Therefore, it is concluded that the existing systems for the instruction trace allow to program specific trace trigger events, to collect limited trace information at a certain interval before and / or after said event. These methods suffer from a certain rigidity in that the number of blocks that can be traced in each execution is always limited, and they do not adapt well to the code instrumentation techniques that are used in the characterization of the execution time in the worst case in critical systems, such as that used by the aforementioned RapiTime tool. Since the application of code instrumentation using current processors introduces overload at runtime, the invention presented, aimed at eliminating these overloads, provides an improvement with a specific objective framed in this area.
DESCRIPCIÓN DE LA INVENCIÓN DESCRIPTION OF THE INVENTION
En un primer aspecto de la invención, se divulga un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza. El dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza comprende: In a first aspect of the invention, a parallel processing device of program instructions and trace instructions is disclosed. The parallel processing device for program instructions and trace instructions comprises:
• una etapa de búsqueda de instrucciones, que a su vez comprende:  • an instruction search stage, which in turn includes:
o un módulo de cálculo de la dirección de la instrucción; y,  or a module for calculating the direction of the instruction; Y,
o módulo de búsqueda de las instrucciones con doble puerto de lectura; o module for searching instructions with double reading port;
• una etapa de decodificación duplicada; • a duplicate decoding stage;
• un pipeline-traza (pipeline de traza) para el procesado únicamente de instrucciones de traza;  • a trace pipeline (trace pipeline) for processing trace instructions only;
• un registro de salida para la traza;  • an exit record for the trace;
• una ruta de datos que a su vez comprende un conjunto de multiplexores;  • a data path that in turn comprises a set of multiplexers;
• un controlador de la ruta de datos, que a su vez comprende unas entradas y unas salidas que controlan los multiplexores, la carga en unos registros asociados las distintas etapas y el registro de salida para la traza;  • a controller of the data path, which in turn comprises inputs and outputs that control the multiplexers, the load in associated registers of the different stages and the output register for the trace;
donde el controlador de la ruta de datos está configurado para determinar, en función del estado de dicho controlador y del valor de las entradas en dicho controlador, el valor de las salidas que son enviadas a los multiplexores de la ruta de datos de tal forma que una instrucción de traza se ejecuta de forma sincronizada con la instrucción que la precede, haciéndose efectiva dicha ejecución durante la última etapa del pipeline de traza. En una realización de la invención, el controlador comprende las siguientes secuencias de instrucciones S1 , S2 y S3: where the data path controller is configured to determine, depending on the state of said controller and the value of the inputs in said controller, the value of the outputs that are sent to the multiplexers of the data path such that a trace instruction is executed in synchronization with the preceding instruction, said execution being effective during the last stage of the trace pipeline. In one embodiment of the invention, the controller comprises the following instruction sequences S1, S2 and S3:
S1 : corresponde a pares Instrucción-Traza en los que las instrucciones a ser trazadas siempre se cargan en el elemento “INSTRUCCIÓN N”, mientras las instrucciones correspondientes de traza se cargan en el elemento“INSTRUCCIÓN N+1”;  S1: corresponds to Instruction-Trace pairs in which the instructions to be drawn are always loaded in the “INSTRUCTION N” element, while the corresponding trace instructions are loaded in the “N + 1 INSTRUCTION” element;
S2: corresponde a una secuencia de instrucciones que no son trazadas, de forma en los elementos“INSTRUCCIÓN N” e“INSTRUCCIÓN N+1” siempre se cargan instrucciones (y no trazas);  S2: corresponds to a sequence of instructions that are not traced, so that “INSTRUCTION N” and “INSTRUCTION N + 1” always load instructions (and not traces);
S3: corresponde a dos pares Instrucción-Traza en los que las instrucciones de traza se cargan en ciclos sucesivos en el elemento “INSTRUCCIÓN N +1”, mientras las instrucciones a trazar se cargan en esos mismos ciclos en el elemento“INSTRUCCIÓN N+1”.  S3: corresponds to two Instruction-Trace pairs in which the trace instructions are loaded in successive cycles in the “INSTRUCTION N + 1” element, while the instructions to be traced are loaded in those same cycles in the “INSTRUCTION N + 1 element” "
El dispositivo de procesamiento ejecuta la secuencia S1 durante dos ciclos de reloj, T y T+1 ; de tal forma que las instrucciones almacenadas en unas direcciones X+1 , X+3 y X+5, son las instrucciones de traza de las instrucciones que las preceden, ubicadas, respectivamente, en unas direcciones X, X+2 y X+4. The processing device executes the sequence S1 for two clock cycles, T and T + 1; such that the instructions stored in directions X + 1, X + 3 and X + 5, are the trace instructions of the instructions that precede them, located, respectively, in directions X, X + 2 and X + 4 .
El dispositivo de procesamiento ejecuta la secuencia S2 en la que durante dos ciclos no se cargan instrucciones de traza; de tal forma que durante el primer ciclo T se detecta que las dos instrucciones que están cargadas en la etapa de decodificación no son de traza, y por tanto las señales“N_ES_TRAZA” “N_1_ES_TRAZA” valen ambas“0”; y, en el ciclo T +1 el controlador se encuentra en el estado denominado“INSTR PENDIENTE”, en el que la instrucción pendiente ubicada en la segunda unidad de decodificación se dirige hacia la“etapa 3” del pipeline de instrucciones del procesador. The processing device executes the S2 sequence in which no trace instructions are loaded for two cycles; so that during the first cycle T it is detected that the two instructions that are loaded in the decoding stage are not trace, and therefore the signals "N_ES_TRAZA" "N_1_ES_TRAZA" are worth both "0"; and, in the T + 1 cycle the controller is in the state called "PENDING INSTR", in which the pending instruction located in the second decoding unit is directed to "step 3" of the processor instruction pipeline.
El dispositivo de procesamiento ejecuta la secuencia S3: en el ciclo T el valor de la señal “N_ES_TRAZA” es“1”, mientras que“N_1_ES_TRAZA” vale“0”, y el valor de la señal de multiplexación “SEL_TR_P4” es“2”, de tal forma que se habilita una ruta donde la instrucción de traza se sincroniza con la ejecución de la instrucción a trazar; en el ciclo T+1 , se ubica la instrucción de traza en la etapa 4 del pipeline de traza. En el ciclo T, además, la señal de multiplexación“SEL_PIPE_TRAZA” toma el valor 0, con el fin de que en el ciclo T+1 se encuentre un cero en la etapa 3 del pipeline de traza. The processing device executes the sequence S3: in the T cycle the value of the “N_ES_TRAZA” signal is “1”, while “N_1_ES_TRAZA” is worth “0”, and the value of the multiplexing signal “SEL_TR_P4” is “2 ”, So that a route is enabled where the trace instruction is synchronized with the execution of the instruction to be plotted; in the T + 1 cycle, the trace instruction is located in step 4 of the trace pipeline. In the T cycle, in addition, the multiplexing signal "SEL_PIPE_TRAZA" takes the value 0, so that in the T + 1 cycle a zero is found in step 3 of the trace pipeline.
En una forma de realización de la invención, el dispositivo de procesamiento durante un ciclo“T”, detecta una burbuja en la etapa 3 del pipeline de instrucciones, de tal forma que el controlador fija la ruta que carga un“0” en las etapas 3 y 4 del pipeline- traza y una dirección de salto“Z” se enruta hacía el registro de entrada de la etapa de búsqueda, de tal forma que la detección de la burbuja en la etapa 3 se corresponde con la puesta a“1” de la señal "BURBUJA_P3" y de la señal "BURBUJA". El controlador controla: la ruta hacia la etapa 3 asignando“0” a la señal "SEL_PIPE_TRAZA"; la ruta hacia la etapa 4 asignando un“0” a la señal "SEL_TR_P4"; y la carga del registro de entrada de la etapa de búsqueda activando la señal "LD_DIR" y enrutando la dirección "Z" hacía dicho registro asignando un“0” a la señal "SEL_DIR". In an embodiment of the invention, the processing device during a cycle "T" detects a bubble in step 3 of the instruction pipeline, so that the controller sets the route that loads a "0" in stages 3 and 4 of the pipeline and a jump direction "Z" the search stage input register is routed, so that the detection of the bubble in stage 3 corresponds to setting "1" of the "BURBUJA_P3" signal and the "BUBBLE" signal. The controller controls: the route to stage 3 by assigning “0” to the "SEL_PIPE_TRAZA"signal; the route to stage 4 assigning a “0” to the "SEL_TR_P4"signal; and loading the input register of the search stage by activating the "LD_DIR" signal and routing the "Z" address to said register by assigning a "0" to the "SEL_DIR" signal.
En una forma de realización de la invención, el dispositivo el dispositivo de procesamiento durante un ciclo“T”, detecta una burbuja en la etapa 4 del pipeline de instrucciones, de tal forma el controlador fija la ruta que carga un“0” en las etapas 3, 4 y 5 del pipeline- traza y la dirección de salto“Z” se enruta hacia el registro de entrada de la etapa de búsqueda, de tal forma que la detección de la burbuja en la etapa 4 se corresponde con la puesta a “1” de la señal "BURBUJA_P4" y de la señal "BURBUJA". El controlador controla: la ruta hacia la etapa 3 del pipeline- traza asignando“0” a la señal "SEL_PIPE_TRAZA" ; la ruta hacia la etapa 4 asignando un“0” a la señal "SEL_TR_P4"; la ruta hacia la etapa 5 se controla asignando un“0” a la señal "SEL_TR_P5"; y, la carga del registro de entrada de la etapa de búsqueda activando la señal "LD_DIR" y enrutando la dirección "Z" hacía dicho registro asignando un 0 a la señal "SEL_DIR". In an embodiment of the invention, the device, the processing device during a "T" cycle, detects a bubble in step 4 of the instruction pipeline, such that the controller sets the route that loads a "0" in the stages 3, 4 and 5 of the pipeline and the jump direction "Z" is routed to the entry register of the search stage, such that the detection of the bubble in stage 4 corresponds to the setting to "1" of the "BURBUJA_P4" signal and of the "BUBBLE" signal. The controller controls: the route to stage 3 of the pipeline by assigning "0" to the "SEL_PIPE_TRAZA" signal; the route to stage 4 assigning a “0” to the "SEL_TR_P4" signal; the route to step 5 is controlled by assigning a "0" to the "SEL_TR_P5" signal; and, loading the input register of the search stage by activating the "LD_DIR" signal and routing the "Z" address to said register by assigning a 0 to the "SEL_DIR" signal.
En un segundo aspecto de la invención se divulga un procesador RISC,“Computador con Conjunto de Instrucciones Reducidas”, que comprende un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza según una cualquiera de las formas de realización anteriores para el primer aspecto de la invención. A second aspect of the invention discloses a RISC processor, "Computer with Reduced Instruction Set", which comprises a parallel processing device for program instructions and trace instructions according to any one of the above embodiments for the first aspect of the invention.
En un tercer aspecto de la invención se divulga un método de procesamiento en paralelo de instrucciones de programa e instrucciones de traza que, ejecutado sobre un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza definido en cualquiera de las realizaciones del primer aspecto de la invención, procesa de forma paralela una instrucción y una instrucción de traza. In a third aspect of the invention a parallel processing method of program instructions and trace instructions is disclosed which, executed on a parallel processing device of program instructions and trace instructions defined in any of the embodiments of the first aspect of the invention, it processes an instruction and a trace instruction in parallel.
BREVE DESCRIPCIÓN DE LAS FIGURAS Para complementar la descripción de la invención y con objeto de ayudar a una mejor comprensión de sus características, se acompaña como parte integrante de dicha descripción, un juego de dibujos en donde con carácter ilustrativo y no limitativo, se ha representado lo siguiente: BRIEF DESCRIPTION OF THE FIGURES To complement the description of the invention and in order to help a better understanding of its characteristics, a set of drawings is attached as an integral part of said description, where the following is illustrated and not limited to:
Figura 1. Estructura del dispositivo para la sincronización y ejecución paralela propuesta en la invención. Figure 1. Structure of the device for synchronization and parallel execution proposed in the invention.
Figura 2. Máquina de Mealy del controlador de la ruta de datos.  Figure 2. Mealy machine of the data path controller.
Figura 3. Evolución de la Ruta de datos ante la secuencia de instrucciones S1.  Figure 3. Evolution of the data path before the sequence of instructions S1.
Figura 4. Evolución de la Ruta de datos ante la secuencia de instrucciones S2.  Figure 4. Evolution of the data path before the sequence of instructions S2.
Figura 5. Evolución de la Ruta de datos ante la secuencia de instrucciones S3.  Figure 5. Evolution of the data path before the sequence of instructions S3.
Figura 6. Evolución de la Ruta de datos ante una burbuja en la etapa 3 del pipeline.  Figure 6. Evolution of the data path before a bubble in stage 3 of the pipeline.
Figura 7. Evolución de la Ruta de datos ante una burbuja en la etapa 4 del pipeline.  Figure 7. Evolution of the data path before a bubble in stage 4 of the pipeline.
Figura 8. Evolución de la Ruta de datos tras una burbuja en el ciclo anterior.  Figure 8. Evolution of the data path after a bubble in the previous cycle.
Figura 9 Tabla de transición de estados del controlador de la ruta de datos.  Figure 9 State transition table of the data path controller.
Figura 10 Tabla de las salidas del controlador de la ruta de datos relativas a la etapa de búsqueda.  Figure 10 Table of controller outputs of the data path relative to the search stage.
Figura 11 Tabla de las salidas del controlador de la ruta de datos relativas a la etapa de decodificación.  Figure 11 Table of controller outputs of the data path relative to the decoding stage.
Figura 12 Tabla de las salidas del controlador de la ruta de datos relativas a la etapas 3, 4 y 5 del pipeline de las instrucciones de traza.  Figure 12 Table of the outputs of the controller of the data path relative to stages 3, 4 and 5 of the pipeline of the trace instructions.
En la figura 1 se referencian los elementos del dispositivo de sincronización y ejecución paralela propuesta en la invención. Estos elementos son los siguientes: The elements of the parallel synchronization and execution device proposed in the invention are referenced in Figure 1. These elements are as follows:
100 Etapa 1 de búsqueda de instrucciones  100 Stage 1 instruction search
101 Módulo de selección de la dirección de la siguiente instrucción  101 Address selection module of the following instruction
102 Módulo de búsqueda de instrucciones con doble puerto de lectura  102 Instruction search module with double reading port
103 Etapa 2 de decodificación duplicada  103 Stage 2 duplicate decoding
104 Controlador de la ruta de datos del dispositivo  104 Device data path controller
105 Entradas al controlador de la ruta de datos  105 Inputs to the data path controller
106 Salidas del controlador de la ruta de datos  106 Data Path Controller Outputs
107 Módulo de decodificación "INSTRUCCIÓN N" de la etapa 2  107 "INSTRUCTION N" decoding module of stage 2
108 Módulo de decodificación " INSTRUCCIÓN N + 1" de la etapa 2  108 Decoding module "INSTRUCTION N + 1" of stage 2
109 Multiplexor de selección de la entrada de la etapa 3 del pipeline de instrucciones del procesador RISC 110 Multiplexor de selección de la entrada de la etapa 3 del pipeline de las instrucciones de traza 109 Step 3 input selection multiplexer of the RISC processor instruction pipeline 110 Pipeline stage 3 input selection multiplexer from trace instructions
112 Pipeline de las instrucciones del procesador RISC  112 Pipeline of RISC processor instructions
113 Pipeline de las instrucciones de traza  113 Pipeline of trace instructions
114 Etapa 3 del pipeline de las instrucciones del procesador RISC  114 Pipeline stage 3 of the RISC processor instructions
115 Etapa 3 del pipeline de las instrucciones de traza  115 Pipeline stage 3 of the trace instructions
116 Multiplexor de selección de la entrada de la etapa 4 del pipeline de instrucciones de traza  116 Input selection multiplexer of stage 4 of the trace instructions pipeline
117 Etapa 4 del pipeline de las instrucciones del procesador RISC  117 Pipeline stage 4 of the RISC processor instructions
118 Etapa 4 del pipeline de las instrucciones de traza  118 Pipeline stage 4 of the trace instructions
119 Multiplexor de selección de la entrada de la etapa 5 del pipeline de instrucciones de traza  119 Input selection multiplexer of stage 5 of the trace instructions pipeline
120 Etapa 5 del pipeline de las instrucciones del procesador RISC  120 Pipeline stage 5 of the RISC processor instructions
121 Etapa 5 del pipeline de las instrucciones de traza  121 Pipeline stage 5 of the trace instructions
122 Entrada al controlador de la ruta de datos que monitoriza la detección de burbujas  122 Input to the data path controller that monitors bubble detection
123 Registro de salida de la información de traza  123 Record output of trace information
124 ESPERA: Entrada al controlador de la ruta de datos que monitoriza la espera en la búsqueda de instrucciones  124 WAIT: Input to the controller of the data path that monitors the wait in the search for instructions
125 BURBUJA_P3: Señal de entrada al controlador que monitoriza la detección de una burbuja en la etapa 3 de pipeline de instrucciones del procesador RISC  125 BURBUJA_P3: Input signal to the controller that monitors the detection of a bubble in step 3 of the RISC processor instruction pipeline
126 BURBUJA_P4: Señal de entrada al controlador que monitoriza la detección de una burbuja en la etapa 4 de pipeline de instrucciones del procesador RISC  126 BUBBLE_P4: Input signal to the controller that monitors the detection of a bubble in step 4 of the RISC processor instruction pipeline
127 LD_N: Salida del controlador de la ruta de datos que controla la carga del módulo de decodificación "INSTRUCCIÓN N" de la etapa 2  127 LD_N: Output of the data path controller that controls the load of the decoding module "INSTRUCTION N" of step 2
128 LD_N_1 :Salida del controlador de la ruta de datos que controla la carga del módulo de decodificación "INSTRUCCIÓN N + 1" de la etapa 2  128 LD_N_1: Output of the data path controller that controls the load of the decoding module "INSTRUCTION N + 1" of step 2
129 N_ES_TRAZA: Señal de entrada al controlador que monitoriza si la instrucción que se ha decodificado en el elemento "Instrucción N" es de tipo traza  129 N_EN_TRAZA: Input signal to the controller that monitors whether the instruction that has been decoded in the "Instruction N" element is of the trace type
130 N_1_ES_TRAZA: Señal de entrada al controlador que monitoriza si la instrucción que se ha decodificado en el elemento "Instrucción N+1" es de tipo traza 131 SEL_PIPE_INSTR: Salida del controlador de la ruta de datos que controla el multiplexor de entrada a la etapa 3 del pipeline de instrucciones del procesador RISC 130 N_1_EN_TRAZA: Input signal to the controller that monitors whether the instruction that has been decoded in the "Instruction N + 1" element is trace type 131 SEL_PIPE_INSTR: Output of the data path controller that controls the input multiplexer to stage 3 of the RISC processor instruction pipeline
132 SEL_PIPE_TRAZA: Salida del controlador de la ruta de datos que controla el multiplexor de entrada a la etapa 3 del pipeline de instrucciones de traza 132 SEL_PIPE_TRAZA: Output of the data path controller that controls the input multiplexer to step 3 of the trace instruction pipeline
133 SEL_TR_P4: Salida del controlador de la ruta de datos que controla el multiplexor de entrada a la etapa 4 del pipeline de instrucciones de traza133 SEL_TR_P4: Output of the data path controller that controls the input multiplexer to step 4 of the trace instruction pipeline
134 SEL_TR_P5: Salida del controlador de la ruta de datos que controla el multiplexor de entrada a la etapa 5 del pipeline de instrucciones de traza134 SEL_TR_P5: Output of the data path controller that controls the input multiplexer to step 5 of the trace instruction pipeline
135 TR_P5_ES_CERO: Señal que monitoriza si la etapa 5 de pipeline de instrucciones de traza almacena un valor cero 135 TR_P5_EN_CERO: Signal that monitors whether stage 5 of the trace instruction pipeline stores a zero value
136 LD_TR_OUT: Señal que controla el almacenamiento en el registro de salida de la información de traza. Toma el valor complementario a la señal TR_P5_ES_CERO, por lo que el registro sólo se carga cuando la información de traza es distinta de cero.  136 LD_TR_OUT: Signal that controls the storage in the output register of the trace information. It takes the complementary value to the TR_P5_ES_CERO signal, so the register is only loaded when the trace information is nonzero.
137 LD_DIR: Señal que controla el almacenamiento en el registro de entrada a la etapa de búsqueda de la dirección de la siguiente instrucción a buscar.  137 LD_DIR: Signal that controls the storage in the entry register at the search stage of the address of the next instruction to be searched.
138 SEL_DIR: Salida del controlador de la ruta de datos que controla el multiplexor de entrada al registro de entrada a la etapa de búsqueda que almacena la dirección de la siguiente instrucción a buscar.  138 SEL_DIR: Output of the data path controller that controls the input multiplexer to the input register to the search stage that stores the address of the next instruction to be searched.
139 Registro de entrada a la etapa de búsqueda que almacena la dirección de la siguiente instrucción a buscar.  139 Entry record to the search stage that stores the address of the next instruction to search.
140 Multiplexor de entrada al registro de entrada a la etapa de búsqueda que almacena la dirección de la siguiente instrucción a buscar.  140 Input multiplexer to the entry register to the search stage that stores the address of the next instruction to be searched.
DESCRIPCIÓN DE UNA FORMA DE REALIZACIÓN DE LA INVENCIÓN DESCRIPTION OF A FORM OF EMBODIMENT OF THE INVENTION
La invención consiste en un dispositivo dotado de una estructura interna de procesamiento que permite eliminar la sobrecarga de tiempo de ejecución que introduce la instrumentación de código utilizada para medir el tiempo de“ejecución en caso peor” empleando análisis híbrido. Este análisis combina el análisis estático del código con medidas del tiempo de ejecución sobre la plataforma de despliegue. El análisis estático determina qué instrucciones es necesario trazar y, mediante técnicas de instrumentación, añade código de trazado tras cada instrucción que se desea trazar, de forma que el instante de ejecución de dicha instrucción puede capturarse mediante un hardware de soporte y un analizador lógico. El código añadido tras la instrucción que deseamos trazar permite identificar unívocamente el momento de ejecución de dicha instrucción, pero introduce una sobrecarga que con esta invención se puede eliminar. El dispositivo es capaz de detectar las instrucciones de traza y ejecutarlas en paralelo, de forma sincronizada, y condicionada a la ejecución completa de la instrucción que le precede. De esa forma permite que el proceso de trazado sea no intrusivo en lo que respecta al tiempo de ejecución, ya que la secuencia y el instante de ejecución del programa bajo análisis, no se ven modificados por la introducción de las trazas, al ser éstas ejecutadas en paralelo. The invention consists of a device equipped with an internal processing structure that allows eliminating the runtime overhead that introduces the code instrumentation used to measure the "worst case execution" time using hybrid analysis. This analysis combines the static analysis of the code with runtime measurements on the deployment platform. The static analysis determines which instructions are necessary to draw and, by means of instrumentation techniques, adds plot code after each instruction that is desired to be drawn, so that the instant of execution of said instruction can be captured by means of a hardware of support and a logic analyzer. The code added after the instruction that we wish to trace allows us to uniquely identify the moment of execution of said instruction, but introduces an overload that can be eliminated with this invention. The device is able to detect the trace instructions and execute them in parallel, in a synchronized way, and conditioned to the complete execution of the instruction that precedes it. In this way, it allows the plotting process to be non-intrusive in regards to the execution time, since the sequence and the moment of execution of the program under analysis are not modified by the introduction of the traces, since they are executed in parallel.
El dispositivo propuesto en la invención utiliza un código de instrucción específico, que será empleado para la instrumentación, y cuya estructura interna interpreta como la habilitación de traza de la instrucción que le precede. Los elementos principales de este dispositivo, que se muestran en la figura 1 , son: 1) una etapa de búsqueda de instrucciones (100), que cuenta con un módulo de cálculo de la dirección de la instrucción (101) y módulo de búsqueda de las instrucciones con doble puerto de lectura (102); 2) una etapa de decodificación duplicada (103); 3) un pipeline específico para las instrucciones de traza (113); 4) un registro de salida para la traza (123); 5) la ruta de datos, formada por un conjunto de multiplexores (109, 110, 116 y 119); 6) el controlador de la ruta de datos del dispositivo (104), que determina, en función de su estado, y del valor de sus entradas (105), el valor de las salidas (106) que controlan tanto los multiplexores (109, 110, 1 16 y 119), como la carga en registros asociados las distintas etapas (107, 108, 114, 115, 117, 118, 120 y 121), así como el registro de salida (123). Tanto las entradas (105) como las salidas (106) están representadas gráficamente en la figura 1 junto a la etiqueta asignada para cada señal. The device proposed in the invention uses a specific instruction code, which will be used for instrumentation, and whose internal structure interprets as the trace enablement of the preceding instruction. The main elements of this device, which are shown in Figure 1, are: 1) an instruction search stage (100), which has an instruction address calculation module (101) and a search module for the instructions with double reading port (102); 2) a duplicate decoding stage (103); 3) a specific pipeline for the trace instructions (113); 4) an output record for the trace (123); 5) the data path, consisting of a set of multiplexers (109, 110, 116 and 119); 6) the device's data path controller (104), which determines, depending on its status, and the value of its inputs (105), the value of the outputs (106) that control both multiplexers (109, 110, 1 16 and 119), such as the loading in associated registers of the different stages (107, 108, 114, 115, 117, 118, 120 and 121), as well as the output register (123). Both the inputs (105) and the outputs (106) are plotted in Figure 1 next to the label assigned for each signal.
El dispositivo, gracias a la etapa de búsqueda con doble puerto (100), permite cargar dos instrucciones simultáneamente a la etapa de decodificación (103) para que sean decodificadas en paralelo. La señal “ESPERA” (124) de esta etapa se utiliza para modelar posibles estados de espera en dicha búsqueda, y puede activarse tras un reset del procesador, o como consecuencia de hacerse efectivo un salto, lo que provoca la inyección de burbujas en el pipeline de instrucciones del procesador (112), monitorizadas por las señales “BURBUJA_P3” (inyección de burbuja en la etapa 3, 125), 'BURBUJA_P4” (inyección de burbuja en la etapa 4, 126), y la función OR de éstas, etiquetada como “BURBUJA” (122). La señal “ESPERA” (124) se desactivará para notificar a la etapa de decodificación (103) que las instrucciones están disponibles para ser cargadas. The device, thanks to the double port search stage (100), allows two instructions to be loaded simultaneously to the decoding stage (103) to be decoded in parallel. The “WAIT” signal (124) of this stage is used to model possible waiting states in said search, and can be activated after a reset of the processor, or as a result of a jump, which causes the injection of bubbles in the processor instruction pipeline (112), monitored by the signals “BURBUJA_P3” (bubble injection in step 3, 125), 'BURBUJA_P4 ”(bubble injection in stage 4, 126), and their OR function, labeled "BUBBLE" (122). The “WAIT” signal (124) will be deactivated to Notify the decoding stage (103) that the instructions are available for loading.
Las dos instrucciones que se encuentran en la etapa de decodificación corresponden siempre a instrucciones almacenadas en palabras consecutivas de memoria. En la figura 1 , el elemento etiquetado como“INSTRUCCIÓN N” (107) será el que recibirá la primera de las dos instrucciones, mientras que el etiquetado como“INSTRUCCIÓN N+1” (108) recibirá la siguiente. Los valores N y N+1 no corresponden a direcciones físicas de memoria que sean consecutivas, sino que representan dos instrucciones almacenadas en palabras de memoria consecutivas, sin considerar el tamaño de palabra en bytes del procesador, que en el caso más general de un procesador RISC de 32 bits, sería de 4. The two instructions found in the decoding stage always correspond to instructions stored in consecutive memory words. In Figure 1, the item labeled "INSTRUCTION N" (107) will be the one that will receive the first of the two instructions, while the one labeled "INSTRUCTION N + 1" (108) will receive the following. The N and N + 1 values do not correspond to physical memory addresses that are consecutive, but instead represent two instructions stored in consecutive memory words, regardless of the word size in bytes of the processor, which in the most general case of a processor 32-bit RISC would be 4.
En la etapa de decodificación (103), y como consecuencia de decodificar cada una de las instrucciones, se determina si las instrucciones son de tipo traza o pertenecen al resto del conjunto de instrucciones, calculando para ello las señales etiquetadas en la figura 1 como“N_ES_TRAZA” (127) y“N_1_ES_TRAZA” (128). In the decoding step (103), and as a consequence of decoding each of the instructions, it is determined whether the instructions are of the trace type or belong to the rest of the set of instructions, calculating the signals labeled in Figure 1 as " N_ES_TRAZA ”(127) and“ N_1_ES_TRAZA ”(128).
El controlador de la ruta (104) utiliza los valores de esas señales, y el de las señales “ESPERA” (124) y“BURBUJA” (122), junto con el estado del propio controlador, para determinar la ruta que seguirán las instrucciones hacia las siguientes etapas. El controlador configura los multiplexores (109, 110, 116 y 119) de la ruta para asegurar que una instrucción de traza se ejecuta de forma sincronizada con la instrucción que la precede, haciéndose efectiva dicha ejecución durante la última etapa (121) del pipeline de traza (113), en el que se comprueba que la señal“TR_P5_ES_CERO” (135) está desactivada, en cuyo caso la señal“LD_TR_OUT” se activa (136) y el valor de traza es dirigido hacia el registro de salida (123). The route controller (104) uses the values of those signals, and that of the “WAIT” (124) and “BUBBLE” (122) signals, together with the state of the controller itself, to determine the route that the instructions will follow. Towards the next stages. The controller configures the multiplexers (109, 110, 116 and 119) of the route to ensure that a trace instruction is executed in synchronization with the preceding instruction, making such execution effective during the last stage (121) of the pipeline of trace (113), in which it is verified that the signal “TR_P5_ES_CERO” (135) is deactivated, in which case the signal “LD_TR_OUT” is activated (136) and the trace value is directed to the output register (123) .
La figura 2 representa la máquina de Mealy del controlador de la ruta de datos de este dispositivo (200), que es especificada formalmente en las tablas de las figuras 9, 10, 11 y 12. Figure 2 represents the Mealy machine of the data path controller of this device (200), which is formally specified in the tables of Figures 9, 10, 11 and 12.
Las figuras 3, 4 y 5 son, respectivamente, ejemplos de cómo el controlador, para hacer efectiva la sincronización, fija la ruta en las siguientes secuencias de instrucciones posibles S1 , S2, y S3:  Figures 3, 4 and 5 are, respectively, examples of how the controller, to effect synchronization, sets the route in the following possible instruction sequences S1, S2, and S3:
• La secuencia S1 corresponde a pares Instrucción-Traza (301- 302, 303-304 y 305-306) en los que las instrucciones que se desean trazar (301 , 303 y 305) siempre se cargan en el elemento “INSTRUCCIÓN N” (107), mientras las instrucciones correspondientes de traza (302, 304 y 306) se cargan en el elemento“INSTRUCCIÓN N+1” (115). • The sequence S1 corresponds to Instruction-Trace pairs (301-302, 303-304 and 305-306) in which the instructions to be traced (301, 303 and 305) they are always loaded in the “INSTRUCTION N” element (107), while the corresponding trace instructions (302, 304 and 306) are loaded in the “INSTRUCTION N + 1” element (115).
• S2 corresponde a una secuencia de instrucciones (401 , 402, 403 y 404) que no son trazadas, de forma en los elementos “INSTRUCCIÓN N” (114) e “INSTRUCCIÓN N+1” (115) siempre se cargan instrucciones y no trazas.  • S2 corresponds to a sequence of instructions (401, 402, 403 and 404) that are not traced, so that “INSTRUCTION N” (114) and “INSTRUCTION N + 1” (115) always load instructions and not traces
• La secuencia S3 corresponde a dos pares Instrucción-Traza (502- 503 y 504-505) en los que las instrucciones de traza (503 y 505) se cargan en ciclos sucesivos ( 500 y 510) en el elemento “INSTRUCCIÓN N +1” (107), mientras las instrucciones a trazar (502 y 504) se cargan en esos mismos ciclos en el elemento “INSTRUCCIÓN N+1” (108).  • The S3 sequence corresponds to two Instruction-Trace pairs (502-503 and 504-505) in which the trace instructions (503 and 505) are loaded in successive cycles (500 and 510) in the “INSTRUCTION N + 1 ”(107), while the instructions to be drawn (502 and 504) are loaded in those same cycles in the“ INSTRUCTION N + 1 ”element (108).
La figura 3 muestra el funcionamiento del dispositivo durante dos ciclos de reloj, T (300) y T+1 (307), en el que el procesador ejecuta la secuencia de instrucciones S1. En la secuencia S1 las instrucciones almacenadas en las direcciones X+1 (302), X+3 (304) y X+5 (306), son las instrucciones de traza de las instrucciones que las preceden, ubicadas, respectivamente, en las direcciones X (301), X+2 (303) y X+4 (305). El esquema muestra en los dos ciclos, T (300) y T+1 (307), cómo las instrucciones de traza (302, 304, 306), añadidas como fruto de la instrumentación, se dirigen hacia las etapas (115 y 118) que pertenecen al pipeline de las instrucciones de tipo traza, mientras que las instrucciones a trazar (301 , 303, y 305), se dirigen hacia las etapas (114 y 117) que pertenecen al pipeline de las instrucciones del procesador. De esta forma se produce una ejecución sincronizada de la instrucción trazada y su instrucción de traza, y se evita la sobrecarga en tiempo de ejecución de insertar instrucciones de traza en un programa, ya que estas se ejecutan en paralelo. Figure 3 shows the operation of the device during two clock cycles, T (300) and T + 1 (307), in which the processor executes the instruction sequence S1. In the sequence S1 the instructions stored in the addresses X + 1 (302), X + 3 (304) and X + 5 (306), are the trace instructions of the preceding instructions, located, respectively, in the addresses X (301), X + 2 (303) and X + 4 (305). The scheme shows in the two cycles, T (300) and T + 1 (307), how the trace instructions (302, 304, 306), added as fruit of the instrumentation, are directed towards the stages (115 and 118) that belong to the pipeline of the trace type instructions, while the instructions to be drawn (301, 303, and 305), are directed towards the steps (114 and 117) that belong to the pipeline of the processor instructions. In this way a synchronized execution of the plotted instruction and its trace instruction is produced, and the execution time overhead of inserting trace instructions in a program is avoided, since these are executed in parallel.
La figura 4 muestra el funcionamiento del dispositivo en la secuencia S2, en la que durante dos ciclos no se cargan instrucciones de traza. En ese caso durante el primer ciclo T (400) se detecta que las dos instrucciones que están cargadas en la etapa de decodificación (403 y 404) no son de traza, y por tanto las señales“N_ES_TRAZA” (129) y“N_1_ES_TRAZA” (130) valen ambas 0. Esta situación conduce a que la etapa de decodificación (103) no cargue dos nuevas instrucciones al inicio del ciclo T + 1 (408), ya que en el ciclo T (400) los valores de "LD_N" (127) y "LD_N_1" (128) que controlan dicha carga son ambos 0. En el ciclo T +1 (408) el controlador se encuentra en el estado denominado“INSTR PENDIENTE” (202), en el que la instrucción pendiente (404) ubicada en la segunda unidad de decodificación (108) se dirige hacia la etapa 3 del pipeline de instrucciones del procesador (114). En los dos ciclos, T (400) y T +1 (408), el controlador carga la etapa 3 del pipeline de las instrucciones de traza (115) con valores 0, por lo que las instrucciones no serán trazadas. Figure 4 shows the operation of the device in the sequence S2, in which for two cycles no trace instructions are loaded. In that case during the first cycle T (400) it is detected that the two instructions that are loaded in the decoding stage (403 and 404) are not trace, and therefore the signals "N_ES_TRAZA" (129) and "N_1_EN_TRAZA" (130) both are worth 0. This situation leads to the decoding stage (103) not loading two new instructions at the beginning of the T + 1 cycle (408), since in the T cycle (400) the values of "LD_N" (127) and "LD_N_1" (128) that control said load are both 0. In the T +1 cycle (408) the controller is in the state called "PENDING INSTR" (202), in which the pending instruction ( 404) located in the second decoding unit (108) it is directed towards step 3 of the processor instruction pipeline (114). In both cycles, T (400) and T +1 (408), the controller loads stage 3 of the pipeline of the trace instructions (115) with values 0, so the instructions will not be traced .
En la figura 5 se muestra el funcionamiento del dispositivo para la secuencia S3, que cubre el caso en el que en el ciclo T (500) la instrucción a trazar (502) se encuentra en la etapa 3 del pipeline de instrucciones (114), mientras que la instrucción de traza (503) está cargada en el elemento“INSTRUCCIÓN N” (107) de la etapa de decodificación (103). La secuencia incluye, además, que en ese mismo ciclo el elemento“INSTRUCCIÓN N + 1” (108) de la etapa de decodificación (103) contenga la siguiente instrucción (504) a ejecutar. De acuerdo a esta secuencia, en el ciclo T (500) el valor de la señal “N_ES_TRAZA” (129) es 1 , mientras que“N_1_ES_TRAZA” (130) vale 0, y el valor de la señal de multiplexación“SEL_TR_P4” (133) es 2, lo que habilita una ruta donde la instrucción de traza (503) se sincroniza con la ejecución de la instrucción a trazar (502). La sincronización se hace efectiva en el ciclo T+1 (510), ubicándose la instrucción de traza ( 503) en la etapa 4 del pipeline de traza (118). En el ciclo T (500), además, la señal de multiplexación“SEL_PIPE_TRAZA” (132) toma el valor 0, con el fin de que en el ciclo T+1 (510) se encuentre un cero (507) en la etapa 3 del pipeline de traza (115). Figure 5 shows the operation of the device for the sequence S3, which covers the case in which in cycle T (500) the instruction to be traced (502) is in step 3 of the instruction pipeline (114), while the trace instruction (503) is loaded in the "INSTRUCTION N" element (107) of the decoding stage (103). The sequence also includes that in that same cycle the element "INSTRUCTION N + 1" (108) of the decoding stage (103) contains the following instruction (504) to be executed. According to this sequence, in cycle T (500) the value of the “N_ES_TRAZA” signal (129) is 1, while “N_1_ES_TRAZA” (130) is 0, and the value of the multiplexing signal “SEL_TR_P4” ( 133) is 2, which enables a route where the trace instruction (503) is synchronized with the execution of the instruction to be plotted (502). Synchronization is effective in the T + 1 cycle (510), the trace instruction (503) being located in step 4 of the trace pipeline (118). In the T cycle (500), in addition, the multiplexing signal “SEL_PIPE_TRAZA” (132) takes the value 0, so that in the T + 1 cycle (510) a zero (507) is found in step 3 of the trace pipeline (115).
La secuencia S3 provoca, además, que durante el ciclo T( 500), la siguiente instrucción (504), que se encuentra almacenada en el elemento“INSTRUCCIÓN N + 1” (108), tenga habilitada la ruta hacia la etapa 3 del pipeline de instrucciones (114). Para habilitar esta ruta la señal“SEL_PIPE_INSTR” (131) toma el valor 1 durante el ciclo T (500).  The sequence S3 also causes the following instruction (504), which is stored in the "INSTRUCTION N + 1" (108), during cycle T (500), to have the route to stage 3 of the pipeline enabled of instructions (114). To enable this route, the “SEL_PIPE_INSTR” signal (131) takes the value 1 during the T cycle (500).
En el ciclo T+1 (510), el dispositivo repite la misma configuración de la ruta de datos que había en el ciclo T (500), ya que la secuencia ubica de nuevo una instrucción de tipo traza (505) en el elemento“INSTRUCCIÓN N” (107) de la etapa de decodificación (103) y la siguiente instrucción a ejecutar (506) en el elemento“INSTRUCCIÓN N + 1” (108).  In the T + 1 cycle (510), the device repeats the same configuration of the data path as in the T cycle (500), since the sequence again locates a trace type instruction (505) in the element “ INSTRUCTION N "(107) of the decoding stage (103) and the next instruction to be executed (506) in the" INSTRUCTION N + 1 "element (108).
Finalmente, las figuras 6, 7 y 8 describen el funcionamiento del dispositivo ante la detección de burbujas. Las burbujas se insertan en el pipeline de instrucciones de un procesador RISC en todas aquellas situaciones en las que se interrumpe la ejecución secuencial de instrucciones, como ocurre con las instrucciones de salto, tanto condicional como incondicional, o en las llamadas y retornos de funciones. Cuando una instrucción provoca que se interrumpa el orden secuencial de ejecución, el procesador debe descartar la ejecución de las instrucciones posteriores a dicha instrucción, e iniciar la búsqueda de la instrucción cuya dirección ha sido determinada tras la ejecución de la instrucción que ha provocado la ruptura de secuencia. En las figuras 6 y 7 esta dirección está etiquetada como dirección“Z” (600). La figura 6 explica la ruta de datos ante una burbuja en la etapa 3 del pipeline de instrucciones (114), mientras que la figura 7 corresponde a una burbuja detectada en la etapa 4 (117), y que provoca además una burbuja en la etapa 3 (114). La figura 8 explica la evolución de la ruta en los ciclos siguientes a la detección de una burbuja hasta que la instrucción de la dirección de salto ( 600) es suministrada por la etapa de búsqueda (100) . Finally, Figures 6, 7 and 8 describe the operation of the device before the detection of bubbles. The bubbles are inserted in the instruction pipeline of an RISC processor in all situations in which the sequential execution of instructions is interrupted, as is the case with jump instructions, both conditional and unconditional, or in function calls and returns. When an instruction causes the sequential order of execution to be interrupted, the processor must discard the execution of the instructions following that instruction, and start the search for the instruction whose address has been determined after the execution of the instruction that caused the sequence to break. In figures 6 and 7 this address is labeled as address "Z" (600). Figure 6 explains the data path before a bubble in step 3 of the instruction pipeline (114), while Figure 7 corresponds to a bubble detected in step 4 (117), and also causes a bubble in the stage 3 (114). Figure 8 explains the evolution of the route in the cycles following the detection of a bubble until the jump direction instruction (600) is supplied by the search stage (100).
En la figura 6, se muestra cómo durante el ciclo T se detecta una burbuja sólo en la etapa 3 del pipeline de instrucciones (114), y cómo el controlador fija la ruta que carga un 0 en las etapas 3 (115) y 4 (118) del pipeline de traza (113), mientras la dirección de salto“Z” (600) se enruta hacía el registro de entrada de la etapa de búsqueda (139). La detección de la burbuja en la etapa 3 se corresponde con la puesta a 1 de la señal "BURBUJA_P3" (125) y consecuentemente de la señal "BURBUJA" (122). La ruta hacia la etapa 3 (115) se controla asignando 0 a la señal "SEL_PIPE_TRAZA" (132), y la ruta hacia la etapa 4 (118) se controla asignando un 0 a la señal "SEL_TR_P4" (133). La carga del registro de entrada de la etapa de búsqueda (139) se controla activando la señal "LD_DIR" (137) y enrutando la dirección "Z" (600) hacía dicho registro (139) asignando un 0 a la señal "SEL_DIR" (138). In figure 6, it is shown how during the T cycle a bubble is detected only in step 3 of the instruction pipeline (114), and how the controller sets the route that loads a 0 in stages 3 (115) and 4 ( 118) of the trace pipeline (113), while the jump direction "Z" (600) is routed to the entry register of the search stage (139). The detection of the bubble in step 3 corresponds to the setting of 1 of the "BURBUJA_P3" signal (125) and consequently of the "BUBBLE" signal (122). The route to stage 3 (115) is controlled by assigning 0 to the "SEL_PIPE_TRAZA" signal (132), and the route to stage 4 (118) is controlled by assigning a 0 to the "SEL_TR_P4" signal (133). The loading of the search register input register (139) is controlled by activating the "LD_DIR" signal (137) and routing the "Z" address (600) to said register (139) by assigning a 0 to the "SEL_DIR" signal. (138).
En la figura 7, se muestra cómo durante el ciclo T se detecta una burbuja en la etapa 4 del pipeline de instrucciones (117), y cómo el controlador fija la ruta que carga un 0 en las etapas 3 (115), 4 (118) y 5 (121) del pipeline de traza (113), mientras la dirección de salto“Z” (600) se enruta hacia el registro de entrada de la etapa de búsqueda (139). La detección de la burbuja en la etapa 4 corresponde con la puesta a 1 de la señal "BURBUJA_P4" (126) y consecuentemente de la señal "BURBUJA" (122). La ruta hacia la etapa 3 del pipeline de traza (115) se controla asignando 0 a la señal "SEL_PIPE_TRAZA" (132), la ruta hacia la etapa 4 (118) se controla asignando un 0 a la señal "SEL_TR_P4" (133), y la ruta hacia la etapa 5 (121) se controla asignando un 0 a la señal "SEL_TR_P5" (134). La carga del registro de entrada de la etapa de búsqueda (139) se controla activando la señal "LD_DIR" (137) y enrutando la dirección "Z" (600) hacía dicho registro (139) asignando un 0 a la señal "SEL_DIR" (138). Figure 7 shows how during the T cycle a bubble is detected in step 4 of the instruction pipeline (117), and how the controller sets the route that loads a 0 in stages 3 (115), 4 (118 ) and 5 (121) of the trace pipeline (113), while the jump direction "Z" (600) is routed to the input register of the search stage (139). The detection of the bubble in step 4 corresponds to the setting to 1 of the "BUBBLE_P4" signal (126) and consequently of the "BUBBLE" signal (122). The route to stage 3 of the trace pipeline (115) is controlled by assigning 0 to the signal "SEL_PIPE_TRAZA" (132), the route to stage 4 (118) is controlled by assigning a 0 to the signal "SEL_TR_P4" (133) , and the route to step 5 (121) is controlled by assigning a 0 to the signal "SEL_TR_P5" (134). The loading of the search register input register (139) is controlled by activating the "LD_DIR" signal (137) and routing the "Z" address (600) to said register (139) by assigning a 0 to the "SEL_DIR" signal. (138).
La figura 8 representa la espera de dos ciclos (801 y 802) tras cualquiera de las dos burbujas descritas en las figuras 6 y 7, de forma que en el ciclo T+2 (802) la etapa de búsqueda (100) desactiva la señal“ESPERA” (124) indicando que las instrucciones están disponibles para cargarse en la etapa de decodificación (103) en el siguiente ciclo, activándose las señales“LD_N” (127) y“LD_N_1” (128). Figure 8 represents the wait for two cycles (801 and 802) after either of the two bubbles described in Figures 6 and 7, so that in the T + 2 cycle (802) the search stage (100) deactivates the “WAIT” signal (124) indicating that the instructions are available for loading in the decoding stage (103) in the next cycle, activating the signals "LD_N" (127) and "LD_N_1" (128).
En el conjunto de casos presentados en las figuras 3, 4, 5, 6, 7 y 8 se describe cómo se comporta el dispositivo propuesto en la invención ante las distintas secuencias de instrucciones, y la aparición de posibles burbujas. En todos los casos se comprueba que el dispositivo hace efectiva la sincronización de la ejecución de las instrucciones de traza con las instrucciones trazadas, evitándose además la sobrecarga en tiempo de ejecución, ya que las instrucciones de traza siempre se ejecutan en paralelo con las instrucciones a trazar. In the set of cases presented in Figures 3, 4, 5, 6, 7 and 8, it is described how the device proposed in the invention behaves before the different instruction sequences, and the appearance of possible bubbles. In all cases it is verified that the device makes the synchronization of the execution of the trace instructions with the drawn instructions effective, also avoiding the overload at runtime, since the trace instructions are always executed in parallel with the instructions to draw.
La realización de la invención estará basada en la especificación estructural de la figura 1 y de funcionamiento según el diagrama de transición de estados de la figura 2, y las tablas definidas en las figuras 9, 10, 11 y 12. Las figuras 3, 4, 5 y 6 completan los detalles que facilitan la implementación. The embodiment of the invention will be based on the structural specification of Figure 1 and operating according to the state transition diagram of Figure 2, and the tables defined in Figures 9, 10, 11 and 12. Figures 3, 4 , 5 and 6 complete the details that facilitate implementation.
La realización física preferida consistirá en la implementación“Hardware/Firmware” de la funcionalidad descrita, partiendo de un modelo de descripción de una arquitectura estándar de procesador sobre el que se realizarán las mencionadas modificaciones y que afectan, básicamente, al diseño del pipeline. Dichos modelos de descripción de arquitecturas, van a permitir generar los detalles de fabricación del dispositivo, que podrá ser materializado sobre un dispositivo programable como una FPGA (Matriz de Puertas Programables, Field Programmable Gafe Array) o bien sobre un Circuito Integrado de Aplicación Específica (ASIC, Application Specific Integrated Circuit). The preferred physical embodiment will consist of the "Hardware / Firmware" implementation of the described functionality, based on a model description of a standard processor architecture on which the aforementioned modifications will be made and that basically affect the design of the pipeline. Said architecture description models will allow the manufacturing details of the device to be generated, which may be materialized on a programmable device such as an FPGA (Programmable Door Matrix, Field Programmable Gafe Array) or on a Specific Application Integrated Circuit ( ASIC, Application Specific Integrated Circuit).
Hay diferentes opciones de realización. Todas ellas parten del modelo VHDL de un ΊR Core” de un procesador RISC segmentado, como ARM o LEON, sobre el que se modificará la implementación de la estructura del pipeline del dispositivo para incluir la funcionalidad descrita en esta patente. El objetivo es generar un nuevo ΊR Core”, que podrá ser fabricado sobre FPGA o ASIC. There are different realization options. All of them are based on the VHDL model of a ΊR Core ”of a segmented RISC processor, such as ARM or LEON, on which the implementation of the pipeline structure of the device will be modified to include the functionality described in this patent. The objective is to generate a new ΊR Core ”, which can be manufactured on FPGA or ASIC.

Claims

REIVINDICACIONES
1. Un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza, caracterizado porque comprende: 1. A parallel processing device for program instructions and trace instructions, characterized in that it comprises:
• una etapa de búsqueda de instrucciones (100), que a su vez comprende:  • an instruction search stage (100), which in turn comprises:
o un módulo de cálculo de la dirección de la instrucción (101); y, o módulo de búsqueda de las instrucciones con doble puerto de lectura (102);  or a module for calculating the direction of the instruction (101); and, or module for searching the instructions with double reading port (102);
• una etapa de decodificación duplicada (103);  • a duplicate decoding stage (103);
• un pipeline-traza (113) para el procesado únicamente de instrucciones de traza; • a pipeline-trace (113) for processing only trace instructions;
• un registro de salida (123) para la traza; • an output record (123) for the trace;
• una ruta de datos que a su vez comprende un conjunto de multiplexores (109, 110, 116, 119);  • a data path that in turn comprises a set of multiplexers (109, 110, 116, 119);
• un controlador de la ruta de datos (104), que a su vez comprende unas entradas (105) y unas salidas (106) que controlan los multiplexores (109, 110, 116, 119), la carga en unos registros asociados las distintas etapas (107, 108, 114, 115, 117, 118, 120, 121) y el registro de salida (123) para la traza;  • a controller of the data path (104), which in turn comprises some inputs (105) and outputs (106) that control the multiplexers (109, 110, 116, 119), the load in some associated registers the different stages (107, 108, 114, 115, 117, 118, 120, 121) and the output register (123) for the trace;
donde el controlador de la ruta de datos (104) está configurado para determinar, en función del estado de dicho controlador (104) y del valor de las entradas (105) en dicho controlador, el valor de las salidas (106) que son enviadas a los multiplexores (109, 110, 116 y 119) de la ruta de datos de tal forma que una instrucción de traza se ejecuta de forma sincronizada con la instrucción que la precede, haciéndose efectiva dicha ejecución durante la última etapa (121) del pipeline de traza (113). where the data path controller (104) is configured to determine, depending on the state of said controller (104) and the value of the inputs (105) on said controller, the value of the outputs (106) that are sent to the multiplexers (109, 110, 116 and 119) of the data path in such a way that a trace instruction is executed in synchronization with the preceding instruction, said execution being effective during the last stage (121) of the pipeline trace (113).
2. Un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza, según la reivindicación 1 , caracterizado porque el controlador comprende las siguientes secuencias de instrucciones: 2. A parallel processing device for program instructions and trace instructions according to claim 1, characterized in that the controller comprises the following sequence of instructions:
• S1 : corresponde a pares Instrucción-Traza (301- 302, 303-304 y 305-306) en los que las instrucciones a ser trazadas (301 , 303 y 305) siempre se cargan en el elemento“INSTRUCCIÓN N” (107), mientras las instrucciones correspondientes de traza (302, 304 y 306) se cargan en el elemento“INSTRUCCIÓN N+1” (115); • S1: corresponds to Instruction-Trace pairs (301-302, 303-304 and 305-306) in which the instructions to be drawn (301, 303 and 305) are always loaded in the “INSTRUCTION N” element (107) , while the corresponding trace instructions (302, 304 and 306) are loaded in the “INSTRUCTION N + 1” element (115);
• S2: corresponde a una secuencia de instrucciones (401 , 402, 403 y 404) que no son trazadas, de forma en los elementos “INSTRUCCIÓN N” (114) e “INSTRUCCIÓN N+1” (115) siempre se cargan instrucciones; • S2: corresponds to a sequence of instructions (401, 402, 403 and 404) that are not drawn, so that “INSTRUCTION N” (114) and “INSTRUCTION N + 1” (115) always load instructions;
• S3: corresponde a dos pares Instrucción-Traza (502- 503 y 504-505) en los que las instrucciones de traza (503 y 505) se cargan en ciclos sucesivos (500 y 510) en el elemento“INSTRUCCIÓN N +1” (107), mientras las instrucciones a trazar (502 y 504) se cargan en esos mismos ciclos en el elemento“INSTRUCCIÓN N+1” (108). • S3: corresponds to two Instruction-Trace pairs (502-503 and 504-505) in which the trace instructions (503 and 505) are loaded in successive cycles (500 and 510) in the “INSTRUCTION N + 1” item (107), while the instructions to be plotted (502 and 504) are loaded in those same cycles in the “INSTRUCTION N + 1” element (108).
3. Un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza, según la reivindicación 2, caracterizado porque el dispositivo de procesamiento ejecuta la secuencia S1 durante dos ciclos de reloj, T (300) y T+1 (307); de tal forma que las instrucciones almacenadas en unas direcciones X+1 (302), X+3 (304) y X+5 (306), son las instrucciones de traza de las instrucciones que las preceden, ubicadas, respectivamente, en unas direcciones X (301), X+2 (303) y X+4 (305). 3. A parallel processing device for program instructions and trace instructions according to claim 2, characterized in that the processing device executes the sequence S1 for two clock cycles, T (300) and T + 1 (307); such that the instructions stored in directions X + 1 (302), X + 3 (304) and X + 5 (306), are the trace instructions of the instructions that precede them, located, respectively, in directions X (301), X + 2 (303) and X + 4 (305).
4. Un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza, según la reivindicación 2, caracterizado porque el dispositivo de procesamiento ejecuta la secuencia S2 en la que durante dos ciclos no se cargan instrucciones de traza; de tal forma que durante el primer ciclo T (400) se detecta que las dos instrucciones que están cargadas en la etapa de decodificación (403 y 404) no son de traza, y por tanto las señales“N_ES_TRAZA” (129) y“N_1_ES_TRAZA” (130) valen ambas“0”; y, en el ciclo T +1 (408) el controlador se encuentra en el estado denominado “INSTR PENDIENTE” (202), en el que la instrucción pendiente (404) ubicada en la segunda unidad de decodificación (108) se dirige hacia la “etapa 3” del pipeline de instrucciones del procesador (114). 4. A parallel processing device for program instructions and trace instructions according to claim 2, characterized in that the processing device executes the sequence S2 in which no trace instructions are loaded for two cycles; so that during the first cycle T (400) it is detected that the two instructions that are loaded in the decoding stage (403 and 404) are not trace, and therefore the signals “N_ES_TRAZA” (129) and “N_1_EN_TRAZA "(130) are worth both" 0 "; and, in the T + 1 cycle (408) the controller is in the state called "PENDING INSTR" (202), in which the pending instruction (404) located in the second decoding unit (108) is directed towards the "Step 3" of the processor instruction pipeline (114).
5. Un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza, según la reivindicación 2, caracterizado porque el dispositivo de procesamiento ejecuta la secuencia S3: en el ciclo T (500) el valor de la señal “N_ES_TRAZA” (129) es“1”, mientras que“N_1_ES_TRAZA” (130) vale“0”, y el valor de la señal de multiplexación“SEL_TR_P4” (133) es“2”, de tal forma que se habilita una ruta donde la instrucción de traza (503) se sincroniza con la ejecución de la instrucción a trazar (502); en el ciclo T+1 (510), se ubica la instrucción de traza (503) en la etapa 4 del pipeline de traza (118). En el ciclo T(500), además, la señal de multiplexación “SEL_PIPE_TRAZA” (132) toma el valor 0, con el fin de que en el ciclo T+1 (510) se encuentre un cero (507) en la etapa 3 del pipeline de traza (115). 5. A parallel processing device of program instructions and trace instructions according to claim 2, characterized in that the processing device executes the sequence S3: in the T cycle (500) the value of the signal "N_ES_TRAZA" (129 ) is “1”, while “N_1_EN_TRAZA” (130) is worth “0”, and the value of the multiplexing signal “SEL_TR_P4” (133) is “2”, so that a route is enabled where the instruction of trace (503) is synchronized with the execution of the instruction to be plotted (502); in the T + 1 cycle (510), the trace instruction (503) is located in step 4 of the trace pipeline (118). In the T cycle (500), in addition, the multiplexing signal “SEL_PIPE_TRAZA” (132) takes the value 0, so that in the T + 1 cycle (510) a zero (507) is found in step 3 of the trace pipeline (115).
6. Un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza, según la reivindicación 1 , caracterizado porque el dispositivo de procesamiento durante un ciclo“T”, detecta una burbuja en la etapa 3 del pipeline de instrucciones (114), de tal forma que el controlador fija la ruta que carga un“0” en las etapas 3 (115) y 4 (118) del pipeline- traza (113) y una dirección de salto“Z” (600) se enruta hacía el registro de entrada de la etapa de búsqueda (139), de tal forma que la detección de la burbuja en la etapa 3 se corresponde con la puesta a“1” de la señal "BURBUJA_P3" (125) y de la señal "BURBUJA" (122). 6. A parallel processing device for program instructions and trace instructions according to claim 1, characterized in that the processing device during a "T" cycle detects a bubble in step 3 of the instruction pipeline (114), such that the controller sets the route that loads a " 0 "in stages 3 (115) and 4 (118) of the pipeline (113) and a skip address" Z "(600) is routed to the entry register of the search stage (139), of such so that the detection of the bubble in step 3 corresponds to setting "1" of the signal "BUBBLE_P3" (125) and the signal "BUBBLE" (122).
7. Un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza, según la reivindicación 6, caracterizado porque el controlador controla: la ruta hacia la etapa 3 (115) asignando“0” a la señal "SEL_PIPE_TRAZA" (132); la ruta hacia la etapa 4 (118) asignando un“0” a la señal "SEL_TR_P4" (133); y la carga del registro de entrada de la etapa de búsqueda (139) activando la señal "LD_DIR" (137) y enrutando la dirección "Z" (600) hacía dicho registro (139) asignando un“0” a la señal "SEL_DIR" (138). 7. A parallel processing device for program instructions and trace instructions according to claim 6, characterized in that the controller controls: the route to step 3 (115) by assigning "0" to the signal "SEL_PIPE_TRAZA" (132) ; the route to step 4 (118) assigning a "0" to the signal "SEL_TR_P4" (133); and loading the search register input register (139) by activating the "LD_DIR" signal (137) and routing the "Z" address (600) to said register (139) by assigning a "0" to the "SEL_DIR" signal. "(138).
8. Un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza, según la reivindicación 1 , caracterizado porque el dispositivo de procesamiento durante un ciclo“T”, detecta una burbuja en la etapa 4 del pipeline de instrucciones (117), de tal forma el controlador fija la ruta que carga un“0” en las etapas 3 (115), 4 (118) y 5 (121) del pipeline- traza (113) y la dirección de salto“Z” (600) se enruta hacia el registro de entrada de la etapa de búsqueda (139), de tal forma que la detección de la burbuja en la etapa 4 se corresponde con la puesta a “1” de la señal "BURBUJA_P4" (126) y de la señal "BURBUJA" (122). 8. A parallel processing device for program instructions and trace instructions according to claim 1, characterized in that the processing device during a "T" cycle detects a bubble in step 4 of the instruction pipeline (117), in this way the controller sets the route that loads a “0” in stages 3 (115), 4 (118) and 5 (121) of the pipeline (113) and the jump direction “Z” (600) is routes to the input register of the search stage (139), so that the detection of the bubble in stage 4 corresponds to setting "1" of the "BUBBLE_P4" signal (126) and the signal "BUBBLE" (122).
9. Un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza, según la reivindicación 8, caracterizado porque el controlador controla: la ruta hacia la etapa 3 del pipeline- traza (115) asignando “0” a la señal "SEL_PIPE_TRAZA" (132); la ruta hacia la etapa 4 (118) asignando un“0” a la señal "SEL_TR_P4" (133); la ruta hacia la etapa 5 (121) se controla asignando un“0” a la señal "SEL_TR_P5" (134); y, la carga del registro de entrada de la etapa de búsqueda (139) activando la señal "LD_DIR" (137) y enrutando la dirección "Z" (600) hacía dicho registro (139) asignando un 0 a la señal "SEL_DIR" (138). 9. A parallel processing device for program instructions and trace instructions according to claim 8, characterized in that the controller controls: the route to step 3 of the pipeline (115) by assigning "0" to the signal "SEL_PIPE_TRAZA "(132); the route to step 4 (118) assigning a "0" to the signal "SEL_TR_P4" (133); the route to step 5 (121) is controlled by assigning a "0" to the signal "SEL_TR_P5" (134); and, loading the input register of the search stage (139) by activating the "LD_DIR" signal (137) and routing the "Z" address (600) to said register (139) by assigning a 0 to the "SEL_DIR" signal (138).
10. Un procesador RISC,“Computador con Conjunto de Instrucciones Reducidas” , caracterizado porque comprende un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza según una cualquiera de las reivindicaciones anteriores. 10. A RISC processor, "Computer with Reduced Instruction Set", characterized in that it comprises a parallel processing device of program instructions and trace instructions according to any one of the preceding claims.
11. Un método de procesamiento en paralelo de instrucciones de programa e instrucciones de traza que, ejecutado sobre un dispositivo de procesamiento en paralelo de instrucciones de programa e instrucciones de traza definido en cualquiera de las reivindicaciones 1 a 9, procesa de forma paralela una instrucción y una instrucción de traza. 11. A parallel processing method of program instructions and trace instructions which, executed on a parallel processing device of program instructions and trace instructions defined in any one of claims 1 to 9, processes an instruction in parallel and a trace instruction.
PCT/ES2019/070176 2018-03-20 2019-03-18 Method and device for parallel processing of program instructions and trace instructions WO2019180288A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
ES201830266A ES2697548B2 (en) 2018-03-20 2018-03-20 A PARALLEL PROCESSING METHOD AND DEVICE FOR PROGRAM INSTRUCTIONS AND TRACE INSTRUCTIONS
ESP201830266 2018-03-20

Publications (1)

Publication Number Publication Date
WO2019180288A1 true WO2019180288A1 (en) 2019-09-26

Family

ID=65024202

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/ES2019/070176 WO2019180288A1 (en) 2018-03-20 2019-03-18 Method and device for parallel processing of program instructions and trace instructions

Country Status (2)

Country Link
ES (1) ES2697548B2 (en)
WO (1) WO2019180288A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2802723B2 (en) * 2019-07-12 2021-07-27 Univ Alcala Henares A METHOD FOR SELECTIVE TRACING OF INSTRUCTION EXECUTION, RELATED PROCESSING DEVICE AND PROCESSOR

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5564028A (en) * 1994-01-11 1996-10-08 Texas Instruments Incorporated Pipelined data processing including instruction trace
US5933626A (en) * 1997-06-12 1999-08-03 Advanced Micro Devices, Inc. Apparatus and method for tracing microprocessor instructions
US5996092A (en) * 1996-12-05 1999-11-30 International Business Machines Corporation System and method for tracing program execution within a processor before and after a triggering event
US6499123B1 (en) * 1989-02-24 2002-12-24 Advanced Micro Devices, Inc. Method and apparatus for debugging an integrated circuit
GB2492457A (en) * 2011-06-29 2013-01-02 Ibm Predicting out of order instruction level parallelism of threads in a multi-threaded processor
US20130290640A1 (en) * 2012-04-27 2013-10-31 Nvidia Corporation Branch prediction power reduction

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6499123B1 (en) * 1989-02-24 2002-12-24 Advanced Micro Devices, Inc. Method and apparatus for debugging an integrated circuit
US5564028A (en) * 1994-01-11 1996-10-08 Texas Instruments Incorporated Pipelined data processing including instruction trace
US5996092A (en) * 1996-12-05 1999-11-30 International Business Machines Corporation System and method for tracing program execution within a processor before and after a triggering event
US5933626A (en) * 1997-06-12 1999-08-03 Advanced Micro Devices, Inc. Apparatus and method for tracing microprocessor instructions
GB2492457A (en) * 2011-06-29 2013-01-02 Ibm Predicting out of order instruction level parallelism of threads in a multi-threaded processor
US20130290640A1 (en) * 2012-04-27 2013-10-31 Nvidia Corporation Branch prediction power reduction

Also Published As

Publication number Publication date
ES2697548A1 (en) 2019-01-24
ES2697548B2 (en) 2020-07-22

Similar Documents

Publication Publication Date Title
ES2588185T3 (en) Operation mode comparison debug circuit of a set of processor instructions
KR101183651B1 (en) System and method of data forwarding within an execution unit
US7043416B1 (en) System and method for state restoration in a diagnostic module for a high-speed microprocessor
US7332929B1 (en) Wide-scan on-chip logic analyzer with global trigger and interleaved SRAM capture buffers
US7870437B2 (en) Trace data timestamping
US20050207521A1 (en) Recovery from errors in a data processing apparatus
JP4076946B2 (en) First-in first-out memory system and method
JPH055136B2 (en)
BR102013015049B1 (en) apparatus and method
KR20110008298A (en) Selectively performing a single cycle write operation with ecc in a data processing system
US10795685B2 (en) Operating a pipeline flattener in order to track instructions for complex
US9361104B2 (en) Systems and methods for determining instruction execution error by comparing an operand of a reference instruction to a result of a subsequent cross-check instruction
US20200201810A1 (en) Identifying processing units in a processor
JP2016035626A (en) Semiconductor device
ES2697548B2 (en) A PARALLEL PROCESSING METHOD AND DEVICE FOR PROGRAM INSTRUCTIONS AND TRACE INSTRUCTIONS
US11625316B2 (en) Checksum generation
US20130097462A1 (en) Embedded logic analyzer
US6934828B2 (en) Decoupling floating point linear address
JP5118069B2 (en) Dual-pass multi-mode sequential storage element
ES2802723B2 (en) A METHOD FOR SELECTIVE TRACING OF INSTRUCTION EXECUTION, RELATED PROCESSING DEVICE AND PROCESSOR
TWI802951B (en) Method, computer system and computer program product for storing state data of finite state machine
JP2006520953A (en) Memory system having high speed and low speed data reading mechanism
US7013256B2 (en) Computer system with debug facility
Bernardi et al. A SBST strategy to test microprocessors' branch target buffer
Lu et al. RaceFree: An efficient multi-threading model for determinism

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19770670

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19770670

Country of ref document: EP

Kind code of ref document: A1