WO2022116750A1 - 指令执行方法、装置、电子设备和存储介质 - Google Patents

指令执行方法、装置、电子设备和存储介质 Download PDF

Info

Publication number
WO2022116750A1
WO2022116750A1 PCT/CN2021/126841 CN2021126841W WO2022116750A1 WO 2022116750 A1 WO2022116750 A1 WO 2022116750A1 CN 2021126841 W CN2021126841 W CN 2021126841W WO 2022116750 A1 WO2022116750 A1 WO 2022116750A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
parameter
register
classifier
state
Prior art date
Application number
PCT/CN2021/126841
Other languages
English (en)
French (fr)
Inventor
闻军会
田超
贾磊
严小平
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Publication of WO2022116750A1 publication Critical patent/WO2022116750A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30047Prefetch instructions; cache control instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements

Definitions

  • the present disclosure relates to the technical fields of speech, natural language processing, and deep learning in the field of computer technology, and in particular, to an instruction execution method, apparatus, electronic device, and storage medium.
  • LSTM Long Short-Term Memory
  • WaveRNN Wave Recurrent Neural Network
  • the operation program is put into the interrupt to remove the correlation between the operation function and the main program, or the out-of-order execution architecture is used to remove the dependency of some instructions, but there is a waiting problem between the parameter configuration and the complex operation. Therefore, the operation is performed efficiently.
  • An instruction execution method, apparatus, electronic device and storage medium are provided.
  • an instruction execution method including: an instruction classifier identifies a category of a current input instruction; the category is a parameter configuration instruction, and the instruction classifier writes a corresponding parameter according to the parameter configuration instruction into the corresponding first parameter register in the instruction cache; the category is a calculation instruction, then the instruction classifier writes the calculation instruction into the instruction register in the instruction cache; the arithmetic unit detects the instruction If the register is not empty, the next calculation instruction is fetched from the instruction register, and the written parameter is fetched from the corresponding first parameter register according to the fetched calculation instruction, and the fetched parameter is updated to the The calculation is performed in the second parameter register in the arithmetic unit.
  • an instruction execution device comprising: an instruction cache, the instruction cache includes a first parameter register and an instruction register; an instruction classifier, used to identify a category of a current input instruction; the category is parameter configuration instruction, write the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction; if the category is a calculation instruction, write the calculation instruction into the instruction cache in the instruction register of The written parameter is fetched from a parameter register, and the fetched parameter is updated to the second parameter register in the operation unit for calculation.
  • an electronic device comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor , the instructions are executed by the at least one processor, so that the at least one processor can execute the instruction execution method described in the first aspect of the present disclosure.
  • a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the instruction execution method described in the first aspect of the present disclosure.
  • a computer program product including a computer program, which implements the instruction execution method described in the first aspect of the present disclosure when the computer program is executed by a processor.
  • FIG. 1 is a schematic flowchart of an instruction execution method according to a first embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of an instruction execution method according to a second embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of an instruction execution method according to a third embodiment of the present disclosure.
  • FIG. 4 is a system architecture diagram of an instruction execution method according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic flowchart of an instruction execution method according to a fourth embodiment of the present disclosure.
  • FIG. 6 is a block diagram of an instruction execution apparatus according to the first embodiment of the present disclosure.
  • FIG. 7 is a block diagram of an instruction execution apparatus according to a second embodiment of the present disclosure.
  • FIG. 8 is a block diagram of an electronic device used to implement the instruction execution method of an embodiment of the present disclosure.
  • Speech can include speech recognition, speech interaction and other technical fields, and is an important direction in the field of artificial intelligence.
  • Voice recognition is a technology that allows machines to convert speech signals into corresponding text or commands through the process of recognition and understanding. It mainly includes three aspects: feature extraction technology, pattern matching criteria and model training technology.
  • Voice Interaction is a technology that uses voice as the information carrier to interact, communicate, and exchange information between machines and users. Compared with traditional human-computer interaction, it has the advantages of convenience, speed and high user comfort.
  • Natural Language Processing is a science that studies computer systems that can effectively realize natural language communication, especially software systems, and is an important direction in the field of computer science and artificial intelligence.
  • Deep Learning is a new research direction in the field of Machine Learning (ML), which is to learn the inherent laws and representation levels of sample data, so that machines can analyze and learn like humans, and can recognize text.
  • ML Machine Learning
  • image and sound data science widely used in speech and image recognition.
  • FIG. 1 is a schematic flowchart of an instruction execution method according to a first embodiment of the present disclosure.
  • the instruction execution method may specifically include the following steps:
  • the instruction classifier identifies the category of the current input instruction.
  • the execution subject of the instruction execution method in the embodiment of the present disclosure may be the instruction execution apparatus provided in the embodiment of the present disclosure, and the instruction execution apparatus may be a hardware device with data information processing capability and/or required for driving the hardware device to work. necessary software.
  • the instruction execution device includes an instruction classifier for identifying the category of the current input instruction.
  • This category may specifically include, but is not limited to, parameter configuration instructions, calculation instructions, hardware synchronization instructions, and common Central Processing Unit (CPU) instructions.
  • CPU Central Processing Unit
  • the instruction identification manner may be various existing instruction identification manners, which is not limited in the present disclosure.
  • the category is a parameter configuration instruction
  • the instruction classifier writes the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction.
  • the instruction classifier identifies that the type of the current input instruction is a parameter configuration instruction
  • the parameter corresponding to the instruction is written into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction.
  • the parameter corresponding to the instruction can be carried in the instruction.
  • the instruction cache may be a macro instruction cache.
  • the instruction cache includes multiple entries, and each entry includes a first parameter register group (the first parameter register group includes n first parameter registers) and an instruction register.
  • the first parameter register is used to store the written parameter, and the instruction register is used to store the written command.
  • the category is a calculation instruction
  • the instruction classifier writes the calculation instruction into the instruction register in the instruction cache.
  • the instruction classifier identifies that the type of the current input instruction is a computing instruction
  • the computing instruction is written into the instruction register in the instruction cache.
  • the operation unit detects that the instruction register is not empty, fetches the next calculation instruction from the instruction register, fetches the written parameter from the corresponding first parameter register according to the fetched calculation instruction, and updates the fetched parameter to the operation The calculation is performed in the second parameter register within the unit.
  • the operation unit detects that the instruction register is not empty, that is, there is a calculation instruction to be executed in the instruction register, the next calculation instruction to be executed is fetched from the instruction register, and the calculation instruction corresponding to the calculation instruction is fetched from the first calculation instruction.
  • the written parameter is fetched from a parameter register, the fetched parameter is updated to the second parameter register in the operation unit, and the calculation is performed by using the second parameter register.
  • the instruction classifier identifies the type of the current input instruction, and if the type is the parameter configuration instruction, the corresponding parameter is written into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction , if the category is a computing instruction, the instruction classifier writes the computing instruction into the instruction register in the instruction cache.
  • the operation unit detects that the instruction register is not empty, it fetches the next calculation instruction from the instruction register, and fetches the written parameters from the corresponding first parameter register according to the fetched calculation instruction, and updates the fetched parameters into the operation unit.
  • the calculation is performed in the second parameter register of .
  • the use of the instruction cache enables the preprocessing before the operation (including parameter configuration and the writing of the calculation instruction) and the operation in the operation unit to be executed in parallel, that is, asynchronous execution, which improves the operation execution efficiency.
  • FIG. 2 is a schematic flowchart of an instruction execution method according to a second embodiment of the present disclosure.
  • the instruction execution method may specifically include the following steps:
  • the instruction classifier identifies the category of the current input instruction.
  • the category is a parameter configuration instruction
  • the instruction classifier writes the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction.
  • the category is a calculation instruction
  • the instruction classifier writes the calculation instruction into the instruction register in the instruction cache.
  • the operation unit detects that the instruction register is not empty, fetches the next calculation instruction from the instruction register, fetches the written parameter from the corresponding first parameter register according to the fetched calculation instruction, and updates the fetched parameter to the operation The calculation is performed in the second parameter register within the unit.
  • steps S201-S204 are the same as the steps S101-S104 in the foregoing embodiment, and the specific process will not be repeated here.
  • the instruction execution method in the embodiment of the present disclosure may further include the following steps S205-S210.
  • the category is a hardware synchronization instruction
  • the instruction classifier sends the first read state instruction to the state detection unit.
  • the software when it needs to wait for the calculation result of the arithmetic unit, it inputs a hardware synchronization instruction to the instruction classifier. If the instruction classifier identifies that the type of the current input instruction is a hardware synchronization instruction, the first read state instruction is sent to the state detection unit.
  • the state detection unit detects the state of the instruction register and the state of the operation unit according to the first read state instruction.
  • the state detection unit is used to synchronize the states of software and hardware. After receiving the first read state instruction, the state detection unit detects the state of the instruction register and the state of the operation unit.
  • the state of the instruction register includes empty and not empty.
  • the states of the arithmetic unit include busy and idle. When the second parameter register in the operation unit is in the calculation state, the state of the operation unit is busy, and when the second parameter register in the operation unit is in the non-calculation state, the state of the operation unit is idle.
  • the busy bit of the operation unit is pulled high, and when the operation ends, the busy bit of the operation unit is pulled low.
  • the state detection unit can determine whether the state of the operation unit is busy or idle according to the level of the busy bit of the operation unit.
  • the state detection unit detects that the instruction register is not empty or the operation unit is in a busy state, and sends the first waiting notification information to the instruction classifier.
  • the state detection unit detects that the instruction register is not empty or the operation unit is in a busy state, that is, when at least one of the instruction register is not empty and the operation unit is in a busy state is detected, the first waiting notification information is sent. to the instruction classifier.
  • the instruction classifier stops executing the classification step of identifying the current input instruction according to the first waiting notification information.
  • the instruction classifier stops executing the step of identifying the classification of the current input instruction, and waits for the operation of the operation unit to end.
  • the state detection unit detects that the instruction register is empty and the operation unit is in an idle state, and sends the first stop waiting notification information to the instruction classifier.
  • the state detection unit detects that the instruction register is empty and the operation unit is in an idle state, that is, when it detects that the instruction register is empty and the operation unit is in an idle state at the same time, it means that all operations are completed, and the first stop waiting for notification information Sent to the order sorter.
  • the instruction classifier continues to perform the classification step of identifying the current input instruction according to the first stop waiting notification information.
  • the instruction classifier continues to perform the classification step of identifying the current input instruction.
  • instruction execution method in the embodiment of the present disclosure may further include the following steps S211-S212.
  • the category is a central processing unit command
  • the command classifier sends the central processing unit command to the central processing unit.
  • the instruction classifier identifies that the type of the current input instruction is a central processing unit CPU instruction
  • the CPU instruction is sent to a central processing unit, that is, a CPU arithmetic unit.
  • the central processing unit executes the central processing unit instruction.
  • the CPU arithmetic unit executes the CPU instruction.
  • the instruction execution method of the embodiment of the present disclosure may further include the following steps: the instruction cache updates the write address of the instruction register, About to write address +1.
  • the instruction execution method of the embodiment of the present disclosure may further include the following steps: the instruction cache updates the read address of the instruction register, that is, the read address + 1.
  • the instruction execution method of the embodiment of the present disclosure may further include: an instruction cache
  • the parameter status indication position corresponding to the first parameter register in which the parameter is written is the first value.
  • the arithmetic unit fetches the written parameter from the corresponding first parameter register according to the fetched calculation instruction includes: the arithmetic unit fetches the parameter state from the corresponding first parameter register according to the fetched calculation instruction. Indicates the parameter whose bit is the first value.
  • the instruction execution method according to the embodiment of the present disclosure may further include: the instruction cache fetches the parameter state indication position corresponding to the first parameter register of the parameter as the second value.
  • a parameter status register is set in the instruction cache, so that only those parameters that need to be changed need to be updated.
  • Each entry in the instruction cache includes an n-bit (bit) parameter status register.
  • Each bit in the parameter status register corresponds to a parameter status indication bit, and each parameter status indication bit corresponds to a first parameter register.
  • the instruction classifier identifies the type of the current input instruction, and if the type is the parameter configuration instruction, the corresponding parameter is written into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction , if the category is a computing instruction, the instruction classifier writes the computing instruction into the instruction register in the instruction cache.
  • the operation unit detects that the instruction register is not empty, it fetches the next calculation instruction from the instruction register, and fetches the written parameters from the corresponding first parameter register according to the fetched calculation instruction, and updates the fetched parameters into the operation unit. The calculation is performed in the second parameter register of .
  • the use of the instruction cache enables the preprocessing before the operation (including parameter configuration and the writing of the calculation instruction) and the operation in the operation unit to be executed in parallel, that is, asynchronous execution, which improves the operation execution efficiency.
  • the state detection unit is used to synchronize the software and the hardware, which can avoid the extra power consumption caused by the query or interruption of the software, and simplifies the design of the instruction execution device.
  • the instruction execution method of the embodiment of the present disclosure is executed. The following steps can also be included:
  • the instruction classifier sends a second read status instruction to the status detection unit.
  • the instruction classifier identifies that the category of the current input instruction is a parameter configuration instruction
  • the second read status instruction is sent to the status detection unit.
  • the state detection unit detects the state of the first parameter register according to the second read state instruction.
  • the state detection unit detects the state of the first parameter register.
  • the state of the first parameter register includes full and not full.
  • the state detection unit detects that the first parameter register is full, and sends the second waiting notification information to the instruction classifier.
  • the state detection unit detects that the first parameter register is full, it sends the second waiting notification information to the instruction classifier.
  • the instruction cache when the first parameter register is full, the instruction cache pulls up the full bit of the first parameter register, and when the first parameter register is not full, the instruction cache pulls down the full bit of the first parameter register.
  • the state detection unit may determine whether the state of the first parameter register is full or not full according to the level of the full bit of the first parameter register.
  • the instruction classifier stops executing the step of writing the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction.
  • the instruction classifier stops executing the step of writing the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction, that is, the parameter writing step is no longer executed to avoid Flush the previously written parameters.
  • the instruction classifier stops executing the step of writing the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction, and also stops executing the step of identifying the category of the current input instruction.
  • the state detection unit detects that the first parameter register is not full, and sends the second stop waiting notification information to the instruction classifier.
  • the state detection unit detects that the first parameter register is not full (not full), it sends the second stop waiting notification information to the instruction classifier.
  • the instruction classifier continues to perform the step of writing the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction according to the second stop waiting notification information.
  • the instruction classifier continues to perform the step of writing the corresponding parameters into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction, that is, continues to perform the parameter writing step.
  • the instruction classifier continues to perform the step of writing the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction, and further continues to perform the step of identifying the category of the current input instruction.
  • the instruction classifier identifies the category of the current input instruction, and if the category is the parameter configuration instruction, the corresponding parameter is written into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction , if the category is a calculation instruction, the instruction classifier writes the calculation instruction into the instruction register in the instruction cache.
  • the operation unit detects that the instruction register is not empty, it fetches the next calculation instruction from the instruction register, and fetches the written parameters from the corresponding first parameter register according to the fetched calculation instruction, and updates the fetched parameters into the operation unit. is calculated in the second parameter register.
  • the use of the instruction cache enables the preprocessing before the operation (including the parameter configuration and the writing of the calculation instruction) and the operation in the operation unit to be executed in parallel, that is, asynchronous execution, which improves the operation execution efficiency.
  • Using the state detection unit to synchronize the software and the hardware can avoid the extra power consumption overhead caused by the software through query or interruption, and simplifies the design of the instruction execution device.
  • the software Before the parameters are written, it is detected whether the first parameter register is full, which can prevent the parameters to be written later from overwriting the previously written parameters.
  • the software does not need to update all parameters, which can further improve the efficiency of operation execution.
  • FIG. 4 is a system architecture diagram of an instruction execution method according to an embodiment of the present disclosure. As shown in FIG. 4 , it includes an instruction classifier, an instruction cache, a state detection unit, an arithmetic unit and a central processing unit. For the specific functions of each component, refer to the relevant descriptions in the embodiment shown in FIG. 5 , which will not be repeated here.
  • the instruction execution method may specifically include the following steps:
  • the instruction classifier identifies the category of the current input instruction.
  • the category is a parameter configuration instruction
  • the instruction classifier sends the second read state instruction to the state detection unit.
  • the state detection unit detects the state of the first parameter register according to the second read state instruction. If the first parameter register is not full, steps S504-S506 are executed. If the first parameter register is full, steps S507-S508 are executed.
  • the state detection unit detects that the first parameter register is not full, and sends the second stop waiting notification information to the instruction classifier.
  • the instruction classifier writes the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction according to the second stop waiting notification information.
  • the instruction cache writes the parameter state indication position corresponding to the first parameter register of the parameter as the first value.
  • the state detection unit detects that the first parameter register is full, and sends the second waiting notification information to the instruction classifier.
  • the instruction classifier stops executing the step of writing the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction according to the second waiting notification information.
  • the category is a calculation instruction
  • the instruction classifier writes the calculation instruction into the instruction register in the instruction cache.
  • the instruction cache updates the write address of the instruction register.
  • the operation unit detects that the instruction register is not empty, and fetches the next calculation instruction from the instruction register.
  • the instruction cache updates the read address of the instruction register.
  • the operation unit fetches the parameter whose parameter state indication bit is the first value from the corresponding first parameter register according to the fetched calculation instruction.
  • the instruction cache fetches the parameter state indication position corresponding to the first parameter register of the parameter as the second value.
  • the operation unit updates the fetched parameter to the second parameter register in the operation unit for calculation.
  • the category is a hardware synchronization instruction, and the instruction classifier sends the first read status instruction to the status detection unit.
  • the state detection unit detects the state of the instruction register and the state of the arithmetic unit according to the first read state instruction
  • the state detection unit detects that the instruction register is not empty or the operation unit is in a busy state, then sends the first waiting notification information to the instruction classifier;
  • the instruction classifier stops executing the classification step of identifying the current input instruction according to the first waiting notification information.
  • the state detection unit detects that the instruction register is empty and the operation unit is in an idle state, and sends the first stop waiting notification information to the instruction classifier.
  • the instruction classifier continues to perform the classification step of identifying the current input instruction according to the first stop waiting notification information.
  • the category is a central processing unit instruction
  • the instruction classifier sends the central processing unit instruction to the central processing unit.
  • the central processing unit executes the central processing unit instruction.
  • FIG. 6 is a block diagram of an instruction execution apparatus according to the first embodiment of the present disclosure.
  • the instruction execution apparatus 600 in the embodiment of the present disclosure may specifically include: an instruction cache 601 , an instruction classifier 602 , and an operation unit 603 .
  • the instruction cache 601 includes a first parameter register 6011 and an instruction register 6012 .
  • the instruction classifier 602 is used to identify the category of the current input instruction; the category is a parameter configuration instruction, and the corresponding parameter is written into the corresponding first parameter register 6011 in the instruction cache 601 according to the parameter configuration instruction; the category is a calculation instruction, Then, the calculation instruction is written into the instruction register 6012 in the instruction cache 601 .
  • the arithmetic unit 603 is used to detect that the instruction register 6012 is not empty, then fetch the next calculation instruction from the instruction register 6012, and fetch the written parameter from the corresponding first parameter register 6011 according to the fetched calculation instruction, and fetch the The parameters of are updated to the second parameter register in the operation unit 603 for calculation.
  • the instruction classifier identifies the type of the current input instruction, and if the type is the parameter configuration instruction, the corresponding parameter is written into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction , if the category is a calculation instruction, the instruction classifier writes the calculation instruction into the instruction register in the instruction cache.
  • the operation unit detects that the instruction register is not empty, it fetches the next calculation instruction from the instruction register, and fetches the written parameters from the corresponding first parameter register according to the fetched calculation instruction, and updates the fetched parameters into the operation unit. is calculated in the second parameter register.
  • the use of the instruction cache enables the preprocessing before the operation (including parameter configuration and the writing of the calculation instruction) and the operation in the operation unit to be executed in parallel, that is, asynchronous execution, which improves the operation execution efficiency.
  • FIG. 7 is a block diagram of an instruction execution apparatus according to a second embodiment of the present disclosure.
  • an instruction execution apparatus 700 includes an instruction cache 701 , an instruction classifier 702 , and an operation unit 703 .
  • the instruction cache 701 includes a first parameter register 7011 and an instruction register 7012 .
  • the instruction cache 701 has the same function and structure as the instruction cache 601 in the foregoing embodiment, and the instruction classifier 702 has the same function and structure as the instruction classifier 602 in the foregoing embodiment.
  • the operation unit 703 has the same function and structure as the operation unit 603 in the above embodiment.
  • the first parameter register 7011 has the same function and structure as the 7011 in the above embodiment.
  • the instruction register 7012 has the same function and structure as the 6011 in the above embodiment.
  • the instruction execution apparatus 700 in the embodiment of the present disclosure may further include a state detection unit 704 .
  • the instruction classifier 702 is further configured to: the category is a hardware synchronization instruction, then send the first read status instruction to the status detection unit 704; and stop executing the category step of identifying the current input instruction according to the received first waiting notification information.
  • the state detection unit 704 is used to: detect the state of the instruction register 7012 and the state of the operation unit 703 according to the first read state instruction; detect that the instruction register 7012 is not empty or the operation unit 703 is in a busy state, then send the first waiting notification information to the instruction classifier 702 .
  • state detection unit 704 is further configured to: detect that the instruction register 7012 is empty and the operation unit 703 is in an idle state, then send the first stop waiting notification information to the instruction sorter 702; the instruction sorter 702 is also used for: according to the The first stop waiting for notification information continues to perform the class step of identifying the current input command.
  • the instruction classifier 702 is further configured to: send the second read status instruction to the status detection unit 704; the status detection unit 704 is configured to: detect the status of the first parameter register 7011 according to the second read status instruction; When the parameter register 7011 is full, the second waiting notification information is sent to the instruction sorter 702; the instruction sorter 702 is further configured to: according to the second waiting notification information, stop executing and write the corresponding parameters into the instruction cache 701 according to the parameter configuration instruction Steps in the corresponding first parameter register 7011.
  • state detection unit 704 is further configured to: detect that the first parameter register 7011 is not full, then send the second stop waiting notification information to the instruction classifier 702; the instruction classifier 702 is also used for: according to the second stop waiting notification information, continue to execute the step of writing the corresponding parameter into the corresponding first parameter register 7011 in the instruction cache 701 according to the parameter configuration instruction.
  • the instruction cache 701 is used for: setting the parameter status indication position corresponding to the first parameter register 7011 of the written parameter as the first numerical value; taking the parameter status indication position corresponding to the first parameter register 7011 of the fetched parameter as the second numerical value;
  • the operation unit 703 is specifically configured to: fetch the parameter whose parameter status indication bit is the first value from the corresponding first parameter register 7011 according to the fetched calculation instruction.
  • instruction cache 701 is used to update the write address of the instruction register 7012 after the instruction classifier 702 writes the calculation instruction into the instruction register 7012 in the instruction cache 701 .
  • instruction cache 701 is used to update the read address of the instruction register 7012 after the operation unit 703 fetches the next calculation instruction from the instruction register 7012 .
  • the instruction execution apparatus 700 in the embodiment of the present disclosure may further include: a central processing unit 705; the instruction classifier 702 is further configured to: if the category is a central processing unit instruction, the instruction classifier 702 sends the central processing unit instruction to the central processing unit A processing operator; a central processing operator is used to: execute central processing unit instructions.
  • the instruction classifier identifies the type of the current input instruction, and if the type is the parameter configuration instruction, the corresponding parameter is written into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction , if the category is a computing instruction, the instruction classifier writes the computing instruction into the instruction register in the instruction cache.
  • the operation unit detects that the instruction register is not empty, it fetches the next calculation instruction from the instruction register, and fetches the written parameters from the corresponding first parameter register according to the fetched calculation instruction, and updates the fetched parameters into the operation unit. The calculation is performed in the second parameter register of .
  • the use of the instruction cache enables the preprocessing before the operation (including parameter configuration and the writing of the calculation instruction) and the operation in the operation unit to be executed in parallel, that is, asynchronous execution, which improves the operation execution efficiency.
  • Using the state detection unit to synchronize the software and the hardware can avoid the extra power consumption overhead caused by the software through query or interruption, and simplifies the design of the instruction execution device.
  • check whether the first parameter register is full which can prevent the parameter to be written later from overwriting the previously written parameter.
  • the software does not need to update all parameters, which can further improve the efficiency of operation execution.
  • the present disclosure also provides an electronic device and a readable storage medium.
  • FIG. 8 it is a block diagram of an electronic device of an instruction execution method according to an embodiment of the present disclosure.
  • Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as intelligent voice interaction devices, personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.
  • the electronic device includes: one or more processors 801, a memory 802, and interfaces for connecting various components, including a high-speed interface and a low-speed interface.
  • the various components are interconnected using different buses and may be mounted on a common motherboard or otherwise as desired.
  • Processor 801 may process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on external input/output devices, such as a display device coupled to an interface.
  • multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired.
  • multiple electronic devices may be connected, each providing some of the necessary operations (eg, as a server array, a group of blade servers, or a multiprocessor system).
  • a processor 801 is used as an example.
  • the memory 802 is the non-transitory computer-readable storage medium provided by the present disclosure.
  • the memory stores instructions executable by at least one processor, so that the at least one processor executes the instruction execution method provided by the present disclosure.
  • the non-transitory computer-readable storage medium of the present disclosure stores computer instructions for causing a computer to execute the instruction execution method provided by the present disclosure.
  • the memory 802 can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the instruction execution methods in the embodiments of the present disclosure (for example, appendix).
  • the processor 801 executes various functional applications and data processing of the server by running the non-transitory software programs, instructions and modules stored in the memory 802, that is, implementing the instruction execution method in the above method embodiments.
  • the memory 802 can include a stored program area and a stored data area, wherein the stored program area can store an operating system, an application program required by at least one function; the stored data area can store data created according to the use of the electronic device for executing the method according to the instruction, etc. .
  • memory 802 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device.
  • memory 802 may optionally include memory located remotely from processor 801 that may be connected via a network to an electronic device that instructs the execution of the method. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
  • the electronic device of the instruction execution method may further include: an input device 803 and an output device 804 .
  • the processor 801 , the memory 802 , the input device 803 and the output device 804 may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 8 .
  • the input device 803 can receive input numerical or character information, and generate key signal input related to user settings and function control of the electronic device of the instruction execution method, such as a touch screen, a keypad, a mouse, a trackpad, a touchpad, a pointing stick, One or more input devices such as mouse buttons, trackballs, joysticks, etc.
  • Output devices 804 may include display devices, auxiliary lighting devices (eg, LEDs), haptic feedback devices (eg, vibration motors), and the like.
  • the display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
  • Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.
  • the processor which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.
  • machine-readable medium and “computer-readable medium” refer to any computer program product, apparatus, and/or apparatus for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLDs), including machine-readable media that receive machine instructions as machine-readable signals.
  • machine-readable signal refers to any signal used to provide machine instructions and/or data to a programmable processor.
  • the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer.
  • a display device eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and pointing device eg, a mouse or trackball
  • Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.
  • the systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system.
  • the components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), the Internet, and blockchain networks.
  • a computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
  • the server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the traditional physical host and VPS service ("Virtual Private Server", or "VPS" for short) , there are the defects of difficult management and weak business expansion.
  • the server can also be a server of a distributed system, or a server combined with a blockchain.
  • the present disclosure also provides a computer program product, including a computer program, wherein, when the computer program is executed by a processor, the instruction execution method of the foregoing embodiments of the present disclosure is implemented.
  • the instruction classifier identifies the category of the current input instruction, and if the category is a parameter configuration instruction, writes the corresponding parameter into the corresponding first parameter register in the instruction cache according to the parameter configuration instruction , if the category is a computing instruction, the instruction classifier writes the computing instruction into an instruction register in the instruction cache.
  • the arithmetic unit detects that the instruction register is not empty, fetches the next calculation instruction from the instruction register, and fetches the written parameter from the corresponding first parameter register according to the fetched calculation instruction, and sets the The fetched parameters are updated to the second parameter register in the operation unit for calculation.
  • the use of the instruction cache enables the preprocessing before the operation (including parameter configuration and the writing of the calculation instruction) and the operation in the operation unit to be executed in parallel, that is, asynchronous execution, which improves the operation execution efficiency.
  • steps may be reordered, added or deleted using the various forms of flow shown above.
  • each step described in the present disclosure can be executed in parallel, can be executed sequentially, or can be executed in a different order, as long as the desired results of the technical solutions of the present disclosure can be achieved, it is not limited herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

一种指令执行方法、装置、电子设备和存储介质,涉及语音、自然语言处理、深度学习技术领域。具体实现方案为:指令分类器识别当前输入指令的类别;类别为参数配置指令,则指令分类器根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中;类别为计算指令,则指令分类器将计算指令写入指令缓存中的指令寄存器中;运算单元检测到指令寄存器不为空,则从指令寄存器中取出下一条计算指令,并根据取出的计算指令从对应的第一参数寄存器中取出写入的参数,将取出的参数更新至运算单元内的第二参数寄存器中进行计算。该指令执行方法、装置、电子设备和存储介质,提高了运算执行效率。

Description

指令执行方法、装置、电子设备和存储介质
相关公开的交叉引用
本公开要求于2020年12月02日提交的中国专利申请号“202011401623.1”的优先权,其全部内容通过引用并入本文。
技术领域
本公开涉及计算机技术领域中的语音、自然语言处理、深度学习技术领域,尤其涉及一种指令执行方法、装置、电子设备和存储介质。
背景技术
运算处理器芯片中往往存在复杂运算指令,例如语音处理中的长短期记忆网络(Long Short-Term Memory,简称LSTM)、波基循环神经网络(Wave Recurrent Neural Network,简称WaveRNN)算法中存在大量卷积、全连接激活等操作,这种指令一般由专用运算单元完成。
目前,运算执行时将运算程序放到中断中以解除运算函数与主程序的相关性,或者采用乱序执行架构以解除部分指令相关性,但都存在参数配置与复杂运算之间的等待问题,因此运算执行效率。
发明内容
提供了一种指令执行方法、装置、电子设备和存储介质。
根据第一方面,提供了一种指令执行方法,包括:指令分类器识别当前输入指令的类别;所述类别为参数配置指令,则所述指令分类器根据所述参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中;所述类别为计算指令,则所述指令分类器将所述计算指令写入所述指令缓存中的指令寄存器中;运算单元检测到所述指令寄存器不为空,则从所述指令寄存器中取出下一条计算指令,并根据取出的所述计算指令从对应的所述第一参数寄存器中取出写入的参数,将取出的参数更新至所述运算单元内的第二参数寄存器中进行计算。
根据第二方面,提供了一种指令执行装置,包括:指令缓存,所述指令缓存包括第一参数寄存器和指令寄存器;指令分类器,用于识别当前输入指令的类别;所述类 别为参数配置指令,则根据所述参数配置指令将对应的参数写入所述指令缓存中的对应的所述第一参数寄存器中;所述类别为计算指令,则将所述计算指令写入所述指令缓存中的所述指令寄存器中;运算单元,用于检测到所述指令寄存器不为空,则从所述指令寄存器中取出下一条计算指令,并根据取出的所述计算指令从对应的所述第一参数寄存器中取出写入的参数,将取出的参数更新至所述运算单元内的第二参数寄存器中进行计算。
根据第三方面,提供了一种电子设备,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行本公开第一方面所述的指令执行方法。
根据第四方面,提供了一种存储有计算机指令的非瞬时计算机可读存储介质,所述计算机指令用于使所述计算机执行本公开第一方面所述的指令执行方法。
根据第五方面,提供了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现本公开第一方面所述的指令执行方法。
应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。
附图说明
附图用于更好地理解本方案,不构成对本公开的限定。其中:
图1是根据本公开第一实施例的指令执行方法的流程示意图;
图2是根据本公开第二实施例的指令执行方法的流程示意图;
图3是根据本公开第三实施例的指令执行方法的流程示意图;
图4是根据本公开实施例的指令执行方法的系统架构图;
图5是根据本公开第四实施例的指令执行方法的流程示意图;
图6是根据本公开第一实施例的指令执行装置的框图;
图7是根据本公开第二实施例的指令执行装置的框图;
图8是用来实现本公开实施例的指令执行方法的电子设备的框图。
具体实施方式
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种 细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。
语音可包括语音识别、语音交互等技术领域,是人工智能领域中的一个重要方向。
语音识别(Voice Recognition)是一种让机器通过识别和理解过程把语音信号转变为相应的文本或命令的技术,主要包括特征提取技术、模式匹配准则及模型训练技术三个方面。
语音交互(Voice Interaction)是一种机器与用户以语音为信息载体进行互动、沟通、信息交换等交互行为的技术,相较于传统的人机交互,具有方便快捷、用户舒适性高的优点。
自然语言处理(Natural Language Processing,NLU)是研究能有效地实现自然语言通信的计算机系统,特别是其中的软件系统的一门科学,是计算机科学领域与人工智能领域中的一个重要方向。
深度学习(Deep Learning,DL)是机器学习(Machine Learning,ML)领域中一个新的研究方向,是学习样本数据的内在规律和表示层次,使得机器能够像人一样具有分析学习能力,能够识别文字、图像和声音等数据的一门科学,广泛应用于语音和图像识别。
下面结合附图描述本公开实施例的指令执行方法、装置、电子设备和存储介质。
图1是根据本公开第一实施例的指令执行方法的流程示意图。
如图1所示,本公开实施例的指令执行方法具体可包括以下步骤:
S101,指令分类器识别当前输入指令的类别。
具体的,本公开实施例的指令执行方法的执行主体可为本公开实施例提供的指令执行装置,该指令执行装置可为具有数据信息处理能力的硬件设备和/或驱动该硬件设备工作所需必要的软件。
该指令执行装置中包括指令分类器,用于识别当前输入指令的类别。该类别具体可包括但不限于参数配置指令、计算指令、硬件同步指令和普通的中央处理器(Central Processing Unit,CPU)指令。
需要说明的是,指令识别的方式可以是现有的各种指令识别方式,本公开对此不做过多限定。
S102,类别为参数配置指令,则指令分类器根据参数配置指令将对应的参数写入 指令缓存中的对应的第一参数寄存器中。
具体的,若指令分类器识别出当前输入指令的类别为参数配置指令,则根据参数配置指令将该指令对应的参数写入指令缓存中的对应的第一参数寄存器中。其中,该指令对应的参数可以携带在该指令中。
指令缓存具体可以为宏指令缓存。指令缓存中包括多个表项,每个表项中包括一个第一参数寄存器组(第一参数寄存器组包括n个第一参数寄存器)和一个指令寄存器。第一参数寄存器用于存储写入的参数,指令寄存器用于存储写入的指令。
S103,类别为计算指令,则指令分类器将计算指令写入指令缓存中的指令寄存器中。
具体的,若指令分类器识别出当前输入指令的类别为计算指令,则将计算指令写入指令缓存中的指令寄存器中。
本领域技术人员可以理解,软件需保证宏指令的配置中先配置参数,再发送计算指令。
S104,运算单元检测到指令寄存器不为空,则从指令寄存器中取出下一条计算指令,并根据取出的计算指令从对应的第一参数寄存器中取出写入的参数,将取出的参数更新至运算单元内的第二参数寄存器中进行计算。
具体的,运算单元检测到指令寄存器不为空,即指令寄存器中存在待执行的计算指令,则从指令寄存器中取出下一条待执行的计算指令,根据取出的计算指令从该计算指令对应的第一参数寄存器中取出写入的参数,并将取出的参数更新至运算单元内的第二参数寄存器中,利用第二参数寄存器进行计算。
综上,本公开实施例的指令执行方法,指令分类器识别当前输入指令的类别,若类别为参数配置指令,则根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中,若类别为计算指令,则指令分类器将计算指令写入指令缓存中的指令寄存器中。运算单元检测到指令寄存器不为空,则从指令寄存器中取出下一条计算指令,并根据取出的计算指令从对应的第一参数寄存器中取出写入的参数,将取出的参数更新至运算单元内的第二参数寄存器中进行计算。采用指令缓存使得运算之前的预处理(包括参数配置和计算指令的写入)和运算单元中的运算可以并行执行,即异步执行,提高了运算执行效率。
图2是本公开第二实施例的指令执行方法的流程示意图。
如图2所示,本公开实施例的指令执行方法具体可包括以下步骤:
S201,指令分类器识别当前输入指令的类别。
S202,类别为参数配置指令,则指令分类器根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中。
S203,类别为计算指令,则指令分类器将计算指令写入指令缓存中的指令寄存器中。
S204,运算单元检测到指令寄存器不为空,则从指令寄存器中取出下一条计算指令,并根据取出的计算指令从对应的第一参数寄存器中取出写入的参数,将取出的参数更新至运算单元内的第二参数寄存器中进行计算。
具体的,本步骤S201-S204与上述实施例中的步骤S101-S104相同,具体过程此处不再赘述。
本公开实施例的指令执行方法还可以包括以下步骤S205-S210。
S205,类别为硬件同步指令,则指令分类器将第一读状态指令发送至状态检测单元。
具体的,当软件需要等待运算单元的计算结果时,则输入硬件同步指令至指令分类器。若指令分类器识别出当前输入指令的类别为硬件同步指令,则将第一读状态指令发送至状态检测单元。
S206,状态检测单元根据第一读状态指令检测指令寄存器的状态和运算单元的状态。
具体的,状态检测单元用于同步软件与硬件的状态。状态检测单元接收到第一读状态指令后,检测指令寄存器的状态和运算单元的状态。指令寄存器的状态包括空和不为空。运算单元的状态包括忙碌(busy)和空闲(idle)。运算单元中的第二参数寄存器处于计算状态时,运算单元的状态为busy,运算单元中的第二参数寄存器处于非计算状态时,运算单元的状态为idle。
作为一种可行实施方式,运算单元中的第二参数寄存器开始计算时,将运算单元的busy位拉高,结束运算时,将运算单元的busy位拉低。状态检测单元可以根据运算单元的busy位的高低,确定运算单元的状态为busy还是idle。
S207,状态检测单元检测到指令寄存器不为空或者运算单元处于忙碌状态,则将第一等待通知信息发送至指令分类器。
具体的,若状态检测单元检测到指令寄存器不为空或者运算单元处于忙碌状态,即检测到指令寄存器不为空和运算单元处于忙碌状态中的至少一个满足时,则将第一 等待通知信息发送至指令分类器。
S208,指令分类器根据第一等待通知信息停止执行识别当前输入指令的类别步骤。
具体的,指令分类器接收到第一等待通知信息后,停止执行识别当前输入指令的类别步骤,等待运算单元运算结束。
S209,状态检测单元检测到指令寄存器为空且运算单元处于空闲状态,则将第一停止等待通知信息发送至指令分类器。
具体的,若状态检测单元检测到指令寄存器为空且运算单元处于空闲状态,即检测到指令寄存器为空和运算单元处于空闲状态同时满足时,则说明所有运算完成,将第一停止等待通知信息发送至指令分类器。
S210,指令分类器根据第一停止等待通知信息继续执行识别当前输入指令的类别步骤。
具体的,指令分类器接收到第一停止等待通知信息后,继续执行识别当前输入指令的类别步骤。
进一步的,本公开实施例的指令执行方法还可以包括以下步骤S211-S212。
S211,类别为中央处理器指令,则指令分类器将中央处理器指令发送至中央处理运算器。
具体的,若指令分类器识别出当前输入指令的类别为中央处理器CPU指令,则将该CPU指令发送至中央处理运算器即CPU运算器。
S212,中央处理运算器执行中央处理器指令。
具体的,CPU运算器接收到该CPU指令后,执行该CPU指令。
进一步的,上述步骤S203中的“指令分类器将计算指令写入指令缓存中的指令寄存器中”之后,本公开实施例的指令执行方法还可以包括以下步骤:指令缓存更新指令寄存器的写地址,即将写地址+1。
进一步的,上述步骤S204中的“运算单元从指令寄存器中取出下一条计算指令”之后,本公开实施例的指令执行方法还可以包括以下步骤:指令缓存更新指令寄存器的读地址,即将读地址+1。
进一步的,上述步骤S202中的“指令分类器根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中”之后,本公开实施例的指令执行方法还可以包括:指令缓存将写入参数的第一参数寄存器对应的参数状态指示位置为第一数值。对应的,上述步骤S204中的“运算单元根据取出的计算指令从对应的第一参数寄存器 中取出写入的参数”包括:运算单元根据取出的计算指令从对应的第一参数寄存器中取出参数状态指示位为第一数值的参数。对应的,本公开实施例的指令执行方法还可以包括:指令缓存将取出参数的第一参数寄存器对应的参数状态指示位置为第二数值。
具体的,由于对于某个计算指令来说,并非所有参数均需要配置,因此在指令缓存中设置了参数状态寄存器,这样只需要更新那些需要改变的参数即可。指令缓存中的每个表项中包括一个n比特(bit)的参数状态寄存器。参数状态寄存器中的每一bit对应一个参数状态指示位,每一个参数状态指示位对应一个第一参数寄存器。当第一参数寄存器中写入参数时,指令缓存将对应的参数状态指示位置为第一数值,例如state=1。运算单元根据取出的计算指令从对应的第一参数寄存器中取出参数状态指示位为第一数值例如state=1的参数。指令缓存将该取出的参数对应的参数状态指示位置为第二数值,例如state=0。
综上,本公开实施例的指令执行方法,指令分类器识别当前输入指令的类别,若类别为参数配置指令,则根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中,若类别为计算指令,则指令分类器将计算指令写入指令缓存中的指令寄存器中。运算单元检测到指令寄存器不为空,则从指令寄存器中取出下一条计算指令,并根据取出的计算指令从对应的第一参数寄存器中取出写入的参数,将取出的参数更新至运算单元内的第二参数寄存器中进行计算。采用指令缓存使得运算之前的预处理(包括参数配置和计算指令的写入)和运算单元中的运算可以并行执行,即异步执行,提高了运算执行效率。采用状态检测单元来进行软件和硬件的同步,可避免软件通过查询或中断方式带来的额外功耗开销,简化了指令执行装置的设计。
进一步的,如图3所示,上述步骤S202中的“指令分类器根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中”之前,本公开实施例的指令执行方法还可以包括以下步骤:
S301,指令分类器将第二读状态指令发送至状态检测单元。
具体的,若指令分类器识别出当前输入指令的类别为参数配置指令,则将第二读状态指令发送至状态检测单元。
S302,状态检测单元根据第二读状态指令检测第一参数寄存器的状态。
具体的,状态检测单元接收到第二读状态指令后,检测第一参数寄存器的状态。第一参数寄存器的状态包括已满和未满。
S303,状态检测单元检测到第一参数寄存器已满,则将第二等待通知信息发送至 指令分类器。
具体的,若状态检测单元检测到第一参数寄存器已满(full),则将第二等待通知信息发送至指令分类器。
作为一种可行实施方式,第一参数寄存器已满时,指令缓存将第一参数寄存器的full位拉高,第一参数寄存器未满时,指令缓存将第一参数寄存器的full位拉低。状态检测单元可以根据第一参数寄存器的full位的高低,确定第一参数寄存器的状态为已满还是未满。
S304,指令分类器根据第二等待通知信息,停止执行根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤。
具体的,指令分类器接收到第二等待通知信息后,停止执行根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤,即不再执行参数写入步骤,避免冲掉前面写入的参数。指令分类器停止执行根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤,进而也停止执行识别当前输入指令的类别步骤。
S305,状态检测单元检测到第一参数寄存器未满,则将第二停止等待通知信息发送至指令分类器。
具体的,若状态检测单元检测到第一参数寄存器未满(非full),则将第二停止等待通知信息发送至指令分类器。
S306,指令分类器根据第二停止等待通知信息,继续执行根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤。
具体的,指令分类器接收到第二停止等待通知信息后,继续执行根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤,即继续执行参数写入步骤。指令分类器继续执行根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤,进而也继续执行识别当前输入指令的类别步骤。
综上,本公开实施例的指令执行方法,指令分类器识别当前输入指令的类别,若类别为参数配置指令,则根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中,若类别为计算指令,则指令分类器将计算指令写入指令缓存中的指令寄存器中。运算单元检测到指令寄存器不为空,则从指令寄存器中取出下一条计算指令,并根据取出的计算指令从对应的第一参数寄存器中取出写入的参数,将取出的参数更新至运算单元内的第二参数寄存器中进行计算。采用指令缓存使得运算之前的 预处理(包括参数配置和计算指令的写入)和运算单元中的运算可以并行执行,即异步执行,提高了运算执行效率。采用状态检测单元来进行软件和硬件的同步,可避免软件通过查询或中断方式带来的额外功耗开销,简化了指令执行装置的设计。在参数写入前,检测第一参数寄存器是否已满,可以避免后续待写入的参数冲掉前面已写入的参数。通过设置参数状态指示位,软件无需更新所有参数,可进一步提高运算执行效率。
为清楚说明本公开实施例的指令执行方法,下面结合图4、5对本公开实施例的指令执行方法进行描述。
图4为本公开实施例的指令执行方法的系统架构图。如图4所示,包括指令分类器、指令缓存、状态检测单元、运算单元和中央处理运算器。各部件的具体功能参见图5所示实施例中的相关描述,此处不再赘述。
如图5所示,本公开实施例的指令执行方法具体可包括以下步骤:
S501,指令分类器识别当前输入指令的类别。
S502,类别为参数配置指令,则指令分类器将第二读状态指令发送至状态检测单元。
S503,状态检测单元根据第二读状态指令检测第一参数寄存器的状态。若第一参数寄存器未满,则执行步骤S504-S506。若第一参数寄存器已满,则执行步骤S507-S508。
S504,状态检测单元检测到第一参数寄存器未满,则将第二停止等待通知信息发送至指令分类器。
S505,指令分类器根据第二停止等待通知信息,根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中。
S506,指令缓存将写入参数的第一参数寄存器对应的参数状态指示位置为第一数值。
S507,状态检测单元检测到第一参数寄存器已满,则将第二等待通知信息发送至指令分类器。
S508,指令分类器根据第二等待通知信息,停止执行根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤。
S509,类别为计算指令,则指令分类器将计算指令写入指令缓存中的指令寄存器中。
S510,指令缓存更新指令寄存器的写地址。
S511,运算单元检测到指令寄存器不为空,则从指令寄存器中取出下一条计算指令。
S512,指令缓存更新指令寄存器的读地址。
S513,运算单元根据取出的计算指令从对应的第一参数寄存器中取出参数状态指示位为第一数值的参数。
S514,指令缓存将取出参数的第一参数寄存器对应的参数状态指示位置为第二数值。
S515,运算单元将取出的参数更新至运算单元内的第二参数寄存器中进行计算。
S516,类别为硬件同步指令,则指令分类器将第一读状态指令发送至状态检测单元。
S517,状态检测单元根据第一读状态指令检测指令寄存器的状态和运算单元的状态;
S518,状态检测单元检测到指令寄存器不为空或者运算单元处于忙碌状态,则将第一等待通知信息发送至指令分类器;
S519,指令分类器根据第一等待通知信息停止执行识别当前输入指令的类别步骤。
S520,状态检测单元检测到指令寄存器为空且运算单元处于空闲状态,则将第一停止等待通知信息发送至指令分类器。
S521,指令分类器根据第一停止等待通知信息继续执行识别当前输入指令的类别步骤。
S522,类别为中央处理器指令,则指令分类器将中央处理器指令发送至中央处理运算器。
S523,中央处理运算器执行中央处理器指令。
图6是根据本公开第一实施例的指令执行装置的框图。
如图6所示,本公开实施例的指令执行装置600,具体可包括:指令缓存601、指令分类器602和运算单元603。指令缓存601包括第一参数寄存器6011和指令寄存器6012。
指令分类器602,用于识别当前输入指令的类别;类别为参数配置指令,则根据参数配置指令将对应的参数写入指令缓存601中的对应的第一参数寄存器6011中;类别为计算指令,则将计算指令写入指令缓存601中的指令寄存器6012中。
运算单元603,用于检测到指令寄存器6012不为空,则从指令寄存器6012中取 出下一条计算指令,并根据取出的计算指令从对应的第一参数寄存器6011中取出写入的参数,将取出的参数更新至运算单元603内的第二参数寄存器中进行计算。
需要说明的是,上述对指令执行方法实施例的解释说明,也适用于本公开实施例的指令执行装置,具体过程此处不再赘述。
综上,本公开实施例的指令执行装置,指令分类器识别当前输入指令的类别,若类别为参数配置指令,则根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中,若类别为计算指令,则指令分类器将计算指令写入指令缓存中的指令寄存器中。运算单元检测到指令寄存器不为空,则从指令寄存器中取出下一条计算指令,并根据取出的计算指令从对应的第一参数寄存器中取出写入的参数,将取出的参数更新至运算单元内的第二参数寄存器中进行计算。采用指令缓存使得运算之前的预处理(包括参数配置和计算指令的写入)和运算单元中的运算可以并行执行,即异步执行,提高了运算执行效率。
图7是根据本公开第二实施例的指令执行装置的框图。
如图7所示,本公开实施例的指令执行装置700,包括:指令缓存701、指令分类器702和运算单元703。指令缓存701包括第一参数寄存器7011和指令寄存器7012。
其中,指令缓存701与上述实施例中的指令缓存601具有相同功能和结构,指令分类器702与上述实施例中的指令分类器602具有相同功能和结构。运算单元703与上述实施例中的运算单元603具有相同功能和结构。第一参数寄存器7011与上述实施例中的7011具有相同功能和结构。指令寄存器7012与上述实施例中的6011具有相同功能和结构。
进一步的,本公开实施例的指令执行装置700还可包括状态检测单元704。指令分类器702还用于:类别为硬件同步指令,则将第一读状态指令发送至状态检测单元704;根据接收到的第一等待通知信息停止执行识别当前输入指令的类别步骤。状态检测单元704用于:根据第一读状态指令检测指令寄存器7012的状态和运算单元703的状态;检测到指令寄存器7012不为空或者运算单元703处于忙碌状态,则将第一等待通知信息发送至指令分类器702。
进一步的,状态检测单元704还用于:检测到指令寄存器7012为空且运算单元703处于空闲状态,则将第一停止等待通知信息发送至指令分类器702;指令分类器702还用于:根据第一停止等待通知信息继续执行识别当前输入指令的类别步骤。
进一步的,指令分类器702还用于:将第二读状态指令发送至状态检测单元704; 状态检测单元704用于:根据第二读状态指令检测第一参数寄存器7011的状态;检测到第一参数寄存器7011已满,则将第二等待通知信息发送至指令分类器702;指令分类器702还用于:根据第二等待通知信息,停止执行根据参数配置指令将对应的参数写入指令缓存701中的对应的第一参数寄存器7011中步骤。
进一步的,状态检测单元704还用于:检测到第一参数寄存器7011未满,则将第二停止等待通知信息发送至指令分类器702;指令分类器702还用于:根据第二停止等待通知信息,继续执行根据参数配置指令将对应的参数写入指令缓存701中的对应的第一参数寄存器7011中步骤。
进一步的,指令缓存701用于:将写入参数的第一参数寄存器7011对应的参数状态指示位置为第一数值;将取出参数的第一参数寄存器7011对应的参数状态指示位置为第二数值;运算单元703具体用于:根据取出的计算指令从对应的第一参数寄存器7011中取出参数状态指示位为第一数值的参数。
进一步的,指令缓存701用于:在指令分类器702将计算指令写入指令缓存701中的指令寄存器7012中之后,更新指令寄存器7012的写地址。
进一步的,指令缓存701用于:在运算单元703从指令寄存器7012中取出下一条计算指令之后,更新指令寄存器7012的读地址。
进一步的,本公开实施例的指令执行装置700还可包括:中央处理运算器705;指令分类器702还用于:类别为中央处理器指令,则指令分类器702将中央处理器指令发送至中央处理运算器;中央处理运算器用于:执行中央处理器指令。
需要说明的是,上述对指令执行方法实施例的解释说明,也适用于本公开实施例的指令执行装置,具体过程此处不再赘述。
综上,本公开实施例的指令执行装置,指令分类器识别当前输入指令的类别,若类别为参数配置指令,则根据参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中,若类别为计算指令,则指令分类器将计算指令写入指令缓存中的指令寄存器中。运算单元检测到指令寄存器不为空,则从指令寄存器中取出下一条计算指令,并根据取出的计算指令从对应的第一参数寄存器中取出写入的参数,将取出的参数更新至运算单元内的第二参数寄存器中进行计算。采用指令缓存使得运算之前的预处理(包括参数配置和计算指令的写入)和运算单元中的运算可以并行执行,即异步执行,提高了运算执行效率。采用状态检测单元来进行软件和硬件的同步,可避免软件通过查询或中断方式带来的额外功耗开销,简化了指令执行装置的设计。在参数 写入前,检测第一参数寄存器是否已满,可以避免后续待写入的参数冲掉前面已写入的参数。通过设置参数状态指示位,软件无需更新所有参数,可进一步提高运算执行效率。
根据本公开的实施例,本公开还提供了一种电子设备和一种可读存储介质。
如图8所示,是根据本公开实施例的指令执行方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,智能语音交互设备、个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。
如图8所示,该电子设备包括:一个或多个处理器801、存储器802,以及用于连接各部件的接口,包括高速接口和低速接口。各个部件利用不同的总线互相连接,并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器801可以对在电子设备内执行的指令进行处理,包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如,耦合至接口的显示设备)上显示GUI的图形信息的指令。在其它实施方式中,若需要,可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样,可以连接多个电子设备,各个设备提供部分必要的操作(例如,作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图8中以一个处理器801为例。
存储器802即为本公开所提供的非瞬时计算机可读存储介质。其中,存储器存储有可由至少一个处理器执行的指令,以使至少一个处理器执行本公开所提供的指令执行方法。本公开的非瞬时计算机可读存储介质存储计算机指令,该计算机指令用于使计算机执行本公开所提供的指令执行方法。
存储器802作为一种非瞬时计算机可读存储介质,可用于存储非瞬时软件程序、非瞬时计算机可执行程序以及模块,如本公开实施例中的指令执行方法对应的程序指令/模块(例如,附图6所示的指令缓存601、指令分类器602和运算单元603)。处理器801通过运行存储在存储器802中的非瞬时软件程序、指令以及模块,从而执行服务器的各种功能应用以及数据处理,即实现上述方法实施例中的指令执行方法。
存储器802可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据指令执行方法的电子设 备的使用所创建的数据等。此外,存储器802可以包括高速随机存取存储器,还可以包括非瞬时存储器,例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些实施例中,存储器802可选包括相对于处理器801远程设置的存储器,这些远程存储器可以通过网络连接至指令执行方法的电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
指令执行方法的电子设备还可以包括:输入装置803和输出装置804。处理器801、存储器802、输入装置803和输出装置804可以通过总线或者其他方式连接,图8中以通过总线连接为例。
输入装置803可接收输入的数字或字符信息,以及产生与指令执行方法的电子设备的用户设置以及功能控制有关的键信号输入,例如触摸屏、小键盘、鼠标、轨迹板、触摸板、指示杆、一个或者多个鼠标按钮、轨迹球、操纵杆等输入装置。输出装置804可以包括显示设备、辅助照明装置(例如,LED)和触觉反馈装置(例如,振动电机)等。该显示设备可以包括但不限于,液晶显示器(LCD)、发光二极管(LED)显示器和等离子体显示器。在一些实施方式中,显示设备可以是触摸屏。
此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、专用ASIC(专用集成电路)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。
这些计算程序(也称作程序、软件、软件应用、或者代码)包括可编程处理器的机器指令,并且可以利用高级过程和/或面向对象的编程语言、和/或汇编/机器语言来实施这些计算程序。如本文使用的,术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如,磁盘、光盘、存储器、可编程逻辑装置(PLD)),包括,接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶 显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)、互联网和区块链网络。
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决了传统物理主机与VPS服务("Virtual Private Server",或简称"VPS")中,存在的管理难度大,业务扩展性弱的缺陷。服务器也可以为分布式系统的服务器,或者是结合了区块链的服务器。
根据本公开的实施例,本公开还提供了一种计算机程序产品,包括计算机程序,其中,计算机程序被处理器执行时实现本公开上述实施例的指令执行方法。
根据本公开实施例的技术方案,指令分类器识别当前输入指令的类别,若类别为参数配置指令,则根据所述参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中,若类别为计算指令,则所述指令分类器将所述计算指令写入所述指令缓存中的指令寄存器中。运算单元检测到所述指令寄存器不为空,则从所述指令寄存器中取出下一条计算指令,并根据取出的所述计算指令从对应的所述第一参数寄存器中取出写入的参数,将取出的参数更新至所述运算单元内的第二参数寄存器中进行计算。采用指令缓存使得运算之前的预处理(包括参数配置和计算指令的写入)和运算单元中的运算可以并行执行,即异步执行,提高了运算执行效率。
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序 执行,只要能够实现本公开的技术方案所期望的结果,本文在此不进行限制。
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。

Claims (21)

  1. 一种指令执行方法,包括:
    指令分类器识别当前输入指令的类别;
    所述类别为参数配置指令,则所述指令分类器根据所述参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中;
    所述类别为计算指令,则所述指令分类器将所述计算指令写入所述指令缓存中的指令寄存器中;
    运算单元检测到所述指令寄存器不为空,则从所述指令寄存器中取出下一条计算指令,并根据取出的所述计算指令从对应的所述第一参数寄存器中取出写入的参数,将取出的参数更新至所述运算单元内的第二参数寄存器中进行计算。
  2. 根据权利要求1所述的指令执行方法,还包括:
    所述类别为硬件同步指令,则所述指令分类器将第一读状态指令发送至状态检测单元;
    所述状态检测单元根据所述第一读状态指令检测所述指令寄存器的状态和所述运算单元的状态;
    所述状态检测单元检测到所述指令寄存器不为空或者所述运算单元处于忙碌状态,则将第一等待通知信息发送至所述指令分类器;
    所述指令分类器根据所述第一等待通知信息停止执行所述识别当前输入指令的类别步骤。
  3. 根据权利要求2所述的指令执行方法,还包括:
    所述状态检测单元检测到所述指令寄存器为空且所述运算单元处于空闲状态,则将第一停止等待通知信息发送至所述指令分类器;
    所述指令分类器根据所述第一停止等待通知信息继续执行所述识别当前输入指令的类别步骤。
  4. 根据权利要求1所述的指令执行方法,其中,所述指令分类器根据所述参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中之前,还包括:
    所述指令分类器将第二读状态指令发送至状态检测单元;
    所述状态检测单元根据所述第二读状态指令检测所述第一参数寄存器的状态;
    所述状态检测单元检测到所述第一参数寄存器已满,则将第二等待通知信息发送至所述指令分类器;
    所述指令分类器根据所述第二等待通知信息,停止执行所述根据所述参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤。
  5. 根据权利要求4所述的指令执行方法,其中,还包括:
    所述状态检测单元检测到所述第一参数寄存器未满,则将第二停止等待通知信息发送至所述指令分类器;
    所述指令分类器根据所述第二停止等待通知信息,继续执行所述根据所述参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤。
  6. 根据权利要求1所述的指令执行方法,其中,所述指令分类器根据所述参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中之后,还包括:
    所述指令缓存将写入参数的所述第一参数寄存器对应的参数状态指示位置为第一数值;
    所述运算单元根据取出的所述计算指令从对应的所述第一参数寄存器中取出写入的参数,包括:
    所述运算单元根据取出的所述计算指令从对应的所述第一参数寄存器中取出参数状态指示位为第一数值的参数;
    所述指令执行方法还包括:
    所述指令缓存将取出参数的所述第一参数寄存器对应的参数状态指示位置为第二数值。
  7. 根据权利要求1所述的指令执行方法,其中,所述指令分类器将所述计算指令写入所述指令缓存中的指令寄存器中之后,还包括:
    所述指令缓存更新所述指令寄存器的写地址。
  8. 根据权利要求1所述的指令执行方法,其中,所述运算单元从所述指令寄存器中取出下一条计算指令之后,还包括:
    所述指令缓存更新所述指令寄存器的读地址。
  9. 根据权利要求1所述的指令执行方法,还包括:
    所述类别为中央处理器指令,则所述指令分类器将所述中央处理器指令发送至中央处理运算器;
    所述中央处理运算器执行所述中央处理器指令。
  10. 一种指令执行装置,包括:
    指令缓存,所述指令缓存包括第一参数寄存器和指令寄存器;
    指令分类器,用于识别当前输入指令的类别;所述类别为参数配置指令,则根据所述参数配置指令将对应的参数写入所述指令缓存中的对应的所述第一参数寄存器中;所述类别为计算指令,则将所述计算指令写入所述指令缓存中的所述指令寄存器中;
    运算单元,用于检测到所述指令寄存器不为空,则从所述指令寄存器中取出下一条计算指令,并根据取出的所述计算指令从对应的所述第一参数寄存器中取出写入的参数,将取出的参数更新至所述运算单元内的第二参数寄存器中进行计算。
  11. 根据权利要求10所述的指令执行装置,还包括:状态检测单元;
    所述指令分类器还用于:所述类别为硬件同步指令,则将第一读状态指令发送至所述状态检测单元;根据接收到的第一等待通知信息停止执行所述识别当前输入指令的类别步骤;
    所述状态检测单元用于:根据所述第一读状态指令检测所述指令寄存器的状态和所述运算单元的状态;检测到所述指令寄存器不为空或者所述运算单元处于忙碌状态,则将所述第一等待通知信息发送至所述指令分类器。
  12. 根据权利要求11所述的指令执行装置,其中,所述状态检测单元还用于:检测到所述指令寄存器为空且所述运算单元处于空闲状态,则将第一停止等待通知信息发送至所述指令分类器;
    所述指令分类器还用于:根据所述第一停止等待通知信息继续执行所述识别当前输入指令的类别步骤。
  13. 根据权利要求10所述的指令执行装置,还包括:状态检测单元;
    所述指令分类器还用于:将第二读状态指令发送至所述状态检测单元;
    所述状态检测单元用于:根据所述第二读状态指令检测所述第一参数寄存器的状态;检测到所述第一参数寄存器已满,则将第二等待通知信息发送至所述指令分类器;
    所述指令分类器还用于:根据所述第二等待通知信息,停止执行所述根据所述参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤。
  14. 根据权利要求13所述的指令执行装置,其中,所述状态检测单元还用于:检测到所述第一参数寄存器未满,则将第二停止等待通知信息发送至所述指令分类器;
    所述指令分类器还用于:根据所述第二停止等待通知信息,继续执行所述根据所述参数配置指令将对应的参数写入指令缓存中的对应的第一参数寄存器中步骤。
  15. 根据权利要求10所述的指令执行装置,其中,所述指令缓存用于:将写入参数的所述第一参数寄存器对应的参数状态指示位置为第一数值;将取出参数的所述第一参数寄存器对应的参数状态指示位置为第二数值;
    所述运算单元具体用于:根据取出的所述计算指令从对应的所述第一参数寄存器中取出参数状态指示位为第一数值的参数。
  16. 根据权利要求10所述的指令执行装置,其中,所述指令缓存用于:
    在所述指令分类器将所述计算指令写入所述指令缓存中的所述指令寄存器中之后,更新所述指令寄存器的写地址。
  17. 根据权利要求10所述的指令执行装置,其中,所述指令缓存用于:
    在所述运算单元从所述指令寄存器中取出下一条计算指令之后,更新所述指令寄存器的读地址。
  18. 根据权利要求10所述的指令执行装置,还包括:中央处理运算器;
    所述指令分类器还用于:所述类别为中央处理器指令,则所述指令分类器将所述中央处理器指令发送至中央处理运算器;
    所述中央处理运算器用于:执行所述中央处理器指令。
  19. 一种电子设备,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-9中任一项所述的指令执行方法。
  20. 一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行根据权利要求1-9中任一项所述的指令执行方法。
  21. 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现根据权利要求1-9中任一项所述的指令执行方法。
PCT/CN2021/126841 2020-12-02 2021-10-27 指令执行方法、装置、电子设备和存储介质 WO2022116750A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011401623.1A CN112559040B (zh) 2020-12-02 2020-12-02 指令执行方法、装置、电子设备和存储介质
CN202011401623.1 2020-12-02

Publications (1)

Publication Number Publication Date
WO2022116750A1 true WO2022116750A1 (zh) 2022-06-09

Family

ID=75048162

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/126841 WO2022116750A1 (zh) 2020-12-02 2021-10-27 指令执行方法、装置、电子设备和存储介质

Country Status (2)

Country Link
CN (1) CN112559040B (zh)
WO (1) WO2022116750A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559040B (zh) * 2020-12-02 2021-12-28 北京百度网讯科技有限公司 指令执行方法、装置、电子设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101710272A (zh) * 2009-10-28 2010-05-19 北京龙芯中科技术服务中心有限公司 指令调度装置和方法
CN102412965A (zh) * 2011-08-09 2012-04-11 深圳市德卡科技有限公司 椭圆曲线密码协处理器
CN110825440A (zh) * 2018-08-10 2020-02-21 北京百度网讯科技有限公司 指令执行方法和装置
CN111125627A (zh) * 2019-12-17 2020-05-08 中科寒武纪科技股份有限公司 用于池化多维矩阵的方法及相关产品
US20200341758A1 (en) * 2017-12-29 2020-10-29 Nationz Technologies Inc. Convolutional Neural Network Hardware Acceleration Device, Convolutional Calculation Method, and Storage Medium
CN112559040A (zh) * 2020-12-02 2021-03-26 北京百度网讯科技有限公司 指令执行方法、装置、电子设备和存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100388682C (zh) * 2005-03-21 2008-05-14 北京北方烽火科技有限公司 一种在sgsn网络处理器中提高服务质量的方法
CN101997834B (zh) * 2009-08-10 2015-01-07 北京多思科技发展有限公司 支持高性能安全协议的装置
US10235338B2 (en) * 2014-09-04 2019-03-19 Nvidia Corporation Short stack traversal of tree data structures
CN105785335B (zh) * 2016-03-28 2019-04-05 电子科技大学 一种基于cPCI的数字阵接收通道性能自动测试系统
US10560518B2 (en) * 2017-03-21 2020-02-11 Oracle International Corporation Cloud infrastructure optimization through client request classification

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101710272A (zh) * 2009-10-28 2010-05-19 北京龙芯中科技术服务中心有限公司 指令调度装置和方法
CN102412965A (zh) * 2011-08-09 2012-04-11 深圳市德卡科技有限公司 椭圆曲线密码协处理器
US20200341758A1 (en) * 2017-12-29 2020-10-29 Nationz Technologies Inc. Convolutional Neural Network Hardware Acceleration Device, Convolutional Calculation Method, and Storage Medium
CN110825440A (zh) * 2018-08-10 2020-02-21 北京百度网讯科技有限公司 指令执行方法和装置
CN111125627A (zh) * 2019-12-17 2020-05-08 中科寒武纪科技股份有限公司 用于池化多维矩阵的方法及相关产品
CN112559040A (zh) * 2020-12-02 2021-03-26 北京百度网讯科技有限公司 指令执行方法、装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN112559040A (zh) 2021-03-26
CN112559040B (zh) 2021-12-28

Similar Documents

Publication Publication Date Title
US20210406476A1 (en) Method, electronic device, and storage medium for extracting event from text
US20210383064A1 (en) Text recognition method, electronic device, and storage medium
TWI729472B (zh) 特徵詞的確定方法、裝置和伺服器
US20210397947A1 (en) Method and apparatus for generating model for representing heterogeneous graph node
JP7269913B2 (ja) ナレッジグラフ構築方法、装置、電子機器、記憶媒体およびコンピュータプログラム
JP2022018095A (ja) マルチモーダル事前訓練モデル取得方法、装置、電子デバイス及び記憶媒体
CN111339759B (zh) 领域要素识别模型训练方法、装置及电子设备
US20220004716A1 (en) Method and apparatus for training semantic representation model, device and computer storage medium
EP3852000A1 (en) Method and apparatus for processing semantic description of text entity, device and storage medium
US11507751B2 (en) Comment information processing method and apparatus, and medium
JP7096919B2 (ja) エンティティワードの認識方法と装置
US20210390254A1 (en) Method, Apparatus and Device for Recognizing Word Slot, and Storage Medium
CN111859997B (zh) 机器翻译中的模型训练方法、装置、电子设备及存储介质
EP3846069A1 (en) Pre-training method for sentiment analysis model, and electronic device
US20210200813A1 (en) Human-machine interaction method, electronic device, and storage medium
US20210406299A1 (en) Method and apparatus for mining entity relationship, electronic device, and storage medium
US11216615B2 (en) Method, device and storage medium for predicting punctuation in text
US11182648B2 (en) End-to-end model training method and apparatus, and non-transitory computer-readable medium
WO2022037421A1 (zh) 指令发射方法、装置、电子设备以及存储介质
US20220068267A1 (en) Method and apparatus for recognizing speech, electronic device and storage medium
US20210090562A1 (en) Speech recognition control method and apparatus, electronic device and readable storage medium
JP2022003415A (ja) 音声制御方法及び音声制御装置、電子機器並びに記憶媒体
WO2022116750A1 (zh) 指令执行方法、装置、电子设备和存储介质
CN114399772B (zh) 样本生成、模型训练和轨迹识别方法、装置、设备和介质
CN112270169B (zh) 对白角色预测方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21899784

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21899784

Country of ref document: EP

Kind code of ref document: A1