CN116244214A - Model parameter acquisition method and device, server device and storage medium - Google Patents

Model parameter acquisition method and device, server device and storage medium Download PDF

Info

Publication number
CN116244214A
CN116244214A CN202310125190.9A CN202310125190A CN116244214A CN 116244214 A CN116244214 A CN 116244214A CN 202310125190 A CN202310125190 A CN 202310125190A CN 116244214 A CN116244214 A CN 116244214A
Authority
CN
China
Prior art keywords
parameter
acquired
hash bucket
storage
address index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310125190.9A
Other languages
Chinese (zh)
Inventor
陆游游
谢旻晖
舒继武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202310125190.9A priority Critical patent/CN116244214A/en
Publication of CN116244214A publication Critical patent/CN116244214A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer And Data Communications (AREA)

Abstract

The application provides a model parameter acquisition method, a device, a server device and a storage medium. The method is applied to a Central Processing Unit (CPU), the CPU is positioned in a server device, the server device further comprises a network card and a persistent memory, and model parameters are stored in the persistent memory, and the method comprises the following steps: acquiring a parameter acquisition request sent by client equipment; the parameter acquisition request comprises at least one parameter address index identifier to be acquired; determining parameter description information corresponding to a parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information; the parameter description information comprises at least one parameter storage related information to be acquired; and sending the parameter description information to the network card so that the network card can acquire the parameters to be acquired according to the relevant information of the parameters to be acquired, generating a first parameter acquisition response, and sending the first parameter acquisition response to the client device. According to the scheme, the speed of reading the model parameters by the server equipment can be improved.

Description

Model parameter acquisition method and device, server device and storage medium
Technical Field
The present disclosure relates to storage technologies, and in particular, to a method and apparatus for obtaining model parameters, a server device, and a storage medium.
Background
With the development of artificial intelligence technology, the trainable model parameters in the machine learning model are increased in a large scale, the model parameters are increased to improve the accuracy of the machine learning model, but more storage space is needed to store and read the model parameters in real time.
At present, model parameters of a machine learning model are stored and read in real time through a parameter server, a large number of model parameters are stored in a Dynamic Random Access Memory (DRAM) of the parameter server, and the real-time reading of the model parameters is realized by utilizing lower read delay of the DRAM. However, the DRAM needs to be refreshed once every a period of time to supplement the charge lost over time, and a large amount of power resources are consumed. Moreover, after the parameter server is restarted due to a fault, the model parameters stored in the DRAM need to be reloaded into the DRAM from the persistent memory by the CPU, so that time is wasted, and the parameter server may not be capable of realizing real-time reading of the model parameters after the restarting.
Disclosure of Invention
The application provides a model parameter acquisition method, a device, a server device and a storage medium, which are used for solving the problems that a large amount of power resources are required to be consumed for realizing real-time reading of model parameters in the prior art, and the model parameters can not be read in real time after a parameter server is restarted.
According to a first aspect of the present application, a method for obtaining model parameters is provided, and the method is applied to a central processing unit CPU, where the CPU is located in a server device, and the server device further includes a network card and a persistent memory, where the model parameters are stored in the persistent memory; the method comprises the following steps:
acquiring a parameter acquisition request sent by client equipment; the parameter acquisition request comprises at least one parameter address index identifier to be acquired;
determining parameter description information corresponding to a parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information; the parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired;
the parameter description information is sent to a network card, so that the network card can acquire all parameters to be acquired according to the relevant information of the parameter storage to be acquired, generate a first parameter acquisition response and send the first parameter acquisition response to client equipment; the first parameter acquisition response includes at least one parameter to be acquired.
Optionally, the preset address index information includes a mapping relation between a model parameter address index identifier and model parameter storage related information, where the model parameter storage related information includes a storage address and a data length of a model parameter; the determining parameter description information corresponding to the parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information includes: inquiring the parameter storage related information to be acquired corresponding to each parameter address index identifier to be acquired in the mapping relation; and responding to the inquiry of each parameter address index to be acquired in the parameter acquisition request, and storing related information of the inquired parameter to be acquired as parameter description information corresponding to the parameter acquisition request.
Optionally, the mapping relation includes at least one preset hash bucket, each preset hash bucket is orderly arranged, the preset hash bucket includes overflow count identifiers and a preset number of key value pair storage bits, and model parameter address index identifiers and model parameter storage related information which are mutually corresponding are stored in the key value pair storage bits in a key value pair form; inquiring the information related to the parameter storage to be acquired corresponding to each parameter address index identifier to be acquired in the mapping relation, wherein the information comprises the following steps: for each parameter address index identifier to be acquired, the following operations are executed: calculating a starting hash bucket corresponding to the parameter address index identifier to be acquired by adopting a preset hash algorithm, and determining the starting hash bucket as a hash bucket to be queried of the parameter address index identifier to be acquired; the query key in the hash bucket to be queried is a key value pair to be obtained of the parameter address index identification to be obtained; responding to the fact that the key value pair to be obtained is not queried, and the overflow count identification of the hash bucket to be queried is not a first preset value, and updating the hash bucket to be queried according to the ordered arrangement and the preset step length of each preset hash bucket; repeatedly executing the steps from the step of updating the hash bucket to be queried according to the ordered arrangement and the preset step length of each preset hash bucket from the step of updating the key value pair to be queried with the key to be queried in the hash bucket to be queried as the index mark of the parameter address to be queried, or until the key value pair to be queried is not queried, and the overflow count mark of the hash bucket to be queried is not a first preset numerical value; and in response to the inquiry of the key value pair to be acquired, determining the value of the key value pair to be acquired as the parameter address index identification corresponding parameter storage related information to be acquired.
Optionally, the preset hash bucket further includes a migration identifier; the query key in the hash bucket to be queried is a key value pair to be obtained of a parameter address index identifier to be obtained, and the method comprises the following steps: and if the migration identification of the hash bucket to be queried is determined to be migration, updating the hash bucket to be queried into a starting hash bucket corresponding to the parameter address index identification to be acquired.
Optionally, the method for obtaining model parameters further includes: acquiring hot spot parameter information sent by client equipment; the hot spot parameter information comprises at least one hot spot parameter address index identifier and a heat value of each hot spot parameter; the heat value represents the number of times that the client device acquires the model parameters in a preset time period and has a positive correlation with the number of times that the client device acquires the model parameters; calculating a starting hash bucket corresponding to each hotspot parameter address index identifier by adopting a preset hash algorithm; the following operations are executed for each hot spot parameter address index identifier according to the order of the hot values from big to small: inquiring a hot key value pair with a hot parameter identifier as a hot key in a corresponding initial hash bucket; responding to the fact that a hot spot key value pair cannot be inquired in a corresponding initial hash bucket, wherein an inquiry key in the mapping relation is a hot spot key value pair identified by a hot spot parameter address index; and in response to the inquiry of the hot key value pair in the mapping relation, writing the hot key value pair into a starting hash bucket corresponding to the hot parameter address index identification, and deleting the hot key value pair from the initial hash bucket.
Optionally, the deleting the hot key value pair from the initial hash bucket includes: after the hot spot key value pairs are written into the initial hash buckets corresponding to the hot spot parameter address index identifiers, marking the key value pair storage bits storing the hot spot key value pairs in the initial hash buckets as to-be-released; acquiring a global thread reading identifier and private thread reading identifiers in each reading thread, and determining whether a read thread which runs first exists according to the global thread reading identifier and each private thread reading identifier, wherein the read thread which runs first is a read thread which starts to run before a hot spot key value pair is written into a starting hash bucket corresponding to a hot spot parameter address index identifier; and if the fact that the read thread running in advance does not exist is determined, releasing the key value pair storage bit marked to be released in the initial hash bucket.
Optionally, the method for obtaining model parameters further includes: acquiring a parameter insertion request sent by client equipment; the parameter insertion request comprises at least one parameter to be inserted and an address index identifier thereof; storing each parameter to be inserted into a persistent memory to obtain storage related information of each parameter to be inserted; the relevant information of the parameter storage to be inserted comprises a storage address and a data length of the parameter to be inserted; for each parameter address index identifier to be inserted, the following operations are performed: calculating a starting hash bucket corresponding to the parameter address index identifier to be inserted by adopting a preset hash algorithm, and determining the starting hash bucket as the hash bucket to be inserted of the parameter identifier to be inserted; determining whether unoccupied key value pairs exist in the hash bucket to be inserted or not; in response to the absence of the unoccupied key value pair storage bits, increasing the overflow count identification of the hash bucket to be inserted by a second preset value, further inserting the hash bucket according to the ordered arrangement and the preset step length of each preset hash bucket, and repeatedly executing the step of determining whether the unoccupied key value pair storage bits exist in the hash bucket to be inserted until the unoccupied key value pair storage bits exist in the hash bucket to be inserted, or until the unoccupied key value pair storage bits do not exist in each preset hash bucket; in response to the existence of the unoccupied key-value pair storage bit, the parameter address index to be inserted is identified as a key and the parameter storage related information to be inserted is stored as a value into the unoccupied key-value pair storage bit.
According to a second aspect of the present application, there is provided a model parameter acquisition method, including:
the method is applied to a network card, wherein the network card is positioned in a server device, the server device further comprises a CPU and a persistent memory, and model parameters are stored in the persistent memory; the method comprises the following steps:
receiving parameter description information sent by a CPU (Central processing Unit); the parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired;
acquiring each parameter to be acquired according to the parameter storage related information to be acquired, generating a first parameter acquisition response, and sending the first parameter acquisition response to the client device; the first parameter acquisition response includes at least one parameter to be acquired.
Optionally, the obtaining each parameter to be obtained according to each parameter to be obtained storage address includes: and accessing the DMA through a direct memory, and obtaining each parameter to be obtained from the storage address of each parameter to be obtained according to the data length of each parameter to be obtained.
According to a third aspect of the present application, there is provided a model parameter obtaining apparatus, applied to a CPU, where the CPU is located in a server device, and the server device further includes a network card and a persistent memory, where model parameters are stored in the persistent memory; the device comprises:
The acquisition module is used for acquiring a parameter acquisition request sent by the client equipment; the parameter acquisition request comprises at least one parameter address index identifier to be acquired;
the determining module is used for determining parameter description information corresponding to the parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information; the parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired;
the sending module is used for sending the parameter description information to the network card so that the network card can acquire the parameters to be acquired according to the relevant information of the parameter storage to be acquired, generate a first parameter acquisition response and send the first parameter acquisition response to the client equipment; the first parameter acquisition response includes at least one parameter to be acquired.
According to a fourth aspect of the present application, there is provided a model parameter obtaining apparatus applied to a network card, where the network card is located in a server device, and the server device further includes a CPU and a persistent memory, where model parameters are stored in the persistent memory; the device comprises:
The receiving module is used for receiving the parameter description information sent by the CPU; the parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired;
the sending module is used for obtaining the parameters to be obtained according to the relevant information of the parameter storage to be obtained, generating a first parameter obtaining response of the parameter obtaining response, and sending the first parameter obtaining response of the parameter obtaining response to the client equipment; the parameter acquisition response first parameter acquisition response comprises at least one parameter to be acquired.
According to a fifth aspect of the present application, there is provided a server device, including: CPU, network card, persistent memory and storage; the network card comprises a processor;
the CPU, the network card, the persistent memory and the memory circuit are interconnected;
the memory is used for storing a first computer execution instruction and a second computer execution instruction, and the network card is used for receiving and transmitting data;
the persistent memory is used for storing model parameters;
the CPU executes the first computer-executable instructions to implement the method according to the first aspect, and the processor executes the second computer-executable instructions to implement the method according to the second aspect.
According to a sixth aspect of the present application there is provided a computer readable storage medium having stored therein computer executable instructions for carrying out the method according to the first and/or second aspects when executed by a processor.
The method is applied to a Central Processing Unit (CPU), the CPU is located in the server device, the server device further comprises a network card and a persistent memory, and the model parameters are stored in the persistent memory; acquiring a parameter acquisition request sent by a client device; the parameter acquisition request comprises at least one parameter address index identifier to be acquired; determining parameter description information corresponding to a parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information; the parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired; the parameter description information is sent to a network card, so that the network card can acquire all parameters to be acquired according to the relevant information of the parameter storage to be acquired, generate a first parameter acquisition response and send the first parameter acquisition response to client equipment; the first parameter acquisition response includes at least one parameter to be acquired. Because the model parameters are stored in the persistent memory, the data stored in the persistent memory can still be reserved after the power-off restarting, and the phenomenon that the server equipment consumes time to rewrite the model parameters after restarting can be avoided. Meanwhile, the CPU only needs to determine parameter description information corresponding to the parameter acquisition request, the parameter description information is sent to the network card, the network card reads the parameter to be acquired from the parameter storage address to be acquired, the parameter to be acquired is sent to the client, the pause of the CPU when accessing the persistent memory is avoided, and meanwhile, the network card reads the parameter to be acquired and does not occupy the cache of the CPU, so that the method can improve the speed of reading the model parameters of the server equipment, and ensure that the model parameters can be read in real time after the server equipment is restarted.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a network architecture diagram corresponding to an application scenario of a model parameter acquisition method according to an embodiment of the present application;
FIG. 2 is a flow chart of a model parameter acquisition method according to an embodiment of the present application;
FIG. 3 is a flow chart of a model parameter acquisition method according to a second embodiment of the present application;
fig. 4 is a flowchart of a model parameter acquisition method according to a third embodiment of the present application;
FIG. 5 is a flowchart of a model parameter acquisition method according to a fourth embodiment of the present application;
FIG. 6 is a schematic flow chart of a model parameter acquisition method in a server device according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a model parameter acquiring apparatus provided according to a fifth embodiment of the present application;
fig. 8 is a schematic structural diagram of a model parameter acquiring apparatus according to a sixth embodiment of the present application;
fig. 9 is a schematic structural diagram of a server device according to a seventh embodiment of the present application.
Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.
Detailed Description
The prior art to which the present application relates is described in detail and analyzed below.
With the development of artificial intelligence technology, the tide of large models has rolled up various subdivision fields of artificial intelligence, and the number of trainable model parameters in machine learning models is rapidly increased from 1 hundred million to more than 1 trillion, and is increased by more than 10 times to 4 times. The increasing number of model parameters increases the model accuracy, but with this, higher demands are placed on the storage and real-time reading of the model parameters.
In the prior art, after the machine learning model is trained, because model parameters are too many, a large storage space is required for storage, and the trained model parameters are often stored in a parameter server. When the machine learning model performs model reasoning on the client device, the required model parameters can be acquired in real time through communication with the parameter server, so that storage resources of the client device are saved. The parameter server stores the model parameters in the DRAM, and the model parameters can be read and written at any time by the characteristic of the DRAM, so that the model parameters can be read in real time and sent to the client device, and the online reasoning requirement of the machine learning model is met.
However, in the prior art, since the DRAM stores data by the amount of charge on the capacitor, it needs to be refreshed periodically to supplement the lost charge, which occupies a large amount of power consumption of the parameter server and consumes a large amount of power resources. Meanwhile, after the DRAM is powered off, the amount of charge on the capacitor is continuously lost, power-off storage data cannot be made, when the parameter server is powered off for restarting due to faults or other reasons, model parameters stored in the DRAM are lost, the parameter server is required to rewrite the model parameters into the DRAM from a persistent memory, a large amount of time is required to be consumed for a large amount of model parameters, in the process, the CPU is used for reading the parameters from the persistent memory and writing the parameters into the DRAM, and therefore, a request of a client device for acquiring the parameters cannot be responded in time, so that the parameter server cannot realize real-time reading of the model parameters, and the client device which is performing model reasoning cannot acquire the model parameters in real time.
In summary, the prior art has the problems that a large amount of power resources are consumed to realize the real-time reading of the model parameters, and the real-time reading of the model parameters may not be realized after the parameter server is restarted.
Therefore, in order to save the consumption of power resources, the inventor can not use the DRAM that needs to be refreshed periodically through creative research in order to solve the problems in the prior art, so that the model parameters can be stored in the persistent memory, and the data stored in the persistent memory can still be reserved after the power-off restart, so that the re-writing is not needed, and the time consumed by re-writing the data after the parameter server is restarted can be avoided. However, since the read-write speed of the persistent memory is not as high as that of the DRAM, which is often three times or more than that of the DRAM, the number of model parameters that can be stored in the persistent memory is also larger than that of the DRAM, and the parameter server needs more time to read the model parameters from the persistent memory than from the DRAM by the CPU, which means that when the model parameters are stored in the persistent memory, the CPU needs to read the model parameters needed by the client device on a storage medium with a slower reading speed and in more model parameters, which makes it possible that the parameter server cannot realize real-time reading of the model parameters, and cannot meet the requirement that the client device needs to acquire the model parameters in real time during model reasoning. Although the above problems can be solved by improving the performance of the CPU, the performance improvement of the CPU is limited, and the problem that the parameter server cannot acquire the model parameters in real time due to the slow reading speed of the persistent memory cannot be fundamentally solved.
Therefore, the inventor proposes the proposal of the application, the CPU only processes the data acquisition request sent by the client, determines the storage address of the model parameter required to be acquired by the client device, does not need to acquire the model parameter from the persistent memory, acquires the model parameter required by the client device according to the storage address of the model parameter by the network card, and then sends the model parameter required by the client device to the client device, thereby improving the speed of acquiring the model parameter by the parameter server and meeting the requirement of acquiring the model parameter by the client device in real time.
The technical scheme is applied to a Central Processing Unit (CPU), the CPU is located in a server device, the server device further comprises a network card and a persistent memory, and model parameters are stored in the persistent memory; acquiring a parameter acquisition request sent by client equipment; the parameter acquisition request comprises at least one parameter address index identifier to be acquired; determining parameter description information corresponding to a parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information; the parameter description information comprises at least one parameter storage related information to be acquired, wherein the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired; the method comprises the steps of sending parameter description information to a network card, enabling the network card to acquire all parameters to be acquired according to all parameter storage related information to be acquired, generating a first parameter acquisition response, and sending the first parameter acquisition response to client equipment; the first parameter acquisition response includes at least one parameter to be acquired. Because the model parameters are stored in the persistent memory, the data stored in the persistent memory can still be reserved after the power-off restarting, and the phenomenon that the server equipment consumes time to rewrite the model parameters after restarting can be avoided. Meanwhile, the CPU only needs to determine parameter description information corresponding to the parameter acquisition request, the parameter description information is sent to the network card, the network card reads the parameter to be acquired from the parameter storage address to be acquired, the parameter to be acquired is sent to the client, the pause of the CPU when accessing the persistent memory is avoided, and meanwhile, the network card reads the parameter to be acquired and does not occupy the cache of the CPU, so that the method can improve the speed of reading the model parameters of the server equipment, and ensure that the model parameters can be read in real time after the server equipment is restarted.
The method, the device, the server device and the storage medium for obtaining the model parameters aim to solve the technical problems in the prior art. The following describes the technical solutions of the present application and how the technical solutions of the present application solve the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
The network architecture and application scenario of the model parameter acquisition method provided in the embodiments of the present application will be described below. When the following description refers to the accompanying drawings, the same data in different drawings represents the same or similar elements, unless otherwise indicated.
Fig. 1 is a network architecture diagram corresponding to an application scenario of a model parameter acquisition method provided in an embodiment of the present application. As shown in fig. 1, a network architecture corresponding to an application scenario provided in an embodiment of the present application includes: a client device 11 and a server device 12. The client device 11 is communicatively connected to the server device 12. Server device 12 includes CPU121, network card 122, and persistent memory 123, with CPU121, network card 122, and persistent memory 123 being electrically interconnected.
The client device 11 is provided with a machine learning model, and model parameters that the machine learning model has trained are stored in persistent memory 123.
Model parameters required by the machine learning model on the client device 11 during the reasoning process are obtained by the client device 11 from the server device 12 in real time. The client device 11 acquires the required model parameters in real time by sending a parameter acquisition request to the server device 12. The parameter acquisition request comprises at least one parameter address index identifier to be acquired.
The server device 12 stores preset address index information, and after the server device 12 receives the parameter acquisition request sent by the client device 11, the CPU121 of the server device 12 acquires the parameter acquisition request sent by the client device 11.
The CPU121 determines parameter description information corresponding to the parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information. The parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired. The CPU121 sends the parameter description information to the network card 122, so that the network card obtains each parameter to be obtained according to the information related to the storage of each parameter to be obtained, generates a first parameter obtaining response, and sends the first parameter obtaining response to the client device. The first parameter acquisition response includes at least one parameter to be acquired.
In the application scenario of the application, the number of the client devices may be multiple, the number of the server devices may also be multiple, the client may obtain different model parameters from multiple server devices at the same time, and the server device may also provide model parameters for multiple client devices at the same time. The server device may be used as a parameter server. The application scenario of the present application is described in fig. 1 by the interaction of a client and a server, which is only for the convenience of understanding of those skilled in the art, and does not limit the scope of the present application.
Embodiments of the present application will be described below with reference to the accompanying drawings. The embodiments described in the examples below do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.
Example 1
Fig. 2 is a flowchart of a model parameter obtaining method according to an embodiment of the present application. As shown in fig. 2, the method provided in this embodiment is applied to a CPU, and the execution body is a model parameter obtaining device, where the model parameter obtaining device is located in the CPU, and the CPU is located in the server device. The server device also comprises a network card and a persistent memory, and the model parameters are stored in the persistent memory. The method for obtaining model parameters provided in the present embodiment includes steps 201 to 203.
Step 201, acquiring a parameter acquisition request sent by a client device; the parameter acquisition request comprises at least one parameter address index identifier to be acquired.
In this embodiment, the client device is communicatively connected to the server device. The number of client devices may be one or more. The server side equipment receives a parameter acquisition request sent by the client side equipment through the network card and stores the parameter acquisition request in a cache of the network card. The CPU may obtain a parameter obtaining request sent by the client device from the cache of the network card.
In this embodiment, the parameter address index identifier to be acquired is an address index identifier of a parameter to be acquired, and the address index identifier of the parameter is used to index the parameter storage related information in the preset address index information. The address index identification may be a number, unique code, etc. of the model parameters.
Illustratively, the model parameter γ is stored in the persistent memory, and is uniquely encoded with 001, 4 bytes in data length, and 0000H in storage address, and then the storage related information of the model parameter γ includes 0000H in storage address, and 4 bytes in data length. The storage address 0000H is the number of the first storage unit where the model parameter is located. The address index information of the model parameter γ is 001. The address index information 001 of the model parameter gamma and the corresponding storage related information thereof can be stored in the persistent memory in a key value pair mode.
Step 202, determining parameter description information corresponding to a parameter acquisition request according to a parameter address index identifier to be acquired and preset address index information; the parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired.
In this embodiment, the model parameters are stored in the persistent memory, and the information related to the storage of the model parameters is the location of the model parameters stored in the persistent memory. For example, the storage related information of the model parameter may include a storage address and a data length of the model parameter, where the storage address may be a number of a first storage unit where the model parameter is located. The model parameters can be found and read from the persistent memory by storing the relevant information for the model parameters. The data is stored in a binary manner in the computer, and 8 binary bits are one byte. The data length represents the size of the memory space in which the model parameters are stored. For example, the data length of the model parameter α is 2 bytes, and the number of the first memory location in the persistent memory is 0000H, and the model parameter α can be read from both the 0000H and 0001H memory locations.
In this embodiment, the preset address index information may be a single-layer index, which is used to determine the corresponding model parameter storage address according to the model parameter address index identifier.
In this embodiment, the parameter description information includes information related to parameter storage to be acquired, and the CPU determines information related to parameter storage to be acquired according to the address index identifier of each parameter to be acquired in the parameter acquisition request, so as to generate parameter description information corresponding to the parameter acquisition request, so that the network card can acquire each parameter to be acquired from the persistent memory.
As an alternative embodiment, the preset address index information includes a mapping relationship between the model parameter address index identifier and the model parameter storage related information, the model parameter storage related information includes a storage address and a data length of the model parameter, and step 202 refinement includes steps 301 to 302.
Step 301, inquiring the relevant information of the parameter storage to be acquired corresponding to the address index identifier of each parameter to be acquired in the mapping relation.
In this embodiment, the mapping relationship between the model parameter address index identifier and the information related to the model parameter storage may be a mapping relationship table. The mapping relation table may be stored in a persistent memory, and the CPU may acquire the mapping relation table from the persistent memory and write it into a cache of the CPU after acquiring the parameter acquisition request.
Step 302, in response to the query of each parameter address index to be acquired in the parameter acquisition request, storing related information of the queried parameter to be acquired, and determining the related information as parameter description information corresponding to the parameter acquisition request.
In this embodiment, the CPU queries each parameter address index identifier to be acquired in the parameter acquisition request in the mapping relationship, and if it is queried that the parameter storage related information to be acquired corresponding to the parameter address index identifier to be acquired is stored, puts the parameter address information to be acquired into the parameter description information corresponding to the parameter acquisition request. After the address index identifiers of the parameters to be acquired in the parameter acquisition request are all queried, determining the parameter storage related information of the parameters to be acquired, which are queried to the parameter storage related information, in the parameter acquisition request as parameter description information corresponding to the parameter acquisition request.
For example, if the parameter address index identifiers "1000" and "1001" to be acquired are included in the parameter acquisition request, the parameters α and β to be acquired are respectively corresponding. The CPU may query the parameter address index identifiers "1000" and "1001" as keywords in the mapping relationship, if it is queried that the parameter storage address to be acquired corresponding to the parameter address index identifier "1000" is "A1", then put "A1" as parameter storage related information of the parameter α to be acquired into parameter description information corresponding to the parameter acquisition request, if it is not queried that the parameter storage related information to be acquired corresponding to the parameter address index identifier "1001", then determine, after both the parameter address index identifiers "1000" and "1001" have been queried, parameter storage related information "A1" of the parameter α to be acquired, for which the parameter storage related information "A1" is queried, as parameter description information corresponding to the parameter acquisition request.
According to the model parameter acquisition method provided by the embodiment, the preset address index information comprises a mapping relation between a model parameter address index identifier and model parameter storage related information, and the model parameter storage related information comprises a storage address and a data length of model parameters; inquiring the relevant information of the parameter storage to be acquired corresponding to the index identifier of each parameter address to be acquired in the mapping relation; and responding to the inquiry of each parameter address index to be acquired in the parameter acquisition request, and storing related information of the inquired parameter to be acquired as parameter description information corresponding to the parameter acquisition request. Because each parameter address index to be acquired in the parameter acquisition request is queried, the parameter description information comprises queried parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired, the method is beneficial to the follow-up network card to quickly acquire the parameter to be acquired according to the parameter storage address to be acquired, and the acquired parameter to be acquired is sent to the client device.
Step 203, sending the parameter description information to the network card, so that the network card obtains each parameter to be obtained according to the relevant information of the parameter storage to be obtained, generates a first parameter obtaining response, and sends the first parameter obtaining response to the client device; the first parameter acquisition response includes at least one parameter to be acquired.
In this embodiment, after determining the parameter description information corresponding to the parameter acquisition request, the CPU sends the parameter description information to the network card.
After the network card receives the parameter description information sent by the CPU, acquiring each parameter to be acquired from the storage address of each parameter to be acquired, and generating a first parameter acquisition response after the parameters to be acquired corresponding to the parameter storage related information to be acquired are acquired. The first parameter acquisition response comprises at least one parameter to be acquired by the network card.
It may be understood that the parameter obtaining request and the parameter description information corresponding to the parameter obtaining request both include the client device identifier, and the first parameter obtaining response also includes the client device identifier, so that the network card may send the generated first parameter obtaining response to the client device corresponding to the client device identifier.
In this embodiment, the first parameter obtaining response may further include: and the parameter address index identification to be acquired in the parameter description information. The address index identifier of the parameter to be acquired in the first parameter acquisition response is used for notifying the client device that the corresponding parameter to be acquired cannot be acquired in the server device.
Continuing with the description according to the above example, the parameter description information includes: the network card reads the parameter to be acquired from the parameter storage address A1, and generates a corresponding first parameter acquisition response after the parameter to be acquired stored in each parameter storage address in the parameter description information is acquired. In this example, the first parameter acquisition response includes: the parameter α is to be obtained.
As an optional implementation manner, if the corresponding parameter address to be obtained and the relevant information of the parameter address to be obtained are not searched in the mapping relation in each parameter address index to be obtained in the parameter obtaining request, the CPU generates a second parameter obtaining response corresponding to the parameter obtaining request and sends the second parameter obtaining response to the network card, so that the network card sends the second parameter obtaining response to the client device, and the second parameter obtaining response includes at least one parameter address index information to be obtained.
The model parameter acquisition method provided by the embodiment is applied to a Central Processing Unit (CPU), the CPU is located in a server device, the server device further comprises a network card and a persistent memory, model parameters are stored in the persistent memory, and a parameter acquisition request sent by a client device is acquired; the parameter acquisition request comprises at least one parameter address index identifier to be acquired; determining parameter description information corresponding to a parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information; the parameter description information comprises at least one parameter storage related information to be acquired, wherein the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired; the method comprises the steps of sending parameter description information to a network card, enabling the network card to acquire all parameters to be acquired according to all parameter storage related information to be acquired, generating a first parameter acquisition response, and sending the first parameter acquisition response to client equipment; the first parameter acquisition response includes at least one parameter to be acquired. Because the model parameters are stored in the persistent memory, the data stored in the persistent memory can still be reserved after the power-off restarting, and the phenomenon that the server equipment consumes time to rewrite the model parameters after restarting can be avoided. Meanwhile, the CPU only needs to determine parameter description information corresponding to the parameter acquisition request, the parameter description information is sent to the network card, the network card reads the parameter to be acquired from the parameter storage address to be acquired, the parameter to be acquired is sent to the client, the pause of the CPU when accessing the persistent memory is avoided, and meanwhile, the network card reads the parameter to be acquired and does not occupy the cache of the CPU, so that the method can improve the speed of reading the model parameters of the server equipment, and ensure that the model parameters can be read in real time after the server equipment is restarted.
Example two
Fig. 3 is a flowchart of a model parameter obtaining method according to a second embodiment of the present application. As shown in fig. 3, in the model parameter obtaining method provided in this embodiment, on the basis of the first embodiment, the mapping relationship includes at least one preset hash bucket, each preset hash bucket is orderly arranged, the preset hash bucket includes an overflow count identifier and a preset number of key value pair storage bits, and the model parameter address index identifier and the model parameter storage related information corresponding to each other are stored in the key value pair storage bits in a key value pair form. And, if step 301 is refined, step 301 refinement includes step 401, and step 401 includes steps 4011 to 4015.
Step 401, for each parameter address index identifier to be acquired, the following operations are performed, including steps 4011 to 4015.
In step 4011, a preset hash algorithm is adopted to calculate a starting hash bucket corresponding to the parameter address index identifier to be obtained, and the starting hash bucket is determined to be the hash bucket to be queried of the parameter address index identifier to be obtained.
In this embodiment, the hash algorithm is also called a digest algorithm, and the hash algorithm may calculate any input data to obtain an output digest with a fixed length. For a hash algorithm, the same output digest is necessarily obtained by inputting the same data, and different output digests are obtained by inputting different data with high probability. Therefore, the hash algorithm can be used for realizing the mapping relation between the index identifier of the parameter address to be acquired and the storage address of the parameter to be acquired. The same output digest obtained by inputting different data is called hash collision.
Because of the existence of the hash collision, different parameter address index identifiers to be acquired are input into a preset hash algorithm, the same output abstract is possible to be obtained, the number of model parameters is huge, and in order to ensure that the model parameters can be acquired quickly and accurately, the hash collision is solved through a plurality of preset hash buckets which are orderly arranged. The preset hash bucket includes a preset number of key-value pairs of storage bits. The corresponding model parameter address index identification and the corresponding model parameter storage related information are stored in the key value pair storage bit in the form of key value pairs. Determining an output abstract of the parameter address index identifier of the model as an identifier of each preset hash bucket by a preset hash algorithm, inputting the parameter address index identifier to be acquired into the preset hash algorithm, calculating a preset hash bucket corresponding to the parameter address index identifier to be acquired, and determining the corresponding preset hash bucket as a starting hash bucket corresponding to the parameter address index identifier to be acquired. When the mapping relationship includes at least one preset hash bucket, the mapping relationship may also be referred to as a hash table.
The preset hash bucket further comprises an overflow count identifier, wherein the overflow count identifier is used for counting how many model parameters exist and the parameter address index identifier to be acquired is used for generating hash collision. When hash collision occurs, the model parameter address index identifier and the corresponding parameter storage related information thereof should be placed in the initial hash bucket corresponding to the model parameter address index identifier, but since the initial hash bucket corresponding to the model parameter has no spare key value pair storage bit, the model parameter is placed in other preset hash buckets.
In this embodiment, the size of each preset hash bucket is preconfigured, for example, 256 bytes, and 256 bytes are the internal block size of the currently commercial persistent memory. The overflow count identification may occupy one byte in a preset hash bucket.
In step 4012, the query key in the hash bucket to be queried is the key value pair to be obtained of the parameter address index identifier to be obtained.
Step 4013, in response to the query failing to obtain the key value pair and the overflow count identifier of the hash bucket to be queried not being the first preset value, updating the hash bucket to be queried according to the ordered arrangement and the preset step length of each preset hash bucket.
In this embodiment, the first preset value is used to identify whether there are unoccupied key-value pairs in the preset hash bucket. If the overflow count identifier of the preset hash bucket is a first preset value, the fact that unoccupied key value pairs of storage bits exist in the preset hash bucket is indicated, and if the overflow count identifier of the preset hash bucket is not the first preset value, the fact that all key value pairs of storage bits in the preset hash bucket are occupied is indicated.
Specifically, the index identifier of the parameter address to be obtained may be used as a keyword, and the key value pair to be obtained may be queried in the hash bucket to be queried. The key of the key value pair to be acquired is the index identifier of the parameter address to be acquired, and the value is the relevant information of the parameter storage to be acquired. It can be appreciated that the first queried hash bucket is the starting hash bucket corresponding to the parameter address index to be acquired. If the parameter address index identifier to be obtained is not inquired in the initial hash bucket, the overflow count identifier of the initial hash bucket needs to be obtained, if the overflow count identifier is a first preset value, the parameter address index identifier to be obtained is indicated to be not inquired in the mapping relation, and at this time, the client device possibly sends the wrong parameter address index identifier to be obtained. Wherein the first preset value may be 0.
If the parameter address index identifier to be obtained cannot be queried in the initial hash bucket, and the overflow count identifier of the initial hash bucket is not a first preset value, the parameter address index identifier to be obtained and the corresponding parameter storage address to be obtained are possibly stored in other preset hash buckets except the initial hash bucket, so that the hash bucket to be queried needs to be queried again after being updated. Here, since the plurality of preset hash buckets are orderly arranged, the hash buckets to be queried can be updated according to the preset step length, so that when the parameter address index identification to be obtained cannot be queried all the time, all the preset hash buckets on the detection path where the initial hash bucket is located can be traversed.
Step 4014, repeating the steps from the query key in the hash bucket to update the hash bucket according to the ordered arrangement and the preset step length of each preset hash bucket, until the key value pair to be obtained is queried, or until the key value pair to be obtained is not queried and the overflow count identifier of the hash bucket to be queried is not the first preset value.
In this embodiment, in order to ensure that the index identifier of the parameter address to be obtained is sufficiently queried, after updating the hash bucket to be queried, steps 4012 and 4013 need to be repeatedly executed until the key value pair to be obtained is queried in the hash bucket to be queried, or until the key value pair to be obtained is not queried in all preset hash buckets on the probe path where the initial hash bucket is located.
The mapping relationship includes 9 first, second, third, fourth, fifth, sixth, seventh, eighth and ninth hash buckets arranged in sequence, the initial hash bucket corresponding to the parameter address index to be acquired is the second hash bucket, and the preset step length is 2, where the probing path of the second hash bucket may be: second, fourth, sixth and eighth hash buckets; the method can also be as follows: second, fourth, sixth, eighth, first, third, fifth, seventh hash buckets.
If the key value pair to be obtained is not inquired in the second hash bucket, and the overflow count identifier of the second hash bucket is not the first preset value. The hash bucket to be queried needs to be further needed, and the first, second, third, fourth, fifth, sixth, seventh, eighth and ninth hash buckets are sequentially arranged because the preset step length is 2, so that the hash bucket to be queried is updated to be the fourth hash bucket. Step 4012 and step 4013 are repeatedly executed, and all preset hash buckets on the detection path where the second hash bucket is located are traversed until the key value pair to be obtained is queried, or until the key value pair to be obtained is not queried in all preset hash buckets on the detection path where the second hash bucket is located.
In step 4015, in response to the query of the key value pair to be obtained, determining the value of the key value pair to be obtained as the parameter address index identifier corresponding parameter storage related information to be obtained.
In this embodiment, after the key value pair to be obtained is queried, the value in the key value pair to be obtained may be directly read, so as to determine the value in the key value pair to be obtained as the parameter address index identifier-corresponding parameter storage related information to be obtained.
The model parameter obtaining method provided in this embodiment performs the following operations for each parameter address index identifier to be obtained: calculating a starting hash bucket corresponding to the parameter address index identifier to be acquired by adopting a preset hash algorithm, and determining the starting hash bucket as a hash bucket to be queried of the parameter address index identifier to be acquired; the query key in the hash bucket to be queried is a key value pair to be obtained of the parameter address index identification to be obtained; in response to the query being not completed with the key value pair to be obtained and the overflow count identification of the hash bucket to be queried not being the first preset value, updating the hash bucket to be queried according to the ordered arrangement and the preset step length of each preset hash bucket; repeating the steps from the step of searching the key value pairs to be obtained, which are marked by the index of the parameter address to be obtained, in the hash bucket to the step of updating the hash bucket to be searched according to the ordered arrangement and the preset step length of each preset hash bucket until the key value pairs to be obtained are searched, or until the key value pairs to be obtained are not searched, and the overflow count mark of the hash bucket to be searched is not a first preset numerical value; and in response to the inquiry of the key value pair to be acquired, determining the value of the key value pair to be acquired as the parameter address index identification corresponding parameter storage related information to be acquired. The query result of the key value pair to be obtained in the mapping relation is determined through the overflow count identifier and the query result in the hash bucket to be queried, the hash conflict is solved through a plurality of preset hash buckets, and convenience is brought to each preset hash bucket through updating of the hash bucket to be queried, so that the relevant information of the parameter storage to be obtained corresponding to the parameter address index identifier to be obtained can be rapidly and accurately obtained.
Optionally, for a plurality of parameter address index identifiers to be acquired, when the CPU queries related information of the parameter to be acquired corresponding to the parameter address index identifier to be acquired, or in other words, when the key value pair to be acquired where the parameter address index identifier to be acquired is located, the CPU asynchronously writes, through the prefetch instruction, a starting hash bucket corresponding to the parameter address index identifier to be acquired next into the CPU cache from the persistent memory. Therefore, the relevant information of the parameter storage to be acquired corresponding to the next parameter address index mark to be acquired can be directly inquired in the CPU cache with high probability, so that the high delay of the persistent memory is hidden, and the reading speed is improved.
Example III
Fig. 4 is a flowchart of a model parameter acquisition method according to a third embodiment of the present application. As shown in fig. 4, the method for obtaining model parameters provided in the present embodiment further includes steps 501 to 503 on the basis of the second embodiment.
Step 501, obtaining hot spot parameter information sent by a client device; the hot spot parameter information comprises at least one hot spot parameter address index identifier and a heat value of each hot spot parameter; the heat value represents the number of times that the client device acquires the model parameters in a preset time period, and the number of times that the client device acquires the model parameters is in positive correlation.
In this embodiment, when the server device is in communication connection with a plurality of client devices, the server device may communicate with any one of the preset client devices to obtain the hotspot parameter information because the plurality of client devices have similar data access distribution.
In the client device, the hotness value of the model parameter is used to represent the number of times the model parameter is used in the machine learning model reasoning process, and the number of times the model parameter is used in the machine learning model reasoning process is equal to the number of times the client device acquires the model parameter from the server device. And the heat value of the model parameter and the number of times the model parameter is used in the machine learning model reasoning process are in positive correlation.
Step 502, calculating a starting hash bucket corresponding to each hotspot parameter address index identifier by adopting a preset hash algorithm.
In this embodiment, the preset hash algorithm is the same as the preset hash algorithm in step 4011.
Step 503, performing the following operations for each hotspot parameter address index identifier, including steps 5031 to 5033, in order of the hot value from the higher value to the lower value.
In this embodiment, the key of the hotspot key value pair is a hotspot parameter address index identifier, and the value is the hotspot parameter storage related information corresponding to the hotspot parameter address index identifier. The hot spot parameters are model parameters frequently acquired from the server side equipment by the client side equipment, if the hot spot key value pairs can be stored in the initial hash bucket corresponding to the hot spot parameter address index identification, the CPU can more rapidly inquire hot spot parameter storage related information corresponding to the hot spot parameter address index identification in the mapping relation, and further the client side equipment can more rapidly acquire the hot spot parameters.
In this embodiment, each hotspot parameter is migrated according to the order of the hot values from large to small, so that the query path of the hotspot parameter with the higher hot value can be shortened as much as possible on the premise that the number of the preset hash bucket key values to the storage bits is limited, and the average speed of obtaining the model parameter by the client device is improved.
In step 5031, the query key in the corresponding initial hash bucket is a hotspot key value pair identified by the hotspot parameter.
In this embodiment, the hotspot parameter address index identifier may be used as a key to query a hotspot key pair in a corresponding initial hash bucket, and for a hotspot parameter that can query a hotspot key pair in the initial hash bucket, parameter migration may not be performed, and for a hotspot parameter that cannot query a hotspot key pair in the initial hash bucket, in order to increase the speed at which a client device obtains a model parameter, the hotspot parameter may be migrated to the corresponding initial hash bucket.
In step 5032, in response to the hot key value pair not being found in the corresponding starting hash bucket, the query key in the mapping relationship is the hot key value pair identified by the hot parameter address index.
In step 5033, in response to the hot key value pair being queried in the mapping relationship, the hot key value pair is written into the initial hash bucket corresponding to the hot parameter address index identifier, and the hot key value pair is deleted from the initial hash bucket.
In this embodiment, if no hot key value pair whose key is the hot-spot parameter address index identifier is found in the starting hash bucket, the following two possibilities exist. First, because of the hash collision, the hot key value pair is not stored in the initial hash bucket corresponding to the hot parameter address index identifier. Secondly, the hot spot key value pair does not exist in the mapping relation, and the hot spot parameter does not belong to the model parameter stored in the server equipment. For the two cases, the hot key value pairs identified by the hot parameter address index can be distinguished by inquiring the hot key in the mapping relation.
If the hot key value pair is inquired in the mapping relation, the hot key value pair is migrated to the initial hash bucket corresponding to the hot parameter address index identifier, and if the hot key value pair is not inquired in the mapping relation, the hot parameter is not stored in the server side equipment.
In this embodiment, because the hot key value pair is not queried in the initial hash bucket corresponding to the hot parameter address index identifier, each key value pair storage bit in the initial hash bucket corresponding to the hot parameter address index identifier must be occupied, and there is no unoccupied key value pair storage bit. In this case, the key value pair that does not include the hotspot parameter address index identifier may be migrated away from the starting hash bucket corresponding to the hotspot parameter address index identifier, that is, the address index identifier of the non-hotspot parameter and the storage related information may be migrated away, and the key value pair storage bit may be vacated. For example, the method includes the steps of migrating to a detection path where a starting hash bucket is located, and sequentially calculating the preset hash buckets with unoccupied key value pairs and storage bits according to the ordered arrangement of the preset hash buckets and the preset step length. It can be understood that if each key value pair in the initial hash bucket corresponding to the hotspot parameter address index identifier is stored as a hotspot key value pair, the hotspot key value pair corresponding to the hotspot parameter with the smaller hotness value can be migrated, so that the hotspot parameter with the higher hotness value is ensured, and a shorter query path can be provided in the mapping relation.
According to the model parameter acquisition method provided by the embodiment, hot spot parameter information sent by the client device is acquired; the hot spot parameter information comprises at least one hot spot parameter address index identifier and a heat value of each hot spot parameter; the heat value represents the number of times that the client device acquires the model parameters in a preset time period, and the number of times that the client device acquires the model parameters is in positive correlation; calculating a starting hash bucket corresponding to each hotspot parameter address index identifier by adopting a preset hash algorithm; the following operations are executed for each hot spot parameter address index identifier according to the order of the hot values from big to small: inquiring a hot key value pair with a hot parameter identifier as a hot key in a corresponding initial hash bucket; responding to that the hot spot key value pairs are not inquired in the corresponding initial hash bucket, wherein the inquiry keys in the mapping relation are the hot spot key value pairs identified by the hot spot parameter address indexes; and in response to the inquiry of the hot key value pair in the mapping relation, writing the hot key value pair into the initial hash bucket corresponding to the hot parameter address index identification, and deleting the hot key value pair from the initial hash bucket. Because the hot spot key value pair is migrated to the initial hash bucket corresponding to the hot spot parameter address index identifier, the query path of the hot spot parameter storage address is shortened, and therefore the speed of the server side equipment for acquiring the model parameters to be acquired can be improved.
As an alternative implementation manner, on the basis of the third embodiment, the refinement is performed on "delete hot key value pair from the initial hash bucket where it is located" in step 5033, and then the refinement includes steps 601 to 603.
In step 601, after the hot key value pair is written into the initial hash bucket corresponding to the hot parameter address index identifier, the key value pair storage bit storing the hot key value pair in the initial hash bucket is marked as to be released.
In this embodiment, the initial hash bucket is a preset hash bucket where the hot key pair is located before migration. The initial hash bucket corresponding to the hot spot parameter address index mark is a preset hash bucket where the hot spot key value pair is located before migration. In order to ensure that the hot key pair can be successfully migrated, the storage space occupied by the hot key pair in the initial hash bucket needs to be released after the hot key pair is written into the corresponding initial hash bucket.
Step 602, obtaining a global thread reading identifier and a private thread reading identifier in each reading thread, and determining whether a read thread which runs first exists according to the global thread reading identifier and each private thread reading identifier, wherein the read thread which runs first is a read thread which has started to run before a hot spot key value pair is written into a starting hash bucket corresponding to a hot spot parameter address index identifier.
If it is determined that there is no read thread running first, then the key value pair storage bit marked to be released in the initial hash bucket is released 603.
In this embodiment, multiple threads may be simultaneously run in the CPU, and the threads running simultaneously in the CPU may be a read thread, a write thread, or the like. The reading thread is used for reading the mapping relation and acquiring the parameter storage address to be acquired corresponding to the parameter storage address to be acquired from the mapping relation. The writing thread is used for writing the key value pair into the key value pair storage bit of the preset hash bucket. The write thread is needed in the migration process of the hot spot key value pair, and the read thread is needed in the process of determining the parameter description information.
In this embodiment, the read thread that is running first refers to a read thread that has already started running before the hot-spot key value pair is written into the starting hash bucket corresponding to the hot-spot parameter address index identifier. When the reading thread acquires the key value pairs to be acquired, starting to inquire from a starting hash bucket corresponding to the parameter address index identification to be acquired, and traversing each preset hash bucket of the inquiry path through the ordered arrangement of each preset hash bucket and the preset step length. Therefore, in order to ensure that the read thread can acquire the parameter storage address to be acquired as soon as possible, the hot key value pair is prevented from being deleted when the read thread inquires the initial hash bucket where the hot key value pair is located, the read thread which does not run before is needed, and then the hot key value pair which needs to be deleted in the initial hash bucket is released from the occupied storage space.
In this embodiment, whether a thread that is reading the mapping relationship exists may be determined by the global thread reading identifier and the private thread reading identifier in each reading thread. In particular, the global thread read identifier may be a time identifier, such as a timestamp, or may be an increasing sequence number. When each reading thread runs, the private thread reading identification of the reading thread needs to be acquired first.
When the global thread reading identifier is a time identifier, the time is changed along with the time lapse, and the private thread reading identifier is the time when the thread runs. When the global thread identifier is a serial number which is continuously increased, the private thread reading identifier is determined by the global thread identifier, for example, the reading thread can determine the global thread identifier of the running thread as the private thread identifier, and the global thread identifier is not changed in the running process, and the private thread identifier is deleted after the running is finished. Meanwhile, the global thread identifier increases a third preset value, for example, 1, when each new reading thread runs, so as to realize continuous self-increase. And the write thread in the initial hash bucket corresponding to the hot spot key value pair writer can acquire the global thread identification when the write thread starts to run. Furthermore, in this embodiment, as long as the minimum value in the private thread identifier of each running read thread is greater than the global thread identifier number obtained by the write thread when starting to run, it may be determined that the read thread started before the write thread runs has been completely run, and the key value pair in the release initial hash bucket marked as occupied to be released may be stored.
Optionally, the preset hash bucket further includes a migration identifier, where the migration identifier is used to identify whether the preset hash bucket is migrating. The migration identification may be that no migration is occurring or that migration is occurring. When the hotspot parameters are migrated, the migration identifier of each preset hash bucket on the hotspot parameter detection path is modified to be migrated, and the migration identifier can be modified to be not migrated after the hotspot parameters are migrated. The detection path is determined according to the initial hash bucket corresponding to the parameter address index identification, the ordered arrangement of each preset hash bucket and the preset step length. The migration identifier may occupy 1 bit in the preset hash bucket, and indicates that the preset hash bucket is not migrated or is being migrated through 0 or 1 on the bit.
Optionally, the preset hash bucket further includes a version count identifier, where the version count identifier is used to implement a multi-version concurrency control protocol. The version counter may occupy 4 bytes in the preset hash bucket.
As an optional implementation manner, on the basis of any one of the foregoing embodiments, the preset hash bucket further includes a migration identifier, and if step 4012 "the query key in the hash bucket to be queried is the key value pair to be obtained with the parameter address index to be obtained identified" is refined, step 4012 refinement includes step 40121.
In step 40121, if it is determined that the migration identifier of the hash bucket to be queried is migrating, the hash bucket to be queried is updated to the initial hash bucket corresponding to the parameter address index identifier to be obtained.
In this embodiment, the migration flag is not performing migration or is migrating, and the migration indicates that the preset hash bucket is performing hotspot parameter migration. If the read thread is in the probing path, the migration identifier of the hash bucket to be queried is in migration, and then all probing needs to be started from the initial bucket corresponding to the parameter address index identifier to be acquired until the hash bucket in migration does not exist in the probing path.
According to the model parameter obtaining method provided by the embodiment, if the migration mark of the hash bucket to be queried is determined to be migrating, the hash bucket to be queried is updated to be the initial hash bucket corresponding to the parameter address index mark to be obtained, and when the hash bucket to be queried is migrating, the hash bucket to be queried is updated to be the initial hash bucket corresponding to the parameter address index mark to be obtained, so that the parameter address index mark to be obtained can be fully queried in the mapping relation, and the hot spot key value pair which is migrating is not missed when the hot spot parameter is migrated.
As an alternative implementation manner, on the basis of any one of the foregoing embodiments, a solution of the present application further includes steps 701 to 703.
Step 701, obtaining a parameter insertion request sent by a client device; the parameter insertion request includes at least one parameter to be inserted and an address index identification thereof.
In this embodiment, the parameters to be inserted are trained and require model parameters stored in the server device.
Step 702, storing each parameter to be inserted into a persistent memory to obtain information related to the storage of each parameter to be inserted; the parameter to be inserted stores the relevant information including the storage address and the data length of the parameter to be inserted.
In this embodiment, when the CPU obtains the parameters to be inserted, the CPU may obtain the data length of each parameter to be inserted at the same time, so that the CPU may write each parameter to be inserted into the persistent memory of the server device, to obtain the storage related information of the parameter to be inserted. The storage unit to be written with the parameters to be inserted is the storage address of the parameters to be inserted.
Step 703, for each parameter address index identifier to be inserted, performing the following operations, including steps 7031 to 7034.
Step 7031, a preset hash algorithm is adopted to calculate a starting hash bucket corresponding to the parameter address index identifier to be inserted, and the starting hash bucket is determined to be the hash bucket to be inserted of the parameter identifier to be inserted.
In this embodiment, the preset hash algorithm is the same as the preset hash algorithm in step 4011.
Step 7032, it is determined whether there are unoccupied key-value pairs of storage bits in the hash bucket to be inserted.
Step 7033, in response to the absence of the unoccupied key-value pair storage bit, increasing the overflow count identifier of the hash bucket to be inserted by a second preset value, further inserting the hash bucket according to the ordered arrangement and the preset step size of each preset hash bucket, and repeating the step of determining whether the unoccupied key-value pair storage bit exists in the hash bucket to be inserted until the unoccupied key-value pair storage bit exists in the hash bucket to be inserted, or until the unoccupied key-value pair storage bit does not exist in each preset hash bucket.
Step 7034, in response to the existence of the unoccupied key-value pair storage bit, storing the parameter address index identification to be inserted as a key and the parameter storage related information to be inserted as a value into the unoccupied key-value pair storage bit.
In this embodiment, for the operation of inserting parameters, the CPU first locates its initial hash bucket according to the parameter address index to be inserted, and if there is an unoccupied key value pair storage pair in the initial hash bucket, stores the parameter address index to be inserted and the parameter storage related information to be inserted as the key value pair in the initial hash bucket.
If the unoccupied key value pair storage bits do not exist in the initial hash bucket, the overflow count identification of the initial hash bucket is increased by a second preset value, and the number of key value pairs which are not stored in the initial hash bucket is identified. Then, according to the ordered arrangement and the preset step length of each preset hash bucket, on a detection path where the initial hash bucket is located, whether each preset hash bucket on the path has an unoccupied key value to detect the storage bit or not is detected until a preset hash bucket with the unoccupied key value to the storage bit is found, namely, the overflow count is marked as a preset hash bucket with a first preset value. And storing the related information of the parameter storage to be inserted and the parameter address index identification to be inserted into a preset hash bucket of which the first unoccupied key value pair storage bit exists in the first detection path of the initial hash bucket in sequence, wherein the unoccupied key value pair storage bit exists in the preset hash bucket. Meanwhile, the overflow count identifiers of all preset hash buckets passing through the detection path are increased by a second preset value so as to record hash conflicts.
Wherein the second preset value may be 1. The preset step length can be self-adaptive according to the model parameter address index identification in the mapping relation, so that the continuous full bucket is prevented from influencing the query performance.
For inquiring, updating and deleting parameter operation, the CPU sequentially detects each preset hash bucket on the detection path where the initial hash bucket is located until a key value pair where the parameter address index mark to be acquired, updated and deleted is located is found, and the inquiring, deleting and updating operation is executed in the hash bucket. Or until a preset hash bucket is encountered where the overflow counter is a first preset value. If the preset hash bucket with the overflow counter being the first preset value is encountered, the key value pairs to be acquired, updated and deleted where the parameter address index identifiers to be acquired, updated and deleted are still not found, which means that the parameter does not exist in the mapping relation.
In this embodiment, for the operation of deleting the parameter, after deleting the key value pair to be deleted where the parameter to be deleted is located, the detection path needs to be traversed reversely, and the overflow count identifier in each preset hash bucket on the path is subtracted by a second preset value, so as to reduce the number of hash conflicts in each preset hash bucket.
Optionally, in this embodiment, too long probe paths may be avoided by reasonably setting the hash table size. In general, with a hash table loading factor of 0.8, more than 99.99% of the parameters can be found within three probes, with a mathematical expectation of 1.05 for the number of probes. In practice, since the model size of the server device deployment can be predicted, the size of the hash table can be predetermined empirically by considering performance and space tradeoffs.
According to the model parameter acquisition method provided by the embodiment, a parameter insertion request sent by the client device is acquired; the parameter insertion request comprises at least one parameter to be inserted and an address index identifier thereof; storing each parameter to be inserted into a persistent memory to obtain storage related information of each parameter to be inserted; the relevant information of the parameter storage to be inserted comprises a storage address and a data length of the parameter to be inserted; for each parameter address index identifier to be inserted, the following operations are performed: calculating a starting hash bucket corresponding to the parameter address index identifier to be inserted by adopting a preset hash algorithm, and determining the starting hash bucket as the hash bucket to be inserted of the parameter identifier to be inserted; determining whether unoccupied key value pairs exist in the hash bucket to be inserted or not; in response to the absence of the unoccupied key-value pair storage bits, increasing the overflow count identification of the hash bucket to be inserted by a second preset value, further inserting the hash bucket according to the ordered arrangement and the preset step length of each preset hash bucket, and repeatedly executing the step of determining whether the unoccupied key-value pair storage bits exist in the hash bucket to be inserted until the unoccupied key-value pair storage bits exist in the hash bucket to be inserted, or until the unoccupied key-value pair storage bits do not exist in each preset hash bucket; in response to the existence of the unoccupied key-value pair storage bit, the parameter address index to be inserted is identified as a key and the parameter storage related information to be inserted is stored as a value into the unoccupied key-value pair storage bit. Because the corresponding initial hash bucket is calculated according to each parameter address index identifier to be inserted, the parameter address index identifier to be inserted and the storage related information thereof are written into the initial hash bucket preferentially, and the parameters to be acquired are written into according to the detection path when the initial hash bucket overflows, the speed of acquiring the storage related information of the parameters to be acquired more quickly during inquiry can be improved, and the speed of acquiring, updating and deleting the model parameters by the server side equipment is improved.
Example IV
Fig. 5 is a flowchart of a model parameter acquisition method according to a fourth embodiment of the present application. As shown in fig. 5, the method provided in this embodiment is applied to a network card, where the execution body is a model parameter obtaining device, the model parameter obtaining device is located in the network card, and the network card is located in the server device. The server device also includes a CPU and a persistent memory in which the model parameters are stored. The method for obtaining model parameters provided in the present embodiment includes steps 801 to 802.
Step 801, receiving parameter description information sent by a CPU; the parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired.
In this embodiment, the parameter description information may be a descriptor list, including information related to the parameter storage to be acquired. The parameter storage related information to be acquired comprises the following steps: the storage address of the parameter to be acquired, and the data length of the parameter to be acquired.
The storage address of the parameter to be acquired is the storage address of the parameter to be acquired in the persistent memory. The memory address of the parameter to be obtained may be a memory cell number of the persistent memory.
The data length of the parameter to be acquired is the size of the storage space occupied by the parameter to be acquired in the persistent memory.
The network card can acquire the parameter description information sent by the CPU through the bus connection between the network card and the CPU.
Step 802, acquiring each parameter to be acquired according to the information related to the storage of each parameter to be acquired, generating a first parameter acquisition response, and sending the first parameter acquisition response to the client device; the first parameter acquisition response includes at least one parameter to be acquired.
As an alternative embodiment, the refinement of step 802, then step 802 refinement includes: and accessing the DMA through a direct memory, and obtaining each parameter to be obtained from the storage address of each parameter to be obtained according to the data length of each parameter to be obtained.
In this embodiment, the network card includes a direct memory access DMA engine, which provides a Scatter-Gather direct memory access (Scatter-Gather-DMA) function. The network card can gather each parameter to be acquired from the storage address of each parameter to be acquired through the DMA engine, so that the pause of the CPU when accessing the low-speed persistent memory medium is avoided, and the acquisition speed of the parameter to be acquired is improved.
In this embodiment, the first parameter obtaining response may be a DMA message, where the DMA message includes a header and a data payload, the header is an index identifier of a parameter address to be obtained and a storage address of the parameter to be obtained, and the data payload is a specific value of the parameter to be obtained.
It may be appreciated that the parameter acquisition request and the parameter description information corresponding to the parameter acquisition request include the client device identifier, and the first parameter acquisition response also includes the client device identifier, so that the network card may send the generated first parameter acquisition response to the client device corresponding to the client device identifier.
In this embodiment, it is assumed that n model parameters need to be acquired, 1+n DMA requests need to be completed in total for assembling a DMA message, a pull message header is initialized in a pre-allocated page-locked DRAM memory, and the remaining n times are pull model parameters, and the memory address is directly pulled from the persistent memory according to the parameters to be acquired. All 1+n DMA requests are only signalled once to the network card by means of the doorbell batch (doorbell batching) mechanism provided by the network card. When the CPU is adopted to acquire the model parameters, the parameters in the persistent memory are sequentially gathered into the buffer area positioned in the DRAM, and the network card pulls the parameters from the buffer area and returns the parameters to the client. Therefore, in this embodiment, by the method for acquiring parameters by using the network card, the CPU only needs to generate parameter description information, for example, a descriptor of a distributed-aggregate DMA request, and send the parameter description information to the network card, where the acquiring and aggregating of the parameters are performed asynchronously by the network card, so that the speed of acquiring the parameters from the persistent memory can be improved, and the technical problem of the present application is solved.
In the embodiment, the model parameters are acquired through the network card, so that CPU pauses caused by a large amount of high-delay persistent memory reading are eliminated, CPU bottlenecks are relieved, and the performance of the server device is improved. Meanwhile, the network card acquires the model parameters without interfering with the CPU cache, but the method of accessing the persistent memory model parameters by the CPU can cause cache pollution, further influence the performance of acquiring the parameter storage addresses and the like, and for example, the deletion rate of the last level cache of the index part can be increased by 11%. Meanwhile, the network card acquisition model parameters are not dependent on special hardware such as an FPGA (field programmable gate array), an intelligent network card and the like, and can be directly deployed on the existing data center network card. Distributed-aggregate DMA has been supported as a basic function by network cards of most vendors.
The method for obtaining the model parameters provided by the embodiment receives parameter description information sent by a CPU; the parameter description information comprises at least one parameter address index identifier to be acquired or a parameter storage address to be acquired; acquiring each parameter to be acquired according to the parameter storage related information to be acquired, generating a first parameter acquisition response, and sending the first parameter acquisition response to the client device; the first parameter acquisition response includes at least one parameter to be acquired. Because the parameter acquisition response is generated through the network card and directly sent to the client device, CPU pauses caused by a large amount of high-delay persistent memory reading can be eliminated, and the model parameter acquisition speed is improved.
The model parameter acquisition method of the present application is further described below by way of example. Fig. 6 is a schematic flow chart of a model parameter acquisition method in a server device according to an embodiment of the present application. As shown in fig. 6, the mapping relationship is a hash table 61, which includes a plurality of preset hash buckets 62, the client device 11 sends a parameter acquisition request to the server device 12, the network card 121 of the server device 12 receives the parameter acquisition request, the CPU122 of the server device 12 acquires the parameter acquisition request by communicating with the network card 121, acquires the hash table 61 from the persistent memory 123, and queries, in the hash table 61, information about to-be-acquired parameter storage corresponding to each to-be-acquired parameter address index identifier, where the to-be-acquired parameter storage related information includes a storage address and a data length of the to-be-acquired parameter. The CPU determines the storage related information of each parameter to be acquired in the parameter acquisition request as parameter description information corresponding to the parameter acquisition request, and sends the parameter description information to the network card 121. The network card 121 receives the parameter description information, pulls each parameter to be acquired from the storage address of each parameter to be acquired according to the data length of each parameter to be acquired through the DMA1211, generates a parameter acquisition response, and sends the parameter acquisition response to the client device 11.
Example five
Fig. 7 is a schematic structural diagram of a model parameter acquiring apparatus according to a fifth embodiment of the present application. As shown in fig. 7, the model parameter obtaining apparatus 70 provided in this embodiment is applied to a CPU, where the CPU is located in a server device, and the server device further includes a network card and a persistent memory, where model parameters are stored. The model parameter acquisition means 70 includes: an acquisition module 71, a determination module 72 and a transmission module 73.
An obtaining module 71, configured to obtain a parameter obtaining request sent by a client device; the parameter acquisition request comprises at least one parameter address index identifier to be acquired.
A determining module 72, configured to determine parameter description information corresponding to the parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information; the parameter description information comprises at least one parameter address storage related information to be acquired, and the parameter storage related information to be acquired comprises an index identifier of the parameter to be acquired or a parameter storage address to be acquired and a data length.
The sending module 73 is configured to send the parameter description information to the network card, so that the network card obtains each parameter to be obtained according to the relevant information stored in each parameter to be obtained, generates a first parameter obtaining response, and sends the first parameter obtaining response to the client device; the first parameter acquisition response includes at least one parameter to be acquired.
As an alternative embodiment, the preset address index information includes a mapping relationship between a model parameter address index identifier and model parameter storage related information, where the model parameter storage related information includes a storage address and a data length of a model parameter, and the determining module 72 is specifically configured to, when determining, according to the parameter address index identifier to be acquired and the preset address index information, parameter description information corresponding to the parameter acquisition request: inquiring the relevant information of the parameter storage to be acquired corresponding to the address index identifier of each parameter to be acquired in the mapping relation; and responding to the inquiry of each parameter address index to be acquired in the parameter acquisition request, and storing related information of the inquired parameter to be acquired as parameter description information corresponding to the parameter acquisition request.
As an optional implementation manner, the mapping relation comprises at least one preset hash bucket, each preset hash bucket is orderly arranged, the preset hash bucket comprises an overflow count identifier and a preset number of key value pair storage bits, and the model parameter address index identifier and the model parameter storage related information which are mutually corresponding are stored in the key value pair storage bits in a key value pair form; the determining module 72 is specifically configured to, when configured to query the mapping relationship for information related to the storage of the parameter to be obtained corresponding to the address index identifier of each parameter to be obtained: for each parameter address index identifier to be acquired, the following operations are executed: calculating a starting hash bucket corresponding to the parameter address index identifier to be acquired by adopting a preset hash algorithm, and determining the starting hash bucket as a hash bucket to be queried of the parameter address index identifier to be acquired; the query key in the hash bucket to be queried is a key value pair to be obtained of the parameter address index identification to be obtained; in response to the query being not completed with the key value pair to be obtained and the overflow count identification of the hash bucket to be queried not being the first preset value, updating the hash bucket to be queried according to the ordered arrangement and the preset step length of each preset hash bucket; repeating the steps from the step of searching the key value pairs to be obtained, which are marked by the index of the parameter address to be obtained, in the hash bucket to the step of updating the hash bucket to be searched according to the ordered arrangement and the preset step length of each preset hash bucket until the key value pairs to be obtained are searched, or until the key value pairs to be obtained are not searched and the overflow count mark of the hash bucket to be searched is not a first preset numerical value; and in response to the inquiry of the key value pair to be acquired, determining the value of the key value pair to be acquired as the parameter address storage address to be acquired corresponding to the parameter address index identifier.
As an optional implementation manner, the preset hash bucket further includes a migration identifier; the determining module 72 is specifically configured to, when the query key in the hash bucket to be queried is the key value pair to be obtained of the parameter address index identifier to be obtained: if the migration identification of the hash bucket to be queried is determined to be migration, updating the hash bucket to be queried into a starting hash bucket corresponding to the parameter address index identification to be acquired.
As an alternative embodiment, the model parameter obtaining apparatus 70 further includes a migration module, where the migration module is configured to: acquiring hot spot parameter information sent by client equipment; the hot spot parameter information comprises at least one hot spot parameter address index identifier and a heat value of each hot spot parameter; the heat value represents the number of times that the client device acquires the model parameters in a preset time period, and the number of times that the client device acquires the model parameters is in positive correlation; calculating a starting hash bucket corresponding to each hotspot parameter address index identifier by adopting a preset hash algorithm; the following operations are executed for each hot spot parameter address index identifier according to the order of the hot values from big to small: inquiring a hot key value pair with a hot parameter identifier as a hot key in a corresponding initial hash bucket; responding to that the hot spot key value pairs are not inquired in the corresponding initial hash bucket, wherein the inquiry keys in the mapping relation are the hot spot key value pairs identified by the hot spot parameter address indexes; and in response to the inquiry of the hot key value pair in the mapping relation, writing the hot key value pair into the initial hash bucket corresponding to the hot parameter address index identification, and deleting the hot key value pair from the initial hash bucket.
As an alternative implementation manner, the migration module is specifically configured to, when configured to delete a hot key pair from an initial hash bucket where the hot key pair is located: after the hot spot key value pairs are written into the initial hash buckets corresponding to the hot spot parameter address index identifiers, marking the key value pair storage bits for storing the hot spot key value pairs in the initial hash buckets as to-be-released; acquiring a global thread reading identifier and private thread reading identifiers in each reading thread, and determining whether a read thread which runs first exists according to the global thread reading identifier and each private thread reading identifier, wherein the read thread which runs first is a read thread which starts to run before a hot spot key value pair is written into a starting hash bucket corresponding to a hot spot parameter address index identifier; and if the fact that the read thread running in advance does not exist is determined, releasing the key value pair storage bit marked to be released in the initial hash bucket.
As an alternative embodiment, the model parameter obtaining apparatus 70 further includes an inserting module, where the inserting module is configured to: acquiring a parameter insertion request sent by client equipment; the parameter insertion request comprises at least one parameter to be inserted and an address index identifier thereof; storing each parameter to be inserted into a persistent memory to obtain storage related information of each parameter to be inserted; the relevant information of the parameter storage to be inserted comprises a storage address and a data length of the parameter to be inserted; for each parameter address index identifier to be inserted, the following operations are performed: calculating a starting hash bucket corresponding to the parameter address index identifier to be inserted by adopting a preset hash algorithm, and determining the starting hash bucket as the hash bucket to be inserted of the parameter identifier to be inserted; determining whether unoccupied key value pairs exist in the hash bucket to be inserted or not; in response to the absence of the unoccupied key-value pair storage bits, increasing the overflow count identification of the hash bucket to be inserted by a second preset value, further inserting the hash bucket according to the ordered arrangement and the preset step length of each preset hash bucket, and repeatedly executing the step of determining whether the unoccupied key-value pair storage bits exist in the hash bucket to be inserted until the unoccupied key-value pair storage bits exist in the hash bucket to be inserted, or until the unoccupied key-value pair storage bits do not exist in each preset hash bucket; in response to the existence of the unoccupied key-value pair storage bit, the parameter address index to be inserted is identified as a key and the parameter storage related information to be inserted is stored as a value into the unoccupied key-value pair storage bit.
The model parameter acquiring device provided in this embodiment may execute any one of the model parameter acquiring methods provided in the first to third embodiments, and the specific implementation manner is similar to the principle, and will not be repeated here.
Example six
Fig. 8 is a schematic structural diagram of a model parameter acquiring apparatus according to a sixth embodiment of the present application. As shown in fig. 8, the model parameter obtaining apparatus 80 provided in this embodiment is applied to a network card, where the network card is located in a server device, and the server device further includes a CPU and a persistent memory, where model parameters are stored. The model parameter acquisition means 80 includes: a receiving module 81 and a transmitting module 82.
A receiving module 81, configured to receive parameter description information sent by the CPU; the parameter description information comprises at least one parameter storage related information to be acquired, wherein the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired;
the sending module 82 is configured to obtain each parameter to be obtained according to the information related to the parameter storage to be obtained, generate a first parameter obtaining response, and send the first parameter obtaining response to the client device; the first parameter acquisition response includes at least one parameter to be acquired.
As an alternative embodiment, the sending module 82 is specifically configured to: and accessing the DMA through a direct memory, and obtaining each parameter to be obtained from the storage address of each parameter to be obtained according to the data length of each parameter to be obtained.
As an alternative embodiment, the sending module 82 is further configured to generate, in response to the parameter description information not including the parameter storage address to be acquired, a second parameter acquisition response, where the second parameter acquisition response includes at least one parameter address index information to be acquired.
The model parameter acquiring device provided in this embodiment may execute any of the model parameter acquiring methods provided in the fourth embodiment, and the specific implementation manner is similar to the principle, and will not be repeated here.
Example seven
Fig. 9 is a schematic structural diagram of a server device according to a seventh embodiment of the present application. As shown in fig. 9, the server device 90 provided in this embodiment includes a CPU91, a network card 92, a persistent memory 93, and a storage 94. The network card 92 includes a processor 95.CPU91, network card 92, persistent memory 93 and memory 94 are electrically interconnected.
The memory 94 is used for storing first computer-executable instructions and second computer-executable instructions. The network card 92 is used for transmitting and receiving data. Persistent memory 93 is used to store model parameters. The CPU91 executes a first computer-executed instruction to implement any of the model parameter acquisition methods provided in the above-described embodiments one to three, and the processor 95 executes a second computer-executed instruction to implement any of the model parameter acquisition methods provided in the above-described embodiment four.
The CPU91, the network card 92, the persistent memory 93 and the storage 94 may be interconnected by a bus. The bus may be an industry standard architecture (Industry Standard Architecture, abbreviated ISA) bus, an external device interconnect (Peripheral Component Interconnect, abbreviated PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 9, but not only one bus or one type of bus.
Persistent memory 93 generally refers to any memory medium that provides a byte-addressing interface, data persistence, including but not limited to DRAM with a backup battery, an aoteng persistent memory, memory interface block devices based on a novel hardware interconnect protocol, such as byte-addressable memory type solid state drives, etc.
The memory 94 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk, and the like.
In an exemplary embodiment, the server device 90 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.
The embodiments of the present application further provide a computer readable storage medium, where computer executable instructions are stored, where the computer executable instructions are used to implement the model parameter obtaining method provided in any one of the foregoing embodiments when executed by a processor. By way of example, the computer-readable storage medium may be read-only memory (ROM), random-access memory (RAM), magnetic tape, floppy disk, optical data storage device, etc.
Embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the model parameter acquisition method provided in any one of the embodiments above.
It should be understood that the above-described device embodiments are merely illustrative and that the device of the present application may be implemented in other ways. For example, the division of the modules in the above embodiment is merely a logic function division, and there may be another division manner when actually implemented. For example, multiple modules may be combined, or may be integrated into another system, or some features may be omitted or not performed.
In addition, each functional module in each embodiment of the present application may be integrated into one module, or each module may exist alone physically, or two or more modules may be integrated together, unless otherwise specified. The integrated modules may be implemented in hardware or in software program modules.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all alternative embodiments, and that the acts and modules referred to are not necessarily required in the present application.
It should be further noted that, although the steps in the flowchart are sequentially shown as indicated by arrows, the steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps in the flowcharts may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order in which the sub-steps or stages are performed is not necessarily sequential, and may be performed in turn or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (13)

1. The model parameter acquisition method is characterized by being applied to a Central Processing Unit (CPU), wherein the CPU is positioned in a server device, the server device further comprises a network card and a persistent memory, and model parameters are stored in the persistent memory; the method comprises the following steps:
acquiring a parameter acquisition request sent by client equipment; the parameter acquisition request comprises at least one parameter address index identifier to be acquired;
Determining parameter description information corresponding to a parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information; the parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired;
the parameter description information is sent to a network card, so that the network card can acquire all parameters to be acquired according to the relevant information of the parameter storage to be acquired, generate a first parameter acquisition response and send the first parameter acquisition response to client equipment; the first parameter acquisition response includes at least one parameter to be acquired.
2. The method according to claim 1, wherein the preset address index information includes a mapping relation of a model parameter address index identification and model parameter storage related information, the model parameter storage related information including a storage address and a data length of a model parameter;
the determining parameter description information corresponding to the parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information includes:
inquiring the parameter storage related information to be acquired corresponding to each parameter address index identifier to be acquired in the mapping relation;
And responding to the inquiry of each parameter address index to be acquired in the parameter acquisition request, and storing related information of the inquired parameter to be acquired as parameter description information corresponding to the parameter acquisition request.
3. The method according to claim 2, wherein the mapping relationship includes at least one preset hash bucket, each preset hash bucket is arranged in order, the preset hash bucket includes an overflow count identifier and a preset number of key value pair storage bits, and the model parameter address index identifier and the model parameter storage related information which correspond to each other are stored in the key value pair storage bits in the form of key value pairs;
inquiring the information related to the parameter storage to be acquired corresponding to each parameter address index identifier to be acquired in the mapping relation, wherein the information comprises the following steps:
for each parameter address index identifier to be acquired, the following operations are executed:
calculating a starting hash bucket corresponding to the parameter address index identifier to be acquired by adopting a preset hash algorithm, and determining the starting hash bucket as a hash bucket to be queried of the parameter address index identifier to be acquired;
the query key in the hash bucket to be queried is a key value pair to be obtained of the parameter address index identification to be obtained;
responding to the fact that the key value pair to be obtained is not queried, and the overflow count identification of the hash bucket to be queried is not a first preset value, and updating the hash bucket to be queried according to the ordered arrangement and the preset step length of each preset hash bucket;
Repeatedly executing the steps from the step of updating the hash bucket to be queried according to the ordered arrangement and the preset step length of each preset hash bucket from the step of updating the key value pair to be queried with the key to be queried in the hash bucket to be queried as the index mark of the parameter address to be queried, or until the key value pair to be queried is not queried, and the overflow count mark of the hash bucket to be queried is not a first preset numerical value;
and in response to the inquiry of the key value pair to be acquired, determining the value of the key value pair to be acquired as the parameter address index identification corresponding parameter storage related information to be acquired.
4. The method of claim 3, wherein the preset hash bucket further comprises a migration identification;
the query key in the hash bucket to be queried is a key value pair to be obtained of a parameter address index identifier to be obtained, and the method comprises the following steps:
and if the migration identification of the hash bucket to be queried is determined to be migration, updating the hash bucket to be queried into a starting hash bucket corresponding to the parameter address index identification to be acquired.
5. The method according to claim 3 or 4, further comprising:
acquiring hot spot parameter information sent by client equipment; the hot spot parameter information comprises at least one hot spot parameter address index identifier and a heat value of each hot spot parameter; the heat value represents the number of times that the client device acquires the model parameters in a preset time period and has a positive correlation with the number of times that the client device acquires the model parameters;
Calculating a starting hash bucket corresponding to each hotspot parameter address index identifier by adopting a preset hash algorithm;
the following operations are executed for each hot spot parameter address index identifier according to the order of the hot values from big to small:
inquiring a hot key value pair with a hot parameter identifier as a hot key in a corresponding initial hash bucket;
responding to the fact that a hot spot key value pair cannot be inquired in a corresponding initial hash bucket, wherein an inquiry key in the mapping relation is a hot spot key value pair identified by a hot spot parameter address index;
and in response to the inquiry of the hot key value pair in the mapping relation, writing the hot key value pair into a starting hash bucket corresponding to the hot parameter address index identification, and deleting the hot key value pair from the initial hash bucket.
6. The method of claim 3 or 4, wherein deleting the hot key pair from the original hash bucket comprises:
after the hot spot key value pairs are written into the initial hash buckets corresponding to the hot spot parameter address index identifiers, marking the key value pair storage bits storing the hot spot key value pairs in the initial hash buckets as to-be-released;
acquiring a global thread reading identifier and private thread reading identifiers in each reading thread, and determining whether a read thread which runs first exists according to the global thread reading identifier and each private thread reading identifier, wherein the read thread which runs first is a read thread which starts to run before a hot spot key value pair is written into a starting hash bucket corresponding to a hot spot parameter address index identifier;
And if the fact that the read thread running in advance does not exist is determined, releasing the key value pair storage bit marked to be released in the initial hash bucket.
7. The method according to claim 3 or 4, further comprising:
acquiring a parameter insertion request sent by client equipment; the parameter insertion request comprises at least one parameter to be inserted and an address index identifier thereof;
storing each parameter to be inserted into a persistent memory to obtain storage related information of each parameter to be inserted; the relevant information of the parameter storage to be inserted comprises a storage address and a data length of the parameter to be inserted;
for each parameter address index identifier to be inserted, the following operations are performed:
calculating a starting hash bucket corresponding to the parameter address index identifier to be inserted by adopting a preset hash algorithm, and determining the starting hash bucket as the hash bucket to be inserted of the parameter identifier to be inserted;
determining whether unoccupied key value pairs exist in the hash bucket to be inserted or not;
in response to the absence of the unoccupied key value pair storage bits, increasing the overflow count identification of the hash bucket to be inserted by a second preset value, further inserting the hash bucket according to the ordered arrangement and the preset step length of each preset hash bucket, and repeatedly executing the step of determining whether the unoccupied key value pair storage bits exist in the hash bucket to be inserted until the unoccupied key value pair storage bits exist in the hash bucket to be inserted, or until the unoccupied key value pair storage bits do not exist in each preset hash bucket;
In response to the existence of the unoccupied key-value pair storage bit, the parameter address index to be inserted is identified as a key and the parameter storage related information to be inserted is stored as a value into the unoccupied key-value pair storage bit.
8. The model parameter acquisition method is characterized by being applied to a network card, wherein the network card is positioned in a server device, the server device further comprises a CPU and a persistent memory, and model parameters are stored in the persistent memory; the method comprises the following steps:
receiving parameter description information sent by a CPU (Central processing Unit); the parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired;
acquiring each parameter to be acquired according to the parameter storage related information to be acquired, generating a first parameter acquisition response, and sending the first parameter acquisition response to the client device; the first parameter acquisition response includes at least one parameter to be acquired.
9. The method according to claim 8, wherein the acquiring each parameter to be acquired according to the information related to the storage of each parameter to be acquired includes:
and accessing the DMA through a direct memory, and obtaining each parameter to be obtained from the storage address of each parameter to be obtained according to the data length of each parameter to be obtained.
10. The model parameter acquisition device is characterized by being applied to a CPU, wherein the CPU is positioned in a server device, the server device further comprises a network card and a persistent memory, and model parameters are stored in the persistent memory; the device comprises:
the acquisition module is used for acquiring a parameter acquisition request sent by the client equipment; the parameter acquisition request comprises at least one parameter address index identifier to be acquired;
the determining module is used for determining parameter description information corresponding to the parameter acquisition request according to the parameter address index identifier to be acquired and preset address index information; the parameter description information comprises at least one parameter address storage related information to be acquired, wherein the parameter storage related information to be acquired comprises an index identifier of a parameter to be acquired or a parameter storage address and a data length to be acquired;
the sending module is used for sending the parameter description information to the network card so that the network card can acquire the parameters to be acquired according to the relevant information of the parameter storage to be acquired, generate a first parameter acquisition response and send the first parameter acquisition response to the client equipment; the first parameter acquisition response includes at least one parameter to be acquired.
11. The model parameter acquisition device is characterized by being applied to a network card, wherein the network card is positioned in a server device, the server device further comprises a CPU and a persistent memory, and model parameters are stored in the persistent memory; the device comprises:
the receiving module is used for receiving the parameter description information sent by the CPU; the parameter description information comprises at least one parameter storage related information to be acquired, and the parameter storage related information to be acquired comprises a storage address and a data length of the parameter to be acquired;
the sending module is used for storing related information according to each parameter to be acquired to acquire each parameter to be acquired, generating a first parameter acquisition response and sending the first parameter acquisition response to the client equipment; the first parameter acquisition response includes at least one parameter to be acquired.
12. A server device, comprising: CPU, network card, persistent memory and storage; the network card comprises a processor;
the CPU, the network card, the persistent memory and the memory circuit are interconnected;
the memory is used for storing a first computer execution instruction and a second computer execution instruction, and the network card is used for receiving and transmitting data;
The persistent memory is used for storing model parameters;
the CPU executing the first computer-executable instructions to implement the method of any one of claims 1-9, the processor executing the second computer-executable instructions to implement the method of claim 10 or 11.
13. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out the method of any one of claims 1-11.
CN202310125190.9A 2023-02-06 2023-02-06 Model parameter acquisition method and device, server device and storage medium Pending CN116244214A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310125190.9A CN116244214A (en) 2023-02-06 2023-02-06 Model parameter acquisition method and device, server device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310125190.9A CN116244214A (en) 2023-02-06 2023-02-06 Model parameter acquisition method and device, server device and storage medium

Publications (1)

Publication Number Publication Date
CN116244214A true CN116244214A (en) 2023-06-09

Family

ID=86627302

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310125190.9A Pending CN116244214A (en) 2023-02-06 2023-02-06 Model parameter acquisition method and device, server device and storage medium

Country Status (1)

Country Link
CN (1) CN116244214A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116700995A (en) * 2023-08-03 2023-09-05 浪潮电子信息产业股份有限公司 Concurrent access method, device, equipment and storage medium for heterogeneous memory pool

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116700995A (en) * 2023-08-03 2023-09-05 浪潮电子信息产业股份有限公司 Concurrent access method, device, equipment and storage medium for heterogeneous memory pool
CN116700995B (en) * 2023-08-03 2023-11-03 浪潮电子信息产业股份有限公司 Concurrent access method, device, equipment and storage medium for heterogeneous memory pool

Similar Documents

Publication Publication Date Title
CN109300036B (en) Bifurcation regression method and device of block chain network
CN110555001B (en) Data processing method, device, terminal and medium
CN107111452B (en) Data migration method and device applied to computer system and computer system
CN107329704B (en) Cache mirroring method and controller
CN110427386B (en) Data processing method, device and computer storage medium
EP3876106A1 (en) File storage method and deletion method, server, and storage medium
CN107665219B (en) Log management method and device
CN111124270B (en) Method, apparatus and computer program product for cache management
US20200073593A1 (en) Memory controller and associated accessing method and electronic device
CN116244214A (en) Model parameter acquisition method and device, server device and storage medium
WO2023160358A1 (en) Memory scanning method and apparatus
CN114138776A (en) Method, system, apparatus and medium for graph structure and graph attribute separation design
CN105917303A (en) Controller, method for identifying data block stability and storage system
CN116755625A (en) Data processing method, device, equipment and readable storage medium
CN115794669A (en) Method, device and related equipment for expanding memory
CN115904212A (en) Data processing method and device, processor and hybrid memory system
CN103729166A (en) Method, device and system for determining thread relation of program
CN111752941B (en) Data storage and access method and device, server and storage medium
CN111694806A (en) Transaction log caching method, device, equipment and storage medium
CN115277644B (en) Bus data transmission system, method, equipment and storage medium
CN108804571B (en) Data storage method, device and equipment
CN115729955A (en) Thermal data reading method and related device
CN115904211A (en) Storage system, data processing method and related equipment
CN113806389A (en) Data processing method and device, computing equipment and storage medium
CN109828720B (en) Data storage method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination