CN114154644A - Machine learning data processing method and device - Google Patents

Machine learning data processing method and device Download PDF

Info

Publication number
CN114154644A
CN114154644A CN202111443501.3A CN202111443501A CN114154644A CN 114154644 A CN114154644 A CN 114154644A CN 202111443501 A CN202111443501 A CN 202111443501A CN 114154644 A CN114154644 A CN 114154644A
Authority
CN
China
Prior art keywords
accelerator
data processing
machine learning
deployment
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111443501.3A
Other languages
Chinese (zh)
Inventor
杨建磊
雷凡丁
万寒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202111443501.3A priority Critical patent/CN114154644A/en
Publication of CN114154644A publication Critical patent/CN114154644A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a machine learning data processing method and device, relates to the field of artificial intelligence, and comprises the following steps: receiving a processing request of original data to be processed, which is sent by a data processing request party; an accelerator deployment interface is established, the accelerator deployment interface is called and called to send the engineering file to an AI accelerator, and the AI accelerator carries out reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code to obtain a reasoning execution result corresponding to the processing request; and receiving an inference execution result returned by the AI accelerator, and feeding back the inference execution result to the data processing requester. The method and the system can construct the accelerator deployment interface to obtain the inference execution result by using the AI accelerator, separate the relation between the AI accelerator hardware equipment and the management server software from the system level, and have excellent equipment expandability.

Description

Machine learning data processing method and device
Technical Field
The application relates to the field of artificial intelligence, in particular to a machine learning data processing method and device based on an AI accelerator.
Background
With the development of machine learning technology, the training effect of the deep neural network is greatly improved. At the same time, however, the size of deep neural networks has increased by a factor, resulting in the gradual fatigue of conventional computer processors for performing data processing based on deep neural networks. The AI accelerator can be used for carrying out calculation acceleration facing to a deep neural network, and is a hardware acceleration microprocessor special for the field of artificial intelligence. The appearance of the AI accelerator accelerates the inference execution efficiency of the deep neural network and enables the deep neural network to be deployed on the ground quickly.
However, the commercial AI accelerator platform has a large number of functions and high machine learning cost; in addition, from the perspective of extensibility, the existing commercial AI accelerator platform usually only supports the hardware products of its own company, but cannot adapt to AI accelerators produced by other companies, and lacks extensibility, which causes limitation to data processing of a deep neural network.
Disclosure of Invention
Aiming at the problems in the prior art, the application provides a machine learning data processing method and device, which can construct an accelerator deployment interface to obtain an inference execution result by using an AI accelerator.
In order to solve the technical problem, the application provides the following technical scheme:
in a first aspect, the present application provides a machine learning data processing method, including:
receiving a processing request of original data to be processed, which is sent by a data processing request party; the processing request comprises the project file of the original data to be processed; the engineering file comprises the raw data to be processed, a machine learning model and an inference execution code;
an accelerator deployment interface is constructed, the engineering file is sent to an AI accelerator by using the accelerator deployment interface, and the AI accelerator carries out reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code to obtain a reasoning execution result corresponding to the processing request;
and receiving an inference execution result returned by the AI accelerator, and feeding back the inference execution result to the data processing requester.
Further, the constructing an accelerator deployment interface and calling the accelerator deployment interface to send the engineering file to an AI accelerator includes:
constructing the accelerator deployment interface according to the type of the AI accelerator;
constructing a deployment configuration file of the AI accelerator according to the processing request;
inputting the deployment configuration file and the engineering file into the accelerator deployment interface;
selecting a deployment operation script corresponding to the AI accelerator by using the configuration deployment file;
and running the deployment running script, and calling the accelerator deployment interface to load the engineering file to the AI accelerator.
Further, the processing request includes user behavior data; the machine learning data processing method further comprises the following steps:
analyzing the user behavior data to obtain a behavior analysis result of the data processing requester; the behavior analysis result comprises an error region statistical result, an error reporting information statistical result, a submission time statistical result and a submission frequency statistical result in the inference execution process;
and sending the behavior analysis result to a database of a requester and the data processing requester.
Further, the machine learning data processing method further includes:
checking whether the processing request is a repeatedly submitted processing request;
if not, checking whether an AI accelerator requested to be used by the processing request is idle;
if yes, checking whether the engineering file in the processing request conforms to the format specification.
Further, the machine learning data processing method further includes:
when receiving a reasoning execution result analysis request sent by the data processing requester, performing reasoning execution analysis according to the reasoning execution result;
and transmitting the obtained inference execution analysis result back to the data processing requester.
In a second aspect, the present application provides a machine learning data processing apparatus, including:
a processing request receiving unit, configured to receive a processing request of original data to be processed, sent by a data processing requester; the processing request comprises the project file of the original data to be processed; the engineering file comprises the raw data to be processed, a machine learning model and an inference execution code;
the engineering file sending unit is used for constructing an accelerator deployment interface, calling the accelerator deployment interface to send the engineering file to an AI accelerator, and enabling the AI accelerator to carry out reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code to obtain a reasoning execution result;
and the execution result returning unit is used for receiving the inference execution result returned by the AI accelerator and feeding back the inference execution result to the data processing requester.
Further, the project file sending unit includes:
the deployment interface construction module is used for constructing the accelerator deployment interface according to the type of the AI accelerator;
the configuration file construction module is used for constructing a deployment configuration file of the AI accelerator according to the processing request;
the file input module is used for inputting the deployment configuration file and the engineering file into the accelerator deployment interface;
the running script selection module is used for selecting a deployment running script corresponding to the AI accelerator by using the configuration deployment file;
and the engineering file loading module is used for operating the deployment operation script and calling the accelerator deployment interface to load the engineering file to the AI accelerator.
Further, the machine learning data processing apparatus further includes:
the user behavior analysis unit is used for analyzing the user behavior data to obtain a behavior analysis result of the data processing requester; the behavior analysis result comprises an error region statistical result, an error reporting information statistical result, a submission time statistical result and a submission frequency statistical result in the inference execution process;
and the analysis result feedback unit is used for sending the behavior analysis result to the requester database and the data processing requester.
In a third aspect, the present application provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the machine learning data processing method when executing the program.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the machine learning data processing method.
Aiming at the problems in the prior art, the machine learning data processing method and the machine learning data processing device provided by the application can aim at simplifying the use, so that a user can complete the deployment of the neural network model on the AI accelerator by a small amount of operation to obtain an inference execution result, and further can perform data analysis on the inference execution process of the neural network model from two layers of user individuals and user groups; by constructing the accelerator deployment interface, the connection between the AI accelerator hardware device and the management server software is separated from the system level, and the device expandability is excellent.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a block diagram of a method for processing machine learning data according to an embodiment of the present disclosure;
FIG. 2 is a second block diagram of a machine learning data processing method according to an embodiment of the present invention;
FIG. 3 is a flowchart of a machine learning data processing method according to an embodiment of the present application;
FIG. 4 is a flowchart of the construction of an accelerator deployment interface in an embodiment of the present application;
FIG. 5 is a second flowchart of a machine learning data processing method according to an embodiment of the present application;
FIG. 6 is a third flowchart of a method for processing machine learning data according to an embodiment of the present application;
FIG. 7 is a fourth flowchart of a machine learning data processing method according to an embodiment of the present application;
FIG. 8 is a block diagram of a machine learning data processing apparatus according to an embodiment of the present application;
fig. 9 is a structural diagram of a project file sending unit in the embodiment of the present application;
FIG. 10 is a second block diagram of a machine learning data processing apparatus according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of an electronic device in an embodiment of the present application;
FIG. 12 is a state transition diagram of an embodiment of an online tutoring platform according to the present application;
FIG. 13 is a state transition diagram of user management data in an embodiment of the present application;
FIG. 14 is a state transition diagram of task management data in an embodiment of the present application;
FIG. 15 is a state transition diagram of device management data in an embodiment of the present application;
FIG. 16 is a diagram illustrating a user registration sequence in an embodiment of the present application;
FIG. 17 is a diagram illustrating a task management sequence in an embodiment of the present application;
fig. 18 is a schematic diagram of an accelerator deployment interface constructed in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Aiming at the problems in the prior art, the application provides a machine learning data processing method and device, which can enable a user to complete the deployment of a neural network model on an AI accelerator only through a small amount of operations to obtain an inference execution result, and further can perform data analysis on the inference execution process of the neural network model from two layers of user individuals and user groups. According to the machine learning data processing method and device, the relationship between the AI accelerator hardware equipment and the management server software is separated from the system level by constructing the accelerator deployment interface, and excellent equipment expandability is achieved.
In order to better illustrate the technical advantages of the application, the embodiment of the application is illustrated by taking an online teaching experiment platform as an application scene. The on-line teaching experiment platform aims to enable students (users) to conveniently use the AI accelerator to finish the reasoning execution of a neural network model required by the students, and can carry out data analysis on the reasoning execution process to finish the teaching experiment process.
The framework diagram of the online teaching experiment platform is shown in fig. 1 and fig. 2. The framework can be divided into a software layer and an equipment layer on the whole; the software layer comprises a client, a server and a database; the device layer includes a variety of real AI accelerator hardware devices, and these devices have running thereon applications that are capable of executing object code, including user submitted engineering files, as described in more detail below. The user can be used as a data processing requester to make a data processing request to the server through the client, namely, the server is requested to start the AI accelerator required by the user to help the user to perform inference execution of the neural network model. All data in the inference execution process is stored in a result database and a user database (hereinafter also referred to as a requester database), and the reason for storing the inference execution data and the user behavior data separately is to facilitate quick searching of user information. It should be noted that the entity of the database may be, but is not limited to being, in a server. Subsequently, the user may request the server for user behavior analysis, and the server may extract data from the database and transmit the data to the client for visual display (i.e., browser page). The user database stores user information, and the result database stores engineering operation results.
In one embodiment, referring to fig. 3, in order to be able to construct an accelerator deployment interface to obtain inference execution results using an AI accelerator, the present application provides a machine learning data processing method, including:
s101: receiving a processing request of original data to be processed, which is sent by a data processing request party; the processing request comprises the project file of the original data to be processed; the engineering file comprises the raw data to be processed, a machine learning model and an inference execution code;
s102: an accelerator deployment interface is established, the accelerator deployment interface is called to send the engineering file to an AI accelerator, and the AI accelerator carries out reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code to obtain a reasoning execution result;
s103: and receiving an inference execution result returned by the AI accelerator, and feeding back the inference execution result to the data processing requester.
It is understood that, in the above application scenario, the student user, as a data processing requester, may make a data processing request to the server through the client. These student users often have trained the neural network model (also known as the machine learning model) before making data processing requests. In this case, the processing request includes the project file of the raw data to be processed; the engineering file comprises original data to be processed, a machine learning model and an inference execution code. The server can call the AI accelerator to perform reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code to obtain a reasoning execution result.
It should be noted that: the client is configured with a visualization page that the user can directly access. During the submission of the data processing request, the user may submit a project file in the visualization page and fill in form information, including but not limited to project operating parameters, project type, and target AI accelerator. The so-called target AI accelerator may be an AI accelerator selected by a user according to characteristics of their machine learning model. The client sends the project file submitted by the user to the server, waits for a project operation result (namely an inference execution result) returned by the server, and displays the project operation result in a visual page for the user to view.
A server: the server is responsible for receiving a processing request from the client, constructing an accelerator deployment interface, calling the accelerator deployment interface to send the engineering file to the AI accelerator, enabling the AI accelerator to perform reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code, obtaining a reasoning execution result corresponding to the processing request, and returning the reasoning execution result to the client.
The server may first determine whether to accept the processing request before sending the project file to the AI accelerator. The judgment content includes but is not limited to: request restriction checking, device resource checking and file format checking. Wherein the request limit check is used to determine whether the user has repeatedly submitted the processing request; the equipment resource check is used for judging whether a target AI accelerator requested by a user is in an idle state; the file format check is used to determine whether the format of the project file submitted by the user meets specifications. In addition, the server can also perform statistics on user behaviors, including but not limited to performing user engineering execution process error area statistics, error reporting information statistics, submission frequency statistics and submission time statistics. And storing the user behavior information obtained by statistics into a user database.
③ AI accelerator: the AI accelerator provides a plurality of kinds of AI accelerators, each having several actual hardware devices. These actual hardware devices are the ultimate deployment environment for the project files submitted by the user. Various AI accelerators can communicate with the server through the constructed accelerator deployment interface to receive the engineering files. The result (reasoning execution result) obtained by the AI accelerator execution can also be transmitted back to the server through the accelerator deployment interface, so that the server stores the result in the result database.
In the specific implementation process, the state transition diagram of the online teaching experiment platform is shown in fig. 12 and divided into five states. When the user does not submit any processing request, the state is an idle state; when the user submits the processing request, the server is switched to a running state, and at the moment, the server also needs to asynchronously monitor whether other users submit the processing request. In the running state, if the server does not check the processing request (see the descriptions of S401 to S403 in fig. 6), the running is terminated, the server releases the resource, enters a running interrupt state, and then exits. If all the devices normally operate, the devices enter an operation ending state after operation is successful, and then exit. The "running" state may be a process in which the AI accelerator actually performs inference execution.
The online teaching experiment platform management mode can be divided into three major aspects of user management, task management and equipment management, and is respectively used for processing users, tasks (also called processing requests) and equipment (AI accelerators).
Managing users: user management referring to fig. 13, in relation to management of user information data, a server needs to query a database for information of the user and return a corresponding result for the user to view and/or modify.
Task management: task management referring to fig. 14, related operations of a user on submitted project files are involved. The server gives corresponding feedback according to the current level of authority of the user and the operation to be performed. For example, in the historical task query, the server not only returns the historical result, but also needs to provide a download channel of the corresponding result file, so that the user can compare the result with the latest result data.
Managing the equipment: device management referring to fig. 15, a series of operations of a device by a server are involved. The user node is not shown in fig. 15, illustrating that the operation is transparent to the user. An administrator can manually restart the problem equipment through the server and can check the occupation condition of equipment resources at any time for fault analysis or as the basis of cluster expansion.
As can be seen from the above description, the machine learning data processing method provided by the application can aim at simplifying use, so that a user can complete deployment of the neural network model on the AI accelerator only through a small amount of operations to obtain an inference execution result, and further can perform data analysis on the inference execution process of the neural network model from two layers of user individuals and user groups; by constructing the accelerator deployment interface, the connection between the AI accelerator hardware device and the management server software is separated from the system level, and the device expandability is excellent.
In an embodiment, referring to fig. 4, constructing an accelerator deployment interface, and calling the accelerator deployment interface to send the engineering file to an AI accelerator includes:
s201: constructing the accelerator deployment interface according to the type of the AI accelerator;
s202: constructing a deployment configuration file of the AI accelerator according to the processing request;
s203: inputting the deployment configuration file and the engineering file into the accelerator deployment interface;
s204: selecting a deployment operation script corresponding to the AI accelerator by using the configuration deployment file;
s205: and running the deployment running script, and calling the accelerator deployment interface to load the engineering file to the AI accelerator.
It is to be understood that the processing procedure after the user submits the processing request is shown in fig. 17. Fig. 17 exemplifies a process in which a task (also referred to as a processing request) is submitted by a user and processed to return. The client uses the form information (project operation parameters, project types and target AI accelerator platforms) and the project files uploaded by the user to construct a request packet and sends the request packet to the server, and after receiving the request, the server firstly performs permission judgment (such as request limitation check, equipment resource check and file format check), and then determines whether to give permission to continue execution of the processing request according to the check result. When the detection passes, the server constructs a configuration file, and the source of the configuration file can be but is not limited to: (1) the request form (such as project operation parameters, project types and target AI accelerator platforms) (2) submitted by the client is the equipment number assigned to the processing request by the server. After the configuration file is built, the server inputs the project file and the configuration file into a deployment interface, the server executes a deployment operation script (provided by an AI accelerator hardware manufacturer) of the target AI accelerator by using the deployment interface, and deploys the project onto the AI accelerator for execution.
It should be noted that the AI accelerator hardware of the online teaching experiment platform is various, so that a user can clearly identify which AI accelerator hardware the user wants to use in the configuration deployment file, then the server can select the deployment running script corresponding to the AI accelerator according to the information, finally the server runs the deployment running script, and the accelerator deployment interface is utilized to load the engineering file to the AI accelerator.
In an embodiment, referring to fig. 18, an accelerator deployment interface is constructed according to the type of the AI accelerator, when the processing request is received, the engineering file and the configuration file are input into the accelerator deployment interface, and the accelerator deployment interface is called to send the engineering file to the AI accelerator, so that the AI accelerator performs inference execution on raw data to be processed according to the machine learning model and the inference execution code to obtain an inference execution result. And the running script selection module in the deployment interface selects a corresponding deployment running script according to the deployment configuration file, and the server deploys the project into the target AI accelerator by executing the corresponding deployment running script. The operation of building the accelerator deployment interface may be performed before or after the server receives the processing request, but more often before the processing request is received.
As can be seen from the above description, the machine learning data processing method provided by the present application can construct an accelerator deployment interface, and call the accelerator deployment interface to send the engineering file to the AI accelerator.
In one embodiment, referring to FIG. 5, the processing request includes user behavior data; the machine learning data processing method further comprises the following steps:
s301: analyzing the user behavior data to obtain a behavior analysis result of the data processing requester; the behavior analysis result comprises an error region statistical result, an error reporting information statistical result, a submission time statistical result and a submission frequency statistical result in the inference execution process;
s302: and sending the behavior analysis result to a database of a requester and the data processing requester.
It will be appreciated that the online teaching experiment platform includes a database. The database can store the basic information, behavior information, engineering operation results and the like of users and is divided into a user database and a result database. As previously mentioned, such separate storage facilitates rapid analysis of inference execution results and rapid lookup of user behavior information. The user database is separated from the result database, so that the excessive system overhead of each search due to the excessive result database is prevented. The result database and the user database are mapped with each other so that the inference execution result can be associated with the user. Specifically, the mapping mode between the user database and the result database can realize the mapping between the result database and the user database by connecting the entries of the result database and the user database in series through the user identity field.
The actual data analysis process can be divided into the analysis of the individual behavior of the user or the analysis of the group behavior of the user. The analysis result can be provided for the user in a mode of a graph, a table and the like, and corresponding visual statistical information is presented. The user personal behavior analysis can be open to users and system administrators, including project submission accuracy, error information statistics, error region statistics, submission frequency statistics, submission time statistics and the like, students can adjust learning states of the students according to the information, breakthroughs are made for weak points, and learning efficiency is effectively improved. The user group behavior analysis is only open to a system administrator, and comprises student error area distribution statistics, error information statistics, submission frequency statistics, submission time statistics and the like, weak knowledge points of students can be known, teaching tasks and teaching schemes can be improved in a targeted manner, and meanwhile, the weak knowledge points can be fed back to an online experiment platform for further optimization. The analysis results may be stored in a database.
From the above description, the machine learning data processing method provided by the application can analyze the user behavior data to obtain the behavior analysis result of the data processing requester, and send the behavior analysis result to the requester database and the data processing requester.
The requester database, also referred to as a user database, may store a variety of user behavior data, including but not limited to user registration data, login data, and a variety of data in requesting the execution of inferences.
In order to better illustrate the method of the present application, the specific workflow of the online platform is described in three aspects of user registration, user login, task submission and processing.
Firstly, user registration: fig. 16 shows a flow of user registration. When a user registers, the user needs to input information such as a user name, a password, a mailbox, an interested tag and the like. The client side firstly carries out format verification on the information input by the user and then sends the information to the server. The server further checks the information (checks for conflicts etc.) and if there are no problems, writes the user information into the user database. And the user database, the server and the client return results in sequence.
Secondly, the user logs in: fig. 17 shows a flow of user login. When the user enters the personal information interface, the client sends a processing request to the server. The server checks the authority of the client (including whether the client logs in and whether the login user has the authority to acquire information, and the like), reads personal information from the user database, and returns the personal information to the client. And the client displays the information after rendering. When the user changes the personal information at the client, the changed information is sent to the server. The server needs to check again, if no problem exists, the server writes the data into the user database, and returns to the change state. And the client returns the writing result to the user.
Thirdly, task submission and processing: an accelerator deployment interface is established, the accelerator deployment interface is called to send the engineering file to an AI accelerator, and the AI accelerator carries out reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code to obtain a reasoning execution result corresponding to the processing request; and receiving the inference execution result returned by the AI accelerator, and feeding back the inference execution result to the data processing requester.
In an embodiment, referring to fig. 7, the machine learning data processing method further includes:
s501: when receiving a reasoning execution result analysis request sent by the data processing requester, performing reasoning execution analysis according to the reasoning execution result;
s502: and transmitting the obtained inference execution analysis result back to the data processing requester.
It is understood that the AI accelerator can return the inference execution result to the server, which can save the inference execution result in the result database on one hand and return the inference execution result to the client on the other hand. If the user wants to view a more detailed process after checking the returned inference execution result, a result downloading request (which may be an inference execution result analysis request) is sent, and the server may return the downloaded content. These downloaded content may include inferential performance analysis results. It should be noted that the server may process the request intermittently, so that the requests of multiple users may be processed simultaneously in an asynchronous manner.
As can be seen from the above description, the machine learning data processing method provided in the present application can return the obtained inference execution analysis result to the data processing requester.
In summary, the method described herein brings benefits including, but not limited to:
the method and the device comprise a software and hardware layered structure, a software layer and an equipment layer are separated, interaction is realized through a communication interface, the interface is a unified communication interface of a server and AI accelerator equipment, and mounting of new AI accelerator equipment can be easily completed; the data collection and analysis system capable of being oriented to teaching can collect relevant information used by a student platform, such as project correct rate submission, error information statistics and the like, and provide behavior analysis of individual and group of students.
The method and the device have excellent expandability, and support various AI accelerators, including determining the hardware deployment script of the execution of the engineering file through the configuration file, and deploying the engineering file to the AI accelerators. The platform can be divided into a software layer and a device layer, the software layer and the device layer are communicated through an interactive interface, and the connection between the AI accelerator device and the platform management software is separated from the system layer. Therefore, when new equipment is required, the platform can complete the deployment of the new equipment through simple interface docking processing.
According to the method and the device, the load balancing method is adopted, the load balancing can be carried out under the condition that resources are insufficient but a user submits a project frequently, the task pressure among the devices is balanced in a device load balancing scheduling mode, and the problem that the user experience is influenced because a single device cannot respond to the submitted project files quickly is prevented.
The method and the device have better robustness, and the problem program is stopped in a way of forced emergency interruption of the server scheduling system, for example, dead loops exist in the program, so that the user request is prevented from being blocked due to unpredictable errors.
The method and the device can be used for a teaching data collection and analysis system, and the system can fully analyze the weak points of students and pointedly reform course schemes. On the other hand, platform vulnerabilities can also be mined through data analysis, thereby further improving the platform.
Based on the same inventive concept, the embodiment of the present application further provides a machine learning data processing apparatus, which can be used to implement the methods described in the foregoing embodiments, as described in the following embodiments. Because the principle of solving the problem of the machine learning data processing device is similar to that of the machine learning data processing method, the implementation of the machine learning data processing device can refer to the implementation of the software performance reference determination method, and repeated details are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
In one embodiment, referring to fig. 8, in order to be able to construct an accelerator deployment interface to obtain inference execution results using an AI accelerator, the present application provides a machine learning data processing apparatus comprising: a processing request receiving unit 801, a project file sending unit 802, and an execution result returning unit 803.
A processing request receiving unit 801, configured to receive a processing request of original data to be processed, sent by a data processing requester; the processing request comprises the project file of the original data to be processed; the engineering file comprises the raw data to be processed, a machine learning model and an inference execution code;
the engineering file sending unit 802 is configured to construct an accelerator deployment interface, and call the accelerator deployment interface to send the engineering file to an AI accelerator, so that the AI accelerator performs inference execution on the raw data to be processed according to the machine learning model and the inference execution code to obtain an inference execution result;
the execution result returning unit 803 is configured to receive the inference execution result returned by the AI accelerator, and feed back the inference execution result to the data processing requester.
In an embodiment, referring to fig. 9, the project file sending unit 802 includes: the system comprises a deployment interface construction module 901, a configuration file construction module 902, a file input module 903, an operation script selection module 904 and an engineering file loading module 905.
A deployment interface construction module 901, configured to construct the accelerator deployment interface according to the type of the AI accelerator;
a configuration file constructing module 902, configured to construct a deployment configuration file of the AI accelerator according to the processing request;
a file input module 903, configured to input the deployment configuration file and the engineering file into the accelerator deployment interface;
an operating script selecting module 904, configured to select a deployment operating script corresponding to the AI accelerator by using the configuration deployment file;
and the engineering file loading module 905 is used for operating the deployment operation script and calling the accelerator deployment interface to load the engineering file to the AI accelerator.
In an embodiment, referring to fig. 10, the machine learning data processing apparatus further includes: a user behavior analysis unit 1001 and an analysis result feedback unit 1002.
A user behavior analysis unit 1001, configured to analyze the user behavior data to obtain a behavior analysis result of the data processing requester; the behavior analysis result comprises an error region statistical result, an error reporting information statistical result, a submission time statistical result and a submission frequency statistical result in the inference execution process;
an analysis result feedback unit 1002, configured to send the behavior analysis result to the requester database and the data processing requester.
From a hardware level, in order to be able to construct an accelerator deployment interface to obtain an inference execution result by using an AI accelerator, the present application provides an embodiment of an electronic device for implementing all or part of the contents of the machine learning data processing method, where the electronic device specifically includes the following contents:
a Processor (Processor), a Memory (Memory), a communication Interface (Communications Interface) and a bus; the processor, the memory and the communication interface complete mutual communication through the bus; the communication interface is used for realizing information transmission between the machine learning data processing device and relevant equipment such as a core service system, a user terminal, a relevant database and the like; the logic controller may be a desktop computer, a tablet computer, a mobile terminal, and the like, but the embodiment is not limited thereto. In this embodiment, the logic controller may be implemented with reference to the embodiment of the machine learning data processing method and the embodiment of the machine learning data processing apparatus in the embodiment, and the contents thereof are incorporated herein, and repeated descriptions are omitted.
It is understood that the user terminal may include a smart phone, a tablet electronic device, a network set-top box, a portable computer, a desktop computer, a Personal Digital Assistant (PDA), an in-vehicle device, a smart wearable device, and the like. Wherein, intelligence wearing equipment can include intelligent glasses, intelligent wrist-watch, intelligent bracelet etc..
In practical applications, part of the machine learning data processing method may be executed on the electronic device side as described above, or all operations may be performed in the client device. The selection may be specifically performed according to the processing capability of the client device, the limitation of the user usage scenario, and the like. This is not a limitation of the present application. The client device may further include a processor if all operations are performed in the client device.
The client device may have a communication module (i.e., a communication unit), and may be in communication connection with a remote server to implement data transmission with the server. The server may include a server on the side of the task scheduling center, and in other implementation scenarios, the server may also include a server on an intermediate platform, for example, a server on a third-party server platform that is communicatively linked to the task scheduling center server. The server may include a single computer device, or may include a server cluster formed by a plurality of servers, or a server structure of a distributed apparatus.
Fig. 11 is a schematic block diagram of a system configuration of an electronic device 9600 according to an embodiment of the present application. As shown in fig. 11, the electronic device 9600 can include a central processor 9100 and a memory 9140; the memory 9140 is coupled to the central processor 9100. Notably, this FIG. 11 is exemplary; other types of structures may also be used in addition to or in place of the structure to implement telecommunications or other functions.
In one embodiment, the machine learning data processing method functions may be integrated into the central processor 9100. The central processor 9100 may be configured to control as follows:
s101: receiving a processing request of original data to be processed, which is sent by a data processing request party; the processing request comprises the project file of the original data to be processed; the engineering file comprises the raw data to be processed, a machine learning model and an inference execution code;
s102: an accelerator deployment interface is established, the accelerator deployment interface is called to send the engineering file to an AI accelerator, and the AI accelerator carries out reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code to obtain a reasoning execution result;
s103: and receiving an inference execution result returned by the AI accelerator, and feeding back the inference execution result to the data processing requester.
As can be seen from the above description, the machine learning data processing method and apparatus provided by the present application can achieve the goal of simplifying usage, so that a user can complete deployment of a neural network model on an AI accelerator only through a small number of operations, and obtain an inference execution result, and further can perform data analysis on an inference execution process of the neural network model from two layers of a user individual and a user group; by constructing the accelerator deployment interface, the connection between the AI accelerator hardware device and the management server software is separated from the system level, and the device expandability is excellent.
In another embodiment, the machine learning data processing apparatus may be configured separately from the central processor 9100, for example, the data compound transmission apparatus may be configured as a chip connected to the central processor 9100, and the function of the machine learning data processing method may be realized by the control of the central processor.
As shown in fig. 11, the electronic device 9600 may further include: a communication module 9110, an input unit 9120, an audio processor 9130, a display 9160, and a power supply 9170. It is noted that the electronic device 9600 also does not necessarily include all of the components shown in fig. 11; in addition, the electronic device 9600 may further include components not shown in fig. 11, which may be referred to in the prior art.
As shown in fig. 11, a central processor 9100, sometimes referred to as a controller or operational control, can include a microprocessor or other processor device and/or logic device, which central processor 9100 receives input and controls the operation of the various components of the electronic device 9600.
The memory 9140 can be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information relating to the failure may be stored, and a program for executing the information may be stored. And the central processing unit 9100 can execute the program stored in the memory 9140 to realize information storage or processing, or the like.
The input unit 9120 provides input to the central processor 9100. The input unit 9120 is, for example, a key or a touch input device. Power supply 9170 is used to provide power to electronic device 9600. The display 9160 is used for displaying display objects such as images and characters. The display may be, for example, an LCD display, but is not limited thereto.
The memory 9140 can be a solid state memory, e.g., Read Only Memory (ROM), Random Access Memory (RAM), a SIM card, or the like. There may also be a memory that holds information even when power is off, can be selectively erased, and is provided with more data, an example of which is sometimes called an EPROM or the like. The memory 9140 could also be some other type of device. Memory 9140 includes a buffer memory 9141 (sometimes referred to as a buffer). The memory 9140 may include an application/function storage portion 9142, the application/function storage portion 9142 being used for storing application programs and function programs or for executing a flow of operations of the electronic device 9600 by the central processor 9100.
The memory 9140 can also include a data store 9143, the data store 9143 being used to store data, such as contacts, digital data, pictures, sounds, and/or any other data used by an electronic device. The driver storage portion 9144 of the memory 9140 may include various drivers for the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, contact book applications, etc.).
The communication module 9110 is a transmitter/receiver 9110 that transmits and receives signals via an antenna 9111. The communication module (transmitter/receiver) 9110 is coupled to the central processor 9100 to provide input signals and receive output signals, which may be the same as in the case of a conventional mobile communication terminal.
Based on different communication technologies, a plurality of communication modules 9110, such as a cellular network module, a bluetooth module, and/or a wireless lan module, may be disposed in the same electronic device. The communication module (transmitter/receiver) 9110 is also coupled to a speaker 9131 and a microphone 9132 via an audio processor 9130 to provide audio output via the speaker 9131 and receive audio input from the microphone 9132, thereby implementing ordinary telecommunications functions. The audio processor 9130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 9130 is also coupled to the central processor 9100, thereby enabling recording locally through the microphone 9132 and enabling locally stored sounds to be played through the speaker 9131.
An embodiment of the present application further provides a computer-readable storage medium capable of implementing all the steps in the machine learning data processing method in which the execution subject is the server or the client in the above embodiments, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements all the steps in the machine learning data processing method in which the execution subject is the server or the client in the above embodiments, for example, when the processor executes the computer program, the processor implements the following steps:
s101: receiving a processing request of original data to be processed, which is sent by a data processing request party; the processing request comprises the project file of the original data to be processed; the engineering file comprises the raw data to be processed, a machine learning model and an inference execution code;
s102: an accelerator deployment interface is established, the accelerator deployment interface is called to send the engineering file to an AI accelerator, and the AI accelerator carries out reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code to obtain a reasoning execution result;
s103: and receiving an inference execution result returned by the AI accelerator, and feeding back the inference execution result to the data processing requester.
As can be seen from the above description, the machine learning data processing method and apparatus provided by the present application can achieve the goal of simplifying usage, so that a user can complete deployment of a neural network model on an AI accelerator only through a small number of operations, and obtain an inference execution result, and further can perform data analysis on an inference execution process of the neural network model from two layers of a user individual and a user group; by constructing the accelerator deployment interface, the connection between the AI accelerator hardware device and the management server software is separated from the system level, and the device expandability is excellent.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A machine learning data processing method, comprising:
receiving a processing request of original data to be processed, which is sent by a data processing request party; the processing request comprises the project file of the original data to be processed; the engineering file comprises the raw data to be processed, a machine learning model and an inference execution code;
an accelerator deployment interface is established, the accelerator deployment interface is called to send the engineering file to an AI accelerator, and the AI accelerator carries out reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code to obtain a reasoning execution result corresponding to the processing request;
and receiving an inference execution result returned by the AI accelerator, and feeding back the inference execution result to the data processing requester.
2. The machine learning data processing method of claim 1, wherein the constructing an accelerator deployment interface and invoking the accelerator deployment interface to send the engineering file to an AI accelerator comprises:
constructing the accelerator deployment interface according to the type of the AI accelerator;
constructing a deployment configuration file of the AI accelerator according to the processing request;
inputting the deployment configuration file and the engineering file into the accelerator deployment interface;
selecting a deployment operation script corresponding to the AI accelerator by using the configuration deployment file;
and running the deployment running script, and calling the accelerator deployment interface to load the engineering file to the AI accelerator.
3. The machine learning data processing method of claim 1, wherein the processing request includes user behavior data; the machine learning data processing method further comprises the following steps:
analyzing the user behavior data to obtain a behavior analysis result of the data processing requester; the behavior analysis result comprises an error region statistical result, an error reporting information statistical result, a submission time statistical result and a submission frequency statistical result in the inference execution process;
and sending the behavior analysis result to the data processing requester.
4. The machine learning data processing method of claim 1, further comprising:
checking whether the processing request is a repeatedly submitted processing request;
if not, checking whether an AI accelerator requested to be used by the processing request is idle;
if yes, checking whether the engineering file in the processing request conforms to the format specification.
5. The machine learning data processing method of claim 1, further comprising:
when receiving a reasoning execution result analysis request sent by the data processing requester, performing reasoning execution analysis according to the reasoning execution result;
and transmitting the obtained inference execution analysis result back to the data processing requester.
6. A machine learning data processing apparatus, comprising:
a processing request receiving unit, configured to receive a processing request of original data to be processed, sent by a data processing requester; the processing request comprises the project file of the original data to be processed; the engineering file comprises the raw data to be processed, a machine learning model and an inference execution code;
the engineering file sending unit is used for constructing an accelerator deployment interface, calling the accelerator deployment interface to send the engineering file to an AI accelerator, and enabling the AI accelerator to carry out reasoning execution on the raw data to be processed according to the machine learning model and the reasoning execution code to obtain a reasoning execution result corresponding to the processing request;
and the execution result returning unit is used for receiving the inference execution result returned by the AI accelerator and feeding back the inference execution result to the data processing requester.
7. The machine learning data processing apparatus according to claim 6, wherein the project file transmission unit includes:
the deployment interface construction module is used for constructing the accelerator deployment interface according to the type of the AI accelerator;
the configuration file construction module is used for constructing a deployment configuration file of the AI accelerator according to the processing request;
the file input module is used for inputting the deployment configuration file and the engineering file into the accelerator deployment interface;
the running script selection module is used for selecting a deployment running script corresponding to the AI accelerator by using the configuration deployment file;
and the engineering file loading module is used for operating the deployment operation script and calling the accelerator deployment interface to load the engineering file to the AI accelerator.
8. The machine learning data processing apparatus according to claim 6, wherein the processing request includes user behavior data, the machine learning data processing apparatus further comprising:
the user behavior analysis unit is used for analyzing the user behavior data to obtain a behavior analysis result of the data processing requester; the behavior analysis result comprises an error region statistical result, an error reporting information statistical result, a submission time statistical result and a submission frequency statistical result in the inference execution process;
and the analysis result feedback unit is used for sending the behavior analysis result to the data processing requester.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the machine learning data processing method of any one of claims 1 to 5 are implemented when the program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the machine learning data processing method according to any one of claims 1 to 5.
CN202111443501.3A 2021-11-30 2021-11-30 Machine learning data processing method and device Pending CN114154644A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111443501.3A CN114154644A (en) 2021-11-30 2021-11-30 Machine learning data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111443501.3A CN114154644A (en) 2021-11-30 2021-11-30 Machine learning data processing method and device

Publications (1)

Publication Number Publication Date
CN114154644A true CN114154644A (en) 2022-03-08

Family

ID=80784457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111443501.3A Pending CN114154644A (en) 2021-11-30 2021-11-30 Machine learning data processing method and device

Country Status (1)

Country Link
CN (1) CN114154644A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190138908A1 (en) * 2018-12-28 2019-05-09 Francesc Guim Bernat Artificial intelligence inference architecture with hardware acceleration
CN111461332A (en) * 2020-03-24 2020-07-28 北京五八信息技术有限公司 Deep learning model online reasoning method and device, electronic equipment and storage medium
CN111857734A (en) * 2020-06-19 2020-10-30 苏州浪潮智能科技有限公司 Deployment and use method of distributed deep learning model platform
CN112329945A (en) * 2020-11-24 2021-02-05 广州市网星信息技术有限公司 Model deployment and reasoning method and device
WO2021088964A1 (en) * 2019-11-08 2021-05-14 阿里巴巴集团控股有限公司 Inference system, inference method, electronic device and computer storage medium
CN113204413A (en) * 2020-02-03 2021-08-03 阿里巴巴集团控股有限公司 Task processing method, device and equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190138908A1 (en) * 2018-12-28 2019-05-09 Francesc Guim Bernat Artificial intelligence inference architecture with hardware acceleration
WO2021088964A1 (en) * 2019-11-08 2021-05-14 阿里巴巴集团控股有限公司 Inference system, inference method, electronic device and computer storage medium
CN113204413A (en) * 2020-02-03 2021-08-03 阿里巴巴集团控股有限公司 Task processing method, device and equipment
CN111461332A (en) * 2020-03-24 2020-07-28 北京五八信息技术有限公司 Deep learning model online reasoning method and device, electronic equipment and storage medium
CN111857734A (en) * 2020-06-19 2020-10-30 苏州浪潮智能科技有限公司 Deployment and use method of distributed deep learning model platform
CN112329945A (en) * 2020-11-24 2021-02-05 广州市网星信息技术有限公司 Model deployment and reasoning method and device

Similar Documents

Publication Publication Date Title
CN113408746B (en) Distributed federal learning method and device based on block chain and terminal equipment
CN106843828B (en) Interface display and loading method and device
CN113268336B (en) Service acquisition method, device, equipment and readable medium
CN110781373B (en) List updating method and device, readable medium and electronic equipment
CN113505520A (en) Method, device and system for supporting heterogeneous federated learning
CN112445575A (en) Multi-cluster resource scheduling method, device and system
CN107807935B (en) Using recommended method and device
CN109525592A (en) Data sharing method, device, equipment and computer readable storage medium
CN111679790A (en) Remote software development storage space distribution method and device
US20240005165A1 (en) Machine learning model training method, prediction method therefor, apparatus, device, computer-readable storage medium, and computer program product
CN111464352A (en) Call link data processing method and device
CN112689012A (en) Cross-network proxy communication method and device
CN114257532B (en) Method and device for detecting state of server
CN107517188A (en) A kind of data processing method and device based on Android system
CN112396511A (en) Distributed wind control variable data processing method, device and system
CN111767558A (en) Data access monitoring method, device and system
CN111191143A (en) Application recommendation method and device
CN114154644A (en) Machine learning data processing method and device
CN113487041B (en) Transverse federal learning method, device and storage medium
CN109656535B (en) Voice skill off-line development method based on browser
US20170171330A1 (en) Method for pushing information and electronic device
CN113434423A (en) Interface test method and device
CN111343172A (en) Network access authority dynamic processing method and device
CN112614049A (en) Image processing method, image processing device, storage medium and terminal
CN113961792B (en) Resource recommendation system, method, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination