WO2024063437A1

WO2024063437A1 - Method and device for managing artificial intelligence model

Info

Publication number: WO2024063437A1
Application number: PCT/KR2023/013789
Authority: WO
Inventors: 콘다하르샤; 고얄알록; 라마 춘추스리
Original assignee: 쿠팡 주식회사
Priority date: 2022-09-22
Filing date: 2023-09-14
Publication date: 2024-03-28

Abstract

Disclosed is a method for managing an artificial intelligence model in an electronic device. Specifically, the method for managing an artificial intelligence model may comprise the steps of: receiving a request from a client device; identifying information related to the request; identifying at least one artificial intelligence model corresponding to the request on the basis of the information; transmitting control information identified on the basis of the information to the at least one artificial intelligence model; receiving result information according to the control information from the at least one artificial intelligence model; and providing the result information to the client device.

Description

Method and device for managing artificial intelligence models

Embodiments of this specification relate to methods and devices for managing artificial intelligence models. The embodiment of the present specification transmits control information confirmed based on information related to the request to an artificial intelligence model, and provides result information received from the artificial intelligence model to the client device, thereby facilitating interaction between the client device and artificial intelligence. It relates to a method and device for managing an artificial intelligence model that is controlled.

Recently, when providing services in an Internet environment such as online marketing, it is required to make business decisions quickly. In the past, these decisions were made manually, but recently, there has been an increasing tendency to automate decision making by using machine learning models. However, there were concerns that the time required for decision-making would be long as high traffic had to be handled due to the expansion of the service. Accordingly, methods and devices for solving this problem are required.

This disclosure is proposed to solve the above-mentioned problems and provides a method and device for managing an artificial intelligence model.

More specifically, the present disclosure identifies identified control information based on information related to the request, transmits the identified control information to an artificial intelligence model related to the request, and provides result information received from the artificial intelligence model to a client device. By doing so, the purpose is to provide a method and device for managing an artificial intelligence model to control the interaction between a client device and an artificial intelligence model.

The technical challenges that this embodiment aims to achieve are not limited to the technical challenges described above, and other technical challenges can be inferred from the following embodiments.

As a technical means for achieving the above-described problem, a method of managing an artificial intelligence model in an electronic device according to the first aspect of the present disclosure includes the steps of receiving a request from a client device; Confirming information related to the request; Based on the information, confirming at least one artificial intelligence model corresponding to the request; Transmitting control information identified based on the information to at least one artificial intelligence model; Receiving result information according to control information from at least one artificial intelligence model; and providing result information to a client device.

According to one embodiment, the information includes identification information about the request, information about at least one artificial intelligence model corresponding to the request, information about input data of at least one artificial intelligence model, and information about the execution type corresponding to the request. It may be characterized as including.

According to one embodiment, the step of transmitting control information to at least one artificial intelligence model includes checking the execution order of the at least one artificial intelligence model based on information about the input-output relationship for the at least one artificial intelligence model. ; and identifying control information based on the execution order.

According to one embodiment, when at least one artificial intelligence model is divided into a plurality of groups, the execution order of the at least one artificial intelligence model may include an execution order between the artificial intelligence models included in each of the plurality of groups, A plurality of groups may be characterized as being executed in parallel.

According to one embodiment, information about the execution type may be determined based on the expected time until result information is received.

According to one embodiment, when the execution type is the first type, providing the result information to the client device may include providing the result information to the waiting client device in response to receiving the result information.

According to one embodiment, when the execution type is the second type, providing result information to the client device may include transmitting the result information to a data processing platform.

According to one embodiment, receiving the request may include transmitting identification information to the client device in response to receiving the request.

According to one embodiment, result information stored in the data processing platform may be characterized by being identified by the client device based on identification information.

According to one embodiment, the step of confirming at least one artificial intelligence model includes identifying each of the at least one artificial intelligence model as one of a first type of processor unit-based artificial intelligence model and a second type of processor unit-based artificial intelligence model. May include steps.

According to one embodiment, the first group including a first type of processor unit-based artificial intelligence model among the at least one artificial intelligence model includes a keyword detection model, and a second type of processor among the at least one artificial intelligence model. The second group, including unit-based artificial intelligence models, may be characterized as including optical character recognition models.

According to one embodiment, the step of confirming at least one artificial intelligence model includes a first group including a first type of processor unit-based artificial intelligence model among the at least one artificial intelligence model, and a second group of the at least one artificial intelligence model. It may include assigning a second group including a type of processor unit-based artificial intelligence model to a first type of processor unit server and a second type of processor unit server, respectively.

According to one embodiment, the second type of processor unit server includes a plurality of instances, and the instances corresponding to each artificial intelligence model included in the second group are responsible for memory use of the artificial intelligence model included in the second group. It may be characterized as being determined based on historical data.

An electronic device for managing an artificial intelligence model according to a second aspect of the present disclosure includes a transceiver; Storage for storing one or more instructions; and receiving a request from a client device, confirming information related to the request, based on the information, identifying at least one artificial intelligence model corresponding to the request, and transmitting the identified control information based on the information to at least one artificial intelligence. It may include a processor that transmits to the model, receives result information according to control information from at least one artificial intelligence model, and provides the result information to the client device.

The recording medium according to the third aspect of the present disclosure may be a non-transitory computer-readable recording medium that records a program for executing a method for managing an artificial intelligence model on a computer.

According to an embodiment of the present specification, as an electronic device receives a request from a client device, it can check at least one artificial intelligence model corresponding to the request and control information for efficiently executing the at least one artificial intelligence model. Additionally, the electronic device can receive result information from an artificial intelligence model executed according to control information, and provide the received result information to the client device. As the function that regulates the interaction between client devices and artificial intelligence models is unified in electronic devices, latency is lowered and machine learning models can be managed efficiently.

The effect of the invention is not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description of the claims.

FIG. 1 is a diagram illustrating a system in which a method of managing an artificial intelligence model by an electronic device according to various embodiments can be implemented.

Figure 2 is a flowchart showing how an electronic device manages an artificial intelligence model.

FIG. 3 is a diagram illustrating an embodiment in which an electronic device manages an artificial intelligence model when the execution type is the first type.

FIG. 4 is a diagram illustrating an embodiment in which an electronic device manages an artificial intelligence model when the execution type is the second type.

FIG. 5 is a diagram illustrating another embodiment in which an electronic device manages an artificial intelligence model when the execution type is the second type.

Figure 6 is a diagram for explaining an embodiment of changing the execution type.

Figure 7 is a diagram to explain an embodiment of dividing and managing an artificial intelligence model into a GPU-based artificial intelligence model and a CPU-based artificial intelligence model.

FIG. 8 is a diagram illustrating an embodiment of allocating a GPU instance corresponding to an artificial intelligence model based on history data about memory usage of the artificial intelligence model.

Figure 9 is a block diagram illustrating an electronic device for managing an artificial intelligence model according to an embodiment.

The terms used in the embodiments are general terms that are currently widely used as much as possible while considering the functions in the present disclosure, but this may vary depending on the intention or precedent of a person working in the art, the emergence of new technology, etc. In addition, in certain cases, there are terms arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the relevant description. Therefore, the terms used in this disclosure should be defined based on the meaning of the term and the overall content of this disclosure, rather than simply the name of the term.

When a part in the entire specification is said to “include” a certain element, this means that it does not exclude other elements but may further include other elements, unless specifically stated to the contrary. In addition, terms such as “...unit” and “...module” used in the specification refer to a unit that processes at least one function or operation, which is implemented as hardware or software, or as a combination of hardware and software. It can be.

The expression “at least one of a, b, and c” used throughout the specification means ‘a alone’, ‘b alone’, ‘c alone’, ‘a and b’, ‘a and c’, ‘b and c’. ', or 'all a, b, c'.

The “terminal” mentioned below may be implemented as a computer or portable terminal that can connect to a server or other terminal through a network. Here, the computer includes, for example, a laptop, desktop, laptop, etc. equipped with a web browser, and the portable terminal is, for example, a wireless communication device that guarantees portability and mobility. , all types of communication-based terminals such as IMT (International Mobile Telecommunication), CDMA (Code Division Multiple Access), W-CDMA (W-Code Division Multiple Access), and LTE (Long Term Evolution), smartphones, tablet PCs, etc. It may include a handheld-based wireless communication device.

Below, with reference to the attached drawings, embodiments of the present disclosure will be described in detail so that those skilled in the art can easily practice them. However, the present disclosure may be implemented in many different forms and is not limited to the embodiments described herein.

Hereinafter, embodiments of the present invention will be described in detail with reference to the attached drawings.

In describing the embodiments, description of technical content that is well known in the technical field to which the present invention belongs and that is not directly related to the present invention will be omitted. This is to convey the gist of the present invention more clearly without obscuring it by omitting unnecessary explanation.

For the same reason, some components are exaggerated, omitted, or schematically shown in the accompanying drawings. Additionally, the size of each component does not entirely reflect its actual size. In each drawing, identical or corresponding components are assigned the same reference numbers.

The advantages and features of the present invention and methods for achieving them will become clear by referring to the embodiments described in detail below along with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below and may be implemented in various different forms. The present embodiments are merely provided to ensure that the disclosure of the present invention is complete and to provide common knowledge in the technical field to which the present invention pertains. It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout the specification.

At this time, it will be understood that each block of the processing flow diagram diagrams and combinations of the flow diagram diagrams can be performed by computer program instructions. These computer program instructions can be mounted on a processor of a general-purpose computer, special-purpose computer, or other programmable data processing equipment, so that the instructions performed through the processor of a computer or other programmable data processing equipment are described in the flow chart block(s). It creates the means to perform functions. These computer program instructions may also be stored in computer-usable or computer-readable memory that can be directed to a computer or other programmable data processing equipment to implement a function in a particular manner, so that the computer-usable or computer-readable memory It is also possible to produce manufactured items containing instruction means that perform the functions described in the flowchart block(s). Computer program instructions can also be mounted on a computer or other programmable data processing equipment, so that a series of operational steps are performed on the computer or other programmable data processing equipment to create a process that is executed by the computer and can be processed by the computer or other programmable data processing equipment. Instructions that perform processing equipment may also provide steps for executing the functions described in the flow diagram block(s).

Additionally, each block may represent a module, segment, or portion of code that includes one or more executable instructions for executing specified logical function(s). Additionally, it should be noted that in some alternative execution examples it is possible for the functions mentioned in the blocks to occur out of order. For example, it is possible for two blocks shown in succession to be performed substantially at the same time, or it is possible for the blocks to be performed in reverse order depending on the corresponding function.

Referring to FIG. 1, system 10 according to various embodiments may be implemented by various types of devices. For example, system 10 may include electronic device 100, client device 110, server 120, and data processing platform 130. The system 10 shown in Figure 1 shows only components relevant to this embodiment. Accordingly, those skilled in the art can understand that other general-purpose components may be included in addition to the components shown in FIG. 1.

Electronic device 100, client device 110, server 120, and data processing platform 130 may each include a transceiver, storage, and processor. In addition, the electronic device 100, client device 110, server 120, and data processing platform 130 each mean a unit that processes at least one function or operation, which is hardware or software, or hardware and It can be implemented through a combination of software. Meanwhile, throughout the embodiment, the electronic device 100, the client device 110, the server 120, and the data processing platform 130 are each referred to as separate devices or servers, but this may have a logically divided structure, among which At least some of them may be implemented as separate functions in one device or server.

According to one embodiment, the electronic device 100, the client device 110, the server 120, and the data processing platform 130 may include a plurality of computer systems or computer software implemented as network servers. For example, at least some of the electronic device 100, client device 110, server 120, and data processing platform 130 may have sub-devices capable of communicating with other network servers via a computer network such as an intranet or the Internet. It may refer to a computer system and computer software that is connected and receives a request to perform a task, performs the task, and provides a performance result. In addition, at least some of the electronic device 100, client device 110, server 120, and data processing platform 130 are built on a series of applications that can operate on a network server and other nodes inside or connected to It can be understood as a broad concept that includes various databases. For example, at least some of the electronic device 100, client device 110, server 120, and data processing platform 130 may run DOS, Windows, Linux, UNIX, or It can be implemented using various network server programs provided depending on the operating system such as MacOS.

Electronic device 100, client device 110, server 120, and data processing platform 130 may communicate with each other through a network (not shown). Networks include Local Area Network (LAN), Wide Area Network (WAN), Value Added Network (VAN), mobile radio communication network, satellite communication network, and combinations thereof. It is a data communication network in a comprehensive sense that allows each network constituent shown in FIG. 1 to communicate smoothly with each other, and may include wired Internet, wireless Internet, and mobile wireless communication networks. Wireless communications include, for example, wireless LAN (Wi-Fi), Bluetooth, Bluetooth low energy, ZigBee, WFD (Wi-Fi Direct), UWB (ultra wideband), and infrared communication (IrDA, infrared Data Association). ), NFC (Near Field Communication), etc., but are not limited thereto.

According to one embodiment, the electronic device 100 receives a request from the client device 110, checks information related to the request, checks at least one artificial intelligence model corresponding to the request based on the information, and receives the information. Control information identified based on can be transmitted to at least one artificial intelligence model, result information according to the control information can be received from at least one artificial intelligence model, and the result information can be provided to the client device 110. Here, the operation of the electronic device 100 transmitting control information to at least one artificial intelligence model may include transmitting control information to the server 120 executing at least one artificial intelligence model. Here, the client device 110 may be a program that can access services on other computer systems such as the electronic device 100 through a network, and the server 120 may be any processor unit server such as a GPU server or CPU server. However, the client device 110 and server 120 are not limited thereto.

Additionally, the client device 110 may operate in standby mode while connected to the electronic device 100 or may perform another task while disconnected from the electronic device 100. More specifically, when a time-consuming artificial intelligence model needs to be executed or when the available memory usage on the server 120 is low, the client device 110 operates in an asynchronous mode and sends a request to the electronic device. It may be more appropriate to disconnect from (100).

When the connection between the electronic device 100 and the client device 110 is disconnected, the electronic device 100 may publish the resulting information received from the server 120 to the data processing platform 130, and the client device ( 110) can identify result information among the data stored in the data processing platform 130 based on the request ID corresponding to the request, and subscribe or consume the identified result information. Accordingly, the client device 110 may indirectly receive result information from the electronic device 100. Here, the result information may include output data of at least one artificial intelligence model, and the output data of at least one artificial intelligence model may be stored separately in the data processing platform 130. Additionally, the client device 110 may identify result information including output data of at least one artificial intelligence model stored in the data processing platform 130 using the request ID. Here, the data processing platform 130 may be a data processing platform for data exchange between servers and electronic devices. For example, the data processing platform 130 may be a platform related to Apache Kafka, an open source message broker project, but is not limited thereto.

According to one embodiment, the electronic device 100 may operate as an orchestrator that regulates the interaction between the client device 110, the server 120, and the data processing platform 130, and the electronic device 100 may operate as an orchestrator. May include non-blocking I/O. According to this system 10, the waiting time of the client device 110 until the electronic device 100 receives a request from the client device 110 and provides result information can be significantly reduced, and non-blocking IO By introducing it, the high cost of input and output can be reduced. Additionally, by using non-blocking IO, other actions can be taken, such as running another artificial intelligence model before receiving the output data. In other words, the electronic device 100 can process requests from a large amount of traffic more efficiently.

Additionally, the request transmitted by the client device 110 may be provided to the machine learning model through the electronic device 100 without separate processing through the data processing platform 130. In addition, the client device 110 can easily check result information based on the received request ID, so the system for collecting output data of the artificial intelligence model can be unified in the electronic device 100 and be efficient. Below, a specific example in which the electronic device 100 manages an artificial intelligence model will be examined in detail.

Referring to FIG. 2, each operation in which an electronic device manages an artificial intelligence model involves some operations being changed, replaced, or some sequences between operations within the range clearly understood by those skilled in the art to which the present invention pertains. It can be clearly understood that can be changed.

In step S210, the electronic device 100 may receive a request from a client device.

In step S220, the electronic device 100 may check information related to the request.

According to one embodiment, the electronic device 100 may receive a request from the client device 110 and check information related to the request. When using machine learning for business decision-making, etc., decisions can be made based on multiple features. At this time, the client device 110 may use at least one artificial intelligence model when determining one feature among a plurality of features used for business decision making. That is, the request received from the client device 110 may be a request for running an artificial intelligence model in relation to determining features. For example, decision-making to improve the quality of providing items in the Internet environment may require comprehensive decision-making based on various features, such as the first feature for whether inappropriate keywords are included and the second feature for image verification. . At this time, in order to check whether the first feature includes an inappropriate keyword, the client device 110 may transmit a request corresponding to the first feature to the electronic device 100.

According to one embodiment, the request may include identification information about the request, information about at least one artificial intelligence model corresponding to the request, information about input data of the artificial intelligence model, and information about the execution type corresponding to the request. You can. Identification information for a request is information for distinguishing a request from other requests, and may be set by the client device 110. At least one artificial intelligence model may be an artificial intelligence model required to determine a first feature among a plurality of artificial intelligence models executable through a server. Whether or not the artificial intelligence model is required to determine the first feature may be determined by the client device 110.

The execution type of the request may be determined as one of the first type, Synchronous, and the second type, Asynchronous. After the electronic device 100 receives a request, whether the client device 110 and the electronic device 100 are connected may vary depending on the execution type of the request. Here, the execution type may be determined based on the expected time to receive result information corresponding to the request.

More specifically, when the electronic device 100 confirms at least one artificial intelligence model corresponding to the request, the electronic device 100 generates output data of the at least one artificial intelligence model based on history data related to the inference of the at least one artificial intelligence model. You can check the estimated time to collect everything. In addition, the electronic device 100 can more accurately check the expected time based on the amount of memory currently available in the server 120. Based on the expected time, the electronic device 100 may determine whether the execution type of the corresponding request is appropriately set. An embodiment related to this will be examined in detail in FIG. 6.

The input data of the artificial intelligence model may be data used as input to at least one previously learned artificial intelligence model. More specifically, the input data may be list-type data including a plurality of images. Additionally, at least one artificial intelligence model may be based on one input data, but is not limited to this. For example, information related to the request may include information about input data corresponding to each of at least one artificial intelligence model.

In step S230, the electronic device 100 may check at least one artificial intelligence model corresponding to the request based on the information.

According to one embodiment, the electronic device 100 may check the execution order of at least one artificial intelligence model. The electronic device 100 may check the execution order of at least one artificial intelligence model based on a graph of the execution order of a plurality of artificial intelligence models. More specifically, the electronic device 100 may check a graph of the execution order of a plurality of artificial intelligence models based on information about the input-output relationship for at least one artificial intelligence model. Here, the graph may be information in the form of a tree diagram that includes the execution order of the artificial intelligence model.

For example, the first artificial intelligence model may be an artificial intelligence model that uses data A and data B as input data and output data, respectively, and the second artificial intelligence model may use data B and data C as input data and output data, respectively. there is. At this time, the electronic device 100 configures the artificial intelligence model to execute the second artificial intelligence model after executing the first artificial intelligence model, based on information about the input-output relationship between the first artificial intelligence model and the second artificial intelligence model. You can check the graph of the execution order.

Additionally, at least one artificial intelligence model can be performed in parallel for multiple groups. For example, when at least one artificial intelligence model is divided into multiple groups according to the characteristics and functions of the artificial intelligence model, checking the execution order of the artificial intelligence model involves executing the execution between artificial intelligence models included in each of the plurality of groups. This may include checking the order. More specifically, dividing at least one artificial intelligence model into a plurality of groups according to the characteristics and functions of the artificial intelligence model may mean separating the groups into groups that do not depend on each other in input-output relationships. For example, if the NLP-related artificial intelligence model and the computer vision-related artificial intelligence model do not depend on each other, the NLP-related artificial intelligence model and the computer vision-related artificial intelligence model can be executed in parallel with each other.

According to one embodiment, the electronic device 100 may determine each of the at least one artificial intelligence model as one of a first type of processor unit-based artificial intelligence model and a second type of processor unit-based artificial intelligence model. For example, the first type of processor unit and the second type of processor unit may correspond to a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU), respectively.

In the present application, the A processor unit-based artificial intelligence model only refers to an artificial intelligence model that operates more efficiently on the processor unit server A, and may not mean an artificial intelligence model that operates only on the processor unit server A. More specifically, the first type of processor unit-based artificial intelligence model refers to an artificial intelligence model that operates more efficiently on a first type of processor unit server, and may operate on a second type of processor unit. In other words, even if an artificial intelligence model operates more efficiently on the first type of processor unit, if there are many tasks being processed on the first type of processor unit server, it may operate as an artificial intelligence model driven by the second type of processor unit server. You can. Likewise, the second type of processor unit-based artificial intelligence model simply means an artificial intelligence model that operates more efficiently on the second type of processor unit server and can operate on the first type of processor unit server.

For each artificial intelligence model, whether the performance of the artificial intelligence model is superior on either the GPU server or the CPU server may vary. For example, among artificial intelligence models, computer vision-related artificial intelligence models such as optical character recognition (OCR) model and logo detection model may be more efficient models when run based on a GPU server. Conversely, among artificial intelligence models, keyword detection models and age detection models, which are examples of artificial intelligence models related to natural language processing, may be more efficient models when run based on a CPU server. Here, the criteria for an efficient model may include the latency, cost, and prediction accuracy of the artificial intelligence model.

The electronic device 100 comprehensively considers the type of task being executed or scheduled to be executed in the server 120, task status information, the type of at least one artificial intelligence model, and the execution order of at least one artificial intelligence model, Hardware resources allocated to at least one artificial intelligence model can be flexibly changed. For example, even if the artificial intelligence model operates more efficiently on the first type of processor unit, if there are many tasks being processed on the first type of processor unit server, the artificial intelligence model that runs through the second type of processor unit server may be used. It can work. When hardware resources are dynamically changed, information about a server corresponding to at least one artificial intelligence may be included in the control information.

In step S240, the electronic device 100 may transmit control information identified based on the information to at least one artificial intelligence model.

Once the server on which each artificial intelligence model runs and the execution order of the artificial intelligence model are determined, the electronic device 100 may transmit control information identified based on the information to at least one artificial intelligence model. Here, the operation of transmitting control information to at least one artificial intelligence model may include transmitting control information to each server and executing the artificial intelligence model according to input data corresponding to the control information in each server. You can. Control information may be information including information about the server on which the artificial intelligence model runs and the execution order of the artificial intelligence model.

In step S250, the electronic device 100 may receive result information according to control information from at least one artificial intelligence model.

According to one embodiment, the electronic device 100 may receive output data of the artificial intelligence model as execution of the artificial intelligence model is completed. Accordingly, the electronic device 100 can collect result data of at least one artificial intelligence model and check result information. At this time, the result data of at least one artificial intelligence model may be output at different times, and the electronic device 100 waits until all result data of at least one artificial intelligence model are received, and then outputs the at least one artificial intelligence model. Output data from intelligent models can be collected.

In step S260, the electronic device 100 may provide result information to the client device.

According to one embodiment, the electronic device 100 may provide result information to the client device 110 according to different methods, based on whether the execution type corresponding to the request is the first type or the second type. Additionally, the electronic device 100 may identify result information by collecting output data according to the operation of the artificial intelligence model from at least one artificial intelligence model driven in parallel. Accordingly, the electronic device 100 may return the identified result information to the client device 110.

The specific operation of providing result information for the case where the execution type is the first type (Synchronous) to the client device 110 will be examined in FIG. 3, and the result information for the case where the execution type is the second type (Asynchronous). The specific operation of providing to the client device 110 will be described in FIGS. 4 and 5.

FIG. 3 is a diagram illustrating an embodiment in which the electronic device 100 manages an artificial intelligence model when the execution type is the first type.

As seen in FIG. 2, the client device 110 may use at least one artificial intelligence model when determining one feature among a plurality of features used for business decision making. For example, when determining features related to content creation of an SDP page, an OCR model, keyword detection model, brand detection model, and logo detection model may be needed. Referring to FIG. 3, FIG. 3 illustrates an embodiment using an OCR model and a brand detection model. Additionally, when determining features related to the creation of content on other pages, other artificial intelligence models may be used in addition to the OCR model, keyword detection model, brand detection model, and logo detection model.

According to one embodiment, the electronic device 100 may receive a request from the client device 110 and check information 300 related to the request. Referring to Figure 3, the received request may include the following information.

{

"requestId"　:　"01a-c140-bf972",

"execution"　:　"sync",

"detections"　: ["OCR","brand_detection"],

"data"　: [{"imageurl"　:　"http://urllink1"}]

}

Referring to FIG. 3, 1) the request ID 301, which is identification information for the request, may be '01a-c140-bf972', 2) the execution type 302 may be sync, which is the 'first type', 3) Detection (303), which is information about at least one artificial intelligence model, can be ‘OCR model’ and ‘brand detection model’, and 4) input data (304) about at least one artificial intelligence model is ‘http: //urllink1'. Here, 'http://urllink1' may be a dummy URL for explanation.

Since only two artificial intelligence models are used to determine the feature corresponding to the request, it may take a relatively short time for the electronic device 100 to receive result information corresponding to the request. Accordingly, it may be appropriate for the execution type according to the embodiment of FIG. 3 to be determined as the first type, Synchronous. Since the execution type 302 is the first type, the client device 110 may operate in standby mode while connected to the electronic device 100 after transmitting the request.

According to one embodiment, the electronic device 100 may determine the execution order of at least one artificial intelligence model. For example, as input data 'http://urllink1' is input to the OCR model, the OCR model uses 'http://urllink1' as input data to output device-readable text data as output data. You can. In addition, text data, which is the output data of the OCR model, can be used as input data of the brand detection model, and the OCR model and brand detection model may correspond to dependent artificial intelligence models. Accordingly, the electronic device 100 can confirm the execution order in which the brand detection model is executed after the OCR model, based on information about the input-output relationship between the OCR model and the brand detection model. Additionally, the electronic device 100 may generate control information including the confirmed execution order, at least one artificial intelligence model to be executed, and 'http://urllink1' as input data 304.

According to one embodiment, the electronic device 100 may transmit control information to at least one artificial intelligence model. Accordingly, at least one artificial intelligence model may be executed based on control information. More specifically, 1) the OCR model can generate text data as output data based on 'http://urllink1', which is the input data 304. The generated text data may be transmitted to the electronic device 100. Additionally, the generated text data can be used as input data for a brand detection model. 2) The brand detection model can generate brand data as output data based on text data received as input data. Additionally, the generated brand data may be transmitted to the electronic device 100.

According to one embodiment, the electronic device 100 may collect text data and brand data, which are output data, and transmit the resulting information including all output data to the client device 110 operating in standby mode. Additionally, the electronic device 100 may collect output data and then transmit result information including the output data to the client device 110.

When the execution type is the first type, the electronic device 100 provides result information to the client device 110 operating in standby mode, and the request ID, which is identification information for the request, may be unused data. . Therefore, when the execution type is the first type, the request ID, which is identification information for the request, may be information provided optionally.

According to one embodiment, the electronic device 100 may receive a request from the client device 110 and check information 400 related to the request. Referring to Figure 4, the received request may include the following information.

{

"requestId"　:　"01a-c140-bf972",

"execution"::"async",

"requesttype"　: ["Type A"],

"data"　: [{"imageurl"　:　"http://urllink1"}]

}

Referring to FIG. 4, 1) the request ID 401, which is identification information for the request, may be '01a-c140-bf972', 2) the execution type 402 may be async, which is the 'second type', 3) The request type 403, which is information about at least one artificial intelligence model, may be type A. The request type may be a type preset in the client device 110 and the electronic device 100, and the request type may include information about at least one artificial intelligence model corresponding to the request type. Additionally, information about at least one artificial intelligence model included in the request type may be changed by the client device. Referring to FIG. 4, type A according to one embodiment may be a type that uses an OCR model, a keyword detection model, and an age detection model as at least one artificial intelligence model. In addition, the Request Type 403 in FIG. 4 and the Detection 303 in FIG. 3 are both fields for at least one artificial intelligence model, and the Request Type and Detection may be optionally included in the information. 4) The input data 404 for at least one artificial intelligence model may be 'http://urllink1'.

The execution type according to the embodiment of FIG. 4 may be the second type, Asynchronous. Accordingly, as the client device 110 receives the request ID from the electronic device 100, the connection between the client device 110 and the electronic device 100 may be disconnected. Accordingly, when the inference of the artificial intelligence model takes time, the client device 110 can perform other tasks.

According to one embodiment, the electronic device 100 may determine the execution order of at least one artificial intelligence model. Similar to the embodiment of FIG. 3, output data of the OCR model may be used as input data of the keyword detection model, and output data of the keyword detection model may be used as input data of the age detection model. Accordingly, the electronic device 100 can check the execution order in which the OCR model, keyword detection model, and age detection model are executed. Accordingly, the electronic device 100 generates control information including the confirmed execution order and 'http://urllink1', which is input data 304 of the at least one artificial intelligence model and OCR model that is the target of execution. You can.

According to one embodiment, the electronic device 100 may transmit control information to a server to run at least one artificial intelligence model. Accordingly, at least one artificial intelligence model may be executed based on control information. More specifically, 1) the OCR model can generate text data as output data based on 'http://urllink1', which is the input data 304. The generated text data may be transmitted to the electronic device 100. Additionally, the generated text data can be used as input data for a keyword detection model. 2) The keyword detection model can use the received text data as input data to generate keyword data as output data. Additionally, the generated keyword data may be transmitted to the electronic device 100. 3) The age detection model can use the received keyword data as input data to generate data about age as output data. Additionally, the generated data on age can be transmitted to the electronic device 100.

Since the execution type of the request is the second type, Async, the electronic device 100 may not directly provide result information including output data of at least one artificial intelligence model to the client device 110. At this time, the electronic device 100 may publish the result information to the data processing platform 130. The data processing platform 130 is not limited to Apache Kafka, but a specific example of transmitting result information to a client device using Apache Kafka will be described below.

According to one embodiment, the electronic device 100 may set the result information as a Kafka message and transmit it to the data processing platform 130. If the data processing platform 130 is a platform for Apache Kafka, the data processing platform 130 may include a Kafka cluster. At this time, the data processing platform 130 may store the result information received from the electronic device 100 in a Kafka topic of the Kafka cluster. More specifically, the result information may be stored in a Kafka topic of the data processing platform 130 with the request ID as the name of the Kafka topic. Accordingly, the client device can easily identify the Kafka topic corresponding to the request by using the name of the Kafka topic, thereby filtering out irrelevant requests. Additionally, a Kafka topic corresponding to each client device 110 may be pre-designated, but is not limited to this.

According to one embodiment, Kafka topics may be divided into multiple partitions. Accordingly, the data processing platform 130 may store the output data of at least one artificial intelligence model in different partitions of the Kafka topic of the data processing platform 130 where the request ID is the name of the Kafka topic, but this is limited. That is not the case.

Client device 110 may conceal result information stored in data processing platform 130. More specifically, the client device 110 can use the request ID to identify the Kafka topic corresponding to the request among a plurality of topics. Accordingly, the client device 110 can receive result information stored in the Kafka topic corresponding to the request ID. In relation to this, the client device 110 may periodically check whether a Kafka topic with the name of the Kafka topic corresponding to the request ID exists in the data processing platform 130.

However, the operation of the client device 110 to periodically check whether a Kafka topic with the name of the Kafka topic corresponding to the request ID exists in the data processing platform 130 at a random time may be inefficient. In this regard, the electronic device 100 according to one embodiment includes information about the execution state of at least one artificial intelligence model at the time of receiving the request, history data about the execution of at least one artificial intelligence model, and the server. Based on at least one of the information about the amount of available memory, the expected time until the resulting information is transmitted to the data processing platform 130 can be predicted.

According to one embodiment, when the electronic device 100 predicts the expected time, upon receiving a request, the electronic device 100 may transmit information about the request ID and the expected time to the client device 110. That is, the client device 110 can check whether a Kafka topic with the name of the Kafka topic corresponding to the request ID exists in the data processing platform 130, based on information about the expected time, and receive the result information. there is. Accordingly, the overall cost of system 10 can be greatly reduced.

According to one embodiment, the electronic device 100 may receive a request from the client device 110 and check information 500 related to the request. Referring to Figure 5, the received request may include the following information.

{

"requestId":　"01a-c140-bf972",

"execution":　"async",

"data": [{"imageurl"　:　"http://urllink1",

detection: ["OCR", "keyword_detection", "age_detection"]},

{"imageurl"　:　"http://urllink2",

detections: "Certification"}]

}

Referring to FIG. 5, 1) the request ID 501, which is identification information for the request, may be '01a-c140-bf972', and 2) the execution type 502 may be Async, which is the 'second type'. 3) Additionally, the data 503 may include a plurality of input data and information about an artificial intelligence model corresponding to each of the plurality of input data. More specifically, referring to FIG. 5, at least one artificial intelligence model corresponding to http://urllink1, which is the first input data, may include an OCR model, a keyword detection model, and an age detection model. Additionally, at least one artificial intelligence model corresponding to http://urllink2, which is the second input data, may include an authentication-related model. Accordingly, when there is a plurality of input data corresponding to a request, requesttype and detection are not separately described as shown in FIGS. 3 and 4, but the detection field or requesttype field may be written together in the data field. Additionally, http://urllink1 and http://urllink2 may be dummy URLs for explanation purposes.

According to one embodiment, the electronic device 100 may determine the execution order of at least one artificial intelligence model. More specifically, when the artificial intelligence model corresponding to the request is divided into a plurality of groups, the execution order of at least one artificial intelligence model may include the execution order between the artificial intelligence models included in each of the plurality of groups, A plurality of groups may be characterized as being executed in parallel.

Referring to FIG. 5, the OCR model, keyword detection model, age detection model, and authentication-related model can be divided into a first group related to text analysis through NLP and a second group related to image analysis through computer vision. For example, the first group may include an OCR model, a keyword detection model, and an age detection model, and the second group may include an authentication-related model. Since the authentication-related model does not depend on computer-readable text data through OCR, the authentication-related model can be executed in parallel with the artificial intelligence model included in the first group. Accordingly, the electronic device 100 may check the execution order of the OCR model, keyword detection model, and age detection model included in the first group, as well as the execution order that allows the first group and the second group to be executed in parallel. Additionally, information related to the request received from the client device 110 may include information regarding the execution order and parallel execution of at least one artificial intelligence model. Accordingly, it may be possible for the client device 110 to specify the execution order of at least one artificial intelligence model.

Additionally, the electronic device 100 may transmit control information including the confirmed execution order to the artificial intelligence model. More specifically, the electronic device 100 may transmit control information to the server 120, and the server 120 may 1) provide first input data “http://urllink1” as input data for the OCR model and 2) the second input data, “http://urllink2”, can be provided as input data for the authentication-related model. Detailed information related to the operation of the artificial intelligence model will be omitted as it is similar to Figure 4.

The electronic device 100 may receive output data including text data, keyword data, age data, and authentication data from the artificial intelligence model. Additionally, the electronic device 100 can collect the received output data and check the result information. The electronic device 100 may provide the request ID and result information to the data processing platform 130 so that the result information is stored in a Kafka topic of the data processing platform 130 using the request ID as the name of the Kafka topic. Accordingly, the client device 110 can receive result information from a topic that has the name of the topic corresponding to the request ID.

According to one embodiment, the electronic device 100 may receive a request from the client device 110 and check information 600 related to the request. Referring to Figure 6, the received request may include the following information.

{

"requestId":　"01a-c140-bf972",

"execution":　"sync",

"data": [{"imageurl"　:　"http://urllink1",

detection: ["OCR", "keyword_detection", "age_detection"]},

{"imageurl"　:　"http://urllink2",

detections: "Certification"}]

}

Referring to FIG. 6, the request ID 601 is “01a-c140-bf972”, the execution type 602 is the first type, and the data 603 is a plurality of input data and each of the plurality of input data. May include information about the artificial intelligence model.

There are a total of four artificial intelligence models that respond to requests, and there are concerns that inference through artificial intelligence models will take a lot of time. In addition, even in cases where there are many tasks currently being processed by the server 120 or when there are a large number of other client devices currently connected to the electronic device 100, there is a concern that inference through the artificial intelligence model will take a lot of time. There is. However, since the execution type is the first type and is synchronous, the client device 110 may operate in standby mode while connected to the electronic device 100 until result information is received. Accordingly, the client device 110 may not be able to perform other functions and may wait for an excessive amount of time.

According to one embodiment, the electronic device 100 checks the execution order of at least one artificial intelligence model corresponding to the request and then receives result information based on information about the execution state of at least one artificial intelligence model. time can be predicted. For example, if the expected time is longer than the set time, the electronic device 100 may determine that the execution type is improperly set. Accordingly, the electronic device 100 may transmit a request including whether to change the execution type from the first type to the second type to the client device 110.

At this time, if the electronic device 100 receives a response from the client device 110 indicating that the execution type will not be changed, or 2) the electronic device 100 does not receive a response from the client device 110 for a set time, the electronic device 110 100 may provide result information to the client device 110 by setting the execution type corresponding to the request as the first type. At this time, the electronic device 100 may not publish result information including text data, keyword data, data about age, and data about authentication as output data to the data processing platform 130, and the electronic device 100 ) can directly transmit the result information to the waiting client device 110.

Conversely, as in the embodiment of FIG. 6, the electronic device 100 may receive a response changing the execution type from the client device 110 within a set time. Accordingly, the electronic device 100 publishes result information including text data, keyword data, age data, and authentication data as output data to the data processing platform 130, and the client device 110 receives the request. You can use the ID to conceal the result information stored in the Kafka topic corresponding to the request. At this time, since the operation of the electronic device 100 in FIG. 6 is the same as the operation of the electronic device 100 in FIG. 5, detailed information will be omitted.

According to one embodiment, the electronic device 100 may identify each of the at least one artificial intelligence model as one of a first type of processor unit-based artificial intelligence model and a second type of processor unit-based artificial intelligence model. Here, the first type of processor unit and the second type of processor unit may correspond to a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU), respectively.

Additionally, the electronic device 100 may allocate the first type of processor unit-based artificial intelligence model and the second type of processor unit-based artificial intelligence model to the first type of processor unit server and the second type of processor unit server. . However, as seen above, the first type of processor unit-based artificial intelligence model and the second type of processor unit-based artificial intelligence model cannot be run only on the first type of processor unit server and the second type of processor unit server, respectively. For example, even if the artificial intelligence model operates more efficiently on the first type of processor unit server, if there are many tasks currently being processed on the first type of processor unit server, the artificial intelligence that runs through the second type of processor unit server It can act as a model. Referring to FIG. 7, the first type of processor unit server and the second type of processor unit server may correspond to the CPU server 720 and GPU server 710 of FIG. 7, respectively.

According to one embodiment, artificial intelligence models can be divided into GPU-based artificial intelligence models and CPU-based artificial intelligence models. Accordingly, the electronic device 100 may classify the artificial intelligence model corresponding to the request into one of a CPU-based artificial intelligence model and a GPU-based artificial intelligence model. Specifically, among artificial intelligence models, computer vision-related artificial intelligence models, such as optical character recognition (OCR) models and logo detection models, may be more efficient models when based on GPUs. More specifically, the optical character recognition model and logo detection model can run on both CPU-based devices and GPU-based devices, but may be an artificial intelligence model that runs more efficiently on GPU-based devices. Conversely, among artificial intelligence models, keyword detection models and age detection models, which are examples of artificial intelligence models related to natural language processing, may be more efficient models when based on CPU. More specifically, the keyword detection model and age detection model can be run on both CPU-based devices and GPU-based devices, but may be an artificial intelligence model that runs more efficiently on CPU-based devices.

According to one embodiment, the server 120 of FIG. 7 may include a GPU server 710 and a CPU server 720. Using the GPU server 710 for inference of an artificial intelligence model allows the artificial intelligence model to be managed more efficiently in many aspects than using only the CPU server 720. For example, by using a GPU instance, the efficiency of the system 10 can be increased in terms of processing speed for providing result information according to a request and cost considering the latency of the client device. Here, the GPU instance is an element that constitutes the GPU server 710, and may be a sub-server of the GPU server used in artificial intelligence, etc. in frameworks and applications.

According to one embodiment, the electronic device 100 allocates a first group including a CPU-based artificial intelligence model to the CPU server 720, and assigns a second group including a GPU-based artificial intelligence model to the GPU server 710. can be assigned to . In this regard, information about servers corresponding to a plurality of artificial intelligence models may be set in the electronic device 100. That is, the electronic device 100 may determine a server corresponding to an artificial intelligence model based on information about servers corresponding to a plurality of artificial intelligence models. The operation of determining a server may include determining a GPU instance included in the GPU server 710 or a container included in the CPU server 720.

According to one embodiment, the electronic device 100 may allocate an artificial intelligence model included in the first group to one of a plurality of containers included in the CPU server 720. Here, the CPU server 720 may be the Kubernetes open source platform, but is not limited thereto. The electronic device 100 may allocate an artificial intelligence model included in the second group to one of a plurality of instances included in the GPU server 720. In other words, even if the artificial intelligence models belong to the same group, the instances or containers that are actually executed may be different. For example, the OCR model may be assigned to a first instance included in the GPU server 710, and the keyword detection model may be assigned to a first container included in the CPU server 720. GPU instances and containers can be connected to artificial intelligence models through a web server.

The server address operated by each artificial intelligence model may be included in the control information. More specifically, the endpoint related to the server address for each artificial intelligence model may be included in the control information. The electronic device 100 may call an endpoint corresponding to each artificial intelligence model. Here, the endpoint may be a point such as a URL that allows access to resources such as output data on the server.

Additionally, adding a new artificial intelligence model under the system 10 can be easily performed using the endpoint. Specifically, by setting 1) the input data of the new artificial intelligence model and 2) the end point of the new artificial intelligence model, the electronic device 100 can easily manage the new artificial intelligence model. For example, when adding a chart detection model, 1) first data as input data and 2) first URL as endpoint may be set, and the electronic device 100 may set the first data and first URL based on the first data and first URL. , the output data of the chart detection model can be obtained.

Regarding GPU instances of GPU server 710, GPU server 710 may include spot instances in addition to on-demand instances. Here, a spot instance can be called an instance that can be stopped arbitrarily by the administrator, but is efficient in terms of cost. However, since it can be stopped arbitrarily, it is necessary to appropriately adjust the amount of memory allocated to spot instances. Accordingly, when the spot instance is interrupted, the electronic device 100 may run an artificial intelligence model based on another GPU instance based on a preset replacement rule. Additionally, each GPU instance may be characterized as being connected to the corresponding machine learning model through a single thread.

FIG. 8 shows the electronic device 100 allocating an OCR model, a keyword detection model, a gender detection model, an age detection model, and an authentication-related model to the GPU server 810 and the CPU server 820 included in the server 120. An example is shown.

According to one embodiment, the electronic device 100 may determine the GPU instance corresponding to at least one artificial intelligence model based on history data about memory usage of the artificial intelligence model included in the second group. For example, each GPU instance may be allocated 16GB of memory. Additionally, historical data on the memory usage of artificial intelligence models shows that the OCR model, gender detection model, and age detection model can use 12GB, 6GB, and 6GB of memory on average, respectively. At this time, if the OCR model and another artificial intelligence model are allocated to the same GPU instance, there may be a risk of exceeding the amount of memory allocated to the GPU instance. Accordingly, the electronic device 100 may 1) assign an OCR model to the first instance, and 2) assign a gender detection model and an age detection model to the second instance based on history data about memory usage of the artificial intelligence model. there is.

Additionally, as discussed above, the electronic device 100 may allocate a new instance to the artificial intelligence model included in the second group. For example, the electronic device 100 may assign a second instance and a third instance to the gender detection model and the age detection model, respectively, by adding the third instance.

This process is similarly applicable to the operation of determining a container corresponding to an artificial intelligence model included in the first group. Specifically, according to historical data on memory usage of artificial intelligence models, the memory usage of keyword detection models and authentication-related models can be large. Accordingly, the electronic device 100 may 1) allocate a keyword detection model to the first container and 2) allocate an authentication-related model to the second container based on history data about memory usage of the artificial intelligence model.

Additionally, GPU instances within a GPU server and containers within a CPU server may be connected to each other. Here, connection means that the artificial intelligence model through the container can be run before and after the artificial intelligence model through the GPU instance. For example, referring to FIG. 8, a first instance and a first container may be connected, and a first container and a second instance may also be connected. Specifically, text data, which is the output data of the OCR model, can be used as input data of the keyword detection model. At this time, text data generated by executing the OCR model through the first instance of the GPU server 810 may be transmitted to the first container of the CPU server 820. Additionally, keyword data generated by executing the keyword detection model through the first container of the CPU server 820 may be transmitted to the second instance of the GPU server 810.

Additionally, the process of determining the instance or container corresponding to each artificial intelligence model is not fixed, and can be flexibly adjusted based on the memory usage history data of the artificial intelligence model. For example, 1) If the OCR model, gender detection model, and age detection model consume 4GB, 4GB, and 4GB of memory on average, respectively, then the OCR model, gender detection model, and age detection model can be assigned to one GPU instance. there is. Conversely, as the capacity of the model and input data increases, the OCR model, gender detection model, and age detection model use 12 GB, 10 GB, and 10 GB of memory on average, respectively. Can be assigned to different GPU instances. In addition, as seen above, even if it is a GPU-based artificial intelligence model, if the first instance and the second instance in the GPU server 810 have many tasks, the GPU-based artificial intelligence model may be operated through a container of the CPU server 820. It may be possible.

The electronic device 900 of FIG. 9 may correspond to the electronic device 100 of the present specification.

The electronic device 900 of the present disclosure may include a transceiver 910, a storage 920, and a processor 930, according to an embodiment. The components shown in FIG. 9 are not essential for implementing the electronic device, so the electronic device 900 described herein may have more or fewer components than the components listed above. Anyone with ordinary knowledge in the technical field related to the embodiment can understand. Meanwhile, in an embodiment, the processor 930 may include at least one processor.

The transceiver 910 can communicate with an external device using wired or wireless communication technology and may include the transceiver 910. The external device can be a client device, terminal, open source platform, or server. In addition, communication technologies used by the transceiver 910 include GSM (Global System for Mobile communication), CDMA (Code Division Multi Access), LTE (Long Term Evolution), 5G, WLAN (Wireless LAN), and Wi-Fi (Wireless-Fidelity). ), Bluetooth, RFID (Radio Frequency Identification), Infrared Data Association (IrDA), ZigBee, NFC (Near Field Communication), etc., but are not limited thereto.

According to one embodiment, the transceiver 910 may receive a request from the client device 110 and transmit control information to at least one artificial intelligence model. Additionally, the transceiver 910 may receive result information according to control information from at least one artificial intelligence model. If the execution type is the first type, the electronic device 100 may directly transmit result information to the client device 110. If the execution type is the second type, the electronic device 100 may transmit result information to the data processing platform 130.

The storage 920 may store information for performing at least one method described above with reference to FIGS. 1 to 9 . Storage 920 may be referred to as memory and may be volatile memory or non-volatile memory. Additionally, the storage 920 may store one or more instructions required to perform the operation of the processor 930, and may temporarily store data stored on the platform or in an external memory. For example, the storage 920 may store information about input-output relationships for at least one artificial intelligence model and history data about memory usage of the artificial intelligence model.

The processor 930 can control the overall operation of the electronic device 900 and process data and signals. The processor 930 may perform one of the methods described above with reference to FIGS. 1 to 9 . The processor 930 controls embodiments performed by the electronic device 900 through interaction with the transceiver 910 and the storage 920 and further components that the electronic device 900 may include. You can. According to one embodiment, the processor 930 receives a request from the client device 110, confirms information related to the request, based on the information, verifies at least one artificial intelligence model corresponding to the request, and generates the information. Control information identified based on can be transmitted to at least one artificial intelligence model, result information according to the control information can be received from at least one artificial intelligence model, and the result information can be provided to the client device 110.

Meanwhile, the specification and drawings disclose preferred embodiments of the present invention, and although specific terms are used, they are used in a general sense to easily explain the technical content of the present invention and to aid understanding of the present invention. It is not intended to limit the scope of the invention. In addition to the embodiments disclosed herein, it is obvious to those skilled in the art that other modifications based on the technical idea of the present invention can be implemented.

The electronic device or terminal according to the above-described embodiments includes a processor, memory for storing and executing program data, permanent storage such as a disk drive, a communication port for communicating with an external device, a touch panel, and a key. , user interface devices such as icons, etc. Methods implemented as software modules or algorithms may be stored on a computer-readable recording medium as computer-readable codes or program instructions executable on the processor. Here, computer-readable recording media include magnetic storage media (e.g., ROM (read-only memory), RAM (random-access memory), floppy disk, hard disk, etc.) and optical read media (e.g., CD-ROM). ), DVD (Digital Versatile Disc), etc. The computer-readable recording medium is distributed among computer systems connected to a network, so that computer-readable code can be stored and executed in a distributed manner. The media may be readable by a computer, stored in memory, and executed by a processor.

This embodiment can be represented by functional block configurations and various processing steps. These functional blocks may be implemented in various numbers of hardware or/and software configurations that execute specific functions. For example, embodiments include integrated circuit configurations such as memory, processing, logic, look-up tables, etc. that can execute various functions under the control of one or more microprocessors or other control devices. can be hired. Similar to how the components can be implemented as software programming or software elements, this embodiment includes various algorithms implemented as combinations of data structures, processes, routines or other programming constructs, such as C, C++, Java ( It can be implemented in a programming or scripting language such as Java, assembler, Python, etc. Functional aspects may be implemented as algorithms running on one or more processors. Additionally, this embodiment may employ conventional technologies for electronic environment settings, signal processing, and/or data processing. Terms such as “mechanism,” “element,” “means,” and “composition” can be used broadly and are not limited to mechanical and physical components. The term may include the meaning of a series of software routines in connection with a processor, etc.

The above-described embodiments are merely examples and other embodiments may be implemented within the scope of the claims described below.

Claims

In a method of managing an artificial intelligence model in an electronic device,

Receiving a request from a client device;

confirming information related to the request;

Based on the information, confirming at least one artificial intelligence model corresponding to the request;

Transmitting control information identified based on the information to the at least one artificial intelligence model;

Receiving result information according to the control information from the at least one artificial intelligence model; and

A method of managing an artificial intelligence model including providing the result information to the client device.
According to claim 1,

The information includes identification information about the request, information about the at least one artificial intelligence model corresponding to the request, information about input data of the at least one artificial intelligence model, and information about the execution type corresponding to the request. A method of managing an artificial intelligence model comprising:
According to claim 1,

The step of transmitting the control information to the at least one artificial intelligence model includes:

Confirming an execution order of the at least one artificial intelligence model based on information about the input-output relationship for the at least one artificial intelligence model; and

A method of managing an artificial intelligence model including identifying the control information based on the execution order.
According to clause 3,

When the at least one artificial intelligence model is divided into a plurality of groups, the execution order of the at least one artificial intelligence model includes an execution order between the artificial intelligence models included in each of the plurality of groups, and

A method for managing an artificial intelligence model, wherein the plurality of groups are executed in parallel.
According to clause 2,

A method for managing an artificial intelligence model, characterized in that the information about the execution type is determined based on the expected time until receiving the result information.
According to clause 2,

When the execution type is the first type, providing the result information to the client device includes:

A method of managing an artificial intelligence model comprising providing the result information to the waiting client device in response to receiving the result information.
According to clause 2,

When the execution type is the second type, providing the result information to the client device includes:

A method of managing an artificial intelligence model including transmitting the result information to a data processing platform.
According to clause 7,

The step of receiving the request is,

In response to receiving the request, a method for managing an artificial intelligence model comprising transmitting the identification information to the client device.
According to clause 8,

A method for managing an artificial intelligence model, wherein the result information stored in the data processing platform is identified by the client device based on the identification information.
According to claim 1,

The step of confirming the at least one artificial intelligence model is,

A method for managing an artificial intelligence model, comprising the step of identifying each of the at least one artificial intelligence model as one of a first type of processor unit-based artificial intelligence model and a second type of processor unit-based artificial intelligence model.
According to claim 10,

Among the at least one artificial intelligence model, a first group including the first type of processor unit-based artificial intelligence model includes a keyword detection model, and

A method for managing an artificial intelligence model, wherein a second group including the second type of processor unit-based artificial intelligence model among the at least one artificial intelligence model includes an optical character recognition model.
According to claim 10,

The step of confirming the at least one artificial intelligence model is,

A first group including a processor unit-based artificial intelligence model of the first type among the at least one artificial intelligence model, and a second group including a processor unit-based artificial intelligence model of the second type among the at least one artificial intelligence model. A method for managing an artificial intelligence model comprising assigning groups to a first type of processor unit server and a second type of processor unit server, respectively.
Following paragraph 12,

The second type of processor unit server includes a plurality of instances, and

A method for managing an artificial intelligence model, characterized in that the instance corresponding to each artificial intelligence model included in the second group is determined based on history data about memory usage of the artificial intelligence model included in the second group.
In an electronic device for managing an artificial intelligence model,

transceiver;

Storage for storing one or more instructions; and

Receive a request from a client device,

Check information related to the request,

Based on the information, identify at least one artificial intelligence model corresponding to the request,

Transmitting control information identified based on the information to the at least one artificial intelligence model,

Receive result information according to the control information from the at least one artificial intelligence model, and

An electronic device for managing an artificial intelligence model, including a processor that provides the result information to the client device.
A non-transitory computer-readable recording medium that records a program for executing the method of claim 1 on a computer.