CN114298313A

CN114298313A - Artificial intelligence computer vision reasoning method

Info

Publication number: CN114298313A
Application number: CN202111605642.0A
Authority: CN
Inventors: 范亮; 汤坚; 张磊
Original assignee: Guangzhou Zhongke Zhi Tour Technology Co ltd
Current assignee: Guangzhou Zhongke Zhi Tour Technology Co ltd
Priority date: 2021-12-25
Filing date: 2021-12-25
Publication date: 2022-04-08

Abstract

The invention discloses an artificial intelligence computer vision reasoning method, which uses a k8s (kubernets, hereinafter, both are referred to as k8s) cluster to manage containers for a plurality of running subtasks, has the advantage of easy maintenance, can quickly train operation and maintenance personnel, and achieves the purposes of improving the efficiency of overall bottom-layer computing resources and quickly deploying, operating and maintaining. The invention relates to an artificial intelligence computer vision reasoning method, which comprises the following steps: configuring an algorithm engine, wherein the algorithm engine is developed based on the C #. net core. The system uses k8s to schedule and manage resources to generate a corresponding program container, and uses a k8s command line tool to create a corresponding operation container agent according to the number of GPUs applied by a user; the operation container agent monitors 5000 ports to receive algorithm inference task information sent by a Web Api request; and the algorithm engine creates a script according to the algorithm reasoning task and executes an algorithm process.

Description

Artificial intelligence computer vision reasoning method

Technical Field

The invention relates to the technical field of artificial intelligence such as computer vision, deep learning and the like, in particular to an artificial intelligence computer vision reasoning method.

Background

The conventional GPU computing resource scheduling scheme is a GPU resource scheduler for sharing a memory mode, and is combined with a cluster job management system to transmit the number of needed GPU equipment to a user, so that the needed GPU equipment resource can be automatically allocated to the user job, and the use conflict of a plurality of GPU equipment is avoided. According to the other GPU resource scheduling method and device for the MESOS containerized platform, when a target service is created on an imaging interface, the GPU video memory utilization rate for the target service of the containerized platform is set to be convenient for distributing a corresponding target host for the service, a CUDA driver and a GPU driver of the target host are called by operating an application container engine, and the GPU resource of the target host is utilized to realize that the target service is operated in a container.

After the advent of artificial intelligence computer vision technology, more and more tedious manual operations were replaced, freeing up both hands. The computer vision technology is a technology for simulating the visual process of human beings by a computer, and has the capability of feeling the environment and the visual function of human beings. The existing intelligent computer vision reasoning management system is mainly applied to GPU resource sharing under multiple servers, a GPU resource scheduler used for a memory sharing mode has the problems of insufficient stability, poor confidentiality and the like, different users and objects use shared memories and server environments, the mutual conflict influence possibly exists, and the synchronous use of multiple users and multiple instances is not facilitated.

According to the GPU resource scheduling method for the MESOS containerized platform, an Apache MESOS container is an open source cluster manager of a distributed system kernel, the architecture is relatively loosely coupled, the MESOS uses an API of the MESOS, and the program development and operation and maintenance are complex. Apache MESOS is a mixed environment configuration tool, and since it contains container and non-container applications, although MESOS is stable, it is relatively difficult for its user to quickly learn about applications, and the operation and maintenance personnel need extremely rich experience and ability, which is not favorable for applications.

With the rapid progress of artificial intelligence technology, various excellent models have been industrially deployed into practical projects. However, in practical applications, the efficiency is greatly affected by the problems of resource scheduling conflict, model calling crash and the like, and the model and the resources are used and scheduled more stably, safely and conveniently.

Disclosure of Invention

The invention provides an artificial intelligence computer vision reasoning method, which uses a k8s cluster to manage a plurality of containers for running subtasks, has the advantage of easy maintenance, can quickly train operation and maintenance personnel, and achieves the purposes of improving the efficiency of overall bottom-layer computing resources and quick deployment and operation and maintenance.

The invention provides an artificial intelligence computer vision reasoning method, which comprises the following steps:

configuring an algorithm engine, wherein the algorithm engine provides support for the operation of an algorithm, the algorithm engine is an agent program, and the algorithm comprises a model reasoning task;

the algorithm engine receives inference task information sent by the Web Api;

and the algorithm engine creates a script according to the reasoning task and executes an algorithm process.

Alternatively to this, the first and second parts may,

the configuration algorithm engine uses a k8s command line tool to create a corresponding operation container agent according to the GPU number applied by a user, and the operation agent object starts a Web Server for monitoring 5000 ports.

Alternatively to this, the first and second parts may,

the algorithm engine creates a script according to the reasoning task and executes an algorithm process, and the algorithm process comprises the following steps:

the algorithm engine starts a working machine;

and the algorithm engine schedules an identification file to a working machine for identification and outputs an identification result, wherein the identification file comprises a trained model weight file, an index file and a network structure file.

Alternatively to this, the first and second parts may,

after the algorithm engine receives inference task information sent by the Web Api, before the algorithm engine creates a script according to the inference task and executes an algorithm process, the method further includes:

and determining the algorithm as a reasoning task to obtain a determination result.

Alternatively to this, the first and second parts may,

and when the algorithm is an inference task, invoking an inference script to carry out algorithmic inference.

Alternatively to this, the first and second parts may,

before the algorithm engine receives inference task information sent by the Web Api, the method further comprises the following steps:

and the algorithm engine initiates an inference task information receiving request to the address configured in the configuration file.

Alternatively to this, the first and second parts may,

after the algorithm engine initiates a request for receiving inference task information to an address configured in a configuration file, before the algorithm engine receives inference task information sent by a Web Api, the method further includes: if no inference task information is received, wait for 'PollInterval' milliseconds before trying.

Alternatively to this, the first and second parts may,

determining the algorithm as an algorithm reasoning task according to the following steps:

determine that is a request to http:// { server }/test/{ model id }? dataPath ═ dataPth ″, is the initiating inference algorithm.

Alternatively to this, the first and second parts may,

after receiving inference task information sent by the Web Api, the algorithm engine further comprises the following steps: and recording data path parameters, algorithm path parameters and callback address parameters.

Alternatively to this, the first and second parts may,

and the operation container agent reports the log and result information in the process through the callback address Web Api in the process of executing the reasoning task.

Compared with the prior art, the application has the following beneficial effects:

the technical scheme includes that an algorithm engine is configured, the algorithm engine provides support for operation of an algorithm, and the algorithm engine is k8 s; and the algorithm engine receives inference task information sent by the Web Api, creates a script according to the inference task and executes an algorithm process. Since k8s is used as a container application on multiple hosts in the management cloud platform, k8s can package the application into a container image, and fully utilize server resources in the form of a container. The k8s cluster is used for managing the containers for the plurality of running subtasks, the method has the advantage of easiness in maintenance, operation and maintenance personnel can be quickly trained, and the purposes of improving the efficiency of the whole bottom-layer computing resources and quickly deploying, operating and maintaining the resources are achieved. Compared with a GPU resource scheduling method for the MESOS containerized platform, the framework is relatively loosely coupled, the MESOS uses an API of the MESOS, and the program development and operation and maintenance are complex. Apache MESOS users have relative difficulty in fast learning and application, and operation and maintenance personnel need extremely rich experience and capability and are not beneficial to application. The k8s management program provided by the application can be rapidly deployed and applied in a manner of deploying containerized applications on the cluster, deploying the cluster in a large scale, updating the versions of the containerized applications and debugging the containerized applications.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow chart of a first embodiment of a method for artificial intelligence computer vision reasoning in accordance with the present invention;

FIG. 2 is a flow chart of a second embodiment of the artificial intelligence computer vision reasoning method of the present invention;

FIG. 3 is a flow chart of a third embodiment of the artificial intelligence computer vision reasoning method of the present invention;

FIG. 4 is a flow chart of a fourth embodiment of the artificial intelligence computer vision reasoning method of the present invention;

FIG. 5 is a flowchart illustrating a fifth embodiment of an artificial intelligence computer vision reasoning method according to the present invention.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The words "a", "an" and "the" and the like as used herein are also intended to include the meanings of "a plurality" and "the" unless the context clearly dictates otherwise. Furthermore, the terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

The terms appearing hereinafter include the following explanations:

the distribution method divides a problem which needs huge computing power to solve into a plurality of small parts, then distributes the parts to a plurality of computers for processing, and finally integrates the computing results to obtain a final result.

A server cluster is a solution for improving the overall computing power of a server, and can use a plurality of computers to perform parallel computing to obtain a higher computing speed, or use a plurality of computers to perform backup, and a parallel or distributed system composed of server clusters connected together is used to colloquially refer to that a plurality of servers are centralized to perform the same service. The server cluster can support the operation load of big data analysis and provide calculation or application service for other clients (such as terminals like PC, smart phone, ATM and the like and even large-scale equipment like train systems and the like) in the network. Since the servers in the server cluster run the same computing task, from the perspective of the external client, the cluster server appears as if only one server provides a uniform service to the outside.

The Web Api is a Web application program interface. The network application can realize the capabilities of storage service, message service, computing service and the like through the API interface, and can develop powerful web applications by utilizing the capabilities.

kubernets, k8s for short, is an abbreviation for 8 instead of 8 characters "ubernet". k8s is an open source container orchestration engine that supports automated deployment, large-scale scalable, application containerization management. When an application is deployed in a production environment, multiple instances of the application are typically deployed for load balancing application requests for containerized applications on multiple hosts, and kubernets aim to make deploying containerized applications simple and efficient (powerfull), and provide a mechanism for application deployment, planning, updating, and maintenance. The method has the characteristics of less resource occupation and quick deployment of the container. Each application can be packaged into a container mirror, the one-to-one relation between each application and the container also enables the container to have great advantages, and the container mirror can be created for the application at the stage of build or release by using the container, because each application does not need to be combined with the rest of the application stack and does not depend on the production environment infrastructure, and therefore a consistent environment can be provided from research and development to reasoning and production. Similarly, containers are lighter weight, more "transparent" than virtual machines, which is more convenient to monitor and manage.

Referring to fig. 1, a first embodiment of an artificial intelligence computer vision reasoning method according to the present invention includes:

101. and configuring an algorithm engine.

In this embodiment, an algorithm engine configured to provide runtime support for algorithmic inference is provided in the server cluster. The algorithm engine is developed based on C #. net core. The system uses k8s to schedule and manage resources to generate corresponding program containers, and each user instance and the scheduling unit run independently, thereby avoiding program exception caused by mutual direct environment configuration.

And according to the GPU number applied by the user, using a k8s command line tool to create a corresponding running container agent, and monitoring 5000 ports. agent only supports the service startup mode. Without distinguishing the input parameters, the agent will start a Web Server, listening to 5000 ports (which may be mapped to other ports by configuration when running in a container). Support sending requests to this Web Server to perform inference. The system starts some operation container agents in advance, calls a specific model to carry out operation after receiving the identification task, and outputs a result, and when the task is completed, a support file of the container or the identification model does not need to be destroyed, so that the efficiency is improved.

102. The algorithm engine receives inference task information sent by the Web Api;

in this embodiment, the algorithm engine receives inference task information sent back by the Web Api for algorithmic inference according to an agreed mode, dynamically creates a script, and executes an inference process.

103. And the algorithm engine creates a script according to the reasoning task and executes an algorithm process.

In the embodiment, the algorithm engine establishes a script and executes an algorithm process according to the inference task, uses different Api contents to independently execute the inference algorithm, receives inference task information sent back by a Web Api for algorithmic inference according to an agreed mode, dynamically establishes the script and executes the inference process; and loading the trained model weight file, index file, network structure file and the like, and scheduling and identifying tasks to the working machine to start execution, wherein the function application of the algorithm engine is wide and efficient. The specific algorithm engine creates a script according to the inference task and executes an algorithm process, which will be described in the following embodiments.

In this embodiment, the system first configures an algorithm engine, which provides support for the operation of an algorithm, where the algorithm engine is k8s, and the algorithm includes an inference algorithm; and the algorithm engine receives inference task information sent by the Web Api, creates a script according to the inference task and executes an algorithm process. Since k8s is used as a container application on multiple hosts in the management cloud platform, k8s can package the application into a container image, and fully utilize server resources in the form of a container. The k8s cluster is used for managing the containers for the plurality of running subtasks, the system has the advantage of easiness in maintenance, operation and maintenance personnel can be quickly trained, and the purposes of improving the efficiency of the whole bottom-layer computing resources and quickly deploying, operating and maintaining the resources are achieved. Compared with a GPU resource scheduling method for the MESOS containerized platform, the framework is relatively loosely coupled, the MESOS uses an API of the MESOS, and the program development and operation and maintenance are complex. Apache MESOS users have relative difficulty in fast learning and application, and operation and maintenance personnel need extremely rich experience and capability and are not beneficial to application. The k8s management program provided by the application can be rapidly deployed and applied in a manner of deploying containerized applications on the cluster, deploying the cluster in a large scale, updating the versions of the containerized applications and debugging the containerized applications.

Compared with the conventional reasoning management system, the method is mainly applied to GPU resource sharing under multiple servers, the GPU resource scheduler for the memory sharing mode has the problems of insufficient stability, poor confidentiality and the like, different users and objects use shared memory and server environments, the mutual conflict influence possibly exists, the synchronous use of multiple users and multiple instances is not facilitated, the users cannot configure development environments and reasoning environments by themselves, and therefore the extensibility of algorithm development space is poor, and the use habit of the users is not met. Compared with a GPU resource scheduling method for the MESOS containerized platform, the framework is relatively loosely coupled, the MESOS uses an API of the MESOS, and the program development and operation and maintenance are complex. Apache MESOS users have relative difficulty in fast learning and application, and operation and maintenance personnel need extremely rich experience and capability and are not beneficial to application. The k8s management program provided by the system can be rapidly deployed and applied in a manner of deploying containerized applications on the cluster, deploying the cluster in a large scale, updating the versions of the containerized applications and debugging the containerized applications.

At a service management end outside the container, operation requests of a user are required to be received, wherein the operation requests comprise requests of an application program API (application program interface) on a UI (user interface), then the operation requests are put into a work queue, a corresponding work server is dispatched to respond to work task requests, more containers are started to participate in work when resources are enough, and tasks are completed as soon as possible; when there are no more resources, the work queue becomes a buffer pool between the background work cluster and the foreground user request. The main functions of work queue, task management, agent program in container, work state information synchronization and the like cooperate to work together to form the support of system operation. The system uses a distributed data storage and distribution mode, uses the CDN to accelerate the distribution of a large amount of data, and has the advantages of high efficiency and high speed.

The difference between the second embodiment of the artificial intelligence computer vision reasoning method provided by the invention and the first embodiment is that the step that the algorithm engine creates a script and executes an algorithm process according to the reasoning task further comprises the following steps:

the algorithm engine starts a working machine;

In this embodiment, the algorithm engine starts the working machine at an appropriate time, loads the trained model weight file, index file, network structure file, etc., and schedules the recognition task to the working machine to start execution, and finally outputs the recognition result, and calls back the function or updates the function to the state information of the call task according to the convention.

Since steps 201 to 202 of this embodiment are similar to steps 101 to 102 of the first embodiment, they are not described herein again. Referring to fig. 2, the following describes a flow of this embodiment by taking an example in practical application, and the flow includes the following steps.

201. Configuring an algorithm engine;

202. the algorithm engine receives inference task information sent by the Web Api;

203. the algorithm engine starts a working machine;

204. and the algorithm engine schedules the identification file to the working machine for identification and outputs an identification result.

The third embodiment of the artificial intelligence computer vision reasoning method provided by the invention is different from the first embodiment in that after the step of receiving the reasoning task information sent by the Web Api, the step of creating a script and executing an algorithm process by the algorithm engine according to the reasoning task further comprises: determining whether the algorithm is a reasoning algorithm or not to obtain a determination result; the algorithm engine creates a script according to the reasoning task and executes an algorithm process, and the algorithm process comprises the following steps: when the algorithm is an inference algorithm, invoking an inference script;

since steps 301 to 302 of this embodiment are similar to steps 101 to 102 of the first embodiment, they are not described herein again. Referring to fig. 3, the following describes a flow of this embodiment by taking an example in practical application, and the flow includes the following steps.

301. Configuring an algorithm engine;

302. the algorithm engine receives inference task information sent by the Web Api;

303. determining whether the algorithm is a reasoning algorithm or not to obtain a determination result;

in this embodiment, how the system determines the algorithm to be an inference algorithm is explained in the following embodiments.

304. In this embodiment, when the algorithm is an inference algorithm, an inference script is invoked.

Invoking the inference script includes:

agent directly calls' test. And simultaneously, the set model directory and the task directory are transmitted in through parameters.

According to the convention, a model path is transmitted into a model path and a task path is transmitted into a task path through a model, and the model path and the task path are absolute paths obtained after a program processes a related configuration template. Since confusion of whether the relative path is to an agent or to a py file easily occurs if the relative path is set, the 'TestModelPathBaseTemplate' and 'TestTaskPathBaseTemplate' configuration items should be set as absolute paths so as not to cause problems.

The invocation is similar as follows:

py-model { testbistabpathtemplate }/test.py-model { TestModelPathBaseTemplate instantiation }.

In this embodiment, when the algorithm is an inference algorithm, an inference script is called. When initiating the inference task, the agent will execute the inference script by internally starting the python process. The reasoning script of the algorithm adopts a python script, so that the latest algorithm achievement can be conveniently connected, the reasoning script can receive the transmitted data path after being packaged, the process information and the result information are stored in a local designated position, and the agent can conveniently read and report the result after reading the algorithm reasoning.

The difference between the fourth embodiment and the third embodiment of the artificial intelligence computer vision reasoning method provided by the invention is that before the step of receiving reasoning task information sent by the Web Api, the algorithm engine further comprises: and the algorithm engine initiates an inference task information receiving request to the address configured in the configuration file. Determining the algorithm as an inference algorithm based on the following steps: determine request to http:// { server }/test/{ modeld }? dataPath ═ dataPth ″, is the initiating inference algorithm. Referring to fig. 4, the flow of the embodiment includes the following steps:

401. configuring an algorithm engine;

step 401 is similar to step 301, and will not be described herein.

402. The algorithm engine initiates a reasoning task information receiving request to the address configured in the configuration file;

in this embodiment, after agent is started, it will automatically initiate a 'get' request to the address configured by 'PollTaskUrl' in the configuration file, and execute the obtained tasks in sequence (treat 'train' and 'test' tasks equally). If no task is acquired, wait 'PollInterval' for a millisecond before trying.

403. Determine that is a request to http:// { server }/test/{ model id }? dataPath ═ dataPth ″, is the initiating inference algorithm.

In the present embodiment, "POST" requests to ""// { server }/test/{ modelId }? dataPath ═ dataPth ″, initiates the inference task. Note that each call will generate a new inference path to store the process file, and the agent ensures that all inference tasks are serialized.

The fifth embodiment of the artificial intelligence computer vision reasoning method provided by the invention is different from the previous embodiments in that after receiving reasoning task information sent by the Web Api, the algorithm engine further comprises: and recording data path parameters, algorithm path parameters and callback address parameters. The algorithm engine creates a script according to the reasoning task and executes an algorithm process, and the algorithm process comprises the following steps: and the operation container agent reports the log and result information in the process through the callback address Web Api in the process of executing the reasoning task.

Referring to fig. 5, an actual example of the present embodiment includes the following steps:

501. configuring an algorithm engine;

502. the algorithm engine receives inference task information sent by the Web Api;

steps 501-502 are similar to steps 101-102 in the first embodiment, and will not be described in detail.

503. Recording data path parameters, algorithm path parameters and callback address parameters;

504. and the operation container agent reports the log and result information in the process through the callback address Web Api in the process of executing the reasoning task.

When initiating the inference task, the agent will execute the inference script by internally starting the python process. The reasoning script of the algorithm adopts a python script, so that the latest algorithm achievement can be conveniently connected, the reasoning script can receive the transmitted data path after being packaged, the process information and the result information are stored in a local designated position, and the agent can conveniently read and report the result after reading the algorithm reasoning.

and in the process of executing the reasoning task, the agent reports the log and result information in the process through the callback address Web Api.

When executing the reasoning task, the agent can call back various information in the reporting process of the Web Api. If it is not provided with a Web Api, it cannot report information correctly.

The callback Web Api is provided by the management platform, and agent accesses through ' serviceUrl ' configured in ' appsetins. agent will go to this url according to the situation 'POST' Json content. 'MIME type' of 'POST' request is 'application/json'.

POST http:// localhost:5000/api/info the past content of POST each time is an array, ready for sending multiple pieces of data per request in the future. All information backlogged may be sent in timed batches later due to large amounts of data.

The parameters in json are as follows:

taskType distinguishes whether the current is reasoning

action represents the current operation, and comprises the following steps:

1. log, namely common log output, reporting the output log of python, wherein the log content is shown in rawLog

2. error, python execution error, details see rawLog

3. And end, ending the treatment process. Is triggered after the execution of python is completed. Py file execution end does not trigger. Whether success or failure can be seen in ExitCode reported in rawLog

4. model Id: model Id

5. Task Id: [ occurrence conditions: taskType ═ test ]

6. Original underlying log content: "rawLog"

agent accesses through the ' testresultserviceurlttemplate ' position configured in appsetings.json ', wherein the ' modelId ' and the ' taskId ' are replaced by corresponding models and task ids when agent reports inference task results. agent will collect all ' txt ' result files output by ' test.

'MIME type' of 'POST' request is 'application/json'.

json content is shown below, with detailed description annotated: is an array representing the corresponding results generated for a collection of documents identified, wherein each element represents a document identification result:

1. the filename with the extension removed, for example, the recognized image is "857.jpg", and the file value of the recognized result is "857"

2. The resulting one or more objects are identified as an array. One picture may identify a plurality of contents to be identified. If no recognition object exists, the identification object is a null array [ ].

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present application may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection for one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

Artificial intelligence is the subject of studying computers to simulate some human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural voice processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. An artificial intelligence computer vision reasoning method, comprising:

configuring an algorithm engine, wherein the algorithm engine provides support for the operation of the algorithm, and the algorithm engine is developed based on the C #. net core. The system uses k8s to schedule and manage resources to generate a corresponding program container, and uses a k8s command line tool to create a corresponding operation container agent according to the number of GPUs applied by a user, wherein the algorithm comprises an inference algorithm;

the algorithm engine receives inference task information sent by the Web Api;

2. The artificial intelligence computer vision reasoning method of claim 1, wherein:

the configuration algorithm engine uses k8s to schedule and manage resources according to the number of GPUs applied by the user to generate a corresponding program container, uses a k8s command line tool to create a corresponding operation container agent according to the number of GPUs applied by the user, and the operation agent starts a Web Server for monitoring 5000 ports.

3. The artificial intelligence computer vision reasoning method of claim 1, wherein:

the algorithm engine starts a working machine;

4. The artificial intelligence computer vision reasoning method of claim 1, wherein:

and determining the algorithm as a reasoning algorithm to obtain a determination result.

5. The artificial intelligence computer vision reasoning method of claim 4, wherein:

6. The artificial intelligence computer vision reasoning method of claim 4, wherein:

7. The artificial intelligence computer vision reasoning method of claim 6, wherein:

8. The artificial intelligence computer vision reasoning method of claim 6, wherein:

determine that is a request to http:// { server }/test/{ model id }? dataPath ═ dataPth ″, is the initiating model inference task.

9. The artificial intelligence computer vision reasoning method of claim 1, wherein:

10. The artificial intelligence computer vision reasoning method of claim 2, wherein:

and running the agent program, and reporting the log and result information in the process by a callback address Web Api in the process of executing the model reasoning task.