CN117725271A - Service processing method, device, electronic equipment and storage medium - Google Patents

Service processing method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117725271A
CN117725271A CN202311729520.1A CN202311729520A CN117725271A CN 117725271 A CN117725271 A CN 117725271A CN 202311729520 A CN202311729520 A CN 202311729520A CN 117725271 A CN117725271 A CN 117725271A
Authority
CN
China
Prior art keywords
vector
target
node
attribute information
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311729520.1A
Other languages
Chinese (zh)
Inventor
陈京来
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Netease Hangzhou Network Co Ltd
Original Assignee
Netease Hangzhou Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netease Hangzhou Network Co Ltd filed Critical Netease Hangzhou Network Co Ltd
Priority to CN202311729520.1A priority Critical patent/CN117725271A/en
Publication of CN117725271A publication Critical patent/CN117725271A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a service processing method, a device, electronic equipment and a storage medium, and relates to the technical field of search engines. According to the scheme, a user can customize attribute information of a target vector service according to actual service requirements, schedule target nodes in a vector engine according to defined attribute information and data scale, and start the target nodes to execute the target vector service through the target nodes, namely, the scheme provides a plug-in architecture, and can extend the vector engine in a customized mode, so that the user can realize a retrieval function of vector data based on the designed task-oriented vector engine, namely, the user can quickly utilize the task-oriented vector engine provided by the scheme to meet service requirements without excessively sinking into operation and maintenance, and the problem that the main stream vector engine provided in the prior art cannot meet actual requirements of various service types is solved.

Description

Service processing method, device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of search engine technologies, and in particular, to a service processing method, a device, an electronic device, and a storage medium.
Background
Vectors play an important role in large language models, knowledge base interactions and computation. Unstructured data, such as text, images, audio, video, etc., can be represented as vectors embedded (vector embeddings) based on intelligent tools and algorithms, thereby implementing functions such as text similarity calculation, knowledge base retrieval, reasoning, etc. The vector provides a convenient and effective representation method for semantic understanding and application. The main role of the vector engine is to store and process vector data and to provide efficient vector retrieval functions.
In the prior art, a plurality of vector engines are provided, however, the vector engines do not provide matching schemes for different service types, and cannot meet the actual requirements of various service types.
Disclosure of Invention
The invention aims to provide a service processing method, a device, an electronic device and a storage medium aiming at the defects in the prior art so as to solve the problem that the main stream vector engine provided in the prior art cannot meet the actual requirements of various service types.
In order to achieve the above purpose, the technical solution adopted in the embodiment of the present application is as follows:
in a first aspect, an embodiment of the present application provides a service processing method, where the method includes:
acquiring attribute information of a target vector service to be processed;
determining a target node in a vector engine according to the attribute information and a predefined data scale; wherein the data size is the data size of the maximum vector data that the vector engine can provide;
and starting the target node to execute the target vector service through the target node.
In a second aspect, an embodiment of the present application further provides a service processing apparatus, where the apparatus includes:
the acquisition module is used for acquiring attribute information of the target vector service to be processed;
the determining module is used for determining a target node in the vector engine according to the attribute information and the predefined data scale; wherein the data size is the data size of the maximum vector data that the vector engine can provide;
and the starting module is used for starting the target node so as to execute the target vector service through the target node.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, a storage medium, and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium in communication over the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the business processing method as provided in the first aspect.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs a service processing method as provided in the first aspect.
The beneficial effects of this application are:
the embodiment of the application provides a service processing method, a device, electronic equipment and a storage medium, in the scheme, a user can customize attribute information of a target vector service according to actual service requirements, then schedule a target node in a vector engine according to the defined attribute information and data scale, and start the target node so as to execute the target vector service through the target node, namely, the scheme provides a plug-in architecture, and the user can extend the vector engine in a self-defined manner, so that the user can realize a retrieval function of vector data based on the designed task-oriented vector engine, namely, the user can quickly utilize the task-oriented vector engine provided by the scheme to meet the service requirements without excessively sinking into operation and maintenance, and the problem that the main stream vector engine provided in the prior art cannot meet the actual requirements of various service types is solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a service processing method provided in an embodiment of the present application;
fig. 2 is a second flow chart of a service processing method provided in the embodiment of the present application;
fig. 3 is a flow chart diagram of a service processing method according to an embodiment of the present application;
fig. 4 is a flow chart of a service processing method provided in the embodiment of the present application;
fig. 5 is a schematic structural diagram of a service processing device provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the accompanying drawings in the present application are only for the purpose of illustration and description, and are not intended to limit the protection scope of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. A flowchart, as used in this application, illustrates operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to the flow diagrams and one or more operations may be removed from the flow diagrams as directed by those skilled in the art.
In addition, the described embodiments are only some, but not all, of the embodiments of the present application. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.
It should be noted that the term "comprising" will be used in the embodiments of the present application to indicate the presence of the features stated hereinafter, but not to exclude the addition of other features.
First, terms related to the present application will be explained.
(1) Vector engine: vectors play an important role in large language models, knowledge base interactions and computation. The AI-based tools and algorithms can represent unstructured data, such as text, images, audio, video, etc., as vector embeddings (vector embeddings), thereby implementing text similarity calculation, knowledge base retrieval, reasoning, etc. The vector provides a convenient and effective representation method for semantic understanding and application. The main role of the vector engine is to store and process vector data and to provide efficient vector retrieval functions.
(2) An Embedding technology: high-dimensional data (e.g., text, picture, audio) is mapped to a low-dimensional space using an embedded technique, i.e., the picture, sound, and text are converted into vectors for representation. The ebedding is typically implemented using a neural network model.
(3) Vector similarity: the similarity calculation method is a basis of vector retrieval and is used for measuring the similarity between vector data, and if two ebedding vectors are very similar, the original data sources are similar.
(4) Vector index: vector indexing is a more time and space efficient data structure for vector construction based on some mathematical model. It is able to efficiently query several vectors that are similar to the target vector.
(5) Kubernetes: an open source system for managing applications across host containers provides a basic mechanism for deployment, maintenance, and capacity expansion and contraction of applications. Kubernetes acts as a container orchestration engine that supports automated deployment, large scale scalability, application containerization management.
(6) Faiss: an efficient search algorithm library for dense vector similarity search and clustering that is open-source in Meta 2017, month 4. Developed by c++, provides a complete Python/numpy interface.
(7) Milvus: the cloud primary vector database which is open in 10 months in 2019 of the domestic startup company is developed by Golang, has a storage and calculation separation framework, has the characteristics of high availability, high performance and easy expansion, and is used for real-time recall of massive vector data.
Prior to addressing the solution of the present application, the currently provided mainstream vector engine includes:
faiss: an efficient search algorithm library for dense vector similarity search and clustering that is open-source in Meta 2017, month 4. Developed by c++, provides a complete Python/numpy interface.
Milvus: the cloud primary vector database which is open in 10 months in 2019 of the domestic startup company is developed by Golang, has a storage and calculation separation framework, has the characteristics of high availability, high performance and easy expansion, and is used for real-time recall of massive vector data.
Weaves: the netherlands SeMI Technologies open-source cloud native database. Client access in GraphQL, REST and various languages is supported.
pgvector: based on the PostgreSQL extension (PostgreSQL 9.6+ above support), a powerful set of functions is provided for users to efficiently store, query and process vector data.
rediscovery: a plug-in module of Redis provides memory-based vector data storage, indexing and similarity retrieval capabilities.
However, none of the vector engines provides a matching scheme for different service types, which results in failure to meet the actual requirements of various service types.
Based on the above problems, the embodiment of the application provides a service processing method, which mainly defines corresponding task types according to characteristics of different task types, wherein attribute information in the task types determines a task scheduling strategy and task execution logic, namely, the scheme provides a plug-in architecture, and a user can expand a vector engine in a self-defined manner, so that the user can realize a retrieval function of vector data based on the designed task-oriented vector engine, and the problem that the main stream vector engine provided in the prior art cannot meet actual requirements of various service types is solved.
Specific implementations of the service processing method of the present application will be described in detail below through a plurality of embodiments.
Fig. 1 is a schematic flow chart of a service processing method according to an embodiment of the present application, and it should be noted that the service processing method according to the present application is not limited to the specific order shown in fig. 1 and described below.
It should be understood that, in other embodiments, the sequence of part of the steps in the service processing method provided in the present application may be interchanged according to actual needs, or part of the steps may be omitted or deleted. As shown in fig. 1, the method includes:
s101, acquiring attribute information of a target vector service to be processed.
For example, the vector traffic may include: image-text retrieval, commodity recommendation, image-text duplicate checking and the like.
The attribute information of the vector service is used for representing the index of the vector service, and mainly comprises: recall accuracy, latency sensitivity, offline tasks/online services, quality of service class (Quality of Service, qoS for short), data release policies, etc.
It should be appreciated that the corresponding recall accuracy, delay sensitivity, off-line, qoS, data release policy, etc. emphasis also varies for different vector services. For example, attribute information of the image-text retrieval is that recall precision is high, time delay sensitivity is high, online and the like.
The recall precision corresponds to the selection of the vector index; the time delay refers to the inquiry time of the vector, and is an index which needs to be considered with emphasis under the condition of large data volume.
S102, determining a target node in the vector engine according to the attribute information and the predefined data scale.
Wherein the data size is the maximum vector data size that the vector engine can provide. For example, a vector engine may provide a search of 100 tens of thousands of pictures, each of which may be ebedding into a 768-dimensional vector according to a model, 100 tens of thousands of pictures being 768-dimensional vector data.
The heterogeneous resource, kubernetes multi-cluster, mainly comprises GPU, CPU, SSD and other nodes of different types.
The task-oriented vector engine provided by the scheme is realized based on a Kubernetes multi-cluster architecture, namely, the Kubernetes cluster is used for running the vector engine. The Kubernetes Zone is a node group of the Kubernetes cluster divided by labels, for example, GPU nodes may be divided into one Zone, and may also be divided by service.
Specifically, in this embodiment, the Zone of the corresponding Kubernetes cluster may be scheduled according to the attribute information (recall accuracy, delay sensitivity, off-line, qos, etc.) and the data size of the above-defined target vector service, where the selected Zone is the target node.
For example, if the attribute information of the image-text retrieval is that the recall precision is high, the time delay sensitivity is high, and the image-text retrieval is off-line, the GPU Zone corresponding to the Kubernetes cluster with high performance, which can meet the requirements, can be scheduled according to the attribute information of the image-text retrieval and the 768-dimensional vector data of 100 ten thousand, namely the GPU Zone is a node which can simultaneously meet the requirements of high recall precision, high time delay sensitivity, on-line and occupied memory of the data scale.
S103, starting the target node to execute the target vector service through the target node.
In this embodiment, for example, the task oriented vector engine provided by the present solution is designed based on the milvus and fasss of the open source. Therefore, before executing the target vector service, the vector retrieval mirror image provided by milvus or faiss needs to be pulled, and then the Kubernetes cluster starts the target node based on the vector retrieval mirror image so as to execute the target vector service through the target node, thereby realizing the processing of the target vector service.
The scheme mainly designs a vector engine facing tasks, and a user can rapidly utilize the vector engine to meet service requirements without excessively sinking into operation and maintenance.
In summary, in this embodiment, a user may customize attribute information of a target vector service according to actual service requirements, then schedule a target node in a vector engine according to the defined attribute information and a data size, and start the target node to execute the target vector service through the target node, that is, the present embodiment provides a plug-in architecture, and the user may extend the vector engine in a custom manner, so that the user may implement a function of retrieving vector data based on a designed task-oriented vector engine, that is, the user may quickly use the task-oriented vector engine provided in the present embodiment to satisfy service requirements without excessive trapping in operation and maintenance, thereby solving the problem that the mainstream vector engine provided in the prior art cannot satisfy actual requirements of various service types.
Optionally, the step S101 includes:
and acquiring index information input by a user, and taking the index information as attribute information of the target vector service.
In one implementation, for example, custom metric information entered by a user for a target vector task may be received, wherein the user-entered metric information may include one or more.
Illustratively, for example, recall accuracy includes three levels, high, medium, and low, and delay sensitivity includes three levels, high, medium, and low. For example, if the index information input by the user for the commodity recommendation 1 includes that the recall accuracy is high, the recall accuracy may be high as the attribute information of the commodity recommendation 1.
For another example, the index information input by the user for the commodity recommendation 2 includes: the recall accuracy is medium, the delay sensitivity is low, and the recall is offline, etc., so that the medium, the delay sensitivity is low, and the recall is offline as the attribute information of the commodity recommendation 2. Therefore, the attribute information of the vector service can be determined according to the user-defined index information input by the user aiming at the vector task, and the expandability of the task-oriented vector engine is improved.
Optionally, referring to fig. 2, the step S101 includes:
s201, acquiring a target task type selected by a user from a plurality of preset task types.
Wherein each task type corresponds to at least one attribute information.
For example, the attribute information of the task type 1 includes: the recall precision is high, the time delay sensitivity is high, the online and QoS grade is high, and the task type 2 attribute information comprises: the recall precision is high and online. I.e. a plurality of attribute information can be combined to obtain a task type.
S202, taking the attribute information of the target task type as the attribute information of the target vector service.
In general, in the actual process, the characteristics of different vector services are different, and the emphasis points of the corresponding indexes such as recall accuracy, time delay sensitivity, off-line, qoS, data release strategy and the like are also different. In practice, several indices are usually interrelated and related to the size of the data, etc.
Therefore, in the scheme, attribute information of multiple task types can be pre-configured according to common characteristics among indexes of different vector services, when the attribute information of image-text retrieval is required to be acquired, the task type matched with the attribute information target of the image-text retrieval can be selected directly from the configured multiple task types, after comparison, the task type 1 is selected as the target task type, and the attribute information of the task type 1 is used as the attribute information of the image-text retrieval. By adopting the method, the target task type matched with the attribute information of the target vector service is selected from the plurality of task types which are pre-configured, and the attribute information of the target task type is used as the attribute information of the target vector service, so that the definition efficiency of different vector services is improved.
Optionally, referring to fig. 3, the step S102 includes:
s301, determining at least one node to be selected in the vector engine according to the attribute information, the resource types of the nodes of the vector engine and the resource use state.
The resource usage state includes: an idle state and an occupied state. For example, only GPU nodes in Kubernetes clusters are idle, and SSD nodes and CPU nodes are both in possession.
In this embodiment, according to attribute information of a task type, a task scheduling policy and task execution logic may be determined. Specifically, the attribute information of the task type is used as a query condition, and at least one node to be selected, which is matched with the attribute information of the task type, is searched in each node of the vector engine. For example, the attribute information of the task type is high in recall precision, high in time delay sensitivity, online and high in QoS level, the resource type of the SSD node is image-text online retrieval, high in recall precision and high in time delay sensitivity, and the resource use state of the SSD node is in an idle state, so that the SSD node can be used as a node to be selected.
S302, screening out target nodes from at least one node to be selected according to available resources of each node to be selected.
The available resources of the node to be selected refer to node memory, i.e. the remaining memory space in the node.
For example, the memory node of the SSD node is 200G, and the memory space occupied by the data size is 100G, i.e. the SSD node can meet the memory requirement of the data size, and then the SSD node can be used as the target node. Therefore, the memory node of the target node can be ensured to completely meet the service requirement of occupying a large amount of storage resources when the target vector service is executed, and the problem of insufficient memory space is avoided.
Optionally, the step S301 includes:
traversing each node in the vector engine, and regarding the traversed current node, if the resource type of the current node is matched with the attribute information of the target vector service and the resource use state of the current node is in an idle state, taking the current node as a node to be selected.
The matching of the resource type of the current node with the attribute information of the target vector service means that the current node can meet the vector retrieval requirement of the target vector service, namely, the stability and reliability of the execution process of the target vector service can be ensured with high performance by scheduling the current node, the service requirement of the target vector service is met, and the situation that the target vector service cannot be executed is avoided.
Specifically, taking the task type 1 as an example, the attribute information of the task type 1 includes: the recall precision is high, the time delay sensitivity is high, the online and QoS (quality of service) levels are high, each node of the Kubernetes cluster is traversed according to the attribute information of the task type 1, after traversing, the resource type of the node 2, the resource type of the node 3 and the resource type of the node 4 are found to be matched with the attribute information of the task type 1, wherein the resource utilization state of the node 2 and the resource utilization state of the node 3 are idle states, but the resource utilization state of the node 4 is occupied state, and therefore the node 2 and the node 3 of the Kubernetes cluster can be used as a candidate node.
It should be noted that, after traversing all the nodes in the vector engine, if there is no node matching the attribute information of the target vector service, the vector engine does not support the processing of the target vector service.
In addition, if the resource type of the current node is matched with the attribute information of the target vector service, but the resource use state of the current node is an occupied state, the target vector service is always in a waiting state until the current node resource is in an idle state, and then the current node processes the target service vector.
Optionally, referring to fig. 4, the step S103 includes:
s401, acquiring a pre-generated vector engine mirror image.
The vector engine mirror image is mainly used for calculating the similarity of the eigenvalue vectors.
S402, starting a target node based on vector engine mirroring.
For example, a pre-built vector engine image may be obtained by pulling an instruction; and then, starting the target node through the vector engine mirror image to realize the retrieval of the target vector service.
Optionally, after performing the step S103, the method further includes:
and releasing the target node after the target node executes the target vector service.
Optionally, after the target node performs the target vector service, the target node needs to be released, that is, the resource usage state of the target node is updated to be an idle state, and part or all of the memory space of the target node is emptied, so as to realize the management of the life cycle of the target node.
Based on the same inventive concept, the embodiment of the present application further provides a service processing device corresponding to the service processing method, and since the principle of the device in the embodiment of the present application for solving the problem is similar to that of the service processing method in the embodiment of the present application, the implementation of the device may refer to the implementation of the method, and the repetition is omitted.
Fig. 5 is a schematic structural diagram of a service processing apparatus according to an embodiment of the present application, and referring to fig. 5, the apparatus includes:
an obtaining module 501, configured to obtain attribute information of a target vector service to be processed;
a determining module 502, configured to determine a target node in a vector engine according to the attribute information and a predefined data size; wherein the data size is the data size of the maximum vector data that the vector engine can provide;
a starting module 503, configured to start the target node, so as to execute the target vector service through the target node.
Optionally, the obtaining module 501 is further configured to:
and acquiring index information input by a user, and taking the index information as attribute information of the target vector service.
Optionally, the obtaining module 501 is further configured to:
acquiring a target task type selected by a user from a plurality of preset task types, wherein each task type corresponds to at least one attribute information;
and taking the attribute information of the target task type as the attribute information of the target vector service.
Optionally, the determining module 502 is further configured to:
according to the attribute information, the resource type and the resource use state of each node of the vector engine, at least one node to be selected in the vector engine is determined, and the resource use state comprises: an idle state, an occupied state;
and screening the target node from the at least one node to be selected according to the available resources of each node to be selected.
Optionally, the determining module 502 is further configured to:
traversing each node in the vector engine, and regarding the traversed current node, if the resource type of the current node is matched with the attribute information of the target vector service and the resource use state of the current node is in an idle state, taking the current node as a node to be selected.
Optionally, the starting module 503 is further configured to:
acquiring a pre-generated vector engine mirror image;
the target node is started based on the vector engine image.
Optionally, the apparatus further comprises:
and the releasing module is used for releasing the target node after the target node executes the target vector service.
The foregoing apparatus is used for executing the method provided in the foregoing embodiment, and its implementation principle and technical effects are similar, and are not described herein again.
The above modules may be one or more integrated circuits configured to implement the above methods, for example: one or more application specific integrated circuits (Application Specific Integrated Circuit, abbreviated as ASIC), or one or more microprocessors (digital singnal processor, abbreviated as DSP), or one or more field programmable gate arrays (Field Programmable Gate Array, abbreviated as FPGA), or the like. For another example, when a module above is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a central processing unit (Central Processing Unit, CPU) or other processor that may invoke the program code. For another example, the modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application, where the electronic device includes: a processor 601, a storage medium 602, and a bus 603, the storage medium 602 storing machine-readable instructions executable by the processor 601, the processor 601 in communication with the storage medium 602 via the bus 603 when the electronic device is running, the processor 601 executing the machine-readable instructions to perform the steps of:
acquiring attribute information of a target vector service to be processed;
determining a target node in a vector engine according to the attribute information and a predefined data scale; wherein the data size is the data size of the maximum vector data that the vector engine can provide;
and starting the target node to execute the target vector service through the target node.
Optionally, the processor 601 is configured to, when executing the obtaining attribute information of the target vector service to be processed, specifically:
and acquiring index information input by a user, and taking the index information as attribute information of the target vector service.
Optionally, the processor 601 is configured to, when executing the obtaining attribute information of the target vector service to be processed, specifically:
acquiring a target task type selected by a user from a plurality of preset task types, wherein each task type corresponds to at least one attribute information;
and taking the attribute information of the target task type as the attribute information of the target vector service.
Optionally, the processor 601 is configured to determine the target node in the vector engine according to the attribute information and the predefined data size, specifically configured to:
according to the attribute information, the resource type and the resource use state of each node of the vector engine, at least one node to be selected in the vector engine is determined, and the resource use state comprises: an idle state, an occupied state;
and screening the target node from the at least one node to be selected according to the available resources of each node to be selected.
Optionally, the processor 601 determines at least one node to be selected in the vector engine according to the attribute information and the resource type and the resource usage state of each node of the vector engine, and specifically is configured to:
traversing each node in the vector engine, and regarding the traversed current node, if the resource type of the current node is matched with the attribute information of the target vector service and the resource use state of the current node is in an idle state, taking the current node as a node to be selected.
Optionally, the processor 601 is configured to, when executing the enabling the target node, specifically:
acquiring a pre-generated vector engine mirror image;
the target node is started based on the vector engine image.
Optionally, after executing the enabling the target node to execute the target vector service by the target node, the processor 601 is further configured to:
and releasing the target node after the target node executes the target vector service.
Optionally, the present invention also provides a program product, such as a computer readable storage medium, comprising a program which when executed by a processor is adapted to perform the steps of:
acquiring attribute information of a target vector service to be processed;
determining a target node in a vector engine according to the attribute information and a predefined data scale; wherein the data size is the data size of the maximum vector data that the vector engine can provide;
and starting the target node to execute the target vector service through the target node.
Optionally, the processor is configured to, when executing the obtaining attribute information of the target vector service to be processed, specifically:
and acquiring index information input by a user, and taking the index information as attribute information of the target vector service.
Optionally, the processor is configured to, when executing the obtaining attribute information of the target vector service to be processed, specifically:
acquiring a target task type selected by a user from a plurality of preset task types, wherein each task type corresponds to at least one attribute information;
and taking the attribute information of the target task type as the attribute information of the target vector service.
Optionally, the processor is configured to determine the target node in the vector engine according to the attribute information and the predefined data size, specifically for:
according to the attribute information, the resource type and the resource use state of each node of the vector engine, at least one node to be selected in the vector engine is determined, and the resource use state comprises: an idle state, an occupied state;
and screening the target node from the at least one node to be selected according to the available resources of each node to be selected.
Optionally, the processor is configured to determine at least one node to be selected in the vector engine according to the attribute information and the resource type and the resource usage state of each node of the vector engine, and specifically configured to:
traversing each node in the vector engine, and regarding the traversed current node, if the resource type of the current node is matched with the attribute information of the target vector service and the resource use state of the current node is in an idle state, taking the current node as a node to be selected.
Optionally, the processor is configured to, when executing the starting the target node, specifically:
acquiring a pre-generated vector engine mirror image;
the target node is started based on the vector engine image.
Optionally, the processor is further configured to, after executing the enabling the target node to execute the target vector service by the target node:
in the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in hardware plus software functional units.
The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (english: processor) to perform some of the steps of the methods according to the embodiments of the invention. And the aforementioned storage medium includes: u disk, mobile hard disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.

Claims (10)

1. A method of service processing, the method comprising:
acquiring attribute information of a target vector service to be processed;
determining a target node in a vector engine according to the attribute information and a predefined data scale; wherein the data size is the data size of the maximum vector data that the vector engine can provide;
and starting the target node to execute the target vector service through the target node.
2. The method according to claim 1, wherein the obtaining attribute information of the target vector service to be processed includes:
and acquiring index information input by a user, and taking the index information as attribute information of the target vector service.
3. The method according to claim 1, wherein the obtaining attribute information of the target vector service to be processed includes:
acquiring a target task type selected by a user from a plurality of preset task types, wherein each task type corresponds to at least one attribute information;
and taking the attribute information of the target task type as the attribute information of the target vector service.
4. The method of claim 1, wherein determining the target node in the vector engine based on the attribute information and a predefined data size comprises:
according to the attribute information, the resource type and the resource use state of each node of the vector engine, at least one node to be selected in the vector engine is determined, and the resource use state comprises: an idle state, an occupied state;
and screening the target node from the at least one node to be selected according to the available resources of each node to be selected.
5. The method of claim 4, wherein determining at least one candidate node in the vector engine according to the attribute information and the resource type and the resource usage status of each node of the vector engine comprises:
traversing each node in the vector engine, and regarding the traversed current node, if the resource type of the current node is matched with the attribute information of the target vector service and the resource use state of the current node is in an idle state, taking the current node as a node to be selected.
6. The method of claim 1, wherein the initiating the target node comprises:
acquiring a pre-generated vector engine mirror image;
the target node is started based on the vector engine image.
7. The method of claim 1, wherein after said initiating said target node to perform said target vector traffic by said target node, further comprising:
and releasing the target node after the target node executes the target vector service.
8. A service processing apparatus, the apparatus comprising:
the acquisition module is used for acquiring attribute information of the target vector service to be processed;
the determining module is used for determining a target node in the vector engine according to the attribute information and the predefined data scale; wherein the data size is the data size of the maximum vector data that the vector engine can provide;
and the starting module is used for starting the target node so as to execute the target vector service through the target node.
9. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the method of any one of claims 1-7.
10. A computer readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when executed by a processor, performs the method according to any of claims 1-7.
CN202311729520.1A 2023-12-14 2023-12-14 Service processing method, device, electronic equipment and storage medium Pending CN117725271A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311729520.1A CN117725271A (en) 2023-12-14 2023-12-14 Service processing method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311729520.1A CN117725271A (en) 2023-12-14 2023-12-14 Service processing method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117725271A true CN117725271A (en) 2024-03-19

Family

ID=90206590

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311729520.1A Pending CN117725271A (en) 2023-12-14 2023-12-14 Service processing method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117725271A (en)

Similar Documents

Publication Publication Date Title
CN109753356A (en) A kind of container resource regulating method, device and computer readable storage medium
CN114169427B (en) Distributed training method, device and equipment based on end-to-end self-adaptation
US11201936B2 (en) Input and output schema mappings
Du et al. Scientific workflows in IoT environments: a data placement strategy based on heterogeneous edge-cloud computing
Huang et al. Achieving load balance for parallel data access on distributed file systems
CN116508019A (en) Learning-based workload resource optimization for database management systems
KR101355273B1 (en) A computing system, a method for controlling thereof, and a computer-readable recording medium having a computer program for controlling thereof
US11604682B2 (en) Pre-emptive container load-balancing, auto-scaling and placement
CN110782122B (en) Data processing method and device and electronic equipment
US20210349811A1 (en) Regression prediction in software development
CN113760847A (en) Log data processing method, device, equipment and storage medium
CN107992354A (en) For reducing the method and device of memory load
US11775864B2 (en) Feature management platform
EP3942399B1 (en) Automated assistant for generating, in response to a request from a user, application input content using application data from other sources
CN116954944A (en) Distributed data stream processing method, device and equipment based on memory grid
CN116932147A (en) Streaming job processing method and device, electronic equipment and medium
CN117725271A (en) Service processing method, device, electronic equipment and storage medium
US20230153300A1 (en) Building cross table index in relational database
US11782918B2 (en) Selecting access flow path in complex queries
CN113868249A (en) Data storage method and device, computer equipment and storage medium
JP2006023948A (en) Resource search method, cluster system, computer, and cluster
Cheriere Towards Malleable Distributed Storage Systems: From Models to Practice
US20240273065A1 (en) Hadoop distributed file system (hdfs) express bulk file deletion
CN116755893A (en) Job scheduling method and device of deep learning-oriented distributed computing system
CN116541482A (en) Text object indexing method, object storage system and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination