CN113791904A

CN113791904A - Method, apparatus, device and readable storage medium for processing query input

Info

Publication number: CN113791904A
Application number: CN202111067443.9A
Authority: CN
Inventors: 甄真; 尹劲草; 陈佳捷
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-09-13
Filing date: 2021-09-13
Publication date: 2021-12-14
Anticipated expiration: 2041-09-13
Also published as: CN113791904B

Abstract

The present disclosure provides a method, an apparatus, a device and a readable storage medium for processing query input, which relate to the technical field of data processing, in particular to the field of intelligent search and deep learning. The specific implementation scheme is as follows: determining a first indication indicating a computational load of a current query input; obtaining a second indication indicating performance of a plurality of servers available to process the query input and a third indication indicating whether the plurality of servers have been assigned the query input; and selecting a target server from the plurality of servers to which the query input is not allocated to process the current query input based on the first indication, the second indication and the third indication. By the method, the delay of query input can be reduced, the user experience is improved, the method has universality compared with a search engine, and the migration tolerance of the server is improved.

Description

Method, apparatus, device and readable storage medium for processing query input

Technical Field

The present disclosure relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a readable storage medium for processing query input for intelligent search and deep learning.

Background

With the development of science and technology, the amount of knowledge and information is rapidly increasing. In order to facilitate a user to quickly obtain the required information, the required information can be usually searched by a search engine. Search engines are also rapidly improving as the amount of information processed increases to provide users with a variety of information in a timely manner. The search engines commonly used today are google's search engine and hundredth's search engine. These conventional search engines retrieve the corresponding information for the user from a vast amount of resources. Although the current search technology has been widely used, there are many problems to be solved when searching for data or information.

Disclosure of Invention

The present disclosure provides a method, apparatus, device, and readable storage medium for processing query input.

According to an aspect of the present disclosure, a method of processing query input is provided. The method includes determining a first indication indicative of a computational load of a current query input; obtaining a second indication indicating performance of a plurality of servers available to process the query input and a third indication indicating whether the plurality of servers have been assigned the query input; and selecting a target server from the plurality of servers to which the query input is not allocated to process the current query input based on the first indication, the second indication and the third indication.

According to a second aspect of the present disclosure, an apparatus for processing query input is provided. The apparatus includes a first indication determination module configured to determine a first indication indicative of a computational effort of a current query input; an indication obtaining module configured to obtain a second indication indicating performance of a plurality of servers available to process the query input and a third indication indicating whether the plurality of servers have been allocated the query input; and a query input processing module configured to select a target server from the plurality of servers to which the query input is not assigned, based on the first indication, the second indication, and the third indication, to process the current query input.

According to a third aspect of the present disclosure, an electronic device is provided. The electronic device includes at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method according to the first aspect of the disclosure.

According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method according to the first aspect of the present disclosure.

According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method according to the first aspect of the present disclosure.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 illustrates a schematic diagram of an environment 100 in which embodiments of the present disclosure can be implemented;

FIG. 2 illustrates a flow diagram of a method 200 for processing query input in accordance with some embodiments of the present disclosure;

FIG. 3A illustrates a schematic diagram of an example 300A of query input latency versus query input computation volume, in accordance with some embodiments of the present disclosure;

FIG. 3B illustrates a schematic diagram of an example 300B of query input latency versus query input computation load, in accordance with some embodiments of the present disclosure;

FIG. 4 shows a schematic diagram of a process 400 of query input assignment, in accordance with some embodiments of the present disclosure;

FIG. 5 illustrates a schematic diagram of an example system 500 for processing query inputs, in accordance with some embodiments of the present disclosure;

FIG. 6 illustrates a schematic diagram of an example system 600 for processing query inputs, in accordance with some embodiments of the present disclosure; and

FIG. 7 illustrates a block diagram of an apparatus 700 for processing query input in accordance with some embodiments of the present disclosure; and

fig. 8 illustrates a block diagram of a device 800 capable of implementing multiple embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In describing embodiments of the present disclosure, the terms "include" and its derivatives should be interpreted as being inclusive, i.e., "including but not limited to. The term "based on" should be understood as "based at least in part on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The terms "first," "second," and the like may refer to different or the same object. Other explicit and implicit definitions are also possible below.

In a search engine, there is a problem of a delay long tail for a query input, i.e., a response time is relatively long. This affects the user experience and is prone to customer churn. The delay long tail of the query input is optimized, and the method is very helpful for improving the user experience.

The method of optimizing the delay long tail of the query input is more. It can be divided into three major categories. The first category is the performance optimization category method. The method optimizes the performance of the module by a universal optimization means and controls the consumption of query input. Such as increasing CPU parallelism, using more advanced hardware, aggressively truncating recall results, employing partial rather than full index queries, etc. The second category is mainly the posterior bib. The method avoids the occurrence of long tail as much as possible. The third class is the load balancing class.

For the load balancing method, the method can be further subdivided into a cyclic algorithm, a random load balancing algorithm, a weight scheduling algorithm based on back-end delay statistics and a static weight scheduling algorithm. However, for the round-robin algorithm and the random load balancing algorithm, although the traffic is scattered, the query input with large calculation amount is still scheduled to the server with poor performance, so that long tail is caused. Does not have extremely optimized characteristics. For the weight scheduling algorithm based on the back-end delay statistics, the weight scheduling algorithm is only suitable for being used in a link with high throughput, and the interaction interval between two adjacent query inputs of a specific client and a specific server needs to be short. Links with a very high number of copies of both the client and server are not applicable. It has no general-purpose characteristics. For the static weight scheduling algorithm, under the cloud native architecture, the server instance may be migrated, and the static weight has distortion. Without server migration tolerant features.

In order to address at least the above issues, an improved scheme for processing query input is proposed according to an embodiment of the present disclosure. In this approach, a computing device determines a first indication indicating a computational amount of a current query input and obtains a second indication indicating performance of a plurality of servers available to process the query input and a third indication indicating whether the plurality of servers have been assigned the query input. The computing device then selects a target server from the plurality of servers to which the query input is not assigned to process the current query input based on the first indication, the second indication, and the third indication. By the method, the delay of query input can be reduced, the user experience is improved, the method has universality compared with a search engine, and the migration tolerance of the server is improved.

Fig. 1 illustrates a schematic diagram of an environment 100 in which various embodiments of the present disclosure can be implemented. The example environment 100 includes a computing device 104. Also included in the example environment 100 are a plurality of servers 110-1, 110-2, … …, 110-N, where N is a positive integer and is also referred to as a server 110 for ease of description.

The computing device 104 is operable to receive the current query input 102 and determine to which server 110 to send the current query input 102. Computing devices 104 include, but are not limited to, personal computers, server computers, hand-held or laptop devices, mobile devices (such as mobile phones, Personal Digital Assistants (PDAs), media players, and the like), multiprocessor systems, consumer electronics, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.

The computing device 104 utilizes a computation prediction model to predict the computation of the current query input. The computing device 104 also obtains performance indications 106, also referred to as second indications for ease of description, for a plurality of servers that are available to process the query input. In some embodiments, the performance indication for the server 110 is determined by collecting the CPU model of the machine on which the server is located. In some embodiments, the performance indication of the server 110 is determined by collecting the number of cores in the CPU of the machine in which the server is located. The above examples are intended to be illustrative of the present disclosure, and are not intended to be limiting of the present disclosure. The skilled person can set any suitable way to determine the performance of the server as required.

The computing device 104 also obtains a status indication 108 of whether the server 110 is assigned a query input, also referred to as a third indication for ease of description. Alternatively or additionally, the status indication information is the status indication information of the server within one period. "one cycle" in this disclosure refers to the time for which one query input is assigned to each of all servers available to process the query input. A server is assigned only one query input during a cycle. In some embodiments, status indication information is provided to indicate the servers 110, where each bit corresponds to a server, and when the bit marks bit 0, indicates that the server is not assigned a query input. When the bit flags bit 1, this indicates that the server is assigned a query input. In some embodiments, other information may also be set to distinguish whether a server is assigned a query input. The above examples are intended to be illustrative of the present disclosure, and are not intended to be limiting of the present disclosure.

The computing device 104 selects a server from the servers 110 to process the query input 102 for unconfigured query input based on the computational load of the obtained query input, the capabilities of the servers, and whether the servers have been assigned query tasks.

The server 110 may be a physical server device or may be a virtual server deployed on a physical computing device. The server 110 is used to process query inputs received from the computing device 104. The processing results are then returned to the user who initiated the query input.

By the method, the delay of query input can be reduced, the user experience is improved, the method has universality compared with a search engine, and the migration tolerance of the server is improved.

An environment 100 in which various embodiments of the present disclosure can be implemented is described above in connection with FIG. 1. A flow diagram of a method 200 for processing query inputs in accordance with some embodiments of the present disclosure is described below in conjunction with fig. 2. Method 200 in fig. 2 may be performed by computing device 104 in fig. 1 or any suitable computing device.

At block 202, a first indication indicating a computational burden of a current query input is determined. For example, the computing device 104 in fig. 1 determines a first indication indicative of a computational load of the current query input 102.

In some embodiments, the computing device 104 receives the current query input. The computing device then applies the current query input to the computational prediction model to obtain a first indication. The computational prediction model is derived based on historical query inputs and corresponding delays. In this way, the computational burden of the current query input can be quickly determined.

In some embodiments, the computational prediction model is trained using a query semantic model in conjunction with query inputs and corresponding delay durations. The query semantic model may be a hundredth RENIE model.

In some embodiments, historical query inputs and corresponding time delays may be obtained, and then the size of the computational volume determined from the time delays. For example, the delay time length is used to indicate the calculation amount or the maximum delay time length is selected as a reference, and then the ratio of the other time lengths to the reference time length is used as an indication value of the calculation amount. When the current query input is received, the time delay or the calculation amount indicating value of the historical query input corresponding to the current query input is searched, or the time delay or the calculation amount indicating value of the current query input is determined by using similar historical queries. The above examples are intended to be illustrative of the present disclosure, and are not intended to be limiting of the present disclosure.

At block 204, a second indication indicating performance of a plurality of servers available to process the query input and a third indication indicating whether the plurality of servers have been assigned the query input are obtained. For example, the computing device in FIG. 1 obtains a second indication indicating performance of a plurality of servers available to process the query input and a third indication indicating whether the plurality of servers have been assigned the query input.

In some embodiments, the computing device 104 obtains resource configuration information for each of a plurality of servers. The computing device then determines a second indication of performance for the plurality of servers based on the resource configuration information. By the method, the performance of the server can be determined quickly.

In some embodiments, the computing device 104 collects information of the central processor CPU of the server and then determines the performance of the server based on the model of the CPU. For example by scoring the model of the CPU of the machine on which the server is located, each model having a different score. The score is defined by the time delay for the same benchmark query input to be tested on the machine on which the different servers are located. This process may be performed periodically.

In some embodiments, the computing device 104 collects information of the central processor CPU of the server and then determines the performance of the server based on the number of cores of the CPU. The higher the number of cores, the higher the performance. The above examples are intended to be illustrative of the present disclosure and are not intended to be limiting.

In some embodiments, the computing device 104 also has state information for the handset server. For a plurality of servers that process query inputs, each server has a corresponding indicator value that indicates whether it has been assigned a query input. In one example, if a query input is assigned, the indicator value is set to 1, and if a query input is not assigned, the indicator value is set to 0. In another example, whether a server is assigned a query input is indicated with other identifying information. The above examples are intended to be illustrative of the present disclosure, and are not intended to be limiting of the present disclosure.

At block 206, a target server from the plurality of servers for processing the current query input is determined based on the first indication, the second indication, and the third indication, the target server not being assigned the previous query input.

In some embodiments, the computing device 104 determines the first indication as a calculated quantity indication value for the current query input. The computing device 104 then determines, using the third indication, a set of servers from the plurality of servers to which the query input is not assigned. Next, the computing device 104 determines a performance indicator value for each server in the set of servers based on the second indication. Finally, a target server is determined from a set of servers based on the calculated quantity indication value and the performance indication value. By the method, the server for processing the query input can be rapidly determined.

In some embodiments, the computing device 104 determines how close both the calculated quantity indicator value and the performance indicator value are to a predetermined value. For example, the proximity of the sum of the calculated quantity indicator value and the performance indicator value to a predetermined value is determined. A target server is then determined from the set of servers based on the proximity. In this manner, a server suitable for processing query input may be selected.

In some embodiments, the computing device 104 determines a performance indicator value for each of the plurality of servers from the second indication. The computing device 104 then compares the performance indication value to a first threshold to divide the plurality of servers into a first subset of servers and a second subset of servers, the performance of the servers in the first subset of servers being better than the performance of the servers in the second subset of servers. The computing device 104 also determines a calculated quantity indication value for the current query input. The calculated quantity indication value is then compared to a second threshold value. And finally, selecting a server which is not distributed with the query task from the first server subgroup or the second server subgroup as a target server based on the comparison result. By the method, the target server can be quickly determined.

In some embodiments, if it is determined that the calculation amount indicating value is less than or equal to the second threshold value, selecting a server to which the query task is not allocated as the target server from the second server subgroup; and if the calculation amount indicating value is determined to be larger than the second threshold value, selecting a server which is not allocated with the query task from the first server subgroup as a target server. In this way, a server suitable for processing query inputs can be found. Alternatively or additionally, if no server is found from the assigned first or second subset of servers that is not assigned a query input, an available server is looked up in the other set as the target server.

In some embodiments, after the target server is assigned the current query input, an indication value in the third indication corresponding to the target server is adjusted to mark the target server as assigned query input. By the method, whether the server distributes the query input or not can be accurately marked.

In some embodiments, the computing device 104 determines whether the third indication indicates that each of the plurality of servers has been assigned a query task. If it is determined that each of the plurality of servers has been assigned a query task, it indicates that all of the servers have been assigned a query input. The third indication is then adjusted to indicate that a plurality of servers are not assigned query tasks, such that all servers accept the next cycle of query input assignments from the beginning. In this way, subsequent query inputs can be equalized.

Above and in connection with FIG. 2, a flow diagram of a method 200 for processing query input is described, in accordance with some embodiments of the present disclosure. The principle of assigning query inputs is described below in conjunction with fig. 3A and 3B. Wherein FIG. 3A illustrates a schematic diagram of an example 300A of query latency versus query computation, in accordance with some embodiments of the present disclosure; FIG. 3B illustrates a schematic diagram of an example 300B of query latency versus query computation, in accordance with some embodiments of the present disclosure.

The left graph in FIG. 3A shows the classification of the total query input latency from the two dimensions of the size of the query computation, the good and bad of the machine on which the server resides: if the computational load of the query input is small and the machine on which the server that processes it is on is performing well, then the delay is short. If the query input is computationally intensive and the machine on which the server is processing it is poorly performing, then the delay is long. If the computational load of the query input is small and the machine on which the server is processing it is performing poorly, then the latency is moderate. Or the query input is computationally intensive and the machine on which the server is processing it is performing well, the latency is moderate.

The 4 combinations of query input ratios are consistent under the condition that the query input breaks up the schedule. The right graph in fig. 3A shows the distribution of the delay of the total query input over a period of time in this case.

For long tail optimization, the delay long tail caused by the hot spot instance is more serious, so that query input scatter scheduling is a precondition which must be guaranteed. Under the condition of not breaking query input scattered scheduling, if query input with large calculation amount is scheduled to a server with good performance as much as possible, and query input with small calculation amount is scheduled to a server with poor performance as much as possible, the whole time distribution becomes more uniform, and long tail is reduced. As shown in fig. 3B, after the adjustment in the left graph, the delay of the query input is substantially in the medium delay state.

The distribution of query input over servers is described above in connection with FIGS. 3A and 3B. A process 400 for query input assignment is described below in conjunction with fig. 4. As shown in FIG. 4, there are N servers that process query inputs, N being a positive integer. When the first 1-N query inputs come, they are distributed to N servers, one query input for each server, at which point the first cycle ends. The second period begins and then ends after each server is assigned a query input. If the query input that is now accepted is the mth, it can be determined that it is in m/N cycles. Further, only the status indication of whether the query input is assigned to each server in the current cycle may be counted.

A schematic diagram of an example system 500 for processing query inputs in accordance with some embodiments of the present disclosure is described below in conjunction with fig. 5. As shown in FIG. 5, a query input 502 is first received at block 502. A computation metric is then performed at block 504, using a metric function s (). At block 506, a score for the query input is obtained. The input of S is the query input, and the output of S is: the query input requires a computational effort: the score is S (query input) S. The value interval is as follows: [0,1]. The larger the value is, the more the calculation amount is represented. For the function S (), it can be replaced by a Baidu open-sourced query semantic model ERNIE. The training samples of the model are trained using the actual delays of the respective query inputs on the different servers. This training process is also periodically and routinely performed.

The server instances 508-1, … 508-i, …, 508-N determine server instances 512-1, … 512-i, …, 512-N having corresponding server instance capabilities through a performance measurement P () function at block 510. The decision module 518, in addition to obtaining the server instance and query input computation volume including the performance indication, obtains the status indications 514-1, … 514-i, …, 514-N for the server instance, with d-0 indicating that no query input is assigned and d-1 indicating that a query input is assigned. It is then determined that the query input is assigned to server instance i. However, server instances 516-1, … 516-i, …, 516-N after obtaining allocation query input, d of server instance 516-i is adjusted to 1.

By the method, the delay of query input can be reduced, the user experience is improved, the method has universality compared with a search engine, and the migration tolerance of the server is improved

FIG. 6 illustrates a schematic diagram of an example system 600 for processing query inputs, in accordance with some embodiments of the present disclosure. As shown in FIG. 6, system 600 has an instance scoring service 604, a log storage service 618, a model training service 616, a model service 612, and a traffic scheduling framework 608.

The instance scoring service 604 is used to collect CPU models of the machines on an hourly or daily basis on which the respective server instances 602-1, 602-2, … …, 602-N are located, to perform performance scoring, how many points are scored per model may be specified by predetermined rules. The scoring results are pushed as a dictionary 606 into the traffic scheduling framework. This service is equivalent to the performance metric function P ().

In the log storage service 618, the server instances 602-1, 602-2, … …, 602-N instances push logs to the log storage service 618 on an hourly basis. The processing latency for each query is recorded in the log. The time delay records here are time-consuming processing of the server instance itself, and do not include various network latencies. Therefore, the computational load of the query input on this server can be reflected. The set of (query input, latency) in these logs is the training set 620 of models.

Model training service 616 may use a hundred-degree ERNIE model to train the model by taking the entire (query input, delay) set from the log storage service on a daily or weekly basis. And pushing the trained model to a model service.

The model service 612 is used to access query input from the client instance 610, such as the computing device 104, to make a computation prediction using the ERNIE model 614, and to return the prediction score to the client instance 610. This service is equivalent to the computation metric function S ().

The traffic scheduling framework 608 obtains the score of the current query from the client instance 610, counts the allocation state of each server instance in the current period, and issues the query input to the target server instance according to the function decision function F ().

FIG. 7 shows a schematic block diagram of an apparatus 700 for processing query input in accordance with an embodiment of the present disclosure. As shown in fig. 7, the apparatus 700 includes a first indication determining module 702 configured to determine a first indication indicating a computational effort for a current query input; an indication obtaining module 704 configured to obtain a second indication indicating performance of a plurality of servers available to process the query input and a third indication indicating whether the plurality of servers have been allocated the query input; and a query input processing module 706 configured to determine a target server from the plurality of servers for processing the current query input, the target server not assigned a previous query input, based on the first indication, the second indication, and the third indication.

In some embodiments, wherein the first indication determining module 702 comprises: a receiving module configured to receive a current query input; and an application module configured to apply the current query input to a computational prediction model to obtain a first indication, the computational prediction model being derived based on historical query inputs and corresponding delays.

In some embodiments, wherein the indication obtaining module 704 comprises: a resource configuration information acquisition module configured to acquire resource configuration information of each of a plurality of servers; and a second indication determination module configured to determine a second indication of performance for the plurality of servers based on the resource configuration information.

In some embodiments, wherein the query input processing module comprises: a first calculated amount indicating value determining module configured to determine a first indication as a calculated amount indicating value for a current query input; a set of servers determination module configured to determine, based on the third indication, a set of servers from the plurality of servers to which no query input is assigned; a first performance indicator value determination module configured to determine a performance indicator value for each server in a set of servers based on the second indication; and a target server determination module configured to determine a target server from a set of servers based on the calculated quantity indication value and the performance indication value.

In some embodiments, the target server determination module comprises: a proximity determination module configured to determine a proximity of both the calculated quantity indicator value and the performance indicator value to a predetermined value; and a proximity-based server determination module configured to determine a target server from a set of servers based on the proximity.

In some embodiments, wherein the query input processing module 706 comprises: a second performance indicator value determination module configured to determine a performance indicator value for each of the plurality of servers based on the second indication; a first comparison module configured to compare the performance indication value with a first threshold value to divide the plurality of servers into a first server subgroup and a second server subgroup, the performance of the servers in the first server subgroup being better than the performance of the servers in the second server subgroup; a second calculated amount indicating value determining module configured to determine the first indication as a calculated amount indicating value for the current query input; a first comparison module configured to compare the calculated amount indication value with a second threshold value; and a result processing module configured to select a server to which the query task is not assigned as a target server from the first server subgroup or the second server subgroup based on a result of the comparison.

In some embodiments, wherein the result processing module comprises: a first selection module configured to select a server to which the query task is not allocated as a target server from the second server subgroup if it is determined that the calculation amount indicating value is equal to or less than a second threshold value; and a second selection module configured to select a server to which the query task is not allocated as a target server from the first server subgroup if it is determined that the calculation amount indicating value is greater than the second threshold value.

In some embodiments, the apparatus 700 further comprises: a first adjustment module configured to adjust an indication value in the third indication corresponding to the target server if it is determined that the target server is assigned the current query input.

In some embodiments, the apparatus 700 further comprises: an assigned query task determination module configured to determine whether the third indication indicates that each of the plurality of servers has been assigned a query task; and a second adjustment module configured to adjust the third indication to indicate that the plurality of servers are not assigned the query task if it is determined that each of the plurality of servers has been assigned the query task.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. This example electronic device 800 may be used to implement the computing device 104 in fig. 1. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The computing unit 801 performs the various methods and processes described above, such as the method 200. For example, in some embodiments, the method 200 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When loaded into RAM 803 and executed by computing unit 801, may perform one or more of the steps of method 200 described above. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the method 200 in any other suitable manner (e.g., by way of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A method for processing query input, comprising:

determining a first indication indicating a computational load of a current query input;

obtaining a second indication indicating performance of a plurality of servers available to process query input and a third indication indicating whether the plurality of servers have been assigned previous query input; and

determining, based on the first indication, the second indication, and the third indication, a target server from the plurality of servers for processing the current query input, the target server not being assigned the previous query input.

2. The method of claim 1, wherein determining the first indication comprises:

receiving the current query input; and

applying the current query input to a computational prediction model to obtain the first indication, the computational prediction model being derived based on historical query inputs and corresponding delays.

3. The method of claim 1, wherein obtaining the second indication and the third indication comprises:

acquiring resource configuration information of each server in the plurality of servers; and

determining a second indication of performance for the plurality of servers based on the resource configuration information.

4. The method of claim 1, wherein determining the target server comprises:

determining a first indication as a calculated quantity indication value for the current query input;

determining, based on the third indication, a set of servers from the plurality of servers that are not assigned the previous query input;

determining, based on the second indication, a performance indication value for each server in the set of servers; and

determining the target server from the set of servers based on the calculated quantity indicator value and the performance indicator value.

5. The method of claim 4, wherein determining the target server from the set of servers comprises:

determining a proximity of both the calculated amount indicator value and the performance indicator value to a predetermined value; and

determining the target server from the set of servers based on the proximity.

6. The method of claim 1, wherein determining the target server comprises:

determining a performance indicator value for each of the plurality of servers based on the second indication;

comparing the performance indication value to a first threshold to divide the plurality of servers into a first subset of servers and a second subset of servers, the performance of the servers in the first subset of servers being better than the performance of the servers in the second subset of servers;

comparing the calculated quantity indication value with a second threshold value; and

selecting a server from the first server subgroup or the second server subgroup as the target server, to which a query task is not assigned, based on a result of the comparison.

7. The method of claim 6, wherein selecting a target server that is not assigned a query task comprises:

if the calculation amount indicating value is determined to be smaller than or equal to the second threshold value, selecting a server which is not allocated with the query task from the second server subgroup as the target server; and

selecting a server from the first subset of servers to which a query task is not assigned as the target server if it is determined that the calculated amount indicating value is greater than the second threshold.

8. The method of claim 1, further comprising:

adjusting an indication value in the third indication corresponding to the target server if it is determined that the target server is assigned the current query input.

9. The method of claim 8, further comprising:

determining whether the third indication indicates that each of the plurality of servers has been assigned a query task; and

adjusting the third indication to indicate that the plurality of servers are not assigned a query task if it is determined that each of the plurality of servers has been assigned a query task.

10. An apparatus for processing query input, comprising:

a first indication determination module configured to determine a first indication indicating a computational load of a current query input;

an indication obtaining module configured to obtain a second indication indicating performance of a plurality of servers available to process a query input and a third indication indicating whether the plurality of servers have been allocated the query input; and

a query input processing module configured to determine a target server from the plurality of servers for processing the current query input based on the first indication, the second indication, and the third indication, the target server not being assigned the previous query input.

11. The apparatus of claim 10, wherein the first indication determining module comprises:

a receiving module configured to receive the current query input; and

an application module configured to apply the current query input to a computational prediction model to obtain the first indication, the computational prediction model being derived based on historical query inputs and corresponding delays.

12. The apparatus of claim 10, wherein the indication acquisition module comprises:

a resource configuration information acquisition module configured to acquire resource configuration information of each of the plurality of servers; and

a second indication determination module configured to determine a second indication of performance for the plurality of servers based on the resource configuration information.

13. The apparatus of claim 10, wherein the query input processing module comprises:

a first calculated amount indication value determination module configured to determine a first indication as a calculated amount indication value for the current query input;

a set of servers determination module configured to determine, based on the third indication, a set of servers from the plurality of servers to which the previous query input is not assigned;

a first performance indicator value determination module configured to determine a performance indicator value for each server in the set of servers based on the second indication; and

a target server determination module configured to determine the target server from the set of servers based on the calculated quantity indicator value and the performance indicator value.

14. The apparatus of claim 13, wherein the target server determination module comprises:

a proximity determination module configured to determine a proximity of both the calculated amount indicator value and the performance indicator value to a predetermined value; and

a proximity-based server determination module configured to determine the target server from the set of servers based on the proximity.

15. The apparatus of claim 10, wherein the query input processing module comprises:

a second performance indicator value determination module configured to determine a performance indicator value for each of the plurality of servers based on the second indication;

a first comparison module configured to compare the performance indication value to a first threshold value to divide the plurality of servers into a first subset of servers and a second subset of servers, a performance of a server in the first subset of servers being better than a performance of a server in the second subset of servers;

a second calculated indication value determination module configured to determine the first indication as a calculated indication value for the current query input;

a second comparison module configured to compare the calculated quantity indication value with a second threshold value; and

a result processing module configured to select a server from the first server subgroup or the second server subgroup as the target server to which a query task is not assigned based on a result of the comparison.

16. The apparatus of claim 15, wherein the result processing module comprises:

a first selection module configured to select a server to which a query task is not assigned as the target server from the second server subgroup if it is determined that the calculation amount indication value is equal to or less than the second threshold value; and

a second selection module configured to select a server to which a query task is not assigned as the target server from the first subset of servers if it is determined that the calculation amount indication value is greater than the second threshold value.

17. The apparatus of claim 10, further comprising:

a first adjustment module configured to adjust an indication value in the third indication that corresponds to the target server if it is determined that the target server is assigned the current query input.

18. The apparatus of claim 17, further comprising:

an assigned query task determination module configured to determine whether the third indication indicates that each of the plurality of servers has been assigned a query task; and

a second adjustment module configured to adjust the third indication to indicate that the plurality of servers are not assigned a previous query task if it is determined that each of the plurality of servers has been assigned a query task.

19. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.

20. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-9.

21. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-9.