CN111598137A

CN111598137A - Method and device for providing reasoning service and electronic equipment

Info

Publication number: CN111598137A
Application number: CN202010331927.9A
Authority: CN
Inventors: 窦宏辰
Original assignee: Beijing Kingsoft Cloud Network Technology Co Ltd
Current assignee: Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date: 2020-04-24
Filing date: 2020-04-24
Publication date: 2020-08-28

Abstract

The embodiment of the disclosure provides a method and a device for providing inference service and electronic equipment, and relates to the field of cloud computing. The method comprises the following steps: receiving a picture inference request from a client, wherein the picture inference request at least comprises picture data and key values; the key value is a resource identifier for uniquely identifying a video resource, and the video resource comprises a plurality of picture data; obtaining at least one cache picture data corresponding to the key value from a cache system; at least one cache picture data associated with the key values and a corresponding inference result are stored in the cache system in advance, and the at least one cache picture data and the corresponding inference result are obtained by calculating through an inference model according to historical picture inference requests; acquiring an inference result corresponding to cached picture data of which the similarity with the picture data meets a preset condition; and returning the inference result to the client. According to the embodiment of the disclosure, the resource consumption for providing the reasoning service can be reduced, and the service efficiency can be improved.

Description

Method and device for providing reasoning service and electronic equipment

Technical Field

The present disclosure relates to the field of machine learning technologies, and more particularly, to a method for providing inference services, an apparatus for providing inference services, an electronic device, and a computer-readable storage medium.

Background

The AI (Artificial Intelligence) reasoning service is to perform a reasoning operation on each input picture by a Graphics Processing Unit (GPU) based on a visual reasoning model to obtain a corresponding reasoning result, such as a picture classification result, a picture detection result, a picture segmentation result, and the like.

In the auditing of an actual service scene, such as a live broadcast scene, a large number of identical or similar pictures often exist in a certain time sequence window, and the pictures need to obtain corresponding reasoning results based on a visual reasoning model, so that more computing resources need to be consumed, and the resource waste is caused. Therefore, there is a need for improvement in view of the above-mentioned drawbacks.

Disclosure of Invention

It is an object of the disclosed embodiments to provide a new technical solution for providing inference services.

According to a first aspect of the present disclosure, there is provided a method for providing inference services, the method comprising:

receiving a picture inference request from a client, wherein the picture inference request comprises picture data and key values; the key value is a resource identifier for uniquely identifying a video resource, and the video resource comprises a plurality of picture data;

obtaining at least one cache picture data corresponding to the key value from a cache system;

at least one cache picture data associated with the key value and a corresponding inference result are stored in the cache system in advance, and the at least one cache picture data and the corresponding inference result are obtained by calculating through an inference model according to a historical picture inference request;

acquiring an inference result corresponding to cached picture data of which the similarity with the picture data meets a preset condition;

and returning the inference result to the client as the inference result of the picture data.

Optionally, each of the cached picture data includes a corresponding first picture feature;

the obtaining of the inference result corresponding to the cached picture whose similarity with the picture data meets the preset condition includes:

calculating a second picture characteristic corresponding to the picture data;

calculating the similarity between the second picture characteristics and at least one first picture characteristic respectively;

and sequencing the similarity meeting the similarity threshold, and determining the inference result corresponding to the first picture feature with the maximum similarity as the inference result of the picture inference request.

Optionally, after the calculating the similarity between the second picture feature and at least one of the first picture features, the method further includes:

and if the similarity meeting the similarity threshold does not exist, inputting the picture data into a reasoning model for calculation to obtain the corresponding reasoning result.

Optionally, after obtaining the corresponding inference result, the method further includes:

and caching the key value, the picture data and the corresponding reasoning result into the cache system.

Optionally, the caching the key value, the picture data, and the corresponding inference result in the caching system includes:

obtaining a cache result corresponding to the key value from the cache system; the cache result comprises a preset number of cache pictures and a corresponding reasoning result;

judging whether the picture data is the same as each cached picture or not;

if the picture data is the same as one of the cache pictures, associating an inference result corresponding to the picture data as a new inference result to the cache pictures for caching;

and if the picture data is different from each cached picture, associating the picture data and the corresponding inference result to the key value for caching.

Optionally, the determining whether the picture data is the same as each of the cached pictures includes:

and comparing whether the MD5 value of the picture data is the same as the MD5 value of each cache picture.

Optionally, after the associating the picture data and the corresponding inference result to the key value for caching, the method further includes:

judging whether the number of the cache pictures associated with the key value exceeds a preset number threshold value or not;

if the time stamp exceeds the preset time stamp, deleting the cache picture corresponding to the maximum time stamp; or deleting the cache picture with the least number of times of returning the inference result.

According to a second aspect of the present disclosure, there is provided an apparatus for providing inference services, comprising:

the system comprises a receiving module, a processing module and a display module, wherein the receiving module is used for receiving a picture reasoning request from a client, and the picture reasoning request at least comprises picture data and key values; the key value is a resource identifier for uniquely identifying a video resource, and the video resource comprises a plurality of picture data;

the acquisition module is used for acquiring at least one cache image data corresponding to the key value from a cache system; at least one cache picture data associated with the key value and a corresponding inference result are stored in the cache system in advance, and the at least one cache picture data and the corresponding inference result are obtained by calculating through an inference model according to a historical picture inference request; acquiring an inference result corresponding to the cached picture data with the similarity of the picture data meeting a preset condition;

and the sending module is used for returning the inference result to the client as the inference result of the picture data.

Optionally, wherein each of the cached picture data includes a corresponding first picture feature;

the acquisition module is specifically configured to:

calculating a second picture characteristic corresponding to the picture data;

calculating the similarity between the second picture characteristic and at least one first picture characteristic;

Optionally, wherein the apparatus further comprises:

and the calculation module is used for inputting the picture data into a reasoning model for calculation to obtain the corresponding reasoning result when the similarity meeting the similarity threshold does not exist.

Optionally, wherein the apparatus further comprises:

and the cache module is used for caching the key value, the picture data and the corresponding reasoning result into the cache system.

Optionally, the cache module is specifically configured to:

obtaining a cache result corresponding to the key value from the cache system; the cache result comprises a preset number of cache pictures and a corresponding reasoning result; judging whether the picture data is the same as each cached picture or not; if the picture data is the same as one of the cache pictures, associating an inference result corresponding to the picture data as a new inference result to the cache pictures for caching; and if the picture data is different from each cached picture, associating the picture data and the corresponding inference result to the key value for caching.

Optionally, the cache module is specifically configured to: and comparing whether the MD5 value of the picture data is the same as the MD5 value of each cache picture.

Optionally, the cache module is further configured to:

According to a third aspect of the present disclosure, there is also provided an electronic device including the apparatus for providing inference service according to any one of the second aspects of the present disclosure, or the electronic device including:

a memory for storing executable commands;

a processor for executing the method for providing inference services as claimed in any one of the first aspect of the present disclosure under control of said executable commands.

According to a fourth aspect of the present disclosure, there is also provided a computer-readable storage medium storing executable instructions that, when executed by a processor, perform the method for providing inference services as set forth in any one of the first aspects of the present disclosure.

According to one embodiment of the disclosure, a picture inference request from a client is received, wherein the picture inference request at least comprises picture data and key values; the key value is a resource identifier for uniquely identifying a video resource, and the video resource comprises a plurality of picture data; obtaining at least one cache picture data corresponding to the key value from a cache system; at least one cache picture data associated with the key values and a corresponding inference result are stored in the cache system in advance, and the at least one cache picture data and the corresponding inference result are obtained by calculating through an inference model according to historical picture inference requests; acquiring an inference result corresponding to cached picture data of which the similarity with the picture data meets a preset condition; and returning the inference result to the client. As the reasoning results of the similar pictures are pre-stored in the cache system, when the picture reasoning request of the similar pictures is received, the corresponding reasoning results can be directly obtained from the cache system, thereby reducing the resource consumption for providing the reasoning service and improving the service efficiency.

Other features of the present disclosure and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a schematic structural diagram of a server to which a method for providing inference services according to an embodiment of the present disclosure may be applied;

FIG. 2 is a schematic flow chart diagram of a method for providing inference services of an embodiment of the present disclosure;

FIG. 3 is a system architecture diagram of one example of a method for providing inference services to which embodiments of the present disclosure may be applied;

FIG. 4 is a system flow diagram of the system architecture shown in FIG. 3 as applied to an embodiment of the present disclosure;

FIG. 5 is a flow diagram of a fetch cache of an example of an embodiment of the present disclosure;

FIG. 6 is a flow diagram of updating a cache of an example of an embodiment of the present disclosure;

FIG. 7 is a functional block diagram of an apparatus for providing inference services in accordance with an embodiment of the present disclosure;

fig. 8 is a functional block diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

< hardware configuration >

Fig. 1 is a schematic structural diagram of a server to which the method for providing inference service according to the embodiment of the present disclosure may be applied.

As shown in fig. 1, the server 1000 may be, for example, a blade server or the like. In one example, server 1000 may be a computer.

In another example, the server 1000 may be as shown in fig. 1, including a processor 1100, a memory 1200, an interface device 1300, a communication device 1400, a display device 1500, an input device 1600. Although the server may also include speakers, microphones, and the like, these components are not relevant to the present disclosure and are omitted herein.

The processor 1100 may be, for example, a central processing unit CPU, a microprocessor MCU, or the like. The memory 1200 includes, for example, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. The interface device 1300 includes, for example, a USB interface, a serial interface, and the like. Communication device 1400 is capable of wired or wireless communication, for example. The display device 1500 is, for example, a liquid crystal display panel. The input device 1600 may include, for example, a touch screen, a keyboard, and the like.

The servers shown in fig. 1 are merely illustrative and are in no way meant to be any limitation of the present disclosure, its application, or uses. In an embodiment of the present disclosure, the memory 1200 of the server 1000 is used for storing instructions for controlling the processor 1100 to operate so as to execute any one of the methods for providing inference services provided by the embodiments of the present disclosure.

Those skilled in the art will appreciate that although a plurality of devices are shown for server 1000 in fig. 1, the present disclosure may only refer to some of the devices, for example, only processor 1100 and storage device 1200 for server 1000. The skilled person can design the instructions according to the disclosed solution of the present disclosure. How the instructions control the operation of the processor is well known in the art and will not be described in detail herein.

< method examples >

The present embodiment provides a method for providing inference services, which may be performed, for example, by the server 1000 shown in fig. 1.

As shown in fig. 2, the method includes steps 2100 to 2400:

in step 2100, a picture inference request from a client is received, where the picture inference request at least includes picture data and key values.

The key value is a resource identifier for uniquely identifying a video resource, and the video resource comprises a plurality of picture data.

Step 2200, obtaining at least one cache image data corresponding to the key value from a cache system.

At least one cache picture data associated with the key value and a corresponding inference result are stored in the cache system in advance, and the at least one cache picture data and the corresponding inference result are obtained by calculating through an inference model according to historical picture inference requests.

Each of the cached picture data includes a corresponding first picture characteristic.

Step 2300, obtaining an inference result corresponding to the cached picture data whose similarity with the picture data meets a preset condition.

And 2400, returning the inference result to the client as the inference result of the picture data.

In this embodiment, since the inference result of the similar picture is pre-stored in the cache system, when the picture inference request of the similar picture is received, the corresponding inference result can be directly obtained from the cache system, thereby reducing the consumption of computing resources for providing the inference service.

In an example, in step 2300, when obtaining the inference result corresponding to the cached image data whose similarity to the image data meets the preset condition, the server 1000 may specifically calculate the second image feature corresponding to the image data first; then calculating the similarity between the second picture characteristic and at least one first picture characteristic respectively; and sequencing the similarity meeting the similarity threshold, and determining the inference result corresponding to the first picture feature with the maximum similarity as the inference result of the picture inference request.

Specifically, if only one similarity between the first picture feature and the second picture feature meets the similarity threshold, the inference result corresponding to the first picture feature is determined as the inference result of the picture inference request. If the similarity of the first picture features and the second picture features accords with the similarity threshold value, the similarity is ranked, and the inference result corresponding to the first picture feature with the maximum similarity is determined as the inference result of the picture inference request.

If the server 1000 finds that there is no similarity that meets the similarity threshold, it indicates that there is no cached picture data similar to the picture data in the cache system, i.e., the cache is not hit, the server 1000 inputs the picture data into the inference model for calculation, and obtains the corresponding inference result.

In one example, the server 1000 may further cache the inference result after calculating the corresponding inference result. Namely, the key value, the picture data and the corresponding inference result are cached in the cache system.

Specifically, during caching, the server 1000 may first obtain a cache result corresponding to the key value from a caching system; the cache result comprises a preset number of cache pictures and a corresponding reasoning result; and judging whether the picture data is the same as each cached picture.

If the server 1000 determines that the picture data is the same as one of the cached pictures, which indicates that the picture data has been cached after the inference result is obtained by using other inference models, associating the inference result corresponding to the picture data obtained by the current computation as a new inference result with the cached picture for caching. If the picture data is different from each cached picture, it indicates that the picture data and the corresponding inference result are cached for the first time, and the server 1000 directly associates the picture data and the corresponding inference result with the key value for caching.

For example, the inference result currently associated with the cached picture is a classification result obtained based on the classification model, and the current calculation is a detection result obtained based on the detection model, and the detection result obtained this time is also associated with the cached picture, so that when a picture detection request of picture data similar to the cached picture is subsequently received, the corresponding detection result can be directly obtained from the cache and returned to the client, and the repeated calculation based on the detection model is not needed, thereby reducing the consumption of calculation resources.

In one example, the server 1000 may specifically determine whether the picture data is the same as each of the cached pictures by comparing whether the MD5 value of the picture data is the same as the MD5 value of each of the cached pictures. It is understood that if the value of MD5 is the same, the picture data is the same as the cached picture, and if the value of MD5 is different, the picture data is not the same as the cached picture.

Further, in consideration of the size of the buffer space, in one example, a threshold may be set for the number of buffered pictures. Specifically, when caching new picture data and a corresponding inference result, the server 1000 may determine whether the number of cached pictures associated with the key value exceeds a preset number threshold; if the time stamp exceeds the preset time stamp, deleting the cache picture corresponding to the maximum time stamp; or deleting the cache picture with the least number of times of returning the inference result.

< example >

Fig. 3 is a system architecture diagram of a method for providing inference services to which embodiments of the present disclosure may be applied.

As shown in fig. 3, the server communicates with a client through an OpenAPI interface, and the client is configured to provide a model service 1 and a model service 2.

The Server is, for example, a Cache Server (Cache Server), which may include an MD5 Cache, a timing Cache, and a feature Cache. The MD5 Cache, the timing Cache, and the feature Cache may be divided into devices independent of each other, or their functions may be combined on one physical machine to implement, which is not specifically limited in this embodiment.

The MD5 Cache is realized by Database (DB) indexes; the time sequence Cache adopts a hash table of Remote Dictionary service (Redis) and a data structure of an ordered queue (ZADD/ZREM/ZOPMIN); and caching the first picture characteristics in the characteristic Cache, and realizing characteristic retrieval in the Cache based on a retrieval system.

The client can interact with the cache server through the OpenAPI interface to provide model service, obtain cached reasoning results from the cache server, or update the reasoning results in the cache.

Fig. 4 is a system flow diagram applied to the system architecture shown in fig. 3 in an embodiment of the present disclosure. Specifically, as shown in fig. 4, in this example, the OpenAPI receives a picture inference request from a Gateway (GW). The OpenAPI firstly requests a reasoning result from a cache server; feedback from the cache server is then received. The cache server may feed back two results, one is that the cache server feeds back no corresponding inference result, and the other is that the cache server feeds back the inference result with the corresponding inference result.

For the first case, the OpenAPI requests data from a model server (model server), the model server is calculated by a Graphics Processing Unit (GPU), and after returning an inference result to the OpenAPI, the OpenAPI caches the inference result of the picture in a cache server, and receives an update success message fed back by the cache server. Meanwhile, the OpenAPI returns the inference result calculated by the GPU to the gateway.

For the second case, the cache server directly feeds back the inference result to the OpenAPI, and the OpenAPI directly feeds back the inference result to the gateway without GPU calculation.

Fig. 5 is a flow diagram of obtaining a cache according to an example of the disclosure. As shown in fig. 5, in this example, when obtaining the inference result, the cache server may first send the picture data and the corresponding key value (keyname, kn) to a cache (CacheHitter), and the CacheHitter sends the key value to Redis to request to obtain at least one piece of cached picture data corresponding to the key value, where the cached picture data includes the corresponding first picture feature. Redis returns the first picture feature corresponding to the key value to CacheHitter, in an example, the first picture feature is returned in a form of a list, and further, a maximum value N may be set for the list, that is, the number of pictures cached in one video sequence at most is set, so as to save the cache space.

After receiving the picture feature list, the cacheHitter calculates second picture features corresponding to the picture data, and performs similarity calculation on the second picture features and the N first picture features in the picture feature list. The CacheHitter sorts the similarity meeting the similarity threshold, and returns a reasoning result corresponding to the first picture feature with the maximum similarity to the cache server as a reasoning result of the picture reasoning request; and if the similarity meeting the similarity threshold does not exist in the calculation, the CacheHitter returns a result without corresponding inference to the cache server.

Further, the cache server updates the cache after obtaining the inference result of the picture data sent by the OpenAPI. Fig. 6 is a flow diagram illustrating updating a cache according to an example of the present disclosure. As shown in fig. 6, when the cache is updated, the cache server may send a key value, picture data, and an inference result to the CacheHitter, the CacheHitter requests a corresponding cache result from the Redis according to the key value, and the Redis returns all cache results corresponding to the key value to the CacheHitter.

And the cacheHitter judges according to the cache result. If the cache result is empty, adding a new key value and a cache result corresponding to the key value in the cache; if the cache result corresponding to the key value contains the picture, comparing the picture in the cache result with the MD5 value of the picture data, and if the MD5 values are the same, adding a new reasoning result for the key value; and if the MD5 values are different, adding a new picture and a corresponding reasoning result.

Further, the CacheHitter can also determine whether the number of pictures in the cache result corresponding to the key value is greater than a preset number threshold (N-PICS), and if so, delete the cache picture corresponding to the largest timestamp; or deleting the cache picture with the least number of times of returning the inference result.

After the cache hitter completes the operation, the cache result corresponding to the adjusted key value is updated to Redis, and after an update success message returned by the Redis is received, the update success message is returned to the cache server. Thereby completing the cache update.

The method for providing inference service of the present embodiment has been described above with reference to the accompanying drawings. Receiving a picture inference request from a client, wherein the picture inference request at least comprises picture data and key values; the key value is a resource identifier for uniquely identifying a video resource, and the video resource comprises a plurality of picture data; obtaining at least one cache picture data corresponding to the key value from a cache system; at least one cache picture data associated with the key values and a corresponding inference result are stored in the cache system in advance, and the at least one cache picture data and the corresponding inference result are obtained by calculating through an inference model according to historical picture inference requests; acquiring an inference result corresponding to cached picture data of which the similarity with the picture data meets a preset condition; and returning the inference result to the client. As the reasoning results of the similar pictures are pre-stored in the cache, when the picture reasoning request of the similar pictures is received, the corresponding reasoning results can be directly obtained from the cache, thereby reducing the resource consumption for providing the reasoning service and improving the service efficiency.

< apparatus embodiment >

The present embodiment provides an apparatus for providing inference service, for example, the apparatus 7000 for providing inference service shown in fig. 7, which is applied to the first client.

Specifically, the apparatus 7000 for providing inference service may include: a receiving module 7100, an obtaining module 7200 and a sending module 7300.

The receiving module 7100 is configured to receive a picture inference request from a client, where the picture inference request includes at least picture data and a key value. The key value is a resource identifier for uniquely identifying a video resource, and the video resource comprises a plurality of picture data.

An obtaining module 7200, configured to obtain, from a cache system, at least one cache picture data corresponding to the key value; at least one cache picture data associated with the key value and a corresponding inference result are stored in the cache system in advance, and the at least one cache picture data and the corresponding inference result are obtained by calculating through an inference model according to a historical picture inference request; and acquiring an inference result corresponding to the cached picture data with the similarity of the picture data meeting the preset condition.

A sending module 7300, configured to return the inference result to the client as the inference result of the picture data.

Each cache picture data comprises a corresponding first picture characteristic. Specifically, the obtaining module 7200 may be configured to calculate a second picture feature corresponding to the picture data; calculating the similarity between the second picture characteristic and at least one first picture characteristic; and sequencing the similarity meeting the similarity threshold, and determining the inference result corresponding to the first picture feature with the maximum similarity as the inference result of the picture inference request.

Optionally, the apparatus 7000 for providing inference service may further include a calculating module, configured to input the picture data into an inference model for calculation to obtain the corresponding inference result when there is no similarity meeting the similarity threshold.

Optionally, the apparatus 7000 for providing inference service may further include a caching module, configured to cache the key value, the picture data, and the corresponding inference result in the caching system.

Specifically, the cache module may be configured to obtain a cache result corresponding to the key value from the cache system; the cache result comprises a preset number of cache pictures and a corresponding reasoning result; judging whether the picture data is the same as each cached picture or not; if the picture data is the same as one of the cache pictures, associating an inference result corresponding to the picture data as a new inference result to the cache pictures for caching; and if the picture data is different from each cached picture, associating the picture data and the corresponding inference result to the key value for caching.

The caching module may specifically compare whether the MD5 value of the picture data is the same as the MD5 value of each cached picture when determining whether the picture data is the same as each cached picture.

Optionally, the cache module may be further configured to: judging whether the number of the cache pictures associated with the key value exceeds a preset number threshold value or not; if the time stamp exceeds the preset time stamp, deleting the cache picture corresponding to the maximum time stamp; or deleting the cache picture with the least number of times of returning the inference result.

The apparatus for providing inference service of this embodiment may be configured to implement the technical solutions of the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and are not described herein again.

< apparatus embodiment >

In this embodiment, an electronic device is also provided, where the electronic device includes the apparatus 7000 for providing inference service described in the apparatus embodiment of the present disclosure; alternatively, the electronic device is the electronic device 8000 shown in fig. 8, and includes:

a memory 8100 for storing executable commands.

A processor 8200 for performing the methods described in any of the method embodiments of the present disclosure under the control of executable commands stored by the memory 8100.

The implementation subject of the embodiment of the method executed in the electronic equipment can be a server or a terminal device.

< computer-readable storage Medium embodiment >

The present embodiments provide a computer-readable storage medium having stored therein an executable command that, when executed by a processor, performs a method described in any of the method embodiments of the present disclosure.

The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, implementation by software, and implementation by a combination of software and hardware are equivalent.

Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the present disclosure is defined by the appended claims.

Claims

1. A method for providing inference services, the method comprising:

receiving a picture inference request from a client, wherein the picture inference request at least comprises picture data and key values; the key value is a resource identifier for uniquely identifying a video resource, and the video resource comprises a plurality of picture data;

2. The method of claim 1, wherein each of the cached picture data comprises a corresponding first picture characteristic;

calculating a second picture characteristic corresponding to the picture data;

3. The method of claim 2, wherein after the calculating the similarity between the second picture feature and at least one of the first picture features, the method further comprises:

4. The method of claim 3, wherein after said obtaining the corresponding inference result, the method further comprises:

5. The method of claim 4, wherein the caching the key value, the picture data, and the corresponding inference result into the caching system comprises:

judging whether the picture data is the same as each cached picture or not;

6. The method of claim 5, wherein the determining whether the picture data is the same as each of the cached pictures comprises:

7. The method of claim 5, wherein after the associating the picture data and the corresponding inference result to the key value for caching, the method further comprises:

8. An apparatus for providing inference services, comprising:

9. The apparatus of claim 8, wherein each of the cached picture data comprises a corresponding first picture characteristic;

the acquisition module is specifically configured to:

calculating a second picture characteristic corresponding to the picture data;

10. The apparatus of claim 9, wherein the apparatus further comprises:

11. The apparatus of claim 10, wherein the apparatus further comprises:

12. The apparatus of claim 11, wherein the caching module is specifically configured to:

13. The apparatus of claim 12, wherein the caching module is specifically configured to: and comparing whether the MD5 value of the picture data is the same as the MD5 value of each cache picture.

14. The apparatus of claim 12, wherein the caching module is further configured to:

15. An electronic device comprising an apparatus for providing inference services according to any of claims 8-14, or comprising:

a memory for storing executable commands;

a processor for executing the method for providing inference services according to any of claims 1-7 under the control of said executable commands.

16. A computer-readable storage medium storing executable instructions that, when executed by a processor, perform the method for providing inference services according to any of claims 1-7.