CN112037796A

CN112037796A - Data processing method, device, equipment and medium

Info

Publication number: CN112037796A
Application number: CN202010918464.6A
Authority: CN
Inventors: 王锁平; 周登宇; 张伟坤
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2020-09-08
Filing date: 2020-09-08
Publication date: 2020-12-04
Anticipated expiration: 2040-09-08
Also published as: CN112037796B; WO2021159745A1

Abstract

The embodiment of the application discloses a data processing method, a device, equipment and a medium, which relate to a voice processing technology in artificial intelligence and can be applied to a block chain network, wherein the method comprises the following steps: acquiring first multimedia data about a first service from a terminal; identifying first multimedia data to obtain first service attribute information; determining a recognition engine matched with the first service attribute information from the shared recognition engine set as a target recognition engine; outputting prompt information about processing the first service; acquiring second multimedia data sent aiming at the prompt information from the terminal; and sending the second multimedia data to the first service platform so that the first service platform adopts the target recognition engine to recognize the second multimedia data and process the first service. By adopting the embodiment of the application, the resource waste can be avoided, and the cost is reduced.

Description

Data processing method, device, equipment and medium

Technical Field

The present application relates to voice processing technologies in artificial intelligence, and in particular, to a data processing method, apparatus, device, and medium.

Background

At present, video robot calls are used in many industries, such as business consultation and business handling in service industry, and the video robot calls gradually replace manual work and can realize business handling at any time and any place. When a user calls a video robot, different recognition engines are usually docked according to different services that the user needs to handle, and the service processing is carried out through the recognition engines. Different services need to be processed by different servers, the video robot needs to bear more service attributes to realize the docking of different recognition engines, each service needs to be customized and developed with different recognition engines, a large amount of resources are wasted, and the cost is high.

Disclosure of Invention

The embodiment of the application provides a data processing method, a data processing device, data processing equipment and a data processing medium, which can avoid resource waste and reduce cost.

An embodiment of the present application provides a data processing method, including:

acquiring first multimedia data about a first service from a terminal;

identifying the first multimedia data to obtain the first service attribute information, wherein the first service attribute information comprises at least one of the service grade of the first service or the service income of the first service;

determining a recognition engine matched with the first service attribute information from the shared recognition engine set as a target recognition engine;

outputting prompt information about processing the first service;

acquiring second multimedia data sent aiming at the prompt information from the terminal;

and sending the second multimedia data to a first service platform so that the first service platform adopts the target recognition engine to recognize the second multimedia data and process the first service.

Optionally, the identifying the first multimedia data to obtain the first service attribute information includes: performing voice recognition on the first voice data to obtain a first keyword associated with a service in the first voice data, and determining the first service attribute information according to the first keyword; or, converting the first voice data to obtain first text data corresponding to the first voice data; extracting keywords from the first text data to obtain second keywords associated with the service in the first text data; and determining the first service attribute information according to the second keyword.

Optionally, the first service attribute information includes a service class of the first service; the determining, as a target recognition engine, a recognition engine from the set of shared recognition engines that matches the first service attribute information includes: acquiring the recognition level of a recognition engine in the shared recognition engine set, wherein the recognition level of the recognition engine is used for reflecting the accuracy of the recognition engine for recognizing the multimedia data; and determining the recognition engine with the recognition level matched with the service level of the first service in the shared recognition engine set as the target recognition engine.

Optionally, the first service attribute information includes a service revenue of the first service, and the determining, as the target recognition engine, a recognition engine that matches the first service attribute information from the shared recognition engine set includes: acquiring the identification cost of the identification engines in the shared identification engine set; and determining the identification engine with the identification cost matched with the business income of the first business in the shared identification engine set as the target identification engine.

Optionally, the second multimedia data includes first video data and second voice data; the sending the second multimedia data to a first service platform, so that the first service platform adopts the target recognition engine to recognize the second multimedia data, and processes the first service, including: acquiring a first image of a user corresponding to the terminal according to the first video data; and sending the first image, the first video data and the second voice data to the first service platform so that the first service platform verifies the legality of the terminal according to the first image, and when the terminal is legal, identifying the first video data and the second voice data by adopting the target identification engine to process the first service.

Optionally, the method further includes: if the warning information which is sent by the first service platform and used for indicating that the terminal does not have legality is obtained, outputting adjustment information used for indicating the user to perform posture adjustment; acquiring third multimedia data sent by the terminal aiming at the adjustment information, wherein the third multimedia data comprises third video data; acquiring a second image of the user according to the third video data; and sending the second image to the first service platform so that the first service platform verifies the legality of the terminal according to the second image.

An embodiment of the present application provides a data processing apparatus, including:

a first obtaining module, configured to obtain first multimedia data related to a first service from a terminal;

a data identification module, configured to identify the first multimedia data to obtain the first service attribute information, where the first service attribute information includes at least one of a service class of the first service or a service benefit of the first service;

the engine determining module is used for determining a recognition engine matched with the first service attribute information from the shared recognition engine set to serve as a target recognition engine;

the information output module is used for outputting prompt information about processing the first service;

a second obtaining module, configured to obtain, from the terminal, second multimedia data sent for the prompt information;

and the service processing module is used for sending the second multimedia data to the first service platform so that the first service platform adopts the target recognition engine to recognize the second multimedia data and process the first service.

Optionally, the information output module is configured to determine to process the first service platform according to the identifier of the first service; acquiring prompt information about processing the first service from the first service platform; and outputting the first prompt message.

Optionally, the first multimedia data includes first voice data, and the data recognition module is specifically configured to perform voice recognition on the first voice data to obtain a first keyword associated with a service in the first voice data, and determine the first service attribute information according to the first keyword; or, converting the first voice data to obtain first text data corresponding to the first voice data; extracting keywords from the first text data to obtain second keywords associated with the service in the first text data; and determining the first service attribute information according to the second keyword.

Optionally, the first service attribute information includes a service class of the first service; the engine determination module is specifically configured to obtain an identification level of an identification engine in the shared identification engine set, where the identification level of the identification engine is used to reflect an accuracy of identifying the multimedia data by the identification engine; and determining the recognition engine with the recognition level matched with the service level of the first service in the shared recognition engine set as the target recognition engine.

Optionally, the first service attribute information includes a service revenue of the first service; the engine determination module is specifically configured to obtain an identification cost of an identification engine in the shared identification engine set; and determining the identification engine with the identification cost matched with the business income of the first business in the shared identification engine set as the target identification engine.

Optionally, the second multimedia data includes first video data and second voice data; the service processing module is specifically configured to obtain a first image of a user corresponding to the terminal according to the first video data; and sending the first image, the first video data and the second voice data to the first service platform so that the first service platform verifies the legality of the terminal according to the first image, and when the terminal is legal, identifying the first video data and the second voice data by adopting the target identification engine to process the first service.

Optionally, the apparatus further comprises: the adjusting module is used for outputting adjusting information for indicating the user to perform posture adjustment if warning information which is sent by the first service platform and used for indicating that the terminal does not have legality is acquired; acquiring third multimedia data sent by the terminal aiming at the adjustment information, wherein the third multimedia data comprises third video data; acquiring a second image of the user according to the third video data; and sending the second image to the first service platform so that the first service platform verifies the legality of the terminal according to the second image.

One aspect of the present application provides a computer device, comprising: a processor, a memory, a network interface;

the processor is connected to a memory and a network interface, wherein the network interface is used for providing a data communication function, the memory is used for storing a computer program, and the processor is used for calling the computer program to execute the method in the aspect in the embodiment of the present application.

An aspect of the embodiments of the present application provides a computer-readable storage medium, in which a computer program is stored, where the computer program includes program instructions, and the program instructions, when executed by a processor, cause the processor to execute the above-mentioned method for processing data based on a block chain of the first aspect.

In the embodiment of the application, by identifying the first multimedia data, the first service attribute information corresponding to the first service can be acquired. And by determining the target recognition engine corresponding to the first service, when the first service is processed subsequently, the target recognition engine is used for recognition, and the first service is processed. Because the shared recognition engine set comprises a plurality of recognition engines, namely the mode can centralize the plurality of recognition engines in the shared recognition engine set, different businesses can share the recognition engines in the set, the recognition engines do not need to be customized for the different businesses, the resource waste can be avoided, the investment of hardware resources is saved, and the cost is saved. Further, prompt information about processing the first service is output, second multimedia data sent aiming at the prompt information is obtained from the terminal, and the terminal can collect the second multimedia data obtained by a user through replying according to the prompt information by outputting the prompt information. And sending the second multimedia data to the first service platform so that the first service platform adopts the target recognition engine to recognize the second multimedia data and process the first service. When a user needs to handle a service, the corresponding service and the identification engine corresponding to the service are determined only by acquiring service attribute information in multimedia data, and the multimedia data corresponding to the first service is sent to the first service platform, and the first service platform can adopt the corresponding identification engine to identify and process the first service. The method can separate the two flows of determining the recognition engine and processing the service, and can realize the quick butt-joint to the service processing platform for service processing.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flowchart of a data processing method provided in an embodiment of the present application;

fig. 2 is a schematic flowchart of a data processing method according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Among the key technologies of Speech processing Technology (Speech Technology) are automatic Speech recognition Technology (ASR) and Speech synthesis Technology (TTS), as well as voiceprint recognition Technology. The computer can listen, see, speak and feel, and the development direction of the future human-computer interaction is provided, wherein the voice becomes one of the best viewed human-computer interaction modes in the future.

The application relates to a voice processing technology in artificial intelligence, which is characterized in that a voice processing technology is utilized to identify first multimedia data related to a first service to obtain first service attribute information, a target identification engine matched with the first service attribute information is determined from a shared identification engine set, and second multimedia data are sent to a first service platform, so that the first service platform identifies the second multimedia data by using the target identification engine to process the first service. Because different services can share the recognition engine in the set, the recognition engine does not need to be customized for different services, resource waste can be avoided, and the investment of hardware resources is saved, thereby saving the cost. This application can be applicable to fields such as wisdom government affairs, wisdom education, is favorable to promoting the construction in wisdom city.

The technical scheme of the application is suitable for identifying the multimedia data sent by the terminal, so that corresponding service processing is carried out according to the service attribute information in the multimedia data. For example, the technical scheme of the application is suitable for scenes such as remote auditing, video return visit, remote account opening and the like, the first multimedia data related to the first service is obtained from the terminal, the attribute information of the first service is obtained by identifying the first multimedia data, the target identification engine matched with the attribute information is determined according to the attribute information, prompt information related to processing of the first service is output, so that the terminal sends the second multimedia data according to the prompt information, and the second multimedia data is sent to the service platform corresponding to the first service, so that the service platform identifies the second multimedia data by adopting the target identification engine and processes the first service. By identifying the multimedia data containing the service, the service attribute information in the multimedia data can be determined, so that the corresponding service is handled according to the service attribute information.

Referring to fig. 1, fig. 1 is a schematic flowchart Of a data processing method provided in an embodiment Of the present application, where the method may be applied to a computer device, where the computer device includes a mobile phone, a tablet computer, a notebook computer, a palm computer, a smart sound, a mobile internet device (MID, a mobile internet device), a Point Of Sale (POS) machine, a wearable device (e.g., a smart watch, a smart bracelet, etc.), and the like; the method can also refer to an independent server, a server cluster consisting of a plurality of servers, or a cloud computing center. As shown in fig. 1, the method includes:

s101, first multimedia data related to a first service is obtained from a terminal.

Here, the terminal may refer to a terminal used by a user for performing service processing. The terminal may include a mobile phone, a tablet computer, a notebook computer, a palm top computer, a smart audio, a Mobile Internet Device (MID), a Point Of Sale (POS) machine, a wearable device (e.g., a smart watch, a smart bracelet, etc.), and the like. The first transaction may include a transaction that the user needs to transact, such as an XX purchase, a bank loan, a bank card transaction, a credit card transaction, and so on. Alternatively, the first transaction may also include services required by the subscriber, such as a bank card balance inquiry, a credit line inquiry, and the like. The first multimedia data may include a voice data type, a video data type, and the like.

In a specific implementation, a user may send a call request through a terminal, a computer device obtains the call request, establishes a call connection with the terminal according to the call request, and obtains first multimedia data related to a first service from the terminal through the call connection. Here, the call connection may include a video connection, a voice connection, and the like. The video connection is used for acquiring video data sent by a terminal connected with the computer equipment, and the voice connection is used for acquiring voice data sent by the terminal connected with the computer equipment.

S102, the first multimedia data is identified to obtain first service attribute information.

Here, the first multimedia data includes a keyword corresponding to the first service, the computer device may identify the first multimedia data, and if the first multimedia data includes the keyword corresponding to the first service, the computer device may use the keyword as the first service attribute information. For example, the first multimedia data may be, for example, "i want to transact a credit card", the recognized keywords include "transact" and "credit card", and the first attribute information includes "transact" and "credit card".

S103, identifying engines matched with the first service attribute information from the shared identifying engine set to serve as target identifying engines.

Here, the recognition engine is used to recognize multimedia data. The shared recognition engine set comprises at least one recognition engine, and the shared recognition engine set can comprise recognition engines for recognizing multimedia data corresponding to a plurality of services. One service may correspond to a plurality of recognition engines, and may include, for example, a voice data recognition engine, a text data recognition engine, a facial data recognition engine, and so on. A recognition engine may also recognize multiple services. The recognition engine matched with the first service attribute information refers to a recognition engine that can recognize multimedia data corresponding to the first service. For example, if the first service attribute information is "transact credit card", the first service may be "transact credit card", and the recognition engine matched with the first service attribute information is a recognition engine capable of recognizing multimedia data corresponding to "transact credit card", that is, the recognition engine may recognize text information, voice data, and the like filled by the user transacting credit card. For example, when the user needs to transact the first service, the user sends voice data and text data required for transacting the first service to the computer device through the terminal, and the target recognition engine is a recognition engine capable of recognizing the voice data and the text data.

Optionally, the first service attribute information may include at least one of a service class of the first service or a service revenue of the first service. The recognition engine corresponding to the first service attribute information may be determined according to the first service attribute information. The service level of the first service refers to a level of identification data required to be acquired for processing the first service, and the identification data may include at least one of voice data, fingerprint data, and face data. For example, the recognition level of the face data is greater than that of the fingerprint data, the recognition level of the fingerprint data is greater than that of the voice data, and so on. A lower recognition level of the recognition data indicates a lower recognition complexity, and a higher recognition level of the recognition data indicates a higher recognition complexity. That is, if the first service attribute information includes only voice data, the service level of the first service is lower; if the first service attribute information includes face data, the service level of the first service is higher. When the service level of the first service is lower, the multimedia data can be identified by using the identification engine with lower cost, and the identification result meets the identification requirement of service processing. When the service level of the first service is higher, the identification engine with higher identification accuracy can be used for identification, so that the identification precision is improved. When the service level of the first service is higher, the identification engine with higher identification precision is used, so that the identification accuracy can be improved; when the service level of the first service is lower, the identification engine with lower identification cost is used, so that the cost of service processing can be saved.

The business benefit of the first business may be an expected benefit of the first business, for example, the lower the cost corresponding to the recognition engine, the higher the business benefit of the first business; the lower the cost corresponding to the recognition engine is, the lower the business profit of the first business is.

And S104, outputting prompt information about processing the first service.

Here, the prompt information of the first service refers to flow information for processing the first service. For example, the flow information for processing the first service includes acquiring user identity information, acquiring user face data, and the prompt information for the first service may include "please fill out currently displayed identity information", "please aim at a camera", "please blink", "please move the face left and right", and so on. By outputting the prompt information about processing the first service, the user can perform corresponding reply according to the prompt information, such as filling identity information, aligning the face with a camera, and the like, so that the terminal acquires the reply of the user according to the prompt information of the first service to obtain the second multimedia data. Here, the second multimedia data may include a voice data type, a video data type, and the like. If the second multimedia data is of a voice data type, the terminal records voice replied by the user according to the prompt message of the first service to obtain voice data, namely the second multimedia data; and if the second multimedia data is of the video data type, the terminal records the video replied by the user according to the prompt message of the first service to obtain video data, namely the second multimedia data.

And S105, acquiring the second multimedia data sent aiming at the prompt information from the terminal.

Here, since the terminal collects the second multimedia data replied by the user according to the prompt message of the first service in the above step, the terminal can send the second multimedia data to the computer device, and then the computer device obtains the second multimedia data sent for the prompt message.

And S106, sending the second multimedia data to the first service platform so that the first service platform adopts the target recognition engine to recognize the second multimedia data and process the first service.

Here, the first service platform may refer to a platform that processes the first service. For example, if the first service is the handling of a credit card, the first service platform is a banking platform. And after the computer equipment sends the second multimedia data to the first service platform, the first service platform adopts the target recognition engine to recognize the second multimedia data and process the first service.

Specifically, the first service platform may employ a target recognition engine to recognize the second multimedia data, recognize authenticity of the second multimedia data, and process the first service if the second multimedia data has authenticity; and if the second multimedia data does not have authenticity, ending the processing of the first service. For example, if the second multimedia data includes facial information of the user, the identifying the second multimedia data using the target recognition engine may include: identifying whether the face information of the user included in the second multimedia data is the face information of the user stored by the first service platform, and if so, considering that the second multimedia data is authentic; if not, the second multimedia data is not considered to have authenticity. The face information of the user stored by the first service platform can be according to the face information stored by the user in the historical service transacted by the first service platform. For example, if the user transacts a bank card at the first service platform, the facial information of the user stored by the first service platform may be the facial information of the user that the user has reserved when the first service platform transacts the bank card. If the user does not handle the historical service on the first service platform, or if the user does not store the facial information while handling the historical service on the first service platform, the facial information of the user may be obtained from other platforms storing the facial information of the user, for example, the facial information of the user may be obtained from platforms corresponding to institutions such as the public security department, the civil administration department, and the like.

Optionally, after processing the first service, the multimedia data sent by the terminal may also be acquired, the second service attribute information is determined by identifying the multimedia data, and an identification engine matched with the second service attribute information is determined from the shared identification engine set, and is used as the second identification engine, and prompt information about processing the second service is output; and acquiring the multimedia data sent by the prompt message aiming at the second service from the terminal, and sending the multimedia data to the second service platform so as to process the second service. That is to say, since the shared recognition engine set includes at least one recognition engine, and different recognition engines correspond to different services, in this way, a plurality of recognition engines can be concentrated in one set, so that different services share one recognition engine, and the recognition engines do not need to be customized for different services, thereby saving cost. The method also integrates various services together, is convenient for fast docking to the service platform, and can dock to the corresponding recognition engine and process the corresponding service by recognizing the service attribute information in the multimedia data even if a user needs to handle various different services, thereby improving the service processing efficiency.

Optionally, the computer device in the present application may refer to any node device in a block chain, where the block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission (P2P transmission), a consensus mechanism, and an encryption algorithm, and is essentially a decentralized database; the blockchain can be composed of a plurality of serial transaction records (also called blocks) which are connected in series by cryptography and protect the contents, and the distributed accounts connected in series by the blockchain can effectively record the transactions by multiple parties and can permanently check the transactions (can not be tampered). The consensus mechanism is a mathematical algorithm for establishing trust and obtaining rights and interests among different nodes in the block chain network; that is, the consensus mechanism is a mathematical algorithm commonly recognized by network nodes in the blockchain. The method and the device can realize that multiple services share the recognition engine in the recognition engine set by utilizing the consensus mechanism of the block chain, avoid resource waste and save cost.

In an embodiment, the first service attribute information includes an identifier of the first service, and the step S104 may include the following steps S11 to S13.

s11, determining to process the first service platform according to the identification of the first service.

And s12, obtaining prompt information about processing the first service from the first service platform.

s13, outputting the first prompt message.

In steps s 11-s 13, the identifier of the first service is used to uniquely indicate the first service, for example, the identifier of the first service may be the name of the first service, short description of the name of the first service, pinyin of the name of the first service, abbreviation of pinyin of the name of the first service, and number used to indicate the first service, etc. The first service processing platform is a platform capable of processing the first service, for example, if the first service identifier is a secure bank card, the first service platform is a secure bank platform. The computer device may obtain the flow information for handling the first service from the first service platform by determining the first service platform, for example, obtaining the user identity information, obtaining the user face data, and the like in the above steps, to obtain the prompt information of the first service, and output the first prompt information to the terminal. The user can check the prompt message through the terminal and reply correspondingly according to the prompt message so as to transact the business.

In one embodiment, the first multimedia data includes first voice data, and the step S102 may include the following steps S21-S23.

s21, performing voice recognition on the first voice data to obtain a first keyword associated with the service in the first voice data, and determining the first service attribute information according to the first keyword.

Here, the first voice data refers to data acquired by collecting a voice of a user speaking. The first keyword associated with the service may be, for example, a name of the service, a name abbreviation of the service, and a number for representing the service, etc. The computer device performs voice recognition on the first voice data to obtain a first keyword, such as a name of the service, associated with the service in the first voice data, and determines first service attribute information according to the name of the service. For example, the first voice data is "i want to handle a bank card", the first keyword is "bank card", and the first service attribute information may be determined by obtaining words before and after the first keyword, for example, it is determined that the first service attribute information includes "handle a bank card".

In a specific implementation, the computer device may use an ASR technology or another speech recognition technology to recognize the speech data, obtain a first keyword associated with the service in the first speech data, and determine the first service attribute information according to the first keyword.

s22, converting the first voice data to obtain first text data corresponding to the first voice data.

Here, since the first voice data is voice-type data, the voice-type data can be converted into text-type data to obtain the first text data.

s23, extracting keywords from the first text data to obtain second keywords associated with the service in the first text data, and determining the first service attribute information according to the second keywords.

Here, the first keyword and the second keyword may be the same, or the first keyword and the second keyword may be different. The computer equipment converts the first voice data into first text data, extracts keywords from the first text data to obtain second keywords associated with the service in the first text data, and determines first service attribute information according to the second keywords.

In specific implementation, the computer equipment firstly carries out word segmentation processing on the first text data, and divides the first text data into at least one word segmentation; acquiring a stop word set, wherein the stop word set comprises at least one word irrelevant to a service; searching a target word matched with the at least one participle in the stop word set; deleting a target word in the at least one word segmentation; and extracting keywords from at least one segmented word after the target word is deleted to obtain a second keyword, and determining first service attribute information according to the second keyword.

For example, the first text data is "i want to handle a bank card", the result of the segmentation processing is "i want to handle a bank card", so that 4 segmentations are obtained, then the 4 segmentations are respectively matched with each stop word in the stop word set, if the 2 segmentations of "i" and "want" are matched, the 2 segmentations are deleted, so that the "handle a bank card" is obtained, the keyword extraction is performed on the "handle a bank card", the second keyword "bank card" is obtained, and the first service attribute information is determined according to the second keyword.

In specific implementation, the first voice data can be selected to be subjected to voice recognition according to specific requirements, or the first voice data is converted into text data to be subjected to keyword extraction, for example, the cost of voice recognition is low, and the voice recognition is adopted under the condition of saving the cost; or the accuracy of extracting the keywords by converting the voice data into the text data is higher, and the keywords are extracted by converting the voice data into the text data under the condition of improving the recognition accuracy.

The first service attribute information can be obtained by performing voice recognition on the first voice data or converting the first voice data into text data and extracting keywords from the text data, so that the biological recognition engine and the first service platform can be determined according to the first service attribute information, and then corresponding service processing can be performed.

In an embodiment, the first service attribute information includes a service level of the first service, and the step S103 may include the following steps S31 to S32.

And s31, acquiring the recognition level of the recognition engine in the shared recognition engine set, wherein the recognition level of the recognition engine is used for reflecting the accuracy of the recognition engine for recognizing the multimedia data.

And s32, determining the recognition engine with the recognition level matched with the service level of the first service in the shared recognition engine set as the target recognition engine.

In steps s 31-s 32, the higher the recognition level of the recognition engine, the higher the accuracy of the recognition engine recognizing the multimedia data; the lower the recognition level of the recognition engine, the lower the accuracy with which the recognition engine recognizes the multimedia data. The higher the service grade of the first service is, the higher the identification grade of the identification data which needs to be acquired for processing the first service is; the lower the service level of the first service is, the lower the identification level of the identification data that is acquired to process the first service is. For example, the identification data required to be acquired for processing the first service is voice data, which indicates that the service level of the first service is low, and the identification level of the service level matching degree identification engine of the first service is low; the identification data which needs to be acquired for processing the first service is face data, which indicates that the service level of the first service is higher, and the identification level of the service level matching degree identification engine of the first service is higher.

Optionally, in a case that the identification data required to be acquired to process the first service is at least two of voice data, fingerprint data, and face data, the service level of the first service may be determined according to the type of the identification data. For example, when the identification data required to be acquired for processing the first service includes voice data, fingerprint data, and face data, the service level of the first service is higher; and when the identification data required to be acquired for processing the first service comprises voice data and fingerprint data, the service level of the first service is lower. For example, the identification data required to be acquired for processing the first service 1 to the first service 4 respectively include identification data 1 to identification data 4, and the identification data 1 includes voice data and fingerprint data, the identification data 2 includes voice data and face data, the identification data 3 includes fingerprint data and face data, and the identification data 4 includes voice data, fingerprint data and face data, so that the service level of the first service 1 is smaller than the service level of the first service 2, the service level of the first service 2 is smaller than the service level of the first service 3, and the service level of the first service 3 is smaller than the service level of the first service 4.

And determining the recognition level in the shared recognition engine set as the target recognition engine by acquiring the recognition level of the recognition engine in the shared recognition engine set and determining the recognition engine with the recognition level matched with the service level of the first service. Under the condition that the service level of the first service is lower, an identification engine with a lower identification level can be adopted, so that the cost can be saved; under the condition that the service level of the first service is higher, an identification engine with a higher identification level can be adopted, so that the accuracy of identifying the multimedia data can be improved.

In an embodiment, the first service attribute information includes a service revenue of the first service, and the step S103 may include the following steps S41 to S42.

s41, obtaining the identification cost of the identification engine in the shared set of identification engines.

And s42, determining the recognition engine with the recognition cost matched with the business profit of the first business in the shared recognition engine set as the target recognition engine.

In steps s 41-s 42, the business profit of the first business may be an expected profit for the first business, the identification cost of the identification engine refers to the amount of money required to be paid out for purchasing or using the identification engine, and the lower the identification cost of the identification engine is, the higher the business profit of the first business is; the higher the identification cost of the identification engine, the lower the business revenue of the first business. The computer equipment determines the recognition cost of the recognition engine in the shared recognition engine set and the recognition engine with the recognition cost matched with the business income of the first business as a target recognition engine by acquiring the recognition cost of the recognition engine in the shared recognition engine set. Under the condition that the business benefit of the first business is high, the recognition engine with low recognition cost is adopted to recognize the multimedia data, so that the recognition cost can be reduced, and the business benefit of the first business is improved.

Optionally, please refer to fig. 2, and fig. 2 is a schematic flowchart of a data processing method according to an embodiment of the present application. The method is applied to computer equipment; as shown in fig. 2, the method includes:

s201, acquiring first multimedia data related to a first service from a terminal.

S202, the first multimedia data is identified to obtain first service attribute information.

S203, identifying engines matched with the first service attribute information from the shared identifying engine set to serve as target identifying engines.

And S204, outputting prompt information about processing the first service.

And S205, acquiring the second multimedia data sent aiming at the prompt information from the terminal.

Here, the second multimedia data includes first video data and second voice data, and the specific implementation manner of steps S201 to S205 may refer to the description of steps S101 to S105 in the embodiment corresponding to fig. 1, and is not described herein again.

And S206, acquiring a first image of the user corresponding to the terminal according to the first video data.

Here, the first video data is video data acquired by a terminal through a user replying according to prompt information for processing the first service. The first video data includes a face image of the user.

The computer device can intercept the first video data at preset intervals to obtain a first image containing the face of the user and obtain a first image of the user corresponding to the terminal. For example, the image in the first video data may be cut every 0.5 second, resulting in a first image. For example, if the duration of the first video data is 2 seconds, the number of the first images acquired by the user is 4.

And S207, sending the first image, the first video data and the second voice data to a first service platform so that the first service platform verifies the legality of the terminal according to the first image, and recognizing the first video data and the second voice data by adopting a target recognition engine when the terminal has the legality to process the first service.

Here, the second voice data is voice data acquired by the terminal and replied by the user according to the prompt message for processing the first service. The computer equipment sends the first image, the first video data and the second voice data to the first service platform so that the first service platform can verify the legality of the terminal according to the first image, and when the terminal is legal, the target recognition engine is adopted to recognize the first video data and the second voice data so as to process the first service.

In a specific implementation, after the first service platform acquires the first image, the first video data and the second voice data, the first image may be identified by using a target identification engine, whether a facial image of a user in the first image and a user image stored by the first service platform are facial images of the same user is determined, if yes, the terminal is determined to be legal, and the first video data and the second voice data are identified by using the target identification engine to process the first service. If not, determining that the terminal is not legal, and generating warning information for indicating that the terminal is not legal so that the user can adjust the posture according to the warning information.

In a possible implementation manner, when the first service platform recognizes the first video data and the second voice data by using the target recognition engine, a third image of the user corresponding to the second voice data in the first video data may be obtained, that is, a third image of the user when answering a question according to the prompt information of the first service is obtained from the first video data, where the third image includes a facial image of the user. And performing micro-expression recognition on the third image, so as to determine the authenticity of the question answered by the user according to the micro-expression when the user answers the question. And processing the first service if the truth of the questions answered by the user is determined to be higher through the micro expression recognition. And if the authenticity of the question answered by the user is determined to be low through micro expression recognition, sending indication information for secondarily verifying the identity of the user or outputting the question with abnormal micro expression of the user again. And if the secondary verification is passed or the expression when the user answers the question again indicates that the authenticity of the question answered by the user is higher, processing the first service. And if the secondary verification fails or the expression when the user answers the question again indicates that the authenticity of the question answered by the user is low, outputting the result to indicate the user to perform service handling in a manual service handling process corresponding to the first service platform, and finishing the first service handling.

The first image in the first video data is acquired, and the first image is sent to the first service platform for verification, so that the authenticity of the user identity can be improved, the first service platform can recognize the authenticity of the question answered by the user by performing micro-expression recognition on the third image in the first video data, the identity information of the user is verified secondarily, and the accuracy of service handling is improved.

In one embodiment, the above step method may include the following steps s51 to s 54.

s51, if the warning information sent by the first service platform for indicating that the terminal is not legal is obtained, outputting adjustment information for indicating the user to perform posture adjustment.

And s52, acquiring third multimedia data sent by the terminal aiming at the adjustment information, wherein the third multimedia data comprises third video data.

s53, a second image of the user is obtained based on the third video data.

And s54, sending the second image to the first service platform, so that the first service platform verifies the validity of the terminal according to the second image.

In steps s51 to s54, if the computer device acquires the warning information sent by the first service platform to indicate that the terminal does not have validity, outputting adjustment information for indicating the user to perform posture adjustment, so that the user performs posture adjustment according to the adjustment information, for example, when the face of the user is not aligned with the camera of the terminal, the adjusted face of the user is aligned with the camera of the terminal; or, in the case that the camera of the terminal includes the user a and the user B, and the user a is a user needing to handle the first service, the adjusted camera of the terminal includes only the user a.

The computer equipment acquires third multimedia data sent by the terminal aiming at the adjustment information, wherein the third multimedia data comprises third video data; acquiring a second image of the user according to the third video data; and sending the second image to the first service platform so that the first service platform verifies the legality of the terminal according to the second image. The second image comprises a face image of the user, and if the second image and the face image of the user stored in the first service platform are the face image of the same user, the terminal has legality and processes the first service. If the second image is not the same as the facial image of the user stored in the first service platform, the terminal does not have the legality, the first service is processed, the manual service processing department corresponding to the first service platform is indicated by the terminal to process the first service, and the first service is processed. The terminal validity can be verified by outputting the adjustment information to prompt the user to adjust the posture under the condition that the first terminal is verified not to have validity, so that the authenticity of the user identity information verification is improved.

The method of the embodiments of the present application is described above, and the apparatus of the embodiments of the present application is described below.

Referring to fig. 3, fig. 3 is a schematic diagram of a component structure of a data processing apparatus according to an embodiment of the present application, where the data processing apparatus may be a computer program (including program code) running in a computer device, for example, the data processing apparatus is an application software; the apparatus may be used to perform the corresponding steps in the methods provided by the embodiments of the present application. The device 30 comprises:

a first obtaining module 301, configured to obtain first multimedia data about a first service from a terminal;

a data identification module 302, configured to identify the first multimedia data to obtain the first service attribute information, where the first service attribute information includes at least one of a service class of the first service or a service benefit of the first service;

an engine determining module 303, configured to determine, from the shared recognition engine set, a recognition engine that matches the first service attribute information as a target recognition engine;

an information output module 304, configured to output prompt information about processing the first service;

a second obtaining module 305, configured to obtain, from the terminal, second multimedia data sent for the prompt message;

the service processing module 306 is configured to send the second multimedia data to the first service platform, so that the first service platform identifies the second multimedia data by using the target identification engine, and processes the first service.

Optionally, the information output module 304 is configured to:

determining to process the first service platform according to the identifier of the first service;

acquiring prompt information about processing the first service from the first service platform;

and outputting the first prompt message.

Optionally, the first multimedia data includes first voice data, and the data recognition module 302 is specifically configured to:

performing voice recognition on the first voice data to obtain a first keyword associated with a service in the first voice data, and determining the first service attribute information according to the first keyword;

or, converting the first voice data to obtain first text data corresponding to the first voice data;

extracting keywords from the first text data to obtain second keywords associated with the service in the first text data; and determining the first service attribute information according to the second keyword.

Optionally, the first service attribute information includes a service class of the first service; the engine determining module 303 is specifically configured to:

acquiring the recognition level of a recognition engine in the shared recognition engine set, wherein the recognition level of the recognition engine is used for reflecting the accuracy of the recognition engine for recognizing the multimedia data;

and determining the recognition engine with the recognition level matched with the service level of the first service in the shared recognition engine set as the target recognition engine.

Optionally, the first service attribute information includes a service revenue of the first service; the engine determining module 303 is specifically configured to:

acquiring the identification cost of the identification engines in the shared identification engine set;

and determining the identification engine with the identification cost matched with the business income of the first business in the shared identification engine set as the target identification engine.

Optionally, the second multimedia data includes first video data and second voice data; the service processing module 306 is specifically configured to:

acquiring a first image of a user corresponding to the terminal according to the first video data;

and sending the first image, the first video data and the second voice data to the first service platform so that the first service platform verifies the legality of the terminal according to the first image, and when the terminal is legal, identifying the first video data and the second voice data by adopting the target identification engine to process the first service.

Optionally, the apparatus further comprises: an adjusting module 307, configured to:

if the warning information which is sent by the first service platform and used for indicating that the terminal does not have legality is obtained, outputting adjustment information used for indicating the user to perform posture adjustment;

acquiring third multimedia data sent by the terminal aiming at the adjustment information, wherein the third multimedia data comprises third video data;

acquiring a second image of the user according to the third video data;

and sending the second image to the first service platform so that the first service platform verifies the legality of the terminal according to the second image.

It should be noted that, for the content that is not mentioned in the embodiment corresponding to fig. 3, reference may be made to the description of the method embodiment, and details are not described here again.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure. As shown in fig. 4, the computer device 40 may include: the processor 401, the network interface 404 and the memory 405, and the computer device 40 may further include: a user interface 403, and at least one communication bus 402. Wherein a communication bus 402 is used to enable connective communication between these components. The user interface 403 may include a Display (Display) and a Keyboard (Keyboard), and the selectable user interface 403 may also include a standard wired interface and a standard wireless interface. The network interface 404 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 405 may be a high-speed RAM memory or a non-volatile memory (e.g., at least one disk memory). The memory 405 may alternatively be at least one storage device located remotely from the aforementioned processor 401. As shown in fig. 4, the memory 405, which is a type of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.

In the computer device 40 shown in fig. 4, the network interface 404 may provide network communication functions; and the user interface 403 is primarily an interface for providing input to a user; and processor 401 may be used to invoke a device control application stored in memory 405 to implement:

acquiring first multimedia data about a first service from a terminal;

outputting prompt information about processing the first service;

It should be understood that the computer device 40 described in this embodiment may perform the description of the data processing method in the embodiment corresponding to fig. 1 and fig. 2, and may also perform the description of the data processing apparatus in the embodiment corresponding to fig. 3, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.

Embodiments of the present application also provide a computer-readable storage medium storing a computer program, the computer program comprising program instructions, which, when executed by a computer, cause the computer to perform the method according to the foregoing embodiments, and the computer may be a part of the above-mentioned computer device. Such as the processor 401 described above. By way of example, the program instructions may be executed on one computer device, or on multiple computer devices located at one site, or distributed across multiple sites and interconnected by a communication network, which may comprise a blockchain network.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims

1. A data processing method, comprising:

acquiring first multimedia data about a first service from a terminal;

determining a recognition engine matched with the first service attribute information from a shared recognition engine set as a target recognition engine;

outputting prompt information about processing the first service;

2. The method of claim 1, wherein the first service attribute information further comprises an identification of the first service, and wherein outputting the prompt information for processing the first service comprises:

and outputting the first prompt message.

3. The method of claim 1, wherein the first multimedia data comprises first voice data, and wherein the identifying the first multimedia data to obtain the first service attribute information comprises:

performing voice recognition on the first voice data to obtain a first keyword associated with a service in the first voice data, and determining the first service attribute information according to the first keyword; or,

converting the first voice data to obtain first text data corresponding to the first voice data;

extracting keywords from the first text data to obtain second keywords which are related to services in the first text data; and determining the first service attribute information according to the second keyword.

4. The method of claim 1, wherein the first service attribute information comprises a service class of the first service;

the determining, as a target recognition engine, a recognition engine from the shared recognition engine set that matches the first service attribute information includes:

acquiring the identification level of an identification engine in the shared identification engine set, wherein the identification level of the identification engine is used for reflecting the accuracy of the identification engine for identifying the multimedia data;

5. The method of claim 1, wherein the first business attribute information comprises business revenue of the first business, and wherein determining a recognition engine from a set of shared recognition engines that matches the first business attribute information as a target recognition engine comprises:

and determining an identification engine with identification cost matched with the business income of the first business in the shared identification engine set as the target identification engine.

6. The method of claim 1, wherein the second multimedia data comprises first video data and second voice data; the sending the second multimedia data to a first service platform so that the first service platform adopts the target recognition engine to recognize the second multimedia data and process the first service includes:

and sending the first image, the first video data and the second voice data to the first service platform so as to enable the first service platform to verify the legality of the terminal according to the first image, and when the terminal is legal, identifying the first video data and the second voice data by adopting the target identification engine to process the first service.

7. The method of claim 6, further comprising:

acquiring a second image of the user according to the third video data;

8. A data processing apparatus, comprising:

a data identification module, configured to identify the first multimedia data to obtain the first service attribute information, where the first service attribute information includes at least one of a service level of the first service or a service benefit of the first service;

an engine determination module, configured to determine, from a shared recognition engine set, a recognition engine that matches the first service attribute information as a target recognition engine;

and the service processing module is used for sending the second multimedia data to a first service platform so that the first service platform adopts the target recognition engine to recognize the second multimedia data and process the first service.

9. A computer device, comprising: a processor, a memory, and a network interface;

the processor is connected to the memory and the network interface, wherein the network interface is configured to provide data communication functions, the memory is configured to store program code, and the processor is configured to call the program code to perform the method according to any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-7.