CN111462783A

CN111462783A - Audio and video recording guiding method and device, computer equipment and storage medium

Info

Publication number: CN111462783A
Application number: CN202010147531.9A
Authority: CN
Inventors: 郭锦宏
Original assignee: OneConnect Financial Technology Co Ltd Shanghai
Current assignee: OneConnect Smart Technology Co Ltd; OneConnect Financial Technology Co Ltd Shanghai
Priority date: 2020-03-05
Filing date: 2020-03-05
Publication date: 2020-07-28
Also published as: WO2021175019A1

Abstract

The invention discloses an audio and video recording guiding method, an audio and video recording guiding device, computer equipment and a storage medium, wherein the method comprises the following steps: by receiving a service signing request sent by a client and acquiring a target service identifier from the service signing request, and a target double recording link corresponding to the target service identification is obtained from a preset rule base, the target double recording link comprises at least one basic link, AI voice information is generated based on a double recording rule, and through AI voice information, according to the sequence of the sequence ID, guiding each basic link in the target double recording link to perform double recording processing to obtain double recording data corresponding to each basic link in the target double recording link, and finally summarizing each double recording data to obtain target double recording information, wherein the method of guiding according to AI voice information avoids flow errors, meanwhile, the division into a plurality of basic links is also beneficial to reducing the time cost for recording the audio and video again when the audio and video recording is not in compliance, and the efficiency of recording the audio and video is improved.

Description

Audio and video recording guiding method and device, computer equipment and storage medium

Technical Field

The invention relates to the technical field of computers, in particular to an audio and video recording guiding method and device, computer equipment and a storage medium.

Background

At present, in some service signing scenes with high service requirements, double recording is needed in the service signing process, namely recording audio and video, and the double recording mainly refers to a mode that a service party acquires audio-visual data and electronic data by recording the audio and video and other technical means, and records and stores key links of the service signing process so as to realize playback of service signing behaviors, inquiry of important information and confirmation of problem responsibility and avoid the phenomenon of non-compliance.

With the development of social economy, businesses related to each individual or organization are more and more, most businesses have higher requirements on safety, namely, audio and video recording needs to be carried out in the business signing process, currently, people are guided to carry out business signing and double recording mainly in a manual mode, and the double recording videos are consulted to carry out quality inspection on the business signing process afterwards, but when the quality problem occurs in the audio and video recording process, the mode needs to record the audio and video again, so that the audio and video recording efficiency is lower.

Disclosure of Invention

The embodiment of the invention provides an audio and video recording guiding method, an audio and video recording guiding device, computer equipment and a storage medium, and aims to improve the current audio and video recording efficiency.

In order to solve the above technical problem, an embodiment of the present application provides an audio and video recording guidance method, including:

receiving a service signing request sent by a client, and acquiring a target service identifier from the service signing request;

acquiring a target double recording link corresponding to the target service identifier from a preset rule base, wherein the target double recording link comprises at least one basic link, and each basic link corresponds to a sequence ID and a double recording rule;

generating AI voice information based on the double recording rule, and guiding each basic link in the target double recording link to carry out double recording processing according to the sequence of the sequence ID through the AI voice information to obtain double recording data corresponding to each basic link in the target double recording link;

and summarizing each double-recording data to obtain target double-recording information.

Optionally, the step of guiding, by the AI voice information, each basic link in the target double recording link to perform double recording processing according to the sequence of the sequence ID, and obtaining double recording data corresponding to each basic link in the target double recording link includes:

when the starting of the basic link is detected, recording a starting time point, and acquiring an input mode corresponding to the basic link;

according to the recording mode, voice-guided double recording is carried out to obtain temporary data, and a recording ending time point is recorded;

performing AI quality inspection on the temporary data to obtain a quality inspection result;

and when the quality inspection result is that the quality inspection is passed, taking the temporary data as double-record data corresponding to the basic link, and determining time range information corresponding to the double-record data according to the starting time point and the ending time point.

Optionally, the quality inspection method of the AI quality inspection is voice quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:

acquiring voice information in the temporary data, and performing voice recognition on the voice information to obtain text information corresponding to the voice information;

performing semantic recognition on the text information to obtain a semantic recognition result;

and determining whether the voice information in the temporary data is qualified or not according to the semantic recognition result and a preset judgment mode, if so, determining that the voice quality inspection is passed, and if not, determining that the voice quality inspection is failed.

Optionally, the quality inspection method of the AI quality inspection is behavioral quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:

extracting video information in the temporary data, and extracting video frame images from the video information according to a preset interval;

carrying out face recognition on each video frame image, and taking the video frame image containing the face image as a target image;

and performing identity authentication on the target image, confirming identity information corresponding to the target image, performing consistency proofreading on the identity information and identity information in the service information to obtain a proofreading result, and determining the quality inspection result according to the proofreading result.

Optionally, the quality inspection method of the AI quality inspection is certificate quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:

acquiring a picture file in the temporary data, and analyzing the picture file in an ocr recognition mode to obtain certificate information contained in the picture file;

and checking the certificate information and the service information, and determining the certificate quality inspection result according to the checking result.

Optionally, after performing AI quality inspection on the temporary data to obtain a quality inspection result, and when the quality inspection result is that the quality inspection passes, taking the temporary data as the double-recording data corresponding to the basic link, and determining time range information corresponding to the double-recording data according to the start time point and the end time point, the audio/video recording guidance method further includes:

if the quality inspection result is quality inspection failure, generating corresponding voice guide information according to the reason of the quality inspection failure, and generating an updated starting time point;

and playing the voice guide information to enable the user to re-enter according to the voice guide information to obtain updated temporary data, generating an updated end time point, returning to the step of performing AI quality inspection on the temporary data to obtain a quality inspection result, and continuing to execute the step until the obtained quality inspection result is passed.

Optionally, after the summarizing is performed on each double-recording data to obtain target double-recording information, the audio and video recording guiding method further includes:

if a sampling inspection request sent by a management end is received, acquiring a preset key link corresponding to the service;

acquiring time range information corresponding to each preset key link as target spot inspection time;

and extracting data information corresponding to the target spot check time from the target double-record data to serve as information to be spot checked, and sending the information to be spot checked to the management terminal.

In order to solve the above technical problem, an embodiment of the present application further provides an audio/video recording guiding device, including:

the request receiving module is used for receiving a service signing request sent by a client and acquiring a target service identifier from the service signing request;

a link obtaining module, configured to obtain, from a preset rule base, a target double recording link corresponding to the target service identifier, where the target double recording link includes at least one basic link, and each basic link corresponds to one sequence ID and one double recording rule;

the double recording module is used for generating AI voice information based on the double recording rule, and guiding each basic link in the target double recording link to carry out double recording processing according to the sequence of the sequence ID through the AI voice information to obtain double recording data corresponding to each basic link in the target double recording link;

and the summarizing module is used for summarizing each double-recording data to obtain target double-recording information.

Optionally, the dual recording module includes:

the starting recording unit is used for recording a starting time point when the starting of the basic link is detected, and acquiring a recording mode corresponding to the basic link;

the recording ending unit is used for carrying out voice-guided double recording according to the recording mode to obtain temporary data and recording a recording ending time point;

the quality inspection unit is used for carrying out AI quality inspection on the temporary data to obtain a quality inspection result;

and the data determining unit is used for taking the temporary data as double-record data corresponding to the basic link when the quality inspection result is that the quality inspection passes, and determining time range information corresponding to the double-record data according to the starting time point and the ending time point.

Optionally, the quality control method of the AI quality control is voice quality control, and the quality control unit includes:

the voice recognition subunit is configured to acquire voice information in the temporary data, perform voice recognition on the voice information, and obtain text information corresponding to the voice information;

the semantic recognition subunit is used for carrying out semantic recognition on the text information to obtain a semantic recognition result;

and the result judging subunit is used for determining whether the voice information in the temporary data is qualified or not according to the semantic recognition result and a preset judging mode, if so, confirming that the voice quality inspection is passed, and if not, confirming that the voice quality inspection is failed.

Optionally, the quality inspection method of the AI quality inspection is behavioral quality inspection, and the quality inspection unit includes:

the image extraction subunit is used for extracting the video information in the temporary data and extracting video frame images from the video information according to a preset interval;

the face recognition subunit is used for carrying out face recognition on each video frame image and taking the video frame image containing the face image as a target image;

and the identity verification subunit is used for performing identity authentication on the target image, confirming identity information corresponding to the target image, performing consistency verification on the identity information and identity information in the service information to obtain a verification result, and determining the quality inspection result according to the verification result.

Optionally, the quality inspection method of the AI quality inspection is certificate quality inspection, and the quality inspection unit includes:

the image analysis subunit is configured to acquire an image file in the temporary data, and analyze the image file in a manner identified by ocr to obtain certificate information included in the image file;

and the certificate checking subunit is used for checking the certificate information and the service information and determining the certificate quality inspection result according to the checking result.

Optionally, the audio/video recording guiding apparatus further includes:

the guide information regeneration module is used for generating corresponding voice guide information according to the reason of quality inspection failure and generating an updated starting time point if the quality inspection result is that the quality inspection fails;

and the voice guide module is used for playing the voice guide information so that the user can re-enter the voice guide information to obtain updated temporary data, generating an updated end time point, returning to the step of performing AI quality inspection on the temporary data to obtain a quality inspection result, and continuously executing the step until the obtained quality inspection result is passed.

Optionally, the audio/video recording guiding apparatus further includes:

the selective examination link acquisition module is used for acquiring a preset key link corresponding to the service if a selective examination request sent by a management end is received;

the sampling inspection time determining module is used for acquiring time range information corresponding to each preset key link and taking the time range information as target sampling inspection time;

and the sampling inspection information determining module is used for extracting data information corresponding to the target sampling inspection time from the target double-record data to be used as information to be sampled and inspected, and sending the information to be sampled and inspected to the management terminal.

In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which includes a memory, a processor, and a computer program that is stored in the memory and can be run on the processor, where the processor implements the steps of the above audio/video recording guidance method when executing the computer program.

In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps of the audio/video recording and guiding method are implemented.

On one hand, by receiving a service signing request sent by a client, obtaining a target service identifier from the service signing request, and obtaining a target double recording link corresponding to the target service identifier from a preset rule base, wherein the target double recording link comprises at least one basic link, each basic link corresponds to a sequence ID and a double recording rule, then generating AI voice information based on the double recording rule, and guiding each basic link in the target double recording link to perform double recording processing according to the sequence of the sequence ID through the AI voice information to obtain double recording data corresponding to each basic link in the target double recording link, and finally summarizing each double recording data to obtain the target double recording information, which is a way of guiding according to the AI voice information, compared with the traditional manual mode, the efficiency is higher, meanwhile, the double recording link is divided into a plurality of basic links, the time cost for recording the audio and video again is reduced when the audio and video recording is wrong, and the efficiency of recording the audio and video is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

fig. 2 is a flowchart of an embodiment of an audio/video recording guidance method of the present application;

fig. 3 is a schematic structural diagram of an embodiment of an audio-video recording guidance apparatus according to the present application;

FIG. 4 is a schematic block diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, as shown in fig. 1, a system architecture 100 may include

terminal devices

101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like.

The

terminal devices

101, 102, 103 may be various electronic devices having display screens and supporting web browsing, including but not limited to smart phones, tablet computers, E-book readers, MP3 players (Moving Picture E interface displays the properties Group Audio L layer III, mpeg compression standard Audio layer 3), MP4(Moving Picture E interface displays the properties Group Audio L layer IV, mpeg compression standard Audio layer 4) players, laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the

terminal devices

101, 102, 103.

It should be noted that the audio and video recording guidance method provided by the embodiment of the present application is executed by a server, and accordingly, the audio and video recording guidance device is disposed in the server.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. Any number of terminal devices, networks and servers may be provided according to implementation needs, and the

terminal devices

101, 102 and 103 in this embodiment may specifically correspond to an application system in actual production.

Referring to fig. 2, fig. 2 shows an audio and video recording guiding method provided by an embodiment of the present invention, which is described by taking the method as an example applied to the server in fig. 1, and is detailed as follows:

s201: and receiving a service signing request sent by the client, and acquiring a target service identifier from the service signing request.

Specifically, a service signing intelligent double-recording system is deployed at a client, the system comprises a service signing intelligent double-recording task, a service signing party logs in the service signing intelligent double-recording system of the client through a service person by using a personal account, and the service signing intelligent double-recording task is selected in the system.

The client is provided with a camera device for recording audio and video images of the service signing party in the service signing process.

The service identifier is a symbol used for uniquely identifying a service, and may specifically be one of a Chinese character, a letter, a number, and a symbol, or a combination of multiple types, and the target service identifier is a service identifier included in the service signing request.

It should be noted that, in this embodiment, each service identifier corresponds to at least one service requirement, and a double entry rule corresponding to the service identifier is configured in advance according to multiple dimensions of a mechanism, a product type, an age, and the like included in the service requirements, and a specific setting of the double entry rule may be selected according to an actual need, which is not limited herein.

For example, in one embodiment, a service is "new user guided registration", a service identifier is denoted as "Register _ new user", and the corresponding double-recording rule is face authentication, registration information check and certificate authentication.

S202: and acquiring a target double recording link corresponding to the target service identifier from a preset rule base, wherein the target double recording link comprises at least one basic link, and each basic link corresponds to a sequence ID and a double recording rule.

Specifically, each service identifier corresponds to a double recording link in a rule base preset by the server, and after the target service identifier is obtained, the target double recording link corresponding to the target service identifier is selected from the rule base, so that a double recording process is performed according to the target double recording link subsequently.

The target double-recording link comprises at least one basic link, each basic link has respective double-recording rules and comprises an independent double-recording scene and a double-recording task, for example, a face authentication link, the link acquires a face image to be authenticated through a camera device, transmits the image to a server side, and executes face verification processing.

It is easy to understand that each basic link has a unique link identifier, and meanwhile, in this embodiment, the target double recording link includes a plurality of basic links, so that a sequence ID is set in advance for each basic link in the target double recording link corresponding to each service identifier.

For example, in one embodiment, the basic links included in the target bibliographic link of a contract signing service are, in order of sequential ID: basic information identification, face verification, service signing video acquisition, certificate confirmation, signature video acquisition and the like.

S203: and based on a double recording rule, generating AI voice information, and guiding each basic link in the target double recording link to carry out double recording processing according to the sequence of the sequence ID through the AI voice information to obtain double recording data corresponding to each basic link in the target double recording link.

Specifically, according to the double recording rule of each basic link, voice guide information corresponding to the basic link is generated, the voice guide information is summarized according to the sequence ID corresponding to the basic link to generate AI voice information, and then, through the AI voice information, each basic link is guided to carry out double recording processing according to the sequence ID from small to large, so that double recording data corresponding to each basic link are obtained.

In this embodiment, the AI voice information is generated based on the double-recording rule, the double-recording rule is analyzed to obtain the text semantics corresponding to the double-recording rule, a text-to-speech manner is further adopted to obtain the voice guidance information, and then the AI voice information is generated according to the sequence ID.

The AI voice information is used for guiding the user in a voice broadcasting mode, and the next link is entered after the user completes the current double recording link.

Preferably, in this embodiment, the Text-To-Speech mode is a Text-To-Speech (TTS) mode, which is also called Speech broadcast, and refers To a technology for converting Text content into audio content and playing the audio content, and intelligently converts characters into natural Speech streams through the design of a neural network under the support of an embedded chip, and the TTS is one of Speech synthesis applications, and converts files stored in a computer, such as help files or web pages, into natural Speech for output, and is widely used for helping visually impaired people To read or is not suitable for scenes where information is acquired through vision, and can help visually impaired people To read information on a computer, and can further increase the readability of Text documents.

In the embodiment, a party is guided to perform service signing in an AI intelligent voice information mode, and double recording is performed on a specific flow, namely audio and video recording, so that the accuracy of double recording can be effectively improved.

S204: and summarizing each double-record data to obtain target double-record information.

Specifically, according to the starting time point and the ending time point of each double-record data, the double-record data are summarized according to the sequence ID, and the starting time point and the ending time point of each link are marked, so that the single link can be subjected to rapid sampling inspection during subsequent manual sampling inspection.

In this embodiment, by receiving a service signing request sent by a client, obtaining a target service identifier from the service signing request, and obtaining a target double recording link corresponding to the target service identifier from a preset rule base, where the target double recording link includes at least one basic link, each basic link corresponds to a sequence ID and a double recording rule, and then generating AI voice information based on the double recording rule, and guiding each basic link in the target double recording link to perform double recording processing according to the sequence of the sequence ID through the AI voice information to obtain double recording data corresponding to each basic link in the target double recording link, and finally summarizing each double recording data to obtain target double recording information, this way of guiding according to the AI voice information is more efficient than the traditional manual way, and at the same time, the double recording is divided into multiple basic links, and the time cost of double recording is reduced when double recording has errors, and the efficiency of double recording is improved.

In some optional implementation manners of this embodiment, in step S203, the step of guiding, through the AI voice information and according to the sequence of the sequence ID, each basic link in the target double recording link to perform double recording processing, and obtaining double recording data corresponding to each basic link in the target double recording link includes:

when detecting that the basic link is started, recording a starting time point, and acquiring an input mode corresponding to the basic link;

according to the recording mode, carrying out voice-guided double recording to obtain temporary data and recording the recording ending time point;

and when the quality inspection result is that the quality inspection passes, taking the temporary data as the double-record data corresponding to the basic link, and determining the time range information corresponding to the double-record data according to the starting time point and the ending time point.

Specifically, when each basic link is started, a starting time point is recorded, so that after quality inspection is subsequently performed on the basic link, if double recording of the basic link is unqualified, the starting position of the basic link is determined according to the starting time point, and double recording is performed again.

The entry mode refers to specific entry matters, including but not limited to face entry, behavior entry, information entry, certificate entry, and the like.

Furthermore, each basic link corresponds to a list of items to be recorded, intelligent guiding double recording is carried out according to the recording mode and the voice guide information, whether the items to be recorded are recorded or not is judged according to the received picture, voice and video signals in the double recording process, the recording ending time point is recorded after the recording is finished, and temporary data are obtained.

The passive verification is carried out by setting a buried point mode according to received picture, voice and video signals to judge whether the entry of the items needing to be entered is finished.

For example, after the face information items of both parties need to be recorded, after the face information of the first party is recorded, a confirmation message is generated, then the face information of the second party is recorded through voice guidance, the face information of the second party is also successfully recorded, after the confirmation message is generated, a message that the face information items of both parties are recorded is generated, voice broadcasting is performed, and after the broadcasting is finished, the next item recording is executed.

For another example, when the certificate information of the party needs to be entered, a mode of image quick identification is adopted to monitor whether the video image has the certificate image, when the certificate image exists, the corresponding picture is captured, a message of entering the certificate information record information is generated, voice broadcasting is carried out, and after the broadcasting is finished, the next item entry is executed.

It is easy to understand that the recording end time point and the recording start time point have the same function, and both are to quickly locate the link when the quality inspection fails, so as to re-record the temporary data of the link.

Further, in this embodiment, a corresponding AI quality inspection mode is further selected according to the entry mode, and quality inspection is performed on the temporary data to obtain a quality inspection result, where the AI quality inspection mode includes, but is not limited to, voice quality inspection, behavior quality inspection, certificate quality inspection, and the like.

In this embodiment, the target double recording link is decomposed into a plurality of basic links to perform double recording of audio and video, and quality inspection is performed after the double recording of each basic link is completed, so that the effectiveness of the double recording of the basic link is ensured, and the efficiency of audio and video recording is improved.

In some optional implementation manners of this embodiment, the quality inspection manner of the AI quality inspection is voice quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:

Specifically, in this embodiment, the entered voice information is converted into text information, and the text information is subjected to semantic recognition, so as to determine whether the will of the party meets the business needs according to the recognized semantics, thereby realizing fast and intelligent voice quality inspection and improving quality inspection efficiency.

For example, in a specific embodiment, the voice guidance information is "whether you carefully read the content of the contract and agree with the convention in the contract", and the semantic recognition result converted from the obtained voice information of the party is "i have read and agree with the reading", and the quality inspection is considered to pass.

The speech recognition may specifically adopt a third-party speech recognition tool, or may also adopt a speech recognition algorithm, and common third-party speech recognition tools include, but are not limited to: IBM Watson, boomerang, AVST, etc., commonly used speech recognition algorithms include, but are not limited to: a Connectionist Temporal Classification (CTC) algorithm, an automatic spechrecordination (asr), a Dynamic Time Warping (Dynamic Time Warping) based algorithm, and the like.

The semantic recognition of the text may specifically adopt a Natural language processing (Natural L and facility processing, N L P) mode for recognition.

The preset determination mode may be set according to actual needs, and is not limited herein.

In the embodiment, the voice information is converted into the text information, the text information is subjected to semantic recognition, the obtained semantics are compared with the preset judgment mode, when the obtained semantics accord with the preset judgment mode, the quality inspection is confirmed to pass, the voice information in the data is intelligently inspected, and the quality inspection efficiency of a basic link is improved.

In some optional implementation manners of this embodiment, the quality inspection manner of the AI quality inspection is behavioral quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:

and performing identity authentication on the target image, confirming identity information corresponding to the target image, performing consistency proofreading on the identity information and identity information in the service information to obtain a proofreading result, and determining a quality inspection result according to the proofreading result.

Specifically, the server side extracts video information in the temporary data and extracts video frame images from the video information according to a preset interval; and carrying out face recognition on each video frame image, taking the video frame image containing the face image as a target image, checking the identity of the service signing party according to the target image, and determining a quality check result according to a check result.

The identity consistency check specifically includes but is not limited to: verification of personal information, identification of face images, and verification of video images answering questionnaire questions.

And when the three data of the personal information, the face image and the video image are successfully verified, the verification result is that the identity of the service signing party is legal, otherwise, when at least one data is failed to be verified, the verification result is that the identity of the service signing party is illegal.

Further, in order to improve the security of service signing and ensure the principal voluntary principle, the embodiment also combines the micro expression to confirm the intention of the principal.

In a specific embodiment, the video image of the business signing party is subjected to micro-expression recognition, the emotion of the business signing party is determined according to the recognition result, and if the emotion meets the preset emotion requirement, the successful verification of the video image is confirmed.

For example, in one embodiment, a query of basic information such as personal identification information, company information, business-related information, and the like is made by a preset questionnaire question, a video image of a business signing party returned to the questionnaire question is captured with a facial micro-expression, the captured micro-expression is compared with an existing facial motion coding system, the emotion transmitted in the micro-expression of the business signing party is determined, and whether the business signing party has abnormal behavior is determined based on the emotion. For example, if the emotion transmitted in the micro expression of the service signing party is anxiety or nervous, it is determined that the contract signing party has abnormal signing behavior, and at this time, the authentication is not verified.

In the embodiment, the identity is subjected to consistency check by extracting the face image from the video information of the temporary data, so that the validity of the identities of both sides of the business signing and the volunteering of behaviors in the business signing process are ensured.

In some optional implementation manners of this embodiment, the quality inspection manner of the AI quality inspection is certificate quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:

Specifically, when the quality inspection mode of the AI quality inspection is certificate quality inspection, the image file is obtained from the temporary data, the image file is analyzed in the ocr mode to obtain certificate information contained in the image file, and then the certificate information is checked by the service information to determine the quality inspection result.

Analyzing the image file in the ocr manner to obtain the certificate information included in the image file specifically includes: preprocessing the image; performing edge detection on the preprocessed image, and acquiring a region meeting a preset condition as a candidate region; and judging whether the image in the candidate area is a certificate image, if so, analyzing the certificate image to obtain certificate information contained in the certificate image.

In this embodiment, when the quality inspection mode is certificate quality inspection, the certificate image is determined from the picture file in the temporary data, and then the certificate image is analyzed to obtain certificate information, and the certificate information and the service information are checked for consistency, so as to determine the certificate quality inspection result, which is beneficial to improving the efficiency of quality inspection.

In some optional implementation manners of this embodiment, after performing AI quality inspection on the temporary data to obtain a quality inspection result, and when the quality inspection result is that the quality inspection passes, taking the temporary data as the double-recording data corresponding to the basic link, and determining time range information corresponding to the double-recording data according to the start time point and the end time point, the audio/video recording guidance method further includes:

Specifically, the server side is preset with a reason for quality inspection failure, and converts the reason into voice guidance information in a TTS manner, regenerates the updated start time point, and plays the voice guidance information, so that the personnel at the client side can perform double recording processing on the basic link again according to the voice guidance information until the quality inspection result is passed.

In this embodiment, for a basic link in which quality inspection fails, voice guidance information is generated to guide a client user to perform double recording processing on the basic link again, so that irregular double recording operation can be corrected in time, correction after double recording of all basic links is completed is avoided, and double recording efficiency is improved.

In some optional implementation manners of this embodiment, after step S204, the method for guiding audio and video recording further includes:

if a sampling inspection request sent by a management end is received, acquiring a preset key link corresponding to a service;

and extracting data information corresponding to the target sampling inspection time from the target double-record data to serve as information to be sampled and inspected, and sending the information to be sampled and inspected to a management terminal.

Specifically, the key link corresponding to the target service identifier may be configured in advance according to actual needs, after the double-entry is completed, the management end performs spot inspection on the key link according to configured information, when performing spot inspection on the key link, time range information corresponding to the key link is obtained and used as target spot inspection time, then data information corresponding to the target spot inspection time is extracted from the target double-entry data and used as to-be-spot-inspected information, and the to-be-spot-inspected information is sent to the management end.

The key link may be a link that is more important or more prone to error in the business double-recording link, and may be specifically determined according to an actual situation, which is not limited herein.

In this embodiment, when a selective examination request sent by a management end is received, preset key links corresponding to a service are obtained, and data information corresponding to the preset key links in target double-record data is obtained as to-be-selectively-examined information, so that the subsequent selective examination of the service double-record information through the to-be-selectively-examined information is facilitated, and the selective examination efficiency of the target double-record data is improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

Fig. 3 shows a schematic block diagram of an audio/video recording guidance device corresponding to the audio/video recording guidance method in the foregoing embodiment one to one. As shown in fig. 3, the audio/video recording guidance apparatus includes a request receiving module 31, a link obtaining module 32, a double recording module 33, and a summarizing module 34. The functional modules are explained in detail as follows:

a request receiving module 31, configured to receive a service signing request sent by a client, and obtain a target service identifier from the service signing request;

a link obtaining module 32, configured to obtain, from a preset rule base, a target double recording link corresponding to a target service identifier, where the target double recording link includes at least one basic link, and each basic link corresponds to one sequence ID and one double recording rule;

the double recording module 33 is configured to generate AI voice information based on a double recording rule, and guide each basic link in the target double recording link to perform double recording processing according to the sequence of the sequence ID through the AI voice information to obtain double recording data corresponding to each basic link in the target double recording link;

and the summarizing module 34 is configured to summarize each double-recording data to obtain target double-recording information.

Optionally, the dual recording module includes:

and the data determining unit is used for taking the temporary data as the double-record data corresponding to the basic link when the quality inspection result is that the quality inspection passes, and determining the time range information corresponding to the double-record data according to the starting time point and the ending time point.

the voice recognition subunit is used for acquiring the voice information in the temporary data and performing voice recognition on the voice information to obtain text information corresponding to the voice information;

and the identity verification subunit is used for performing identity authentication on the target image, confirming identity information corresponding to the target image, performing consistency verification on the identity information and identity information in the service information to obtain a verification result, and determining a quality inspection result according to the verification result.

the image analysis subunit is used for acquiring the image file in the temporary data, and analyzing the image file in an ocr identification mode to obtain certificate information contained in the image file;

and the certificate checking subunit is used for checking the certificate information and the service information and determining the certificate quality checking result according to the checking result.

Optionally, the audio/video recording guidance device further includes:

the selective examination link acquisition module is used for acquiring a preset key link corresponding to the service if a selective examination request sent by the management end is received;

and the sampling inspection information determining module is used for extracting data information corresponding to the target sampling inspection time from the target double-record data to be used as the information to be sampled and transmitted to the management terminal.

For specific limitations of the audio/video recording and guiding device, reference may be made to the above limitations of the audio/video recording and guiding method, which are not described herein again. All or part of the modules in the audio and video recording and guiding device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 4, fig. 4 is a block diagram of a basic structure of a computer device according to the present embodiment.

The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It is noted that only the computer device 4 having the components connection memory 41, processor 42, network interface 43 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.

The memory 41 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or D interface display memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the computer device 4. Of course, the memory 41 may also include both internal and external storage devices of the computer device 4. In this embodiment, the memory 41 is generally used for storing an operating system installed in the computer device 4 and various types of application software, such as program codes for controlling electronic files. Further, the memory 41 may also be used to temporarily store various types of data that have been output or are to be output.

The processor 42 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute the program code stored in the memory 41 or process data, such as program code for executing control of an electronic file.

The network interface 43 may comprise a wireless network interface or a wired network interface, and the network interface 43 is generally used for establishing communication connection between the computer device 4 and other electronic devices.

The present application further provides another embodiment, which is to provide a computer-readable storage medium, where the computer-readable storage medium stores an interface display program, where the interface display program is executable by at least one processor, so as to cause the at least one processor to execute the steps of the audio/video recording guidance method as described above.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims

1. An audio and video recording guiding method is characterized by comprising the following steps:

2. The method for guiding audio/video recording according to claim 1, wherein the guiding each basic link in the target double recording link to perform double recording processing according to the sequence of the sequence ID through the AI voice information to obtain double recording data corresponding to each basic link in the target double recording link comprises:

3. The method for guiding audio/video recording according to claim 2, wherein the AI quality inspection is performed in a voice quality inspection mode, and performing AI quality inspection on the temporary data to obtain a quality inspection result comprises:

4. The method for guiding audio/video recording according to claim 2, wherein the AI quality inspection is performed in a behavioral quality inspection mode, and performing AI quality inspection on the temporary data to obtain a quality inspection result comprises:

5. The method for guiding audio/video recording according to claim 2, wherein the AI quality inspection is performed in a certificate quality inspection mode, and performing AI quality inspection on the temporary data to obtain a quality inspection result comprises:

6. The method for guiding audio and video recording according to claim 2, wherein after the AI quality inspection is performed on the temporary data to obtain a quality inspection result, and when the quality inspection result is that the quality inspection passes, the temporary data is used as double-recording data corresponding to the basic link, and after time range information corresponding to the double-recording data is determined according to the start time point and the end time point, the method further comprises:

7. The audio-video recording guiding method according to any one of claims 1 to 6, wherein after the summarizing of each of the double-recording data to obtain target double-recording information, the audio-video recording guiding method further comprises:

8. An audio and video recording guide device, characterized in that the audio and video recording guide device comprises:

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the audio-video recording guidance method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the audio/video recording guidance method according to any one of claims 1 to 7.