CN108391141A

CN108391141A - Method and apparatus for output information

Info

Publication number: CN108391141A
Application number: CN201810226177.1A
Authority: CN
Inventors: 谢俊; 莫玮
Original assignee: Beijing Jingdong Financial Technology Holding Co Ltd
Current assignee: JD Digital Technology Holdings Co Ltd; Jingdong Technology Holding Co Ltd
Priority date: 2018-03-19
Filing date: 2018-03-19
Publication date: 2018-08-10
Anticipated expiration: 2038-03-19
Also published as: CN108391141B

Abstract

The embodiment of the present application discloses the method and apparatus for output information.One specific implementation mode of this method includes：It is directed to the operation requests of target product in response to receiving user, user is shot, the default voice document not played is chosen from default voice document set as current preset voice document, executes multimedia file generation step, including：Current preset voice document is played, video data and audio data are obtained；Video data and audio data are encoded, video file and audio file are generated；In response to determining that audio data includes voice data input by user, and the default voice document in default voice document set finishes playing, and video file and audio file are merged and generate destination multimedia file；Authentication reference paper is sent to server, so that server is based on authentication reference paper and is authenticated to user, wherein authentication reference paper includes destination multimedia file.This embodiment improves the efficiency authenticated to user.

Description

Method and apparatus for output information

Technical field

The invention relates to field of computer technology, the more particularly, to method and apparatus of output information.

Background technology

When user is intended to be related to certain operation of individual privacy, for example, buy some product (finance product) or When handling a certain business (such as opening an account), generally require to verify the identity of user.

The existing method verified to user identity is typically to be verified face to face to user, i.e., user needs to finger Determine place and completes authentication.

Invention content

The embodiment of the present application proposes the method and apparatus for output information.

In a first aspect, the embodiment of the present application provides a kind of method for output information, this method includes：In response to connecing The operation requests that user is directed to target product are received, user is shot, chooses from default voice document set and does not play Default voice document as current preset voice document, and execute multimedia file generation step, multimedia file generates step Suddenly include：Current preset voice document is played, to obtain video data and audio number for the default voice document played According to；The video data and audio data that are obtained are encoded respectively, generate video file and audio file；This method is also wrapped It includes：In response to determining that audio data includes the voice data that user is directed to the input of current preset voice document, and default voice is literary Default voice document in part set finishes playing, and video file and audio file are merged and generate destination multimedia file；To Server sends authentication reference paper, so that server is based on authentication reference paper and is authenticated to user, wherein authentication reference File includes destination multimedia file.

In some embodiments, authentication reference paper further includes target image, and target image is generated based on following steps：It is aobvious Show and is directed to the pre-set product information of target product；It is asked for the information input of product information in response to receiving user, Obtain text information input by user, wherein text information includes the signature of user；Acquired text information is added to pre- If on image, generating target image, wherein pre-set image includes product information.

In some embodiments, pre-set image includes pre-set image region, wherein pre-set image region exists with product information Image-region on pre-set image is different；And acquired text information is added on pre-set image, including：It will be acquired Text information be added in pre-set image region.

In some embodiments, current preset voice document is played, including：Current preset voice document is played, and is shown For the pre-set text message of default voice document played.

In some embodiments, this method further includes：Determine whether audio data does not include that user is directed to current preset language The voice data of sound file input；Include language of the user for the input of current preset voice document in response to determining audio data not Sound data execute multimedia file generation step.

In some embodiments, determining whether audio data does not include that user inputs for current preset voice document After voice data, this method further includes：In response to determining that audio data includes that user inputs for current preset voice document Voice data, determine whether the default voice document in voice document set finishes playing；In response to determining voice document collection Default voice document in conjunction does not finish playing, and the default voice document conduct not played is chosen from default voice document set Current preset voice document, and execute multimedia file generation step.

Second aspect, the embodiment of the present application provide a kind of device for output information, which includes：First executes Unit is configured to be directed to the operation requests of target product in response to receiving user, be shot to user, from default voice The default voice document not played is chosen in file set as current preset voice document, and is executed multimedia file and generated step Suddenly, multimedia file generation step includes：Current preset voice document is played, to obtain for the default voice document played Video data and audio data；The video data and audio data that are obtained are encoded respectively, generate video file and Audio file；The device further includes：Combining unit is configured in response to determining that audio data includes that user is directed to current preset The voice data of voice document input, and the default voice document in default voice document set finishes playing, by video file Merge with audio file and generates destination multimedia file；Transmission unit is configured to send authentication reference paper to server, with Make server be based on authentication reference paper to authenticate user, wherein authentication reference paper includes destination multimedia file.

In some embodiments, pre-set image includes pre-set image region, wherein pre-set image region exists with product information Image-region on pre-set image is different；And adding device be further configured to acquired text information being added to it is pre- If in image-region.

In some embodiments, which further includes：Determination unit is configured to determine whether audio data does not include using Family is directed to the voice data of current preset voice document input；Second execution unit is configured in response to determining audio data Do not include the voice data that user is directed to the input of current preset voice document, executes multimedia file generation step.

In some embodiments, the second execution unit is further configured to：In response to determining that audio data includes user For the voice data of current preset voice document input, determine whether the default voice document in voice document set plays At；In response to determining that the default voice document in voice document set does not finish playing, chosen from default voice document set The default voice document not played executes multimedia file generation step as current preset voice document.

The third aspect, the embodiment of the present application provide a kind of terminal, including：One or more processors；Storage device is used In the one or more programs of storage, when one or more programs are executed by one or more processors so that at one or more The method that reason device realizes any embodiment in the above-mentioned method for output information.

Fourth aspect, the embodiment of the present application provide a kind of computer storage media, are stored thereon with computer program, In, which realizes any embodiment in the above-mentioned method for output information method when being executed by processor.

Method and apparatus provided by the embodiments of the present application for output information, by being directed to mesh in response to receiving user The operation requests for marking product, shoot user, the default voice document not played are chosen from default voice document set As current preset voice document, and multimedia file generation step is executed, multimedia file generation step includes：It plays current Default voice document, to obtain video data and audio data for the default voice document played；Respectively to being obtained Video data and audio data encoded, generate video file and audio file；And then by response to determining audio number Default voice text according to the voice data for being directed to the input of current preset voice document including user, and in default voice document set Part finishes playing, and video file and audio file are merged and generate destination multimedia file；Finally authentication ginseng is sent to server File is examined, so that server is based on authentication reference paper and is authenticated to user, wherein authentication reference paper includes the more matchmakers of target Body file is avoided and is reflected face to face to user to which the destination multimedia file generated using shooting authenticates user Power, improves the efficiency authenticated to user.

Description of the drawings

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon：

Fig. 1 is that this application can be applied to exemplary system architecture figures therein；

Fig. 2 is the flow chart according to one embodiment of the method for output information of the application；

Fig. 3 is the schematic diagram according to an application scenarios of the method for output information of the application；

Fig. 4 is the flow chart according to another embodiment of the method for output information of the application；

Fig. 5 is the structural schematic diagram according to one embodiment of the device for output information of the application；

Fig. 6 is adapted for the structural schematic diagram of the computer system of the terminal device for realizing the embodiment of the present application.

Specific implementation mode

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, is illustrated only in attached drawing and invent relevant part with related.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 shows the implementation of the method for output information or the device for output information that can apply the application The exemplary system architecture 100 of example.

As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 provide communication link medium.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be interacted by network 104 with server 105 with using terminal equipment 101,102,103, to receive or send out Send message etc..Various telecommunication customer end applications can be installed, such as web browser is answered on terminal device 101,102,103 With, searching class application, instant messaging tools, Video processing software etc..

Terminal device 101,102,103 can be hardware, can also be software.When terminal device 101,102,103 is hard Can be the various electronic equipments with camera, loud speaker and microphone, including but not limited to smart mobile phone, tablet when part Computer, pocket computer on knee and desktop computer etc..When terminal device 101,102,103 is software, can install In above-mentioned cited electronic equipment.Multiple softwares or software module may be implemented into (for example, for providing distributed clothes in it The multiple softwares or software module of business), single software or software module can also be implemented as.It is not specifically limited herein.

Server 105 can be to provide the server of various services, such as to transmitted by terminal device 101,102,103 The netscape messaging server Netscape that multimedia file is handled.Netscape messaging server Netscape can count the multimedia file etc. received According to carrying out the processing such as analyzing, and handling result (such as authenticating result) is fed back into terminal device.

It should be noted that server can be hardware, can also be software.When server is hardware, may be implemented At the distributed server cluster that multiple servers form, individual server can also be implemented as.It, can when server is software To be implemented as multiple softwares or software module (for example, multiple softwares or software module for providing Distributed Services), also may be used To be implemented as single software or software module.It is not specifically limited herein.

It should be noted that the method for output information that is provided of the embodiment of the present application generally by terminal device 101, 102, it 103 executes, correspondingly, the device for output information is generally positioned in terminal device 101,102,103.

It should be understood that the number of the terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.

With continued reference to Fig. 2, the flow of one embodiment of the method for output information according to the application is shown 200.This is used for the method for output information, includes the following steps：

Step 201, it is directed to the operation requests of target product in response to receiving user, user is shot, from default The default voice document not played is chosen in voice document set as current preset voice document, and executes multimedia file life At step.

In the present embodiment, it is used for executive agent (such as the terminal shown in FIG. 1 of the method operation of output information thereon Equipment 101,102,103) can in response to receive user be directed to target product operation requests, user is shot, from The default voice document not played is chosen as current preset voice document in default voice document set, and executes multimedia text Part generation step.Wherein, target product can be user's product to be operated on it, specifically, target product can be Virtual product (such as finance product), or entity products (such as bank card).Operation requests can be Client-initiated, With target product it is relevant it is various request (such as purchase request, investment requests, handle ask etc.).Default voice document set with Target product is corresponding, and default voice document set may include at least one default voice document, and default voice document can be with It include the voice for authentication prerecorded for target product.For example, may include being prerecorded for target product Voice for being putd question to user.It should be noted that may include in default voice document set do not play it is default Voice document.Herein, the default voice document not played can be pre-set and does not play mark, and then above-mentioned execution master Body can choose the default voice document not played as current preset voice document from default voice document set.Alternatively, Above-mentioned executive agent can determine and choose the default voice document not played by inquiring broadcasting record, and then will be selected It is the default voice document of broadcasting as current preset voice document.

In the present embodiment, above-mentioned multimedia file generation step may include：

Step 2011, current preset voice document is played, to obtain the video counts for the default voice document played According to and audio data.

Wherein, video data can be shot obtained video to user, audio data can be to user into Row shoots obtained audio.Herein, audio data may include institute after the default voice document that user's uppick is played The speech response made, such as to presetting the answer of the problem of proposed in voice document.

In some optional realization methods of the present embodiment, above-mentioned executive agent can play current preset voice text Part, and show for the pre-set text message of default voice document played.Wherein, pre-set text message can Think that technical staff is directed to the pre-set prompt message of default voice document played.It is understood that passing through display Above-mentioned text message can facilitate user to obtain information, and then make speech response.

Step 2012, the video data and audio data that are obtained are encoded respectively, generates video file and audio File.

Wherein, video file can be the video for meeting default video format.Audio file can be to meet preset audio The audio of format.Herein, the video of above-mentioned default video format can be merged into more with the audio of above-mentioned preset audio format Media file.

Illustratively, above-mentioned executive agent can encode the video data obtained according to H264 agreements, generate The video file of H264 formats；The audio file obtained is encoded according to ACC agreements, generates the audio text of ACC formats Part.Wherein, the audio file of the video file of H264 formats and ACC formats can be merged into multimedia file.

It should be noted that Video coding and audio coding are the known technologies studied and applied extensively at present, herein not It repeats again.

Step 202, in response to determining that audio data includes the voice data that user is directed to the input of current preset voice document, And the default voice document in default voice document set finishes playing, it is more that video file and audio file are merged generation target Media file.

In the present embodiment, above-mentioned executive agent (such as terminal device shown in FIG. 1 101,102,103) can be in response to Determine that audio data includes the voice data that user is directed to the input of current preset voice document, and in default voice document set Default voice document finishes playing, and video file and audio file are merged and generate destination multimedia file.Wherein, the more matchmakers of target Body file is multimedia file to be output, for being authenticated to above-mentioned user.

It is understood that as video file and audio file to be merged to the condition for generating destination multimedia file, on State executive agent it needs to be determined that audio data whether include user be directed to current preset voice document input voice data, and Whether the default voice document in default voice document set finishes playing.

In the present embodiment, above-mentioned executive agent can determine whether audio data is directed to including user by various methods The voice data of current preset voice document input.As an example, when environmental noise influence degree is smaller, above-mentioned executive agent Can by comparison preset voice document voice signal and audio data voice signal come determine audio data whether include Voice data input by user.If specifically, the voice signal of audio data it is identical as the voice signal of default voice document or Close, then it includes voice data input by user that can determine audio data not.

Optionally, audio data can be identified as text data by above-mentioned executive agent, and based on the textual data identified According to, determine audio data whether include voice data input by user.As an example, audio data includes in default voice document Voice, for the voice in default voice document, identification in advance has pre-set text.Audio data is identified as text data Afterwards, above-mentioned executive agent can determine whether identified text data includes text in addition to above-mentioned pre-set text, if so, It can then determine that audio data includes voice data input by user.Further it will be understood that when audio data does not include pre- If when voice in voice document, only audio number can be determined by determining whether to identify text data by audio data According to whether to include voice data input by user (if specifically, identify text data, can determine that audio data includes using The voice data of family input).

It should be noted that speech recognition is the known technology studied and applied extensively at present, details are not described herein.

In the present embodiment, above-mentioned executive agent can determine default in default voice document set by various methods Whether voice document finishes playing.As an example, above-mentioned executive agent can by determine in default voice document set whether Further include carrying not play the default voice document of mark to determine whether default voice document finishes playing；Alternatively, above-mentioned hold Row main body can be recorded by searching for broadcasting to determine whether default voice document finishes playing.

Herein, it is more can be merged generation target by above-mentioned executive agent by various methods for video file and audio file Media file is not limited herein.As an example, above-mentioned executive agent can utilize preassembled software (such as FFmpeg) Video file and audio file are merged and generate destination multimedia file.It should be noted that the synthetic technology of video and audio It is the known technology studied and applied extensively at present, details are not described herein.

In some optional realization methods of the present embodiment, not whether above-mentioned executive agent can also determine audio data The voice data of current preset voice document input is directed to including user；Work as in response to determining that audio data is not directed to including user The voice data of preceding default voice document input, executes multimedia file generation step.

In some optional realization methods of the present embodiment, determining whether audio data is not directed to including the user After the voice data of current preset voice document input, following steps can also be performed in above-mentioned executive agent：In response to determination Audio data includes the voice data that user is directed to the input of current preset voice document, determines the default language in voice document set Whether sound file finishes playing；In response to determining that the default voice document in voice document set does not finish playing, from default language The default voice document not played is chosen in sound file set as current preset voice document, and executes multimedia file generation Step.

Step 203, authentication reference paper is sent to server, so that server is based on authentication reference paper and is carried out to user Authentication.

In the present embodiment, the destination multimedia file obtained based on step 202, above-mentioned executive agent can be to server (such as server 105 shown in FIG. 1) sends authentication reference paper, so that server is based on authentication reference paper and is carried out to user Authentication, wherein authentication reference paper may include destination multimedia file.Specifically, as an example, server can play mesh Multimedia file is marked, is audited for auditor, and obtains the authenticating result of auditor's input.

It is a signal according to the application scenarios of the method for output information of the present embodiment with continued reference to Fig. 3, Fig. 3 Figure.In the application scenarios of Fig. 3, terminal device 301 can be in response to receiving user 302 for target product (bank card) Operation requests (handling request) 303, shoot user 302, chosen from default voice document set do not play it is default Voice document executes voice data and receives multimedia file generation step as current preset voice document.It specifically, can be with It plays current preset voice document and user 302 is shot, to obtain 305 He of video data for default voice document Audio data 306；The video data 305 and audio data 306 that are obtained are encoded respectively, generate 307 He of video file Audio file 308.Then, terminal device 301 can be in response to determining that audio data 306 includes that user 302 is directed to current preset The voice data of voice document input, and the default voice document in default voice document set finishes playing, by video file 307 and audio file 308 merge and generate destination multimedia file 309.Last terminal device 301 can be sent to server 310 Reference paper 311 is authenticated, so that server is based on authentication reference paper 311 and is authenticated to user, wherein authentication reference paper 311 include target media file 309.

The method that above-described embodiment of the application provides in response to receiving user for the operation of target product by asking It asks, user is shot, the default voice document not played is chosen from default voice document set as current preset language Sound file, and multimedia file generation step is executed, multimedia file generation step includes：Current preset voice document is played, To obtain video data and audio data for the default voice document played；Video data and sound to being obtained respectively Frequency generates video file and audio file according to being encoded；And then by response to determining that audio data includes that user is directed to The voice data of current preset voice document input, and the default voice document in default voice document set finishes playing, it will Video file and audio file, which merge, generates destination multimedia file；Authentication reference paper finally is sent to server, so that clothes Business device is based on authentication reference paper and is authenticated to user, wherein authentication reference paper includes destination multimedia file, to profit With shooting generate destination multimedia file user is authenticated, avoid and user authenticated face to face, improve to The efficiency that family is authenticated.

With further reference to Fig. 4, it illustrates the flows 400 of another embodiment of the method for output information.The use In the flow 400 of the method for output information, include the following steps：

Step 401, it is directed to the operation requests of target product in response to receiving user, user is shot, from default The default voice document not played is chosen in voice document set as current preset voice document, and executes multimedia file life At step.

In the present embodiment, it is used for executive agent (such as the terminal shown in FIG. 1 of the method operation of output information thereon Equipment 101,102,103) can in response to receive user be directed to target product operation requests, user is shot, from The default voice document not played is chosen as current preset voice document in default voice document set, and executes multimedia text Part generation step.

Step 402, in response to determining that audio data includes the voice data that user is directed to the input of current preset voice document, And the default voice document in default voice document set finishes playing, it is more that video file and audio file are merged generation target Media file.

In the present embodiment, above-mentioned executive agent (such as terminal device shown in FIG. 1 101,102,103) can be in response to Determine that audio data includes the voice data that user is directed to the input of current preset voice document, and in default voice document set Default voice document finishes playing, and video file and audio file are merged and generate destination multimedia file.

Above-mentioned steps 401, step 402 are consistent with step 201, the step 202 in previous embodiment respectively, above with respect to step Rapid 201, the description of step 202 is also applied for step 401, step 402, and details are not described herein again.

Step 403, display is directed to the pre-set product information of target product.

In the present embodiment, it can show for the executive agent of the method for output information operation thereon and be produced for target The pre-set product information of product.Wherein, shown product information can be used for checking for user.Product information can be used for The attribute of target product is characterized, product information can include but is not limited at least one of following：Word, number, symbol, image. It should be noted that herein, executive agent can be shown with various forms such as webpage, pictures and be pre-set for target product Product information.

Step 404, it is asked for the information input of product information in response to receiving user, obtains word input by user Information.

In the present embodiment, above-mentioned executive agent can be defeated for the information of the said goods information in response to receiving user Enter request, obtains text information input by user, wherein text information may include the signature of user.Specifically, above-mentioned execution Main body can prestore the image for being useful for receiving word input by user, and in turn, above-mentioned executive agent can obtain above-mentioned Text information on image.

Step 405, acquired text information is added on pre-set image, generates target image.

In the present embodiment, based on the text information in step 404, above-mentioned executive agent can believe acquired word Breath is added on pre-set image, generates target image, wherein pre-set image can include the said goods information.Target image can Think image that is to be output, being audited to it for related technical personnel.

Herein, acquired text information can be added to pre-set image by above-mentioned executive agent by various methods On.As an example, above-mentioned executive agent can identify and obtain the image-region for including above-mentioned text information, to the image-region Sectional drawing is carried out, and then the image that sectional drawing obtains is added on pre-set image by image fusion technology.

Optionally, above-mentioned executive agent can identify the pixel included by text information, and in default blank image Pixel (generating text information) identical with the pixel value of the pixel identified is generated, acquisition includes waiting for for text information Image is added, and then image to be added is added on pre-set image by image fusion technology.

It should be noted that image co-registration is the known technology studied and applied extensively at present, details are not described herein.

In some optional realization methods of the present embodiment, pre-set image may include pre-set image region, wherein pre- If image-region is different from image-region of the product information on pre-set image；And above-mentioned executive agent can will be acquired Text information is added in above-mentioned pre-set image region.

Step 406, authentication reference paper is sent to server, so that server is based on authentication reference paper and is carried out to user Authentication.

In the present embodiment, the target image of the destination multimedia file and step 405 acquisition that are obtained based on step 402, Above-mentioned executive agent can send authentication reference paper to server (such as server 105 shown in FIG. 1), so that server base User is authenticated in authentication reference paper, wherein authentication reference paper may include destination multimedia file and target figure Picture.

Figure 4, it is seen that compared with the corresponding embodiments of Fig. 2, the method for output information in the present embodiment Flow 400 highlight and obtain text information input by user, and then generate target image, and using target image as authentication ginseng Examine the step of file is exported.The scheme of the present embodiment description, which can introduce, as a result, more asks with the operation transmitted by user Associated data are sought, to realize more fully information output.

With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides one kind for exporting letter One embodiment of the device of breath, the device embodiment is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer For in various electronic equipments.

As shown in figure 5, the device 500 for output information of the present embodiment includes：First execution unit 501 merges list Member 502 and transmission unit 503.Wherein, the first execution unit 501 is configured in response to receiving user for target product Operation requests shoot user, and the default voice document not played is chosen from default voice document set as current Default voice document, and multimedia file generation step is executed, multimedia file generation step includes：Play current preset voice File, to obtain video data and audio data for the default voice document played；Video counts to being obtained respectively It is encoded according to audio data, generates video file and audio file；Combining unit 502 is configured in response to determining sound Frequency is directed to the voice data that current preset voice document inputs, and the default language in default voice document set according to including user Sound file finishes playing, and video file and audio file are merged and generate destination multimedia file；Transmission unit 503 is configured to Authentication reference paper is sent to server, so that server is based on authentication reference paper and is authenticated to user, wherein authentication ginseng It may include destination multimedia file to examine file.

It in the present embodiment, can be in response to receiving use for the first execution unit 501 of the device of output information 500 Family is directed to the operation requests of target product, is shot to user, chosen from default voice document set do not play it is default Voice document executes multimedia file generation step as current preset voice document.Wherein, target product can be user The product to be operated on it, specifically, target product can be virtual product (such as finance product), or entity Product (such as bank card).Operation requests can be Client-initiated, with target product it is relevant it is various request (such as buy ask Ask, investment requests, handle request etc.).Default voice document set is corresponding with target product, and default voice document set can be with Including at least one default voice document, default voice document may include being used for authentication for what target product was prerecorded Voice.It should be noted that may include the default voice document not played in default voice document set.Herein, for The default voice document not played can pre-set and not play mark, and then above-mentioned executive agent can be from default voice document The default voice document not played is chosen in set as current preset voice document.Alternatively, above-mentioned executive agent can pass through Inquiry plays record to determine and choose the default voice document not played, and then the selected default voice for broadcasting is literary Part is as current preset voice document.

Step 5011, current preset voice document is played, to obtain the video counts for the default voice document played According to and audio data.

Step 5012, the video data and audio data that are obtained are encoded respectively, generates video file and audio File.

In the present embodiment, the combining unit 502 for being used for the device 500 of output information can be in response to determining audio data The voice data of current preset voice document input, and the default voice document in default voice document set are directed to including user It finishes playing, video file and audio file is merged and generate destination multimedia file.Wherein, destination multimedia file is defeated to wait for Multimedia file going out, for being authenticated to above-mentioned user.

It is understood that as video file and audio file to be merged to the condition for generating destination multimedia file, need Determine whether audio data includes voice data and default voice document of the user for the input of current preset voice document Whether the default voice document in set finishes playing.

In the present embodiment, combining unit 502 can determine whether audio data is directed to including user by various methods The voice data of current preset voice document input.As an example, when environmental noise influence degree is smaller, above-mentioned executive agent Can by comparison preset voice document voice signal and audio data voice signal come determine audio data whether include Voice data input by user.If specifically, the voice signal of audio data it is identical as the voice signal of default voice document or Close, then it includes voice data input by user that can determine audio data not.

Optionally, audio data can be identified as text data by combining unit 502, and based on the textual data identified According to, determine audio data whether include voice data input by user.As an example, audio data includes in default voice document Voice, for the voice in default voice document, identification in advance has pre-set text.Audio data is identified as text data Afterwards, above-mentioned executive agent can determine whether identified text data includes text in addition to above-mentioned pre-set text, if so, It can then determine that audio data includes voice data input by user.Further it will be understood that when audio data does not include pre- If when voice in voice document, only audio number can be determined by determining whether to identify text data by audio data According to whether to include voice data input by user (if specifically, identify text data, can determine that audio data includes using The voice data of family input).

In the present embodiment, combining unit 502 can determine default in default voice document set by various methods Whether voice document finishes playing.As an example, also whether combining unit 502 can be by determining in default voice document set Include determining whether default voice document finishes playing with the default voice document for not playing mark；Alternatively, combining unit 502 can record by searching for broadcasting to determine whether default voice document finishes playing.

Herein, it is more can be merged generation target by combining unit 502 by various methods for video file and audio file Media file is not limited herein.It should be noted that the synthetic technology of video and audio is research and application extensively at present Known technology, details are not described herein.

In the present embodiment, the destination multimedia file obtained based on combining unit 502, transmission unit 503 can be to clothes Business device (such as server 105 shown in FIG. 1) sends authentication reference paper, so that server is based on authentication reference paper to user It is authenticated, wherein authentication reference paper may include destination multimedia file.

In some optional realization methods of the present embodiment, authentication reference paper can also include target image, target Image can be based on following steps and generate：Display is directed to the pre-set product information of target product；In response to receiving user For the information input request of product information, text information input by user is obtained, wherein text information includes the label of user Name；Acquired text information is added on pre-set image, target image is generated, wherein pre-set image includes product information.

In some optional realization methods of the present embodiment, pre-set image may include pre-set image region, wherein pre- If image-region is different from image-region of the product information on pre-set image；And adding device can be further configured to Acquired text information is added in pre-set image region.

In some optional realization methods of the present embodiment, current preset voice document is played, including：It plays current pre- If voice document, and show for the pre-set text message of default voice document played.

In some optional realization methods of the present embodiment, the device 500 for output information can also include：It determines Unit (not shown) is configured to determine whether audio data does not include that user inputs for current preset voice document Voice data；Second execution unit (not shown) is configured to work as in response to determining that audio data is not directed to including user The voice data of preceding default voice document input, executes multimedia file generation step.

In some optional realization methods of the present embodiment, above-mentioned second execution unit can be further configured to： In response to determining that audio data includes the voice data that user is directed to the input of current preset voice document, voice document set is determined In default voice document whether finish playing；In response to determining that the default voice document in voice document set does not play At choosing the default voice document that does not play from default voice document set as current preset voice document, and execute more Media file generation step.

The device 500 for output information that above-described embodiment of the application provides is responded by the first execution unit 501 In receiving operation requests of the user for target product, user is shot, is chosen not from default voice document set The default voice document played executes multimedia file generation step as current preset voice document, multimedia file life Include at step：Current preset voice document is played, to obtain video data and sound for the default voice document played Frequency evidence；The video data and audio data that are obtained are encoded respectively, generate video file and audio file；And then it is logical Combining unit 502 is crossed in response to determining that audio data includes the voice data that user is directed to the input of current preset voice document, and Default voice document in default voice document set finishes playing, and video file and audio file are merged and generate the more matchmakers of target Body file；Last transmission unit 503 sends authentication reference paper to server so that server be based on authentication reference paper to Family is authenticated, wherein authentication reference paper includes destination multimedia file, the destination multimedia text to be generated using shooting Part authenticates user, avoids and is authenticated face to face to user, improves the efficiency authenticated to user.

Below with reference to Fig. 6, it illustrates the computer systems 600 suitable for the terminal device for realizing the embodiment of the present application Structural schematic diagram.Terminal device shown in Fig. 6 is only an example, to the function of the embodiment of the present application and should not use model Shroud carrys out any restrictions.

As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and Execute various actions appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data. CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always Line 604.

It is connected to I/O interfaces 605 with lower component：Importation 606 including button, microphone etc.；Including display screen, raise The output par, c 607 of sound device etc.；Storage section 608 including hard disk etc.；And including LAN card, modem etc. The communications portion 609 of network interface card.Communications portion 609 executes communication process via the network of such as internet.Driver 610 Also according to needing to be connected to I/O interfaces 605.It should be noted that when above-mentioned terminal device is tablet computer, portable meter on knee Whens calculation machine or desktop computer etc., computer system 600 can also include detachable media, such as disk, CD, magneto-optic disk, Semiconductor memory etc..Detachable media can be mounted on driver 610 as needed, in order to from the meter read thereon Calculation machine program is mounted into storage section 608 as needed.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed by communications portion 609 from network, and/or from detachable media 611 are mounted.When the computer program is executed by central processing unit (CPU) 601, limited in execution the present processes Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or Computer readable storage medium either the two arbitrarily combines.Computer readable storage medium for example can be --- but Be not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or arbitrary above combination. The more specific example of computer readable storage medium can include but is not limited to：Electrical connection with one or more conducting wires, Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory Part or above-mentioned any appropriate combination.In this application, computer readable storage medium can any be included or store The tangible medium of program, the program can be commanded the either device use or in connection of execution system, device.And In the application, computer-readable signal media may include the data letter propagated in a base band or as a carrier wave part Number, wherein carrying computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but not It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use In by instruction execution system, device either device use or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to：Wirelessly, electric wire, optical cable, RF etc., Huo Zheshang Any appropriate combination stated.

Flow chart in attached drawing and block diagram, it is illustrated that according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part for a part for one module, program segment, or code of table, the module, program segment, or code includes one or more uses The executable instruction of the logic function as defined in realization.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, this is depended on the functions involved.Also it to note Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit can also be arranged in the processor, for example, can be described as：A kind of processor packet Include execution unit, combining unit and transmission unit.Wherein, the title of these units is not constituted under certain conditions to the unit The restriction of itself, for example, execution unit is also described as " unit for executing multimedia file generation step ".

As on the other hand, present invention also provides a kind of computer-readable medium, which can be Included in device described in above-described embodiment；Can also be individualism, and without be incorporated the device in.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the device so that should Device：It is directed to the operation requests of target product in response to receiving user, user is shot, from default voice document set The middle default voice document not played of choosing executes multimedia file generation step as current preset voice document, more matchmakers Body file generated step includes：Current preset voice document is played, to obtain the video for the default voice document played Data and audio data；The video data and audio data that are obtained are encoded respectively, generate video file and audio text Part；Further such that the device：In response to determining that audio data includes the voice that user is directed to the input of current preset voice document Data, and the default voice document in default voice document set finishes playing, and video file and audio file are merged and generated Destination multimedia file；Authentication reference paper is sent to server, so that server is based on authentication reference paper and is carried out to user Authentication, wherein authentication reference paper includes destination multimedia file.

Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art Member should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Other technical solutions of arbitrary combination and formation.Such as features described above has similar work(with (but not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims

1. a kind of method for output information, including：

It is directed to the operation requests of target product in response to receiving user, the user is shot, from default voice document The default voice document not played is chosen in set as current preset voice document, and executes multimedia file generation step, The multimedia file generation step includes：The current preset voice document is played, to obtain for the default language played The video data and audio data of sound file；The video data and audio data that are obtained are encoded respectively, generate video File and audio file；The method further includes：

Include the voice data that the user is directed to current preset voice document input in response to the determination audio data, And the default voice document in the default voice document set finishes playing, and the video file and the audio file are closed And generate destination multimedia file；

Authentication reference paper is sent to server, so that the server is based on the authentication reference paper and is carried out to the user Authentication, wherein the authentication reference paper includes the destination multimedia file.

2. according to the method described in claim 1, wherein, the authentication reference paper further includes target image, the target figure As being generated based on following steps：

Display is directed to the pre-set product information of the target product；

It is asked for the information input of the product information in response to receiving the user, obtains the word input by user Information, wherein the text information includes the signature of the user；

Acquired text information is added on pre-set image, target image is generated, wherein the pre-set image includes described Product information.

3. according to the method described in claim 2, wherein, the pre-set image includes pre-set image region, wherein described default Image-region is different from image-region of the product information on the pre-set image；And

It is described that acquired text information is added on pre-set image, including：

Acquired text information is added in the pre-set image region.

It is described to play the current preset voice document 4. according to the method described in claim 1, wherein, including：

The current preset voice document is played, and is shown for the pre-set text envelope of default voice document played Breath.

5. according to the method described in one of claim 1-4, wherein the method further includes：

Determine whether the audio data does not include that the user is directed to the voice data that the current preset voice document inputs；

It does not include the voice number that the user is directed to current preset voice document input in response to the determination audio data According to executing the multimedia file generation step.

6. according to the method described in claim 5, whether not including user's needle in the determination audio data wherein After the voice data of current preset voice document input, the method further includes：

Include the voice data that the user is directed to current preset voice document input in response to the determination audio data, Determine whether the default voice document in institute's voice file set finishes playing；

In response to determining that the default voice document in institute's voice file set does not finish playing, from default voice document set The default voice document not played is chosen as current preset voice document, and executes the multimedia file generation step.

7. a kind of device for output information, including：

First execution unit, be configured in response to receive user be directed to target product operation requests, to the user into Row shooting, chooses the default voice document not played as current preset voice document, and hold from default voice document set Row multimedia file generation step, the multimedia file generation step include：The current preset voice document is played, to obtain Obtain the video data and audio data for the default voice document played；Video data and audio number to being obtained respectively According to being encoded, video file and audio file are generated；

Described device further includes：

Combining unit is configured in response to the determination audio data include that the user is literary for the current preset voice The voice data of part input, and the default voice document in the default voice document set finishes playing, by video text Part and the audio file, which merge, generates destination multimedia file；

Transmission unit is configured to send authentication reference paper to server, so that the server is referred to based on the authentication File authenticates the user, wherein the authentication reference paper includes the destination multimedia file.

8. device according to claim 7, wherein the authentication reference paper further includes target image, the target figure As being generated based on following steps：

Display is directed to the pre-set product information of the target product；

9. device according to claim 8, wherein the pre-set image includes pre-set image region, wherein described default Image-region is different from image-region of the product information on the pre-set image；And

The adding device is further configured to acquired text information being added in the pre-set image region.

10. device according to claim 7, wherein the broadcasting current preset voice document, including：

11. according to the device described in one of claim 7-10, wherein described device further includes：

Determination unit is configured to determine whether the audio data does not include that the user is literary for the current preset voice The voice data of part input；

Second execution unit is configured to not include that the user is directed to the current preset in response to the determination audio data The voice data of voice document input, executes the multimedia file generation step.

12. according to the devices described in claim 11, wherein second execution unit is further configured to：

13. a kind of terminal, including：

One or more processors；

Storage device, for storing one or more programs,

When one or more of programs are executed by one or more of processors so that one or more of processors are real The now method as described in any in claim 1-6.

14. a kind of computer storage media, is stored thereon with computer program, wherein the program is realized when being executed by processor Method as described in any in claim 1-6.