CN111724795A

CN111724795A - Photo album playing method and device

Info

Publication number: CN111724795A
Application number: CN202010556474.XA
Authority: CN
Inventors: 侯方
Original assignee: Haier Uplus Intelligent Technology Beijing Co Ltd
Current assignee: Haier Uplus Intelligent Technology Beijing Co Ltd
Priority date: 2020-06-17
Filing date: 2020-06-17
Publication date: 2020-09-29

Abstract

The embodiment of the invention provides a method and a device for playing an album, wherein the method comprises the following steps: collecting first voice information for awakening an album, and extracting target voiceprint characteristics of the first voice information; awakening the album if the target voiceprint feature matches one of one or more previously stored voiceprint features; receiving second voice information and extracting an operation instruction corresponding to the second voice information; playing the photo album according to the operation instruction can solve the problem that the photo album cannot be played due to incapability of touch control when the user is far away from the equipment in the related technology because the photo album is played through finger touch control, and improves user experience through voice control of photo album playing.

Description

Photo album playing method and device

Technical Field

The invention relates to the field of communication, in particular to an album playing method and an album playing device.

Background

Most of the current playing modes of albums on the market are traditional touch selection modes, people need to contact with equipment, and the required functions are selected through finger touch, so that the mode is very inconvenient when the people are far away from the equipment.

Aiming at the problem that the photo album cannot be played due to the fact that the user cannot touch when the distance between the user and equipment is far in the related art by touching and controlling the photo album through fingers, a solution is not provided.

Disclosure of Invention

The embodiment of the invention provides an album playing method and device, which at least solve the problem that in the related art, albums cannot be played due to the fact that a user cannot touch when the user is far away from equipment because the albums are played by finger touch.

According to an embodiment of the present invention, there is provided an album playing method including:

collecting first voice information for awakening an album, and extracting target voiceprint characteristics of the first voice information;

awakening the album if the target voiceprint feature matches one of one or more previously stored voiceprint features;

receiving second voice information and extracting an operation instruction corresponding to the second voice information;

and playing the photo album according to the operation instruction.

Optionally, playing the album according to the operation instruction includes:

determining a target user identifier corresponding to the target voiceprint feature according to a pre-stored correspondence between the voiceprint feature and the user identifier;

playing the photo album according to a playing mode preset for the target user identification, wherein the playing mode at least comprises one of the following modes: the system comprises a playing target, a playing sequence and a playing speed, wherein the playing target corresponds to a picture category.

Optionally, after the album is played according to a playing mode set for the target user identifier in advance, the method further includes:

receiving third voice information;

identifying an adjusting instruction for adjusting the playing mode corresponding to the third voice information;

adjusting the playing mode according to the adjusting instruction;

and playing the photo album according to the adjusted playing mode.

Optionally, before collecting the first voice message for waking up the photo album, the method further includes:

collecting one or more voice unlocking instructions for triggering the photo album;

extracting voiceprint characteristics of the one or more voice unlocking instructions;

setting one or more voiceprint features as a voiceprint lock of the album, and storing the one or more voiceprint features.

Optionally, after extracting the voiceprint features of the one or more voice unlocking instructions, the method further comprises:

setting a user identifier of each voiceprint feature through a display interface prompt;

determining a user identifier corresponding to each voiceprint feature according to the interactive operation on the display interface;

and storing the corresponding relation between the voiceprint characteristics and the user identification.

Optionally, the method further comprises:

extracting picture information in the photo album, wherein the picture information comprises at least one of the following: shooting time, shooting place, picture tag, album name;

classifying the pictures in the photo album according to the picture information to obtain a plurality of picture categories;

and setting a playing mode for the user identification according to the plurality of picture categories.

Optionally, after the album is played according to the operation instruction, the method further includes:

receiving fourth voice information;

identifying a closing instruction for closing the photo album corresponding to the fourth voice information;

and closing the photo album according to the closing instruction.

According to another embodiment of the present invention, there is also provided an album playing apparatus including:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring first voice information for awakening an album and extracting target voiceprint characteristics of the first voice information;

the awakening module is used for awakening the photo album under the condition that the target voiceprint characteristics are matched with one of one or more prestored voiceprint characteristics;

the first extraction module is used for receiving second voice information and extracting an operation instruction corresponding to the second voice information;

and the first playing module is used for playing the photo album according to the operation instruction.

Optionally, the first playing module includes:

the determining submodule is used for determining a target user identifier corresponding to the target voiceprint feature according to the corresponding relation between the prestored voiceprint feature and the user identifier;

the playing submodule is used for playing the album according to a playing mode preset for the target user identifier, wherein the playing mode at least comprises one of the following modes: the system comprises a playing target, a playing sequence and a playing speed, wherein the playing target corresponds to a picture category.

Optionally, the apparatus further comprises:

the first receiving module is used for receiving third voice information;

the first identification module is used for identifying an adjustment instruction for adjusting the playing mode corresponding to the third voice information;

the adjusting module is used for adjusting the playing mode according to the adjusting instruction;

and the second playing module is used for playing the photo album according to the adjusted playing mode.

Optionally, the apparatus further comprises:

the second acquisition module is used for acquiring one or more voice unlocking instructions for triggering the photo album;

the second extraction module is used for extracting the voiceprint characteristics of the one or more voice unlocking instructions;

the first storage module is used for setting one or more voiceprint characteristics as a voiceprint lock of the photo album and storing the one or more voiceprint characteristics.

Optionally, the apparatus further comprises:

the first setting module is used for setting the user identification of each voiceprint feature through a display interface prompt;

the determining module is used for determining the user identification corresponding to each voiceprint feature according to the interactive operation on the display interface;

and the second storage module is used for storing the corresponding relation between the voiceprint characteristics and the user identification.

Optionally, the apparatus further comprises:

the third extraction module is used for extracting the picture information in the album, wherein the picture information comprises at least one of the following: shooting time, shooting place, picture tag, album name;

the classification module is used for classifying the pictures in the photo album according to the picture information to obtain a plurality of picture categories;

and the second setting module is used for respectively setting a playing mode for the user identifier according to the plurality of picture categories.

Optionally, the apparatus further comprises:

the second receiving module is used for receiving fourth voice information;

the second identification module is used for identifying a closing instruction for closing the photo album corresponding to the fourth voice information;

and the closing module is used for closing the photo album according to the closing instruction.

According to a further embodiment of the present invention, a computer-readable storage medium is also provided, in which a computer program is stored, wherein the computer program is configured to perform the steps of any of the above-described method embodiments when executed.

According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.

According to the invention, first voice information for awakening the photo album is collected, and the target voiceprint characteristic of the first voice information is extracted; awakening the album if the target voiceprint feature matches one of one or more previously stored voiceprint features; receiving second voice information and extracting an operation instruction corresponding to the second voice information; playing the photo album according to the operation instruction can solve the problem that the photo album cannot be played due to incapability of touch control when the user is far away from the equipment in the related technology because the photo album is played through finger touch control, and improves user experience through voice control of photo album playing.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

fig. 1 is a block diagram of a hardware configuration of a mobile terminal of an album playing method according to an embodiment of the present invention;

FIG. 2 is a flowchart of an album playing method according to an embodiment of the present invention;

FIG. 3 is a flow chart of a voice-controlled smart photo album playing according to an embodiment of the present invention;

fig. 4 is a block diagram of an album playing apparatus according to an embodiment of the present invention.

Detailed Description

The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

Example 1

The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking a mobile terminal as an example, fig. 1 is a block diagram of a hardware structure of the mobile terminal of the album playing method according to the embodiment of the present invention, as shown in fig. 1, the mobile terminal may include one or more processors 102 (only one is shown in fig. 1) (the processor 102 may include but is not limited to a processing device such as a microprocessor MCU or a programmable logic device FPGA), and a memory 104 for storing data, and optionally, the mobile terminal may further include a transmission device 106 for a communication function and an input/output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration, and does not limit the structure of the mobile terminal. For example, the mobile terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store a computer program, for example, a software program and a module of application software, such as a computer program corresponding to the album playing method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer program stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 can be a Radio FrequeNcy (RF) module, which is used to communicate with the internet in a wireless manner.

Based on the above mobile terminal or network architecture, in this embodiment, an album playing method is provided, and fig. 2 is a flowchart of the album playing method according to the embodiment of the present invention, as shown in fig. 2, the flowchart includes the following steps:

step S202, collecting first voice information for awakening the photo album, and extracting target voiceprint characteristics of the first voice information;

in the embodiment of the invention, the photo album, namely the voiceprint lock, is awakened through the first language information, and the voiceprint lock is established on the basis of the voiceprint recognition technology and is a specific application of the voiceprint recognition technology. Speech biological feature recognition, also called speaker recognition, commonly called voiceprint recognition, is a biological recognition method for automatically recognizing the identity of a speaker according to the pronunciation physiological and behavioral features of the speaker. The security provided by voiceprint recognition can be as high as other biometric technologies (e.g., fingerprint, palm, and iris). The voice signal is convenient for remote transmission and acquisition, and the voiceprint recognition is more excellent and unique in telecommunication and network-based identity recognition application.

Voiceprint (Voiceprint) is a sound spectrum which is displayed by an electroacoustic instrument and carries speech information, and the Voiceprint spectrums of any two persons are different. Voiceprint Recognition (VPR), also known as Speaker Recognition (Speaker Recognition), has two categories, namely Speaker Identification (Speaker Identification) and Speaker Verification (Speaker Verification). The former is used for judging which one of a plurality of people said a certain section of voice, and is a 'one-out-of-multiple' problem; the latter is used to confirm whether a certain speech is spoken by a given person, which is a "one-to-one decision" problem. Whether recognition or verification, the voiceprint of the speaker needs to be modeled first, which is a so-called "training" or "learning" process.

The main tasks of voiceprint recognition include: voice signal processing, voiceprint feature extraction, voiceprint modeling, voiceprint comparison, decision discrimination and the like.

Step S204, awakening the photo album under the condition that the target voiceprint feature is matched with one of one or more prestored voiceprint features;

step S206, receiving second voice information and extracting an operation instruction corresponding to the second voice information;

speech recognition in embodiments of the present invention refers to a technique for a machine to convert speech signals into corresponding text or commands through a recognition and understanding process. The technical principle of speech recognition is pattern recognition, and the process can be summarized as follows: preprocessing-feature extraction-pattern matching under a speech model base-language processing under a language model base-recognition is completed.

And S208, playing the photo album according to the operation instruction.

Further, the step S208 may specifically include: determining a target user identifier corresponding to the target voiceprint feature according to a pre-stored correspondence between the voiceprint feature and the user identifier; playing the photo album according to a playing mode preset for the target user identification, wherein the playing mode at least comprises one of the following modes: the system comprises a playing target, a playing sequence and a playing speed, wherein the playing target corresponds to a picture category.

Through the steps S202 to S208, first voice information for awakening the photo album is collected, and target voiceprint characteristics of the first voice information are extracted; awakening the album if the target voiceprint feature matches one of one or more previously stored voiceprint features; receiving second voice information and extracting an operation instruction corresponding to the second voice information; playing the photo album according to the operation instruction can solve the problem that the photo album cannot be played due to incapability of touch control when the user is far away from the equipment in the related technology because the photo album is played through finger touch control, and improves user experience through voice control of photo album playing.

The user wants to open the album with voice, by the user saying 'open album', unlock album. After the album is unlocked, a user can speak instructions of 'play next piece', 'play previous piece', 'play faster bit', 'play slower bit' and the like to the equipment to control the play mode of the equipment; the user can also speak 'play people', 'play photos of 2 months and 19 days', 'play favorite photos', and the like to instruct the control device to play contents. In addition, the equipment can also display the favorite photos of the user intelligently by default according to the playing habit of each user. Finally, after the album is played, the user can say 'close the album' to enable the equipment to be in a standby state and wait for being called next time.

In the embodiment of the invention, the user can also adjust the playing mode, and after the album is played according to the playing mode preset for the target user identifier, third voice information is received; identifying an adjusting instruction for adjusting the playing mode corresponding to the third voice information; adjusting the playing mode according to the adjusting instruction; and playing the photo album according to the adjusted playing mode.

In the embodiment of the invention, one or more voice unlocking instructions for triggering the photo album are collected before first voice information for awakening the photo album is collected; extracting voiceprint characteristics of the one or more voice unlocking instructions; and setting one or more voiceprint features as a voiceprint lock of the photo album, and storing the one or more voiceprint features, so that whether the voiceprint features are stored before can be compared when a subsequent user wakes up the photo album, and only if matching is successful, the photo album can be woken up, and then playing of the photo album can be controlled. The voiceprint lock is firstly recorded, the equipment uses the sound of a user as an unlocking password, a voice unlocking instruction can be recorded for each member of a family, an album is opened, the sound characteristic of the user is obtained, the voiceprint lock is used as the voiceprint lock, and identities such as dad and son can be set for the voiceprint lock.

In an optional embodiment, after the voiceprint features of the one or more voice unlocking instructions are extracted, setting a user identifier of each voiceprint feature in the one or more voiceprint features through a display interface prompt, that is, setting corresponding identifiers, such as dad, mom, baby and the like, by a user; determining a user identifier corresponding to each voiceprint feature according to the interactive operation on the display interface; and storing the corresponding relation between the voiceprint characteristics and the user identification.

In another optional embodiment, the photo album may be classified, corresponding playing modes may be set for different users, and specifically, the picture information in the photo album is extracted, where the picture information includes at least one of the following: the photo album identification method comprises the following steps of (1) shooting time, shooting place, picture labels and album names, wherein the picture labels can be set manually or can be automatically marked after picture contents are identified through an image identification technology, and the labels can be specifically people, scenery and the like; classifying the pictures in the photo album according to the picture information to obtain a plurality of picture categories, for example, the picture categories include that a first category picture is shot in a first place, a second category picture is shot in a second place, a third category picture is a person, a fourth category picture is a landscape, a fifth category picture is 2018 and the like, the picture categories can also be directly photo album names created by a user in advance, and the same picture can belong to only one category or a plurality of categories; and setting playing modes for the user identifications according to the plurality of picture categories, wherein different playing modes can be set for different user identifications. Namely, the device can classify the pictures and intelligently play the pictures according to the classes. Extracting the shooting time and the shooting place from the picture shooting information, so that the picture can be played according to the date and the place; people and scenery can be identified, and classified playing is carried out according to the picture types; the name can be customized for the album, and the album can be played according to the name of the album.

In the embodiment of the present invention, the played album or the album being played can be closed through voice control, and specifically, after the album is played according to the operation instruction, fourth voice information is received; identifying a closing instruction for closing the photo album corresponding to the fourth voice information; and closing the photo album according to the closing instruction.

Fig. 3 is a flowchart of playing a voice-controlled smart album according to an embodiment of the present invention, as shown in fig. 3, including:

step S301, inputting voice unlocking instructions of one or more users, extracting voiceprint characteristics of the users to obtain voiceprint locks, and only when the users say 'open photo album', the users can call the function of playing photo albums by the equipment, and even if other people say 'open photo album', the users cannot wake the equipment up.

Step S302, user information is established, and user information, such as the name, sex, age, identity, role and other information of the user, is entered.

Step S303, awakening the photo album by voice, and awakening the equipment when the user says 'open photo album';

step S304, judging whether a voice instruction triggering playing is received within preset time, if so, executing step S305, otherwise, entering standby;

in step S305, a picture that the user may like is played for the user according to the playing instruction of the user, and at this time, the device may identify which user is according to the user' S voice, and record the playing preference of the user each time. After the photo album is awakened, default playing can be carried out, and the default playing content is intelligently recommended by the user through the past playing record.

Step S306, judging whether an adjusting voice command for adjusting the playing mode is received, if so, executing step S307, otherwise, executing step S309;

step S307, adjusting the playing mode according to the adjustment voice instruction, that is, if the user sends a voice instruction, playing according to the voice instruction of the user, and when the user does not like the default playing content, controlling the playing content of the album through the voice instruction, for example: it can be said that 'play a photograph taken while traveling in japan', 'play my favorite', 'play previous', 'pause' and the like. (ii) a

Step S308, continuing to play according to a default playing mode, namely if the user has no voice instruction, the album will always play the default playing content;

step S309, the device enters a standby mode, specifically, the device may directly perform the standby mode after the user watches the photo album, or the device may perform the standby mode after receiving the 'photo album closing' instruction of the user, and wait for being awakened next time.

According to the embodiment of the invention, through the voice recognition technology, the user sends the instruction to play the photo album, so that the user can control the play of the photo album remotely and encrypt the photo album to protect privacy. When the user and the album playing equipment have a certain distance, the user does not need to get up and get close to the equipment, and the album is controlled in a finger touch mode, so that the use experience of the user is improved, and the album playing equipment is convenient and fast. The play mode of controlling the album is changed from finger touch into sound control, is simple and convenient, can do other things, and controls the album at the same time, and improves the efficiency.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

Example 2

In this embodiment, an album playing apparatus is further provided, and the apparatus is used to implement the foregoing embodiments and preferred embodiments, which have already been described and are not described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 4 is a block diagram of an album playing apparatus according to an embodiment of the present invention, as shown in fig. 4, including:

the first acquisition module 42 is configured to acquire first voice information for waking up an album, and extract a target voiceprint feature of the first voice information;

a wake-up module 44 configured to wake up the album if the target voiceprint feature matches one of one or more previously stored voiceprint features;

a first extracting module 46, configured to receive the second voice information and extract an operation instruction corresponding to the second voice information;

and the first playing module 48 is used for playing the photo album according to the operation instruction.

Optionally, the first playing module 48 includes:

Optionally, the apparatus further comprises:

the first receiving module is used for receiving third voice information;

Optionally, the apparatus further comprises:

and the second storage module is used for storing the corresponding relation between the one or more voiceprint characteristics and the user identification.

Optionally, the apparatus further comprises:

the second receiving module is used for receiving fourth voice information;

It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.

Example 3

Embodiments of the present invention also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:

s1, collecting first voice information for awakening the photo album, and extracting target voiceprint characteristics of the first voice information;

s2, when the target voiceprint feature is matched with one of one or more prestored voiceprint features, waking up the photo album;

s3, receiving second voice information and extracting an operation instruction corresponding to the second voice information;

and S4, playing the photo album according to the operation instruction.

Optionally, in this embodiment, the storage medium may include, but is not limited to: a usb disk, a Read-ONly Memory (ROM), a RaNdom Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, which can store computer programs.

Example 4

Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

and S4, playing the photo album according to the operation instruction.

Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An album playing method, comprising:

and playing the photo album according to the operation instruction.

2. The method of claim 1, wherein playing the album according to the operation instruction comprises:

3. The method according to claim 2, wherein after playing the album according to a playing mode set for the target user identifier in advance, the method further comprises:

receiving third voice information;

adjusting the playing mode according to the adjusting instruction;

and playing the photo album according to the adjusted playing mode.

4. The method of claim 1, wherein prior to collecting the first voice message for waking up the photo album, the method further comprises:

5. The method of claim 4, wherein after extracting voiceprint features of the one or more voice unlock instructions, the method further comprises:

6. The method of claim 5, further comprising:

7. The method according to any one of claims 1 to 6, wherein after playing the album according to the operation instruction, the method further comprises:

receiving fourth voice information;

and closing the photo album according to the closing instruction.

8. An album playing apparatus comprising:

9. A computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to carry out the method of any one of claims 1 to 7 when executed.

10. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and wherein the processor is arranged to execute the computer program to perform the method of any of claims 1 to 7.