CN113012693B

CN113012693B - Voice-based local media screening and playing method and device, terminal equipment and medium

Info

Publication number: CN113012693B
Application number: CN202110187922.8A
Authority: CN
Inventors: 晋晓琼; 王玉斌
Original assignee: Nanjing Skyworth Institute Of Information Technology Co ltd; Shenzhen Skyworth RGB Electronics Co Ltd
Current assignee: Nanjing Skyworth Institute Of Information Technology Co ltd; Shenzhen Skyworth RGB Electronics Co Ltd
Priority date: 2021-02-18
Filing date: 2021-02-18
Publication date: 2024-04-30
Anticipated expiration: 2041-02-18
Also published as: CN113012693A

Abstract

The invention discloses a local media screening and playing method, a device, terminal equipment and a medium based on voice, wherein the method comprises the following steps: acquiring a voice signal, identifying the voice signal, and determining a corresponding voice instruction; the local media resources are screened according to the voice command, and the media resources corresponding to the voice command are screened; and playing the screened media resources. The invention solves the problem of inconvenient screening of sound resources in the local storage equipment; the invention can search and play the local sound resource which is wanted to listen through the voice instruction. The invention makes the intelligent terminal more convenient for screening local media resources, and provides convenience for users.

Description

Voice-based local media screening and playing method and device, terminal equipment and medium

Technical Field

The present invention relates to the field of audio playing technologies, and in particular, to a local media screening playing method and apparatus based on voice, a terminal device, and a storage medium.

Background

Along with the development of technology and the continuous improvement of living standard of people, the use of various intelligent terminals is becoming more and more popular, and the intelligent terminals become an indispensable communication tool in people's life. Most intelligent terminals have a media playing function, and also basically store some local media resources or extrapolate the media resources of the mobile storage device, so that the local media screening of the intelligent terminals in the prior art is inconvenient. At present, a plurality of mobile storage devices which store a large amount of music or sound resources are sold on the market, the music or sound resources can be played on a certain host without a network, and the mobile storage devices are popular among a plurality of people because of the fact that the cheap additional resources are abundant, and the sales of the mobile storage devices in the past can be seen to be more than 10 ten thousand in the Taobao search. However, since the content is large, the screening of resources is difficult, and the use of resources is sometimes inconvenient for users.

Accordingly, there is a need for improvement and advancement in the art.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a local media screening and playing method, a device, a terminal device and a storage medium based on voice, and solves the problem that the screening of sound resources in the local storage device is inconvenient; the invention can search and play the local sound resource which is wanted to listen through the voice instruction. The invention makes the intelligent terminal more convenient for screening local media resources, and provides convenience for users.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows:

a local media screening and playing method based on voice includes:

acquiring a voice signal, identifying the voice signal, and determining a corresponding voice instruction;

the local media resources are screened according to the voice command, and the media resources corresponding to the voice command are screened;

and playing the screened media resources.

The method for screening and playing the local media based on the voice, wherein the steps of acquiring the voice signal, identifying the voice signal and determining the corresponding voice instruction comprise the following steps:

Presetting a database table with a preset size, which is used for storing all the parsed file information, wherein the file size and the MD5 value of the tail of the file are the same and are regarded as the same file; wherein the file information includes: file name, album name, category, singer name, duration, file size.

acquiring a voice signal through a microphone, and identifying the voice signal;

A corresponding voice command is identified.

The method for screening and playing the local media based on the voice, wherein the step of screening the local media resources according to the voice instruction comprises the following steps:

scanning a local media file or a media file in newly accessed mobile storage equipment in advance;

reading file information of the media file according to the scanned resource path, and storing the file information and the resource path together into a corresponding local database;

the MD5 values for the header and trailer of the file are stored in the database at the same time.

The method for screening and playing the local media based on the voice, wherein the step of reading file information of the media file according to the scanned resource path and storing the file information and the resource path together into a corresponding local database further comprises the following steps:

when a local media file or a media file in a newly accessed mobile storage device is scanned and the media file with incomplete information is scanned, searching from a database table with preset size, and complementing related information;

When the related file information is not searched from the database table with the preset size, the control is obtained from the network resource through a crawler algorithm and is stored in the corresponding local database.

Extracting corresponding voice keywords according to the voice command;

scanning a local media file or a media file in a newly accessed mobile storage device according to the extracted voice keyword;

scanning and searching media files corresponding to and matched with the voice keywords;

and screening and outputting the media files correspondingly matched with the voice keywords.

The method for screening and playing the local media based on the voice, wherein the step of playing the screened media resources comprises the following steps:

Acquiring a media file which is screened and output;

and playing the screened media resources according to a preset sequence.

A voice-based local media screening playback apparatus, wherein the apparatus comprises:

the acquisition module is used for acquiring a voice signal, identifying the voice signal and determining a corresponding voice instruction;

The screening module is used for screening the local media resources according to the voice command and screening media resources corresponding to the voice command;

the play control module is used for playing the screened media resources;

the presetting module is used for presetting a database table with a preset size and storing all the parsed file information, wherein the file size and the MD5 value of the head and tail of the file are the same and are regarded as the same file; wherein the file information includes: file name, album name, category, singer name, duration, file size;

The pre-scanning module is used for scanning the local media file or the media file in the newly accessed mobile storage device in advance; reading file information of the media file according to the scanned resource path, and storing the file information and the resource path together into a corresponding local database; the MD5 values for the header and trailer of the file are stored in the database at the same time.

A terminal device, wherein the terminal device comprises a memory, a processor and a voice-based local media screening playback program stored on the memory and executable on the processor, the processor implementing the steps of any one of the voice-based local media screening playback methods when executing the voice-based local media screening playback program.

A computer readable storage medium having stored thereon a voice-based local media screening playback program which, when executed by a processor, implements the steps of any of the voice-based local media screening playback methods.

The beneficial effects are that: compared with the prior art, the invention provides a local media screening and playing method based on voice, which realizes voice function by utilizing a microphone of a host, and rapidly screens the content in the mobile storage device by utilizing an algorithm, so that a user can rapidly hear the content which the user wants to hear through voice instructions, for example, the user wants to hear the rock songs, and all the rock songs in the storage device can be screened and played, thereby providing convenience for the user.

Drawings

Fig. 1 is a flowchart of a specific implementation of a voice-based local media screening and playing method according to embodiment 1 of the present invention.

Fig. 2 is a flowchart of speech recognition in the local media screening and playing method based on speech provided in embodiment 1 of the present invention.

Fig. 3 is a flowchart of media file scanning and screening in the voice-based local media screening and playing method according to embodiment 1 of the present invention.

Fig. 4 is an interaction flow chart of a local media screening and playing method based on voice provided in embodiment 2 of the present invention.

Fig. 5 is a data updating flow chart of the voice-based local media screening and playing method according to embodiment 2 of the present invention.

Fig. 6 is a schematic block diagram of a voice-based local media screening and playing device according to an embodiment of the present invention.

Fig. 7 is a schematic block diagram of an internal structure of a terminal device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and effects of the present invention clearer and more specific, the present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

It should be noted that, if directional indications (such as up, down, left, right, front, and rear … …) are included in the embodiments of the present invention, the directional indications are merely used to explain the relative positional relationship, movement conditions, etc. between the components in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indications are correspondingly changed.

In addition, if there is a description of "first", "second", etc. in the embodiments of the present invention, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.

Today in technology assisted life, people and various intelligent terminals such as photocopies are surrounded by mobile phones, tablets, computers and televisions in life. Intelligent terminals have been slowly penetrating into every corner of people's life.

Along with the development of technology and the continuous improvement of living standard of people, the use of various intelligent terminals is becoming more and more popular, and the intelligent terminals become an indispensable communication tool in people's life. Most intelligent terminals have a media playing function, and also basically store some local media resources or extrapolate the media resources of the mobile storage device, so that the local media screening of the intelligent terminals in the prior art is inconvenient. At present, a plurality of mobile storage devices which store a large amount of music or sound resources are sold on the market, the music or sound resources can be played on a certain host without a network, and the mobile storage devices are popular among a plurality of people because of the fact that the cheap additional resources are abundant, and the sales of the mobile storage devices in the past can be seen to be more than 10 ten thousand in the Taobao search. However, since the content is so large, the screening of resources becomes a difficult task, and the use and operation of the resources are sometimes inconvenient for users.

In order to solve the problems in the prior art, this embodiment provides a local media screening and playing method based on voice, and by the method of this embodiment, the method includes: acquiring a voice signal, identifying the voice signal, and determining a corresponding voice instruction; the local media resources are screened according to the voice command, and the media resources corresponding to the voice command are screened; and playing the screened media resources. The invention solves the problem of inconvenient screening of sound resources in the local storage equipment; the invention can search and play the local sound resource which is wanted to listen through the voice instruction. The invention makes the intelligent terminal more convenient for screening local media resources and provides convenience for users.

Exemplary method

The local media screening and playing method based on voice of the present embodiment may be applied to a terminal device, as shown in fig. 1, and includes the following steps:

Step S100, a voice signal is acquired, the voice signal is identified, and a corresponding voice instruction is determined.

In the prior art, if a user has downloaded a plurality of favorite media music files and stores the favorite media music files in a music folder or a USB flash disk with a plurality of music files purchased on the Internet, the user needs to play a specific song, such as 'ice rain', through an intelligent playing terminal, and the user needs to turn over one by one, so that the user is very troublesome to find.

In the embodiment of the invention, the intelligent playing terminal is taken as an example for explanation, when a user needs to play which song, the user can speak through voice, for example, when the user listens to ice and rain, the user can speak 'please open the ice and rain of Liu Dehua', and in the embodiment of the invention, the intelligent playing terminal can acquire a voice signal, identify the voice signal and determine a corresponding voice instruction.

Specifically, as shown in fig. 2, the step S100 specifically includes:

step S101, a voice signal is obtained through a microphone, and the voice signal is identified;

step S102, identifying a corresponding voice instruction.

Specifically, for example, if the user wants to hear "thank you love 1999", then "please open Xie Tingfeng's thank you love 1999" can be spoken by voice, then the smart play terminal of the present invention obtains a voice signal through the microphone, recognizes the voice signal, and recognizes that the corresponding voice command is "open Xie Tingfeng's thank you love 1999"

Step 200, screening local media resources according to the voice command, and screening media resources corresponding to the voice command;

In the embodiment of the invention, a database table with a preset size is preset before specific implementation, namely, a large database table is set, for example, 50% of the database of local media resources is larger than the database of local media resources, so that the database table is convenient for storing all analyzed file information, and files with the same size and the same MD5 value at the head and tail of the file are regarded as the same file; wherein the file information includes: file name, album name, category, singer name, duration, file size.

In addition, the invention can scan and document the local media file and the media file of the newly accessed U disk device when in implementation, and establish a corresponding database relation table.

Specifically, scanning a local media file or a media file in a newly accessed mobile storage device in advance; that is, each media file newly stored in the local or newly accessed mobile storage device is scanned in the invention; reading file information of the media file according to the scanned resource path, and storing the file information and the resource path together into a corresponding local database; the MD5 values for the header and trailer of the file are stored in the database at the same time. And scanning and profiling the new media file to establish a corresponding database relation table. Facilitating subsequent voice lookup searches.

The step of reading file information of the media file according to the scanned resource path and storing the file information and the resource path in the corresponding local database further includes (i.e. in the embodiment of the present invention, the scanned media file specifically includes:

When scanning local media files or media files in newly accessed mobile storage equipment and scanning the media files with incomplete information, searching a database table with preset size (namely a large database table, wherein the large database table in the invention can store file names, album names, categories, singer names, duration and file sizes of a plurality of media files in advance), and then supplementing relevant information;

and when the related file information is not searched from the database table with the preset size, the control is obtained from the network resource through a crawler algorithm and is stored in the corresponding local database. The invention can not search the file information related to the current media file in the local small database and the large database list (namely the database list with preset size) with complete preset media file information, and the control is acquired from the network resource through the crawler algorithm, and is stored in the corresponding local database after being acquired on the network.

In one embodiment, as shown in fig. 3, the step S200 specifically includes:

Step S201, extracting corresponding voice keywords according to the voice command;

Step S202, scanning a local media file or a media file in a newly accessed mobile storage device according to the extracted voice keyword;

Step 203, scanning and searching media files corresponding to and matched with the voice keywords;

and step S204, screening and outputting the media files correspondingly matched with the voice keywords.

Namely, in the embodiment of the invention, the corresponding voice keywords are extracted according to the voice command; scanning a local media file or a media file in a newly accessed mobile storage device according to the extracted voice keyword; scanning and searching media files corresponding to and matched with the voice keywords; screening and outputting the media files correspondingly matched with the voice keywords; for example, when a user wants to listen to a song of "ice rain", the user can speak the voice of "please open ice rain" of Liu Dehua through voice, the invention can extract the voice keyword of "open ice rain" of Liu Dehua, and can scan the local media file or the media file in the newly accessed mobile storage device according to the extracted voice keyword of "open ice rain" of Liu Dehua; scanning and searching a song of Liu Dehua ice rain, which is correspondingly matched with the voice keyword of 'ice rain' of Liu Dehua; and screening and outputting the media files correspondingly matched with the voice keywords. And proceeds to step S300.

And step S300, playing the screened media resources.

In the step, the media files which are screened and output are obtained; and playing the screened media resources according to a preset sequence. For example, when Liu Dehua songs of ice rain are extracted, the songs are controlled to be played according to the voice command.

In a further embodiment, as shown in fig. 4 and fig. 5, a local media screening playing method based on voice in this embodiment includes the following steps:

Taking the intelligent playing device as an example of the vehicle-mounted playing terminal 10, the vehicle-mounted playing terminal 10 is provided with a microphone 11, a loudspeaker 12 and a speaker; the mobile storage device 30, such as a usb disk storing songs, may be externally connected to the processor, and the vehicle-mounted playing terminal 10 is connected to the server 20 through a network. The vehicle-mounted playing terminal 10 utilizes a host microphone to realize a voice function, and uploads a resource catalog and a path to a cloud server through scanning of resources in a storage device by voice; the (cloud) server 20 may implement a tag classification function for resources, and a screening function for resources. Audio player functions, supporting common audio formats such as mp3, wav, flac, aac, etc.

Specifically, as shown in fig. 5, the data updating process in the embodiment of the present invention specifically includes the following steps:

S11, reading file information and storing the file information in a corresponding database; then, the process proceeds to S12;

s12, whether the file information of the scanned media file has an empty field, namely whether the pre-scanned media file does not find the corresponding file information, if yes, the step S13 is entered, and if not, the step S15 is entered;

s13, searching from a large database, entering S14,

S14, writing the information back to the application database;

S15, whether file information of a new media file does not exist in a large database, if so, entering step S16, and if not, entering step S17;

s16, updating a large database table, and entering a step S17;

S17, ending.

Namely, in the specific embodiment of the invention, the server side processes the resources by utilizing the own policy algorithm, and the corresponding APP is manufactured by the method of the embodiment of the invention and is installed in the vehicle-mounted playing terminal 10. Firstly, information of an audio file is read according to a resource directory and a path uploaded by an app, wherein the information comprises a file name, an album name, a category, a singer name, a duration, a file size and the like, the information is stored in a database corresponding to the storage device (the empty field is just empty, and one piece of information is just empty), and simultaneously MD5 values of the file head and the file tail are stored in the database. Meanwhile, the server side maintains a large database table, stores all the parsed file information, and regards the file with the same size and the same MD5 value as the file with the same size. If a file with insufficient information is encountered, a large database table may be searched (as in step S13 of fig. 5), and the relevant information may be filled in. If there is no relevant information in the large database table, the server may use a crawler algorithm to obtain from the network resource. For example, a certain file lacks category information, a large database table can be searched for files with the same names by using file names, corresponding categories are found, and the categories are filled in a database corresponding to the storage device.

In the embodiment of the invention, the result is transmitted to the cloud server after the voice command is identified, the server side screens from the corresponding database according to the command, the result is returned to the vehicle-mounted playing terminal 10app, and the app transmits the returned resource to the player for playing. For example, when the voice command identifies that the user wants to listen to the rock songs, the user selects the songs with the rock categories from the corresponding database, forms a list and returns the list to the app, and the app plays the list according to the list information.

The invention utilizes the existing hardware equipment, realizes the classified screening of a large amount of local resources on the software level, and utilizes the voice technology to facilitate the use of users.

Exemplary apparatus

As shown in fig. 6, an embodiment of the present invention provides a local media screening and playing device based on voice, which includes:

the acquisition module 10 is used for acquiring a voice signal, identifying the voice signal and determining a corresponding voice instruction;

the screening module 20 is configured to screen local media resources according to the voice command, and screen media resources corresponding to the voice command;

A play control module 30, configured to play the screened media resources;

A preset module 40, configured to preset a database table with a predetermined size, and store all the parsed file information, where the file size and the MD5 value of the header and the tail are the same and considered as the same file; wherein the file information includes: file name, album name, category, singer name, duration, file size;

a pre-scanning module 50, configured to scan in advance a local media file or a media file in a newly accessed mobile storage device; reading file information of the media file according to the scanned resource path, and storing the file information and the resource path together into a corresponding local database; while storing the MD5 values for the header and trailer of the file in the database, as described in detail above.

Based on the above embodiment, the present invention also provides a terminal device, and a functional block diagram thereof may be shown in fig. 7. The terminal equipment comprises a processor, a memory, a network interface, a display screen and a voice recognition module which are connected through a system bus. Wherein the processor of the terminal device is adapted to provide computing and control capabilities. The memory of the terminal device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the terminal device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements a voice-based local media screening playback method. The display screen of the terminal equipment can be a liquid crystal display screen or an electronic ink display screen, and the voice recognition module of the terminal equipment is preset in the terminal equipment.

It will be appreciated by those skilled in the art that the functional block diagram shown in fig. 7 is merely a block diagram of a portion of the structure associated with the present inventive arrangements and is not limiting of the terminal device to which the present inventive arrangements are applied, and that a particular terminal device may include more or less components than those shown, or may combine some of the components, or have a different arrangement of components.

In one embodiment, a terminal device is provided, the terminal device including a memory, a processor, and a voice-based local media screening playback program stored on the memory and executable on the processor, the processor implementing the following operation instructions when executing the voice-based local media screening playback program:

And playing the screened media resources, wherein the media resources are specifically described above.

The step of obtaining the voice signal, identifying the voice signal and determining the corresponding voice instruction comprises the following steps:

A corresponding voice command is identified.

The step of screening the media resources corresponding to the voice command comprises the following steps:

The step of reading file information of the media file according to the scanned resource path and storing the file information and the resource path together in a corresponding local database further comprises the steps of:

Extracting corresponding voice keywords according to the voice command;

The step of playing the screened media resources comprises the following steps:

Acquiring a media file which is screened and output;

and playing the screened media resources according to a preset sequence.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

In summary, the invention discloses a local media screening and playing method, a device, a terminal device and a storage medium based on voice, wherein the method comprises the following steps: acquiring a voice signal, identifying the voice signal, and determining a corresponding voice instruction; the local media resources are screened according to the voice command, and the media resources corresponding to the voice command are screened; and playing the screened media resources. The invention solves the problem of inconvenient screening of sound resources in the local storage equipment; the invention can search and play the local sound resource which is wanted to listen through the voice instruction. The invention makes the intelligent terminal more convenient for screening local media resources, and provides convenience for users.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A local media screening and playing method based on voice, comprising the steps of:

the step of screening the local media resources according to the voice command comprises the following steps:

Scanning and profiling a local media file or a media file in newly accessed mobile storage equipment in advance, reading file information of the media file according to a resource path obtained by scanning, storing the file information and the resource path together into a corresponding local database, storing MD5 values of the head part and the tail part of the file into the database, and establishing a corresponding database relation table;

The step of reading file information of the media file according to the scanned resource path and storing the file information and the resource path together in a corresponding local database further comprises the following steps:

When the related file information is not searched from the database table with the preset size, controlling to acquire the related file information from the network resource through a crawler algorithm and storing the related file information into a corresponding local database;

and playing the screened media resources.

2. The method for voice-based local media screening and playing according to claim 1, wherein the step of obtaining a voice signal, identifying the voice signal, and determining the corresponding voice command is preceded by:

3. The method for voice-based local media screening and playing according to claim 1, wherein the step of obtaining a voice signal, identifying the voice signal, and determining the corresponding voice command comprises:

A corresponding voice command is identified.

4. The method for voice-based local media screening and playing according to claim 1, wherein the step of screening local media resources according to the voice command, and screening media resources corresponding to the voice command comprises:

Extracting corresponding voice keywords according to the voice command;

5. The method of claim 1, wherein the step of playing the media assets that are screened out comprises:

Acquiring a media file which is screened and output;

and playing the screened media resources according to a preset sequence.

6. A voice-based local media screening playback apparatus, the apparatus comprising:

Before the step of screening the local media resources according to the voice command and selecting the media resources corresponding to the voice command, the method specifically comprises the following steps:

And the play control module is used for playing the screened media resources.

7. A terminal device comprising a memory, a processor and a voice-based local media screening playback program stored on the memory and executable on the processor, the processor implementing the steps of the voice-based local media screening playback method of any one of claims 1-5 when the voice-based local media screening playback program is executed by the processor.

8. A computer readable storage medium, having stored thereon a voice-based local media screening playback program which, when executed by a processor, implements the steps of the voice-based local media screening playback method of any one of claims 1-5.