CN108052525B

CN108052525B - Method and device for acquiring audio information, storage medium and electronic equipment

Info

Publication number: CN108052525B
Application number: CN201711087240.XA
Authority: CN
Inventors: 王君龙
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2017-11-07
Filing date: 2017-11-07
Publication date: 2020-10-09
Anticipated expiration: 2037-11-07
Also published as: CN108052525A

Abstract

The application discloses a method, a device, a storage medium and an electronic device for acquiring audio information, wherein the method comprises the following steps: when the fact that the information loss of the preset image exceeds a preset recognition threshold value is recognized, extracting keyword information and target pattern information from the preset image; one of the keyword information and the target pattern information is used as a first parameter, and the other information is used as a second parameter; selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter; and acquiring audio information corresponding to the target link data. The audio link set is obtained through one of the keyword information or the target pattern information, then the target link data is obtained through screening according to the other one, and the recognition rate of the preset image with less information is improved.

Description

Method and device for acquiring audio information, storage medium and electronic equipment

Technical Field

The present application belongs to the field of communications technologies, and in particular, to a method, an apparatus, a storage medium, and an electronic device for acquiring audio information.

Background

The existing point reading equipment identifies and positions the point reading object according to the coordinate or code of the point reading object and other modes. The coding mode is a reading mode, codes are printed on a book, then the codes are identified by a reading pen to determine contents, and then corresponding audio is played according to the contents. The coordinate mode is that the position of the point reading machine is clicked by using the point reading pen to determine the coordinates, and corresponding audio is played in combination with the current page of the point reading machine after the coordinates are confirmed. Namely, the identification and positioning of the reading object can be realized by the aid of a reading pen or a reading machine.

Disclosure of Invention

The application provides a method, a device, a storage medium and an electronic device for acquiring audio information, which can improve the recognition rate of images and further acquire accurate audio information.

In a first aspect, an embodiment of the present application provides a method for acquiring audio information, which is applied to an electronic device, and the method includes:

when the fact that the information loss of a preset image exceeds a preset recognition threshold value is recognized, extracting keyword information and target pattern information from the preset image;

taking one of the keyword information and the target pattern information as a first parameter and the other information as a second parameter;

selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter;

and acquiring audio information corresponding to the target link data.

In a second aspect, an embodiment of the present application provides an apparatus for acquiring audio information, where the apparatus includes:

the character pattern acquisition unit is used for extracting keyword information and target pattern information from a preset image when the fact that the information loss of the preset image exceeds a preset identification threshold value is identified;

a parameter setting unit configured to set one of the keyword information and the target pattern information as a first parameter and the other as a second parameter;

and the target link acquisition unit is used for selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter.

And the audio acquisition unit is used for acquiring the audio information corresponding to the target link data.

In a third aspect, embodiments of the present application provide a storage medium having a computer program stored thereon, which, when running on a computer, causes the computer to execute the above-mentioned method for acquiring audio information.

In a fourth aspect, an embodiment of the present application provides an electronic device, which includes a processor and a memory, where the memory has a computer program, and the processor is configured to execute the method for acquiring audio information by calling the computer program.

According to the method, the device, the storage medium and the electronic equipment for acquiring the audio information, when the fact that the information loss of the preset image exceeds the preset identification threshold value is identified, the keyword information and the target pattern information are extracted from the preset image; one of the keyword information and the target pattern information is used as a first parameter, and the other information is used as a second parameter; selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter; and acquiring audio information corresponding to the target link data. When the electronic equipment is used for realizing point reading, the audio link set is obtained through one of the keyword information or the target pattern information, then the target link data is obtained through screening according to the other one, and the recognition rate of the preset image with less information is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

Fig. 1 is a schematic view of an application scenario of an apparatus for acquiring audio information according to an embodiment of the present application;

fig. 2 is a schematic view of another application scenario of an apparatus for acquiring audio information according to an embodiment of the present application;

fig. 3 is a schematic flowchart of a method for acquiring audio information according to an embodiment of the present application;

fig. 4 is a schematic flowchart of a process of acquiring pattern information of a preset image according to an embodiment of the present disclosure;

fig. 5 is a schematic flowchart of acquiring pattern information according to a background color according to an embodiment of the present application;

fig. 6 is a schematic flow chart illustrating selection of a first parameter according to an embodiment of the present disclosure;

FIG. 7 is a schematic flowchart of acquiring an audio link set according to an embodiment of the present application;

fig. 8 is a schematic flowchart of selecting corresponding target link data according to an embodiment of the present application;

fig. 9 is a schematic diagram of a first structure of an apparatus for acquiring audio information according to an embodiment of the present application;

fig. 10 is a second structural diagram of an apparatus for acquiring audio information according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a third apparatus for acquiring audio information according to an embodiment of the present application;

fig. 12 is a schematic diagram of a fourth structure of an apparatus for acquiring audio information according to an embodiment of the present application;

fig. 13 is a schematic structural diagram of a fifth apparatus for acquiring audio information according to an embodiment of the present application;

fig. 14 is a schematic diagram of a sixth structure of an apparatus for acquiring audio information according to an embodiment of the present application;

fig. 15 is a schematic diagram of a seventh structure of an apparatus for acquiring audio information according to an embodiment of the present application;

fig. 16 is a schematic structural diagram of an electronic device provided in an embodiment of the present application;

fig. 17 is another schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Referring to the drawings, wherein like reference numbers refer to like elements, the principles of the present application are illustrated as being implemented in a suitable computing environment. The following description is in terms of particular embodiments of the application illustrated and should not be taken as limiting the application to other embodiments not described in detail herein.

In the description that follows, specific embodiments of the present application will be described with reference to steps and symbols executed by one or more computers, unless otherwise indicated. Accordingly, these steps and operations will be referred to, several times, as being performed by a computer, the computer performing operations involving a processing subunit of the computer processing represented by electronic signals representing data in a structured form. This operation transforms the data or maintains it at locations in the computer's memory system, which may be reconfigured or otherwise altered in a manner well known to those skilled in the art. The data maintains a data structure that is a physical location of the memory that has particular characteristics defined by the data format. However, while the principles of the application have been described in language specific to above, it is not intended to be limited to the specific form set forth herein, and it will be recognized by those of ordinary skill in the art that various of the steps and operations described below may be implemented in hardware.

The term "unit" as used herein may be considered a software object executing on the computing system. The various components, units, engines, and services described herein may be viewed as objects of implementation on the computing system. The apparatus and method described herein may be implemented in software, but may also be implemented in hardware, and are within the scope of the present application.

The terms "first", "second", and "third", etc. in this application are used to distinguish between different objects and not to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but rather, some embodiments may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In one embodiment of the application, the touch reading can be realized through the electronic equipment. The electronic device may first obtain a click-to-read sample, where the click-to-read sample may be a picture, a photograph, or a code. The electronic device may be a mobile terminal, such as a mobile phone, a tablet computer, a notebook computer, and the like, which is not limited in this application.

After the click-to-read sample is obtained, the electronic device may obtain a multimedia sample, which may be an audio file or a video file, etc. The electronic device may then associate the retrieved click-to-read sample with the multimedia sample. For example, the electronic device associates photo a with audio a, photo B with audio B, and photo C with video C. After associating the click-to-read sample with the multimedia sample, the electronic device may save the sample data in a preset database.

It is to be understood that, in some embodiments, the click-to-read sample and the multimedia sample may be in a one-to-one correspondence relationship, a one-to-many relationship, a many-to-one relationship, or the like, which is not specifically limited in this embodiment.

When using the electronic device click-to-read function, a user may first take a picture using the mobile electronic device or select a picture from an album or scan a code. The picture taken by the user or the selected picture or the scanned code is the object to be read (i.e. the object to be read). After the object needing to be read is acquired, the electronic device can search a photo or a code matched with the object needing to be read in a preset database. If the photo or the code matched with the object needing to be read exists in the preset database, the electronic equipment can search the multimedia file associated with the photo or the code, and the multimedia file is played.

In one embodiment, for example, the sample is a photograph a containing three text segments, each of which is associated with a segment of audio. Then, when the photo shot by the user is matched with the photo a in the preset database, since the photo a contains three sections of characters and each section of character is associated with a corresponding section of audio, the user can select to play the corresponding audio by specifically selecting a certain section of character in the photo a. Or, the user may not make a specific selection, and then the electronic device may sequentially play the audio corresponding to the three paragraphs of text.

Referring to fig. 1, for example, a user uses an electronic device to take a picture X, and the electronic device finds that the picture X matches the picture a in a preset database. The photo A comprises three sections of characters, wherein the first section of characters is associated with the audio A, the second section of characters is associated with the audio B, and the third section of characters is associated with the audio C. For example, if the user circles the area of the second text on the screen, the electronic device may play audio b accordingly.

It can be understood that the embodiment can realize the function of reading the text in a certain object (such as a photo) on the electronic device, and the mode has the advantages of low cost, good convenience and the like.

Referring to fig. 2, please refer to fig. 2 for a schematic view of another application scenario of the apparatus for acquiring audio information according to the embodiment of the present application. For example, the apparatus that acquires audio information acquires a preset image (e.g., an object to be read), and then identifies the preset image, and extracts keyword information and target pattern information from the preset image when it is identified that the information missing of the preset image exceeds a preset identification threshold; one of the keyword information and the target pattern information is used as a first parameter, and the other information is used as a second parameter; selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter; and acquiring audio information corresponding to the target link data, and finally playing the target audio information.

An execution main body of the method for acquiring audio information may be the apparatus for acquiring audio information provided in the embodiment of the present application, or an electronic device integrated with the apparatus for acquiring audio information, where the apparatus for acquiring audio information may be implemented in a hardware or software manner.

Embodiments of the present application will be described in terms of an apparatus for acquiring audio information, which may be specifically integrated in an electronic device. The method for acquiring the audio information comprises the following steps: when the fact that the information loss of the preset image exceeds a preset recognition threshold value is recognized, extracting keyword information and target pattern information from the preset image; one of the keyword information and the target pattern information is used as a first parameter, and the other information is used as a second parameter; selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter; and acquiring audio information corresponding to the target link data.

Referring to fig. 3, fig. 3 is a flowchart illustrating a method for acquiring audio information according to an embodiment of the present disclosure. The method for acquiring the audio information provided by the embodiment of the application is applied to the electronic equipment, and the specific process can be as follows:

step 101, when the information loss of the preset image is recognized to exceed a preset recognition threshold, extracting keyword information and target pattern information from the preset image.

The preset image can be shot by a camera of the electronic equipment and also can be obtained by receiving the preset image transmitted by other equipment. The preset image can also be stored in the electronic equipment in advance and then displayed through calling of application software.

And identifying the preset image to obtain the information of the preset image, and when the information is missing and exceeds a preset identification threshold, the corresponding audio information cannot be quickly found. For example, if the acquired preset image is incomplete, such as the text page acquired by the camera is incomplete, only a part of the text page is acquired, the part of the text page only accounts for 60%, and the preset recognition threshold may be set to 70% of the whole text page. Or, the acquired preset image is not clear, when the textbook page is shot through the camera, the hand shaking causes the preset image to be fuzzy, and part of characters or patterns cannot be identified, so that when the identified information cannot find the corresponding audio information quickly, the preset identification threshold value can be 60% of the character identification rate, namely, the identified characters account for 60% of the total characters. When the preset image has quick identification information such as digital codes and two-dimensional codes, the information loss is directly judged to exceed the preset identification threshold.

And when the information loss of the preset image is recognized to exceed a preset recognition threshold value, extracting the keyword information and the target pattern information from the preset image.

Before the step of extracting the keyword information and the target pattern information from the preset image, the character information and the pattern information are extracted from the preset image.

Referring to fig. 4, fig. 4 is a schematic flow chart illustrating obtaining pattern information of a preset image according to an embodiment of the present disclosure. In this embodiment, the method for acquiring pattern information of a preset image includes the following steps:

step 201, dividing a preset image into a plurality of areas. The preset image is divided into enough areas, and the preset image can be divided into a plurality of areas in equal proportion.

Step 202, setting a plurality of sampling points on each area, and acquiring a plurality of color samples according to the plurality of sampling points. A plurality of sampling points are set in all the areas, and then color samples are acquired through the sampling points. The more sampling points, the better the effect.

In step 203, the background color is determined according to the number of color samples of each color.

The obtained color sample comprises a plurality of colors, and if the proportion of the number of one color sample is more than a certain proportion, such as 50%, the color sample is determined as the background color. If the preset image is a color picture and a large number of progressive colors are used, the color samples in a certain color range are determined to be background colors if the ratio of the number of color samples in the color range is greater than a certain ratio, for example, 60%. The background color may also be determined according to the proportion and the position of the color samples, for example, if the proportion of the number of one color sample in the upper part of the preset image is greater than a certain proportion, for example, 55%, and the proportion of the number of the other color sample in the lower part of the preset image is greater than a certain proportion, for example, 55%, the preset image is determined to include two background colors.

And step 204, acquiring pattern information according to the background color.

The pattern information may include a background color and a position of the background color. Removing a background color region in a preset image, and dividing the rest region into a plurality of sub-regions by the interval distance between the regions, wherein each sub-region is pattern information.

Step 205, extracting target pattern information from the pattern information.

And comparing the pattern information with the pre-stored patterns in the database, and if the pattern information is the same as the pre-stored patterns in the database, setting the pattern information as a target pattern.

The method comprises the steps of storing a pre-stored pattern in a database in advance, comparing pattern information with the pre-stored pattern, and setting the pattern information as a target pattern if the pattern information is the same, wherein the same judgment standard is various, and for example, the similarity of the pattern information and the target pattern is considered to be the same when the similarity of the pattern information and the target pattern reaches 70%.

Referring to fig. 5, fig. 5 is a schematic flow chart illustrating obtaining pattern information according to a background color according to an embodiment of the present disclosure. In the present embodiment, acquiring pattern information according to the background color includes the steps of:

step 2041, if the ratio of the color samples acquired by the sampling points in the region to the background color is smaller than a preset ratio threshold, determining the region to be an image-taking region.

In one region, if the proportion of the sampling points with the background color samples in the whole region is greater than a certain proportion, such as 90%, the region is determined to be a background color region, and pattern information is obtained subsequently to skip the region.

In one area, if the proportion of the sampling points with the background color samples in the whole area is smaller than a preset proportion threshold value, such as 60%, the area is determined to be an image taking area, and pattern information is obtained in the area subsequently.

Step 2042, the color of each display point is obtained in the image-taking area, and a matrix data table is correspondingly formed.

And acquiring the color of each display point in the image taking area, wherein the display points can be pixel points. Then, a display point is formed to correspond to a color data value, and the color data values are formed into a matrix data table.

Step 2043, in the matrix data table, setting the audio link corresponding to the display point with the background color as first data, and setting the audio link corresponding to the display point with the color not being the background color as second data, so as to obtain a matrix data table to be selected.

In the matrix data table, each color data value corresponds to the color of one display point, if the color data value is a background color, the color data value is set as first data, such as an audio link corresponding to white, and if the color is not the background color, the color data value is set as the first data, such as an audio link corresponding to black, so that a matrix data table to be selected is obtained. The image is converted into an image with white lines and black lines, and much interference is removed. At this time, some edge data may be removed from the candidate matrix data table, for example, a graph with edge lines is obtained from the first data, the graph also has some second data, and the second data is an image that is not needed for obtaining the preset image, and may be removed at this time.

Step 2044, in the matrix data table to be selected, dividing the matrix data table to be selected into a plurality of pattern areas according to the first data, and acquiring pattern information according to the position of the second data in the pattern areas.

In the matrix data table to be selected, the matrix data table to be selected is divided into a plurality of pattern areas through first data division, because an image corresponding to the first data is a background color, if two sides of the first data are provided with second data, the first data between the two second data exceed a certain width, and if the width of the whole area is 5% of the width of the whole area in the same direction, the first data are divided into the pattern areas, and the matrix data table to be selected is divided into the pattern areas through the method. And then converting the pattern area into a pattern, and connecting the points corresponding to the second data to form a pattern, wherein the pattern is pattern information.

Extracting keyword information from the text information, wherein the text information, the font size of each text in the text information and the position information of each text in the preset image can be obtained from the preset image; if the font sizes corresponding to the characters comprise at least two types, setting the character with the largest font as a preset keyword; if the number of the preset keywords does not exceed the preset word number, directly setting the preset keywords as keyword information; and if the number of the preset keywords exceeds the preset word number, screening according to the position information of the preset keywords to obtain keyword information. For example, if the position of the preset keyword is at the edge of the preset image, the preset keyword is eliminated. The preset keyword is above the preset image rather than below the preset image.

The keyword information further includes page number information, and if a number is recognized in these areas based on the storage location of the page number information, such as the bottom middle, the bottom outer side, the middle of the outer side, and the like, the number is set as the keyword information and is the keyword information of the page number information.

In some embodiments, the keyword information is extracted from the text information, the text in the text information may be compared with the pre-stored text in the database, and if the text in the text information is the same as the pre-stored text in the database, the text is set as the keyword information. The method includes the steps of storing pre-stored characters in a database in advance, wherein the pre-stored characters can enable content corresponding to a preset image to be determined within a range, such as language, mathematics, a first book, a storybook and the like. The pre-stored words can also be obtained through a network. And then comparing the recognized characters with the pre-stored characters in the database, and if the recognized characters are the same, setting the characters as keyword information.

In some embodiments, a fast search keyword information set and a fast search pattern set are set in a database, the stored keyword information and patterns can be used for fast searching the corresponding audio link set, the data volume in the audio link set is small, and the keyword information and the patterns are book names, author names, pricing, car logo patterns, two-dimensional codes and the like.

Step 102, using one of the keyword information and the target pattern information as a first parameter, and using the other information as a second parameter.

Referring to fig. 6, fig. 6 is a schematic flow chart illustrating the selection of the first parameter according to the embodiment of the present disclosure. In this embodiment, selecting the first parameter includes the following steps:

step 3021, respectively acquiring keyword information and target pattern information to acquire historical durations of corresponding link information sets;

step 3022, the information with shorter history time in the keyword information and the target pattern information is used as the first parameter.

And acquiring the historical duration of the corresponding link information set through the keyword information and the target pattern information, and selecting one of the keyword information and the target pattern information as a first parameter. And similarly, acquiring corresponding audio links according to the target pattern information, forming an audio link set, acquiring historical duration of the corresponding audio link set according to the target pattern information and the target pattern information, setting one of the historical duration as a first parameter and setting the other one as a second parameter.

The keyword information may further include a plurality of keywords, the probability of each keyword may be stored in advance, then the time lengths of the plurality of keywords to be searched are calculated, similarly, the target pattern information may include a plurality of pattern information, the probability of each pattern information may be stored in advance, then the time lengths of the plurality of target patterns to be searched are calculated, and then one of the keyword information and the target pattern information is set as the first parameter.

And when one data set is searched first, the corresponding keyword information or target pattern information is set as a first parameter, and the other data set is a second parameter.

And 103, selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter.

And firstly, acquiring a corresponding audio link set according to the first parameter, and then screening in the audio link set according to the second parameter to obtain corresponding target link data without searching in a database.

Referring to fig. 7, fig. 7 is a schematic flowchart illustrating a process of acquiring an audio link set according to an embodiment of the present application. In this embodiment, the first parameter includes a plurality of sub-parameters, and the obtaining of the audio link set corresponding to the first parameter includes the following steps:

301, obtaining the search efficiency value of each sub-parameter, and setting the search sequence of the plurality of sub-parameters according to the search efficiency value.

The search efficiency value includes the number, time, etc. of audio links that the database is searched for. For example, the first parameter is keyword information, the keyword information is a plurality of keywords, and then a search efficiency value of each keyword information is obtained, where the search efficiency value may be the number of corresponding audio links obtained by searching the database. And then setting a searching sequence for the plurality of keywords according to the number of the obtained corresponding audio links.

And step 302, searching the database to obtain corresponding audio links by using the sub-parameters with the rank exceeding the preset rank in the searching sequence, and forming an initial link set.

And searching to obtain corresponding audio links in the database by using the sub-parameters with the rank exceeding the preset rank, such as the third sub-parameter, in the searching sequence and by using the first and second sub-parameters with the rank, and forming an initial link set.

And 303, screening the initial link set by using the sub-parameters ranked below the preset rank in the search sequence to obtain an audio link set.

And then screening in the initial link set by using other sub-parameters, and obtaining the audio link set through screening, wherein the data volume of the initial link set is obviously less than that of the database, so that the searching efficiency can be improved.

Referring to fig. 8, fig. 8 is a schematic flow chart illustrating selection of corresponding target link data according to an embodiment of the present disclosure. In this embodiment, selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter includes the following steps:

and 311, screening in the audio link set according to the second parameter to obtain a plurality of data to be selected.

Step 312, displaying a plurality of data to be selected.

Step 313, according to the confirmation instruction, determining target link data corresponding to the confirmation instruction from the multiple data to be selected.

The target link data cannot be accurately obtained by screening according to the second parameter in the audio link set, but a plurality of relatively close data to be selected are obtained, then the data to be selected are displayed, for example, through a display screen, and then according to the confirmation instruction, the target link data corresponding to the confirmation instruction is determined from the data to be selected. And if a click instruction of the user is received and the click coordinate is the coordinate of one of the data to be selected, confirming that the data to be selected is the target link data and associating the data to be selected and the target link data.

In some embodiments, the displaying of the multiple pieces of data to be selected is specifically that, if the number of the multiple pieces of data to be selected exceeds a preset number, a weight value of each piece of data to be selected is obtained according to a weight of keyword information in each piece of data to be selected and a similarity of target pattern information; and selecting the data to be selected with the maximum preset number weight value from the plurality of data to be selected for displaying.

For example, the keyword information includes keyword information of different font sizes, the weight of the keyword information with a large font size is greater than that of the keyword information with a small font size, the weight of the keyword information in the middle of the preset image is greater than that of the keyword information on the side of the preset image, and the weight of the upper keyword information is greater than that of the lower keyword information in the upper and lower adjacent keyword information. Then, according to the weight of the keyword information in the data to be selected and the similarity of the target pattern information, the weight values of the data to be selected are formed according to a certain ratio, for example, 50% of each weight value, and finally, the data to be selected with the maximum weight value of a preset number, for example, 5, is selected from the plurality of data to be selected and displayed.

And 104, acquiring target audio information according to the target link data.

The target link data is an address corresponding to the target audio information, then the storage position of the target audio information is obtained according to the address, and then the target audio information is played by calling playing software. The audio information corresponding to the target link data may include a plurality of audios, and when the preset image includes a plurality of click-to-read regions, the plurality of audios correspond to the plurality of click-to-read regions one to one. For example, there are some cards, on which there are text information and picture information, and also there may be quick identification information such as a unique digital code, a barcode, or a two-dimensional code, and then the card is photographed by an electronic device such as a smart phone to obtain a card image, and then an audio voice message matching with the card is found from a database according to the quick identification information, and when the text information or the picture information is clicked, a corresponding audio is played. When a child takes the smart phone to shoot a card, due to the shooting angle or the shaking hands and the like, the rapid identification information is lacked or cannot be identified, at the moment, the keyword information and the target pattern information of the card image are obtained, the audio information corresponding to the card is searched from the database, the audio information comprises audio corresponding to a plurality of clicking areas, and then the audio corresponding to the clicked area is determined from the plurality of audio according to the clicking position.

According to the method for acquiring the audio information, when the fact that the information loss of the preset image exceeds the preset identification threshold value is identified, keyword information and target pattern information are extracted from the preset image; one of the keyword information and the target pattern information is used as a first parameter, and the other information is used as a second parameter; selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter; and acquiring audio information corresponding to the target link data. The audio link set is obtained through one of the keyword information or the target pattern information, then the target link data is obtained through screening according to the other one, the recognition rate of the preset image with less information is improved, and the recognition efficiency is improved.

In order to better implement the method for acquiring audio information provided by the embodiment of the present application, an embodiment of the present application further provides an apparatus for acquiring audio information. The terms are the same as the above-mentioned method for acquiring audio information, and the details of the implementation can refer to the description in the method embodiment.

Referring to fig. 9, fig. 9 is a schematic view illustrating a first structure of an apparatus for acquiring audio information according to an embodiment of the present disclosure. The apparatus 500 for acquiring audio information is applied to an electronic device, and includes a text pattern acquiring unit 501, a parameter setting unit 502, a target link acquiring unit 503, and an audio acquiring unit 504. Wherein:

a text pattern obtaining unit 501, configured to extract keyword information and target pattern information from a preset image when it is recognized that information loss of the preset image exceeds a preset recognition threshold.

The preset image can be shot by a camera of the electronic equipment and also can be obtained by receiving the preset image transmitted by other equipment. The preset image transmitted by other equipment is received and acquired in a wireless mode or a wired mode. The wireless mode comprises modes such as Bluetooth, NFC, WIFI network and mobile network, and the mobile network comprises networks such as 2G, 3G and 4G. The wired mode includes connecting other devices through a data line to acquire a preset image. The preset image can also be stored in the electronic equipment in advance and then displayed through calling of application software.

Referring to fig. 10, fig. 10 is a schematic view illustrating a second structure of an apparatus for acquiring audio information according to an embodiment of the present disclosure. In the present embodiment, the text pattern acquisition unit 501 includes a dividing subunit 5011, a sampling subunit 5012, a background color acquisition subunit 5013, a pattern information acquisition subunit 5014, and an object pattern information acquisition subunit 5015. Wherein:

the dividing subunit 5011 is configured to divide the preset image into a plurality of areas. The preset image is divided into enough areas, and the preset image can be divided into a plurality of areas in equal proportion.

The sampling subunit 5012 is configured to set a plurality of sampling points in each area, and obtain a plurality of color samples according to the plurality of sampling points. A plurality of sampling points are set in all the areas, and then color samples are acquired through the sampling points. The more sampling points, the better the effect.

A background color obtaining subunit 5013 configured to determine a background color according to the number of color samples of each color;

the pattern information acquisition subunit 5014 is configured to acquire pattern information based on the background color.

The pattern information may include a background color and a position of the background color. Removing the background color area in the preset image, and dividing the interval distance between the left area customs into a plurality of sub-areas, wherein each sub-area is pattern information.

The target pattern information acquiring subunit 5015 extracts target pattern information from the pattern information.

Referring to fig. 11, fig. 11 is a schematic structural diagram of a third apparatus for acquiring audio information according to an embodiment of the present disclosure. In the present embodiment, the pattern information acquisition subunit 5014 includes a drawing area determination module 541, a first matrix formation module 542, a second matrix formation module 543, and a pattern information acquisition module 544. Wherein:

the image-taking region determining module 541 is configured to determine that the region is the image-taking region if a ratio of the sampling points in the region to obtain the color samples as the background color is smaller than a preset ratio threshold.

The first matrix forming module 542 is configured to obtain the color of each display point in the drawing area, and correspondingly form a matrix data table.

The second matrix forming module 543 is configured to set, in the matrix data table, the audio link corresponding to the display point with the background color as the first data, and set the audio link corresponding to the display point with the color not being the background color as the second data, so as to obtain a to-be-selected matrix data table.

The pattern information obtaining module 544 is configured to, in the candidate matrix data table, divide the candidate matrix data table into a plurality of pattern areas according to the range of the first data, and obtain the pattern information according to the position of the second data in the pattern area.

Referring to fig. 12, fig. 12 is a schematic diagram illustrating a fourth structure of an apparatus for acquiring audio information according to an embodiment of the present application. In the present embodiment, the text pattern acquisition unit 501 includes a text information acquisition sub-unit 5016, a preset keyword acquisition sub-unit 5017, and a keyword information screening sub-unit 5018. Wherein:

a text information obtaining subunit 5016, configured to obtain text information from the preset image, and a font size and position information of each text in the text information in the preset image;

the preset keyword acquisition subunit 5017 is configured to set, if the font sizes corresponding to the respective characters include at least two types, the character with the largest font as the preset keyword;

the keyword information screening subunit 5018 is configured to, if the number of the preset keywords exceeds the preset word number, perform screening according to the position information of the preset keywords to obtain keyword information.

For example, if the position of the preset keyword is at the edge of the preset image, the preset keyword is eliminated. The preset keyword is above the preset image rather than below the preset image.

The keyword information further includes page number information, and a number is identified in these areas based on the storage location of the page number, such as the middle of the bottom, the outer side of the bottom, the middle of the outer side, and the like, and is set as the keyword information, and is the keyword information of the page number.

In some embodiments, the text pattern obtaining unit 501 may further compare the text in the text information with the pre-stored text in the database, and if the text in the text information is the same as the pre-stored text in the database, set the text as the keyword information. The method includes the steps of storing pre-stored characters in a database in advance, wherein the pre-stored characters can enable content corresponding to a preset image to be determined within a range, such as language, mathematics, a first book, a storybook and the like. The pre-stored words can also be obtained through a network. And then comparing the recognized characters with the pre-stored characters in the database, and if the recognized characters are the same, setting the characters as keyword information.

In some embodiments, the text pattern obtaining unit 501 may further compare the pattern information with pre-stored patterns in a database, and if the pattern information is the same, set the pattern information as the target pattern.

In some embodiments, the text pattern obtaining unit 501 is further configured to set a fast search keyword information set and a fast search pattern set in a database, where the stored keyword information and patterns can be fast searched for a corresponding audio link set, and the volume of data in the audio link set is small, and the keyword information and patterns are, for example, book names, author names, pricing, car logo patterns, two-dimensional codes, and the like.

A parameter setting unit 502, configured to set one of the keyword information and the target pattern information as a first parameter and set the other information as a second parameter.

Referring to fig. 13, fig. 13 is a schematic diagram illustrating a fifth structure of an apparatus for acquiring audio information according to an embodiment of the present application. In the present embodiment, the parameter setting unit 502 includes a history duration acquiring subunit 5021 and a parameter setting subunit 5022. Wherein:

a history duration obtaining subunit 5021, configured to obtain the keyword information and the target pattern information respectively to obtain history durations of corresponding link information sets;

a parameter setting subunit 5022 is configured to take the information with shorter history duration in the keyword information and the target pattern information as the first parameter.

And acquiring the historical duration of the corresponding connection information set through the keyword information and the target pattern information, and selecting one of the keyword information and the target pattern information as a first parameter.

And similarly, acquiring corresponding audio links according to the target pattern information, forming an audio link set, and then acquiring the duration of the data set according to the two audio links, wherein one audio link is set as a first parameter, and the other audio link is set as a second parameter. For example, corresponding data sets are acquired according to the keyword information and the target pattern information at the same time, and when one data set is searched first, the corresponding keyword information or target pattern information is set as a first parameter, and the other is a second parameter.

The target link obtaining unit 503 is configured to select corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter.

Referring to fig. 14, fig. 14 is a schematic view of a sixth structure of an apparatus for acquiring audio information according to an embodiment of the present application. In this embodiment, the target link obtaining unit 503 includes an order setting sub-unit 5031, a searching sub-unit 5032, and a first screening sub-unit 5033. Wherein:

an order setting sub-unit 5031 configured to obtain search efficiency values of a plurality of sub-parameters included in the first parameter, and set a search order of the plurality of sub-parameters according to the search efficiency values.

The searching sub-unit 5032 is configured to search the database for the corresponding audio link by using the sub-parameter with the rank exceeding the preset rank in the search order, and form an initial link set.

The first screening sub-unit 5033 is configured to perform screening in the initial link set by using the sub-parameters ranked below the preset ranking in the search order, so as to obtain an audio link set.

Referring to fig. 15, fig. 15 is a schematic view of a seventh structure of an apparatus for acquiring audio information according to an embodiment of the present application. In this embodiment, the target link obtaining unit 503 includes a second screening sub-unit 5034, a presentation sub-unit 5035, and a determination sub-unit 5036. Wherein:

a second screening subunit 5034, configured to perform screening according to the second parameter in the audio link set to obtain multiple pieces of data to be selected;

a display subunit 5035, configured to display a plurality of data to be selected;

the determining subunit 5036 is configured to determine, according to the confirmation instruction, target link data corresponding to the confirmation instruction from the multiple pieces of data to be selected.

In some embodiments, the display subunit is further configured to, if the number of the plurality of pieces of data to be selected exceeds a preset number, obtain a weight value of each piece of data to be selected according to a weight of keyword information in each piece of data to be selected and a similarity of the target pattern information; and selecting the data to be selected with the maximum preset number weight value from the plurality of data to be selected for displaying.

An audio obtaining unit 504, configured to obtain the target audio information according to the target link data.

According to the device for acquiring the audio information, when the fact that the information loss of the preset image exceeds the preset identification threshold value is identified, the keyword information and the target pattern information are extracted from the preset image; one of the keyword information and the target pattern information is used as a first parameter, and the other information is used as a second parameter; selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter; and acquiring audio information corresponding to the target link data. The audio link set is obtained through one of the keyword information or the target pattern information, then the target link data is obtained through screening according to the other one, the recognition rate of the preset image with less information is improved, and the recognition efficiency is improved.

In specific implementation, the above modules may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and specific implementation of the above modules may refer to the foregoing method embodiments, which are not described herein again.

In the embodiment of the present application, the apparatus for acquiring audio information and the method for acquiring audio information in the above embodiments belong to the same concept, and any method provided in the embodiment of the method for acquiring audio information may be run on the apparatus for acquiring audio information, and a specific implementation process thereof is described in detail in the embodiment of the method for acquiring audio information, and is not described herein again.

The embodiment of the application also provides the electronic equipment. Referring to fig. 16, the electronic device 600 includes a processor 601 and a memory 602. The processor 601 is electrically connected to the memory 602.

The processor 600 is a control center of the electronic device 600, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device 600 by running or loading a computer program stored in the memory 602, and calls data stored in the memory 602, and processes the data, thereby performing overall monitoring of the electronic device 600.

The memory 602 may be used for storing software programs and units, and the processor 601 executes various functional applications and data processing by running the computer programs and units stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, a computer program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 602 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 602 may also include a memory controller to provide the processor 601 with access to the memory 602.

In the embodiment of the present application, the processor 601 in the electronic device 600 loads instructions corresponding to one or more processes of the computer program into the memory 602 according to the following steps, and the processor 601 runs the computer program stored in the memory 602, thereby implementing various functions as follows:

when the fact that the information loss of the preset image exceeds a preset recognition threshold value is recognized, extracting keyword information and target pattern information from the preset image;

one of the keyword information and the target pattern information is used as a first parameter, and the other information is used as a second parameter;

and acquiring audio information corresponding to the target link data.

In some embodiments, the processor 601 is further configured to perform the following steps:

respectively acquiring keyword information and target pattern information to acquire the historical duration of a corresponding link information set;

and using the information with shorter history time in the keyword information and the target pattern information as a first parameter.

In some embodiments, the first parameter comprises a plurality of sub-parameters, and the processor 601 is further configured to perform the following steps:

obtaining the search efficiency value of each sub-parameter, and setting a search sequence for the sub-parameters according to the search efficiency value;

searching in a database to obtain corresponding audio links by utilizing the sub-parameters with the rank exceeding the preset rank in the searching sequence, and forming an initial link set;

and screening in the initial link set by using the sub-parameters ranked below the preset rank in the search sequence to obtain the audio link set.

screening is carried out in the audio link set according to the second parameters to obtain a plurality of data to be selected;

displaying a plurality of data to be selected;

and determining target link data corresponding to the confirmation instruction from the plurality of data to be selected according to the confirmation instruction.

dividing a preset image into a plurality of areas;

setting a plurality of sampling points on each area, and acquiring a plurality of color samples according to the plurality of sampling points;

determining a background color according to the number of color samples of each color;

acquiring pattern information according to the background color;

target pattern information is extracted from the pattern information.

if the proportion of the sampling points in the area for acquiring the background color of the color sample is smaller than a preset proportion threshold, determining the area as an image-taking area;

acquiring the color of each display point in the image taking area, and correspondingly forming a matrix data table;

in the matrix data table, setting audio links corresponding to display points with background colors as first data, and setting audio links corresponding to display points with colors not being background colors as second data to obtain a matrix data table to be selected;

in the matrix data table to be selected, the matrix data table to be selected is divided into a plurality of pattern areas according to first data, and pattern information is obtained according to the positions of second data in the pattern areas.

acquiring character information from a preset image, and font size and position information of each character in the character information in the preset image;

if the font sizes corresponding to the characters comprise at least two types, setting the character with the largest font as a preset keyword;

and if the number of the preset keywords exceeds the preset word number, screening according to the position information of the preset keywords to obtain keyword information.

As can be seen from the above, in the electronic device provided in the embodiment of the present application, when it is recognized that the information loss of the preset image exceeds the preset recognition threshold, the keyword information and the target pattern information are extracted from the preset image; one of the keyword information and the target pattern information is used as a first parameter, and the other information is used as a second parameter; selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter; and acquiring audio information corresponding to the target link data. The audio link set is obtained through one of the keyword information or the target pattern information, then the target link data is obtained through screening according to the other one, and the recognition rate of the preset image with less information is improved.

Referring also to fig. 17, in some embodiments, the electronic device 600 may further include: a display 603, a radio frequency circuit 604, an audio circuit 605, and a power supply 606. The display 603, the rf circuit 604, the audio circuit 605 and the power supply 606 are electrically connected to the processor 601, respectively.

The display 603 may be used to display information entered by or provided to the user as well as various graphical user interfaces, which may be made up of graphics, text, icons, video, and any combination thereof. The Display 603 may include a Display panel, and in some embodiments, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

The rf circuit 604 may be used for transceiving rf signals to establish wireless communication with a network device or other electronic devices through wireless communication, and for transceiving signals with the network device or other electronic devices.

The audio circuit 605 may be used to provide an audio interface between the user and the electronic device through a speaker, microphone.

The power supply 606 may be used to power various components of the electronic device 600. In some embodiments, the power supply 606 may be logically connected to the processor 601 through a power management system, so as to implement functions of managing charging, discharging, and power consumption management through the power management system.

Although not shown in fig. 17, the electronic device 600 may further include a camera, a bluetooth unit, and the like, which are not described in detail herein.

An embodiment of the present application further provides a storage medium, where the storage medium stores a computer program, and when the computer program runs on a computer, the computer is caused to execute the application program management and control method in any one of the above embodiments, for example: when the fact that the information loss of the preset image exceeds a preset recognition threshold value is recognized, extracting keyword information and target pattern information from the preset image; one of the keyword information and the target pattern information is used as a first parameter, and the other information is used as a second parameter; selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter; and acquiring audio information corresponding to the target link data.

In the embodiment of the present application, the storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

It should be noted that, for the method for acquiring audio information in the embodiment of the present application, it can be understood by a person skilled in the art that all or part of the process for implementing the method for acquiring audio information in the embodiment of the present application may be completed by controlling the relevant hardware through a computer program, where the computer program may be stored in a computer-readable storage medium, such as a memory of an electronic device, and executed by at least one processor in the electronic device, and during the execution process, the process may include, for example, the process of the embodiment of the method for acquiring audio information. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.

In the apparatus for acquiring audio information according to the embodiment of the present application, each functional unit may be integrated in one processing chip, or each unit may exist alone physically, or two or more units are integrated in one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The integrated unit, if implemented as a software functional unit and sold or used as a stand-alone product, may also be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, or the like.

The method, the apparatus, the storage medium, and the electronic device for acquiring audio information provided by the embodiments of the present application are described in detail above, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the above embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method for acquiring audio information, which is applied to an electronic device, is characterized by comprising the following steps:

acquiring audio information corresponding to the target link data;

the first parameter includes a plurality of sub-parameters, and the step of obtaining the audio link set corresponding to the first parameter includes:

searching in a database to obtain corresponding audio links by using the sub-parameters with the ranking exceeding the preset ranking in the searching sequence, and forming an initial link set;

and screening in the initial link set by using the sub-parameters ranked below a preset rank in the search sequence to obtain the audio link set.

2. The method of claim 1, wherein the step of using one of the keyword information and the target pattern information as the first parameter comprises;

respectively acquiring the keyword information and the target pattern information to acquire the historical duration of a corresponding link information set;

and taking the information with shorter history time in the keyword information and the target pattern information as a first parameter.

3. The method according to claim 1, wherein the step of selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter comprises:

displaying the plurality of data to be selected;

4. The method of claim 1, wherein the step of extracting the target pattern information from the preset image comprises:

dividing the preset image into a plurality of areas;

setting a plurality of sampling points on each region, and acquiring a plurality of color samples according to the plurality of sampling points;

determining a background color according to the number of the color samples of each color;

acquiring pattern information according to the background color;

target pattern information is extracted from the pattern information.

5. The method of claim 4, wherein the step of obtaining pattern information according to the background color comprises:

if the proportion of the sampling points in the area for acquiring the color samples to the background color is smaller than a preset proportion threshold, determining the area to be an image-taking area;

acquiring the color of each display point in the drawing area, and correspondingly forming a matrix data table;

in the matrix data table, setting the audio link corresponding to the display point with the color of the background color as first data, and setting the audio link corresponding to the display point with the color not being the background color as second data to obtain a matrix data table to be selected;

in the matrix data table to be selected, dividing the matrix data table to be selected into a plurality of pattern areas according to the first data, and acquiring pattern information according to the positions of the second data in the pattern areas.

6. The method for acquiring audio information according to claim 1, wherein the step of extracting keyword information from the preset image comprises:

acquiring character information from the preset image, and the font size and the position information of each character in the character information in the preset image;

7. An apparatus for obtaining audio information, the apparatus comprising:

the target link acquisition unit is used for selecting corresponding target link data from the audio link set corresponding to the first parameter according to the second parameter;

the audio acquisition unit is used for acquiring audio information corresponding to the target link data;

wherein the target link acquiring unit includes:

the device comprises a sequence setting sub-unit, a searching efficiency calculating unit and a searching efficiency calculating unit, wherein the sequence setting sub-unit is used for acquiring searching efficiency values of a plurality of sub-parameters and setting a searching sequence of the plurality of sub-parameters according to the searching efficiency values, and the sub-parameters are sub-parameters included in a first parameter;

the searching subunit is used for searching a database to obtain corresponding audio links by utilizing the subparameters with the ranks exceeding the preset ranks in the searching sequence and forming an initial link set;

and the first screening subunit is used for screening the initial link set by using the sub-parameters ranked below a preset rank in the search sequence to obtain the audio link set.

8. The apparatus for acquiring audio information according to claim 7, wherein the parameter setting unit comprises:

a history duration obtaining subunit, configured to obtain the keyword information and the target pattern information respectively to obtain history durations of corresponding link information sets;

and the parameter setting subunit is used for taking the information with shorter history duration in the keyword information and the target pattern information as a first parameter.

9. The apparatus for acquiring audio information according to claim 7, wherein the target link acquiring unit includes:

the second screening subunit is used for screening in the audio link set according to the second parameter to obtain a plurality of data to be selected;

the display subunit is used for displaying the plurality of data to be selected;

and the determining subunit is used for determining target link data corresponding to the confirmation instruction from the multiple data to be selected according to the confirmation instruction.

10. The apparatus for acquiring audio information according to claim 7, wherein the text pattern acquiring unit comprises:

a dividing subunit, configured to divide the preset image into a plurality of regions;

the sampling subunit is used for setting a plurality of sampling points on each region and acquiring a plurality of color samples according to the plurality of sampling points;

a background color obtaining subunit, configured to determine a background color according to the number of the color samples of each color;

the pattern information acquisition subunit is used for acquiring pattern information according to the background color;

and the target pattern information acquisition subunit is used for extracting the target pattern information from the pattern information.

11. The apparatus for acquiring audio information according to claim 10, wherein the pattern information acquiring subunit includes:

the image taking region determining module is used for determining the region as an image taking region if the proportion of the color samples obtained by the sampling points in the region to the background color is smaller than a preset proportion threshold;

the first matrix forming module is used for acquiring the color of each display point in the image taking area and correspondingly forming a matrix data table;

a second matrix forming module, configured to set, in the matrix data table, an audio link corresponding to the display point with the color of the background color as first data, and set, in the matrix data table, an audio link corresponding to the display point with the color not being the background color as second data, so as to obtain a to-be-selected matrix data table;

and the pattern information acquisition module is used for dividing the matrix data table to be selected into a plurality of pattern areas according to the first data in the matrix data table to be selected and acquiring pattern information according to the positions of the second data in the pattern areas.

12. The apparatus for acquiring audio information according to claim 7, wherein the text pattern acquiring unit comprises:

the character information acquiring subunit is used for acquiring character information from the preset image, and the font size of each character in the character information and the position information of each character in the preset image;

a preset keyword obtaining subunit, configured to set, if the font size corresponding to each character includes at least two types, the character with the largest font as a preset keyword;

and the keyword information screening subunit is configured to, if the number of the preset keywords exceeds the preset word number, screen according to the position information of the preset keywords to obtain keyword information.

13. A storage medium having stored thereon a computer program for causing a computer to perform the method of acquiring audio information of any one of claims 1 to 6 when the computer program runs on the computer.

14. An electronic device comprising a processor and a memory, said memory having a computer program, wherein said processor is adapted to perform the method of retrieving audio information of any of claims 1 to 6 by invoking said computer program.