CN110727854B - Data processing method and device, electronic equipment and computer readable storage medium - Google Patents

Data processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN110727854B
CN110727854B CN201910775864.3A CN201910775864A CN110727854B CN 110727854 B CN110727854 B CN 110727854B CN 201910775864 A CN201910775864 A CN 201910775864A CN 110727854 B CN110727854 B CN 110727854B
Authority
CN
China
Prior art keywords
interface
character
pinyin
preset
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910775864.3A
Other languages
Chinese (zh)
Other versions
CN110727854A (en
Inventor
王彦芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201910775864.3A priority Critical patent/CN110727854B/en
Publication of CN110727854A publication Critical patent/CN110727854A/en
Application granted granted Critical
Publication of CN110727854B publication Critical patent/CN110727854B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/0483Interaction with page-structured environments, e.g. book metaphor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/0485Scrolling or panning

Abstract

The invention relates to a data processing method, a data processing device, electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: receiving an interface image of an application interface sent by a client, wherein the interface image is intercepted when the client interface detects a preset sliding operation; determining a character template matched with the interface image in a preset character template library; determining an ordered pinyin set corresponding to the character template according to the corresponding relation between characters and pinyin in a preset pinyin library; and generating output audio based on the ordered pinyin set, and returning the output audio to the client. The embodiment of the invention can enhance the attraction of the recommended content to the user, thereby guiding the user to click and watch the recommended content, enhancing the interactivity between the APP and the user and increasing the click arrival quantity of the APP.

Description

Data processing method and device, electronic equipment and computer readable storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data processing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
Present children APP operation methods, take the Qibabu as an example, when the user operates and watches by oneself in APP, horizontal slip page, the content that APP recommends can be according to tertiary classification or data characteristic polymerization, shows certain type of content at different functional blocks, if: sweet princess dreams for boys, positive hero dreams for boys, and so on.
This way of presentation has the drawback that: the literacy quantity of children of low age is limited, when they see the recommended functional block and the data content in the block, because do not know the specific block theme and the Chinese characters of the picture book or the movie name, will miss the content that may be interested originally, so the present text content plus cover page presentation mode, for children of low age or other users who have reading disorder, attract them only from the vision, the interactivity and the attraction are very limited; moreover, the data recommended for the APP provider loses click reach.
Disclosure of Invention
In order to solve the technical problems or at least partially solve the technical problems, the invention provides a data processing method, a data processing device, an electronic device and a computer-readable storage medium.
In a first aspect, the present invention provides a data processing method, including:
receiving an interface image of an application interface sent by a client, wherein the interface image is intercepted when the client interface detects a preset sliding operation;
determining a character template matched with the interface image in a preset character template library;
determining an ordered pinyin set corresponding to the character template according to the corresponding relation between characters and pinyin in a preset pinyin library;
and generating output audio based on the ordered pinyin set, and returning the output audio to the client.
Optionally, determining a text template matched with the interface image in a preset text template library includes:
extracting characters to be recognized from the interface image;
respectively calculating similarity between the characters to be recognized and each character template in a preset character template library;
and determining the character template with the highest similarity as the character template matched with the interface image.
Optionally, extracting the text to be recognized in the interface image includes:
carrying out image processing on the interface image to obtain an intermediate image;
performing image edge search on the intermediate image to obtain an edge search result;
performing pixel neighborhood calculation on the edge search result to obtain a plurality of pixel connected regions;
and extracting characters to be recognized in a plurality of pixel communication areas.
Optionally, the establishing of the preset text template library includes:
acquiring first interface contents of a plurality of preset application interfaces in a database;
performing word segmentation and duplication removal on the first interface content to obtain an interface character set;
aiming at each interface character in the interface character set, respectively carrying out display setting according to a plurality of preset character display forms to obtain a plurality of character templates;
and storing a plurality of character templates corresponding to each interface character into the character template library.
Optionally, the establishing of the preset text template library further includes:
if the newly added second interface content exists in the database, performing word segmentation on the second interface content;
removing duplication of the second interface content after word segmentation and the interface character set to obtain new interface characters;
and storing the newly added interface characters into the interface character set, and executing the step of displaying and setting each interface character in the interface character set according to a plurality of preset character display forms.
Optionally, determining an ordered pinyin set corresponding to the text template according to a correspondence between the text and the pinyin in a preset pinyin library, including:
determining characters and character sequences corresponding to the character templates;
determining the pinyin corresponding to the characters according to the corresponding relation between the characters and the pinyin in a preset pinyin library;
and sequencing the pinyin according to the character sequence to obtain the ordered pinyin set.
Optionally, generating output audio based on the ordered pinyin-collection, including:
searching an audio segment corresponding to the pinyin in a preset voice corpus to obtain an ordered audio set;
and coding and splicing the audio segments in the ordered audio set to obtain the output audio.
Optionally, establishing the preset speech corpus includes:
constructing a phonetic table, wherein the phonetic table comprises: a plurality of standard pinyins;
respectively recording audio segments aiming at each standard pinyin in the pinyin list;
and correspondingly storing each standard pinyin and the corresponding audio segment thereof to obtain a voice corpus.
In a second aspect, the present invention provides a data processing apparatus comprising:
the receiving module is used for receiving an interface image of the application interface sent by the client, wherein the interface image is intercepted when the client interface detects a preset sliding operation;
the determining module is used for determining a character template matched with the interface image in a preset character template library;
the first building module is used for determining an ordered pinyin set corresponding to the character template according to the corresponding relation between characters and pinyin in a preset pinyin library;
and the generating module is used for generating output audio based on the ordered pinyin set and returning the output audio to the client.
Optionally, the determining module includes:
the extraction unit is used for extracting characters to be recognized from the interface image;
the calculation unit is used for respectively calculating the similarity between the characters to be recognized and each character template in a preset character template library;
and the first determining unit is used for determining the character template with the highest similarity as the character template matched with the interface image.
Optionally, the extracting unit is further configured to:
carrying out image processing on the interface image to obtain an intermediate image;
performing image edge search on the intermediate image to obtain an edge search result;
performing pixel neighborhood calculation on the edge search result to obtain a plurality of pixel connected regions;
and extracting characters to be recognized in a plurality of pixel communication areas.
Optionally, the apparatus further comprises:
the acquisition module is used for acquiring first interface contents of a plurality of preset application interfaces in a database;
the word segmentation and duplication removal module is used for carrying out word segmentation and duplication removal on the first interface content to obtain an interface character set;
the setting module is used for carrying out display setting on each interface character in the interface character set according to a plurality of preset character display forms to obtain a plurality of character templates;
and the first storage module is used for storing a plurality of character templates corresponding to each interface character into the character template library.
Optionally, the apparatus further comprises:
the word segmentation module is used for segmenting the second interface content if the newly added second interface content exists in the database;
the duplication elimination module is used for eliminating duplication of the second interface content after word segmentation and the interface character set to obtain newly added interface characters;
and the second storage module is used for storing the newly-added interface characters into the interface character set and executing the step of display setting according to a plurality of preset character display forms for each interface character in the interface character set.
Optionally, the building module includes:
the second determining unit is used for determining characters and character sequences corresponding to the character templates;
a third determining unit, configured to determine pinyin corresponding to the characters according to a correspondence between the characters and the pinyin in a preset pinyin library;
and the sequencing unit is used for sequencing the pinyin according to the character sequence to obtain the ordered pinyin set.
Optionally, the generating module includes:
the searching unit is used for searching the audio segments corresponding to the pinyin in a preset voice corpus to obtain an ordered audio set;
and the splicing unit is used for coding and splicing the audio segments in the ordered audio set to obtain the output audio.
Optionally, the apparatus further comprises:
a second construction module configured to construct a phonetic table, the phonetic table including: a plurality of standard pinyins;
the recording module is used for respectively recording audio segments aiming at each standard pinyin in the pinyin list;
and the third storage module is used for correspondingly storing each standard pinyin and the corresponding audio segment thereof to obtain a voice corpus.
In a third aspect, the present invention provides an electronic device, including a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
a memory for storing a computer program;
a processor for implementing the data processing method of any one of the first aspect when executing the program stored in the memory.
In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon a data processing method program that, when executed by a processor, implements the steps of the data processing method of any one of the first aspects.
Compared with the prior art, the technical scheme provided by the embodiment of the invention has the following advantages:
after receiving an interface image of an application interface which is sent by a client and captured by triggering through a preset sliding operation, the embodiment of the invention can determine a character template matched with the interface image in a preset character template library, then determine an ordered pinyin set corresponding to the character template according to the corresponding relation between characters and pinyin in the preset pinyin library, and finally generate an output audio based on the ordered pinyin set and return the output audio to the client.
The embodiment of the invention can output the characters exposed in the interface image in a voice broadcasting mode when the preset sliding operation is detected on the application interface, namely, the characters are displayed on the application interface, and simultaneously the embodiment of the invention can output the characters in the voice broadcasting mode, so that the interest of a user on the recommended content is grasped from the visual and auditory aspects, the attraction of the recommended content to the user is enhanced, the user is guided to click and watch the recommended content, the interactivity between the APP and the user is enhanced, and the click arrival quantity of the APP can be increased.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a data processing method according to another embodiment of the present invention;
fig. 3 is a schematic flow chart of creating a text template library according to another embodiment of the present invention;
fig. 4 is a schematic flowchart illustrating a process of creating the predetermined speech corpus according to another embodiment of the present invention;
fig. 5 is a schematic structural diagram of a data processing apparatus according to another embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to yet another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.
In order to solve the technical problem of the present invention, a data processing method, an apparatus, an electronic device, and a computer-readable storage medium according to embodiments of the present invention are shown in fig. 1, where the data processing method may be applied to a server, and the method may include the following steps:
step S101, a server receives an interface image of an application interface sent by a client, wherein the interface image is captured when a preset sliding operation is detected on a client interface;
in practical application, a user opens an APP installed on a client, when an application interface of the APP is switched through a preset sliding operation, the APP sends an information acquisition request to a server interface, names, brief introduction and contents of all functional blocks in the application interface are acquired from a server through the information acquisition request, at the moment, the functional blocks only render the exposed part of a first screen, and the unexposed part is not rendered.
The preset sliding operation may refer to an operation of sliding a finger of a user on the application interface to enable the application interface to slide along with the finger to display unrendered function blocks and the like, for example, the preset sliding operation may refer to a slow sliding of the finger (for example, the sliding speed is less than 10cm/s) to enable the application interface to slide along with the finger at a slow speed, and at this time, the function blocks in the application interface are displayed from the right side to the left side.
Step S102, the server determines a character template matched with the interface image in a preset character template library;
as shown in fig. 2, the step S102 may include the steps of:
step S201, extracting characters to be recognized from the interface image;
in this step, image processing may be performed on the interface image to obtain an intermediate image; for example, the image processing on the interface image may refer to clipping processing on the interface image, performing grayscale processing on the clipped interface image, performing filter processing on the grayscale-processed interface image, and the like.
Then, carrying out image edge search on the intermediate image to obtain an edge search result; illustratively, multiple rounds of edge finding may be performed on the intermediate image using the Canny edge detection algorithm.
Performing pixel neighborhood calculation on the edge search result to obtain a plurality of pixel connected regions which are distributed in the image and are small and large;
finally, the characters to be recognized can be extracted from the plurality of pixel communication areas, for example, an Optical Character Recognition (OCR) algorithm can be used, and an OCR algorithm is used to analyze and recognize the image file of the text data, so as to extract the characters to be recognized.
Through image processing, edge searching and pixel neighborhood calculation, a plurality of pixel connected regions containing characters to be recognized can be accurately obtained, the characters to be recognized are extracted from the pixel connected regions, compared with the method of extracting the characters to be recognized from the whole picture, the method has the advantages that the calculation amount is smaller, and system resources are saved.
Step S202, respectively calculating similarity between the characters to be recognized and each character template in a preset character template library;
in the embodiment of the invention, the text template library is a text template set which is established in advance and comprises a plurality of text templates.
And step S203, determining the character template with the highest similarity as the character template matched with the interface image.
The method comprises the steps of respectively calculating similarity between characters to be recognized extracted from an interface image and each character template in a preset character template library, determining the character template with the highest similarity as the character template matched with the interface image, determining the character template matched with the interface image in the preset character template library, and searching the character template with the highest similarity, so that the accuracy of the determined character template can be improved.
In the embodiment of the present invention, as shown in fig. 3, the pre-established text template library may include two cases, that is, a template library established based on existing text in an APP, and a template library established based on newly added text of the APP (for example, when an APP adds a new function module, etc.), where since the existing text of the APP is determined to have been presented on an application interface, in order to improve accuracy, an emphasis process may be performed, and therefore, establishing the text template library for the existing text in the APP may be implemented by the following steps S301 to S304, specifically:
step S301, acquiring first interface contents of a plurality of preset application interfaces in a database;
for example, the first interface content may refer to all the three-level classifications and the function block topics, and in this step, all the three-level classifications and the function block topics may be obtained in the database.
Step S302, performing word segmentation and duplicate removal on the first interface content to obtain an interface character set containing non-repeated Chinese characters;
step S303, aiming at each interface character in the interface character set, respectively carrying out display setting according to a plurality of preset character display forms to obtain a plurality of character templates;
the preset text display forms, for example, may refer to song style, clerical script, microsoft mazzo black, and the like. In this step, the interface texts may be set to be fonts such as song style, clerical script, microsoft elegant black, and the like, so that each interface text can correspond to obtain a plurality of text templates.
And step S304, storing a plurality of character templates corresponding to each interface character into the character template library.
Through S301-S304, the first interface content of the application interface (namely, the existing characters in the APP) can be obtained from the database, the character template library is constructed based on the existing characters in the APP, and the characters to be recognized are extracted from the interface image and are constructed based on the interface characters in the application interface, so that when the characters to be recognized are matched with the character templates in the character template library, the character templates with complete consistency or high similarity can be found more easily, and the accuracy of matching the characters to be recognized with the similarity of each character template in the preset character template library is improved.
The newly added characters of the APP can be determined by adding a functional module and the like or the newly added characters possibly added in an application interface, and the establishing of the preset character template library for the newly added characters of the APP can comprise the following steps:
step S305, if the newly added second interface content exists in the database, performing word segmentation on the second interface content;
similar to the first interface content, the second interface content may refer to all three levels of classification and function block themes, and so on.
Step S306, duplicate removal is carried out on the second interface content after word segmentation and the interface character set to obtain newly added interface characters;
step S307, storing the newly added interface characters into the interface character set, and executing the step S303 to display and set each interface character in the interface character set according to a plurality of preset character display forms.
Through the steps S305 to S307, the second interface content newly added to the application interface (namely, the newly added characters in the APP) can be obtained from the database, the template library is constructed based on the newly added characters in the APP by word segmentation, duplication removal and display setting according to the character display form, and finally, the obtained plurality of character templates are stored, so that the template library is constructed based on the newly added characters in the APP.
After the above step S102 is executed, step S103 is executed.
S103, the server determines an ordered pinyin set corresponding to the character template according to the corresponding relation between characters and pinyin in a preset pinyin library;
in this step, the characters and the character sequence corresponding to the character template may be determined first, then the pinyin corresponding to the characters may be determined according to the correspondence between the characters and the pinyin in a preset pinyin library, and finally the pinyins may be sorted according to the character sequence to obtain the ordered pinyin set.
And step S104, the server generates output audio based on the ordered pinyin set and returns the output audio to the client.
In this step, the audio segments corresponding to the pinyin can be searched in a preset speech corpus to obtain an ordered audio set, and then the audio segments in the ordered audio set are encoded and spliced to obtain the output audio.
The server returns the output audio to the client, and the APP of the client calls the operating system playing interface to play the output audio.
The speech corpus is a pre-established speech prediction set containing a plurality of pairs of standard pinyins and corresponding relations between audio segments, as shown in fig. 4, the establishing of the pre-established speech corpus may include the following steps:
step S401, constructing a phonetic table, wherein the phonetic table comprises: a plurality of standard pinyins;
because there are many homophones in Chinese characters, it is only necessary to take the minimum set of pinyin to traverse the pronunciation of each character, the main key of the pinyin list is self-increasing id, and the content includes initial consonant, vowel and tone. For example, the initial consonant is sh, the final is an, and the tone is one sound, which can form a uniquely determined pinyin. Thus, the pronunciation table may have hundreds of elements, which can cover all the pronunciations of the characters in the dictionary.
Step S402, recording audio frequency segments respectively aiming at each standard pinyin in the pinyin list;
in this step, only the number of letters × the number of vowels × 4, about several hundreds of voices need to be recorded.
And S403, correspondingly storing each standard pinyin and the corresponding audio segment thereof to obtain a voice corpus.
The audio frequency segment recorded by each standard pinyin in the voice corpus is in one-to-one correspondence with the standard pinyin in the pinyin list.
Through steps S401 to S403, a speech corpus including a plurality of combinations of standard pinyins and audio segments can be obtained, which can be conveniently used by the server when generating output audio based on the ordered pinyin set, and the audio segments are recorded based on the standard pinyins, so that the sound of the output audio is more accurate, and is convenient for the user to identify.
After receiving an interface image of an application interface which is sent by a client and captured by triggering through a preset sliding operation, the embodiment of the invention can determine a character template matched with the interface image in a preset character template library, then determine an ordered pinyin set corresponding to the character template according to the corresponding relation between characters and pinyin in the preset pinyin library, and finally generate an output audio based on the ordered pinyin set and return the output audio to the client.
The embodiment of the invention can output the characters exposed in the interface image in a voice broadcasting mode when the preset sliding operation is detected on the application interface, namely, the characters are displayed on the application interface, and simultaneously, the embodiment of the invention can output the characters in a voice broadcasting mode, so that the characters are convenient for children who do not recognize the characters or other users with reading obstacles to use, the interest of the users in the recommended content is grasped from the visual and auditory angles, the attraction of the recommended content to the users is enhanced, the recommended content is guided to be clicked and watched, the interactivity of the APP and the users is enhanced, and the click arrival quantity of the APP can be increased.
In practical application, according to the embodiment of the invention, when a child or other users with reading disorder use the APP, the page is slowly slid, the functional block is displayed from the right side to the left side, and at this time, the theme of the functional block can be broadcasted in voice, such as: sweet princess dreams of girls. If the user hears the keyword: the method is sweet and the princess can see the cover drawings of 6-7 data contents under the functional blocks simultaneously, and can help the user to generate interest in the data programs by combining with sound and pictures, so that the user can click to watch the data programs.
In still another embodiment of the present invention, there is also provided a data processing apparatus, as shown in fig. 5, including:
the receiving module 11 is configured to receive an interface image of an application interface sent by a client, where the interface image is captured when a preset sliding operation is detected on a client interface;
the determining module 12 is configured to determine a text template matched with the interface image in a preset text template library;
the first building module 13 is used for determining an ordered pinyin set corresponding to the character template according to the corresponding relation between characters and pinyin in a preset pinyin library;
and the generating module 14 is configured to generate output audio based on the ordered pinyin set, and return the output audio to the client.
In another embodiment of the present invention, the determining module includes:
the extraction unit is used for extracting characters to be recognized from the interface image;
the calculation unit is used for respectively calculating the similarity between the characters to be recognized and each character template in a preset character template library;
and the first determining unit is used for determining the character template with the highest similarity as the character template matched with the interface image.
In another embodiment of the present invention, the extracting unit is further configured to:
carrying out image processing on the interface image to obtain an intermediate image;
performing image edge searching on the intermediate image to obtain an edge searching result;
performing pixel neighborhood calculation on the edge search result to obtain a plurality of pixel connected regions;
and extracting characters to be recognized in a plurality of pixel communication areas.
In yet another embodiment of the present invention, the apparatus further comprises:
the acquisition module is used for acquiring first interface contents of a plurality of preset application interfaces in a database;
the word segmentation and duplication removal module is used for carrying out word segmentation and duplication removal on the first interface content to obtain an interface character set;
the setting module is used for carrying out display setting on each interface character in the interface character set according to a plurality of preset character display forms to obtain a plurality of character templates;
and the first storage module is used for storing a plurality of character templates corresponding to each interface character into the character template library.
In yet another embodiment of the present invention, the apparatus further comprises:
the word segmentation module is used for segmenting the second interface content if the newly added second interface content exists in the database;
the duplication removing module is used for removing duplication of the second interface content after word segmentation and the interface character set to obtain newly added interface characters;
and the second storage module is used for storing the newly-added interface characters into the interface character set and executing the step of display setting according to a plurality of preset character display forms for each interface character in the interface character set.
In another embodiment of the present invention, the building module includes:
the second determining unit is used for determining characters and character sequences corresponding to the character templates;
a third determining unit, configured to determine pinyin corresponding to the characters according to a correspondence between the characters and the pinyin in a preset pinyin library;
and the sequencing unit is used for sequencing the pinyin according to the character sequence to obtain the ordered pinyin set.
In yet another embodiment of the present invention, the generating module includes:
the searching unit is used for searching the audio segments corresponding to the pinyin in a preset voice corpus to obtain an ordered audio set;
and the splicing unit is used for coding and splicing the audio segments in the ordered audio set to obtain the output audio.
In yet another embodiment of the present invention, the apparatus further comprises:
a second construction module configured to construct a phonetic table, the phonetic table including: a plurality of standard pinyins;
the recording module is used for respectively recording audio segments aiming at each standard pinyin in the pinyin list;
and the third storage module is used for correspondingly storing each standard pinyin and the corresponding audio clip thereof to obtain a voice corpus.
In another embodiment of the present invention, an electronic device is further provided, which includes a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing the data processing method of the embodiment of the method when executing the program stored in the memory.
In yet another embodiment of the present invention, a computer-readable storage medium is also provided, on which a data processing method program is stored, which when executed by a processor implements the steps of the data processing method described in the aforementioned method embodiment.
According to the electronic device provided by the embodiment of the invention, the processor realizes the playing operation of acquiring the video by executing the program stored in the memory, confirms the corresponding frame rate reduction strategy according to the playing operation, and plays the video after adjusting the frame data corresponding to the video data according to the frame rate reduction strategy, so that the playing device can play the video well.
The communication bus 1140 mentioned in the above electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus 1140 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 6, but this is not intended to represent only one bus or type of bus.
The communication interface 1120 is used for communication between the electronic device and other devices.
The memory 1130 may include a Random Access Memory (RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The processor 1110 may be a general-purpose processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the integrated circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the invention are all or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk (ssd)), among others.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (18)

1. A data processing method, comprising:
receiving an interface image of an application interface sent by a client, wherein the interface image is intercepted when the client interface detects a preset sliding operation;
determining a character template matched with the interface image in a preset character template library, wherein each interface character in the character template library corresponds to a plurality of character templates with different character display forms;
determining an ordered pinyin set corresponding to the character template according to the corresponding relation between characters and pinyin in a preset pinyin library;
and generating output audio based on the ordered pinyin set, and returning the output audio to the client.
2. The data processing method of claim 1, wherein determining a text template matching the interface image in a preset text template library comprises:
extracting characters to be recognized from the interface image;
respectively calculating similarity between the characters to be recognized and each character template in a preset character template library;
and determining the character template with the highest similarity as the character template matched with the interface image.
3. The data processing method of claim 2, wherein extracting the text to be recognized in the interface image comprises:
carrying out image processing on the interface image to obtain an intermediate image;
performing image edge searching on the intermediate image to obtain an edge searching result;
performing pixel neighborhood calculation on the edge search result to obtain a plurality of pixel connected regions;
and extracting characters to be recognized in a plurality of pixel communication areas.
4. The data processing method of claim 1, wherein establishing the preset text template library comprises:
acquiring first interface contents of a plurality of preset application interfaces in a database;
performing word segmentation and duplicate removal on the first interface content to obtain an interface character set;
aiming at each interface character in the interface character set, respectively performing display setting according to a plurality of preset character display forms to obtain a plurality of character templates;
and storing a plurality of character templates corresponding to each interface character into the character template library.
5. The data processing method of claim 4, wherein establishing the preset text template library further comprises:
if the newly added second interface content exists in the database, performing word segmentation on the second interface content;
removing duplication of the second interface content after word segmentation and the interface character set to obtain new interface characters;
and storing the newly added interface characters into the interface character set, and executing the step of displaying and setting each interface character in the interface character set according to a plurality of preset character display forms.
6. The data processing method of claim 1, wherein determining the ordered pinyin set corresponding to the text template according to correspondence between the text and the pinyin in a preset pinyin library, comprises:
determining characters and character sequences corresponding to the character templates;
determining the pinyin corresponding to the characters according to the corresponding relation between the characters and the pinyin in a preset pinyin library;
and sequencing the pinyin according to the character sequence to obtain the ordered pinyin set.
7. The data processing method of claim 1, wherein generating output audio based on the ordered pinyin-collection includes:
searching an audio segment corresponding to the pinyin in a preset voice corpus to obtain an ordered audio set;
and coding and splicing the audio segments in the ordered audio set to obtain the output audio.
8. The data processing method of claim 7, wherein establishing the preset voice corpus comprises:
constructing a phonetic table, wherein the phonetic table comprises: a plurality of standard pinyins;
respectively recording audio segments aiming at each standard pinyin in the pinyin list;
and correspondingly storing each standard pinyin and the corresponding audio segment thereof to obtain a voice corpus.
9. A data processing apparatus, comprising:
the receiving module is used for receiving an interface image of the application interface sent by the client, wherein the interface image is intercepted when the client interface detects a preset sliding operation;
the determining module is used for determining a character template matched with the interface image in a preset character template library, and each interface character in the character template library corresponds to a plurality of character templates with different character display forms;
the first building module is used for determining an ordered pinyin set corresponding to the character template according to the corresponding relation between characters and pinyin in a preset pinyin library;
and the generating module is used for generating output audio based on the ordered pinyin set and returning the output audio to the client.
10. The data processing apparatus of claim 9, wherein the determining module comprises:
the extraction unit is used for extracting characters to be recognized from the interface image;
the calculation unit is used for respectively calculating the similarity between the characters to be recognized and each character template in a preset character template library;
and the first determining unit is used for determining the character template with the highest similarity as the character template matched with the interface image.
11. The data processing apparatus of claim 10, wherein the extraction unit is further configured to:
carrying out image processing on the interface image to obtain an intermediate image;
performing image edge search on the intermediate image to obtain an edge search result;
performing pixel neighborhood calculation on the edge search result to obtain a plurality of pixel connected regions;
and extracting characters to be recognized in a plurality of pixel communication areas.
12. The data processing apparatus of claim 9, wherein the apparatus further comprises:
the acquisition module is used for acquiring first interface contents of a plurality of preset application interfaces in a database;
the word segmentation and duplication removal module is used for carrying out word segmentation and duplication removal on the first interface content to obtain an interface character set;
the setting module is used for carrying out display setting on each interface character in the interface character set according to a plurality of preset character display forms to obtain a plurality of character templates;
and the first storage module is used for storing a plurality of character templates corresponding to each interface character into the character template library.
13. The data processing apparatus of claim 12, wherein the apparatus further comprises:
the word segmentation module is used for segmenting the second interface content if the newly added second interface content exists in the database;
the duplication removing module is used for removing duplication of the second interface content after word segmentation and the interface character set to obtain newly added interface characters;
and the second storage module is used for storing the newly-added interface characters into the interface character set and executing the step of display setting according to a plurality of preset character display forms for each interface character in the interface character set.
14. The data processing apparatus of claim 9, wherein the building module comprises:
the second determining unit is used for determining characters and character sequences corresponding to the character templates;
a third determining unit, configured to determine pinyin corresponding to the characters according to a correspondence between the characters and the pinyin in a preset pinyin library;
and the sequencing unit is used for sequencing the pinyin according to the character sequence to obtain the ordered pinyin set.
15. The data processing apparatus of claim 9, wherein the generating module comprises:
the searching unit is used for searching the audio segments corresponding to the pinyin in a preset voice corpus to obtain an ordered audio set;
and the splicing unit is used for coding and splicing the audio segments in the ordered audio set to obtain the output audio.
16. The data processing apparatus of claim 9, wherein the apparatus further comprises:
a second construction module configured to construct a phonetic table, the phonetic table including: a plurality of standard pinyins;
the recording module is used for respectively recording audio segments aiming at each standard pinyin in the pinyin list;
and the third storage module is used for correspondingly storing each standard pinyin and the corresponding audio segment thereof to obtain a voice corpus.
17. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the data processing method of any one of claims 1 to 8 when executing the program stored in the memory.
18. A computer-readable storage medium, characterized in that a data-processing method program is stored on the computer-readable storage medium, which when executed by a processor implements the steps of the data-processing method of any one of claims 1 to 8.
CN201910775864.3A 2019-08-21 2019-08-21 Data processing method and device, electronic equipment and computer readable storage medium Active CN110727854B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910775864.3A CN110727854B (en) 2019-08-21 2019-08-21 Data processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910775864.3A CN110727854B (en) 2019-08-21 2019-08-21 Data processing method and device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN110727854A CN110727854A (en) 2020-01-24
CN110727854B true CN110727854B (en) 2022-07-12

Family

ID=69217126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910775864.3A Active CN110727854B (en) 2019-08-21 2019-08-21 Data processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110727854B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101118541A (en) * 2006-08-03 2008-02-06 苗玉水 Chinese-voice-code voice recognizing method
CN103297710A (en) * 2013-06-19 2013-09-11 江苏华音信息科技有限公司 Audio and video recorded broadcast device capable of marking Chinese and foreign language subtitles automatically in real time for Chinese
CN104698998A (en) * 2013-12-05 2015-06-10 上海能感物联网有限公司 Robot system under Chinese speech field control
CN108346427A (en) * 2018-02-05 2018-07-31 广东小天才科技有限公司 A kind of audio recognition method, device, equipment and storage medium
CN110060524A (en) * 2019-04-30 2019-07-26 广东小天才科技有限公司 The method and reading machine people that a kind of robot assisted is read

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10909584B2 (en) * 2006-11-30 2021-02-02 NEXRF Corp. Content relevance weighting system
CN101354748A (en) * 2007-07-23 2009-01-28 英华达(上海)电子有限公司 Device, method and mobile terminal for recognizing character
CN102163284B (en) * 2011-04-11 2013-02-27 西安电子科技大学 Chinese environment-oriented complex scene text positioning method
CN105956588A (en) * 2016-04-21 2016-09-21 深圳前海勇艺达机器人有限公司 Method of intelligent scanning and text reading and robot device
CN107608618B (en) * 2017-09-18 2020-10-09 广东小天才科技有限公司 Interaction method and device for wearable equipment and wearable equipment
CN108847066A (en) * 2018-05-31 2018-11-20 上海与德科技有限公司 A kind of content of courses reminding method, device, server and storage medium
CN109300347B (en) * 2018-12-12 2021-01-26 广东小天才科技有限公司 Dictation auxiliary method based on image recognition and family education equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101118541A (en) * 2006-08-03 2008-02-06 苗玉水 Chinese-voice-code voice recognizing method
CN103297710A (en) * 2013-06-19 2013-09-11 江苏华音信息科技有限公司 Audio and video recorded broadcast device capable of marking Chinese and foreign language subtitles automatically in real time for Chinese
CN104698998A (en) * 2013-12-05 2015-06-10 上海能感物联网有限公司 Robot system under Chinese speech field control
CN108346427A (en) * 2018-02-05 2018-07-31 广东小天才科技有限公司 A kind of audio recognition method, device, equipment and storage medium
CN110060524A (en) * 2019-04-30 2019-07-26 广东小天才科技有限公司 The method and reading machine people that a kind of robot assisted is read

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
OCR Based Image Text to Speech Conversion Using MATLAB;Sneha.C. Madre et al.;《2018 Second International Conference on Intelligent Computing and Control Systems》;20190311;1-5 *

Also Published As

Publication number Publication date
CN110727854A (en) 2020-01-24

Similar Documents

Publication Publication Date Title
EP3579140A1 (en) Method and apparatus for processing video
CN109543058B (en) Method, electronic device, and computer-readable medium for detecting image
CN111161739B (en) Speech recognition method and related product
CN109979450B (en) Information processing method and device and electronic equipment
CN112733654B (en) Method and device for splitting video
CN110347866B (en) Information processing method, information processing device, storage medium and electronic equipment
CN109558513A (en) A kind of content recommendation method, device, terminal and storage medium
CN110750996B (en) Method and device for generating multimedia information and readable storage medium
CN114143479B (en) Video abstract generation method, device, equipment and storage medium
CN111178056A (en) Deep learning based file generation method and device and electronic equipment
WO2022228235A1 (en) Method and apparatus for generating video corpus, and related device
CN114095749A (en) Recommendation and live interface display method, computer storage medium and program product
CN113035199A (en) Audio processing method, device, equipment and readable storage medium
CN111538830A (en) French retrieval method, French retrieval device, computer equipment and storage medium
CN113038175B (en) Video processing method and device, electronic equipment and computer readable storage medium
CN113407775B (en) Video searching method and device and electronic equipment
CN116567351B (en) Video processing method, device, equipment and medium
CN110263135B (en) Data exchange matching method, device, medium and electronic equipment
CN110727854B (en) Data processing method and device, electronic equipment and computer readable storage medium
CN112509581B (en) Error correction method and device for text after voice recognition, readable medium and electronic equipment
CN110428668B (en) Data extraction method and device, computer system and readable storage medium
CN111259181B (en) Method and device for displaying information and providing information
CN112699687A (en) Content cataloging method and device and electronic equipment
CN115618873A (en) Data processing method and device, computer equipment and storage medium
CN111368553A (en) Intelligent word cloud picture data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant