WO2021042234A1 - Procédé d'introduction d'application, terminal mobile et serveur - Google Patents

Procédé d'introduction d'application, terminal mobile et serveur Download PDF

Info

Publication number
WO2021042234A1
WO2021042234A1 PCT/CN2019/104000 CN2019104000W WO2021042234A1 WO 2021042234 A1 WO2021042234 A1 WO 2021042234A1 CN 2019104000 W CN2019104000 W CN 2019104000W WO 2021042234 A1 WO2021042234 A1 WO 2021042234A1
Authority
WO
WIPO (PCT)
Prior art keywords
keywords
application
information
introduction
mobile terminal
Prior art date
Application number
PCT/CN2019/104000
Other languages
English (en)
Chinese (zh)
Inventor
艾静雅
柳彤
朱大卫
汤慧秀
Original Assignee
深圳海付移通科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳海付移通科技有限公司 filed Critical 深圳海付移通科技有限公司
Priority to CN201980010315.5A priority Critical patent/CN111801673A/zh
Priority to PCT/CN2019/104000 priority patent/WO2021042234A1/fr
Publication of WO2021042234A1 publication Critical patent/WO2021042234A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/44Receiver circuitry for the reception of television signals according to analogue transmission standards
    • H04N5/60Receiver circuitry for the reception of television signals according to analogue transmission standards for the sound signals

Definitions

  • This application relates to the technical field of application programs, in particular to an application program introduction method, mobile terminal and server.
  • this application provides an application introduction method, mobile terminal and server. On the one hand, it can adapt to different user groups so that the application can meet the needs of more user groups. On the other hand, it adopts the form of animation.
  • the introduction of the application can increase the personalization of the introduction of the application, increase the interest, and improve the user experience.
  • the first technical solution adopted by this application is to provide an application introduction method, including: obtaining introduction requirement information about the application; wherein the introduction requirement information is used to indicate the requirement for introducing the application; and the introduction requirement information is extracted Keywords; obtain related images and voices based on keywords; process images and voices to form a video for introducing the application.
  • the introduction demand information is audio information
  • extracting keywords in the introduction demand information includes: performing voice recognition on the audio information to obtain text information; performing keyword extraction on the text information to obtain keywords.
  • keyword extraction is performed on text information to obtain keywords, including: semantic segmentation of text information; keywords are obtained based on the result of semantic segmentation.
  • performing semantic segmentation on text information includes: inputting the text information into a convolutional neural network for deep learning, so as to perform semantic segmentation on the text information to obtain keywords.
  • the introduction demand information is text information
  • the extraction of keywords in the introduction demand information includes: semantic segmentation of the text information; keywords are obtained based on the result of the semantic segmentation.
  • obtaining the associated image and voice based on the keyword includes: sending the keyword to the server so that the server generates the associated image and voice based on the keyword; obtaining the image and voice sent by the server.
  • the image and voice are processed to form a video for introducing the application, including: image segmentation of multiple corresponding images, extraction of feature information in the image; combination of feature information to generate multiple images Frame; multiple image frames are formed into animation; animation and voice are merged to form a video used to introduce the application.
  • the method further includes: obtaining background music sent by the server; wherein, the background music is music generated by the server based on keywords; and adding the background music to the video.
  • the second technical solution adopted in this application is to provide an application introduction method, including: acquiring keywords sent by the mobile terminal; wherein the keywords are extracted by the mobile terminal based on the acquired introduction demand information about the application.
  • the introduction demand information is used to express the demand for introducing the application; generate related images and voice based on keywords; send the image and voice to the mobile terminal, so that the mobile terminal can process the image and voice to form the application program Introductory video.
  • generating related images and voices based on keywords includes: deep learning the keywords to obtain related images from a preset image library.
  • generating related images and voices based on keywords includes: applying keywords through deep learning to generate text information that meets the keyword scene; and converting text information into voice.
  • a mobile terminal which includes a processor and a memory connected to the processor; the memory is used to store program data, and the processor is used to execute the program data, so as to implement the first solution described above.
  • the server includes a processor and a memory connected to the processor; the memory is used to store program data, and the processor is used to execute the program data, so as to implement the above-mentioned second solution.
  • Another technical solution adopted by the present application is to provide a computer storage medium, which is used to store program data, and when the program data is executed by a processor, it is used to implement any of the methods provided in the above-mentioned solutions.
  • the mobile terminal includes: an acquisition module for acquiring introduction requirement information about an application program; wherein the introduction requirement information is used to indicate a requirement for introducing an application program; extraction The module is used to extract keywords in the introduction demand information; the acquisition module is also used to obtain related images and voices based on the keywords; the processing module is used to process the images and voices to form an introduction to the application Video.
  • the server includes: an acquisition module for acquiring keywords sent by the mobile terminal; wherein the keywords are extracted by the mobile terminal based on the acquired introduction demand information about the application , The introduction demand information is used to express the demand for introducing the application; the processing module is used to generate associated images and voices based on keywords; the sending module is used to send images and voices to the mobile terminal so that the mobile terminal can respond to the image And voice processing to form a video for introducing the application.
  • an application introduction method of this application includes: obtaining introduction requirement information about the application; wherein, the introduction requirement information is used to indicate information about the introduction of the application. Demand; extract the keywords in the introduction demand information; obtain related images and voices based on the keywords; process the images and voices to form a video for introducing the application program.
  • FIG. 1 is a schematic flowchart of a first embodiment of an application program introduction method provided by the present application
  • FIG. 2 is a schematic flowchart of a second embodiment of an application program introduction method provided by the present application
  • FIG. 3 is a schematic flowchart of a third embodiment of an application program introduction method provided by the present application.
  • FIG. 4 is a schematic flowchart of a fourth embodiment of an application program introduction method provided by the present application.
  • FIG. 5 is a schematic flowchart of a fifth embodiment of an application program introduction method provided by the present application.
  • FIG. 6 is a schematic flowchart of a sixth embodiment of an application program introduction method provided by the present application.
  • FIG. 7 is a schematic flowchart of a seventh embodiment of an application program introduction method provided by the present application.
  • FIG. 8 is a schematic structural diagram of a first embodiment of a mobile terminal provided by the present application.
  • FIG. 9 is a schematic structural diagram of a first embodiment of a server provided by the present application.
  • FIG. 10 is a schematic structural diagram of an embodiment of a computer storage medium provided by the present application.
  • FIG. 11 is a schematic structural diagram of a second embodiment of a mobile terminal provided by the present application.
  • Fig. 12 is a schematic structural diagram of a second embodiment of a server provided by the present application.
  • Fig. 1 is a schematic flowchart of a first embodiment of the application introduction method provided by the present application. The method is implemented based on a mobile terminal, and the method includes:
  • Step 11 Obtain the introduction requirement information about the application; among them, the introduction requirement information is used to indicate the requirement for the introduction of the application.
  • the mobile terminal after the mobile terminal responds to the user to download the application and the installation is complete, it obtains the introduction requirement information about the application.
  • the introduction requirement information may be audio information or text information.
  • the audio information is collected by the microphone of the mobile terminal, and the text information can be manually input, or keywords prompted by the application can be selected as the text information.
  • the introduction requirement information is used to indicate the requirement for the introduction application. For example, when users need to know the income of financial management applications, they can use “income” as the introduction demand information.
  • Step 12 Extract the keywords in the introduction requirement information.
  • the mobile terminal performs keyword extraction on the content of the introduction requirement information.
  • the obtained introduction requirement information is audio information
  • the text information parsed from the audio information is "How is this application safe?" "Payment", the extracted keyword is "secure payment”.
  • the keyword extraction method can be based on a keyword extraction algorithm based on statistical features.
  • the keyword extraction algorithm uses the statistical information of the words in the document to extract the keywords of the document.
  • the text is preprocessed to obtain a set of candidate words, and then keywords are obtained from the candidate set by means of feature value quantification.
  • the feature value quantization methods include feature quantification based on word weight, feature quantification based on word document location, and feature quantification based on word-related information.
  • Feature quantification based on word weight mainly includes part-of-speech, word frequency, inverse document frequency, relative word frequency, word length, etc.; word-based feature quantification of document position is based on the assumption that sentences at different positions of the article have different importance to the document.
  • words in the first N words, last N words, beginning of paragraph, end of paragraph, title, introduction, etc. of the article are representative.
  • Word related information refers to the degree of relevance information between words and words, words and documents, including mutual information, hits value, contribution degree, dependency degree, TF-IDF value, etc.
  • Keyword extraction methods can also be based on deep learning methods.
  • Step 13 Obtain related images and voices based on keywords.
  • acquiring the associated image may be that the mobile terminal sends the keyword to the server, and the server performs image retrieval in a preset image library to obtain multiple images.
  • obtaining the associated voice may be that the mobile terminal sends the keywords to the server, and the server generates multiple texts conforming to the application scenarios through the keywords and application scenarios, and then sends the multiple texts to the mobile terminal, and then the mobile terminal sends the multiple texts to the mobile terminal. Convert text messages into voice messages.
  • acquiring the associated image may be that the mobile terminal performs image retrieval in a local preset image library to obtain multiple images.
  • acquiring the associated voice may be that the mobile terminal generates multiple paragraphs of text that conform to the application scenario through keywords and application scenarios, and then converts the text information into voice information.
  • Step 14 Process the image and voice to form a video for introducing the application.
  • the keywords are "birds" and "trees”; one image has the characteristic information of a tree, and another image has the characteristic information of a bird. These two pieces of information can be extracted to form a bird stop according to the scene. Image on the tree. After composing a series of complete images, the image is smoothed and other enhanced processing, the purpose is to make the image content more natural.
  • the voice information and the image information are merged to form a video for introducing the application program.
  • the application reminds the user to say what he wants to know.
  • the audio information collected by the mobile terminal at this time is "I open an account for the first time, how can I invest in order to achieve high returns and The risk is small, and there is also how to pay.
  • the keywords extracted by the mobile terminal are "open account for the first time”, “investment”, “high return”, “small risk”, and “how to pay”. Then, based on these keywords, search for the corresponding images in the preset image library. For example, “open an account for the first time” will search for the account opening screen and images of animated characters, and “high return” and “low risk” will search for warnings and recommendations.
  • the image of the product is then formed into a section with an animated character explaining how to invest in the best way to ensure low risk, and then there is another picture where payment safety is also very important.
  • text information that meets the scene is generated, the text information is converted into voice, and the voice information and image information are merged to form a video that introduces user needs.
  • background music can also be added to the video.
  • the mobile terminal can generate an application introduction corresponding to the sound and image according to a piece of voice, or tell a story to the child through voice.
  • the child can describe the type of story he likes to listen to, and generate a short story with pictures and texts through machine learning. , So that children are more interested, and it can also make some children who are not literate can acquire corresponding knowledge through animation.
  • the mobile terminal According to different application programs and different user requirements, the mobile terminal generates a video with pictures and texts corresponding to the application program for the user to watch.
  • an application introduction method of this application includes: obtaining introduction requirement information about the application; wherein the introduction requirement information is used to indicate the requirement for introducing the application; and the introduction requirement information is extracted Keywords; obtain related images and voices based on keywords; process images and voices to form a video for introducing the application.
  • Figure 2 is a schematic flowchart of a second embodiment of the application introduction method provided by the present application. The method is implemented based on a mobile terminal, and the method includes:
  • Step 21 Acquire audio information about the application program; among them, the audio information is used to indicate the demand for introducing the application program.
  • the user's audio information is collected to indicate the demand for introducing the application program.
  • the audio information may be audio information related to the application that the user wants to learn about the application.
  • the audio information may be text information displayed to the user after the application is started to prompt the user to learn about the application, so that the user can quickly speak the corresponding keyword information.
  • Step 22 Perform voice recognition on the audio information to obtain text information.
  • speech recognition is to convert a piece of audio information into corresponding text information.
  • the system mainly includes four parts: feature extraction, acoustic model, language model, dictionary and decoding.
  • the collected Perform audio data preprocessing work such as filtering and framing of the audio information to properly extract the audio information that needs to be analyzed from the original signal;
  • feature extraction converts the audio information from the time domain to the frequency domain to provide a suitable acoustic model Feature vector:
  • the acoustic model calculates the score of each feature vector on the acoustic feature according to the acoustic characteristics; while the language model calculates the probability of the sound signal corresponding to the possible sequence of phrases according to the linguistic theory; finally according to the existing dictionary , Decode the phrase sequence to get the final possible text representation.
  • Step 23 Input the text information into the convolutional neural network for deep learning, so as to perform semantic segmentation on the text information to obtain keywords.
  • a large amount of information is trained in advance through a convolutional neural network for deep learning to generate a corresponding semantic segmentation model.
  • the semantic segmentation model gets the text information, it can get the keywords.
  • Step 24 Obtain related images and voices based on the keywords.
  • Step 25 Process the image and voice to form a video for introducing the application.
  • Steps 24-25 have the same or similar technical solutions as the foregoing embodiment, and will not be repeated here.
  • FIG. 3 is a schematic flowchart of a third embodiment of the application introduction method provided by the present application.
  • the method is implemented based on a mobile terminal, and the method includes:
  • Step 31 Acquire text information about the application program; where the text information is used to indicate a requirement for introducing the application program.
  • the text information may be manually input by the user, or may be generated by the user's selection by the application prompting multiple paragraphs of text.
  • Step 32 Perform semantic segmentation on the text information.
  • Step 33 Obtain keywords based on the result of semantic segmentation.
  • Steps 32-33 can be specifically:
  • Adopt TF-IDF term frequency inverse document frequency, a common weighting technique for information retrieval data mining
  • TextRank a general graph-based sorting algorithm for natural language processing
  • Rake Rapid Automatic Keyword Extraction, fast automatic keyword extraction
  • Topic -Model theme model
  • TF-IDF TF*IDF
  • a semantic segmentation model may be established in advance through deep learning of a neural network to achieve rapid extraction of keywords.
  • Step 34 Obtain related images and voices based on the keywords.
  • Step 35 Process the image and voice to form a video for introducing the application program.
  • Steps 34-35 have the same or similar technical solutions as the foregoing embodiment, and will not be repeated here.
  • FIG. 4 is a schematic flowchart of a fourth embodiment of the application introduction method provided by the present application.
  • the method is implemented based on a mobile terminal, and the method includes:
  • Step 41 Obtain the introduction requirement information about the application program; wherein the introduction requirement information is used to indicate the requirement for the introduction application program.
  • Step 42 Extract keywords in the introduction requirement information.
  • Steps 41-42 have the same or similar technical solutions as the foregoing embodiment, and will not be repeated here.
  • Step 43 Send the keywords to the server, so that the server generates associated images and voices based on the keywords.
  • deep learning based on the convolutional neural network can obtain images and voices associated with the keywords.
  • the image may be obtained by the server, and the voice may be recognized by the mobile terminal on the keywords, so as to generate multiple paragraphs of text that match the scene, and convert them into voice.
  • Step 44 Obtain the image and voice sent by the server.
  • Step 45 Process the image and voice to form a video for introducing the application program.
  • Steps 44-45 have the same or similar technical solutions as the foregoing embodiment, and will not be repeated here.
  • FIG. 5 is a schematic flowchart of a fifth embodiment of the application introduction method provided by the present application, and the method includes:
  • Step 51 Obtain introduction requirement information about the application; where the introduction requirement information is used to indicate the requirement for the introduction of the application.
  • Step 52 Extract keywords in the introduction requirement information.
  • Step 53 Send the keywords to the server, so that the server generates associated images and voices based on the keywords.
  • Step 54 Obtain the image and voice sent by the server.
  • Steps 51-54 have the same or similar technical solutions as the above-mentioned embodiment, and will not be repeated here.
  • Step 55 Perform image segmentation on multiple corresponding images, and extract feature information from the images.
  • Image segmentation is the technique and process of dividing an image into a number of specific areas with unique properties and proposing objects of interest. It is a key step from image processing to image analysis.
  • the existing image segmentation methods are mainly divided into the following categories: threshold-based segmentation methods, region-based segmentation methods, edge-based segmentation methods, and segmentation methods based on specific theories.
  • image segmentation is the process of dividing a digital image into disjoint areas.
  • the process of image segmentation is also a marking process, that is, the pixels belonging to the same area are assigned the same number.
  • the threshold-based segmentation method is a region-based image segmentation technology, the principle is to divide the image pixels into several categories.
  • Image thresholding segmentation is one of the most commonly used traditional image segmentation methods. It has become the most basic and most widely used segmentation technique in image segmentation due to its simple implementation, small calculation amount, and stable performance. It is especially suitable for images where the target and background occupy different gray scale ranges. It can not only greatly compress the amount of data, but also greatly simplifies the analysis and processing steps. Therefore, in many cases, it is a necessary image preprocessing process before image analysis, feature extraction and pattern recognition.
  • the purpose of image thresholding is to divide the pixel set according to the gray level, and each of the obtained subsets forms an area corresponding to the real scene. Each area has the same attributes, while the adjacent areas do not have this Consistent attributes. Such division can be achieved by selecting one or more thresholds starting from the gray level.
  • the region-based segmentation method is a segmentation technique based on directly finding the region.
  • the specific algorithms include region growth and region separation and merging algorithms.
  • region growth which starts from a single pixel and gradually merges to form the required segmentation area; the other is to start from the overall situation and gradually cut to the required segmentation area.
  • edge-based segmentation mainly includes point-based detection, line-based detection, and edge-based detection.
  • segmentation methods based on specific theories can be divided into cluster analysis, fuzzy set theory, gene coding, wavelet transform and other methods.
  • feature extraction is performed based on the keywords and the scene, to perform step 56.
  • Step 56 Combine the feature information to generate multiple image frames.
  • Step 57 Form multiple image frames into animation.
  • steps 55-57 are specifically:
  • Deep learning is performed in advance through the convolutional neural network to establish an image model so that the corresponding feature information generates multiple image frames, and then the multiple image frames are formed into an animation.
  • Step 58 The animation and voice are merged to form a video for introducing the application.
  • FIG. 6 is a schematic flowchart of a sixth embodiment of the application introduction method provided by the present application.
  • the method is implemented based on a server, and the method includes:
  • Step 61 Acquire keywords sent by the mobile terminal; wherein the keywords are extracted by the mobile terminal based on the acquired introduction requirement information about the application, and the introduction requirement information is used to indicate the requirement for introducing the application.
  • the mobile terminal After the mobile terminal obtains the introduction requirement information about the application, it extracts the keywords and sends the keywords to the server.
  • Step 62 Generate associated images and voices based on the keywords.
  • the server performs model training through the relevant content of the application in advance, so that when the keyword of the mobile terminal is obtained, it responds quickly and obtains the image and voice associated with the keyword.
  • Step 63 Send the image and voice to the mobile terminal, so that the mobile terminal processes the image and voice to form a video for introducing the application program.
  • the generated image and voice are sent to the mobile terminal, so that the mobile terminal extracts feature information of the image, and then combines the feature information to generate multiple image frames and combine to generate multiple image frames.
  • the mobile terminal merges multiple image frames into animation and voice to form a video for introducing the application.
  • an application introduction method of this application includes: acquiring keywords sent by a mobile terminal; wherein the keywords are extracted by the mobile terminal based on the acquired introduction requirement information about the application.
  • the introduction demand information is used to express the demand for introducing the application; generate related images and voice based on keywords; send the image and voice to the mobile terminal, so that the mobile terminal can process the image and voice to form the application program Introductory video.
  • FIG. 7 is a schematic flowchart of a seventh embodiment of the application introduction method provided by the present application, and the method includes:
  • Step 71 Acquire keywords sent by the mobile terminal; wherein the keywords are extracted by the mobile terminal based on the acquired introduction requirement information about the application, and the introduction requirement information is used to indicate the requirement for introducing the application.
  • Step 72 Pass the keywords through deep learning to obtain associated images from the preset image library.
  • Deep learning models include convolutional neural network (convolutional neural network), DBN (Deep Belief Network, deep trust network model) and stacked auto-encoder network models.
  • Convolutional neural networks are inspired by the structure of the visual system.
  • the first convolutional neural network calculation model was proposed in the neurocognitive machine. Based on the local connection between neurons and the layered organization image conversion, the neurons with the same parameters are applied to the difference of the previous layer of neural network. Position, get a translation-invariant neural network structure. Later, on the basis of this idea, a convolutional neural network was designed and trained with error gradients to obtain superior performance on some pattern recognition tasks.
  • DBN can be interpreted as a Bayesian probability generation model, which is composed of multiple layers of random latent variables.
  • the upper two layers have undirected symmetrical connections, and the lower layer gets top-down directed connections from the upper layer, and the lowest layer unit
  • the state of is the visible input data vector.
  • the DBN is composed of a stack of 2F structural units, and the structural unit is usually RBM (RestIlcted Boltzmann Machine, Restricted Boltzmann Machine).
  • RBM RasterIlcted Boltzmann Machine, Restricted Boltzmann Machine
  • the input samples are used to train the first-layer RBM units, and their output is used to train the second-layer RBM models, and the RBM models are stacked to improve the model performance by adding layers.
  • the unsupervised pre-training process after the DBN code is input to the top RBM, the state of the top layer is decoded to the unit of the bottom layer to realize the reconstruction of the input.
  • RBM shares parameters with each layer of DBN.
  • the structure of the stacked self-encoding network is similar to that of the DBN, consisting of a stack of several structural units. The difference is that the structural unit is an auto-en-coder instead of RBM.
  • the self-encoding model is a two-layer neural network, the first layer is called the coding layer, and the second layer is called the decoding layer.
  • the server needs to generate a corresponding scene prediction according to the characteristics of the application and the characteristics of the keywords, and search for the corresponding image according to the scene.
  • the server will search for images in the Internet.
  • Step 73 Pass the keywords through deep learning to generate text information that meets the keyword scene.
  • Step 74 Convert the text information into voice.
  • Step 75 Send the image and voice to the mobile terminal, so that the mobile terminal processes the image and voice to form a video for introducing the application program.
  • the server first retrieves a large number of images, sends the images to the mobile terminal, and the mobile terminal divides the images according to keywords, and then combines them according to the scene to form an animation. It is voice fusion to form a video for introducing the application.
  • FIG. 8 is a schematic structural diagram of a first embodiment of a mobile terminal provided by the present application.
  • the mobile terminal 80 includes a processor 81 and a memory 82 connected to the processor 81; the memory 82 is used to store program data, and the processor 81 Used to execute program data to implement the following methods:
  • the introduction demand information is used to express the demand for introducing the application; extract the keywords in the introduction demand information; obtain the associated images and voices based on the keywords; process the images and voices To form a video for introducing the application.
  • the processor 81 for executing the program data is also used to implement the following methods: perform voice recognition on audio information to obtain text information; and perform keyword extraction on text information to obtain keywords.
  • the processor 81 is used to execute the program data to implement the following method: perform semantic segmentation on the text information; obtain keywords based on the result of the semantic segmentation.
  • the processor 81 used to execute the program data is also used to implement the following method: input text information into a convolutional neural network for deep learning, so as to perform semantic segmentation on the text information to obtain keywords.
  • the processor 81 is used to execute the program data to implement the following method: perform semantic segmentation on the text information; obtain keywords based on the result of the semantic segmentation.
  • the processor 81 is used to execute the program data to implement the following method: sending keywords to the server so that the server generates associated images and voices based on the keywords; acquiring images and voices sent by the server.
  • the processor 81 is configured to execute the program data to implement the following method: image segmentation of multiple corresponding images, extraction of feature information in the image; combination of feature information to generate multiple image frames; Multiple image frames are formed into animation; animation and voice are merged to form a video for introducing the application.
  • the processor 81 is used to execute the program data to implement the following method: acquiring background music sent by the server; wherein the background music is music generated by the server based on keywords; adding the background music to the video.
  • FIG. 9 is a schematic structural diagram of a first embodiment of a server provided by the present application.
  • the server 90 includes a processor 91 and a memory 92 connected to the processor 91; the memory 92 is used to store program data, and the processor 91 is used to store program data.
  • the program data is executed to achieve the following methods:
  • Acquire keywords sent by the mobile terminal among them, the keywords are extracted by the mobile terminal based on the acquired introduction demand information about the application, and the introduction demand information is used to indicate the demand for introducing the application; the associated image is generated based on the keywords And voice; send images and voices to the mobile terminal so that the mobile terminal can process the images and voices to form a video for introducing the application.
  • the processor 91 used to execute the program data is also used to implement the following method: pass keywords through deep learning to obtain associated images from a preset image library.
  • the processor 91 used to execute the program data is also used to implement the following method: pass keywords through deep learning to generate text information that meets the keyword scene; convert the text information into voice
  • FIG. 10 is a schematic structural diagram of an embodiment of a computer storage medium provided by the present application.
  • the computer storage medium 100 is used to store program data 101.
  • the program data 101 is executed by a processor, it is used to implement the following methods:
  • the introduction demand information is used to express the demand for introducing the application; extract the keywords in the introduction demand information; obtain the associated images and voices based on the keywords; process the images and voices , To form a video for introducing the application;
  • the keywords are extracted by the mobile terminal based on the acquired introduction demand information about the application, and the introduction demand information is used to indicate the demand for introducing the application; and the correlation is generated based on the keywords
  • the image and voice of the mobile terminal send the image and voice to the mobile terminal so that the mobile terminal can process the image and voice to form a video for introducing the application.
  • the computer storage medium can be applied to the above-mentioned mobile terminal or the above-mentioned server to implement the method of any one of the above-mentioned embodiments.
  • FIG. 11 is a schematic structural diagram of a second embodiment of a mobile terminal provided by the present application.
  • the mobile terminal 110 includes: an acquisition module 111, an extraction module 112, and a processing module 113.
  • the obtaining module 111 is used for obtaining introduction requirement information about the application program; wherein, the introduction requirement information is used to indicate the requirement for introducing the application program;
  • the extraction module 112 is used to extract keywords in the introduction demand information
  • the obtaining module 111 is also used to obtain related images and voices based on keywords;
  • the processing module 113 is used to process images and voices to form a video for introducing the application program.
  • FIG. 12 is a schematic structural diagram of a second embodiment of a server provided by the present application.
  • the server 120 includes: an obtaining module 121, a processing module 122, and a sending module 123.
  • the obtaining module 121 is used to obtain keywords sent by the mobile terminal; wherein, the keywords are extracted by the mobile terminal based on the obtained introduction requirement information about the application, and the introduction requirement information is used to indicate the requirement for introducing the application;
  • the processing module 122 is configured to generate associated images and voices based on keywords
  • the sending module 123 is configured to send images and voices to the mobile terminal, so that the mobile terminal processes the images and voices to form a video for introducing the application program.
  • the disclosed method and device may be implemented in other ways.
  • the device implementation described above is only illustrative.
  • the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be Combined or can be integrated into another system, or some features can be ignored or not implemented.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of this embodiment.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit in the other embodiments described above is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .

Abstract

La présente invention concerne un procédé d'introduction d'application, un terminal mobile et un serveur. Le procédé comprend les étapes consistant à : obtenir des informations d'exigence d'introduction concernant une application, les informations d'exigence d'introduction étant utilisées pour indiquer une exigence d'introduction de l'application (11) ; extraire un mot-clé dans les informations d'exigence d'introduction (12) ; obtenir une image et une voix associées sur la base du mot-clé (13) ; et traiter l'image et la voix pour former une vidéo pour introduire l'application (14). D'une part, le procédé peut s'adapter à différents groupes d'utilisateurs, de telle sorte que l'application satisfait aux exigences d'un plus grand nombre de groupes d'utilisateurs. D'autre part, l'application est introduite sous la forme d'une animation, rendant l'introduction de l'application plus personnalisée et plus intéressante, ce qui améliore ainsi l'expérience utilisateur.
PCT/CN2019/104000 2019-09-02 2019-09-02 Procédé d'introduction d'application, terminal mobile et serveur WO2021042234A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980010315.5A CN111801673A (zh) 2019-09-02 2019-09-02 应用程序的介绍方法、移动终端及服务器
PCT/CN2019/104000 WO2021042234A1 (fr) 2019-09-02 2019-09-02 Procédé d'introduction d'application, terminal mobile et serveur

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/104000 WO2021042234A1 (fr) 2019-09-02 2019-09-02 Procédé d'introduction d'application, terminal mobile et serveur

Publications (1)

Publication Number Publication Date
WO2021042234A1 true WO2021042234A1 (fr) 2021-03-11

Family

ID=72805590

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/104000 WO2021042234A1 (fr) 2019-09-02 2019-09-02 Procédé d'introduction d'application, terminal mobile et serveur

Country Status (2)

Country Link
CN (1) CN111801673A (fr)
WO (1) WO2021042234A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117041627B (zh) * 2023-09-25 2024-03-19 宁波均联智行科技股份有限公司 Vlog视频生成方法及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104731959A (zh) * 2015-04-03 2015-06-24 北京威扬科技有限公司 基于文本的网页内容生成视频摘要的方法、装置及系统
CN106547748A (zh) * 2015-09-16 2017-03-29 中国移动通信集团公司 一种app索引库的创建方法及装置、搜索app的方法及装置
CN108965737A (zh) * 2017-05-22 2018-12-07 腾讯科技(深圳)有限公司 媒体数据处理方法、装置及存储介质
CN109145152A (zh) * 2018-06-28 2019-01-04 中山大学 一种基于查询词的自适应智能生成图文视频缩略图方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104820546B (zh) * 2014-12-30 2018-02-13 广州酷狗计算机科技有限公司 功能信息展示方法和装置
CN106648675A (zh) * 2016-12-28 2017-05-10 乐蜜科技有限公司 应用程序使用信息的展示方法、装置和电子设备
CN106919317A (zh) * 2017-02-27 2017-07-04 珠海市魅族科技有限公司 一种信息展示方法及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104731959A (zh) * 2015-04-03 2015-06-24 北京威扬科技有限公司 基于文本的网页内容生成视频摘要的方法、装置及系统
CN106547748A (zh) * 2015-09-16 2017-03-29 中国移动通信集团公司 一种app索引库的创建方法及装置、搜索app的方法及装置
CN108965737A (zh) * 2017-05-22 2018-12-07 腾讯科技(深圳)有限公司 媒体数据处理方法、装置及存储介质
CN109145152A (zh) * 2018-06-28 2019-01-04 中山大学 一种基于查询词的自适应智能生成图文视频缩略图方法

Also Published As

Publication number Publication date
CN111801673A (zh) 2020-10-20

Similar Documents

Publication Publication Date Title
CN110427617B (zh) 推送信息的生成方法及装置
CN108197111B (zh) 一种基于融合语义聚类的文本自动摘要方法
CN109582952B (zh) 诗歌生成方法、装置、计算机设备和介质
CN108538286A (zh) 一种语音识别的方法以及计算机
US20130262114A1 (en) Crowdsourced, Grounded Language for Intent Modeling in Conversational Interfaces
CN111160452A (zh) 一种基于预训练语言模型的多模态网络谣言检测方法
CN110347790B (zh) 基于注意力机制的文本查重方法、装置、设备及存储介质
CN110287314B (zh) 基于无监督聚类的长文本可信度评估方法及系统
CN110472043B (zh) 一种针对评论文本的聚类方法及装置
CN110807324A (zh) 一种基于IDCNN-crf与知识图谱的影视实体识别方法
CN111506794A (zh) 一种基于机器学习的谣言管理方法和装置
CN110493612B (zh) 弹幕信息的处理方法、服务器及计算机可读存储介质
CN112883731A (zh) 内容分类方法和装置
EP4060548A1 (fr) Procédé et dispositif de présentation d'informations d'invite et support d'informations
CN115470344A (zh) 一种基于文本聚类的视频弹幕与评论主题融合的方法
CN109635303B (zh) 特定领域意义改变词的识别方法
CN113407842B (zh) 模型训练方法、主题推荐理由的获取方法及系统、电子设备
WO2021042234A1 (fr) Procédé d'introduction d'application, terminal mobile et serveur
CN110781327B (zh) 图像搜索方法、装置、终端设备及存储介质
CN115188376A (zh) 一种个性化语音交互方法及系统
CN114428852A (zh) 基于bert预训练模型的中文文本摘要抽取方法及装置
CN114328910A (zh) 文本聚类方法以及相关装置
KR20220143229A (ko) 한국어 언어 모델에 기반한 핵심문장 추출장치 및 그 방법
CN112632229A (zh) 文本聚类方法及装置
CN114722267A (zh) 信息推送方法、装置及服务器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19943978

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19943978

Country of ref document: EP

Kind code of ref document: A1