CN104731960B - Method, apparatus and system based on ecommerce webpage content generation video frequency abstract - Google Patents

Method, apparatus and system based on ecommerce webpage content generation video frequency abstract Download PDF

Info

Publication number
CN104731960B
CN104731960B CN201510156125.8A CN201510156125A CN104731960B CN 104731960 B CN104731960 B CN 104731960B CN 201510156125 A CN201510156125 A CN 201510156125A CN 104731960 B CN104731960 B CN 104731960B
Authority
CN
China
Prior art keywords
keyword
text
word
webpage
ontology
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510156125.8A
Other languages
Chinese (zh)
Other versions
CN104731960A (en
Inventor
李国祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wei Yang Science And Technology Ltd
Original Assignee
Beijing Wei Yang Science And Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wei Yang Science And Technology Ltd filed Critical Beijing Wei Yang Science And Technology Ltd
Priority to CN201510156125.8A priority Critical patent/CN104731960B/en
Publication of CN104731960A publication Critical patent/CN104731960A/en
Application granted granted Critical
Publication of CN104731960B publication Critical patent/CN104731960B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings

Abstract

The present invention relates to video to generate field, more particularly to method, apparatus and system based on ecommerce webpage content generation video frequency abstract.This method, apparatus and system can be based on target electronic commercial affairs webpage text content generation video frequency abstract, and video frequency abstract is shown on target electronic commercial affairs webpage.User is when browsing corresponding ecommerce webpage, buyer's guide information can be obtained by way of watching video frequency abstract, compared to existing e-commerce website by way of picture and character introduction commodity, the time cost of buyer's guide on user's read electronic commercial affairs webpage can be saved.

Description

Method, apparatus and system based on ecommerce webpage content generation video frequency abstract
Technical field
The present invention relates to video to generate field, and video frequency abstract is generated in particular to based on ecommerce webpage content Method, apparatus and system.
Background technology
E-commerce website is exactly enterprise, mechanism or the personal website established on the internet, is enterprise, mechanism Or the personal infrastructure and information platform for carrying out ecommerce, it is the interactive window for implementing ecommerce, is to be engaged in electronics A kind of means of commercial affairs.
Existing e-commerce website, merchandise display are generally basede on word and picture to describe commodity.User passes through at present Increasing approach, such as mobile phone, tablet personal computer, TV etc., to obtain the content of buyer's guide.Existing ecommerce class The also more based on word picture of buyer's guide are obtained using from e-commerce website.
It is to read word in the quick Consumption Age of content, user the shortcomings that buyer's guide on existing e-commerce website Time cost is relative to be improved, and is unfavorable for e-commerce website and is introduced commodity to user with word.
The content of the invention
It is an object of the invention to provide it is a kind of based on ecommerce webpage content generation video frequency abstract method, apparatus and System, the commodity on webpage are introduced to user with the mode of generation video frequency abstract, to save on user's read electronic commercial affairs webpage The time cost of buyer's guide.
In a first aspect, the embodiments of the invention provide a kind of side based on ecommerce webpage content generation video frequency abstract Method, including:Extract the text snippet of target electronic commercial affairs webpage text content;The text snippet is parsed, obtains the text Keyword in summary;Semantic analysis is carried out to the keyword, obtains the keyword Ontology;Based on the keyword Ontology, picture or video corresponding to retrieval, form Background from internet, form Background;Based on the key Word Ontology, animation template corresponding with the keyword is obtained from the grammar database preestablished;By the text Summary is converted into voice data;Rule is rendered according to default, the Background, the animation template and the voice data are closed Into being rendered into video file.
With reference in a first aspect, the embodiments of the invention provide the possible embodiment of the first of first aspect, wherein, institute Stating the text snippet of extraction target electronic commercial affairs webpage text content includes:Based on web page interlinkage, ecommerce webpage is obtained;Go Except the additional information in the ecommerce webpage, wherein the additional information includes one or more of:Advertisement, picture, Video, framework and chart;The content of text of the ecommerce webpage after additional information belonging to extraction removal;From the text Emphasis sentence is won in content and forms the text snippet.
With reference in a first aspect, the embodiments of the invention provide the possible embodiment of second of first aspect, wherein, institute State and the emphasis sentence composition text snippet is won from the content of text, including:Calculate successively every in the content of text Similitude between two sentences;According to the result of calculation of the similitude, to the statement classification in the content of text;According to The result of the classification, from every quasi-sentence extracting sentence respectively is combined, and obtains candidate's summary;From the candidate makes a summary Choose the summary texts made a summary with the minimum candidate of pre-set text length of summarization difference as the ecommerce webpage, wherein institute Pre-set text length of summarization is stated to be determined according to video length to be generated and the bright reading rate of text snippet set in advance.
With reference in a first aspect, the embodiments of the invention provide the possible embodiment of the 4th of first aspect kind, wherein, institute The parsing text snippet is stated, obtains the keyword in the text snippet, including:The text snippet is segmented;Will The word obtained after the participle is compared with the word template in the grammar database, it is determined that the word obtained after participle Part of speech;According to the judged result of the part of speech, noun and number are chosen from the word after participle as the text snippet Keyword.
With reference in a first aspect, the embodiments of the invention provide the possible embodiment of the 5th of first aspect kind, wherein, institute State and semantic analysis is carried out to the keyword, obtain the keyword Ontology, including:Retrieved in the grammar database The keyword, obtain all ontology describings related to the keyword;Using network ontology language OWL from the keyword All ontology describings in determine keyword Ontology under current context.
With reference in a first aspect, the embodiments of the invention provide the possible embodiment of the 6th of first aspect kind, wherein, institute State and render rule according to default, the synthesis of the Background, the animation template and the voice data is rendered into video file, Including:The mapping for setting keyword described in the voice data, Background corresponding with the keyword and animation template is closed System;According to the mapping relations, synthesis is carried out to the Background, the animation template and the voice data and rendered.
Second aspect, the embodiment of the present invention additionally provide a kind of dress based on ecommerce webpage content generation video frequency abstract Put, including:Extraction module, for extracting the text snippet of target electronic commercial affairs webpage text content;Keyword acquisition module, use In parsing the text snippet, the keyword in the text snippet is obtained;Semantic module, for entering to the keyword Row semantic analysis, obtain the keyword Ontology;Background graphics are into module, for based on the keyword Ontology, Picture or video corresponding to retrieval, form Background from internet;Animation template acquisition module, for based on the key Word Ontology, animation template corresponding with the keyword is obtained from default grammar database;Audio conversion module, use In the text snippet is converted into voice data;Video Composition module, for rendering rule according to default, by the background Figure, the animation template and voice data synthesis are rendered into video file.
With reference to second aspect, the embodiments of the invention provide the possible embodiment of the first of second aspect, wherein, institute Keyword acquisition module is stated, including:Participle unit, for being segmented to the text snippet;Part of speech determining unit, for inciting somebody to action The word obtained after the participle is compared with the word template in the grammar database, it is determined that the word obtained after participle Part of speech;Keyword chooses unit, and for the judged result according to the part of speech, noun and number are chosen from the word after participle Keyword of the word as the text snippet.
The third aspect, the embodiment of the present invention additionally provides a kind of is based on ecommerce webpage content generation video frequency abstract System, including:User terminal and the e-commerce server end being connected with user terminal by internet;The e-commerce server end Including being regarded as described in second aspect and second aspect the first possible embodiment based on the generation of ecommerce webpage content The device of frequency summary.
Method, apparatus and system provided in an embodiment of the present invention based on ecommerce webpage content generation video frequency abstract, Target electronic commercial affairs webpage text content generation video frequency abstract can be based on, and by video frequency abstract on target electronic commercial affairs webpage Show.User can obtain buyer's guide when browsing corresponding ecommerce webpage by way of watching video frequency abstract Information, compared to existing e-commerce website by way of picture and character introduction commodity, the commercial affairs of user's read electronic can be saved The time cost of buyer's guide on webpage.
To enable the above objects, features and advantages of the present invention to become apparent, preferred embodiment cited below particularly, and coordinate Appended accompanying drawing, is described in detail below.
Brief description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by embodiment it is required use it is attached Figure is briefly described, it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, therefore be not construed as pair The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to this A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 shows the method based on ecommerce webpage content generation video frequency abstract that the embodiment of the present invention 1 is provided Schematic flow sheet;
Fig. 2 shows the method based on ecommerce webpage content generation video frequency abstract that the embodiment of the present invention 2 is provided Schematic flow sheet;
Fig. 3 shows the method based on ecommerce webpage content generation video frequency abstract that the embodiment of the present invention 3 is provided Schematic flow sheet;
Fig. 4 shows the method based on ecommerce webpage content generation video frequency abstract that the embodiment of the present invention 4 is provided Schematic flow sheet;
Fig. 5 shows the method based on ecommerce webpage content generation video frequency abstract that the embodiment of the present invention 5 is provided Schematic flow sheet;
Fig. 6 shows the device based on ecommerce webpage content generation video frequency abstract that the embodiment of the present invention 6 is provided Structure be intended to;
Fig. 7 shows the device based on ecommerce webpage content generation video frequency abstract that the embodiment of the present invention 7 is provided The structural representation of middle keyword acquisition module;
Fig. 8 shows the system based on ecommerce webpage content generation video frequency abstract that the embodiment of the present invention 8 is provided Connection diagram.
Embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Ground describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.Generally exist The component of the embodiment of the present invention described and illustrated in accompanying drawing can be configured to arrange and design with a variety of herein.Cause This, the detailed description of the embodiments of the invention to providing in the accompanying drawings is not intended to limit claimed invention below Scope, but it is merely representative of the selected embodiment of the present invention.Based on embodiments of the invention, those skilled in the art are not doing The every other embodiment obtained on the premise of going out creative work, belongs to the scope of protection of the invention.
Embodiment 1:
The present embodiment 1 provides a kind of method based on ecommerce webpage content generation video frequency abstract, the signal of its flow Figure is Fig. 1, and main processing steps include:
Step S101:Extract the text snippet of target electronic commercial affairs webpage text content.
The text that merchandise news is introduced on ecommerce webpage may be not succinct enough, and user needs more time cost to obtain Take the buyer's guide information on webpage, by extract ecommerce webpage content of text text snippet can with more succinct and Completely introduce the commodity on ecommerce webpage to user relatively.
In addition, ecommerce webpage in addition to the text for introducing merchandise news, generally also includes other additional informations, such as Advertisement, picture, video, framework and/or chart etc. are attached with ecommerce webpage, these additional informations are not that commodity are situated between The effective content to continue, therefore before the text snippet of extraction ecommerce webpage content of text, can first remove ecommerce Additional information on webpage.
Step S102, text snippet is parsed, obtains the keyword in text snippet.
Keyword in text snippet includes the key message of buyer's guide, acquisition that can be easy by extracting keyword The key message of buyer's guide.By this step, the keyword of buyer's guide can be obtained, keyword letter is provided for subsequent step Breath.
Step S103, semantic analysis is carried out to keyword, obtains keyword Ontology.
Body is the clear and definite specification explanation of generalities, provides the basic terms and relation for forming association area vocabulary, with And the regular definition for providing these vocabulary extensions formed using these terms and relation.Using Ontology, can obtain The basic description of commodity, such as the ontology describing of " shirt " are " dress ornaments ".One word may have multiple ontology describings, such as The ontology describing of " apple " can be " fruit " or " company ", it is therefore desirable to determine the keyword under current context Ontology.This step carries out semantic analysis to keyword and obtains keyword Ontology, to be correctly based in subsequent step Ecommerce webpage content generates video frequency abstract.
Step S104, based on keyword Ontology, picture or video corresponding to retrieval, form background from internet Figure;
Synthetic video summary needs material.According to keyword Ontology, retrieved in internet search engine corresponding Picture or video, formed Background, as in subsequent step synthetic video make a summary material.
Step S105, based on keyword Ontology, obtained from the grammar database preestablished corresponding with keyword Animation template;
Animation template corresponding to different terms Ontology is stored in grammar database, according to keyword Ontology, Animation template corresponding to being obtained from grammar database.Template corresponding to different keywords is pieced together, can be obtained Generate the teaming method of complete video summary.
Step S106, text snippet is converted into voice data;
Text snippet is changed into voice data using corresponding software, made a summary using this voice data as synthetic video Audio material.In video frequency abstract, commodity are introduced to user in a manner of audio, it is easier compared to character introduction, save The time of user.
Step S107, rule is rendered according to default, Background, animation template and institute's voice data synthesis is rendered into video File.
Generation video file is rendered, has corresponding software and renders rule, rule is rendered according to default, by Background, moves Draw template and institute's voice data synthesis is rendered into video file.When synthesizing render video, keyword in voice data is set, with closing The mapping relations of Background corresponding to keyword and animation template;According to mapping relations, to Background, animation template and voice data Synthesis is carried out to render.Such as keyword 1 occurs in audio at the 3rd second, there is next keyword at the 5th second in audio, then Background corresponding to keyword 1 shows according to animation template between the 3rd second and the 5th second.Pass through the method so that video frequency abstract In audio coincide with image, preferably introduce commodity to user.
The present embodiment 1 provides a kind of method based on ecommerce webpage content generation video frequency abstract, can be based on mesh Ecommerce webpage content of text generation video frequency abstract is marked, and video frequency abstract is shown on target electronic commercial affairs webpage. User can obtain buyer's guide information by way of watching video frequency abstract, compare when browsing corresponding ecommerce webpage Existing e-commerce website can save commodity on user's read electronic commercial affairs webpage by way of picture and character introduction commodity The time cost of introduction.
Embodiment 2:
The present embodiment 2 provides one kind on the basis of embodiment 1 and is preferably based on ecommerce webpage content generation video The method of summary, its schematic flow sheet are Fig. 2, and key step includes:
Step S201, based on web page interlinkage, ecommerce webpage is obtained;
The address of web page interlinkage, when can be that user accesses ecommerce webpage, work as to what e-commerce server was sent Preceding ecommerce web page address;Can also be described on the corresponding e-commerce website of e-commerce server scanning acquisition The address of the ecommerce webpage of commodity.Web page interlinkage of the e-commerce server based on acquisition, obtains corresponding e-business network Page information.
Step S202, the additional information in ecommerce webpage is removed, wherein additional information includes one or more of: Advertisement, picture, video, framework and chart;
On the ecommerce webpage that e-commerce server obtains, in addition to including character introduction corresponding to commodity, may be used also Can there are other incoherent additional informations, such as advertisement, picture, video, framework and chart, this additional information is for understanding commodity Information is utterly useless, therefore step S202 is used for removing the additional information on ecommerce webpage.
Step S203, the content of text of the ecommerce webpage after extraction removal additional information;
After eliminating the additional information on ecommerce webpage, e-commerce server obtains the text that commodity are introduced This information, so as in afterwards the step of based on corresponding text message generation video frequency abstract on ecommerce webpage.
Step S204, emphasis sentence composition text snippet is won from content of text.
Buyer's guide on ecommerce webpage may be not succinct enough, containing more word, when user needs more Between cost obtain webpage on buyer's guide information, therefore, it is necessary to won from content of text emphasis sentence composition text snippet, More compactly to introduce the commodity on ecommerce webpage to user, the time cost that user obtains merchandise news is saved.
Step S205, text snippet is parsed, obtains the keyword in text snippet.
This step obtains the keyword of buyer's guide, and key word information is provided for subsequent step.
Step S206, semantic analysis is carried out to keyword, obtains keyword Ontology.
The semantic analysis that this step obtains keyword obtains Ontology, to be correctly based on electronics business in subsequent step Business web page contents generation video frequency abstract.
Step S207, based on keyword Ontology, picture or video corresponding to retrieval, form background from internet Figure;
This step obtains Background, the material made a summary as synthetic video in subsequent step.
Step S208, based on keyword Ontology, obtained from the grammar database preestablished corresponding with keyword Animation template;
This step can obtain the mode of generation video frequency abstract.
Step S209, text snippet is converted into voice data;
Text snippet is changed into voice data by this step, the audio element made a summary using this voice data as synthetic video Material.
Step S210, rule is rendered according to default, Background, animation template and institute's voice data synthesis is rendered into video File.
This step renders generation video file.
A kind of method based on ecommerce webpage content generation video frequency abstract that the present embodiment 2 provides carries with embodiment 1 The method of confession is compared, and its course of work is identical with advantage, repeats no more.
Embodiment 3:
The present embodiment 3 provides one kind on the basis of embodiment 2 and is preferably based on ecommerce webpage content generation video The method of summary, its schematic flow sheet are Fig. 3, and key step includes:
Step S301, based on web page interlinkage, ecommerce webpage is obtained;
This step obtains corresponding ecommerce webpage.
Step S302, the additional information in ecommerce webpage is removed, wherein additional information includes one or more of: Advertisement, picture, video, framework and chart;
This step is used for removing the additional information on ecommerce webpage.
Step S303, the content of text of the ecommerce webpage after extraction removal additional information;
This step obtains the text message that commodity are introduced.
Step S304, the similitude in content of text between every two sentences is calculated successively.
Similar sentence generally comprises similar information.In order to it is succinct, completely introduced to user on ecommerce webpage Commodity, a sentence can be proposed in each class according to similitude by statement classification in content of text, so can be with letter It is clean, completely introduce the commodity on ecommerce webpage to user.
Specifically the method for similitude is between two sentences of calculating:
First, the quantity sum that word is shared in current two sentences is calculated;
The sum of all words appeared in simultaneously in current two word is calculated, its sum is bigger, then it is assumed that two sentences Between similitude it is bigger;
Secondly, by quantity sum divided by the length average value of current two sentences, the similitude of current two sentences is obtained;
The length of sentence is the number of words in sentence defined in this method.All words that will be appeared in simultaneously in current two word The average value of the sum of language divided by the number of words of two sentences, the similitude of current two sentences is obtained, i.e. two sentences share Word is more, and two mean lengths of utterance are shorter, then it is assumed that similitude is bigger between two sentences.Can be easily with the method Obtain the similitude between two sentences.For example, two words in content of text are respectively sentence 1 and sentence 2;Wrapped in sentence 1 Containing 4 words, each word length is 2 words, respectively word 1, word 2, word 3, word 4;6 words are included in sentence 2 Language, each word length are 2 words, respectively word 3, word 4, word 5, word 6, word 7, word 8.Sentence 1 and sentence 2 In share word 3 and word 4 totally 2 words;The length of sentence 1 is 8 words, and the length of sentence 2 is 12 words, and this two sentences are averagely long Degree is 10 words;Therefore the similitude of sentence 1 and sentence 2 is 0.2.
Using the above method, the similitude between every two sentences in content of text can be calculated.
Step S305, according to the result of calculation of similitude, to the statement classification in content of text;
All statement classifications are given according to the step S304 results calculated, if for example, similar between sentence 1 and sentence 2 Property be more than the similitude between sentence 1 and other all sentences and the similitude between sentence 1 and sentence 2 and be more than between sentence Average similarity, then sentence 1 and sentence 2 divide for a class;Otherwise, sentence 1 divides for different classes from sentence 2.By by sentence Classification, it is believed that the sentence in same class expresses the same meaning;All classes are all extracted into a sentence, can completely, The succinct commodity summary info introduced to user on ecommerce webpage, save the time cost that user obtains buyer's guide.
Step S306, according to the result of classification, from every quasi-sentence extracting sentence respectively is combined, and obtains candidate and plucks Will;
The content of text being previously obtained can be classified according to the similitude between sentence, and the sentence in content of text is divided into Multiple classes, the sentence of Similar content may be possessed in each class containing more than one.If therefrom extraction is not made a summary, buyer's guide It is troublesome.Extract a sentence respectively from every quasi-sentence, candidate's summary can be obtained, completely, succinctly can be situated between to user The commodity to continue on ecommerce webpage.May contain more sentences in the class of each sentence, the candidate of acquisition make a summary also have it is multiple Scheme by follow-up step, it is necessary to take suitable scheme.
Step S307, the candidate summary minimum with pre-set text length of summarization difference is chosen in being made a summary from candidate and is used as electronics The summary texts of commercial webpage, wherein pre-set text length of summarization are plucked according to video length to be generated and text set in advance Bright reading rate is wanted to determine.
Pre-set text length of summarization determines according to video length to be generated and the bright reading rate of text snippet set in advance, Such as video length is set to 1 minute, the bright reading rate of text snippet is set to 120 words per minute clocks, then pre-set text length of summarization is set to 120 words., it is necessary to filter out suitable text snippet in multiple text snippets that step 1d3 is obtained.It is in multiple text snippets and pre- If the minimum candidate's summary of text snippet length difference be chosen for the summary texts of ecommerce webpage.When multiple sides being present When case make it that content of text length of summarization is identical, using the scheme for extracting most preceding sentence.Such as the text sentence of acquisition can be divided into Two classes, wherein sentence 1 and sentence 3 are a classes, and sentence 2 and sentence 4 are another classes, the number of words of sentence 1 plus sentence 2 with it is pre- If text snippet length difference is minimum and the length of sentence 1 plus sentence 2 is equal to the length that sentence 3 adds sentence 4, now sentence 1 is The sentence occurred at first in text, then text snippet be made up of sentence 1 and sentence 2.Text snippet is obtained by this step, can Completely, the succinct commodity introduced to user on target electronic commercial affairs webpage.
Step S308, text snippet is parsed, obtains the keyword in text snippet.
This step can obtain the keyword of buyer's guide, and key word information is provided for subsequent step.
Step S309, semantic analysis is carried out to keyword, obtains keyword Ontology.
The semantic analysis that this step obtains keyword obtains Ontology, to be correctly based on electronics business in subsequent step Business web page contents generation video frequency abstract.
Step S310, based on keyword Ontology, picture or video corresponding to retrieval, form background from internet Figure;
This step obtains Background, the material made a summary as synthetic video in subsequent step.
Step S311, based on keyword Ontology, obtained from the grammar database preestablished corresponding with keyword Animation template;
This step can obtain the mode of generation video frequency abstract.
Step S312, text snippet is converted into voice data;
Text snippet is changed into voice data by this step, the audio element made a summary using this voice data as synthetic video Material.
Step S313, rule is rendered according to default, Background, animation template and institute's voice data synthesis is rendered into video File.
This step renders generation video file.
Embodiment 4:
The present embodiment 4 provides one kind on the basis of embodiment 1 and is preferably based on ecommerce webpage content generation video The method of summary, its schematic flow sheet are Fig. 4, and key step includes:
Step S401, the text snippet of target electronic commercial affairs webpage text content is extracted.
This step obtain text snippet, can completely, succinctly the commodity introduced to user on target electronic commercial affairs webpage.
Step S402, text snippet is segmented;
Using Chinese character as base unit in the statement of Chinese sentence, do not have similar to the participle information in English sentence, thus it is right first Text snippet is segmented, and obtains segmenting information in text snippet.
Step S403, the word obtained after participle is compared with the word template in default grammar database, really The part of speech of the word obtained after fixed participle;
Word template is stored in grammar database.By in the word that will be obtained after participle and default grammar database Word template is compared, and can determine that the part of speech of the word obtained after participle, i.e. word is noun, verb, number, measure word, generation The part of speech division of word, adjective, adverbial word, preposition, conjunction, auxiliary word, onomatopoeia and interjection.Similar function word such as adverbial word, preposition, company Word, auxiliary word, onomatopoeia and interjection do not include key message generally, pass through word and the default syntax data that will be obtained after participle Word template in storehouse is compared, it is determined that the part of speech of the word obtained after participle, can more rapidly obtain keyword.
Step S404, according to the judged result of part of speech, choose noun from the word after participle and number is plucked as text The keyword wanted.
The keyword of buyer's guide is noun and number in ecommerce webpage, and noun describes title and the classification of commodity Information, number describe the size, weight and pricing information of commodity.The noun and number being extracted in text snippet, can be obtained To the key message of buyer's guide.
Step S405, semantic analysis is carried out to keyword, obtains keyword Ontology.
The semantic analysis that this step obtains keyword obtains Ontology, to be correctly based on electronics business in subsequent step Business web page contents generation video frequency abstract.
Step S406, based on keyword Ontology, picture or video corresponding to retrieval, form background from internet Figure;
This step obtains Background, the material made a summary as synthetic video in subsequent step.
Step S407, based on keyword Ontology, obtained from the grammar database preestablished corresponding with keyword Animation template;
This step can obtain the mode of generation video frequency abstract.
Step S408, text snippet is converted into voice data;
Text snippet is changed into voice data by this step, the audio element made a summary using this voice data as synthetic video Material.
Step S409, rule is rendered according to default, Background, animation template and institute's voice data synthesis is rendered into video File.
This step renders generation video file.
Embodiment 5:
The present embodiment 5 provides one kind on the basis of embodiment 1 and is preferably based on ecommerce webpage content generation video The method of summary, its schematic flow sheet are Fig. 5, and key step includes:
Step S501, the text snippet of target electronic commercial affairs webpage text content is extracted.
This step obtain text snippet, can completely, succinctly the commodity introduced to user on target electronic commercial affairs webpage.
Step S502, text snippet is parsed, obtains the keyword in text snippet.
This step obtains the keyword of buyer's guide, and key word information is provided for subsequent step.
Step S503, the search key in default grammar database, all bodies related to keyword is obtained and are retouched State;
Ontology describing corresponding to each word is stored with default grammar database, is retrieved in grammar database crucial Word, all ontology describings related to keyword can be obtained.For example, by retrieving grammar database, the sheet of " shirt " is obtained Body description is " dress ornament ".
Step S504, determined using network ontology language OWL from all ontology describings of keyword under current context Keyword Ontology.
Keyword may contain multiple ontology describings, such as " apple ", it may be possible to " fruit ", it is also possible to and " company ", this When, the keyword Ontology under current context is determined using OWL, the correct description of keyword is obtained, subsequently to walk Correctly based on ecommerce webpage content generation video frequency abstract in rapid.
Step S505, based on keyword Ontology, picture or video corresponding to retrieval, form background from internet Figure;
This step obtains Background, the material made a summary as synthetic video in subsequent step.
Step S506, based on keyword Ontology, obtained from the grammar database preestablished corresponding with keyword Animation template;
This step can obtain the template of generation video frequency abstract.
Step S507, text snippet is converted into voice data;
Text snippet is changed into voice data by this step, the audio element made a summary using this voice data as synthetic video Material..
Step S508, rule is rendered according to default, Background, animation template and institute's voice data synthesis is rendered into video File.
This step renders generation video file.
Embodiment 6:
The present embodiment 6 provides a kind of device based on ecommerce webpage content generation video frequency abstract, its structural representation Figure such as Fig. 6, including:
Extraction module 21, for extracting the text snippet of target electronic commercial affairs webpage text content;
Keyword acquisition module 22, for parsing text snippet, obtain the keyword in text snippet;
Semantic module 23, for carrying out semantic analysis to keyword, obtain keyword Ontology;
Background graphics are into module 24, for based on keyword Ontology, from internet picture corresponding to retrieval or Video, form Background;
Animation template acquisition module 25, for based on keyword Ontology, being obtained from the grammar database preestablished Take animation template corresponding with keyword;
Audio conversion module 26, for text snippet to be converted into voice data;
Video Composition module 27, for rendering rule according to default, Background, animation template and voice data are synthesized into wash with watercolours Contaminate for video file.
A kind of device based on ecommerce webpage content generation video frequency abstract that the present embodiment 6 provides, by extracting mould Block 21 extracts the text snippet of target electronic commercial affairs webpage text content;Then the text of extraction is parsed by keyword acquisition module 22 This summary, obtain the keyword in text snippet;Afterwards, the analysis of key word of semantic module 23 obtains keyword body language Justice, then keyword Ontology is based on into module 24 by background graphics, picture or video, shape corresponding to retrieval from internet Into Background;Keyword Ontology is based on by animation template acquisition module 25, obtained from the grammar database preestablished Animation template corresponding with keyword;Text snippet is converted into voice data by audio conversion module 26;Finally, Video Composition mould Block 27 renders rule according to default, and the synthesis of Background, animation template and voice data is rendered into video file.When user accesses During ecommerce webpage, it can see on webpage and video frequency abstract is generated based on corresponding ecommerce webpage content.
The present embodiment 6 provides a kind of device based on ecommerce webpage content generation video frequency abstract, can be based on mesh Ecommerce webpage content of text generation video frequency abstract is marked, and video frequency abstract is shown on target electronic commercial affairs webpage. User can obtain buyer's guide information by way of watching video frequency abstract, compare when browsing corresponding ecommerce webpage Existing e-commerce website can save commodity on user's read electronic commercial affairs webpage by way of picture and character introduction commodity The time cost of introduction.
Embodiment 7:
The present embodiment 7 provides a kind of based on ecommerce webpage content generation video frequency abstract on the basis of embodiment 6 Device, wherein the structural representation of keyword acquisition module 22 as shown in fig. 7, comprises:
Participle unit 22a, for being segmented to text snippet;
Part of speech determining unit 22b, for word and the word template in default grammar database that will be obtained after participle It is compared, it is determined that the part of speech of the word obtained after participle;
Keyword chooses unit 22c, and for the judged result according to part of speech, noun and number are chosen from the word after participle Keyword of the word as text snippet.
Embodiment 8:
The present embodiment 8 provides a kind of system based on ecommerce webpage content generation video frequency abstract, including: User terminal 31 and e-commerce server end 32, user terminal 21 are connected with e-commerce server end 32 by internet, and it connects It is as shown in Figure 8 to connect schematic diagram.
E-commerce server end 32 is included as what embodiment 6 or 7 was provided is regarded based on the generation of ecommerce webpage content The device of frequency summary.
The generation of e-commerce server end 32 passes through user terminal based on ecommerce webpage content generation video frequency abstract, user During 21 access ecommerce webpage, it can see on webpage and video frequency abstract is generated based on corresponding ecommerce webpage content.
The present embodiment 8 provides a kind of system based on ecommerce webpage content generation video frequency abstract, can be based on mesh Ecommerce webpage content of text generation video frequency abstract is marked, and video frequency abstract is shown on target electronic commercial affairs webpage. User can obtain buyer's guide information by way of watching video frequency abstract, compare when browsing corresponding ecommerce webpage Existing e-commerce website can save commodity on user's read electronic commercial affairs webpage by way of picture and character introduction commodity The time cost of introduction.
User terminal 31 can be the application of iPhone mobile phones in the embodiment, iPad tablet personal computers are applied, Android phone is answered Applied with, Android tablet personal computers, TV set-top box is applied, WindowS platform softwares are applied, Mac platform softwares are applied, IE Any one in browser plug-in, Chrome browser plug-ins and Firefox browser plug-in unit.
E-commerce website end 32 can be that WordpreSS plug-in units, Drupal plug-in units, Joomla plug-in units, Mediawiki are inserted Any one in part, DiScuz plug-in units, PhpWind plug-in units and webpage javaScript scripts.
Each device and module that the embodiment of the present invention is provided, its realization principle and caused technique effect and preceding method Embodiment is identical, and to briefly describe, the embodiment part does not refer to part, refers to corresponding contents in preceding method embodiment.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, can be with Realize by another way.Device embodiment described above is only schematical, for example, the division of the unit, Only a kind of division of logic function, can there is other dividing mode when actually realizing, in another example, multiple units or component can To combine or be desirably integrated into another system, or some features can be ignored, or not perform.It is another, it is shown or beg for The mutual coupling of opinion or direct-coupling or communication connection can be by some communication interfaces, device or unit it is indirect Coupling or communication connection, can be electrical, mechanical or other forms.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also That unit is individually physically present, can also two or more units it is integrated in a unit.
The foregoing is only a specific embodiment of the invention, but protection scope of the present invention is not limited thereto, any Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all be contained Cover within protection scope of the present invention.Therefore, protection scope of the present invention described should be defined by scope of the claims.

Claims (8)

  1. A kind of 1. method based on ecommerce webpage content generation video frequency abstract, it is characterised in that including:
    Based on web page interlinkage, ecommerce webpage is obtained;The additional information in the ecommerce webpage is removed, wherein described attached Information is added to include one or more of:Advertisement, picture, video, framework and chart;Extraction removes the institute after the additional information State the content of text of ecommerce webpage;The similitude between every two sentences in the content of text is calculated successively;According to institute The result of calculation of similitude is stated, to the statement classification in the content of text;According to the result of the classification, from every quasi-sentence Extraction sentence is combined respectively, obtains candidate's summary;Chosen and pre-set text length of summarization difference in being made a summary from the candidate The text snippet that minimum candidate makes a summary as the ecommerce webpage, wherein the pre-set text length of summarization is according to be generated Into the bright reading rate of video length and text snippet set in advance determine;
    The text snippet is parsed, obtains the keyword in the text snippet;
    Semantic analysis is carried out to the keyword, obtains the keyword Ontology;
    Based on the keyword Ontology, picture or video corresponding to retrieval, form Background from internet;
    Based on the keyword Ontology, animation corresponding with the keyword is obtained from the grammar database preestablished Template;
    The text snippet is converted into voice data;
    Rule is rendered according to default, the synthesis of the Background, the animation template and the voice data is rendered into video text Part.
  2. 2. according to the method for claim 1, it is characterised in that described to calculate every two sentences in the content of text successively Between similitude, including:
    Calculate the quantity sum that word is shared in current two sentences;
    By the quantity sum divided by the length average value of current two sentences, the similitude of current two sentences is obtained;
    Method according to the similitude for obtaining current two sentences calculates similar between every two sentences in the content of text Property.
  3. 3. according to the method for claim 1, it is characterised in that the parsing text snippet, obtain the text and pluck Keyword in wanting, including:
    The text snippet is segmented;
    The word obtained after the participle is compared with the word template in the grammar database, it is determined that being obtained after participle Word part of speech;
    According to the judged result of the part of speech, the pass of noun and number as the text snippet is chosen from the word after participle Keyword.
  4. 4. according to the method for claim 1, it is characterised in that it is described that semantic analysis is carried out to the keyword, obtain institute Keyword Ontology is stated, including:
    The keyword is retrieved in the grammar database, obtains all ontology describings related to the keyword;
    The keyword sheet under current context is determined from all ontology describings of the keyword using network ontology language OWL Body is semantic.
  5. 5. according to the method for claim 1, it is characterised in that it is described to render rule according to default, by the Background, institute State animation template and voice data synthesis is rendered into video file, including:
    The mapping for setting keyword described in the voice data, Background corresponding with the keyword and animation template is closed System;
    According to the mapping relations, synthesis is carried out to the Background, the animation template and the voice data and rendered.
  6. A kind of 6. device based on ecommerce webpage content generation video frequency abstract, it is characterised in that including:
    Extraction module, for based on web page interlinkage, obtaining ecommerce webpage;Remove the additional letter in the ecommerce webpage Breath, wherein the additional information includes one or more of:Advertisement, picture, video, framework and chart;Described in extraction removes The content of text of the ecommerce webpage after additional information;Calculate successively in the content of text between every two sentences Similitude;According to the result of calculation of the similitude, to the statement classification in the content of text;According to the knot of the classification Fruit, from every quasi-sentence extracting sentence respectively is combined, and obtains candidate's summary;Chosen in being made a summary from the candidate and default text The text snippet that the minimum candidate of this length of summarization difference makes a summary as the ecommerce webpage, wherein the pre-set text is plucked Length is wanted to be determined according to video length to be generated and the bright reading rate of text snippet set in advance;
    Keyword acquisition module, for parsing the text snippet, obtain the keyword in the text snippet;
    Semantic module, for carrying out semantic analysis to the keyword, obtain the keyword Ontology;
    Background graphics are into module, for based on the keyword Ontology, picture corresponding to retrieval or regarded from internet Frequently, Background is formed;
    Animation template acquisition module, for based on the keyword Ontology, being obtained from default grammar database and institute State animation template corresponding to keyword;
    Audio conversion module, for the text snippet to be converted into voice data;
    Video Composition module, for rendering rule according to default, by the Background, the animation template and the voice data Synthesis is rendered into video file.
  7. 7. device according to claim 6, it is characterised in that the keyword acquisition module, including:
    Participle unit, for being segmented to the text snippet;
    Part of speech determining unit, for the word obtained after the participle and the word template in the grammar database to be compared It is right, it is determined that the part of speech of the word obtained after participle;
    Keyword chooses unit, and for the judged result according to the part of speech, noun and number are chosen from the word after participle Keyword as the text snippet.
  8. A kind of 8. system based on ecommerce webpage content generation video frequency abstract, it is characterised in that including:User terminal and with The e-commerce server end that family end is connected by internet;
    The e-commerce server end includes being plucked based on ecommerce webpage content generation video described in claim 6 or 7 The device wanted.
CN201510156125.8A 2015-04-03 2015-04-03 Method, apparatus and system based on ecommerce webpage content generation video frequency abstract Expired - Fee Related CN104731960B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510156125.8A CN104731960B (en) 2015-04-03 2015-04-03 Method, apparatus and system based on ecommerce webpage content generation video frequency abstract

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510156125.8A CN104731960B (en) 2015-04-03 2015-04-03 Method, apparatus and system based on ecommerce webpage content generation video frequency abstract

Publications (2)

Publication Number Publication Date
CN104731960A CN104731960A (en) 2015-06-24
CN104731960B true CN104731960B (en) 2018-03-09

Family

ID=53455847

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510156125.8A Expired - Fee Related CN104731960B (en) 2015-04-03 2015-04-03 Method, apparatus and system based on ecommerce webpage content generation video frequency abstract

Country Status (1)

Country Link
CN (1) CN104731960B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106504304B (en) * 2016-09-14 2019-09-24 厦门黑镜科技有限公司 A kind of method and device of animation compound
CN108965737B (en) * 2017-05-22 2022-03-29 腾讯科技(深圳)有限公司 Media data processing method, device and storage medium
CN107832382A (en) * 2017-10-30 2018-03-23 百度在线网络技术(北京)有限公司 Method, apparatus, equipment and storage medium based on word generation video
CN108470036A (en) * 2018-02-06 2018-08-31 北京奇虎科技有限公司 A kind of method and apparatus that video is generated based on story text
CN110309351A (en) * 2018-02-14 2019-10-08 阿里巴巴集团控股有限公司 Video image generation, device and the computer system of data object
CN109325135B (en) * 2018-10-26 2023-08-08 平安科技(深圳)有限公司 Text-based video generation method, device, computer equipment and storage medium
CN111294640A (en) * 2018-12-07 2020-06-16 北京京东尚科信息技术有限公司 Information display method, information selling method, information display device, information selling device, storage medium and electronic equipment
CN109949078B (en) * 2019-03-01 2020-11-03 北京金堤科技有限公司 Promotion information processing method and device
CN111784431A (en) * 2019-11-18 2020-10-16 北京沃东天骏信息技术有限公司 Video generation method, device, terminal and storage medium
CN112287168A (en) * 2020-10-30 2021-01-29 北京有竹居网络技术有限公司 Method and apparatus for generating video
CN113905254B (en) * 2021-09-03 2024-03-29 前海人寿保险股份有限公司 Video synthesis method, device, system and readable storage medium
CN114363701A (en) * 2021-12-29 2022-04-15 四川启睿克科技有限公司 Method for converting web page into short video
CN114786069A (en) * 2022-04-22 2022-07-22 北京有竹居网络技术有限公司 Video generation method, device, medium and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324760A (en) * 2013-07-11 2013-09-25 中国农业大学 Method and system for automatically generating nutrition health education video through commentary file
CN103559214A (en) * 2013-10-11 2014-02-05 中国农业大学 Method and device for automatically generating video

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100306232A1 (en) * 2009-05-28 2010-12-02 Harris Corporation Multimedia system providing database of shared text comment data indexed to video source data and related methods

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324760A (en) * 2013-07-11 2013-09-25 中国农业大学 Method and system for automatically generating nutrition health education video through commentary file
CN103559214A (en) * 2013-10-11 2014-02-05 中国农业大学 Method and device for automatically generating video

Also Published As

Publication number Publication date
CN104731960A (en) 2015-06-24

Similar Documents

Publication Publication Date Title
CN104731960B (en) Method, apparatus and system based on ecommerce webpage content generation video frequency abstract
CN104731959B (en) The method of text based web page contents generation video frequency abstract, apparatus and system
US10277946B2 (en) Methods and systems for aggregation and organization of multimedia data acquired from a plurality of sources
CN110543574A (en) knowledge graph construction method, device, equipment and medium
US10394886B2 (en) Electronic device, computer-implemented method and computer program
CN111368548A (en) Semantic recognition method and device, electronic equipment and computer-readable storage medium
CN108885617B (en) Sentence analysis system and program
CN104166462A (en) Input method and system for characters
CN105843796A (en) Microblog emotional tendency analysis method and device
CN107402912A (en) Parse semantic method and apparatus
CN111178056A (en) Deep learning based file generation method and device and electronic equipment
CN108038200A (en) Method and apparatus for storing data
CN103150331A (en) Method and device for providing search engine tags
CN107798622A (en) A kind of method and apparatus for identifying user view
CN113806588A (en) Method and device for searching video
EP3001327A1 (en) Method and system of enhancing online contents value
CN113038175B (en) Video processing method and device, electronic equipment and computer readable storage medium
JP2017182646A (en) Information processing device, program and information processing method
KR20120071194A (en) Apparatus of recommending contents using user reviews and method thereof
CN116484872A (en) Multi-modal aspect emotion judging method and system based on pre-training and attention
CN106959945B (en) Method and device for generating short titles for news based on artificial intelligence
CN104615654A (en) Text summarization obtaining method and device
Yamane et al. Tag Line Generating System Using Information on the Web.
CN107329953A (en) The processing method and electronic equipment of natural language corpus data
JP6506839B2 (en) Dissatisfied information processing device and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180309