CN105786834A - Method and system for generating structured abstract of social webpage - Google Patents

Method and system for generating structured abstract of social webpage Download PDF

Info

Publication number
CN105786834A
CN105786834A CN201410806618.7A CN201410806618A CN105786834A CN 105786834 A CN105786834 A CN 105786834A CN 201410806618 A CN201410806618 A CN 201410806618A CN 105786834 A CN105786834 A CN 105786834A
Authority
CN
China
Prior art keywords
structured
type
content
webpage
source code
Prior art date
Application number
CN201410806618.7A
Other languages
Chinese (zh)
Inventor
董毅
张前川
陈营营
张川
魏文华
Original Assignee
北京奇虎科技有限公司
奇智软件(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京奇虎科技有限公司, 奇智软件(北京)有限公司 filed Critical 北京奇虎科技有限公司
Priority to CN201410806618.7A priority Critical patent/CN105786834A/en
Publication of CN105786834A publication Critical patent/CN105786834A/en

Links

Abstract

The invention relates to a method and a system for generating a structured abstract of a social webpage. The method comprises the following steps: querying whether a source code header of the social webpage has one or more character feature description tags; if the one or more character feature description tags exist, extracting a corresponding content field and a corresponding type identifier in each description tag; and establishing the layout of each content field according to each type identifier to generate the structured abstract of the social webpage. According to the method provided by the invention, the character feature description tag is queried to identify the social webpage, and the structured abstract that is displayed in a search result list and is suitable for the social webpage is established by the extracted type fields user-defined by third parties, so a user is convenient to directly master the key information of the social webpage by the search result list.

Description

A kind of generation method and system of social class Web page structural summary

Technical field

The present invention relates to the Internet search technology field, particularly relate to a kind of social class structure of web page Change the generation method and system of summary.

Background technology

Along with developing rapidly of domestic Internet technology, look for the most fast and accurately To the information that oneself is required, become a key issue in internet hunt.

In existing search results pages shows, it is usually and stores in server according to key word The matching degree of key word be ranked up display, or by counting user to this key word The historic click-through rate of webpage be ranked up display.But, searching of being obtained by above-mentioned sequence In rope result page, it is the phase of each search result items and search result items iff present Closing brief introduction, the brief introduction of search result items just can find search interested to need user to check one by one Content, thus just had the demand showing web-page summarization in search result list.

Existing abstraction generating method mainly has two kinds, and one is that static abstract generates method, with Centered by the theme of document, i.e. " summary in full ", although the summary that this method generates is good Have expressed the meaning of document, but document is inquired about to user relevant information and is not necessarily literary composition Shelves theme, this information but may be the most valuable to user, therefore static abstract generates method and is difficult to full The query demand of foot user;Another kind is that dynamic abstract generates method, closes according to the inquiry of user Keyword, extracts and the maximally related part of searching keyword from document, but many times inquires about Key word the most directly embodies the query demand of user, therefore dynamic abstract generates method and is similarly difficult to Meet the query demand of user, and when searching keyword dispersion multiple paragraphs in a document Time, the summary extracted is difficult to include the information that user needs especially.

A kind of method how summary generating Search Results is provided so that summary can be preferable Meet the query demand of user, and comprise the more relevant information being actually user's needs, Become one of current urgent problem.Therefore, mentioned above searching results page generates or display Method can not provide the user the content wanted fast and effectively.

Summary of the invention

The technical problem to be solved be in prior art search results pages generate or Display packing can not provide the user the content wanted fast and effectively.

For this purpose it is proposed, the present invention proposes the generation side of a kind of social class Web page structural summary Method, including step in detail below:

Whether the source code head of the social class webpage of inquiry exists one or more character features is retouched State label;

Describe label if there is one or more described character features, extract each description label The content field of middle correspondence and type identification;

Build the arrangement of each content field according to each type identification, thus generate described social activity The structured summary of class webpage.

Optionally, the method is in step: whether the source code head of the social class webpage of inquiry is deposited Before one or more character features describe label, also include: by third party at webpage The head portion of source code adds the structured summary field identified with predetermined assisted tag, institute State structured summary field and include at least one content field and corresponding type identification.

Optionally, whether the source code head of the social class webpage of described inquiry exists one or many Individual personage's feature description label, farther includes: extract knot according to described predetermined assisted tag Structure abstract fields;Described structured summary field is resolved, resolves knot according to described Fruit judges that whether there is one or more character features in described Web page structural abstract fields retouches State label.

Optionally, described step: by third party web page source code head portion add with The structured summary field of predetermined assisted tag mark, farther includes: by third party at net The head portion of page source code adds personage's key word type field, personage illustrates class type-word Section, personage's rating-type field, user's visit capacity type field and/or corresponding mobile webpage URL Type field;

Described step: build the arrangement of each content field according to each type identification, thus generate The structured summary of described social class webpage, be further: utilize in each described type field Type identification data content corresponding in all types of fields is carried out according to predetermined set layout Embedding, generating structureization is made a summary.

Optionally, described step: by third party web page source code head portion add with The structured summary field of predetermined assisted tag mark, farther includes: by third party at net The head portion of page source code adds personage's key word type field, personage illustrates class type-word Section, personage's rating-type field, user's visit capacity type field and/or corresponding mobile webpage URL Type field;And determine the importance degree of added field;

Described step: build the arrangement of each content field according to each type identification, thus generate The structured summary of described social class webpage, farther includes: described utilize each described type Type identification in field by data content corresponding in all types of fields according to described importance degree Embedding according to predetermined set layout, generating structureization is made a summary.

Optionally, the method is in described step: build each content word according to each type identification The arrangement of section, thus after generating the structured summary of described social class webpage, also include:

Structured summary in presumptive area in search result list.

Optionally, in described step presumptive area in search result list, structured is plucked Farther include: left side or the upper left side of the presumptive area in search result list show Personage's picture.

Optionally, described third party is website webmaster or webpage provider.

The present invention also provides for the generation system of a kind of social class Web page structural summary, this system bag Include:

Character features describes tag queries device, for inquiring about the source code head of social class webpage Whether there is one or more character features and describe label;

Content field and type identification extractor, be used for there is one or more described personage Content field corresponding in each description label and type identification is extracted during feature description label;

Structured summary generator, for building each content field according to each type identification Arrangement, thus generate the structured summary of described social class webpage.

Optionally, this system also includes: structured summary field adder, for by the Tripartite adds, at the head portion of web page source code, the structuring identified with predetermined assisted tag and plucks Want field;Described structured summary field includes at least one content field and corresponding type Mark.

Optionally, described character features describes tag queries device and farther includes:

Structured summary field extraction unit, for extracting knot according to described predetermined assisted tag Structure abstract fields;

Structured summary field resolution unit, for solving described structured summary field Analysis;

Character features describes label judging unit, described for judging according to described analysis result Whether Web page structural abstract fields exists one or more character features and describes label.

Optionally, described structured summary field adder farther includes at least one class type-word Section adding device, for adding personage's key by third party at the head portion of web page source code Word type field, personage illustrate type field, personage's rating-type field, user's access Amount type field and/or corresponding mobile webpage URL type field;

Described structured summary generator utilizes the type identification in each described type field to incite somebody to action Data content corresponding in all types of fields embeds according to predetermined set layout, generating structure Change summary.

Optionally, described structured summary field adder farther includes:

At least one type field adding device, for by third party at the head of web page source code Portion's part adds personage's key word type field, personage illustrates type field, personage's grade Type field, user's visit capacity type field and/or corresponding mobile webpage URL type field;

Field importance degree determines unit, for determining the importance degree of all types of fields added;

Described structured summary generator utilizes the type identification in each described type field to incite somebody to action Data content corresponding in all types of fields carries out embedding according to importance degree according to presetting layout Entering, generating structureization is made a summary.

Optionally, this system also includes, structured summary display, for arranging at Search Results Structured summary in presumptive area in table.

Optionally, described structured summary display farther includes personage's picture display unit, Personage's picture is shown for the left side of the presumptive area in search result list or upper left side.

Optionally, described third party is website webmaster or webpage provider.

The present invention, by inquiry personage's feature description label, picks out social class webpage, and utilizes The self-defining type field of each third party extracted constructs in search result list aobvious The structured summary of the applicable social class webpage shown, facilitates user straight by search result list Connect the key message recognizing social class webpage.

Accompanying drawing explanation

By being more clearly understood from the features and advantages of the present invention with reference to accompanying drawing, accompanying drawing is to show Meaning property and should not be construed as the present invention is carried out any restriction, in the accompanying drawings:

Fig. 1 shows a kind of social class Web page structural summary described in embodiment of the present invention one The flow chart of generation method;

Fig. 2 shows a kind of social class Web page structural summary described in embodiment of the present invention three The flow chart of generation method;

Fig. 3 shows a kind of social class Web page structural summary described in embodiment of the present invention one The structure chart of generation system;

Fig. 4 shows a kind of social class Web page structural summary described in embodiment of the present invention three The structure chart of generation system.

Detailed description of the invention

The implementation method of the present invention mainly increases by one in the head head district of web page source code The structured field that website user or maintenance of netpage user can customize or can write so that Search engine needs to show that in search result list this webpage is believed matching a webpage During breath, client can be made a summary according to described structured field generating structureization, and generation Structured summary is pushed to user in search result list.User is made to need not open webpage Link is just it is recognized that the key message of webpage.Simultaneously because this structured field is for website User or maintenance of netpage User Defined, namely by website webmaster or webpage provider Self-defined, website webmaster or webpage provider can be made full use of again relative to other people to net Page is held more precisely and has the feature of strong preference oneself webpage hope so that summary to be formed Custom field accurately, comprehensively, be greatly saved search engine to generate Search Results row Summary data in table and need the resource that web data is processed.

Before introducing the detailed description of the invention of the present invention, the net that the lower present invention of first explanation uses Page source code data.Meta is a complementary label in html language head district.Meta label Having two attributes, they are http-equiv attribute and name attribute respectively, different attributes Having again different parameter values, these different parameter values are achieved that different webpage functions. Name attribute is mainly used in describing webpage, and corresponding property value is content, content In content be primarily to facilitate search engine robot and search information and classification information. The name attribute grammar form of meta label is: < meta name=" parameter " content=" Concrete parameter value " >.

Wherein name attribute mainly has following several parameter:

A, Keywords (keyword)

Illustrate: keywords is used for telling what the keyword of your webpage of search engine is.

Citing: < meta name=" keywords " content=" science, education,cul ture,pol itics,ecnomics,relat ionships, Entertaiment, human " >

B, descript ion (web site contents description)

Illustrate: description is used for telling search engine your website main contents.

Citing: < meta name=" description " content=" This page is about the Meaning of science, education, cul ture. " >

C, author (author)

Illustrate: the author of mark webpage

Citing: < meta name=" author " content=" bluebox, web@7dspace.com " >

The present invention is on the basis of above-mentioned source code knowledge, by third party in source code header area Territory leads to adds the following self-defined structure field identified with predetermined assisted tag:

<head>

<title>name _ social network title</title>

<meta name=" keywords " content=" ... "/>

<meta name=" description " content=" ... "/>

Certainly, the present invention is not limited to use meta label, and those skilled in the art can Also generating structure abstract structureization summary can be realized with the label or field identification using other Technical scheme, the most also fall within the scope and spirit of the invention.

Utilize above-mentioned self-defined structure field generating structureization summary in search result list, In order to solve the incomplete problem of summary data in search result list.

The present embodiment discloses a kind of generation method of social class Web page structural summary, described social activity Class webpage includes the social activities such as blog, microblogging, circle of friends, Renren Network, facebook, twitter Website or social webpage, as it is shown in figure 1, include step in detail below:

S1. whether the source code head of the social class webpage of inquiry exists one or more personage spy Levy description label;

S2. describe label if there is one or more described character features, extract each description and mark Content field corresponding in label and type identification;

S3. build the arrangement of each content field according to each type identification, thus generate described society Hand over the structured summary of class webpage.

Further, different in webpage character features describes increase importance degree in label field Identification field.

Further, describe label field according to the character features of different importance degrees to search for user The dependency of query word, determines that counter structure abstract structure summary data is embedded in search Sorting position in the results list.If the character features that importance degree is high describes label field and use Family search word is correlated with, it is determined that counter structure summary data entirety and user's search word degree of association Height, the preferential position being embedded in search result list;If the character features that importance degree is low is retouched State label field relevant to user's search word, it is determined that counter structure summary data is overall and uses Family search word degree of association is low, the non-preferential position being embedded in search result list.

Further, when user is by mobile terminal input search query word, search whether to deposit At the Web page structural summary data mated with described search query word, if it does, preferentially return Return structured summary data to mobile terminal;

Further, described preferential return structure summary data is to mobile terminal, based on movement The screen size of terminal is drawn whole screen interaction page based on described structured summary data and returns Show back to mobile terminal.

Further, the whole screen interaction page returning to mobile terminal not only provides summary data to believe Breath, and provide one or more and be available for the operation object that user is mutual.

Further, described operation object identifies also by customized label, generating structure Summary embeds the script control that can call described operation object.Based on this, more strengthen structure Change the interactivity function that summary provides so that summary data more odd jobs are effective and intelligent.

For example, in the head head district of source code of the sina blog webpage of certain famous person, Add following field:

<head>

< meta http-equiv=" Content-Type " content=" text/html;Charset=utf-8 "/>

<title>celebrity name _ sina blog</title>

< meta name=" keywords " content=" celebrity name _ sina blog, celebrity name, tittle-tattle "/>

< meta name=" description " content=" celebrity name _ sina blog, celebrity name, " rich Literary composition exercise question or title (famous person's forwarding " author, " blog article exercise question or title (famous person's forwarding) " Author, gives the postscript that " certain this book " is write, " blog article exercise question or title (a relevant person of good sense) " Author, " blog article exercise question or title (a relevant person of good sense) " author, so-and-so gives me at press The sequence newly write, puts forward to everybody pleasant to have seen it first."/>

By inquiring about whether described source code head head region comprises description of person label meta , if it find that there is described description of person label meta name, then determine that this in name Webpage belongs to social class webpage, extracts the field type keyword by meta name mark And content " celebrity name _ sina blog, the famous person surname corresponding with this field type keyword Name, tittle-tattle ", extract with meta name mark field type description and with this field Content " celebrity name _ sina blog, the celebrity name, " blog article that type description is corresponding Exercise question or title (famous person's forwarding " author, " blog article exercise question or title (famous person's forwarding) " Author, gives the postscript that " certain this book " is write, " blog article exercise question or title (a relevant person of good sense) " Author, " blog article exercise question or title (a relevant person of good sense) " author, so-and-so will to me The sequence newly write published, puts forward to everybody pleasant to have seen it first.".It follows that according to being carried The field contents, the field type generating structureization that take are made a summary.Due to different types of webpage Content, user's focus is different, and such as news category webpage, user compares concern news Issuing time, headline and news content brief introduction, and for social class webpage, user's ratio Relatively paying close attention to the blog article entry etc. of bloger, bloger, the most different types of webpage needs the word represented Segment type is different with field contents, and this is accomplished by search engine for different type of webpage Webpage, presets the different arrangements being suitable to show each field contents according to field type mark Template.During generating structureization summary, the field contents extracted is embedded in described arrangement mould The corresponding type position set on plate.

As the example of a kind of microblogging class webpage, at the head of source code of someone microblogging webpage Head district of portion, adds following field:

<head>

<meta charset=" utf-8 ">

< meta content=" IT engineer's-Liu work, the microblogging of IT engineer's-Liu work, microblogging, Sina's microblogging, weibo " name=" keywords "/>

< meta content=" IT engineer's-Liu work, willow computer, the computer expert of yours at one's side, All difficult miscellaneous diseases of settlement computer!.The microblogging homepage of IT engineer's-Liu work, personal information, Photograph album, * * scientific & technical corporation or group.Sina's microblogging, shares strange thing at one's side whenever and wherever possible Youngster." name=" description "/>

< meta name=" viewport " content=" initial-scale=1, Minimum-scale=1 "/>.

By inquiring about whether described source code head head region comprises description of person's label , if it find that there is described description of person label meta content, then just in meta content Determine that this webpage belongs to social class webpage, extract the field type by meta content mark Keyword and the content " " IT engineer-Liu work corresponding with this field type keyword, The microblogging of IT engineer's-Liu work, microblogging, Sina's microblogging, weibo ", extract and use meta content Mark field type description and corresponding with this field type description in " IT engineer's-Liu work, willow computer, the computer expert of yours at one's side, all are doubtful for settlement computer for appearance Difficult miscellaneous diseases!.The microblogging homepage of IT engineer's-Liu work, personal information, photograph album, * * science and technology is public Department or group.Sina's microblogging, shares fresh thing at one's side whenever and wherever possible.".It follows that According to the field contents extracted, field type generating structureization summary.Due to inhomogeneity The web page contents of type, user's focus is different, and such as news category webpage, user compares Pay close attention to news briefing time, headline and news content brief introduction, and for social class webpage, User compares the personal informations etc. such as the occupational information of concern bloger, and the most different types of webpage needs Field type to be represented is different with field contents, and this is accomplished by search engine for different The webpage of type of webpage, presets according to field type mark and is suitable to show each field contents Different arrangement templates.During generating structureization summary, the field contents extracted is embedded in The corresponding type position set in described arrangement template.

From two examples above it can be seen that the content field of meta tag identifier with The order of name field is unrestricted, and describes label and be also not necessarily limited to meta label, meta Property label, it is also possible to be other labels, such structured field just can have more flexible Self-defined performance space.

As the second embodiment, on the basis of above-mentioned embodiment, in step: look into Whether the source code head of the social class webpage of inquiry exists one or more character features describes label Before, also include: added with predetermined auxiliary at the head portion of web page source code by third party The structured summary field of tag identifier, described structured summary field is helped to include at least one Content field and corresponding type identification.This has just given full play to website user or webpage is used The space to showing at webpage recommending, the family, the assurance of webpage can be determined by user according to oneself In content defined in name type field, it is also possible to determine in explanation type field definition content. When search result list shows the summary of webpage, just carry out making a summary according to these contents Structured summary shows.For example, the source code head of another certain celebrity blog is presented herein below Region:

<head>

< meta http-equiv=" Content-Type " content=" text/html;Charset=utf-8 "/> <title>celebrity name _ sina blog</title>

< meta name=" keywords " content=" Xiao Ming _ sina blog, Xiao Ming, the World Bank, World Bank, Ebola, epidemic situation, Xiao Ming, amusement, old three, violence is aesthstic, wife, film, amusement, Li Si, youngster, Spring, five, high by six, Singapore, financial crisis, the dispirited phase, go into business seven, the youth, Xiao Ming, * * satellite TV, * * matchmaker Body group, Xiao Ming, host, holt, * * teaches, * * program, host, performer, * * satellite TV, thing, feelings Sense, girder, slight snow, Hai Chang, welcome guest, little Hua, * * university, Xiao Ming, * * satellite TV, little Hua "/>

< meta name=" description " content=" Xiao Ming _ sina blog, Xiao Ming, interview of famous person record: Make common prosperity in the unequal world, interview of famous person is recorded: give film love, " Interview to the Famous Person Talk record " business with art game, " interview of famous person record " passes on and reforms, * * program [five] Love is homework necessary in university, and * * special series [four] visits Sino-U.S.'s youngster, and who will more Innovation?, * * special series [three] explores and makes youth creation the most possible, * * special series [two] Exploration in youth, * * special series famous person shares can &nbsp;The special show of * university, famous person is only Visit * state president "/>

<meta http-equiv=" X-UA-Compatible " content=" IE=EmulateIE7 "/>

< meta http-equiv=" mobile-agent " content=" format=html5;

Url=http: //blog.sina.cn/dpool/blog/u/1198920804 " >

< meta http-equiv=" mobile-agent " content=" format=wml;

Url=http: //blog.sina.cn/dpool/blog/ArtList.php?Uid=1198920804&vt=1 " >

It can be seen that may someone in defined in the keyword field of name type identification Holding the most, the direction that relates to including him, field, personage, in search result list In the structured summary provided, personage's key word field provides very abundant content, and compares In the blog mentioned before, blog above is in the content the most more letter of keyword field definition Slightly.Simultaneously as keyword field contents is search engine carries out matching inquiry word and key Word provides to be supported, the most abundant in content definition additionally aids the search hit providing blog web page Rate, and then click volume is provided.Famous person is defined in the description field of name type identification The theme that she relates to, this makes the structured summary generated in search result list just can Directly embody the special topic talked about in the talk show that she presides over.

As the third embodiment, as in figure 2 it is shown, on the basis of the second embodiment On, it is special whether the source code head of the social class webpage of described inquiry exists one or more personage Levy description label, farther include: S11. extracts structuring according to described predetermined assisted tag Abstract fields;S12. described structured summary field is resolved;S13. according to described parsing Result judges whether there is one or more character features in described Web page structural abstract fields Label is described.Concrete, in the source code of two blogs being previously mentioned, can use auxiliary Help label meta to extract structured summary field, and after extracting structured summary field, Parse character features therein and describe label meta name, describe mark further according to character features Sign meta name and extract keywords field type and desciption field type.

As the 4th kind of embodiment, at the second embodiment or the third embodiment base On plinth, described step: added at the head portion of web page source code with predetermined auxiliary by third party Help the structured summary field of tag identifier, farther include: by third party in web page source generation The head portion of code adds characters name key word type field, characters name illustrates type Field, bloger's rank field, blog access times field, concern popularity field and/or correspondence Mobile webpage URL type field;

Described step: build the arrangement of each content field according to each type identification, thus generate The structured summary of described social class webpage, be further: utilize in each described type field Type identification data content corresponding in all types of fields is carried out according to predetermined set layout Embedding, generating structureization is made a summary, concrete, such as in left side or the upper left side of presumptive area Display task photo, show successively on right side bloger's name, blog grade, blog access times, Pay close attention to the information such as popularity, Blog content word, blog article list.

As the 5th kind of embodiment, at the second embodiment or the third embodiment base On plinth, described step: added at the head portion of web page source code with predetermined auxiliary by third party Help the structured summary field of tag identifier, farther include: by third party in web page source generation The head portion of code adds characters name key word type field, characters name illustrates type Field and/or corresponding mobile webpage URL type field;And determine the importance degree of added field;

Described step: build the arrangement of each content field according to each type identification, thus generate The structured summary of described social class webpage, farther includes: described utilize each described type Type identification in field by data content corresponding in all types of fields according to described importance degree Embedding according to presetting layout, generating structureization is made a summary, especially at search result list In presumptive area limited, it is impossible to when the content of described field is all shown, according to important Degree selects part field to show.

As the 6th kind of embodiment, on the basis of embodiment above, the method exists Described step: build the arrangement of each content field according to each type identification, thus generate institute After stating the structured summary of social class webpage, also include:

Structured summary in presumptive area in search result list.

As the 7th kind of embodiment, on the basis of the most all embodiments, described step exists In presumptive area in search result list, structured summary farther includes: at search knot Really left side or the upper left side of the presumptive area in list shows personage's picture.

As the 8th kind of embodiment, the present invention provides a kind of social class Web page structural summary Generation system, as it is shown on figure 3, this system includes:

Character features describes tag queries device 100, for inquiring about the source code head of social class webpage Whether portion exists one or more character features describes label;

Content field and type identification extractor 200, be used for there is one or more described people Content field corresponding in each description label and type mark is extracted during thing feature description label Know;

Structured summary generator 300, for building each content field according to each type identification Arrangement, thus generate the structured summary of described social class webpage.

Optionally, this system also includes: structured summary field adder, for by the Tripartite adds, at the head portion of web page source code, the structuring identified with predetermined assisted tag and plucks Want field;Described structured summary field includes at least one content field and corresponding type Mark.

Optionally, as shown in Figure 4, described character features describes tag queries device 100 and enters one Step includes:

Structured summary field extraction unit 101, for extracting according to described predetermined assisted tag Structured summary field;

Structured summary field resolution unit 102, for carrying out described structured summary field Resolve;

Character features describes label judging unit 103, for judging institute according to described analysis result State and whether Web page structural abstract fields exists one or more character features describe label.

Optionally, described structured summary field adder farther includes at least one class type-word Section adding device, for adding characters name by third party at the head portion of web page source code Key word type field, characters name illustrate type field and/or corresponding mobile webpage URL type field;

Described structured summary generator utilizes the type identification in each described type field to incite somebody to action Data content corresponding in all types of fields embeds according to predetermined set layout, generating structure Change summary.

Optionally, described structured summary field adder farther includes:

At least one type field adding device, for by third party at the head of web page source code Portion part adds characters name key word type field, characters name illustrates type field and / or corresponding mobile webpage URL type field;

Field importance degree determines unit, for determining the importance degree of all types of fields added;

Described structured summary generator utilizes the type identification in each described type field to incite somebody to action Data content corresponding in all types of fields carries out embedding according to importance degree according to predetermined set layout Entering, generating structureization is made a summary.

Optionally, this system also includes, structured summary display, for arranging at Search Results Structured summary in presumptive area in table.

Optionally, described structured summary display farther includes personage's picture display unit, Personage's picture is shown for the left side of the presumptive area in search result list or upper left side.

The all parts embodiment of the present invention can realize with hardware, or with at one or many The software module run on individual processor realizes, or realizes with combinations thereof.This area It will be appreciated by the skilled person that microprocessor or digital signal processor can be used in practice (DSP) life that a kind of social class Web page structural according to embodiments of the present invention is made a summary is realized The some or all functions of the some or all parts in one-tenth system.The present invention can also be real Now for part or all equipment or the device for performing method as described herein Program (such as, computer program and computer program).Such realize the present invention's Program can store on a computer-readable medium, or can have one or more signal Form.Such signal can be downloaded from internet website and obtain, or at carrier signal Upper offer, or provide with any other form.

The generation system of a kind of social class Web page structural summary the most proposed by the invention, can To be a kind of search engine server.Processor is included and with form of memory on this server Computer program or computer-readable medium.Memorizer can be such as flash memory, EEPROM (Electrically Erasable Read Only Memory), EPROM, hard disk or ROM Etc electronic memory.Memorizer has any method step for performing in said method The memory space of program code.Such as, the memory space for program code can include point Yong Yu not realize each program code of various steps in above method.These program codes Can read or be written to from one or more computer program this or In multiple computer programs.These computer programs include such as hard disk, compact-disc (CD), the program code carrier of storage card or floppy disk etc.Such computer program product Product are usually portable or static memory cell.This memory element can have with noted earlier Terminal in the memory paragraph of memorizer similar arrangement, memory space etc..Program code can be with example As being compressed in a suitable form.Generally, memory element includes computer-readable code, With the code read by processor the most such as etc, these codes are when being run by server During search engine program, this server is caused to perform each step in method described above Suddenly.

" embodiment " referred to herein, it is meant that, the spy in conjunction with the embodiments described Determine feature, structure or characteristic to be included at least one embodiment of the present invention.Additionally, please Noting, the word example " at an embodiment " is not necessarily all referring to same embodiment here.

In description mentioned herein, illustrate a large amount of detail.But, it is possible to reason Solving, embodiments of the invention can be put into practice in the case of not having these details.One In a little examples, it is not shown specifically known method, structure and technology, in order to not fuzzy to this The understanding of description.

The present invention will be described rather than enters the present invention to it should be noted above-described embodiment Row limits, and those skilled in the art are without departing from the scope of the appended claims Alternative embodiment can be designed.In the claims, any ginseng between bracket should not will be located in Examine symbol construction and become limitations on claims.Word " comprises " and does not excludes the presence of the power of not being listed in Element in profit requirement or step.Word "a" or "an" before being positioned at element is not arranged Except there is multiple such element.The present invention can be by means of including the hard of some different elements Part and realizing by means of properly programmed computer.If weighing at the unit listing equipment for drying During profit requires, several in these devices can be to carry out concrete body by same hardware branch Existing.Word first, second and third use do not indicate that any order.Can be by these Word explanation is title.

Furthermore, it should also be noted that the language used in this specification primarily to readable and Teaching purpose and select rather than select to explain or limit subject of the present invention 's.Therefore, in the case of without departing from the scope of the appended claims and spirit, for this For the those of ordinary skill of technical field, many modifications and changes will be apparent from.For The scope of the present invention, the disclosure being the present invention is illustrative and not restrictive, this The scope of invention is defined by the appended claims.

The present invention, by inquiry personage's feature description label, picks out social class webpage, and utilizes The self-defining type field of each third party extracted constructs in search result list aobvious The structured summary of the applicable social class webpage shown, facilitates user straight by search result list Connect the key message recognizing social class webpage.

Above-described embodiment, only for technology design and the feature of the explanation present invention, its objective is to allow familiar These those skilled in the art will appreciate that present disclosure and implement according to this, can not be with This limits the scope of the invention.All according to the equivalent change done by spirit of the invention Change or modify, all should contain within protection scope of the present invention.

Although being described in conjunction with the accompanying embodiments of the present invention, but those skilled in the art can To make various modifications and variations without departing from the spirit and scope of the present invention, so Amendment and within the scope of modification each falls within and is defined by the appended claims.

Claims (10)

1. a generation method for social class Web page structural summary, including:
Whether the source code head of the social class webpage of inquiry exists one or more character features is retouched State label;
Describe label if there is one or more described character features, extract each description label The content field of middle correspondence and type identification;
Build the arrangement of each content field according to each type identification, thus generate described social activity The structured summary of class webpage.
Method the most according to claim 1, the method is in step: the social class net of inquiry Whether the source code head of page exists before one or more character features describes label, also wraps Include: identified with predetermined assisted tag in the head portion interpolation of web page source code by third party Structured summary field, described structured summary field include at least one content field with Corresponding type identification.
3. according to the method described in any one of claim 1-2, described inquiry social activity class webpage Source code head whether there is one or more character features and describe label, farther include: Structured summary field is extracted according to described predetermined assisted tag;To described structured summary word Duan Jinhang resolve, judge according to described analysis result described Web page structural abstract fields is No there is one or more character features and describe label.
4. according to the method described in any one of claim 1-3, described step: pass through third party Head portion at web page source code adds the structured summary word identified with predetermined assisted tag Section, farther includes: add personage's key by third party at the head portion of web page source code Word type field, personage illustrate type field, personage's rating-type field, user's access Amount type field and/or corresponding mobile webpage URL type field;
Described step: build the arrangement of each content field according to each type identification, thus generate The structured summary of described social class webpage, be further: utilize in each described type field Type identification data content corresponding in all types of fields is carried out according to predetermined set layout Embedding, generating structureization is made a summary.
5. according to the method described in any one of claim 1-4, described step: pass through third party Head portion at web page source code adds the structured summary word identified with predetermined assisted tag Section, farther includes: add personage's key by third party at the head portion of web page source code Word type field, personage illustrate type field, personage's rating-type field, user's access Amount type field and/or corresponding mobile webpage URL type field;And determine added field Importance degree;
Described step: build the arrangement of each content field according to each type identification, thus generate The structured summary of described social class webpage, farther includes: described utilize each described type Type identification in field by data content corresponding in all types of fields according to described importance degree Embedding according to predetermined set layout, generating structureization is made a summary.
6., according to the method described in any one of claim 1-5, the method is in described step: Build the arrangement of each content field according to each type identification, thus generate described social class net After the structured summary of page, also include:
Structured summary in presumptive area in search result list.
7., according to the method described in any one of claim 1-6, described step arranges at Search Results In presumptive area in table, structured summary farther includes: in search result list The left side of presumptive area or upper left side display personage's picture.
8., according to the method described in any one of claim 1-7, described third party is portal management Person or webpage provider.
9. a generation system for social class Web page structural summary, this system includes:
Character features describes tag queries device, for inquiring about the source code head of social class webpage Whether there is one or more character features and describe label;
Content field and type identification extractor, be used for there is one or more described personage Content field corresponding in each description label and type identification is extracted during feature description label;
Structured summary generator, for building each content field according to each type identification Arrangement, thus generate the structured summary of described social class webpage.
System the most according to claim 9, this system also includes: structured summary Field adder, for adding with predetermined at the head portion of web page source code by third party The structured summary field of assisted tag mark;Described structured summary field includes at least one Individual content field and corresponding type identification.
CN201410806618.7A 2014-12-22 2014-12-22 Method and system for generating structured abstract of social webpage CN105786834A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410806618.7A CN105786834A (en) 2014-12-22 2014-12-22 Method and system for generating structured abstract of social webpage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410806618.7A CN105786834A (en) 2014-12-22 2014-12-22 Method and system for generating structured abstract of social webpage

Publications (1)

Publication Number Publication Date
CN105786834A true CN105786834A (en) 2016-07-20

Family

ID=56385274

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410806618.7A CN105786834A (en) 2014-12-22 2014-12-22 Method and system for generating structured abstract of social webpage

Country Status (1)

Country Link
CN (1) CN105786834A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250421A (en) * 2016-07-25 2016-12-21 深圳市金立通信设备有限公司 A kind of method shooting process and terminal
CN107784056A (en) * 2017-02-20 2018-03-09 平安科技(深圳)有限公司 Page data lookup method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8271517B2 (en) * 2008-12-09 2012-09-18 International Business Machines Corporation Presentation of websites to a computer user
CN103324622A (en) * 2012-03-21 2013-09-25 北京百度网讯科技有限公司 Method and device for automatic generating of front page abstract
CN103324665A (en) * 2013-05-14 2013-09-25 亿赞普(北京)科技有限公司 Hot spot information extraction method and device based on micro-blog
CN104156452A (en) * 2014-08-18 2014-11-19 中国人民解放军国防科学技术大学 Method and device for generating webpage text summarization

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8271517B2 (en) * 2008-12-09 2012-09-18 International Business Machines Corporation Presentation of websites to a computer user
CN103324622A (en) * 2012-03-21 2013-09-25 北京百度网讯科技有限公司 Method and device for automatic generating of front page abstract
CN103324665A (en) * 2013-05-14 2013-09-25 亿赞普(北京)科技有限公司 Hot spot information extraction method and device based on micro-blog
CN104156452A (en) * 2014-08-18 2014-11-19 中国人民解放军国防科学技术大学 Method and device for generating webpage text summarization

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250421A (en) * 2016-07-25 2016-12-21 深圳市金立通信设备有限公司 A kind of method shooting process and terminal
CN107784056A (en) * 2017-02-20 2018-03-09 平安科技(深圳)有限公司 Page data lookup method and device
CN107784056B (en) * 2017-02-20 2020-03-06 平安科技(深圳)有限公司 Page data searching method and device

Similar Documents

Publication Publication Date Title
JP6360228B2 (en) Client-side search templates for online social networks
JP6435307B2 (en) Search intent for queries
US10303779B2 (en) Media consumption history
JP6435383B2 (en) Filter suggested queries on online social networks
US9971842B2 (en) Computerized systems and methods for generating a dynamic web page based on retrieved content
JP6506401B2 (en) Suggested keywords for searching news related content on online social networks
US20170083527A1 (en) Surfacing applications based on browsing activity
US9224159B2 (en) Distributed content exchange and presentation system
CN105027121B (en) The five application page of the machine application is indexed
US20170248486A1 (en) System and method for adaptive electronic distribution of information
Heitmann et al. Using linked data to build open, collaborative recommender systems
US10360272B2 (en) System and method for compending blogs
CN106233279B (en) Based on the content for including in digizine to digizine server user&#39;s recommendation
CN105164710B (en) Method and server for providing search results
Hargittai Open portals or closed gates? Channeling content on the World Wide Web
CN105706080B (en) Augmenting and presenting captured data
US10002189B2 (en) Method and apparatus for searching using an active ontology
US9626545B2 (en) Semantic note taking system
US20140279774A1 (en) Classifying Resources Using a Deep Network
Wong et al. What do we" mashup" when we make mashups
US8132151B2 (en) Action tags
Xie Users’ evaluation of digital libraries (DLs): Their uses, their criteria, and their assessment
CN102902738B (en) Use the search system and method for in-line contextual queries
CN102063476B (en) Video searching method and system
Chua et al. A study of Web 2.0 applications in library websites

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160720