CN104462083B - The method, apparatus and information processing system compared for content - Google Patents

The method, apparatus and information processing system compared for content Download PDF

Info

Publication number
CN104462083B
CN104462083B CN201310416233.5A CN201310416233A CN104462083B CN 104462083 B CN104462083 B CN 104462083B CN 201310416233 A CN201310416233 A CN 201310416233A CN 104462083 B CN104462083 B CN 104462083B
Authority
CN
China
Prior art keywords
project
candidate
centering
compared
pair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310416233.5A
Other languages
Chinese (zh)
Other versions
CN104462083A (en
Inventor
黄耀海
胡钦谙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to CN201310416233.5A priority Critical patent/CN104462083B/en
Publication of CN104462083A publication Critical patent/CN104462083A/en
Application granted granted Critical
Publication of CN104462083B publication Critical patent/CN104462083B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The method, apparatus and information processing system that this disclosure relates to compare for content.This method includes:Identify that the project for including at least two objects to be compared, the project include at least one in phrase, sentence, paragraph, table and image;The project identified is matched to generate candidate items pair, each candidate items are to including at least two projects, and at least two project is respectively from different objects to be compared;At least one of in content of text based on the first predefined rule, the first user's history behavior and object to be compared, determine the feature of each candidate items pair of generated candidate items centering;And based on identified feature, by generated candidate items at least one of candidate items to being determined as than project pair, wherein each project for including than project centering is than project.Through the above scheme, the comparable project in comparison other can be automatically and efficiently identified.

Description

The method, apparatus and information processing system compared for content
Technical field
The present invention relates to data processing fields, and compare more particularly, to content is used in data processing field Method, apparatus and information processing system.
Background technology
It is compared sometimes for the content that will appear in different objects, to contribute to user from wherein selecting him more inclined Good object.Such content, which compares, to be usually required to be found out manually according to word, picture recorded in object etc. by means of people Than project, such as comparable phrase, sentence, paragraph, table, picture etc..
In the related art, in order to which determining comparable project is to carry out content comparison, it usually needs artificial participation and guiding. For example, when wedding planning teacher has the paper of such as leaflet, picture poster etc of wedding plan to come to a pair of of companion by using record When companion introduces wedding plan, wedding planning teacher will be compared the content recorded in it between these different paper, To tell this difference between companion's difference wedding plan.It is such to be more usually manually performed.Specifically, this is right Companion will inquire specific content of the wedding planning teacher interested to them(For example, cost, cost time, stage arrangement etc.), so Difference between wedding planning teacher is explained by emphasizing corresponding contents with pen on different leaflets afterwards.The above process will repeat more It is secondary to carry out more detailed comparison to the content on different leaflets.
But content comparison is carried out by above-mentioned artificial question-response mode or heuristic manner of comparison, usually hold It easily omits and falls to need the important content compared so that the content to be compared found out is not comprehensive, and may also lack to comparing The summary or general view of content.
It can be seen that due to needing the participation of a large amount of manpowers that can just find out the content that can be compared, when spending Between it is longer, and due to the finiteness of manpower, may can not find out comprehensive than project.Therefore, in the related art simultaneously Content can not be effectively performed to compare.
Invention content
An embodiment of the present invention provides a kind of method, apparatus and information processing system compared for content, can be automatic It identifies the project that can be compared, compares so as to which content is effectively performed.
According to an aspect of the present invention, a kind of method compared for content is provided, this method includes:Identification at least two The project for including in a object to be compared, the project include at least one in phrase, sentence, paragraph, table and image;It will The project identified is matched to generate candidate items pair, each candidate items to including at least two projects, it is described at least Two projects are respectively from different objects to be compared;Based on the first predefined rule, the first user's history behavior and to be compared At least one of in the content of text of object, determine the feature of each candidate items pair of generated candidate items centering; And based on identified feature, by generated candidate items at least one of candidate items to being determined as than project It is right, wherein each project for including than project centering is than project.
Further, according to an embodiment of the invention, the project packet for including at least two objects to be compared of the identification It includes:In response to detecting predesignated subscriber's behavior, the project for including at least two object to be compared is identified;Or in response to Determine that the relative position relation of described at least two objects to be compared meets predetermined relationship, identification described at least two is to be compared right The project for including as in.
Further, according to an embodiment of the invention, predesignated subscriber's behavior includes at least one of following:For indicating User be desired with user speech, the option for comparing the carry out content in multiple options operation that content compares selection, will Object to be compared drags to the action of the specific region for comparing, object to be compared is placed on the specific region for comparing In action, by object to be compared an action being placed on another, by object to be compared alignment place action, The action that object part to be compared overlapping is placed.In addition, the predetermined relationship includes at least one of following:At object to be compared In the specific region for comparing, object to be compared is aligned, object part to be compared is overlapped.
Further, according to an embodiment of the invention, the project packet for including at least two objects to be compared of the identification It includes:At least one content for treating comparison other in the following way is identified:Pass through optical character recognition technology pair The content of object to be compared is identified, and is identified by scanning the content that object to be compared treats comparison other, reading waits for Include in comparison other is stored with the bar code of the object-related information to be compared, and is obtained according to the bar code information of reading The content associated with the object to be compared of storage in the database;And extracted from the content identified the phrase, At least one in sentence, paragraph, table and image is used as the project.
Further, according to an embodiment of the invention, it is based on the first predefined rule, determines generated candidate items pair In the feature of each candidate items pair include:By judge project that candidate items centering includes in the database whether by It is defined as, than project, determining the characteristic value of the candidate items pair;And/or by judging that the project that candidate items centering includes exists Whether there is similar arrangement in its respective object to be compared, determines the characteristic value of the candidate items pair.
Further, according to an embodiment of the invention, it is based on the first user's history behavior, determines generated candidate items The feature of each candidate items pair of centering includes:Based on the user determined by speech recognition technology language whether include Meet the project than syntax or pragmatic template, calculates by the co-occurrence time of the project of the candidate items centering of these item designs Number;And/or based on whether alternately movement between detecting the project that eyes of user includes in candidate items pair, calculates the candidate The co-occurrence number for the project that project centering includes;And/or based on whether detect that indicate indicator includes in candidate items pair It is alternately indicated between project, calculates the co-occurrence number for the project that the candidate items centering includes;And/or based on whether detection The project for including to candidate items centering is placed side by side, calculates the co-occurrence number for the project that the candidate items centering includes.
Further, according to an embodiment of the invention, the content of text based on object to be compared determines generated candidate The feature of each candidate items pair of project centering includes:The project for including based on candidate items centering respective waits comparing at its Compared with whether meeting in object than syntax or pragmatic template, the co-occurrence number for the project that the candidate items centering includes is calculated;With/ Or based on whether there is the project met than syntax or pragmatic template in retrieval result, calculate the candidate by these item designs The co-occurrence number of the project of project centering, wherein the project that the retrieval result includes by the candidate items centering is in database In carry out retrieval acquisition.
Further, according to an embodiment of the invention, based on identified feature by generated candidate items centering At least one candidate items are to being determined as than project to including:For each candidate items pair, the time based on determined by The total score of the feature calculation of the option pair candidate items pair;And total score is more than the candidate items of predetermined threshold to being determined as Than project pair.
Further, according to an embodiment of the invention, based on identified feature by generated candidate items centering At least one candidate items are to being determined as than project to including:For each candidate items pair, the time based on determined by Whether the feature of option pair judges the candidate items to being than project pair, wherein described according to comparable project to model Than project to model be by using machine learning algorithm using it is a large amount of known than projects pair and they described first Feature learning under at least one in the content of text of predefined rule, the first user's history behavior and object to be compared obtains 's.
Further, according to an embodiment of the invention, the method compared for content further includes:Based on the second predefined rule Then, to generated candidate items to being filtered;Wherein, it based on the first predefined rule, the first user's history behavior and waits for At least one of in the content of text of comparison other, determine the spy of each candidate items pair of filtered candidate items centering Sign.
Further, according to an embodiment of the invention, be based on the second predefined rule, to generated candidate items into Row filters:Will include length be more than predetermined length project candidate items to removal;And/or by each of which project packet The candidate items of a corresponding project containing another candidate items centering are to removal;And/or it will be except comprising range to be compared The candidate items of project are to removal, wherein the range to be compared is determined by user's housing choice behavior in object to be compared And/or relative position relation by object to be compared meet predetermined relationship determination.
Further, according to an embodiment of the invention, user's housing choice behavior includes at least one of following:It is used to indicate The input of the user speech of range to be compared, the action that range to be compared is specified by indicate indicator.In addition, the predetermined relationship Including at least one of following:The part of object to be compared alignment be range to be compared, object to be compared overlapping part be to wait comparing Compared with range.
Further, according to an embodiment of the invention, the method compared for content further includes:To identified comparable terms Mesh is to being ranked up, to be shown to user according to sorted order.
Further, according to an embodiment of the invention, according at least one of following to identified comparable project to progress Sequence:Total score, mixed reality system based on the feature than project pair are arranged, are believed based on user profiles or user social contact network The user preference of breath, than project centering appearance sequence of the project in object to be compared, than the project in project pair it Between similitude, than between project pair workflow sequencing and than the time sequencing between project pair.
Further, according to an embodiment of the invention, the method compared for content further includes:By identified comparable terms Mesh to showing in table form, wherein each project than project centering arranges in a first direction, different comparable Project to arranging in a second direction, and first direction is one in line direction and column direction, and second direction is line direction and row Another in direction.
Further, according to an embodiment of the invention, by identified comparable project to showing in table form Including:The project of the highest comparable project centering of the total score of feature based is determined as the list item title on first direction.
Further, according to an embodiment of the invention, by identified comparable project to showing in table form Including:Intention excavation is carried out by regarding the project of the highest comparable project centering of the total score of feature based as query term, is determined The candidate name of list item in second direction;By candidate name with comparable project to matching to form candidate name project Right, each candidate name project is to including a candidate name and a comparable project pair;Based on third predefined rule, At least one of in the content of text of two user's history behaviors and object to be compared, determine generated candidate name project centering Each candidate name project pair feature;And based on identified feature, by generated candidate name project centering At least one candidate name project centering each in candidate name with than project to associated, so that it is determined that second List item title on direction.
Further, according to an embodiment of the invention, it is based on third predefined rule, determines generated candidate name item The feature of each candidate name project pair of mesh centering includes:The candidate name for including by judging candidate name project centering Whether it is defined as be mutually related title and project in the database at least one project than project centering, determining should The characteristic value of candidate name project pair.
Further, according to an embodiment of the invention, it is based on second user historical behavior, determines generated candidate name The feature of each candidate name project pair of project centering includes:Based on the language of the user determined by speech recognition technology Whether occur in succession in predetermined time window in the candidate name for including in candidate name project pair and comparable project pair extremely A few project calculates the candidate name that the candidate name project centering includes and the co-occurrence number than project pair;And/or base In whether detecting candidate name that eyes of user includes in candidate name project pair and than at least one in project pair It is moved in succession between project, calculates the candidate name that the candidate name project centering includes and the co-occurrence number than project pair; And/or it based on whether detects in the candidate name and comparable project pair that indicate indicator includes in candidate name project pair extremely It is indicated in succession between a few project, calculates the candidate name and comparable project pair that the candidate name project centering includes Co-occurrence number.
Further, according to an embodiment of the invention, the content of text based on object to be compared determines generated candidate The feature of each candidate name project pair of name item centering includes:The candidate name for including based on candidate name project centering Whether title and at least one project of comparable project centering are appeared in succession in the content of text of one in object to be compared In predetermined space window, the candidate name that the candidate name project centering includes and the co-occurrence number than project pair are calculated;With/ Or the candidate name that based on candidate name project centering includes and whether occur in succession than at least one project of project centering In predetermined space window in the content of text for including in retrieval result, the candidate that the candidate name project centering includes is calculated The co-occurrence number of title and comparable project pair, wherein the candidate name that the retrieval result includes by the candidate name project centering Claim and carries out retrieval acquisition in the database than at least one project of project centering.
Further, according to an embodiment of the invention, based on identified feature by generated candidate name project pair At least one of candidate name project centering each in candidate name include to associated with than project:For every One candidate name project pair, based on the total of the feature calculation of the identified candidate name project pair candidate name project pair Point;And total score is more than the candidate name project centering of predetermined threshold candidate name and the candidate name project centering can Than project to associated, to using the candidate name as with this than project to corresponding list item title.
Further, according to an embodiment of the invention, based on identified feature by generated candidate name project pair At least one of candidate name project centering each in candidate name include to associated with than project:For every One candidate name project pair is judged based on the feature of the identified candidate name project pair according to name item to model The candidate name of the candidate name project centering whether with the comparable project of the candidate name project centering to associated, wherein The name item to model be by using machine learning algorithm using a large amount of known name items pair and they described Feature learning under at least one in the content of text of third predefined rule, second user historical behavior and object to be compared It obtains.
According to another aspect of the present invention, a kind of device compared for content is provided, including:Recognition unit, by with It is set to the project at least two objects to be compared of identification included, the project includes phrase, sentence, paragraph, table and image At least one of in;Pairing unit, the project for being configured as to be identified are matched to generate candidate items pair, each candidate Project is to including at least two projects, and at least two project is respectively from different objects to be compared;Characteristics determining unit, At least one be configured as in the content of text based on the first predefined rule, the first user's history behavior and object to be compared , determine the feature of each candidate items pair of generated candidate items centering;And than project determination unit, by with Be set to based on identified feature, by generated candidate items at least one of candidate items to being determined as than project It is right, wherein each project for including than project centering is than project.
In accordance with a further aspect of the present invention, a kind of information processing system is provided, including:It is used for content ratio as described above Compared with device;And display device, it is configured as showing the comparable project determined by the device compared for content.
According to above-mentioned technical proposal, by based on the first pre-defined rule, the first user's history behavior and object to be compared In content of text at least one of come determine by the project identified pairing generate candidate items pair feature, can be according to spy Sign determines that some or certain candidate items are than project, so as to automatically identify ratio to the pairing project respectively contained Compared with the comparable project in object.It, can be interior than caused by project to avoid manual identified due to being identified automatically to comparable terms purpose Hold the consumption omitted with the plenty of time, is not necessarily to artificial question-response mode or heuristics manner as the relevant technologies, to Content can be effectively performed to compare.
Description of the drawings
The attached drawing for being incorporated to a part for specification and constitution instruction illustrates the embodiment of the present invention, and with description It is used to illustrate the principle of the present invention together.
Fig. 1 shows the block diagram of the exemplary hardware arrangement for the computer system that can implement the embodiment of the present invention.
Fig. 2 shows the flow charts of the method according to an embodiment of the invention compared for content.
Fig. 3 shows the flow according to an embodiment of the invention that the method than project pair is determined for feature based Figure.
Fig. 4 shows the flow chart of the another method according to an embodiment of the invention compared for content.
Fig. 5 show it is according to an embodiment of the invention using method shown in Fig. 2 to example object to be compared at Manage the example of obtained intermediate result.
Fig. 6 shows the comparable project that will be obtained in Fig. 5 to the example that shows in a tabular form.
Fig. 7 shows the example that list item title is determined for the table in Fig. 6.
Fig. 8 shows the flow according to an embodiment of the invention for determining the method for the list item title in second direction Figure.
Fig. 9 shows comparable project that the method according to an embodiment of the invention using Fig. 8 obtains Fig. 5 to locating Manage the example of obtained intermediate result.
Figure 10 shows the list item title according to an embodiment of the invention determined for feature based in second direction The flow chart of method.
Figure 11 shows the structure diagram of the device according to an embodiment of the invention compared for content.
Figure 12 shows the structure diagram of another device according to an embodiment of the invention compared for content.
Figure 13 shows the structure diagram of information processing system according to an embodiment of the invention.
Specific implementation mode
Detailed description of the present invention embodiment that hereinafter reference will be made to the drawings.
It note that similar reference number refers to the similar project in figure with letter, thus once in a width figure A project is defined, there is no need to be discussed in figure later.In the disclosure, term " first " and " second " etc. are only used In differentiation element or step, but it is not intended to indicate chronological order, preferable or importance.
Fig. 1 is the block diagram for showing to implement the hardware configuration of the computer system 1000 of the embodiment of the present invention.
As shown in fig. 1, computer system includes computer 1110.Computer 1110 includes connecting via system bus 1121 The processing unit 1120 that connects, system storage 1130, fixed non-volatile memory interface 1140, removable non-volatile memories Device interface 1150, user input interface 1160, network interface 1170, video interface 1190 and peripheral interface 1195.
System storage 1130 includes ROM(Read-only memory)1131 and RAM(Random access memory)1132.BIOS (Basic input output system)1133 reside in ROM1131.Operating system 1134, application program 1135, other program modules 1136 and certain program datas 1137 reside in RAM1132.
The fixed non-volatile memory 1141 of such as hard disk etc is connected to fixed non-volatile memory interface 1140. Fixed non-volatile memory 1141 for example can store an operating system 1144, application program 1145, other program modules 1146 With certain program datas 1147.
The removable non-volatile memory of such as floppy disk 1151 and CD-ROM drive 1155 etc is connected to Removable non-volatile memory interface 1150.For example, diskette 1 152 can be inserted into floppy disk 1151 and CD (CD)1156 can be inserted into CD-ROM drive 1155.
The input equipment of such as mouse 1161 and keyboard 1162 etc is connected to user input interface 1160.
Computer 1110 can be connected to remote computer 1180 by network interface 1170.For example, network interface 1170 It can be connected to remote computer 1180 via LAN 1171.Alternatively, network interface 1170 may be coupled to modem (Modulator-demodulator)1172 and modem 1172 be connected to remote computer 1180 via wide area network 1173.
Remote computer 1180 may include the memory 1181 of such as hard disk etc, store remote application 1185。
Video interface 1190 is connected to monitor 1191.
Peripheral interface 1195 is connected to printer 1196 and loud speaker 1197.
Computer system shown in FIG. 1 be merely illustrative and be never intended to invention, its application, or uses into Row any restrictions.
Computer system shown in FIG. 1 can be incorporated in any embodiment, can be used as stand-alone computer, or can also make For the processing system in device, one or more unnecessary components can be removed, can also be added to one or more A additional component.
Next, describing the method 200 according to the ... of the embodiment of the present invention compared for content with reference to Fig. 2.
As shown in Fig. 2, method 200 includes:In S210, the project for including at least two objects to be compared, institute are identified The project of stating includes at least one in phrase, sentence, paragraph, table and image;In S220, the project identified is matched To generate candidate items pair, each candidate items are to including at least two projects, and at least two project is not respectively from Same object to be compared;In S230, the text based on the first predefined rule, the first user's history behavior and object to be compared At least one of in content, determine the feature of each candidate items pair of generated candidate items centering;In S240, base In identified feature, by generated candidate items at least one of candidate items to being determined as than project pair, In each project for including than project centering be than project.
Method 200 can be realized by user terminal, can also jointly be realized by user terminal and network side handle equipment, also It can be realized by network side handle equipment.Method 200 is by being based on the first pre-defined rule, the first user's history behavior and/or waiting for The content of text of comparison other can extract the feature of each candidate items pair, and determine part candidate items centering packet whereby Contain and can be used as comparable terms purpose project pair, has thus automatically identified the comparable project in object to be compared, avoid a large amount of manpowers Participation and comparable terms purpose omit, thus improve comparable terms purpose accuracy of identification and recognition efficiency.
Specifically, in S210, the project in object to be compared can be identified in several ways.For example, can be first First the content of each object to be compared is identified, phrase, sentence, paragraph, table are then extracted from the content identified With at least one in image as the project in the object to be compared.
The content of each object to be compared is identified and may be used such as under type.For example, existing light can be passed through The content that character recognition technologies treat comparison other is identified.For another example can will wait comparing by scanning object to be compared Be changed into electronic form from printing form with the real world compared with object, then by the data to electronic form be identified come pair The content of object to be compared is identified.According to one embodiment of present invention, depositing of including in object to be compared can be read Contain the bar code of the object-related information to be compared, and according to the bar code information of reading obtain storage in the database with The associated content of object to be compared.Specifically, can in advance store contents of object in the database, and number will be directed toward According to the addressing information of such as address, identifier etc of the position that the contents of object is stored in library be recorded in bar code as pair It is included in the object as relevant information, and by the bar code.In this way, when reading bar code information, storage can be retrieved Contents of object in the database, to identify the content for including in object.
Treat comparison other content be identified after, phrase, sentence, section can be extracted from the content identified It falls, table and/or image are as the project that may be compared with the content in other objects to be compared in the object to be compared. Specifically, can be extracted from the content identified according to existing method for recognizing semantics the phrase with specific meanings, Sentence etc..It can also be extracted from the content identified short according to known phrases, the sentence etc. prestored in the database Language, sentence etc..According to specific format possessed by paragraph, table, image etc. section can also be captured out from the content identified It falls, table and image etc..
Pass through the step of hereinafter specifically describing, it may be determined that being between which project in these projects identified can be with It is compared.Note that term " project " used herein can indicate with the specific meanings that can be understood by masses and in language The content that can be distinguished from each other in justice.Term " phrase " used herein can indicate the phrase etc. that single word, multiple words are constituted. " sentence " used herein can indicate to appear in the part between two neighboring punctuation mark in object, the two adjacent marks Point symbol can be the same symbol(Such as all be fullstop), can also be distinct symbols(Such as one is fullstop, another is funny Number).Term " paragraph " used herein can indicate two neighboring line feed key(Such as enter key)Between part.Used here as Term " table " can be the content for having table style.Term " image " used herein can either statically or dynamically be schemed Picture, and can be part or all of complete image.
In the case where at least two objects need to carry out content comparison, if detecting predesignated subscriber's behavior, then it is assumed that These objects needs are compared, and then identify the project for including in each in these objects to be compared.The predetermined use Family behavior may include at least one of following:For indicating that user is desired with the user speech that content compares, to multiple options In carry out content compare operation option selection, object to be compared is dragged to the specific region for comparing action, One in object to be compared is placed on another by action that object to be compared is placed in the specific region for comparing On action, by object to be compared alignment place action, by object part to be compared overlapping place action.
Specifically, for example, if determining that user says " content compares " this four words by speech recognition technology, really Determine the intention that there is user content to compare, and starts to treat the project for including in comparison other and be identified.For another example if with Family is selected on such as user terminal of mobile phone, tablet computer etc with finger or is selected with cursor on the touchscreen " content compares ", it is determined that user has the intention that content compares, and starts that project is identified.In another example if user Object to be compared is poured into the comparison domain specified in virtual reality system, or object to be compared alignment is placed on camera shooting The region that machine or scanner can be shot, it is determined that user has the intention that content compares, and starts that project is identified.It removes The above-mentioned mode listed, those skilled in the art will also be appreciated that other user actions, gesture etc. show user it is expected into The intention that row content compares, and in the presence of detecting intention in this way, the content for starting to treat comparison other is known Not.
Alternatively, other than judging whether user has and compare the intention of content in addition to using user's predefined action, may be used also To judge whether user has the intention for comparing content according to the relative position relation of object to be compared.One according to the present invention Embodiment, if it is determined that the relative position relation of at least two objects to be compared meets predetermined relationship, then at least two described in identification Each project for including in a object to be compared.The predetermined relationship may include at least one of following:Object to be compared In the specific region for comparing, object to be compared is aligned, object part to be compared is overlapped.
Specifically, for example, if in the specific region for being compared(For example, video camera can shoot region, By region, the preassigned region etc. in content scanning to electronic information)Detect that appearance at least two is to be compared right As, it is determined that user has the intention that content compares, and starts that project is identified.For another example if detecting to be compared Mutual substantial alignment between object(Allow a certain range of error, such as 10% etc.)Or there is overlapping region, then really Determine the intention that there is user content to compare, and starts that project is identified.
Certainly, triggering treats the condition that the project in comparison other is identified and is not limited to predesignated subscriber's behavior and/or right As relative position relation.Those skilled in the art are easily envisaged that the identification that can trigger project by other means, For example, user presses and is used to indicate user and is desired with switch that content compares, is exclusively used in carrying out content comparison in equipment itself In the case of the equipment it is in running order when etc..
After the project for including in identifying each object to be compared, in S220, project can be matched with Generate candidate items pair.For example, it is assumed that there are two object A and B to be compared, multiple project A1 to An are identified in object A, Multiple project B1 to Bm are identified in object B.So, a project in object A is matched with a project in object B It is right, a candidate items pair can be generated.Therefore, for object A and B to be compared, can generate altogether { A1, B1 }, { A1, B2 }, { A1, B3 } ... the n × m candidate items pair of { An, Bm-1 } and { An, Bm }.Each " { } " shown here is one corresponding Candidate items pair.Similarly, for more than two object to be compared, each candidate items centering includes respectively from different pairs A project as in.For example, for four objects to be compared, each candidate items centering includes four projects, this four items Mesh is respectively from this four objects to be compared.It should be noted that when being matched, it can be for every in object A A project is matched with each project in object B respectively, can also for some projects in object A respectively with object B In each project matched, can also for some projects in object A only respectively with some projects in object B into Row pairing.The concrete mode present invention of pairing is not limited, as long as the project that each candidate items centering includes is not respectively from The number of same object to be compared and the project from different objects to be compared is one.
Then, by the processing of S230 and S240, it may be determined that there is part candidate in the candidate items pair generated in S220 To containing than project, each of this identified part candidate items centering is referred to alternatively as than project pair project, In include project be than project.For example, when true by the processing of S230 and S240 in above-mentioned n × m candidate items centering Fixed { A2, B6 } and { An, B1 } is than project clock synchronization, then A2 and B6 is than project, and An and B1 are than project.
Term used herein can indicate " than project " can be at least two projects being semantically compared. So-called " can be compared semantically " indicates to compare in the cognition of people, they may be having the same Property or corresponding to the things description with same alike result in terms of etc..For example, " interesting " and " uninteresting " is than project, because They are all related with the perception of people." 600 to 1000 dollars " and " 600 to 800 Euros " be than project because they all with valence Lattice are related.Although some projects can be compared in terms of mathematical angle, they are not described " comparable terms herein Mesh ".For example, all contain the number that can be compared in " although 600 to 1000 dollars " and " 365 days " the two projects, It is the semanteme due to them and differs, the two objects is not described " than project " herein.
Next, specifically describe the feature for how determining candidate items pair and how to be determined than project according to feature It is right.
It, can be based on the text of the first predefined rule, the first user's history behavior and/or object to be compared in S230 Content determines the feature of candidate items pair.The feature of candidate items pair can characterize between the project that candidate items centering includes The possibility size that can be compared.The feature of candidate items pair can be indicated that each component in vector can refer to by vector Show in varied situations(For example, based on the first predefined rule, based on the first user's history behavior or based on object to be compared Content of text determines the case where feature)Obtained value.For example, the case where determining feature based on the first predefined rule Under, it can will indicate the whether comparable characteristic value of project of candidate items centering(For example, characteristic value is that 1 expression project is comparable, it is special Value indicative is that 0 expression project is not comparable)It is included in the feature of candidate items pair.Based on the first user's history behavior or to be compared It, can be by the co-occurrence number of the project of candidate items centering in the case that the content of text of object determines feature(the number of co-occurrences)(It can also be referred to as co-occurrence frequency)It is included in the feature of the candidate items pair.
According to an embodiment of the invention, it when determining the feature of candidate items pair based on the first predefined rule, can incite somebody to action Based on the characteristic value of the first predefined rule determination as candidate items to the feature under the first predefined rule.
Specifically, can be by judging it is comparable whether project that candidate items centering includes is defined as in the database Project determines the characteristic value of the candidate items pair.For example, it can be previously stored known comparable project in the database, for example, By the comparable project of administrative staff's manually input, obtained comparable projects such as machine learning etc. are carried out in a large amount of contents in advance. When the project for judging that candidate items centering includes is defined as in the database than project, by the feature of the candidate items pair Value is set as the first value(Such as 1), otherwise it is set as second value(Such as 0).For example, for candidate items to { A1, B1 }(A1 and B1 is respectively from object A and B to be compared), if it is possible to the A1 and B1 stored in the form of than project is found in the database(Example Such as, same a line in the table in database can be stored in than project, then, if it is possible in the database a certain It goes while finding A1 and B1, then it is assumed that A1 and B1 is than project), then candidate items are set as the characteristic value of { A1, B1 } First value., whereas if the A1 and B1 stored in the form of than project cannot be found in the database, then by candidate items pair The characteristic value of { A1, B1 } is set as second value.
Furthermore it is possible to by judging whether the project that candidate items centering includes has in its respective object to be compared Similar arrangement determines the characteristic value of the candidate items pair.Similar arrangement can be by context where project language construction, item Purpose format characteristic etc. determines.When the project from different objects has similar arrangement in respective object, it may be determined that by The characteristic value for the candidate items pair that these projects are constituted is the first value, otherwise is second value, wherein the first value is different from second value And second value can be more than.For example, candidate items to the A1 in { A1, B1 } in object A be blacken overstriking word and B1 is also the word for blackening overstriking in object B, it may be considered that A1 and B1 has similar arrangement in their own object, Then candidate items are set as the first value to the characteristic value of { A1, B1 }.Conversely, when A1 and B1 do not blacken overstriking simultaneously, it can The characteristic value of { A1, B1 } is set as second value.For another example if A1 occurs in the table in object A, B1 is in object B It also appears in table, then can determine that candidate items are the first value to the characteristic value of { A1, B1 }.Conversely, when only A1 and When one of B1 occurs in the table, it may be determined that candidate items are second value to the characteristic value of { A1, B1 }.
It according to an embodiment of the invention, can be with when determining the feature of candidate items pair based on the first user's history behavior Using the project co-occurrence number determined based on the first user's history behavior as candidate items under the first user's history behavior Feature.First user's history behavior may include user now and/or user speech, gesture, the eyes taken in the past are mobile, behaviour Make behavior etc..Pass through the identification to user behavior, it may be determined that whether change the co-occurrence number of project.
The identification to user behavior is described in many documents.User behavior has been got over as a kind of information input mode To cause concern more.For example, in mixed reality technology(Mixed Reality)In, it is used by using cameras capture Family behavior, can obtain the input information of various forms, and then take adaptable operation.For example, in entitled " Make The U.S. Patent Application Publication No. of Static Printed Contents to be Dynamic Using Virtual Data " In US2013147836A1, by using mixed reality technology based on user be physically entered behavior can identify user selection Content, and using the content selected as virtual data project to user with finger to place.In addition, in entitled " Method and system of scoring documents based on attributes obtained from a digital In the US publication 20130054622A1 of document by eye-tracking data analysis ", used by tracking The movement of family eyes carries out content search.It is logical there is no realizing and instructing but in the existing technology using user behavior User behavior is crossed to carry out content comparison.User behavior can not only be carried as the novel input mode for contributing to content to compare For valuable and targetedly information input, and the new meaning of user behavior present in reality is can also impart to, as having Help the input information that content compares, so that user behavior can be uses efficiently, saves the volume when determining than project Outer information input.
Specifically, whether can include to meet than syntax based on the language of the user determined by speech recognition technology Or the project of pragmatic template, calculate the co-occurrence number of the project by the candidate items centering of these item designs.For example, when passing through When existing speech recognition technology identifies that user says " now we to have a look the difference of A1 and B1 ", due to this sentence of user Words meet than syntax or pragmatic template as " ... and ... difference ", therefore can increase in the words and include A1 and the candidate items that are constituted of B1 to { A1, B1 } corresponding co-occurrence number.If do not detected in the language of user To the project met than syntax or pragmatic masterplate, then the individual features of all candidate items pair can not change.
Furthermore it is possible to moved based on whether replacing between detecting the project that eyes of user includes in candidate items pair, Calculate the co-occurrence number for the project that the candidate items centering includes.For example, then noted after the A1 that user watches attentively in object A again Depending in object B B1 or otherwise when, candidate items can be increased to { A1, B1 } corresponding co-occurrence number.If the eye of user Eyeball alternately moves between the B1 in the A1 in object A and object B repeatedly, then alternate number is more, increased co-occurrence time Number can be bigger.Pre-determined number can also be reached in eyes number alternately mobile between A1 and B1(Such as 3 times), just increase Co-occurrence number.If not detecting that eyes of user alternately focuses on the project of different objects, all candidate items pair Individual features can not change.
Furthermore it is possible to based on whether alternately refer between detecting the project that indicate indicator includes in candidate items pair Show, calculates the co-occurrence number for the project that the candidate items centering includes.Here, indicate indicator can be such as mouse, felt pen Etc user's input tool, can also be user's finger, can also be other can be directed toward the project in object to indicate to use Family selects other indicate indicators of respective item.For example, working as user's hand(Or felt pen etc.)It is directed toward after the A1 in object A Then use hand again(Or felt pen etc.)The B1 that is directed toward in object B or otherwise when, it is right to { A1, B1 } that candidate items can be increased The co-occurrence number answered.If user's hand(Or felt pen etc.)Repeatedly between the B1 in the A1 in object A and object B alternately Instruction, then alternate number is more, increased co-occurrence number can be bigger.It can also be with hand(Or felt pen etc.)In A1 Alternately the number of instruction reaches pre-determined number between B1(Such as 3 times), just increase co-occurrence number.For another example user can make A1 is selected with mouse with highlighted mode and then selects B1 with highlighted mode, in such a case, it is possible to increase candidate items pair { A1, B1 } corresponding co-occurrence number.If it is indicated that component replaces directory entry, then all candidates not between different objects The individual features of project pair can not change.
Furthermore it is possible to based on whether detect that the project that candidate items centering includes is placed side by side, the candidate item is calculated The co-occurrence number for the project that mesh centering includes.For example, if the B1 in A1 and object B in object A is placed on side by side by user A line then increases candidate items to { A1, B1 } corresponding co-occurrence number then can consider that A1 and B1 needs are compared.Such as Fruit does not detect that the content that content is placed or placed side by side side by side does not correspond to any candidate items pair, then all The individual features of candidate items pair can not change.
According to an embodiment of the invention, when the content of text based on object to be compared determines the feature of candidate items pair, It can be using the project co-occurrence number determined based on the content of text of object to be compared as candidate items in object to be compared Feature under content of text.
Specifically, whether can be met in its respective object to be compared based on the project that candidate items centering includes Than syntax or pragmatic template, the co-occurrence number for the project that the candidate items centering includes is calculated.For example, working as candidate items pair A1 in { A1, B1 } is appeared in the content of meeting in object A " ... ratio ... is big " structure and B1 is appeared in pair As meeting in B " ... than ... small " structure content in when, it is corresponding to { A1, B1 } total that candidate items can be increased Occurrence number., whereas if content where A1 and B1 and do not meet than syntax or pragmatic masterplate, then the corresponding spy of { A1, B1 } Sign can not change.
Furthermore it is possible to based on, with the presence or absence of the project met than syntax or pragmatic template, being calculated by this in retrieval result The co-occurrence number of the project of the candidate items centering of a little item designs, wherein the retrieval result passes through the candidate items centering packet The project contained carries out retrieval acquisition in the database.For example, for candidate items to { A1, B1 }, if using A1 and B1 as The content instruction A1 and B1 that search key is retrieved from database is comparable(Such as A1 and B1 appears in retrieval result simultaneously In and meet than syntax or pragmatic template), then it is assumed that A1 and B1 is then to increase candidate items to { A1, B1 } than project Corresponding co-occurrence number., whereas if A1 and B1 is not related to as the result that search key returns than syntax or pragmatic Template, then the individual features of { A1, B1 } can not change.
Here, the comparable structure of the content being directed to can be indicated for than syntax or pragmatic template.For example, working as When showing multinomial record in a tabular form, the content under the respective entries each recorded is comparable, at this point, such form is Than syntax or pragmatic template.For another example comprising " ... ratio ... ", " with ... compared with ... ", " being more than ", The text structure of the expression comparison property of " being less than ", " being better than ", " being inferior to " or the like can be than syntax or pragmatic template.
In above determination characteristic value and the mode of project co-occurrence number it is one or more can simultaneously be calculated, And collectively form the feature of candidate items pair.For example, in the feature for determining candidate items pair, the first pre- set pattern can will be based on Then determining characteristic value is included in the first and second components of feature, by the co-occurrence determined based on the first user's history behavior time The co-occurrence number determined based on the content of text of object to be compared is included in by number included in the third to the 6th component of feature In 7th and the 8th component of feature, to collectively form the feature of candidate pair of project.The mode of above-mentioned determining feature is one A example, it is therein for indicating to determine the feature of candidate items pair that those skilled in the art will also be appreciated that other modes The whether comparable possibility of project.
It, can be according to unsupervised in S240 after the feature that each candidate items pair are determined(unsupervised) Algorithm or have supervision(supervised)Algorithm feature based determined from candidate items centering than project pair.
According to an embodiment of the invention, method 300 shown in Fig. 3 may be used to execute for determining than project pair Unsupervised algorithm.
In S310, for each candidate items pair, based on the feature calculation of the identified candidate items pair time The total score of option pair.
Calculating the mode of total score can take many forms.For example, certain candidate items can will be directed to in different judgements Under the conditions of the co-occurrence number that determines and/or characteristic value be weighted total score of the summation as the candidate items pair.For example, can be Highest weight is set based on the characteristic value that the first predefined rule determines, the co-occurrence number to be determined based on the first user behavior Secondary high weight is set, minimum weight is set for the content of text based on object to be compared.It is being related to the first user behavior Can also be the case where situation, the indicate indicator moved based on eyes of user is indicated and identification user speech under each Rule of judgment The case where etc. the different weight of setting.Certainly, those skilled in the art can also readily appreciate that, can be by different Rule of judgment The co-occurrence number and/or characteristic value of lower determination seek the expression formula of total score to calculate total score known to substituting into.The present invention is to determination The concrete form of total score is simultaneously not particularly limited, as long as total score can reflect the project of candidate items centering, comparable possibility is big It is small.
In S320, total score is more than the candidate items of predetermined threshold to being determined as than project pair.
Set predetermined threshold can be specifically arranged according to different situations.If it is desired to which the comparable project found to the greatest extent may be used Can be accurate, predetermined threshold can be set to higher value.If it is desired to all comparable projects are comprehensively found as far as possible, it can To set predetermined threshold to lower value.
According to an embodiment of the invention, method 400 shown in Fig. 4 may be used to execute the algorithm of supervision.There is supervision Algorithm can be realized by being learnt in advance with a large amount of training samples.
S410 to S440 in Fig. 4 and the S210 to S240 in Fig. 2 are essentially identical.Particularly, the execution needs of S440 are borrowed Help than project to model.
Than project to model be by using machine learning algorithm using it is a large amount of known than projects pair and they Feature under at least one in the content of text of first predefined rule, the first user's history behavior and object to be compared What training obtained.In Fig. 4, by inciting somebody to action a large amount of known comparable project pair and each in the first predefined rule, first Feature under at least one in the content of text of user's history behavior and object to be compared corresponds to input for executing machine together The machine learning module of device learning algorithm can be generated than project to model.It can be with to the generation method of model than project Using existing machine learning or training method, such as naive Bayesian(Naive Bayesian)Algorithm, support vector machines (Support Vector Machine, SVM)Algorithm etc..By machine learning, it can judge and be somebody's turn to do according to the feature of input Whether the corresponding project of feature is comparable.So, when the feature of certain candidate items pair determined in S430 is entered what study obtained When than project to model, can be determined according to the output result of model the candidate items to whether be than project pair, i.e., its In project it is whether comparable.
In S440, if the feature determined in S430 is passed through than project to being exported than knot after model treatment Fruit, then candidate items corresponding with this feature are to being confirmed as than project pair.Here, such as other machine learning methods one Sample is established identical as the feature extraction mode used in S430 than the feature extraction mode that project uses model.
It, can be according to feature from candidate items pair in this way, no matter by unsupervised algorithm or have the algorithm of supervision Middle determination avoids the participation of a large amount of manpowers and content from omitting, in raising than project pair so as to automatic identification than project Hold the precision and efficiency compared.
Next, specifically describing the realization process of the method in Fig. 2 with reference to the example in figure 5.
The hard copy A and hard copy B as object to be compared is shown in FIG. 5, they respectively illustrate underwater wedding Plan and makeup wedding plan.It can start item wherein included by the way that hard copy A and hard copy B is arranged side by side or overlapping is placed Purpose identifies.
In the implementation procedure of S210, it can identify that project " underwater wedding ", " 5000-7000 is beautiful from hard copy A Member " and " one week stroke ", identified from hard copy B project " wedding of making up ", " 2000-4000 dollars ", " one day stroke " and " low-risk ".
In the implementation procedure of S220, each project in hard copy A can be carried out with each project in hard copy B Pairing is to form multiple candidate items pair.The implementing result of step S220 is shown in the frame 220 of Fig. 5.
In the implementation procedure of S230, user's history behavior is based here on to determine the feature of each candidate items pair. More specifically, in the present example, by detecting the alternating movement time between project of the eyes of user in each candidate items pair Alternating between the project of number and user's finger in each candidate items pair indicates number, to determine corresponding candidate items pair In project co-occurrence number, as the feature corresponding under the conditions of finger indicates under eyes mobile condition.In the frame of Fig. 5 The implementing result of step S230 is shown in 230, wherein each number represents the alternate frequency detected as respective items purpose Co-occurrence number.
In the implementation procedure of S240, in this example embodiment, the co-occurrence number and finger moved according to eyes indicates The co-occurrence number arrived calculates each wait by scheduled weighting scheme or reflection than other calculations of possibility size The total score of option pair.Then, total score is more than predetermined threshold(Such as 0.90)Candidate items to being determined as than project pair, To find out than project.The implementing result of step S240 is shown in the frame 240 of Fig. 5.It can be seen that extremely by S210 The processing of S240, " underwater wedding " and " makeup wedding ", " 5000-7000 dollars " and " 2000-4000 dollars " and " one week Stroke " and " one day stroke " are confirmed as respectively than project, to realize that automatic and effective content compares.
Two hard copies are compared although showing in the example of fig. 5, but the invention is not restricted to this.In this hair In bright embodiment, object to be compared can be not only hard copy present in actual life, food wrapper and other prints Substantial paper, printed matter etc. can also be electronically existing and provided comprising substantial text information, multimedia Material etc..
For example, in one example, a hard copy can be compared with the file content stored in mobile phone.At this In example, hard copy can be taken pictures into picture by user by the camera of mobile phone, and select progress content comparison in user The comparison of hard copy and the file content stored is realized after function button.It is deposited in the picture and mobile phone of hard copy to be compared The file content of storage can be sent to server, and comparable project therein is found out by executing the method for Fig. 2 by server, And the comparable project found is sent back in mobile phone and is shown.
It in another example, can be by user terminal(Such as computer, mobile phone, personal digital assistant, tablet computer etc.) In the contents of two files be compared.Can be identified by the method for Fig. 2 than project, the identification process can with It is executed in the terminal of family, object to be compared can also be sent to server to be executed by server as above-mentioned example, and by The comparable project found out is sent back user terminal and shown by server.In addition, in some instances, shown institute can be allowed It is sky to have a project of certain comparable project centerings of comparable project centering.
In another example, the comparable project of comparable project pair or the user terminal determination that server returns is to can be with The form of expression of table, curve graph, block diagram etc is shown, or display is projected in mixed reality system.Also, When all or part of projects than project centering are identical, these projects can be merged to display, to show phase It is identical to answer the content of object to be compared in this respect.
In addition, object to be compared can also be the description of product of commodity, the diagnosis report of different patients, in text document Different piece etc..No matter the initial form of object to be compared how, by make object to be compared be electronic form picture, Word, file etc. can realize that automatic content compares by the method for Fig. 2.
According to an embodiment of the invention, it can be based on the second predefined rule, to generated candidate items to carrying out Filter.In such a case, it is possible to just for remaining each candidate items after filtering to determining feature.It is removed by filtration one Divide candidate items pair, it is possible to reduce it needs to be determined that the quantity of the candidate items pair of feature, compares speed to increase content, save System resource overhead.
Can will include length be more than predetermined length project candidate items to removal.For example, if regulation is comparable Project no more than 10 Chinese characters or 20 characters, then when some project of a candidate items centering be more than it is defined most When long length, by the candidate items to removal, it is not directed to it and determines feature in S230.
It can also be by the candidate items of a corresponding project of each of which project comprising another candidate items centering to removal. For example, if there is candidate items pair 1 { A1, B1 }, candidate items pair 2 in the case of A1 realize A2, it is real in the case of B1 Existing B2 } and candidate items pair 3 { A2, B2 }, since candidate items pair 2 contain the content of candidate items pair 1 and 3 simultaneously, Candidate items pair 2 can be removed.
Furthermore it is also possible to by the candidate items comprising the project except range to be compared to removal, wherein described to be compared Range is that relative position relation being determined by user's housing choice behavior in object to be compared and/or by object to be compared is full What sufficient predetermined relationship determined.For example, user's housing choice behavior may include at least one of following:It is used to indicate range to be compared User speech input, specify by indicate indicator the action of range to be compared.The predetermined relationship may include such as down toward One item missing:The part of object to be compared alignment be range to be compared, object to be compared overlapping part be range to be compared.Specifically For, for example, when user selectes a part of content with indicate indicator in each object to be compared, this partial content is exactly to wait for Comparison range.If certain candidate items is to the project being located at except the range to be compared including at least one, the candidate items To being removed.
According to an embodiment of the invention, can also determine than project to later, to identified comparable project into Row sequence, to be shown to user according to sorted order.In this way, contributing to according to user preference or than the weight of project pair Degree is wanted to provide a user more valuable information.For example, can be according at least one of following to identified comparable project pair It is ranked up:Total score, mixed reality system based on the feature than project pair are arranged, are based on user profiles or user social contact net The user preference of network information, the item of appearance sequentially, in comparable project pair than the project of project centering in object to be compared The workflow sequencing between similitude, comparable project pair between mesh and the time sequencing between comparable project pair.Specifically, The total score of feature based can be calculated according to the mode being outlined above.It is arranged by mixed reality system, it can will be comparable Project is to according to predetermined order(Such as lexicographic order)It is arranged.In addition, more similar comparable project can be displayed on more Forward position.In addition, if one than project to being happened at another than project to before, will be first occurred comparable Project is to the comparable project occurred after coming to before.In addition, if between multiple comparable projects pair there is the absolute time to close System(Such as spring, summer, autumn and winter), then can be ranked up according to their absolute time sequencing.
It according to an embodiment of the invention, can be by identified comparable project to showing in table form, wherein Each project than project centering arranges in a first direction, and different comparable projects to arranging in a second direction, first party To being one in line direction and column direction, second direction is another in line direction and column direction.For example, being determined in Fig. 5 Three than projects to " underwater wedding " and " make up wedding ", " 5000-7000 dollars " and " 2000-4000 dollars " and " one week stroke " and " one day stroke " can be shown with the form in Fig. 6.In figure 6, first direction corresponds to row Direction, second direction correspond to column direction.It is shown it is of course also possible to which the table in Fig. 6 is rotated by 90 ° counterclockwise, at this time First direction corresponds to column direction, and second direction corresponds to line direction.
Can be the table that the table determines line direction and/or column direction when than project to showing in table form Item title enhances user experience in order to which user more fully understands than project.Example below in Fig. 6 to show comparable terms How the description of list item title is determined for the mode of mesh pair.Those skilled in the art can hold in reading this specification It changes places and expects how determining list item title in the case of being rotated by 90 ° the display format in Fig. 6 counterclockwise.
It, can be by the highest comparable project of the total score of feature based as described above in the form of the table as Fig. 6 The project of centering is determined as first direction(Line direction in Fig. 6, line direction here correspond to horizontal direction)On list item name Claim.Specifically, as shown in figure 5, than project to " underwater wedding " and " makeup wedding " with highest total score, therefore can be with By " underwater wedding " and " makeup wedding " as the list item title on line direction, the general introduction of respective column is indicated respectively, such as Fig. 7 institutes Show.In the figure 7, each content of the first row indicates respectively the general introduction of respective column.
In addition, in order to determine the second direction in Fig. 6(Column direction in Fig. 6, column direction here correspond to Vertical Square To)On list item title, method 800 shown in fig. 8 may be used.
In S810, anticipated by regarding the project of the highest comparable project centering of the total score of feature based as query term Figure excavates(intent mining), determine the candidate name of the list item in second direction.
Many fields have begun using be intended to digging technology, such as consumer precision marketing, search when from The dynamic occurrence etc. occurred.The embodiment of the present invention will be intended to excavate for determining list item title for the relevant technologies. Specifically, at least one of " underwater wedding " and " makeup wedding " can be closed as search in the example shown in FIG. 5, Key word inputs in the search box in such as Baidu, Google's webpage, may occur in the drop-down of search box is shown at this time multiple Extra field, these extra fields can be indicated by obtaining to a large number of users data analysis with the search key of input most For relevant content.These extra fields can be as the candidate name of the list item in second direction.Certainly, people in the art Member is it will also be appreciated that other find the item with the highest comparable project centering of the total score of feature based by way of being intended to excavate The mostly concerned content of mesh, and using these contents as candidate name.Candidate name can also be referred to as aspect herein (facet), for characterizing the attribute than project pair.
For example, in the example shown in FIG. 5, after by search box in " underwater wedding " input, in drop-down is shown certainly It is dynamic additional " cost ", " duration " and " risk " occur, then using these three additional contents as the table in second direction The candidate name of item.
In S820, by candidate name with comparable project to matching to form candidate name project pair, each time Select name item to including a candidate name and a comparable project pair.
For example, for the another two other than the comparable project pair as list item title found out in Fig. 5 than project Pair and above-mentioned candidate name " cost ", " duration " and " risk ", 6 times can be obtained as shown in the frame 920 in Fig. 9 Select name item pair, wherein each candidate name project to comprising element be expressed in the same row.Certainly, if comparable terms Mesh to { underwater wedding, make up wedding } not as list item title, then this constitutes candidate name project than project to also assisting in It is right, 9 candidate name projects pair can be obtained altogether at this time.
In S830, in the content of text based on third predefined rule, second user historical behavior and object to be compared At least one of, determine the feature of each candidate name project pair of generated candidate name project centering.
The feature of candidate name project pair can characterize the candidate name and comparable project that candidate name project centering includes The associated possibility size between(In other words, candidate name can correctly reflect the possibility of the attribute than project pair Property size).Similar to the feature of above-mentioned candidate items pair, the feature of candidate name project pair can also be indicated by vector, to Each component in amount can indicate in varied situations(For example, based on third predefined rule, being based on second user historical behavior Or the case where feature determined based on the content of text of object to be compared)Obtained value.For example, based on the predefined rule of third In the case of then determining feature, the candidate name of candidate name project centering can will be indicated with comparable project to whether associated Characteristic value(For example, characteristic value is that both 1 expressions are interrelated, characteristic value is that both 0 expressions are not interrelated)It is included in candidate In the feature of name item pair.The case where the content of text based on second user historical behavior or object to be compared determines feature Under, can include in the candidate name project by the candidate name of candidate name project centering and the co-occurrence number than project pair To feature in.
It according to an embodiment of the invention, can when determining the feature of candidate name project pair based on third predefined rule Using will based on third predefined rule determine characteristic value as candidate name project to the feature under third predefined rule.
Specifically, can by judging candidate name that candidate name project centering includes and extremely than project centering Whether a few project is defined as be mutually related title and project in the database, determines the spy of the candidate name project pair Value indicative.For example, known be mutually related title and project can be previously stored in the database(The title characterizes project tool There are attribute or affiliated classification), for example, manually entering associated title and project by administrative staff, in advance a large amount of interior Associated title and the project etc. that machine learning etc. obtains are carried out in appearance.As the candidate for judging that candidate name project centering includes When title and at least one project of comparable project centering are defined as being mutually related in the database, by the candidate name The characteristic value of project pair is set as the first value(Such as 1), otherwise it is set as second value(Such as 0).Note that being carried herein in different piece To multiple first values do not represent one and be set to identical value and multiple second values do not represent one yet and are set to identical value, be only Distinguishing characteristic value is different in varied situations.
For example, for candidate name project to { title 1, than project to { A1, B1 } }, if it is possible in database Middle determining title 1 and A1 are to be mutually related or title 1 and B1 is to be mutually related or title 1 and the same phases of A1 and B1 The characteristic value of the candidate name project pair can be then set as the first value by mutual correlation., whereas if not can determine that in the database Title 1 and any of A1 and B1 are interrelated, then the characteristic value of the candidate name project pair is set as second value.
It according to an embodiment of the invention, can be with when determining the feature of candidate name project pair based on second user history Using the candidate name determined based on second user historical behavior with the co-occurrence number than project pair as candidate name project pair Feature under second user historical behavior.Second user historical behavior may include the use that user is present and/or takes in the past Family voice, gesture, eyes movement, operation behavior etc..Pass through the identification to user behavior, it may be determined that whether change candidate name With the co-occurrence number than project pair.
Specifically, can based on the user determined by speech recognition technology language in predetermined time window whether Occur the candidate name that candidate name project centering includes and at least one project than project centering in succession, calculates the candidate The candidate name that name item centering includes and the co-occurrence number than project pair.Here, predetermined time window can be shifted to an earlier date A period being arranged(Such as 10 seconds, 20 seconds etc.), for reflecting that the certain projects occurred in succession may be to be mutually related. For example, when identifying that user says " cost of this plan probably needs 300 dollars " by speech recognition technology, due to knowing The content " cost " not gone out and " 300 dollars " are included in together respectively as candidate name and than a project of project centering One candidate name project centering, and the time interval that " cost " and " 300 dollars " occurs in succession is no more than predetermined time window The length of mouth, therefore the corresponding co-occurrence number of candidate name project where " cost " and " 300 dollars " can be increased.If Do not detect that successive time of occurrence interval is shorter and is included in the same candidate name project pair in the language of user Content, then the individual features of all candidate name projects pair can not change.
Furthermore it is possible to based on whether detect candidate name that eyes of user includes in candidate name project pair and comparable Project at least one of move in succession between project, calculate candidate name that the candidate name project centering includes and comparable The co-occurrence number of project pair.For example, for candidate name project to { title 1, than project to { A1, B1 } }, if detected Any of A1 and B1 are then watched attentively again after the eye gaze title 1 of user, then the candidate name project can be increased To corresponding co-occurrence number.If the eyes of user watch title 1 and A1 attentively in succession repeatedly(Or B1), then number repeatedly is got over More, increased co-occurrence number can be bigger.Title 1 and A1 can also be watched attentively in succession repeatedly in eyes(Or B1)Number reach pre- Determine number(Such as 3 times), just increase co-occurrence number.If not detecting that eyes of user watches candidate name and project attentively in succession, Then the individual features of all candidate name projects pair can not change.
Furthermore it is possible to based on whether detect candidate name that indicate indicator includes in candidate name project pair and comparable Project at least one of indicate in succession between project, calculate candidate name that the candidate name project centering includes and Than the co-occurrence number of project pair.Indicate indicator is as described above, including user instrument, finger etc..For example, for candidate name item Mesh is to { title 1, than project to { A1, B1 } }, when user is emphasized with indicate indicator(It is directed toward, underlines, chooses)Title 1 When emphasizing later and then any one in A1 and B1, candidate items can be increased to { title 1, than project to { A1, B1 } } Corresponding co-occurrence number.If user indicates title 1 and A1 in succession repeatedly(Or B1), then number repeatedly is more, it is increased Co-occurrence number can be bigger.It can also indicate title 1 and A1 in succession repeatedly(Or B1)Number reach pre-determined number(Such as 3 It is secondary), just increase co-occurrence number.If it is indicated that component is without indicating candidate name and project in succession, then all candidate items pair Individual features can not change.
According to an embodiment of the invention, the feature of candidate name project pair is determined in the content of text based on object to be compared When, it can be using the co-occurrence number of the candidate name determined based on the content of text of object to be compared and comparable project pair as candidate Name item is to the feature under the content of text of object to be compared.
Specifically, can be based on the candidate name that candidate name project centering includes and than at least the one of project centering Whether a project is appeared in succession in the predetermined space window in the content of text of one in object to be compared, calculates the candidate The candidate name that name item centering includes and the co-occurrence number than project pair.Predetermined space window can be arranged in advance A distance(Such as 20 or 30 character lengths etc.), for reflecting that the certain projects occurred in succession may be to be mutually related. For example, for candidate name project to { title 1, than project to { A1, B1 } }, when title 1 and A1 appear in the places A1 simultaneously Object in and the number of characters that is spaced between title 1 and A1 when being 16, it is believed that title 1 and A1 may be interrelated , then increase candidate name project to { title 1, than project to { A1, B1 } } corresponding co-occurrence number.If to be compared There is no to find candidate name and project in object while appearing in the close range of a text, then all candidate name items The feature of mesh pair can not change.
Furthermore it is possible at least one item for the candidate name and comparable project centering for including based on candidate name project centering Whether mesh is appeared in succession in the predetermined space window in the content of text for including in retrieval result, calculates the candidate name project The candidate name that centering includes and the co-occurrence number than project pair, wherein the retrieval result passes through the candidate name project pair In include candidate name and carry out retrieval acquisition in the database than at least one project of project centering.For example, for Candidate name project is to { title 1, than project to { A1, B1 } }, if by title 1 and at least one of A1 and B1 as inspection The content instruction title 1 that rope keyword is detected from database is to be mutually related at least one of A1 and B1(Example Such as, title 1 is existed simultaneously at least one of A1 and B1 in the one text content of retrieval result and at a distance of relatively close), The candidate name project can then be increased to corresponding co-occurrence number.Conversely, not changing the candidate name project then to corresponding Co-occurrence number.
In above determination characteristic value and the mode of co-occurrence number it is one or more can simultaneously be calculated, and altogether With the feature for constituting candidate name project pair.The mode of the feature of the mode of constitutive characteristic and above-mentioned composition candidate items pair can be with Similar, which is not described herein again.
Continue the example of Fig. 9, in this example embodiment, by detecting eyes of user in each candidate name project Middle Phase after note Depending on candidate name and project number and user's finger in each candidate name project Middle Phase after instruction candidate name and item Purpose number, come determine corresponding candidate name project centering candidate name and than project pair co-occurrence number, as The corresponding feature under eyes mobile condition and under the conditions of finger indicates.It shows in frame 930 in fig.9 and walks in this example embodiment The implementing result of rapid S830, wherein each number represents the co-occurrence number detected.
In S840, based on identified feature, by generated candidate name project at least one of candidate name Candidate name in each of title project centering with than project to associated, so that it is determined that the list item name in second direction Claim.
According to an embodiment of the invention, according to unsupervised algorithm or the algorithm of supervision can is determined based in S830 Feature from candidate name project centering determine include can be mutually related candidate name with than project pair candidate name Title project pair, so that it is determined that second direction(Column direction in Fig. 6)On list item title.
For example, the method 1000 described in Figure 10 may be used to execute the nothing for determining the list item title in second direction The algorithm of supervision.
In S1010, for each candidate name project pair, based on the feature of the identified candidate name project pair Calculate the total score of the candidate name project pair.
The mode for calculating total score is similar with the calculation that above-mentioned combination S310 is described, such as using weighted sum or adopts With the mode for seeking total score pre-defined.The present invention is to determining the concrete form of total score and being not particularly limited, as long as always Divide the candidate name that can reflect candidate name project centering with comparable project to correct associated possibility size.
In S1020, total score is more than to the candidate name and the candidate name of the candidate name project centering of predetermined threshold The comparable project of project centering to associated, to using the candidate name as with this than project to corresponding list item title.
Here, set predetermined threshold can be larger value, enable to find as precisely as possible with it is comparable Project is to associated candidate name.
According to an embodiment of the invention, it can be realized by being learnt in advance with a large amount of training samples for determining The algorithm for having supervision of list item title on two directions.
Specifically, needing to judge model by means of name item each candidate name item in above-mentioned steps S840 Whether the candidate name of mesh centering is with comparable project to being associated.Name item to model is calculated by using machine learning Method using a large amount of known name items pair and they in third predefined rule, second user historical behavior and object to be compared Content of text at least one of under feature learning obtain.Name item may be used the generation method of model existing Machine learning or training method, which is not described herein again.By machine learning, can be judged and the spy according to the feature of input Whether levy is associated between corresponding candidate name and comparable project pair.So, when certain the candidate name project determined in S830 To feature be entered the obtained name item of study to model when, can determine the candidate name according to the output result of model Whether the candidate name of project centering and comparable project are to being associated.
If the feature determined in S830 is associated as a result, so to being exported after model treatment by name item The candidate name of candidate name project centering corresponding with this feature and comparable project are to associated.Here, such as other engineerings Learning method is the same, establishes the feature extraction mode that name item uses model and the feature extraction mode phase used in S830 Together.
It, can be according to feature from candidate name item in this way, no matter by unsupervised algorithm or have the algorithm of supervision Mesh centering determines associated candidate name and than project pair, so as to the list item title in automatic identification second direction, Thus, it is possible to will improve user experience than project to being more effectively presented to the user.
Continue the example of Fig. 9, in this example embodiment, what the co-occurrence number and finger moved according to eyes indicated is total to Occurrence number calculates each candidate name by scheduled weighting scheme or other calculations of reflection association possibility size The total score of project pair.Then, total score is more than predetermined threshold(Such as 0.85)Candidate name project centering candidate name and Than project to being determined as being mutually related, to find out the list item title in second direction.It is shown in the frame 940 of Fig. 9 The implementing result of step S840 in this example embodiment.
It can be seen that by the processing of S810 to S840, " 5000-7000 dollars " and " 2000-4000 dollars " corresponding table Entitled " cost ", " one week stroke " and " one day stroke " corresponding list item is entitled " duration ", to realization automatically and The determination of effective list item title.Further, since " underwater wedding " and " makeup wedding " is confirmed as on line direction as described above List item title, therefore need not be the specific row side of " underwater wedding " and " makeup wedding " setting of the first row in the present example Upward list item title.It is of course also possible to for " underwater wedding " and " makeup wedding " setting such as " comparison as list item title The respective name of object " etc.As the example in Fig. 9, the list item title setting " aspect " of the first row can also be defaulted to Such title, to show that first row is shown than project to affiliated property or classification.
The list item title on determining column direction is automatically generated in table in the figure 7 in fig.9.In this way, the side of being expert at To on column direction with list item title comparable project to being presented to user in table form, to it is more vivid directly The mode of sight carries out content comparison convenient for user.
Described above is the methods according to the ... of the embodiment of the present invention compared for content, next, will be in conjunction with Figure 11 to figure 13 describe the device and system according to the present invention compared for content.
As shown in figure 11, the device 1100 compared for content includes that recognition unit 1110, pairing unit 1120, feature are true Order member 1130 and comparable project determination unit 1140.Recognition unit 1110 can be configured as at least two objects to be compared of identification In include project, the project include in phrase, sentence, paragraph, table and image at least one of.Pairing unit 1120 can The project for being configured as to be identified is matched to generate candidate items pair, and each candidate items are to including at least two Mesh, at least two project is respectively from different objects to be compared.Characteristics determining unit 1130 can be configured as based on the At least one of in the content of text of one predefined rule, the first user's history behavior and object to be compared, caused by determination The feature of each candidate items pair of candidate items centering.It can be configured as than project determination unit 1140 based on determining Feature, by generated candidate items at least one of candidate items to being determined as than project pair, wherein each may be used The project for including than project centering is than project.
Recognition unit 1110, pairing unit 1120, characteristics determining unit 1130 and upper than project determination unit 1140 It states and/or other operations can refer to the description that above-mentioned combination step S210 to S240 is carried out with function, in order to avoid repeating, This is repeated no more.
The device provided in an embodiment of the present invention compared for content is by being based on the first pre-defined rule, the first user's history The content of text of behavior and/or object to be compared can extract the feature of each candidate items pair, and determine that part is candidate whereby Include than project in project pair.Therefore, it is possible to automatically identify the comparable project in object to be compared, a large amount of manpowers are avoided Participation and comparable terms purpose omit, thus improve comparable terms purpose recognition efficiency.
Figure 12 shows the structure diagram of another device 1200 according to the ... of the embodiment of the present invention compared for content.Device Recognition unit 1210, pairing unit 1220, characteristics determining unit 1230 and comparable project determination unit 1240 in 1200 are distinguished With recognition unit 1110, pairing unit 1120, characteristics determining unit 1130 and the comparable project determination unit in device 1100 1140 is essentially identical.
According to an embodiment of the invention, recognition unit 1210 can be specifically configured in response to detecting predesignated subscriber's row To identify the project for including at least two object to be compared, or to be compared right in response to determination described at least two The relative position relation of elephant meets predetermined relationship, identifies the project for including at least two object to be compared.
According to an embodiment of the invention, predesignated subscriber's behavior may include at least one of following:For indicating user It is desired with the selection of user speech, the option for comparing the carry out content in multiple options operation that content compares, will wait comparing The action of the specific region for comparing is dragged to compared with object, object to be compared is placed in the specific region for comparing Action, by object to be compared an action being placed on another, by object to be compared alignment place action, will wait for Comparison other partly overlaps the action of placement.The predetermined relationship may include at least one of following:Object to be compared, which is in, to be used In the specific region compared, object to be compared is aligned, object part to be compared is overlapped.
According to an embodiment of the invention, recognition unit 1210 may include content recognition subelement 1212 and extraction subelement 1214.Content recognition subelement 1212 can be configured as in the following way at least one of treat the content of comparison other into Row identification:The content that comparison other is treated by optical character recognition technology is identified, and is treated by scanning object to be compared The content of comparison other is identified, and reads the bar shaped for being stored with the object-related information to be compared for including in object to be compared Code, and the content associated with the object to be compared of storage in the database is obtained according to the bar code information of reading.Extraction Subelement 1214 can be configured as extracting in the phrase, sentence, paragraph, table and image at least from the content identified One conduct project.
According to an embodiment of the invention, characteristics determining unit 1230 may include 1231 to the 8th subelement of the first subelement At least one of 1238.Specifically, the first subelement 1231 can be configured as by judging that candidate items centering includes Whether project is defined as, than project, determining the characteristic value of the candidate items pair in the database.Second subelement 1232 can It is configured as by judging whether the project that candidate items centering includes has similar arrangement in its respective object to be compared, Determine the characteristic value of the candidate items pair.Third subelement 1233 can be configured as based on the use determined by speech recognition technology Whether the language at family calculates candidate items centering by these item designs comprising the project met than syntax or pragmatic template Project co-occurrence number.4th subelement 1234 can be configured as based on whether detecting eyes of user in candidate items pair Including project between alternately move, calculate the co-occurrence number for the project that the candidate items centering includes.5th subelement 1235 It can be configured as alternately indicating between based on whether detecting the project that indicate indicator includes in candidate items pair, calculate The co-occurrence number for the project that the candidate items centering includes.6th subelement 1236 can be configured as based on whether detecting candidate The project that project centering includes is placed side by side, calculates the co-occurrence number for the project that the candidate items centering includes.7th son is single Member 1237 can be configured as the project for including based on candidate items centering whether meet in its respective object to be compared it is comparable Syntax or pragmatic template calculate the co-occurrence number for the project that the candidate items centering includes.8th subelement 1238 can be configured Based on whether there is the project met than syntax or pragmatic template in retrieval result, to calculate the candidate by these item designs The co-occurrence number of the project of project centering, wherein the project that the retrieval result includes by the candidate items centering is in database In carry out retrieval acquisition.
According to an embodiment of the invention, may include computation subunit 1242 and determining son than project determination unit 1240 Unit 1244.Computation subunit 1242 can be configured as each candidate items pair, the candidate items based on determined by To the feature calculation candidate items pair total score.Determination subelement 1244 can be configured as total score being more than the time of predetermined threshold Option is to being determined as than project pair.
According to an embodiment of the invention, it can be specifically configured to for each candidate than project determination unit 1240 Project pair, based on the feature of the identified candidate items pair, according to, to model, judge than project the candidate items to whether For than project pair.Wherein, the comparable project is a large amount of known comparable by using machine learning algorithm utilization to model Project pair and they in the content of text of first predefined rule, the first user's history behavior and object to be compared extremely What the feature learning under one item missing obtained.
According to an embodiment of the invention, device 1200 can also include filter element 1250.Filter element 1250 can be by It is configured to the second predefined rule, to generated candidate items to being filtered.In this case, feature determines single Member 1230 can be configured as in the content of text based on the first predefined rule, the first user's history behavior and object to be compared At least one of, determine the feature of each candidate items pair of filtered candidate items centering.
According to an embodiment of the invention, filter element 1250 may include the first removal subelement 1252, second removal At least one of unit 1254 and third removal subelement 1256.Specifically, first removal subelement 1252 can by with The candidate items that will include length more than the project of predetermined length are set to removing.Second removal subelement 1254 can by with The candidate items by a corresponding project of each of which project comprising another candidate items centering are set to removal.Third removal Unit 1256 can be configured as the candidate items comprising the project except range to be compared to removal, wherein described to wait comparing It is relative position relation being determined by user's housing choice behavior in object to be compared and/or by object to be compared compared with range Meet predetermined relationship determination.
According to an embodiment of the invention, user's housing choice behavior may include at least one of following:It is used to indicate and waits comparing The action of range to be compared is specified compared with the input of the user speech of range, by indicate indicator.The predetermined relationship may include It is at least one of following:The part of object to be compared alignment be range to be compared, object to be compared overlapping part be model to be compared It encloses.
According to an embodiment of the invention, device 1200 can also include sequencing unit 1260.Sequencing unit 1260 can be by It is configured to identified comparable project to being ranked up, to be shown to user according to sorted order.
According to an embodiment of the invention, sequencing unit 1260 can be specifically configured to according to it is at least one of following to really Fixed comparable project is to being ranked up:Total score, mixed reality system based on the feature than project pair are arranged, based on user's letter Shelves or the user preference of the user social contact network information, than project centering appearance sequence of the project in object to be compared, can Than the similitude between the project of project centering, the workflow sequencing between comparable project pair and than the time between project pair Sequentially.
According to an embodiment of the invention, device 1200 can also include display unit 1270.Display unit can be configured For by identified comparable project to showing in table form, wherein each project than project centering is first It being arranged on direction, for different comparable projects to arranging in a second direction, first direction is one in line direction and column direction, the Two directions are another in line direction and column direction.
According to an embodiment of the invention, display unit 1270 specifically can be configured as the total score of feature based is highest Project than project centering is determined as the list item title on first direction.
According to an embodiment of the invention, display unit 1270 may include candidate name determination subelement 1272, match antithetical phrase Unit 1274, feature determination subelement 1276 and performance title determination subelement 1278.Candidate name determination subelement 1272 can To be configured as by the way that the project of the highest comparable project centering of the total score of feature based is carried out intention excavation as query term, Determine the candidate name of the list item in second direction.It can be configured as candidate name and comparable project with sub-unit 1274 To being matched to form candidate name project pair, each candidate name project is comparable to including a candidate name and one Project pair.Feature determination subelement 1276 can be configured as based on third predefined rule, second user historical behavior and wait for At least one of in the content of text of comparison other, determine each candidate name item of generated candidate name project centering The feature of mesh pair.List item title determination subelement 1278 can be configured as based on identified feature, by generated candidate Name item at least one of candidate name project centering each in candidate name with than project to associated, So that it is determined that the list item title in second direction.
According to an embodiment of the invention, feature determination subelement 1276 may include 1281 to the 6th component of the first component 1286.Specifically, the first component 1281 can be configured as by judging candidate name project centering the candidate name for including Whether it is defined as be mutually related title and project in the database at least one project than project centering, determining should The characteristic value of candidate name project pair.Second component 1282 can be configured as based on the user determined by speech recognition technology Language whether occur the candidate name for including in candidate name project pair in succession in predetermined time window and than project pair At least one of project, calculate candidate name that the candidate name project centering includes and the co-occurrence number than project pair. Third member 1283 can be configured as based on whether detecting the candidate name that eyes of user includes in candidate name project pair Claim to move in succession between at least one project than project centering, calculates the candidate name that the candidate name project centering includes The co-occurrence number of title and comparable project pair.4th component 1284 can be configured as based on whether detecting indicate indicator in candidate It is indicated in succession between the candidate name that name item centering includes and at least one project than project centering, calculating should The candidate name that candidate name project centering includes and the co-occurrence number than project pair.5th component 1285 can be configured as Whether at least one project of the candidate name and comparable project centering that include based on candidate name project centering appears in succession In the predetermined space window in one content of text in object to be compared, the time that the candidate name project centering includes is calculated Select title and the co-occurrence number than project pair.6th component 1286 can be configured as includes based on candidate name project centering Candidate name and whether appear in the content of text for including in retrieval result in succession than at least one project of project centering In predetermined space window in, calculate candidate name that the candidate name project centering includes and the co-occurrence time than project pair Number, wherein at least the one of candidate name and comparable project centering that the retrieval result includes by the candidate name project centering A project carries out retrieval acquisition in the database.
According to an embodiment of the invention, list item title determination subelement 1278 may include calculating unit 1292 and list item name Claim to determine component 1294.Calculating unit 1292 can be configured as each candidate name project pair, based on determined by The total score of the feature calculation of the candidate name project pair candidate name project pair.List item title determine component 1294 can by with It is set to the comparable of candidate name and the candidate name project centering for the candidate name project centering that total score is more than to predetermined threshold Project to associated, to using the candidate name as with this than project to corresponding list item title.
According to an embodiment of the invention, list item title determination subelement 1278 can be specifically configured to wait each Name item pair is selected, based on the feature of the identified candidate name project pair, according to name item to model, judges the candidate The candidate name of name item centering whether with the comparable project of the candidate name project centering to associated.Wherein, the name Title project is by using machine learning algorithm is using a large amount of known name items pair and they are pre- in the third to model Feature learning under at least one in the content of text of definition rule, second user historical behavior and object to be compared obtains 's.
Above-mentioned each unit, the above-mentioned and/or other operation of subelement and component and function can refer to and combine Fig. 2 to Figure 10 The specific descriptions of progress, in order to avoid repeating, details are not described herein.
The structure diagram of information processing system 1300 according to the ... of the embodiment of the present invention is shown in FIG. 13.The information processing System 1300 includes the device 1310 compared for content.Device 1310 can be above-mentioned device 1100 or 1200.The information Processing system 1300 further includes display device 1320, and display device 1320 can be configured as display and be compared for content by described Device 1310 determine comparable project.Display device 1320 can be any kind of display coupled with device 1310, Projection device etc., the comparable project that can determine device 1310 are shown to user.For example, display device 1320 can will be comparable Project shows in table form, or will project to display area than project by mixed reality system.
Can the process and apparatus of the present invention be implemented in many ways.For example, can by software, hardware, firmware, Or any combination thereof implement the process and apparatus of the present invention.The order of above-mentioned method and step is merely illustrative, the present invention Method and step be not limited to order described in detail above, unless otherwise clearly stating.In addition, in some embodiments In, the present invention can also be implemented as recording program in the recording medium comprising for realizing according to the method for the present invention Machine readable instructions.Thus, the present invention also covers storage for realizing the recording medium of program according to the method for the present invention.
Although illustrating some specific embodiments of the present invention in detail by example, those skilled in the art should Understand, above-mentioned example, which is intended merely to, to be illustrative and do not limit the scope of the invention.It should be appreciated by those skilled in the art that above-mentioned Embodiment can be changed without departing from the scope of the present invention and essence.The scope of the present invention is limited by the attached claims Fixed.

Claims (43)

1. a kind of device compared for content, including:
Recognition unit is configured as the project at least two objects to be compared of identification included, and the project includes phrase, sentence At least one of in son, paragraph, table and image;
Pairing unit, the project for being configured as to be identified are matched to generate candidate items pair, and each candidate items are to packet At least two projects are included, at least two project is respectively from different objects to be compared;
Characteristics determining unit is configured as determining generated candidate item by least one value in the following entry value of determination The feature of each candidate items pair of mesh centering;Wherein, each entry value includes:The spy determined based on the first predefined rule The item that value indicative, the project co-occurrence number based on the first user's history behavior determination, the content of text based on object to be compared determine Mesh co-occurrence number;And
Than project determination unit, it is configured as based on identified feature, by least the one of generated candidate items centering A candidate items are to being determined as than project pair, wherein each project for including than project centering is than project;
Wherein, in the case where determining the characteristic value based on first predefined rule, the characteristics determining unit includes:
First subelement, being configured as can by judging whether project that candidate items centering includes is defined as in the database Than project, the characteristic value of the candidate items pair is determined;And/or
Second subelement is configured as by judging that the project that candidate items centering includes is in its respective object to be compared It is no that there is similar arrangement, determine the characteristic value of the candidate items pair.
2. the apparatus according to claim 1, wherein the recognition unit is configured to respond to detect predesignated subscriber's row To identify the project for including at least two object to be compared, or to be compared right in response to determination described at least two The relative position relation of elephant meets predetermined relationship, identifies the project for including at least two object to be compared.
3. the apparatus of claim 2, wherein
Predesignated subscriber's behavior includes at least one of following:For indicate user be desired with the user speech that content compares, Compare the carry out content in multiple options the selection of the option of operation, object to be compared is dragged into the given zone for comparing One in object to be compared is put in the action in domain, the action that object to be compared is placed in the specific region for comparing The action on another, the action by object to be compared alignment is placed, the action for placing object part to be compared overlapping are set,
The predetermined relationship includes at least one of following:Object to be compared is in the specific region for comparing, is to be compared right As alignment, object part to be compared overlapping.
4. device according to any one of claims 1 to 3, wherein the recognition unit includes:
Content recognition subelement, at least one content for treating comparison other in being configured as in the following way are known Not:
The content that comparison other is treated by optical character recognition technology is identified,
It is identified by scanning the content that object to be compared treats comparison other,
The bar code for being stored with the object-related information to be compared for including in object to be compared is read, and according to the bar shaped of reading The content associated with the object to be compared of code acquisition of information storage in the database;And
Subelement is extracted, is configured as extracting in the phrase, sentence, paragraph, table and image from the content identified At least one is used as the project.
5. the apparatus according to claim 1, wherein determining the project co-occurrence based on the first user's history behavior In the case of number, the characteristics determining unit further comprises:
Whether third subelement is configured as the language based on the user determined by speech recognition technology comprising meeting than sentence The project of method or pragmatic template calculates the co-occurrence number of the project by the candidate items centering of these item designs;And/or
4th subelement is configured as based on whether between detecting the project that eyes of user includes in candidate items pair alternately It is mobile, calculate the co-occurrence number for the project that the candidate items centering includes;And/or
5th subelement is configured as based on whether between detecting the project that indicate indicator includes in candidate items pair alternately It is indicated, calculates the co-occurrence number for the project that the candidate items centering includes;And/or
6th subelement is configured as based on whether detecting that the project that candidate items centering includes is placed side by side, and calculating should The co-occurrence number for the project that candidate items centering includes.
6. the apparatus according to claim 1, wherein determine the project in the content of text based on the object to be compared In the case of co-occurrence number, the characteristics determining unit further comprises:
7th subelement, is configured as whether the project for including based on candidate items centering accords in its respective object to be compared It closes than syntax or pragmatic template, calculates the co-occurrence number for the project that the candidate items centering includes;And/or
8th subelement is configured as, based on whether there is the project met than syntax or pragmatic template in retrieval result, counting The co-occurrence number for calculating the project by the candidate items centering of these item designs, wherein the retrieval result passes through the candidate items The project that centering includes carries out retrieval acquisition in the database.
7. the apparatus according to claim 1, wherein the comparable project determination unit includes:
Computation subunit is configured as each candidate items pair, based on the feature meter of the identified candidate items pair Calculate the total score of the candidate items pair;And
Determination subelement is configured as total score being more than the candidate items of predetermined threshold to being determined as than project pair.
8. the apparatus according to claim 1, wherein the comparable project determination unit is configured as each candidate Project pair, based on the feature of the identified candidate items pair, according to, to model, judge than project the candidate items to whether For than project pair,
Wherein, the comparable project is to utilize a large amount of known comparable projects pair and it by using machine learning algorithm to model In the content of text of first predefined rule, the first user's history behavior and object to be compared at least one of under What feature learning obtained.
9. the apparatus according to claim 1, further including:
Filter element is configured as being based on the second predefined rule, to generated candidate items to being filtered;
Wherein, the characteristics determining unit is configured as based on the first predefined rule, the first user's history behavior and to be compared At least one of in the content of text of object, determine the feature of each candidate items pair of filtered candidate items centering.
10. device according to claim 9, wherein the filter element includes:
First removal subelement is configured as including candidate items of the length more than the project of predetermined length to removal;With/ Or
Second removal subelement, is configured as the time of a corresponding project of each of which project comprising another candidate items centering Option is to removal;And/or
Third remove subelement, be configured as will include range to be compared except project candidate items to removal, wherein institute It is opposite position being determined by user's housing choice behavior in object to be compared and/or passing through object to be compared to state range to be compared The relationship of setting meets predetermined relationship determination.
11. device according to claim 10, wherein
User's housing choice behavior includes at least one of following:It is used to indicate the input of the user speech of range to be compared, passes through Indicate indicator specifies the action of range to be compared,
The predetermined relationship includes at least one of following:The part of object alignment to be compared is range to be compared, object to be compared The part of overlapping is range to be compared.
12. the apparatus according to claim 1, further including:
Sequencing unit is configured as to identified comparable project to being ranked up, to be shown to use according to sorted order Family.
13. device according to claim 12, wherein the sequencing unit is configured as according at least one of following to institute Determining comparable project is to being ranked up:Total score, mixed reality system based on the feature than project pair are arranged, are based on user The user preference of profile or the user social contact network information, than project centering appearance sequence of the project in object to be compared, Than between the project of project centering similitude, than between project pair workflow sequencing and than between project pair when Between sequence.
14. the apparatus according to claim 1, further including:
Display unit is configured as identified comparable project to showing in table form, wherein each comparable terms The project of mesh centering arranges in a first direction, and for different comparable projects to arranging in a second direction, first direction is line direction With one in column direction, second direction is another in line direction and column direction.
15. device according to claim 14, wherein the display unit is configured as the total score highest of feature based The project of comparable project centering be determined as the list item title on first direction.
16. the device according to claims 14 or 15, wherein the display unit includes:
Candidate name determination subelement is configured as by making the project of the highest comparable project centering of the total score of feature based Intention excavation is carried out for query term, determines the candidate name of the list item in second direction;
With sub-unit, it is configured as candidate name with comparable project to matching to form candidate name project pair, often One candidate name project is to including a candidate name and a comparable project pair;
Feature determination subelement is configured as determining generated candidate by least one value in the following entry value of determination The feature of each candidate name project pair of name item centering;Wherein, each entry value includes:Based on the predefined rule of third Then determining characteristic value, the candidate name that is determined based on second user historical behavior and comparable project pair co-occurrence number, be based on The co-occurrence number of the candidate name that the content of text of object to be compared determines and comparable project pair;And
List item title determination subelement is configured as based on identified feature, by generated candidate name project centering Candidate name in each of at least one candidate name project centering with than project to associated, so that it is determined that second party Upward list item title.
17. device according to claim 16, wherein determining the characteristic value based on the third predefined rule In the case of, the feature determination subelement includes:
The first component, is configured as by judging candidate name that candidate name project centering includes and extremely than project centering Whether a few project is defined as be mutually related title and project in the database, determines the spy of the candidate name project pair Value indicative.
18. device according to claim 16, wherein determining the co-occurrence time based on the second user historical behavior In the case of number, the feature determination subelement includes:
Second component, be configured as the language based on the user determined by speech recognition technology in predetermined time window whether Occur the candidate name that candidate name project centering includes and at least one project than project centering in succession, calculates the candidate The candidate name that name item centering includes and the co-occurrence number than project pair;And/or
Third member, be configured as based on whether detect candidate name that eyes of user includes in candidate name project pair and Moved in succession than between at least one project of project centering, calculate candidate name that the candidate name project centering includes and Than the co-occurrence number of project pair;And/or
4th component, be configured as based on whether detect candidate name that indicate indicator includes in candidate name project pair and It is indicated in succession between at least one project than project centering, calculates the candidate name that the candidate name project centering includes The co-occurrence number of title and comparable project pair.
19. device according to claim 16, wherein determined in the content of text based on the object to be compared described total In the case of occurrence number, the feature determination subelement includes:
5th component is configured as at least the one of the candidate name and comparable project centering that include based on candidate name project centering Whether a project is appeared in succession in the predetermined space window in the content of text of one in object to be compared, calculates the candidate The candidate name that name item centering includes and the co-occurrence number than project pair;And/or
6th component is configured as at least the one of the candidate name and comparable project centering that include based on candidate name project centering Whether a project is appeared in succession in the predetermined space window in the content of text for including in retrieval result, calculates the candidate name The candidate name that project centering includes and the co-occurrence number than project pair, wherein the retrieval result passes through the candidate name item The candidate name and carry out retrieval acquisition in the database than at least one project of project centering that mesh centering includes.
20. device according to claim 16, wherein the list item title determination subelement includes:
Calculating unit is configured as each candidate name project pair, based on the identified candidate name project pair The total score of the feature calculation candidate name project pair;And
List item title determines component, be configured as by total score be more than predetermined threshold candidate name project centering candidate name with The comparable project of the candidate name project centering to associated, to using the candidate name as with this than project to corresponding List item title.
21. device according to claim 16, wherein the list item title determination subelement is configured as each Candidate name project pair, according to name item to model, judges the time based on the feature of the identified candidate name project pair Select name item centering candidate name whether with the comparable project of the candidate name project centering to associated,
Wherein, the name item is to utilize a large amount of known name items pair and it by using machine learning algorithm to model In the content of text of the third predefined rule, second user historical behavior and object to be compared at least one of under What feature learning obtained.
22. a kind of method compared for content, including:
Identify that the project for including at least two objects to be compared, the project include phrase, sentence, paragraph, table and image At least one of in;
The project identified is matched to generate candidate items pair, each candidate items are to including at least two projects, institute At least two projects are stated respectively from different objects to be compared;
Each candidate item of candidate items centering caused by being determined by least one value in the following entry value of determination The feature of mesh pair;Wherein, each entry value includes:It is gone through based on the characteristic value of the first predefined rule determination, based on the first user History behavior determine project co-occurrence number, based on object to be compared content of text determine project co-occurrence number;And
Based on identified feature, by generated candidate items at least one of candidate items to being determined as than project It is right, wherein each project for including than project centering is than project;
Wherein, determine that the characteristic value includes based on first predefined rule:
By judging whether the project that candidate items centering includes is defined as, than project, determining the candidate item in the database The characteristic value of mesh pair;And/or
By judging whether the project that candidate items centering includes has similar arrangement in its respective object to be compared, determine The characteristic value of the candidate items pair.
23. according to the method for claim 22, wherein the project packet for including at least two objects to be compared of the identification It includes:
In response to detecting predesignated subscriber's behavior, the project for including at least two object to be compared is identified;Or
Meet predetermined relationship in response to the relative position relation of described at least two object to be compared of determination, at least two described in identification The project for including in a object to be compared.
24. according to the method for claim 23, wherein
Predesignated subscriber's behavior includes at least one of following:For indicate user be desired with the user speech that content compares, Compare the carry out content in multiple options the selection of the option of operation, object to be compared is dragged into the given zone for comparing One in object to be compared is put in the action in domain, the action that object to be compared is placed in the specific region for comparing The action on another, the action by object to be compared alignment is placed, the action for placing object part to be compared overlapping are set,
The predetermined relationship includes at least one of following:Object to be compared is in the specific region for comparing, is to be compared right As alignment, object part to be compared overlapping.
25. the method according to any one of claim 22 to 24, wherein at least two objects to be compared of the identification In include project include:
At least one content for treating comparison other in the following way is identified:
The content that comparison other is treated by optical character recognition technology is identified,
It is identified by scanning the content that object to be compared treats comparison other,
The bar code for being stored with the object-related information to be compared for including in object to be compared is read, and according to the bar shaped of reading The content associated with the object to be compared of code acquisition of information storage in the database;And
It is used as the item from least one extracted in the content identified in the phrase, sentence, paragraph, table and image Mesh.
26. according to the method for claim 22, wherein determine the project co-occurrence based on the first user's history behavior Number includes:
Whether the language based on the user determined by speech recognition technology includes the project met than syntax or pragmatic template, Calculate the co-occurrence number of the project by the candidate items centering of these item designs;And/or
Based on whether alternately being moved between detecting the project that eyes of user includes in candidate items pair, the candidate items are calculated The co-occurrence number for the project that centering includes;And/or
Based on whether alternately being indicated between detecting the project that indicate indicator includes in candidate items pair, the candidate is calculated The co-occurrence number for the project that project centering includes;And/or
Based on whether detecting that the project that candidate items centering includes is placed side by side, the item that the candidate items centering includes is calculated Purpose co-occurrence number.
27. according to the method for claim 22, wherein determine the project based on the content of text of the object to be compared Co-occurrence number includes:
Whether the project for including based on candidate items centering meets in its respective object to be compared than syntax or pragmatic mould Plate calculates the co-occurrence number for the project that the candidate items centering includes;And/or
Based on whether there is the project met than syntax or pragmatic template in retrieval result, the time by these item designs is calculated The co-occurrence number of the project of option centering, wherein the project that the retrieval result includes by the candidate items centering is in data Retrieval acquisition is carried out in library.
28. according to the method for claim 22, wherein based on identified feature by generated candidate items centering At least one candidate items are to being determined as than project to including:
For each candidate items pair, based on the total of the feature calculation of the identified candidate items pair candidate items pair Point;And
Total score is more than the candidate items of predetermined threshold to being determined as than project pair.
29. according to the method for claim 22, wherein based on identified feature by generated candidate items centering At least one candidate items are to being determined as than project to including:
Each candidate items pair, to model, is sentenced based on the feature of the identified candidate items pair according to than project Break the candidate items to whether be than project pair,
Wherein, the comparable project is to utilize a large amount of known comparable projects pair and it by using machine learning algorithm to model In the content of text of first predefined rule, the first user's history behavior and object to be compared at least one of under What feature learning obtained.
30. according to the method for claim 22, further including:
Based on the second predefined rule, to generated candidate items to being filtered;
Wherein, at least one in content of text based on the first predefined rule, the first user's history behavior and object to be compared , determine the feature of each candidate items pair of filtered candidate items centering.
31. according to the method for claim 30, wherein the second predefined rule is based on, to generated candidate items pair Be filtered including:
Will include length be more than predetermined length project candidate items to removal;And/or
By the candidate items of a corresponding project of each of which project comprising another candidate items centering to removal;And/or
By the candidate items comprising the project except range to be compared to removal, wherein the range to be compared is by waiting comparing Relative position relation being determined compared with user's housing choice behavior in object and/or by object to be compared meets predetermined relationship and determines 's.
32. according to the method for claim 31, wherein
User's housing choice behavior includes at least one of following:It is used to indicate the input of the user speech of range to be compared, passes through Indicate indicator specifies the action of range to be compared,
The predetermined relationship includes at least one of following:The part of object alignment to be compared is range to be compared, object to be compared The part of overlapping is range to be compared.
33. according to the method for claim 22, further including:
To identified comparable project to being ranked up, to be shown to user according to sorted order.
34. according to the method for claim 33, wherein according to it is at least one of following to identified comparable project to progress Sequence:Total score, mixed reality system based on the feature than project pair are arranged, are believed based on user profiles or user social contact network The user preference of breath, than project centering appearance sequence of the project in object to be compared, than the project in project pair it Between similitude, than between project pair workflow sequencing and than the time sequencing between project pair.
35. according to the method for claim 22, further including:
By identified comparable project to showing in table form, wherein each project than project centering is the It being arranged on one direction, for different comparable projects to arranging in a second direction, first direction is one in line direction and column direction, Second direction is another in line direction and column direction.
36. according to the method for claim 35, wherein by identified comparable project to showing in table form Including:
The project of the highest comparable project centering of the total score of feature based is determined as the list item title on first direction.
37. the method according to claim 35 or 36, wherein by identified comparable project to carrying out in table form Display includes:
Intention excavation is carried out by regarding the project of the highest comparable project centering of the total score of feature based as query term, determines The candidate name of list item on two directions;
By candidate name with comparable project to matching to form candidate name project pair, each candidate name project is to packet Include a candidate name and a comparable project pair;
Each time of candidate name project centering caused by being determined by least one value in the following entry value of determination Select the feature of name item pair;Wherein, each entry value includes:The characteristic value that is determined based on third predefined rule, based on the Co-occurrence number, the content of text based on object to be compared of the candidate name that two user's history behaviors determine and comparable project pair are true The co-occurrence number of fixed candidate name and comparable project pair;And
Based on identified feature, by generated candidate name project at least one of candidate name project centering it is every Candidate name in one with than project to associated, so that it is determined that the list item title in second direction.
38. according to the method for claim 37, wherein determine the characteristic value packet based on the third predefined rule It includes:
By judge candidate name that candidate name project centering includes and than project centering at least one project in data Whether it is defined as be mutually related title and project in library, determines the characteristic value of the candidate name project pair.
39. according to the method for claim 37, wherein determine the co-occurrence number based on the second user historical behavior Including:
Whether occurs candidate name in succession in predetermined time window based on the language of the user determined by speech recognition technology The candidate name that project centering includes and at least one project than project centering, calculating the candidate name project centering includes Candidate name and than project pair co-occurrence number;And/or
Based on whether detecting in the candidate name and comparable project pair that eyes of user includes in candidate name project pair extremely It is moved in succession between a few project, calculates the candidate name that the candidate name project centering includes and the co-occurrence than project pair Number;And/or
Based on whether detecting in the candidate name and comparable project pair that indicate indicator includes in candidate name project pair extremely It is indicated in succession between a few project, calculates the candidate name and comparable project pair that the candidate name project centering includes Co-occurrence number.
40. according to the method for claim 37, wherein determine the co-occurrence based on the content of text of the object to be compared Number includes:
Whether at least one project of the candidate name and comparable project centering that include based on candidate name project centering goes out in succession In the predetermined space window in one content of text in present object to be compared, calculating the candidate name project centering includes Candidate name and than project pair co-occurrence number;And/or
Whether at least one project of the candidate name and comparable project centering that include based on candidate name project centering goes out in succession In predetermined space window in the content of text for including in present retrieval result, the time that the candidate name project centering includes is calculated Title and the co-occurrence number than project pair are selected, wherein the candidate that the retrieval result includes by the candidate name project centering Title and at least one project of comparable project centering carry out retrieval acquisition in the database.
41. according to the method for claim 37, wherein based on identified feature by generated candidate name project pair At least one of candidate name project centering each in candidate name include to associated with than project:
For each candidate name project pair, based on the feature calculation of the identified candidate name project pair candidate name The total score of project pair;And
Total score is more than to the comparable of candidate name and the candidate name project centering of the candidate name project centering of predetermined threshold Project to associated, to using the candidate name as with this than project to corresponding list item title.
42. according to the method for claim 37, wherein based on identified feature by generated candidate name project pair At least one of candidate name project centering each in candidate name include to associated with than project:
For each candidate name project pair, based on the feature of the identified candidate name project pair, according to name item To model, judge the candidate name project centering candidate name whether with the comparable project of the candidate name project centering to phase Association,
Wherein, the name item is to utilize a large amount of known name items pair and it by using machine learning algorithm to model In the content of text of the third predefined rule, second user historical behavior and object to be compared at least one of under What feature learning obtained.
43. a kind of information processing system, including:
It is used for the device that content compares according to claim 1-21 any one of them;And
Display device is configured as showing the comparable project determined by the device compared for content.
CN201310416233.5A 2013-09-13 2013-09-13 The method, apparatus and information processing system compared for content Active CN104462083B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310416233.5A CN104462083B (en) 2013-09-13 2013-09-13 The method, apparatus and information processing system compared for content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310416233.5A CN104462083B (en) 2013-09-13 2013-09-13 The method, apparatus and information processing system compared for content

Publications (2)

Publication Number Publication Date
CN104462083A CN104462083A (en) 2015-03-25
CN104462083B true CN104462083B (en) 2018-11-02

Family

ID=52908149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310416233.5A Active CN104462083B (en) 2013-09-13 2013-09-13 The method, apparatus and information processing system compared for content

Country Status (1)

Country Link
CN (1) CN104462083B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103500282A (en) * 2013-09-30 2014-01-08 北京智谷睿拓技术服务有限公司 Auxiliary observing method and auxiliary observing device
US10628505B2 (en) * 2016-03-30 2020-04-21 Microsoft Technology Licensing, Llc Using gesture selection to obtain contextually relevant information
TWI621952B (en) * 2016-12-02 2018-04-21 財團法人資訊工業策進會 Comparison table automatic generation method, device and computer program product of the same
CN108846081B (en) * 2018-06-08 2020-10-30 四川科库科技有限公司 Commodity tracing information query method and system
CN110716681A (en) * 2018-07-11 2020-01-21 阿里巴巴集团控股有限公司 Method and device for comparing display objects of display interface

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1581170A (en) * 2003-08-15 2005-02-16 国际商业机器公司 Method and system for comparing files of two computers
CN101517572A (en) * 2006-07-18 2009-08-26 甲骨文国际公司 Semantic aware processing of XML documents
CN101533346A (en) * 2008-03-13 2009-09-16 中兴通讯股份有限公司 Source file comparing unit and method thereof
CN101765857A (en) * 2007-06-20 2010-06-30 阿玛得斯两合公司 System and method for integrating and displaying travel advices gathered from a plurality of reliable sources

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004110161A (en) * 2002-09-13 2004-04-08 Fuji Xerox Co Ltd Text sentence comparing device
CN102193764B (en) * 2010-03-11 2016-04-20 英华达(上海)电子有限公司 Show and process electronic system and the method for multiple document

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1581170A (en) * 2003-08-15 2005-02-16 国际商业机器公司 Method and system for comparing files of two computers
CN101517572A (en) * 2006-07-18 2009-08-26 甲骨文国际公司 Semantic aware processing of XML documents
CN101765857A (en) * 2007-06-20 2010-06-30 阿玛得斯两合公司 System and method for integrating and displaying travel advices gathered from a plurality of reliable sources
CN101533346A (en) * 2008-03-13 2009-09-16 中兴通讯股份有限公司 Source file comparing unit and method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于特征向量的中文文档比较方法;王琳等;《情报杂志》;20051130(第11期);全文 *

Also Published As

Publication number Publication date
CN104462083A (en) 2015-03-25

Similar Documents

Publication Publication Date Title
Cappallo et al. New modality: Emoji challenges in prediction, anticipation, and retrieval
US10474708B2 (en) Entity-centric knowledge discovery
Kim et al. A hierarchical aspect-sentiment model for online reviews
CN102760153B (en) Dictionary knowledge is merged into SVM study to improve emotional semantic classification
Felix et al. The exploratory labeling assistant: Mixed-initiative label curation with large document collections
CN104462083B (en) The method, apparatus and information processing system compared for content
WO2017088245A1 (en) Method and apparatus for recommending reference document
US9454528B2 (en) Method and system for creating ordered reading lists from unstructured document sets
US9645987B2 (en) Topic extraction and video association
WO2008097706A1 (en) Context-based community-driven suggestions for media annotation
CN109766412A (en) A kind of learning Content acquisition methods and electronic equipment based on image recognition
Benitez-Quiroz et al. Discriminant features and temporal structure of nonmanuals in American Sign Language
CN110377789A (en) For by text summaries and the associated system and method for content media
US20190034455A1 (en) Dynamic Glyph-Based Search
WO2023226760A1 (en) Topic recommendation method and apparatus, computer device, and storage medium
JP6420268B2 (en) Image evaluation learning device, image evaluation device, image search device, image evaluation learning method, image evaluation method, image search method, and program
US20130097494A1 (en) Method and system for visual cues to facilitate navigation through an ordered set of documents
JP6025487B2 (en) Forensic analysis system, forensic analysis method, and forensic analysis program
Cheng et al. Context-based page unit recommendation for web-based sensemaking tasks
CN113343012B (en) News matching method, device, equipment and storage medium
JP5794001B2 (en) Information search method, information search device, and information search program
Adcock et al. Experiments in interactive video search by addition and subtraction
JP6862331B2 (en) Thinking / discussion support system and thinking / discussion support device
Yamashita et al. Exploratory search system based on comic content information using a hierarchical topic classification
Tran et al. An experiment in Interactive Retrieval for the lifelog moment retrieval task at imageCLEFlifelog2020.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant