CA2650381A1 - System and method for topical document searching - Google Patents

System and method for topical document searching Download PDF

Info

Publication number
CA2650381A1
CA2650381A1 CA002650381A CA2650381A CA2650381A1 CA 2650381 A1 CA2650381 A1 CA 2650381A1 CA 002650381 A CA002650381 A CA 002650381A CA 2650381 A CA2650381 A CA 2650381A CA 2650381 A1 CA2650381 A1 CA 2650381A1
Authority
CA
Canada
Prior art keywords
document
documents
subset
collection
additional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002650381A
Other languages
French (fr)
Inventor
Richard Douglas Kemp
Philippe Grenet
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bureau of National Affairs Inc
Original Assignee
Bloomberg Finance L.P.
Richard Douglas Kemp
Philippe Grenet
Bloomberg Lp
Bloomberg Finance Holdings L.P.
Bloomberg (Gp) Finance Llc
Bloomberg L.P.
Bloomberg Inc.
Bloomberg-Bna Holdings Inc.
The Bureau Of National Affairs, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bloomberg Finance L.P., Richard Douglas Kemp, Philippe Grenet, Bloomberg Lp, Bloomberg Finance Holdings L.P., Bloomberg (Gp) Finance Llc, Bloomberg L.P., Bloomberg Inc., Bloomberg-Bna Holdings Inc., The Bureau Of National Affairs, Inc. filed Critical Bloomberg Finance L.P.
Publication of CA2650381A1 publication Critical patent/CA2650381A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/382Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using citations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

Systems and methods are providing for searching for documents within topically- defined clusters. A search space is defined, stalling with one or more source documents, by examining references from one documents to another and following the networks of references to some level of indirection. Depending on the embodiment, references may be followed from a document containing a reference to a referred-to document, or from a referred-to document to a document containing a reference, or both. Once a search space has been defined, a query is executed, and documents within the search space that satisfy the query parameters are identified. In certain embodiments of the invention, the documents primarily relate to legal materials, and one or more source documents are associated with one or more topics within 8 topic directory. In such embodiments, a search query may be limited to one or more selected topics by executing the search query within a search space defined using the associated document or documents as the source.

Description

SYSTEM AND ME'FH~.~~) FOR TOPICAL DOCUMENT SEARCHING
COPYRi~..~HT NOTIC-~'' '00 H] A porÃionot the disclosure o~~: this patent doew-nent contains material t~attis :,u~ject to copyright p;~tectacin. Ti;e copyright owner has ~ae o ;eet:c~~ to tht, 3acs:mffle e'el)z~od:Ãctii3tz ba;
ainyzsne of tbe patent document: oà the patent disclosure os it appears in the I'eteii~ and Td~dc~~ ~rk Office pa1en: aÃleS oÃ: rei;a)ids, bL:à ot?tei"w.ise :ese:ve5 i Ct)pyr,gl).ts,~tirhat.scit',ve--r;

BACKGROUND
[00021 The snveÃrtion disclosed herein relates to sompÃrterized searching for oJectronic data w;i~xÃr cdSe luCt.iC?n3 Lir Stt?red data, <iuu}.; as, a:k., d`+..~~'#.
#"37an~.3.

10,0031 E~octto1~~c~3~ly stored data, or lnfo:i'malia`7.+Ia4 is Ã::f3wa'LaÃ:cb:lc in .Ã.:Y7la:it'. ?Se quantities, aÃ':d:`Ie . . . .
ralv of ~;y;~ k~Ãi~.fs aL~,eleraÃ,~?g: Such ~;ata may ~?~ st~3yes~ as log~eal units of va- :o.as kzltds, ~~ch Ls, e=g. documents, files, recoadsetc.

[10004; A simple 4xannpfe is the Wox~d Wide Wez) ( the "Web''.), a gl+aba;
M1or.inat]on space ~
~:,),~~~risi~;g hyperlinked f~~z~.w~ze~~ts t~:~~Ã ~~Ã~e re~~~.:..ste~. ~i~.
distributed via ~~i~ Inte.=net. From its o1Fg:ni.1 _`?~'~0AWeb lias g Os-.3Ii such that. a 5tudr co'"duc:ed in axnuan-, 200 :} concluded Ã`hpi.t it comprised at leasa. 11.5 billion i~n~.exa.~~~ Web pages alone.

~,000,51 Speclt`Y.lAzLd databases continue to grow a5'{h~~ll, For example, iecirE) :Ã.. i''~.'gc.3 databases SiS;.I`:e, e.,f;., stw:uit;s, rL;gulixt,dlns.. ;udi't,iaJ opi31bon5. and secondary sources r.~i313 a-re constantly updcated an QxpaTibifd. EYa:nples of susõh. Ic-~';al databases t'iÃ`e described i-3 i'i3mSÃ?offly-ot4Mid U.S.
pawnt app3icat;o,. serial -it), 10+ 045.5586, filed on. Jarcizii-y. 1 11, 2002an:d titled "DYNA.1k1IC
I,ECiAL D,-k -ABASE ~ROVIDING C1Js;.RENI' AN3~ ~~~~~ ~~~~~1CAL OF E),0A m1,rs OF LAW,'z a n d u~~~~onAywc)wi~ed I`.S. patent applic3tion ser:s:l no. ~
0/603.'?tl ' filed on :#une 25.
2003, i).iid iJ1J3 i"\Ix.Ai._F'a SO.k~ OF LE'CJAL:..

INI. ORMATsON," both of -~vh>chare hereby anc{3':pc?aated by reference h:.r;:iiaI:is ahe:r e:-ltarely.

1001061 Data hiorCs, of these sizes ~woEl3ta. 3iol be useful t=?,7$hf,ut tools for searching BC3T c:'i3d reL.ieviTig deSi.;"ed infor.-riabloÃl. Various f.`s;'~es of search tools (commonly referred ti~ as b3eai`c;:i efigrnes"? are wA"-ell -.Uiown, T~r pa;.ea:~zh ~:;~4~i,~e accepts a query from a user and Ãh^n tries to ?denÃify a;:i data that in ~som~ ~Nvay corresponds to the qiiesv. A lAsi of sonje or affl logical tLili,ts that c'L#n3.aI3 &Aa responsive W. the query ?sprov;.ded to i.:3i,'. user, whw3 map4' ahe:# be ab,li: :~:~ retrieve some or all of the :og:cal units.

11000471 Tx1e utility of caseai'd:=ai tool, alow-eÃ'eI', oftf`Ti d~pend.'i on how wE.'=.i Ãh~.' user caii ii3E n`fi..3aEGI a qi.;er;f. For eRample. one query related Ão a to;.3ic, may return little oa noust:ffil anformaÃion, while a slightly c:rt'ferc.n.: qsiQ~~ may r~ ~iimhuptdÃ;eds.. or even t~~~~~sands, ofh :s, ~~,hic~~ ~-nay be far too :.le3lsY to e:kiFi;]z:ie. Users may waste C;1,.~ns.id+".T-abl`L"`, ÃimdF in trial and error before saiiinbilÃ;g aipoÃi a query that leads to a manageable ili.::inbeÃ" of relevant hit.~'s. ILa practice, a user may settle I-ox a query known to be oT<rerI. zcli.i:iw`.. i3itd then waste additional ~~rni'.=
mining Ãh:.It..'sLil:S: A,L.`F=,aili',.s't EhÃ4 backgound, a us.:::~.~ ix~~iv wish to i=~d-rice ~i-ic set of searchable ds~ta, and one approach is to limit the search to ~ subset comptisl, ; doCuinents that are related to one a,?oÃlie, by to!pac.
BRIEFSUMMARY OF THE ~NVENT ~ONT

100081 'I'he inwe-at;on provicre.. for limiting a search of a data s.t, c.g., coll.ec;ion oflogica1 such as dL?ctF~`.'wi'1ts, to (-,. s%zbset Ãhereof5 which maJ' bei'5~:fÃ'F"e., to as a. eiSei3~ch Spww."
Membership in th~ ~ubseà may be, determined by one or more relat:trns~~ips bet~vee.:z and!or &,,,nong Ã,1~c among the data, such as, e.g . relationship to one or riifire corr;mon topics.
~' .-la1b17%lmeÃ"EtS ci: the invenÃ:C`,ta ~1rovede for defining a topically-related data Swi7sel in war'tia..h a search may be concucÃed for data responsive to a query.

10009; According tc) emb._~:~,e~~4s z~~~Ã.~e ~:ri~-e:~~~.ic~:i~., s~~.~.~t3 th~.. data set contains xu~~x~:~ces to other data #n. the data set, originating data is selected 1A'1:5?'Ãit"? the data set, and references to andfor f2'om the oi'iginatln4? data are used to iaÃ~~i.Il~> a subset of ttiC; data S`E i. b~':a~i..~~3i*k which s: search i'3~u~y be conducted. In anWm+,f(i~imU'nà of Ãf.:e iiivenÃaony the subset comprises the ~~~iginWfing daLaai:d data refe.'.V~~ ~f.) by andJ+;,saef~errirzg Ã.eb Ã;ie o.;l;.inatÃng da:a. In embodiments o.'t.~ie. znVc;.ntAnn, the subset may comprise "ua:her data that is idcnÃiffiab`~ by iteratively ffit=owiri~ refcr;:nces to and/Ã#r firoi"r the oÃ3.Di'iat.ii.ig diiÃ.t.'~. For example. C:3nSideI`:Ilg tho data to be nodes 1n a graph, and fffiFthe:~"
cons}derÃn, refeÃ-ences to be edges in that gra;sb, the originading daÃ'.a is represented by one or more oz tw'.:n=:`f!%lgz7#'_:d<;5T tl1e, f."a: :terald3'n adds nodes at a d1Jka:1cC of E:.~sii. $:F33`ti an orÃg;%ÃaiÃnu, r:;?de. the ~~c.?nd adds nodes at a distwbce o-f ~wo-, and so oi,.

~~OIO; In ai-i em~odinne:at of :hc iriventxon, a rriet:i-iod is prov.idet for defining a subset ofa searchable data ,et, :1he r.=.,~thod. comprising ~efin;tig the subset to include ot:gii=.?t~llg data an, d, for a.. jeasà ~-ne ite,at;:Ãsn. #.e;~~~er def;ning the subset tc.} ~iieItà de Ãiie unÃÃ~n of the daÃa curr;n fly defined to be in Ã':e sÃ.~~sew at-id aIl data tc3 whic:li. an-v data eurrent.tiN= defined to be in ffie subset contains al least onw re:f~~~~ncc: h-i ar, erribnd,naent of the invention, a rnethod is provided io>
1d.:nt?fi'1Ãig data ''h'`i1hl.ii a seÃ3?'ch{i-ble daÃ.cl set. <he method coa?
piisiilg dt,fining a 41ibs+.:tl of a searchable data set, asbeaore. azid 3'iE:~.'tii*ifv:ii-g (aic~~.a Yta1{~.;:~y~ta,~ j~ Ã~bv''t Liitat .~.~i~lns~3:~'.o.2~t.~a.
o~'.~3.4~i.~2~'~,.' criteria of a specified scarc~i query. i,3 a f-urthei~embodir~~ent ot~the iinventio-in, t>,e originating data is associated with t1i:ie or " more topics, 'rlndb?1e `v`T3g1nc`Ãt:ilg data is specified by 5~leCti31g L)a~~'.
or more topical areas for sew'chfng:

100111 Ir~ ~i-i embodh-nen of the invention, a method is provided fc.>r defining a subset of a 3e%,.r4=l1sitY'F3.e d3ta set, t}':ti'. 3Tsf'.thzod comprising defining the subset to include `s7r:gJ.szat3i.ig datf."i. c#.iid, i'+T' c^t least oii'. Ãt.~r"umL`n, fti?'the.Ã' defining t:~~.` subset to include the wlii3n Ã,).~'t the data currently defined to be in the subset and all data known kiJcoi.jLGi}_ti a Aefei:4^n4L'iV CAi-v dii.tc+: c4t~re~,,fly dZaflined to be in ~~ic Sl~-`~`J3+ i.. in, an embodiment of itic 3Ãav(',.`~~~.~.Its.`~,~, a rae?hod is provided for ÃL~en:if}'.7ng data 4avith`eIl a searchable di1ifl 4G.t, the ia'7i:thod cs:i1p?'isi.Iig defil~:iing't1 subset of a searchable data set, ~.,5 `tiSeioie, ;i.F`id 3dei':tif~'Mg data `~&rith.2% the stIb;iS:~ that sati,fia#
o1ic or mose cÃ.'.'ke:'ia of a Specifi4d search qii:;ry. In a 3:Ãrts3.e:i' e,3abLsdÃi22eI1Ã of the invention, ih,. .
Cir;tgi.iis t`.Ã.ic, data is associated w.Ãt?'.i C`:3e or more tt?piLs., w3d th;', originating &Vua is speci-fiei3 byselCi:tang [73'1e or n`Ã(>se, to3pÃL:az areas for Seai`;hi.I7 .. . .. . . . 4'~

[~~~~~] E-Irnbcadime ~s of t:he. >nvenÃiox. involve the u-se of references faom otie dneurnent to E1110thea' to tdcnfif~r sets of documents likely to relate to one or more common toptcs and to Iinut a I

sGs.1rch of ala?'g~,`~r data set to such a. topical search space. EIÃ3bodii-n'+a'Ãit5 of the invention are described in moTa detail heie3.iR w;th rf'spectto data in unie.., ariai3,,ged as documents. Bli5: as iAidieat;d above, t1he invention anc~ the various embodiments thereof dcsc-ribed herein apply equally to :iai;wiv t3ther aI;wig~rmn1t5 of data, including, for ~.xa-~?p:+~, data aÃ".~aÃ:geS.z is~tC? logical ~

wnÃts such as, e.& w2Yies, rec~~rdiSi eFc. AisSoyemi1ol inaenL:i of Ch1.'.
Ãi'vei;{tic-irta.i'rd discussed in connection w:t~.; databases of ;k-,~al materials, but the invention miw the various embodi:-nerits thereof disclosed hereiza apply generally to any sort of ~ ollexÃion of data that may co-n-Ãain à `=fiere5}ce9Lo other df.~ia within i i.iw' c'`~z3.G..'ctio3 .

~i~~~A."~~; In an embodiment of the ÃFiY'eFy~:ioia, a ~:~Yll` ~ ~tC system is ~3a~.Z~'a~Ã~.i.t " " 3"' Ã3, i` z~ ,~~ ~.l~"
pt , :~', a.F.,~'11~t~ SO'~?i~..aa s~e.5'i~J~~'FF3Y; {J,.
a.

or more electronic c~ocumc-nÃs within a col:ect,~~ of electronic documents, the svsiem con~prissng at least one it1tE'~~~~ce; aÃ"id, at least t3."i2c processor coL:'aA~. iet3 to :hi;'. at [east one interface and programmed to (1) c-iccept a ~citY'ei: ]uey"4' through one of thC'.
.1Fi:erfa~es, (2) obtain a definition of a subset of a colle. ~~on of eIectron3c docurnerts that comprises a pauz.alzty of oIeetToiixc dc~c~r,ments. (3) execute the sea-a'ch que~y withar. the subset, the~~ ~y obtaining ~..,~y results responsive t,-.i the search query, and ( dr) Aa-F'ows?:. any resE1ft C'3:
resuE1`S t4Ã'ough oF`?C of :h:'. ÃF3:tei'fiaCe5, In si14:?i c'kn eIa'`abodi:`.ao;1tA ~lmaining a d+`.:.`nÃ:loF3 of a subset comprises def un.Ãn~. ~ a subset to (1) at least oFlr'". SEiwc~,-` document within the E:C3liectifiÃ"A, each of ti7;S:: :s'..~t:rce do.'.uÃ":3ent4 ci>#.i SpalsiÃ'Fti..? at least one F't.fÃ'.F"i:n~"..e that identifies mi adcl':tiC3i3zsl df).;t:ment within the ci3l;eGtai31i v~~`
dffCiZ-i71eÃ:t:i. CiFs::hact i'F':U`in the 5otFaC:e document, and (14) fiÃrthC',F` :$dd.Ãtio:icii k[3cE1r'e`ti'.EIth .deF?:ifiaf?le by, for some nuFnbGi Ã.~f iterations, for each additional tic~~~~meÃ-it added to the subsetÃn tEle ,mtd;ed.aFelY preceding iteF't?Ã~ioi1 (a) retrieving the additional document, (b) finding in the Ãt:'ti'aG:ved document one i3i lnort: J'e3`,,+"7ra,Ã-icC:s, i>:'at:h of thf.
one or more F'G'feI'enf;4'`s id~,.'f1tFl-V1:1g i?fl additional document, w]d (c) addi.Ãag each of the i.~i.~uÃ1d :'CfGac:nca S, not in the Gef:FiAtÃÃ}n of tiac subset. to i:he dufi.n3tJon of s.i'`.e siF.bSi t. YF3.%3 fiF.Ã'tlh.eF' er3:;'i1den?ent of z.~h-.' invention, the subset eostapr:~es every ~~ec:,soatie da>t::unw;:: 4'. collection that any cEe;.LronFc; document wiÃ.hi:~
the Si:ibset, compnsus a i~~foF'uF'1CC~ to.

f 00141 ;n an onibodim,nt of ;}:e .invt:;n.ion, a computerized systeFxi is pro-vided for identifying one or more clectroriie dc~cuments 'wilhi:n a coi:evtion of electronic d~.~cuments, the system comprising at least one inÃerxaco; and at least one processor coupled Ão, the at least one an,e"i.acc= ~~d pF'ogr mnu+<'d tL? (1) accept a search qiieF'd` 14rG~~gh one of the li~`;~teÃ'foces. (2) ot";ti1n31 a tae:F.",2;i1i7Fa of a subset of a Colaect,c?n of e1eczy~c-inic documents that coõnpr:seu a plurality of electronic ~oc,~-nenis, (3) execute. C-lie se<:r~h query ~~~~th,p, the subset, trtet`c~~y obtaining aÃiv results responsive tCa the search quer?v`: and (4) provide at least one of the results ti1s~ug" 'a one of the inaYlfa`a~es. .i.31 .3Ã.SSih an te=.il?i,*'11d}..Ãnent., obtaii1Yng a ELh+ffinitiL~a of a, s\!i/7ey. 1-ol.papL_t.>e4 \i4iini3dg a sLLb:se\4 to caiT`~.pi'i5L'`= (1) at least one source document 4xp it}eiÃ.^c the C:C;lla.c`t'IC7%2. (~),ad~. ~i.t%ona.i citing dL7c'ti,ni;~mts identifiable :.'~y, for some number of iterations, f-6rCaz,=h document added to the subset in the immediately prelie`LEiiig iteration: ('~~'7) .f:fni3iiig C7neor 3u3Z' acs.d`::Lion`Lif.cat:ngdoCumeiats in tho coi~l-c.l:clch :o-mA3:I'S3ng z31 least one i'4 feÃ'ella'e to the docIai27i3i1tT and (b) adding each add:flona: c.1Ã~ng c;ocuti,enty aic7t alr~~dy i,-~ t'hic subset, to the 5i:bsei. fna i3.:-the3 : rn~od~~~ent o.' hi=.1~7venÃ,on, the atibsut comprises every cl:.crroa:c docurnt;nt wit.--ti:i the collection tffiat is known tc, comprise a reference to any electronic document within the 5.t~sct.

fOO 15,1 IÃi an e1 bbÃ3di.inent oa the 1:w en$it?is, ame.hod is provided for d~en11;Yii.~~ one or more docunii4'ntswithin a E.fiF34:ciionof d~.7ciumenlsp Li.~ome4i't+dc4;i.~`.iprisingl1efining a s~i,~~ebsei oiai7lIect1on, of docuI;~ent5.the collei:ÃioI? of documents comprising ci-pl'ilrality C",t docu`.t7i+ ailsq and the subset 4;.(?õI7,p:is1?1g, (1) at least one source doct:r2i1s3.t13'.Ãtaii7.
`kh+., i;ol`aC:.tao1? of documents, each source d~."~climt:a`^z: comprising at least one r6eaence that identifies an 'e$tadtilClvlill dCT4"u~il~<,' wf.ti11n the collection of d3C~~;~:enfs, distinct ftom the source doS L?I7.1ent, (2) additional documents idc.nti fia;~lc. by . foz som: nuinber of Ã.~~n. fionm f~.7r each ~c)cument added to the search space F.7 the immodiately ;irroeding iteration: (a) retrieving the docunient, (b) findirig in the x=etneved dL7cu:Ã71ent i>I3c oi in+.3rG: rE:~erf;.31ces, cs%c~ of the (?I~c. or more Ã`e`:`ex'e.7e4S identifying an F3dC3i-EIC,Eli3l document, =w:3d (c) i3dd:i#g f:~3.ic~'t7. of tF`lc foiEi~d references, xii?t sit`c!ady Ia the defitas.ion of the ;7#:arÃ'.-h sliace, to thC. . dufi1ataC;; t?t the search Jpa4ex u.d.'.t:ept}n4s, a search qL3C'.w~~ coma Ã'isi,Ãzg oI2c. or i11C?:"e 4aitL'`~>I.a: w1Ll idoF1':i:fyiilg one or more documents wthii7. ttw 5i.P,bSt.E that sc7.tisty theC7i7.t; or more criteria comprised by thes~`.'arch query. 3.ni.7i further enibodii7ient of 'i8.(f.' inZ 'et7tioiiy the4ue_3:5~l cc*nipris; S every electronic document wiEl~~~~ the collection tl7z::s any electa'on. ic d.?culnenti within the subse! c.?mpr;5c;.s, a r~ferel-7c4 'tt), 100101 I~~ ap ~~~~od~~-nent of i1at: i;-ivei.tio;7, a rneÃhod is provided forideÃ:s~f~ing oiie cii more dE;cu73eE1a3,wW 7.3.1a a Ct~~lectiCai7 of di3ci.:&"37.ents, tli~ method compri.saf7,g i.~
elfiÃ.i. ag a subset bi'1thxn a coalcoiioia of dc7ciimen-us, the collection of docuirents comprising a p3ur ;li;y- of documents, ~~~
t'ic subset com pr:5.n;; (~~:.) ac ie~~l eme deoc~rnenu w~~~~~~ the, collection c3~ ~~cuments, (2) addit#:~'}nc"PE citing L:33Cunk ena;i 1d~.'tlt,fii`a.blÃ' b~~',iL3r SJFne i1tImbeF of iterations, fL~r elc:'..h document added to the Sub-set in the immediately preceding itC:Ãai?7n: (~~~
~$.di.Fl<.: one or Ti'iore a+.`lst:+Jiii3l , dt~~w-iiÃ;nts each con~prisin~ at least one reference to the document, and i`~~ adding e:iz,~h citing additional citing document, nt7Ezi.`ireaffy in i,~'at.' Si1b34t, to the subset; accepting a s5:~ch~ query cr3mpr,;;inw One or mors=d criteria; Lilici identifying one or more documents within the subset :iiat satisfy the one or more criteria comprised by thesez3rt:~ ~~jery. I-rt a further emboc~imexit of the invention, the subset c'4.''ni paiSu;i eve2y4lei:tron,i.cdCbL,uznun5. ,S-~`4i2hiA~`t ~hecoalf,'Geion that is known i.:oc.,>mprisc> a reference to any ~lectrzynÃU ~~cu~~ ent wAthi a the subset.

100171=.in an embodiment ofÃa1f.`' invention, a method i+ p3'G`4,vIldet~ Bor defini31? a t~5~~:+..i~? subset ~3~ac~.
collection of documents, the m~~lhod comprising di.fi1x i~:~=? a subset of -a b;C3IlictÃC n. ~~docu:nei1ÃS

tk3 comprise at least o~le So\~~~~ docuzwnà within the colzec~~~~i of documents, each s~~iirce df?S t ar,n{ 1?tcR3iT:pTls;n~ at ~.ea t . one ~'~:=:~e.e#:Ce that ?.daa~~.~~ies a.?i additional d[3cLdmei7.i wifli11 Ãne collection of di)~"i:u33r~.?3At:, wis:.inc~~ from th~.' source document, 'ri'lw`'in this defining comprises a #`"~A'St iteration, and defining ~t-ic Su~.~SC?t W comprise additional docÃa:Tienxs iCe?azlfialb3e by., i'`)f sZ'r~1:1e number of FieruÃ.ÃC?nS; a.'.]r each document added to th~' search S+ i"~-.c-o in the 3.mimt.'.d.:i:3ÃelL' pruCt:.diF:t`'=

2ttinai.iom t[`) w#e=t};F.brT4.i.i3,~.~.., the docurnei).tq (b) ~~."~Sidina,:n IS-tG retrieved i oLu1ne3iL y+1S41.}1 m\+}.=e.

references, each of ~li~.' one or more rel'eZ-'fii,e5 id~~~~fyi5tg an Ã.`iddrt3'C,n.&I document, and (*:) ;Cad1ng each of the ff5u"td t'e .:arenceS, not already in the definition of the 5~ar+a.fi spaGe, tt) the definition of the search space. ;n a fftir.}:ea ~~~~bodimen~ of Ã;-ie invention, il;e subset comprises t~~~cume:it within Ãhe coliectioii that miy document vithin tho subset comprises a reference is~.

[00.1.81 II3 i;:r. t`x`e^ibod:Ineià of the in1'enÃff1u, a :ei`'thf3Ci is provided, ii)r 3efiinhng a topical 4~`e.b set oÃc co.z:e i.1on of doti:;:meIa4s, ta.'2e neÃh'#~ comprisiitg defining a 3u}?
se: of a collection of doc~3eElL;i to comprise at least on~..' so`L:n{;~` .: document within the ciFixC:L:t:oivof dÃ3d:.1 zleiltuy such. defi.ne,.ilg cC?a?Stf,n.Eti3g a first iteration; defining the subset to, comprise z?Ã ler-'~..5. one addftt'~i?al dcw.a-tmO:t'xt identifiable by, for .'o3-iii:snumben of 3ÃeIa:foT3s. 13a each La.ot:=efiÃ
added to the subset in ttie .mmed 3tel4'' preceding iieTi:t3o11: (a.) finding one or more additioÃ~~l citing dt?C:uY21e:iÃs witl`:2i the Z.t3laeefIoFI of dt?Ci;Ã11e.'i3i"a';. G'ach, addiÃfo.lr;'3 >--:.:"lfT.
document à C a3pri8#i4g one or T~zoT ref4Ta i3 ~~.~c =.O
the docia-~~enÃ, and (b) adc:iA~ig each of the additional c:ifing doe~-nenÃ~
to the subset. In a fcirttier G.n-it,od:r'r:enà of Uhx invention, ~~e. subsel comprises every dc~cumenà ~~-ishdn :he wo:~~~ecÃion:hw is k1>o5vn to comprise cr<rfeicrice, t~.3 a.t'ty CioC F7;''+3?+..'Ã1t wiÃhln, the subset.

10019; in some t,'.~:".,~b`+.`=4õ~iÃI`d.3e7?4`.S of the fivv'?nt:ti)Iy, at least o:2e of the sÃ..3t1rce dt`i:u..i:~C;nt5 pertains to ~c`~...t least onea:ea oi iaw. In some fLirtl3er embodinien:s of the ?n~enÃ:on, the p-ro~idedmeÃhud tion1pÃ`ÃSefi seit'c;:rtaig at least one topic from Ã3 directory of ai"oes of ?#lw, w]'L':rei.iu eaclr source 13oL a, .3'Ãe:.tt pertains to at least one of the sc?1".-5.:tf.'Cn iLsy33L:s-A

[00201 l'be i~~~~ention, is Mu3:~ate:~ in,.'he figures of the accompanying d.~awmg;s; wh:c}i a.,v, meat?t to be 4. and Ã.{~t:E d i~: ~:, ~ x e~~' .,~~pl.~~ {?~~~~Ãa~~'~x~~, ~~~~~~~~~~ ~~.~~. references are iÃ~t~~.~~:.~
to refer to like or L:f3r'~spond:iÃ;~,'' tbing8.

1002~:~1 Fig. I is an exf.e '~p`t f1mm a t.".xtiÃad dotiuT:1F:I~~ that contains citations accZ)Id3:I1g to the p:'Ãor :~ll.

~00221 3 igw. I a and ~ b are additional views of ciia:ion5 that appear i.n, :'i~. 1.

100231 Fig. 21s an. excerpt fro,n an HTML document that contains h4~perl;nks according to the pn or art.

[00241 Fxg.. 3 clep'c.s possible ~~laÃionsihÃps between doeumeF,~s accord;ii~
to the prior aft.
;00251 Fig. 4 depi4ts z'he reaa.ions:iips o," 1, -ig. 3 as a dir-Icted grapt., 1002611, Ãg. 5 depicts creat:~gam :N.'4evel search space according to an emuodi3-tierit of tale ~~,~~~~lition.

[00271 Fig. 6 depicts c~eatuag; an, exhaustive search 4nG:ce accE~rdin ~~ to ati embodiment of ~i~
:nvention, 100281 Fig.. 7 dC'.p1::ts, a topical search ac'w'E).iE'#.l;ag to ui3 embodiment of L>#CiÃaveÃ:a.t7n.

100291 Fig. 8 depicts a ti3er in.terfla,w~e dispha~r f,c3,~~,~ 'which aaser znay select a topic.

[00301 Fig. 9 depicts LStaserini;.a.'2fiacbd?sppbZy 1ioSii which a user may 0.=xavF^1.4sA,4Swch query.

~.~~'I'~-kIi,::;~~ ~~l~--'SCREPTION O; I'IIE PREFERRED ~~~~BO~IMa :`~TS

10f/311 Ein[.fodlAS,i\4.nis of ffie invention 2elat4los#.WCcS.Dnng for electronic dovYf.aY,(uCAFs acco:{,J.},tng to ~~io or more cziÃe.:`ia. Ma.iv aspects of the ii3veiit;or, a.zi.~ of particular en,bodime?sÃ, of the iinvuntion aire disc uss:.dt.,ere:n hi c.on;nezt;c:n w :t}a ~eg~~l docutne.nSs. ~~~ludirg ~ ~.. judicial op3z7:1:>ns, ~tae.utes, "ild secondary sources such as :i4'gal treatises and cz~..~e aIn:,3o1=?t:o~ ~5p jrhe discussion of l; gal documents is p:av~~~~~d purely as an exwnple, ~oww~~er;
~~. : does not Ifin7Ã fli~
scope of the hi~~ention. It will be recognized by those bUied in the x';..Ievwit arts that the su'~jw matter Qa'the invention is applicable to documents ~f widely var~~~~ types.

100321 A document naay be considered a containe;". +,:f data ~~-taÃ. Ãn`my be r~idexed and"Or retrieved as a ta:tn,t in L1 data s:~anagi'rnen; system. Ali33ough it is cod7tnC`3 for a doiE3niei,t Ã+? ~~ 'a single C0.1I':pu:.er iile, this need ni?i'3l; tl1.C case. It is well known for a dt?0'ui~:i:.nt to cF3 zn;3ri ,.e s~.t'tfal ffles: for eaainnple., a page on the ~~~o-ric.: Wide Web :pa4 be considered a si,ngie duc=ea :, ~~~:,:t it ca_n conn~.~.1se resources stored in several ri.les. i'on,ersU.ty$ it i:s also well Lno,,=~v'~ for a fil~. to comprise several documents: for example"n a database management 4yg;ern, mu_Ãspl~
~oc;am~-rits at?ay be stored as : ~corc~s, within a. sfing~.}i file.

fOO-131 A:~ eleCtro3?lc wocu?I1ÃJIn:' Ã-nay ~on1pPlse d-igztal detta representing one t3rnnE7re types of h:i~.~T`rt'Ini'3ii[3n. (As a shorthand, 2tmay be said herein t::at the document contains or comprises dhL.
>IZ..fc)r,`:latitlMiF(?F' InsGan{::,, dS3curnL;.iats cC>:amoTIl&' co~n~.?#'::se hu?laF:3-re`cnia'cnble text, blÃ.t ana:y also 'L,o2nia 12u"e i4'coYded tJZf,iL23nd, still a%nd/owSEwV;1T~:~.tg p:4=tf..hes3 ai-if.f,,otr. okEF.ci types of ~~
aelctat:ion to or tns.eadoa text. A dc:came~t may also comprise data intended ff"os automatic p:C)cessiPng. such as, i'..~'~yr,: fC? tz::;3ei.Fng codes aI,~"O:" m;1rk-iFp in a language stliJh as, e.gg XML or 100341 A df3!C~'wm. (?t]: :41z$y reir:'.T tt? one or more other documents.
Such a reaere31ce Tna',,4' take the -oxa~i i#4' the referred-to document with wLffiC;1.e~t precision thiir a .illimai2 reader is -able to ide2nt1ty the referred-to dC~cume;1¾ unambiguously. Sizch ,.3. reference may somei~inae~ be c&[led a "citrit:,?n." a s:ocuLneAa~ cont<lnim~; a citation mav be zex~rred to as a "6tla}g dE)5:ztnient," a.ii3 the document specified by tl`ic t aia:iL3n may be refC:r'r.ed to as a "t':]`iE.'ci df3cuFr.eP3.."

~ . p_ 3 `~. ~~.1 tJi,"^``a:ltiw.f" z~i}I'$2S exist i~3~a'.~.1.:~.~i.i~,i~i.'='`+.`a~~?ae i:T ...1^I 7, ~~Y'ell"E~Iii).'S~'t~
standards 'c'5.~."i.' published by the MC;`Ciem Lai1Ã,L:x'_1ge Association, the A17aF:ri.c[3.Y7 P3ychC3lf7~~1Cal A5soc1atlox%,iliid the i.,Tni''o'efSatv of C,haiiagt? Press. Legal do+:,3.Iine3%t5 o~:ti-::I, c3dherf', to the conventions de.'~'a:1"i'Eset`a. %li. The Bluebook: A r>f Citwr.rn (z;'~~um, bia L~~~~~~ Review Ass'n et ale eds., 17th ed.

2"000) or Association of L~~-at W3'1tfng D#.t'Lc'`oF"S, .:`i LTVD rc , ir~id i~rr+~1nFdat`-: a PridP.5S'it'?nl,d ,~jCieat qf.Ctm.¾zt3nz (Darby Dickerson ed., 2d ed. 200~''3 w;.
100361 F>g,. I contains v:f2 agtnent 1 00 of a document containing legal citations 105, 106 in a stand.rd style. A case citation 105 refers to wn opin:on or other document issued by a court. Fie-I
a depicts the pt`?.1 as of the Gaw MItat3o:1 10-155. As dei3Ãcted, the uize t:tatloTl 105 LC3~ipri4es a t3tI~..' ` 10, which oftt=ti comprises ~tiC:: na7-nes of one or 'iiC;l'C partiQ4 to the aGjs?o}.`i, ill: ~bbtei'f,ai .kon : I i iden~ifyintg the series of voauines (often called -a "reporter") inwhich tl-ic opinion is published, the ntimbcr l. 12 of tha specific ~~~~lutme in ~~-hich the opLnÃon a;~pea~~=s, arici the first page I 13 of the op1.t1ic}3:1 wi:h.~~ t1he Volxin:le. A court opÃF'lti;n is often cited as t:-utho:".it4' i (5a" 1?~ pw'%zcular proposition of .KEaFx; aui"i a case cilaflof3 105 may theT'S :'Jre also vC)InpraSe a "Junqp eie.e" 114 ihiit inK.iiZ=ate',.f` Lhe pag`. or pages o3i wh:ch the opirdE~~p:rorb' i.L'x.e.3 that authority. A R.=aZe."~ ct [,G:kioi?. 105 C'omn3oii.l'v' also cf.7iT :>1'F4es thr date I I S on i4?hlch the court rendered i'n:= i`>p1-n.i(ln.

[0037; Other i:~~~~~~atÃt3~ may appear i>i- a case citation 105 in addition to ancYor instead of the info~.:;~iwtiori depictec; in Fig~ Ia. For example. when not clear fi.on, 3h-c identity of the reporter, t1.e case c~~tativn, 150 a:~ay ;a-itaicat~ the court that rendered the citect opinion. For . . . . .
'x`Ã3~~~3C. 1i.' an i ~ ~ 1 l . i i i o n i a p.3b I .Shed in mk) T~ than one 1 ' C ' . P o . i ' t e i ' , t a ` a G ~ t?~1 c~tatioi b:?Ly 3 :`.a.y i:d 1c :ife the multiple publications by ircl:.tdi.~g "p~~riallw1 citations" (Ã~~t pi~tw~cd.).

;00381 'I'liecase c-i~:.tif?~~ 10,5 depicted in Fig. Ia is sc>met?mes referred to as a "long ftirtn"
cita-tion. Abbreviated fo;n-is of c.x.a~~on (nc?t picsured) exist w,d ~omeÃin,cs -,','.'k)llow the iong) ~~rna citat:~on when the sanne ~~ouun,-..'- :s c:t;.~ more tlaan. once.

100391 A statutory c Ãr1t1on 106 identifies one E5" more sections of law. As ~ep,i:Tod in Fig. lb;
"l' StatW~:.8r)i f,fa'f3'.#.3T1 106 Sde31::ffies a code of laws 11-0 liito which ihC particular sE:c~ioIi has beeI:

t`+iiL~Sa ~7SLi i.~~i~.. ~>S `~
W~.'j.S.~~t,4, l~\-~ title 121 ~}~ .. ki1~ isL4' Q.~[}t.Le?2~aL contains the cited S3iL"C
~. -~~ L~ ~~4.~!W~.~:(~J. . . . . . . . . . . . . . .. . . . . . . . - or pa1-t5 of sCC:ionS 122 withi11 that title. Oth<:-r 3n3`,F7a`1?;.a.:F.[3n may appear in a Se.-at-L1tory citation 1015 ir, adt,xit3'`...n.}~'.,` to and/or instead oI:3.~ie Fnfo.F'mwdf.~n depicted xnB."ig. 9.b., such as, f.?ei,s'~,., the year i.n.
which the coC3e was q?13bl;4h41ie the Elai-ne of the Act coI:ÃalifiiFg the se( iF(?n. etc.

[00401 SÃaÃic:ard fbi'nis exist iffii citing Fllany other :e;5 of docF,Izlii':F2ts. Style. manuals commonly prescribe lC3I1g fo3".3 ti,s' and abbreviated f"~.`;.F`ii3s for citing nearly +;.:'6Fer`y type of dic'i.FmoI1t that a aCgua ;~oew-nent nnighà refer to.

100411 The citation l4r~-ii~ depicted in,,Figs. ;, la, and rb were developed for Fise by human 3eadeÃ'i, Altizol.Zgh cC"~:31pute3';i c4aÃF be and 1-3cve ~ee1i pjog:c3.Fm-ned to rci>grttze these lC?m:4ti iI,i eiicÃ:'C3Il.F4" docui'I:FuntS, other kinds of references are afien used when :F xtt~:..17:dS::d for electronic processing. Fig, 2 depicts the use of one ~y-pe of n.ac~inewse..F~~~ble reference, called a, hype (~r~k,"' t~rtaÃ.d~.{~.~ be used i~~. ~:or~.Frxction Aviih ~}. ., the ~~r'z~:~u Wide Web.

~00421 The document 130 has been marked up using the Hypeffext T-Mar~~p Language ( HYMIL'). Plain text 134 is marked up with tags 135 1317- 3& 140. -w~~~ch a~~li}i-iit one or ~~.~:.. ~.
~.
3~~
~~
el.,me.F3;.S and may #F d]cate .g:, the structure of tlte doUumieÃ'Et 130, t;ie, ,?aew7.F't3g or S1gn1fie,afF:.'.G
of one or more ptartsonsoz tl-i~'_ document 130, and:'c?a t('+~~ appea.ranez:
andUor layout of eI:ments t'1G1:Ã a,i'e to be pF'esC.:.:1 : d: to ai.FS:>r bva u'aS:r taggene. such as, i,. c.~.;e,, a Web browto-ei', ,100431 In ITIML; a Fefierence to any resource (vdhich mxy be an~~lier .11-ITIM;:. document, but n4e;I
no A : ~ rr?ay' t)e ioinpF-~sed by a~i clett'zet. w called an "ani:hC3r" 136.
F-acii an;_ >~,?r is d..Ii.n::eil by a :it<`aCi t{?'g 137wiZ~: anL'ndtcig 1 3 8. Thestartuag 1 3i com~5Ã aSeR <3n.
lL:i7:f'P' I i<3F aze 139 (wh:c 1:, for an anchor, is ~~lway, "a") and, as depicted in Fig. 2, an "h,ref'at~~bmt: 141, whicii is used to ~~ecÃfy` the target of the reieren:c~~ The a=aiu~ 142 of ihe href s.ttri buto 141 is a Umton-n Resource LocatuÃ' ("URL") which adety=.ifies the iaxtJe-4 for ~~ectrr~ift retrieval and which implicitly or e-,plicg~ly specifies the prot'c9col by whia".ii the resource -may be r`a.tr7cõvi d. the server ironi. wla:eh the resowrcc may be rc:.;fieved, and the patli to ,h4 re~o-urc c on xhw serxwe;. A sSw, t tag I ) 7 for ari anchor elenwnt 1 ;~ rnay comprise ottier attributes (not pictured) i>?.
adda#.acir, to w?d,"or iznstead. of the bref attribute 141.

100441 B'~.`tyS?eOn t1L. start tag 1?7 m3d th-C.' e3.1d tag i 38of an anchor element 136 is the body 143 of"the element 136. A Web browser uriÃl typically display only the body 143 of Llie element, bi,i conarno:aly will highlight or oeher%d:~e, alter the appearance of text andiz?r otffier the body 143 to Fndici:~'i.i ihckt it ss a hyperlink. When ii,' user se>ec.4 2Ã`Ic hYpÃ.`,Ss31`k. :.~ÃT., ~7~, i ^ ~la.ev< ~~ c~
t, :i~.._ the p~~ii~~~ to the displayed 'hody and then clicking the button on the ~~~~ntmg desric;e, thz? Web browser aF.2ompssLo r`~~.trie-ve 11ic hyperlinked resource iatid present it to the tis\.a.

f00451 As ~epic-t~:.~d iz; "Fi~. 2, the body 14333 of tzic anchor e:.lem~iit 136 is a citation. 'Tjhis kiF,d of relationship is con-imon.tn h~q?orlinked documen-tsr such as, ta:g., Web pages. 'a'he }luma~-readabAe 11'ext indicates that another document is being TefeÃ-red to, whilithe Q.nelosÃ~~g m~kyup piovid~..',s i,:7fornittion that a3iautomaiic sYsa'em c:~ii u5~to retrieve ESicdoci$n?`~.'nt for ~rese t~~~atiYsn to the user. This kind of relationship is merely a ci?Ti'vt;IlEt:7nal 1:1.sdgC, hs^+iveve;', and it is w; le known in the aft that the b~.~d.?~ 143 or'an anchor elemeric 136 need not bc:
a huniart-~eadable reference or cftai1Ã3it, biii.mcl.y 'be any text, iri dge, wid,'t?i' otf1er content.

100461 Axi HsTML document co,~tax~~s other tags beside:~ those 137, 138 ~~socla}ed ~xi~~ anchor eler.nents 1. '16. As depic.eO in Nig. - , for ~xwnple; the 1'71T M L fragment i-ne::lade tags 13 5 d.efi"e3#$iFcy the beginning anw cIid of a L1a:ag'sEiph and other tags 140 Ind.Lraa1in~~ 1l`ie b<:giiÃfi1:..' rlld end of a range of i:~~ th;3:tdS to be presented to the user in aÃ1 italic ty`pefc3.S:e.

100471 FGg. 3 depicts a Sir.ail collection of docutr~~ents, some of whica>.
reAerÃo other docume:its, H] connection with .h:5 exairnpxC, i`~'õC ~`~.'feN.iI#..CS 1`c:iuy be considered naacl7.Ãile i'C`.adizble T'efi''Ieklce5.

h.maAi-r~.~ad_awle citations, ~~z- bk-iÃh. ~'lor ~xarr}zleõ as depicted in.
Fig. 3, docd~~nert A 155 cÃtes ~~0 t?`h+::t documents. In tht,legal context, 4Lct.ii?a7:i cif stc3lalles, for e?G.ane=7i:'., `+:FNTI O,oil:'at;l no cit~~io_.,s. In c.a~~'tra4t., docu~nent 1) 156 s:oz-~.::prises z~ citaticiÃ:
157 t~.> dozurnent A 155, wid d.c~~~~metià ~J 158 comprises a citation 159 to d:~cuni,:nt A 155 and waothLr citatio-n 160 tc) .~oo ~~~ient B 161 [0048, Documents and their references Fi?a)f be conn>id~red ~lenxats of a directed ~~~ph9 .is that te.~~~ ~sused in co;~ipute, science, in which each rod.: :e~rcseF,:s a d~cument, wad a rcfe]enwe is :;ep-r~ser~ted by an Gd~efrom the node rc~prese-nÃ~i~g the citing documerit :o the node representing the cited document. Fig. 4 de-pi: t;~ the documents and re laixonslhi ,~s of Fi g. 3 a~.~ a directed graph coi-responding to a hyl,ota;Gtical collection of legal doc~nents that cite each ofiie;. 'T'he dC3Cuin4i?k`s 4"f.Flnp P'15e aFly kind of li;.gsil source material, Lt;el udddlg, judi:i3.l L?p?I3.3oxi'i, v>CNL1,1\e?Al -i_V(at[oilsi 11ci-5.t33'ust articles in l3:ga.1 joLSkna3s, etc.
J
?SSSj 100491 The graph of Fig. 4 has no ca~cle5. ir~ other words, ff ntie starts ~zl. anv dc1cu~xent in I _; . 4, one cannot follow the trail of citat;onS back to the or:p-1_Ãx-al do{:`:.iTnen$.. rhTs property may x3.o!d if ;~1.l d+~~;ti~r;~e~~.Ã.~ are, for ex~n~~'e, jud7ciai c?~inionfi, b ~:iause <~~
opinion is iwitten at a specific tim~
and, ~~~j.cally, Aci~yc,i+,eeirilyear>Ãcr-pal,,Iisliedop:Tttr3Tis: Thi:s pr~~pe:ly ne,ednor ~~~r all types af Lao:.Elnl`t'.nt5, however. For eSumi?zfi.:~. secondary legal sources o~`.t. : c141 ,?s- y`L"ff`:`iMcõ`, o#"Ee z3ai~;:~`t~:; . 3.'L~ra.Ã~()t~z:'~~ eh:amule, a Web page n4.c3 t:.i.3$??,pa:se a ~yPeÃ'sÃF3k to a second page that ÃIi Such ~~cl~s are commc?:i_, e:g., wiFhÃp. a W;:bSite.
5~vhen multiple P41gc's li.:* to Onc; another as an z3dd to navigating the site.

;00501 A reference aroni one kloctimi ent to amothe; may suggest that the documents concern the sakn4 or related s~~~ects. For exwr~le, as a matter of s.o~~~~on practice, a hyperlink is inserted into a ~VTe~.~ page tf:; p;-ovac-le a link -to o document thai ;~~ovides further ifflc~rmat:~~ about somet_~ing included within tl:zv refc:_rraiig page. Ft~rz anoÃI-wr e:~~n-ip;e, in the legal -field; ari opin:i:ar, or a treatise, for exwÃnplo. OviAa tyPiC%?llV o3t`' at least one authority for ev~,',:-y legal point it makes.

Such an authority may be mandatory, sus:;h as a statute or a binding decision of wi alapcl,aÃe court, or it may be persuasive, such as aFl 4pFT31o1? of a court ita anC3 t'he1 ;uF'1sdicfioÃ.P As a matter of good practice. thdSi.Agl?9 tht.' z?F:1thZ3r of a document will tFp1C:$ll4 cite the strÃ3``lgL;`~, TnÃ3S: rel~,v<7nt authority available fC?r any pa:-:i. ular point of law.

;00511 Acc~~~~ngly, in embodimerits of the ;navnzicm, ~~~fe~ences b::t~N'een documents are tiswt~ to i::a.lni~', cai.i4te1"s d?fdC}i"ia_lnents that are likely 3.o concern one or T-n37'e related ~op. ics. A'3'a:.arL'~
space ri-i~v thei3 be constructed that is limited, in whole or in part, to one or more such v lusters.
(f'~3 i.~:t?lleGtat3n of one or more r+;'.fs:.3"eIIÃ:'es may be considered a E,~~fmit:I~`~,=I:i of the search 5pav{:
comprising ttic reaer:~ed~~to doc:~~~en;s;) 100521 _:~ ~~ embodiment of the invention, a: search space is defined ~~
nps~~~~~g a, source di`.et+'~u;nePlt aTid all documents 4aaed in the source L3:i3L'.Ãl.m en.
1`'or eva$'np8C a zti connection 'a~'#.th Fig.
3, ifdoc`L:m!,w}.L E 158 isuseLd. G..'?Lhesourc~d~cum~.'~nL, then the search spCS6:L`~~w2là f+omlvisedoew-nont E 4 58t docus?~~~t A 155, and document B
161. If a search of the collection 150 is I Ffn,:ed tk} this s:,'.<YI'5;l7 space, then the F"f~,'su:ts, %^afl ci's:'np:'ihe maa.L,1inx~.T, doc11;n'~'~i::5 oi:~ly fId;rl this subset, assuming any such matching documents exist.

100531 A search space car? be defined iteratively, retrieving further cites from, oxie or Ã-noÃ'e cited za~~~u-nc:Ãats. StarLi}ag with, for exampio, document G 163, that document w~~~rs tc) document D
156, documc~tE 158, and docuazen> F 162, and these ;our dc~~in-nents an~.y be regarded as Ãh<:(evul I se=h space. Docta ~~ent 1) 156 in;art: sefers to d~~+~~ament A 155, ~~~d docurnent E' 158 3'0`3e-rs to document B 16 1, <3.I3d th4Ã`~'.ioÃ`e the level 2 ;~earch, space consists of :=hi: dos^.`1?.Gnt5 in the lev4l I sewwh space plus doc:un ent_~ 1 ~5 and do4timerit B 16 1.

,00541 A cao a;:_-..:?.~ may be encountered multiple Ifia?le5, ~and possibly on ~-iulLhpa~ ~evels, as the search space is ce :'....~ctL~.~; For ~tart~~a~1 docu~~ent .E-i:1:64, ilie level I search space iv,qpri~es doeutne.;.: E 1 >8 arid docuÃ~~ent B 16 1. D~cum,2:~t E. 15 8 refers 10 :1.ocuimeÃ:s B 161.
hi3'w+Lve:> iC? dot;l._i:-,lient B 161 may beC=I?cC3un.ered again w ,dia toÃ?sixuct&Iig s?e level 2 search sp a~.e from t~~;. lc:~~-~.l =._ searc,~ space. But since document E3 161 l?~as - l rea ~~~ been ~~cwa~~~d in ~;~::
level I sG:aÃ'i.i7. 5p=^a~-c, the further ez"f.`o;_t ttt Cs ta t. not s:g3?tiicaI3.t and a"i%$g' be wisi'~gal'd.".din an enn-bi?dti?`Ãent of the iÃavS:.Ã?tioi.

[0055] In the general case, according to aÃ?:Cr{AsS3.odzia'.ilt of iile, i.Ã1vC'.Ã>iõ1oi`#., a search space o'`~ level ~~ v,,=ill consist of a source document and all doc~~rnents tliaà can be reached froni it by foIlc~,'ving a chain of N or fewer r: ference& ;"o use the kun~laafTe of graph theo.;., a search space of ~eveÃANT
will k:CPa"i:i.EJt of ei`ee i+our(:e di"sa::u.ant:nt and `cEl ; doc:I>~ ~' 'S~~`~~.Ã~d3Ã7~ ~z :t? nodes ~: ~. a ' ~..~õ_~~. ' Z3 Ã' N
i. ~.`."~'~S v~3s~'~i. <~ , ~ta'e,t ~. ~
or less from the node corr~sp ondIng to the ~ovarce docimem, ~,vh; re each edge >r, the graph is 4,.iiie46~.,d Li2~1~,i~.^iJ~,a~9p~.~S~12:' to Cir4fetA.ii~.e Fin~y}A~b 4~'+3L~'FCi(1et2.t1oan\`f}{`er.

100561 Fig. 5 depicts ger.aer.-atir>~~ 180 a. level .~~~ searcla space according to an ernbodÃmont of th~:=
i.E31'ent3 on. ?n biC3s".A. ,. 8.';, a:. re-ff Ã`eila e.rs in a ;S"f?iarce i.t3Ct:rneF?1 are iGenlifi%d. T;x' way Ãn, which th,as is done inay vary substany;?lly depending an t:1e. nature and/or :epresenÃat~~~n. of t~ie doCuE.?Gn: <anCP'l3T'ti"te dG4:gr,~ l?_i~the system Ch-a ..c3.T'Iaes C?ti]:t this ideÃ':i.ifii:iltÃUn.

[FY`U 571 In (biS eribu,~i2iia.vi3i. of 3ie ifi }'re2Stio:1. a source do\-YM2LieSitma v comprise 1ef-eC S=til~es, \? ? i V l.il flie for-rn of hwna:l-readable citations, such az, :.~.g,. tlie. document ~(}t} d4~~acte:.~. ~.Ã~. 1-i~;, x `~~ such A
:du^tÃ~,Ã:~~.o aex0renc~~ 1 8-5 may cÃ~~ip~.~;~e use of s~ffiva~e progÃarnired to recognize ~.~aLdoarF13 of characters corresponding 3o klI.CSSN Ã? citation fo.i'IT3s, possibly including coi"Iaah]onlL 'used standard and/or nonstandard variants. Such si:#ffive`.ire may tak-e a d[3C:i.l*Tie.31t as in~J-u~ atic`a fli~.'a? 4tt3re, the fi,`:#Yld 4=2t3.tif353S :`. ~.. Ãi?, a data structure.

a3 [0058] in wn emb~~imenà of the t~ive~~~ion, a ;~our~:-e document maY comprise c~ ~~pviter-;z,.a1-;1}:~
re,f ~~eÃ~~.;~.s, sa~c~~ a5, ~:.~;~,tk?~c~:;e ~:~~ri~~'~-is~.d'~~
&~.~:1?or~:le~~~.s~;.s 136(1;::4. 2).:nanH~.~N33..
docurnent 1:.0 (FÃg. 2): XM;_. i?:ay al:ski be: used to embed eom~.-~uter~re.~idG.lbl.e references in a doC.uÃ:7i,.t3t in coT?T1;.G,t_:on w;thi22? eliy:?oi:a.~""A?c:#::: of Zhi.
3Ã`3wI?fiQ7ut. Ma12~' other ways to encode +.Ã~Ã~~~~~~ :~-~ez~~<~~1~= r~¾:~r~:Ã~~~s are possible aiad be ap,~arc-at to n Ouc~ sl~E;1ed in the.rt. lsia:z:
embodiment of the invention, identi1y;Ã?g wefeÃenc;es 185 m~~~~ c: T.Me use of s~~ftwax~ to read i:E3 ` if~;;~'3; > = a",-.. , ~ ~~~73,a.et:.~-A<:a zL ~s}kA~z~~C.=:s ~~s.:~:5~:ed Ii a dC?c'`+Il'9i`s:n$ and t_?enstt'3Ãt. theiii3>ind references, f'..t.~'',.;'.11 a data structure.

[00591 In an embodiment of tA~~ inve~:xmÃon, idQni~ying references 1 85 in a document may ~a'kx p.ci~ A] ~pri\SF' iEi C4'rJ~of tiiC:7.L d[3(Ju2.1F.@"rnLin constructing a search '.~~5pt-ico. For exan?p3`b',in an. eà ~~ocimei?: of the ,nzr~.~~.tii~n, the rofert:.~?~.e~ iÃ? i~~. ci~?c:~-~a~:~~e~~-t may ~Se Ãc~~~?~.i~fic~,d ~.ut~?~?uI

when the docurnent s t1a:7~ introduced to the s,>vte,.3, v~~~ien the docuzq'~wnt is indexed t'oà ~~~e'%~~ith a searc1, engis?e> and ..i- when the d( )caxmcnà is revised, among other possibi1.i Ãies. Inan embodiment of th`a..' .,nyC;ÃAtTo1?f one or more re.fL=I'`~.'n~.^'s :n a d<3C:ti3nn4`?t may b~.' WC31tituflG=d by a human 4CiisCBr wE13 Fa:pi.:`e.s the rt;tw;reÃ3ce or references into, 4'.g:, a dk?EaSt;i#.ctE:?re:.

[00601 In an embodiment of the >n: some or all references mav be normalized.
Svtch.
~oriiiaIizatioii m.w_ ..g , identa& vai,r?ent forms ofrLferences tc) txie.
sarr:e documentaÃ-:d tlieÃ~by treat any occurrence of any variant as ~den,"deal to aay other o~cuireiic4 of aFiy varia;it.
Normalizing n--fe,'ens,e;s in this way may ~niprove the el-Ta~ ~~~cy oÃ
constructing the search space by liixziÃir..- a:e=.._:ndaÃn proces;s~ng of dc~cumznts w-ki/o? Ãe~~rciices, f0061] T.:.: io identified may be stored persistently, F g. . in a dzitabase maÃ-Eaget??ent St4teÃns -eind Ãx3t,.a ?.p, connection with constructing 180 a search space.

~~~~~~~ In an embodiment of the invention, a maÃii iLerative, process is used to c:o~:s.txuct, Ã:ie, search ~~~ice, vvlEz the number of iterations be`.'~n,,; lK tl1::= desired level of the search space. 1.Ã1 block 186; a cCzil:ltf.'?' is set Lf? !iF3.d1c`sitL.' the i:`c:?3 rini iteration. atetai:'vf3?Ãh the :.> f~.'teFscC=s Y'CTaI:i the sour;e docutnvnt may be considered the A~-rst Ãtviation, so, . as depicted :n Fig. 4~ the counter n is . . teCa#:C}Ã~ ;~~:=ceÃ:.C ~Ã~i~Ã f3 ~.~$i:i~5 N. and $ :.~, l.. :~.3ti4?n Ã4 gIven t"he :1litbc l`d:l?..Ie of I in block 186. Y
c`-iecked fow .~~ ~~~~r-k ]. 87.

1006-31 in th~.' depicted embodinic~~t of the ~~iveniionq constructing tli.:
Ievei n .,.. 1. search spa-c:, i;ugins wi[`n rizl. rirere,geo8 defining tlt7.G a~ve; n 5earL'h:spas.ea This may be i2Icp3e1TiZ':[?ted byjo.' exa*?1p.e, C:CipyiIlg a data ~-suctuaC', t1o?ixÃfig the T'efen"e31ce5 that, define the ie`4'.ei n ;5i'e~;'ciI space t5~3 a now data structure that will hold thGrefere:~~ces that define the level n I
seareli space, as in block 188.

100641 Blocks 189. 190, afid 191 represent a subsid.,ary iterative process inside the .rnai: one. In ~toc:k 189, a check tees place thal ~~~~rn~ines wh~~her. the refLTep-Ues have been ret-f~~~~~~~ for c~ \:} y \.1VGt?Stiieiit in the lev1,r? 0 sSv.=,.~chspCLL=l.'. If 1.ioty nl-a'41..eii4r'1.+9 3.ZioSni the aiexi dif.WtL.mnined doa:-aTient aTt' ?'t,;: ieved i~ block i.K In ain 4mbC3ditzie-act. of the invention, >emevfl.i ff references in block 190 maybosimilaitCi retrieval of rQae4` nce.S,ur fiom the.rv1.ou.icw doS~~.um'F:.nai.i block 185.;0065 1 fn, block 19 1, ~.~a:~~ ~eferenc.e that has ~~eeAi reulevec~ ~:~:om the etzrent document but is 'not ti'oà of the defi~itLon, of the level tt ~- I search space is added to that dw#in31~on. Flow then ['e:E3iniS ~,-,,) block 189., wheÃ'e a e-hock foF' a+.`~Lii3.ion<~
'i,:ni,:YaiLl;ni<'.'d doi;l:meni'; takes plai:.e. ]n `rTt'.
en:bo{3.Unk.nt oi`the ha',,a.nrt3t)zl, this subsidiary process cFids wr3a'F?
Id:e r.`.ren:~`i4'eS have bQet7 ÃotriG;vi:t`,~i ti`r every dioc:]-i"T`e1'3i in ttie level x'a search space:

101:~~~~ Once references have been retriev:d from all documents comprised by the ~efiniiic~~~ of `l}ae level n sf3arLh. S;Sc3,ce, tb~. counter n is incremented i block 192, compl`i i:i:"3g, Ã3.siiltfie pass :.~rotagh ii.":efni:i31 iterative process. a'lifi check I.B: i7`~,`ck 187 is then repeated to deteI'iT1lf.ae if additÃc+.~~ passes are necessary. If ~io additional passes are necessary, the definition o: ~he level N se ;~rch space is made ava>lat~~~ irn block 193 for usc in a search.

10067.1 ir. a3tenniativ~~."' embodiments tif tht.i invention,, i3'I} ile r'S.'.ier''inZ.~~s may be 'tix4E1-~~~~ from ~~ic-, i.t.Ã;i.l,natifZn of EiseaT'L:h m, aC:f~', and/or some do''L:L31a"..r1t 3 F33:ay not b~~,' ~xa~~a_3h ~.'i for Te3:e3'irgaC:i.s. In some cases; f'or ::'xG:n-ip:e, a reference may not ~eadded be:.aukse ~~ ~s not xecor,a~ed or ~~~ auti:
refers to a doetimani that is 'not ~iidexed for searching. References may not be aetrieve~.~~ from a m.it_rcumi;I3~ bei.siu3Ã;, for example, ti1C= OoC:w~.i~?eÃtt itself canmot ii`e 1'+.i.a,i:S='c:d C3r. CTi1Gere:n':eveC3, is in n ;.oF`n i that t:ai-:i3io; be examined for ci:ationS.

10068] ~r, mi. omb;~dimc.rit of the ~nven,ion, it may be possible to specify one or more classes of L2,wu.rn%Y.n2s an4`.E/oi.~n'{..forCj'iiice.1 t{~be exchiZ~aL.d.ftom 4~.~-ic proce::s. For example, w~-ien used to construct s:ieLi'ch space of legal doumei1t8,, statutes and/or JLiC:ir.:f:ia l op.iF.3:ze''?n`i {~oIn fore1gFi ;EaaisE:.iCtiÃ?Ia:B alZi/or i+:-fe:'G'.nd."Ls to such C~oc3aÃ:taÃ;2%(S msl)' ~,'`. .., igaa~reC.`~3. in aii oI11boCiime.i7.t of the iIswmfic.i1.

[$30691 In an eialbi3d`T'~.i it t3a~the lia'Ã1e;,ati+33a, iifork3'?aiio:i7 may be recorded about Sox7ie or all aela~~onsl~~~~ ~~~~~~en andror <Ãrrion9 documents In searcla ~~ace, and such inaoaanaÃ.ic?n ma.y, b4 used ir, cconnection with searches of the space. For example, the n-umMr of references in a first docui~ent to a second docunae:iaà may be recorded if it is ~~~~~ved the, ttic number of r~~~~~nces correlates with tl-te degree of relevance between the iNk. o documents. In coTiri:.c-ldcan w~ffi a-search space of I~~~~l.2: o, inore, the level at which a document was adde:~ to the search space may he Qt7,U,~h~.~ercd rm aiidFca-ÃE?i of the relevance of the dt3L=iI.n3.etIt. Other properties of the reference Ãlnc doz:ume;iÃ.s, ondj'~r the references may, also be recorded, ari.i, ir.;
an e-nib+id,meia: of Ãh~.?
invention, ~~tne ar all of the factors may bH used to order the resuh:~ of ~
~~arch of any search spa~:~c:

z00701 Depending oii, .::g., :henaÃiire of ilic documents, -0h~.~ size of the collection, and the ways in which docrinacnts ln Ã:ae 4olle4Ãioia refer tÃi one .:unof:?c:r, it may be practical to de~`Ã,ie wi eihaasÃivc search s,~ace comprising every document directly or indirectly referred to in a source doL:uin~.'nt. Fig. (6 depict.~ defining 200 -1 exhaustive search space c.cd:or:Iiia4f to c7..~ embodiment of the invention.

[00711 As ~~-for:., ~e-fi:nin~,~ 200 the ; xhaust:vw sÃ;.w~ch space mey in on embodiment of the invention be an iterative process that begins i.ta block- 201 'Mt.~
~eti'Ii;4'ls?~ the r: l"e:enf~~s found i'iz..
4i s~.~ui"ee doCalmeia.t: Because the search is exhaustive, it ciad3 ':h=.'l3ei3 3Ao i e:~erenceS catia "bG found besides those already ar, the definition of the search space. x 1-~4 Ãem.
a1na'~~ng is checked tZzr 2:.n: block A~02.

;00721 In ~~oc~.s 203'20-4, and 205; the re`esenees from each of Ã;aQ
documents added to the 4earL'~".~. space in the previous pass are adided to tt1c definition of th:,'.
search space. The deÃe:,m.naÃÃon is made in bIock'?03 whether aziy documents ii~r 3vh,4h the rei'crei-ie.~.'Shave t~ beadded. If sci, the references from the next document are r;l:i.eved i 3 b1ock~04 aiid ad:~ed, to the definiÃion of .`~~ sca.r.-h set ir, I-dock 20>. 1'~~~ check in block 203' is then repeated.

L`6 1,00731 t3tiUe re.";rencs~s have been retrieved from eGLch dz?cuL :.e ?f, added during the previous pa~~~
the Lh<.;.,`s in ~i.~lf_tck- 202is repeated. if th.... latest paSk did not increase ta'#f,', size of ~li4 search s,?ace.:

ihem the exhaustive search space h~..~ :icca~~ successfully ~efinec?~ ~~nd. in block 2~.~6, its def~rutioil 1C c.va~lable for use in searching.

[E30741 Ih an embodiment of rtie invention, the dc:~:~-,dt1on 200of the e'xliausfive ~e-arc~~ ~~pacer,~ay be ln:errt~~~ted if the +trocess exceeds a preset running time antVo, if the sea; ch space uxc;~eds a preset Sire.

[bYO'l 51 So fCar, thean4ei`, nt:on has Cfee:i discussed in cLn`?eution wii?:2 an e;:1:bodii-n~.'nt dha1 b:ai1ldJ a search space bv r;:iov? ;;f Arom. referriii- documents to referenced c<.oct:~rnent,s, btit :h<? invention is aaoà limited to such ea~~~~~imeints. In eTiibodiments of the in-vc.ition, a may be indexed so tE3c3t references bC$-wee.i: CaocE.`+rnti.nt:i ar'''~' stored, i'-.g, ir, t$databaSL. ~~ such <a~3 embodi431L,nt, i"e1`7"#:ei'i.t3g the re~~ret-ices from a document ms-~ycUiT':pri4e 4lF`E.~m.:ttFng cqi.tery ii5 the daÃz.1~~se thw, requests al I re-f6;. ~~~~s comprised by ~~ic source doe-u:nent.

;00761 ~~ ano:h4r such ~~~~bodiment, l..owever, ;it is possiNe ~~~ traverse Ãs7e gra~.-~h in Ã.le direction, y.g,, by subriiri`ting a database q uc:`y Ãhet identifies all documents thas refer to a scrz~' cv docimiOlli- ~~~fora;l.g to z Ãg. 3 and taking, fÃrI' ~~amp=e, doL:#313eff?'t 3 161 as the source dÃ)cunnef:Ã, the l..".v+.:3 I search space comprises doG'u,=^.n~n! B 16 1., di.'s#,;.:;I.i".Ã~en.k E 15 81, z -,nd dow4ir3;=.11t H 164. '.~1 lie 'eve: 2 ;ew-ch apac:-c additionally co:`nprise;; document G 16'.

[00771 Depending on the embodiment of the invention and ~~~e nature of ILT
searchable data coale. lgon, it may be impossible, impractical, or ~~~desir~~le to zdenÃifyr a] I doc;~~~ent5 that laold ,e,ercnces to a particular document. For exa-i=ple, identifying all Web pages that li~~ ~~~ any particular page mai' not be praGmal, b1it 5t1ch references i3i~.:y be .~~coidi,d aslaxt3,14'3d#ic~l pages are .aadexed fo:-;earc;h., Such ?.'t.","ns c.~f ~~elb ruierences is donc. ":y, for ~~ai-nple, the G~~~~~'r~"
Internet 5e'ti.rL:li e319:31a.

10a9 r`81 In a: 1?.:gal c{?3'2tt:xt, a `a`n.'.c'3rt`,.[2 space 4.t constructed Fnay cf?mpi>Se., e.g.., lines of decisions KZndl`tlf.~ analw~~s based on a signi fleG:n6 sS.aLu ie or judicial opi~ion.
ti Y
~ i [00791 lIp, an en.~~.bodI?~wnt of the Ã:1v r1t:wi,i a search Spc a^-`.' may be i3eaa31ad bi' trifvers1ngthe graph in efther o.r both directions. Ir, some further embodiments ofÃhe invention, a i3~~l- ni.ay choose ~~io ty ~~c of search space a~.si~=~:i~.

100801 I:t will be appreciated b;f those skilled in the art that thereare many WayS to i.ta.~~ement ^3.W1.ch}iF.~~"..:i$:Q Ci.Sira>L21 space such as is F~bi2lt in i33~
`1~2ibod?ba~eXt of the }.iAoYM2i~..LVi1. For example, h~Jl1:.i4 f V> Jet-SwL^l.>ii.ig within ti.e`l2eS.+tis.~}.~',Yl>f textual t~SF-~14.f.4~E\ ~~~WSl k2a4õ~~}.t~:Y L~14~2~~C~ndL.~G 'ti'} CL
w1'r~12a~~'+.~:

#`"Ã3P7 +:.oat~.Fa#.ewiial suppliers. O:Ie such tool is 4~raC;le~~'TÃ'xt, w~:aic?a~s'uppoa'ts text se'c.a 'chinf~
database 5ii;I`i..hEx1~'% in a single k9Ql: :ita:eme?ai:.

100811 _ bit3, in = eil>i?C3d1.s;1L'nt of the invention, .a relational dau?bas4 iIiÃi.is3gemc,nt sysaena contpris`v'S oiav oÃ'Ii'14_1if.'. docur.eza.tS stored so as to eiach document to v~i.' associated "v1th zno;adata. After a search spacreis oÃn?ctruc;ed, the metaeaÃa associated w?tk.
each docu:neiit iii tzle ~ . ...
search space is Ã:741i;1~~eu to ?Fiisloatiw 5Ãacla membership. (Ir, ab embodiment of the iI1vre31tion?., :x ax ~.-:i.EmGnt can be finclu~;ed F$3. 11''ultiple s..=at : f?, spaces simultaneously.) W, 1en a search query is .c ;~,ived #z~ra give~ search space. ffi~.~ quer~:: is taansfomied into zt SQI:, statement tha.specifi~s a ~ lixtus ~ search, fimited to docu ,~~~~~ already id~ ri-fii:ed in t1hear :~~soc,ated metadata as part of the search space.

~~~~~~ In a.n alÃematÃvc:: e-i ' oc";inezat of the invention, asz arch engine conducts ~ search as if iao topic,al search space -as de:,st-ibed herein ha-s been. ;;pecified, and then i:ases, :1^c defin;t~on of the desired search space to filter the I'e4:.ift5. Filtering search results acoadf.T?g to such ai3 e~',.~.L}o~inatintt3f '~heinvendoni'6fJepaait"~v''`li. inFig. 7. In u3ock~.~,:..'~0, a user ,.'+ci`.'ci. T a topicalia.-re`ei Aor tl-acSearch. ~'h: nta:ttare of tl~tis selection will depend on theen:boc~~rnen: ol:.t~~~ inventiÃ$vt. hiore ciiibodÃmon;.~ howevor, -. aseA is presented wi4h, a h directory of topics ai-i~ s'Libtopics.
By ::p~.~:o~.-~riaÃe navigation, the user .~~iy selcw, O,le =? Motc topics relevant to the search.

100831 an blot.~ 'ZIZ2 i, the user specifies the criteria 1-Ãii' the si:z:rt:=h by emi:T'iÃag a query. MrZ+~'.iy .yl3es of search qIi~.'3"#cs :av`4. vvf'l3 known Bil the aeleZ?ir3alt arts, and aF7i:xl'zd~.'y for i:x`sinz ..;.'33ti', Boolean c,uefies and naWra(-language, based queries. In block 222, a search is peaf~.~rme~ acrok ss the ; ntDru document collection, prv~~chng, a list of restiits.

C(~

300841 ?p, b1ock 223, the de~; saLio:i of the 5~~ ~: `?' ;?as,c :s retrieved, and in block 2224, th:s de,i 7a:ton is uwc~ to fi.:er the results of the sear ch from block 222. In an enibodiaaent oa Ãhe ffiveritio3, the de-Fun;1Ã3'Ã3 of ti3e search spact'f.om,a~'.ises i1oranaai;~ed citations tE.~ every 1;.e?:=Ealieha comprised by the ..r=rea,ichs's.;ac'v. E'iitering Lhe IlS.'0.vuit aaa block I24 Ii'..<`3-Yl.f3i'iapa'aSE:m kxr'~'.~.m''.~.~ll`+,, checking ev--i-y taoOt'~inÃ`I.3:: in the list a.~~ resua'~~ against the definition ofthe search space and removing fr"oI?1 tht: tC.tiults any documents not ft,>tsa'id in "'hat defi-ntio?a. Finally, t'#.i i3at7a;.k'225, Ehe fi3Ãe2e,i results of the seart;ia are presented to :I-,,e user, NNhc: may theri0 in an t:nzbQdanie:it of the inveifion, t.e able ,~ rcl..~.,~ ;: amd -w~.tirk with some or til; of the dloc:m:e.r;~s comprssed L'- the re:ruft,.~:

[01038 ..A~ ~li['= contents of `L3:ie search space Iiav be expected to d1'pZ'adhsavI1Y on the coc1i:eint3 of tffi,C sokiTcS:, C3t>Cwn<<~nt used t~.~ genierri~~ IL In -a3 c1"ibisdimeA
of .ht, invention, , a set of topical 1o.3I1:e d+.'7Gmi?eÃits i>; used as the basis for ge`erc:sFaig topical search spaces. Such a topical Li.3C1.n7i.'.3t (V>hci a3 ?~ :t+ : .'i:lÃiE'd t.C; as a "Ic3:'tn'nf5i:e'5) may, for exa:"'7p`It`,. be prepared by an expert in the field and rnay c:tu t.lOsw soiu:.'c;'s ~Avc:d to be most relevant iLi flAe topic.

[00d3+`~1 For f.'xampi~''., a particular (riSv note mi3y be a .k~?C:m~..~'TiaFid`s2~.'l.~i ~.~'i G:n iicle discussing, ~-3, g.3 the application of principles of securities fraud to trading nf'debà securities by perso;:s having mW41~~al nonpublic l:af3ra:`:aalt:.z.7. Another I8.Nv 3~~emc'i.y, Ro,:
+uxump3e, comprise 'si compilation of citations toIaws aaSd m~ regulations r~~,~~~~~g a particular topic, ~.~.g:, securities regulation.
Arioti?er 5~iw note may be a directory of authorities ati-dye~~ particularly relevant to, e.g, securities rti6niJataon. a ac}i such dvo-Liment :aay be wA~ittte.i and maintained by o-n,: o2 a~~~e, expe~;s, in secT3alt1cs law, and may cite the 'elevc$.ilt statutory proMis+i;ns wid regulations and/or leading case haiv. DC'pen dFng on the purpose fC3r b4h_ci3 the dC`t u-necAe wa4 vvr#t?t'3ly it may also cite many court oplfflio:s reflecting ~ppiaC_tzt:czn~~f thc, principles ciiSe:issed in the oi:,eri~uti-iorit:es, In ai embodiment of the invention, thisarti6:.ie xa>av be used as the source article in th~.' geÃtel'exta3n of a SeaxF:h Sptice when a user is looking for infi?m 1atii?n about, e.g., insider trading of cC3ap o;~te bonds.

'00871 In mi eÃ:lbC?d3me;n of the invention, ::~ user may select a topic for searching fxÃ.~1n a screen i~,. ~. A caption 255 may be present, in~.s ~:i~rig, e.~,>., t~~.~.
~`u:~~ctÃ~?~ of 0 St.ch iis : e ~~
picÃed in I

di.play.Ã'1e collection t?i`di}cuaxents; wiC~!'i33 offia:.r 1n{r,`,~'I~.;atit5ii; "flh f.is"'13;:y nZay con7.paise, L''.~T

l9 one ~rnlOre metii~s 256 ~~~~lor otlier means for selecting one or more commands and/or addational functio:.Es.

[00881 One or more ha~j: k--vel topics mov~~`~ ~~~sp14 ~ed, such as, e.i,:, Enviro~sl~emal Asaw 257.

A;1 nr{fly Law 258, ~nd hisuri'~..n1:eY G:w257'. Ttw oig%Ãnee-.aiion ot,.io3.iic~muy be h.er`rl.2ichic`s.~~, wiLh.

v?iE' E?d' m(3re t+?piCh as;5o'+`.3atedwf,ah one or nnE;i'e 5ub-;optcS.
wh3c1mi yÃi3 turn be o3"te or mo"'e fu:-tl7La ub-tf;p:i:s. and so C,Ã'E. (FtSi iit.'e or desCr.pt#oi.i F ttitpic,. comprÃ:sea ';ivs'ub-topi~ :' as tased h4rein.) For evaÃ:p:e, as depicted hn.~ig. 8, the topiL~
~ocur:t~~s L,,'.t,v 260is associated b~~~ii the to;31c~ SL:=C:Ã.Ãrit.ÃeS Re:uuli#:t:3Ã)i_ svh.:h is in t1.:Ã'Ã3 as`,"~.+.,c1.atoG4 ! .;.:;. i Ili topics Federal Regulation of Seca.tt"iLd~.aS 2614, State Reg:sit.i%;(7n of Securities 263, ~~~ch~.~tge & Si~~) R~gulat:()n'264. uiid Inter.aialloz3a'e1. R~.'.gu Lion l)3. Se1.u ~.~tiesj~65, 3.n an embodiment of t3.ff:
invention, the ;~A~,C~t~i.'fziS4iV~~nSLL~S~ be . j. .i;` .i.i2i ~'~r]S'tr+A,1 6iiwt ~iaitÃ5,.+~ES[ir ty~~~tL ,i}:-,ifyCJkr assVciaa-Cr+_^wa~~..,Ãi i'i:C3r"' thaTi one hiÃsF~.'i'C~i-lk.~^''~.~~ ~.(.~`2~+~iG.
. . . . . . . õ . . . topic.

[00891 A tree 270) may be use; it) preseit th~. hierarchically-organized topi::.s. An icon 271 may indicate the presence of oiie or morc 7ubs~niec~ bv a topic or topic. Icons ?7'~? may also 'oc:_ provided that, wbeii 5C`.le:a ~ed, cause all topics below a pfirtiLulc1Ã' topic to be si:id.den, 1,100901 SeI:.cÃion 273 of one or :....; 4 topics ard,`o: s4Ãb-tc}pÃcs mm, be T-eflected in the tree 270 b~~~~
's:nd,'f)Ã` :#.tl`~.'S 274 of one or moi`s:.'dZtCw-nC;i"3tS aaxd/o:i"
directories of documents nai'.i.;' be presented :Fi d.~ tbob-=urneat a?VW.2 275. TA).. an i.+11ab1Fdiin41{t oj`ihe inienAion, o1-i?,= `ik- 21iohe iaw52ote.:32Gi[LY bfir identified in the ~.~ocu~~~~~t area 275.

[00911 I: is pc~s4.:`Ii' .,I: u.tA eI71b?t.l~Ãm+M:l':tt of the i,Ti4''+
.:.iItiOn nw?P' a F1.5er to select multiple topics aÃzdr`o.~"
su'~~opics to be searched ~imWtaÃ~~~~~~ly. ~~~~ suciA an embodfinenÃ, :he se=iA.:-cas sp~.~~~ ~nay hc the unit>r of the searcb spaces corresponding to ~ach of the selected :~~icsa For exansple, s~~ect~ng the topic Securities Rega`atioai 261 would result in a search Space con~~p, Ãsin- the union of the sc;ar~~~ ~paces ccrirospondang to FwderalRegulatioÃ~ of Se::urixies 262.
St:.>e Rvp'Wat:~~~ of S.cur1rfCs '2 i.~;, ExGhxs~,qG & SRO Ri.:gi.F.:a tif5i1 264, and f~teTn`r3tii;n-l! P llc.~~`z3+`.3n L?s Se;uF''it:.6 2-65.
In ar of the 3n ve.r. tion, multipled:sc;r.:e ateiiis x~~~xv be selected at the swrle ÃAme;
~. g , GcÃ-.s raJ P~actioe 259 wid Staie: Regu?atÃo ; of So..irities ?6-5, 00921 A search func;ion -m-ay be ~~ovÃded, aind Fi g. 9 depicts a ~.Sp&:y 290f~o. m which a User aa~~iy eiiÃer a warc.i quc~~ in an enibodinie.it of the :m'eiatian. A wxt area 291 may b~ present;
displa-~yiÃ.g a CiuG'> y as it is entered and/or L.dI .eb.`~, One or more controls 292 may alsi? r~e present, Ca:ow~iig the ]ear1v~`~'~" to be restr{(:3.ed, e..g. 5 by date, dCrcui('..ent {t Yrpt?v .J i.lA ~iSd.csion a3n`_i,/I S, l. .`1CIr 'F.r~teZ ia, 00931 A cor~~rol'~~~~ ~~~~aya~~~~~~~ the u:. '- i(,i select the scope ~~~the search. For example, the si'taxch ciiay bQ liiTl`t<`o, to the text of one, or more ~~cume}'fts pres~.':'Aed I;: the d4acumt`.nZ ta.:'ea 275 F;g: 8). Azternat>vely, a se~.:c}.~ may bc. spOcÃfÃ4d o3 a search &-fined a-naccopdance with the i;:iven.i~~, as desÃ:i-ibe~.-~ a'bove. In at3 emboc~lmem o.: Ãhe invo:xian. ~lie source document used to co~~stri3.Fl:.' thl''lL' .`"'ach-sa)i?.S:'e i3`ay bC' f)ei`h. .
(?:"3'Iei3r~.'~. l`r1-w` Tl:)tG e Wls3.t tbi 5el:(`>ctcd t".~Zp;~' or :op c:s and's~r any si:batc~~~ca c i:l; ._.;{.:.1t-lic c~~ieor rnt?x~ sub--top:cs.

0094; In~~ ~mbo(iiment of the ;nventa(ir., th.. ~~se, may be pz~~ ideci with one ornaore eciÃ~irols (no. pic.tu~ed} allowing s;,~ecAficatia~~~ of th4' of search space to be used.
In tui.oÃhei LmbodEmenLciJ ~hei~_venLiVn.4h\. leT74S S.Lõ3socaw'..~d.~ ~vi2ua some or all topics and/or source c~ocu~~ents may be lld.Ycd. For exumple. it inay~ be de}ea~mined the. vdhen a source docur3~en: is a lw:~~3otc~ tl~aà comprises compiled references to statutes and r4g:uA tions, a .level- ~ search space provides optimum coverage of t.he topic, which may meaii, e,g., that such a search space is most likely tC) incorporate relevant material without tJv'4''riil4..'hi41o1'3. (The E3VSr' of a lw.L7e!--') search space bi this oxampleis meant to be purely emen-ip1ary, and should not be taken to `hA3., t.~t~
even -when the sov.scc document is a ~aw, giotu as described iierein.) 100951 The ~nventiionhas be:~;-o des<:Ã-~beci a:;~ov~ in connection with certain preferred ennbodinie:its. This description is prirely illustrative and not ~Umitingx afad other emboc~iment:~ ~f th.e invenÃior, will be apparent ÃGr :iiose skilled zr, t;ie relevant arts.

~,.
~
,

Claims (15)

1. A computerized system for identifying one or more electronic documents within a collection of electronic documents, comprising.

at least one interface; and at least one processor coupled to the at least one interface and programmed to (1) accept a search query through one of the interfaces, (2) obtain a definition of a subset of a collection of electronic documents that comprises a plurality of electronic documents, (3) execute the search query within the subset, thereby obtaining at least one result, and (4) provide at least one of the results through one of the interfaces;
wherein obtaining a definition of a subset comprises defining a subset to comprise (1) at least one source document within the collection, each of the source documents comprising at least one reference that identifies an additional document within the collection of documents, distinct from the source document, and (2) further additional documents identifiable by, for some number of iterations, for each additional document added to the subset in the immediately preceding iteration: (a) retrieving the additional document, (b) finding in the retrieved document one or more references, each of the one or more references identifying an additional document, and (c) adding each of the found references, not in the definition of the subject, to the definition of the subset.
2. The system of claim 1, wherein the subset comprises every electronic document within the collection that any electronic document within the subset comprises a reference to.
3. A computerized system for identifying one or more electronic documents within a collection of electronic documents, comprising:
at least one interface; and at least one processor coupled to the at least one interface and programmed to (1) accept a search query through one of the interfaces, (2) obtain a definition of a subset of a collection of electronic documents that comprises a plurality of electronic documents, (3) execute the search query within the subset, thereby obtaining at least one result, and (4) provide at least one of the results through one of the interfaces;
wherein obtaining a definition of a subset comprises defining a subset to comprise (1) at least one source document within the collection, (2) additional citing documents identifiable by, for some number of iterations, for each document added to the subset in the immediately preceding iteration;
(a) finding one or more additional citing documents in the collection, each comprising at least one reference to the document, and (b) adding each addition citing document, not already in the subset, to the subset.
4. The system of claim 3, wherein the subset comprises every electronic document within the collection that is known to comprise a reference to any electronic document within the subset.
5. A method of identifying one or more documents within a collection of documents, comprising:
defining a subset of a collection of documents, the collection of documents comprising a plurality of documents, and the subset comprising (1) at least one source document within the collection of documents, each, source document comprising at least one reference that identifies additional document within the collection of documents, distinct from the source document, (2) additional documents identifiable by, for some number of iterations, for each document added to the search space in the immediately preceding iteration: (a) retrieving the document, (b) finding in the retrieved document one or references, each of the one or more references identifying an additional document, and (c) adding each of the found references, not already in the definition of the search space, to the definition of the search space;
accepting a search query comprising one or more criteria, and identifying one or more documents within the subset that satisfy the one or more criteria comprised by the search query.
6. The method of claim 5, wherein the subset comprises every electronic document within the collection that any electronic document within the subset comprises a reference to.
7. A method of identifying one or more documents within a collection of documents, comprising:
defining a subset within a collection of documents, the collection of documents comprising a plurality of documents, and the subset comprising (1) at least one source document within the collection of documents, (2) additional citing documents identifiable by, for some number of iterations, for each document added to the subset in the immediately preceding iteration:

(a) finding one or more additional citing documents each comprising at least one reference to the document, and (h) adding each additional citing document, not already in the subset, to the subset;

accepting a search query comprising one or more criteria; and identifying one or more documents within the subset that satisfy the one or more criteria comprised by the search query.
8. The method of claim 7, wherein the subset comprises every electronic document within the collection that is known to comprise a reference to any electronic document within the subset.
9. The method of claim 5, 6, 7,or 8, wherein at least one of the source documents pertains to at least one area of law.
10. The method of claim 9, comprising selecting at least one topic from a directory of areas of low, wherein, each source document pertains to at least one of the selected topics.
11. A method of defining a topical subset of a collection of documents, comprising:
defining a subset of a collection of documents to comprise at least one source document within the collection of documents, each source document comprising at least one reference that identifies an additional document within the collection of documents, distinct from the source document, wherein this defining comprises a first iteration, and defining the subset to comprise additional documents identifiable, for some number of iterations, for each document added to the search space in the immediately preceding iteration: (a) retrieving the document, (b) finding in the retrieved additional document, and (c) adding each of the found references, not already in the definition of the search space, to the definition of the search space.
12. The method of claim 11, wherein the subset comprises every document within the collection that any document within the subset comprises a reference to.
13. A method of defining a topical subset of a collection of documents, comprising:

defining a subset of a collection of documents to comprise at least one source iteration;

defining the subset to comprise at least one additional document identifiable by, for some number of iterations, for each document added to the subset in the immediately preceding iteration: (a) finding one or more additional citing documents within the collection of documents, each additional citing document comprising one or more references to the document, and (b) adding each of the additional documents to the subset.
14. The method of claim 13, wherein the subset comprises every document within the collection that is known to comprise a reference to any document within the subset.
15. The method of claim 11, 12, 13, or 14, wherein the content of the source document pertains to at least one area of law.
CA002650381A 2006-04-26 2007-03-29 System and method for topical document searching Abandoned CA2650381A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US11/412,315 2006-04-26
US11/412,315 US9529903B2 (en) 2006-04-26 2006-04-26 System and method for topical document searching
PCT/US2007/065621 WO2007127579A2 (en) 2006-04-26 2007-03-29 System and method for topical document searching

Publications (1)

Publication Number Publication Date
CA2650381A1 true CA2650381A1 (en) 2007-11-08

Family

ID=38649508

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002650381A Abandoned CA2650381A1 (en) 2006-04-26 2007-03-29 System and method for topical document searching

Country Status (5)

Country Link
US (2) US9529903B2 (en)
EP (1) EP2013785A4 (en)
AU (1) AU2007243024A1 (en)
CA (1) CA2650381A1 (en)
WO (1) WO2007127579A2 (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8065277B1 (en) 2003-01-17 2011-11-22 Daniel John Gardner System and method for a data extraction and backup database
US8943024B1 (en) 2003-01-17 2015-01-27 Daniel John Gardner System and method for data de-duplication
US8375008B1 (en) 2003-01-17 2013-02-12 Robert Gomes Method and system for enterprise-wide retention of digital or electronic data
US8630984B1 (en) 2003-01-17 2014-01-14 Renew Data Corp. System and method for data extraction from email files
US8069151B1 (en) 2004-12-08 2011-11-29 Chris Crafford System and method for detecting incongruous or incorrect media in a data recovery process
US8527468B1 (en) 2005-02-08 2013-09-03 Renew Data Corp. System and method for management of retention periods for content in a computing system
US8150827B2 (en) * 2006-06-07 2012-04-03 Renew Data Corp. Methods for enhancing efficiency and cost effectiveness of first pass review of documents
US20080189273A1 (en) * 2006-06-07 2008-08-07 Digital Mandate, Llc System and method for utilizing advanced search and highlighting techniques for isolating subsets of relevant content data
AR062635A1 (en) * 2006-09-01 2008-11-19 Thomson Global Resources SYSTEM, METHODS, SOFTWARE AND INTERFASES TO FORMAT APPOINTMENTS OF LEGISLATION
US8332384B2 (en) * 2007-11-29 2012-12-11 Bloomberg Finance Lp Creation and maintenance of a synopsis of a body of knowledge using normalized terminology
US10296528B2 (en) * 2007-12-31 2019-05-21 Thomson Reuters Global Resources Unlimited Company Systems, methods and software for evaluating user queries
US8615490B1 (en) 2008-01-31 2013-12-24 Renew Data Corp. Method and system for restoring information from backup storage media
US9135331B2 (en) 2008-04-07 2015-09-15 Philip J. Rosenthal Interface including graphic representation of relationships between search results
US8392706B2 (en) * 2008-11-26 2013-03-05 Perlustro, L.P. Method and system for searching for, and collecting, electronically-stored information
WO2011075610A1 (en) 2009-12-16 2011-06-23 Renew Data Corp. System and method for creating a de-duplicated data set
US20120215738A1 (en) * 2011-02-23 2012-08-23 Tologix Software Inc. System and methods for the management, structuring, and access to a legal documents database
US20120290612A1 (en) * 2011-05-10 2012-11-15 Ritoe Rajan V N-dimensional data searching and display
US8527863B2 (en) * 2011-06-08 2013-09-03 International Business Machines Corporation Navigating through cross-referenced documents
US9122666B2 (en) * 2011-07-07 2015-09-01 Lexisnexis, A Division Of Reed Elsevier Inc. Systems and methods for creating an annotation from a document
US9589051B2 (en) * 2012-02-01 2017-03-07 University Of Washington Through Its Center For Commercialization Systems and methods for data analysis
US9201969B2 (en) * 2013-01-31 2015-12-01 Lexisnexis, A Division Of Reed Elsevier Inc. Systems and methods for identifying documents based on citation history
US11144994B1 (en) 2014-08-18 2021-10-12 Street Diligence, Inc. Computer-implemented apparatus and method for providing information concerning a financial instrument
US10474702B1 (en) 2014-08-18 2019-11-12 Street Diligence, Inc. Computer-implemented apparatus and method for providing information concerning a financial instrument
US10977284B2 (en) 2016-01-29 2021-04-13 Micro Focus Llc Text search of database with one-pass indexing including filtering
US20180121502A1 (en) * 2016-10-28 2018-05-03 The Bureau Of National Affairs, Inc. User Search Query Processing
US10832000B2 (en) 2016-11-14 2020-11-10 International Business Machines Corporation Identification of textual similarity with references
US20240086482A1 (en) * 2022-09-13 2024-03-14 International Business Machines Corporation Cross application meta history link

Family Cites Families (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5619709A (en) * 1993-09-20 1997-04-08 Hnc, Inc. System and method of context vector generation and retrieval
US6339767B1 (en) * 1997-06-02 2002-01-15 Aurigin Systems, Inc. Using hyperbolic trees to visualize data generated by patent-centric and group-oriented data processing
US5758257A (en) * 1994-11-29 1998-05-26 Herz; Frederick System and method for scheduling broadcast of and access to video programs and other data using customer profiles
US6460036B1 (en) * 1994-11-29 2002-10-01 Pinpoint Incorporated System and method for providing customized electronic newspapers and target advertisements
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US5717914A (en) * 1995-09-15 1998-02-10 Infonautics Corporation Method for categorizing documents into subjects using relevance normalization for documents retrieved from an information retrieval system in response to a query
US5640553A (en) * 1995-09-15 1997-06-17 Infonautics Corporation Relevance normalization for documents retrieved from an information retrieval system in response to a query
US5675788A (en) * 1995-09-15 1997-10-07 Infonautics Corp. Method and apparatus for generating a composite document on a selected topic from a plurality of information sources
CA2245913C (en) * 1996-04-10 2002-06-11 At&T Corp. A system and method for finding information in a distributed information system using query learning and meta search
US5842213A (en) * 1997-01-28 1998-11-24 Odom; Paul S. Method for modeling, storing, and transferring data in neutral form
US6122635A (en) * 1998-02-13 2000-09-19 Newriver Investor Communications, Inc. Mapping compliance information into useable format
US7181438B1 (en) * 1999-07-21 2007-02-20 Alberti Anemometer, Llc Database access system
US6519586B2 (en) * 1999-08-06 2003-02-11 Compaq Computer Corporation Method and apparatus for automatic construction of faceted terminological feedback for document retrieval
US6859800B1 (en) * 2000-04-26 2005-02-22 Global Information Research And Technologies Llc System for fulfilling an information need
US6636848B1 (en) * 2000-05-31 2003-10-21 International Business Machines Corporation Information search using knowledge agents
US7490092B2 (en) * 2000-07-06 2009-02-10 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
US6804662B1 (en) * 2000-10-27 2004-10-12 Plumtree Software, Inc. Method and apparatus for query and analysis
US7080076B1 (en) * 2000-11-28 2006-07-18 Attenex Corporation System and method for efficiently drafting a legal document using an authenticated clause table
US6584470B2 (en) 2001-03-01 2003-06-24 Intelliseek, Inc. Multi-layered semiotic mechanism for answering natural language questions using document retrieval combined with information extraction
EP1402408A1 (en) * 2001-07-04 2004-03-31 Cogisum Intermedia AG Category based, extensible and interactive system for document retrieval
US7206778B2 (en) * 2001-12-17 2007-04-17 Knova Software Inc. Text search ordered along one or more dimensions
US7519589B2 (en) * 2003-02-04 2009-04-14 Cataphora, Inc. Method and apparatus for sociological data analysis
JP2004054631A (en) * 2002-07-19 2004-02-19 Internatl Business Mach Corp <Ibm> Information retrieval system, information retrieval method, structural analysis method of html document, and program
US8543564B2 (en) * 2002-12-23 2013-09-24 West Publishing Company Information retrieval systems with database-selection aids
US20050114130A1 (en) * 2003-11-20 2005-05-26 Nec Laboratories America, Inc. Systems and methods for improving feature ranking using phrasal compensation and acronym detection
US7287012B2 (en) * 2004-01-09 2007-10-23 Microsoft Corporation Machine-learned approach to determining document relevance for search over large electronic collections of documents
EP2662784A1 (en) * 2004-03-15 2013-11-13 Yahoo! Inc. Search systems and methods with integration of user annotations
JP4814239B2 (en) 2004-08-23 2011-11-16 レクシスネクシス ア ディヴィジョン オブ リード エルザヴィア インコーポレイテッド Indexed case identification system and method
US7577671B2 (en) * 2005-04-15 2009-08-18 Sap Ag Using attribute inheritance to identify crawl paths
US7814102B2 (en) * 2005-12-07 2010-10-12 Lexisnexis, A Division Of Reed Elsevier Inc. Method and system for linking documents with multiple topics to related documents
WO2007076453A1 (en) * 2005-12-21 2007-07-05 Decernis, Llc Document validation system and method
US7761436B2 (en) * 2006-01-03 2010-07-20 Yahoo! Inc. Apparatus and method for controlling content access based on shared annotations for annotated users in a folksonomy scheme

Also Published As

Publication number Publication date
EP2013785A4 (en) 2009-09-09
US9529903B2 (en) 2016-12-27
WO2007127579A3 (en) 2008-10-09
WO2007127579A2 (en) 2007-11-08
EP2013785A2 (en) 2009-01-14
US20100257161A1 (en) 2010-10-07
US20070255686A1 (en) 2007-11-01
US9519707B2 (en) 2016-12-13
AU2007243024A1 (en) 2007-11-08
WO2007127579A8 (en) 2008-11-27

Similar Documents

Publication Publication Date Title
CA2650381A1 (en) System and method for topical document searching
CN110704411B (en) Knowledge graph building method and device suitable for art field and electronic equipment
CA2723204A1 (en) Statistical measure and calibration of search criteria where one or both of the search criteria and database is incomplete
CN110678860B (en) System and method for word-by-word text mining
CN107844595B (en) Intelligent job position recommendation method for job hunting website
US20160247090A1 (en) Relationship extraction
US11232111B2 (en) Automated company matching
CN109299235B (en) Knowledge base searching method, device and computer readable storage medium
CN112395425A (en) Data processing method and device, computer equipment and readable storage medium
CN106940711B (en) URL detection method and detection device
Bruckner et al. Evaluation of ILP-based approaches for partitioning into colorful components
CN110019429A (en) Legal services system, method, server, equipment and medium
Silva et al. Feature extraction for the author name disambiguation problem in a bibliographic database
US9141687B2 (en) Identification of data objects within a computer database
CN106909647B (en) Data retrieval method and device
CN109002528B (en) Data import method, device and storage medium
CN108664548B (en) Network access behavior characteristic group dynamic mining method and system under degradation condition
CN110781404A (en) Friend relationship chain matching method, system, computer equipment and readable storage medium
Portugal et al. GH4RE: Repository Recommendation on GitHub for Requirements Elicitation Reuse.
CN116991412A (en) Code processing method, device, electronic equipment and storage medium
CN103324640B (en) A kind of method, device and equipment determining search result document
CN111340580B (en) Method and device for determining house type, computer equipment and storage medium
CN114218371A (en) Multilevel directory name retrieval matching method, device, equipment and medium
CN106649859A (en) Character string-based file compression method and apparatus
Timonin et al. Research of filtration methods for reference social profile data

Legal Events

Date Code Title Description
EEER Examination request
FZDE Dead

Effective date: 20150331