BR102016022812A2

BR102016022812A2 - LEGAL COGNITION METHOD

Info

Publication number: BR102016022812A2
Application number: BR102016022812-3A
Authority: BR
Inventors: Pinto Medeiros Richerland; Fernando Marar João; Tadeu Rondina Mandaliti Renato; Buchina Armando; Paulucci Caio; Gustavo Gutierrez Luiz; Souza Jonathan; Inoue Felipe
Original assignee: Tadeu Rondina Mandaliti Renato
Priority date: 2016-09-30
Filing date: 2016-09-30
Publication date: 2018-05-02
Also published as: WO2018058223A1

Abstract

a presente invenção refere-se a método de cognição jurídica que compreende as seguintes etapas: a. adquirir (10) dados de tribunais e publicações jurídicas; b. estruturar (20) os dados por meio de variáveis taxonômicas; c. selecionar as variáveis taxonômicas; e d. criar (30) um modelo preditivo.The present invention relates to a method of legal cognition comprising the following steps: a. acquire (10) court data and legal publications; B. structure (20) the data through taxonomic variables; ç. select taxonomic variables; and d. create (30) a predictive model.

Description

(54) Título: MÉTODO DE COGNIÇÃO JURÍDICA (51) Int. Cl.: G06Q 50/18; G06F 17/27; G06F 17/20; G06F 17/30 (52) CPC: G06Q 50/18,G06F 17/27,G06F 17/20, G06F 17/30 (73) Titular(es): RENATO TADEU RONDINA MANDALITI, RODRIGO TADEU RONDINA MANDALITI, REINALDO LUIS TADEU RONDINA MANDALITI (72) Inventor(es): RICHERLAND PINTO MEDEIROS; JOÃO FERNANDO MARAR; RENATO TADEU RONDINA MANDALITI; ARMANDO BUCHINA; CAIO PAULUCCI; LUIZ GUSTAVO GUTIERREZ; JONATHAN SOUZA; FELIPE INOUE (57) Resumo: A presente invenção refere-se a método de cognição jurídica que compreende as seguintes etapas: a. adquirir (10) dados de tribunais e publicações jurídicas; b. estruturar (20) os dados por meio de variáveis taxonômicas; c. selecionar as variáveis taxonômicas; e d. criar (30) um modelo preditivo.(54) Title: LEGAL COGNITION METHOD (51) Int. Cl .: G06Q 50/18; G06F 17/27; G06F 17/20; G06F 17/30 (52) CPC: G06Q 50/18, G06F 17/27, G06F 17/20, G06F 17/30 (73) Owner (s): RENATO TADEU RONDINA MANDALITI, RODRIGO TADEU RONDINA MANDALITI, REINALDO LUIS TADEU RONDINA MANDALITI (72) Inventor (s): RICHERLAND PINTO MEDEIROS; JOÃO FERNANDO MARAR; RENATO TADEU RONDINA MANDALITI; ARMANDO BUCHINA; CAIO PAULUCCI; LUIZ GUSTAVO GUTIERREZ; JONATHAN SOUZA; FELIPE INOUE (57) Abstract: The present invention relates to a method of legal cognition that comprises the following steps: a. acquire (10) data from courts and legal publications; B. structure (20) the data by means of taxonomic variables; ç. select taxonomic variables; and d. create (30) a predictive model.

Fig. 1 (74) Procurador(es): ANA PAULA SANTOS CELIDONIOFig. 1 (74) Attorney (s): ANA PAULA SANTOS CELIDONIO

1/10 “MÉTODO DE COGNIÇÃO JURÍDICA”1/10 “LEGAL COGNITION METHOD”

Campo da Invenção [001] A presente invenção refere-se a um método de cognição jurídica.Field of the Invention [001] The present invention relates to a method of legal cognition.

Antecedentes da Invenção [002] O Brasil possui uma grande quantidade de processos jurídicos em andamento; o volume de processos em tramitação cresce a cada ano, pois a quantidade de processos abertos anualmente é maior do que a quantidade de processos encerrados. Tal volume de processos cria um volume ainda maior de arquivos, contendo dados que quando analisados podem gerar grande valor a operadores do direito. A grande maioria dos dados presentes no universo jurídico possui cunho textual, sem a presença de estruturas prédefinidas, o que dificuldade extremamente a obtenção de informação, através destes dados; somado a este fato apesar de um conjunto de tentativas de centralização do controle de processos, o Brasil possui mais de 67 sistemas distribuídos ao redor de seu território.Background to the Invention [002] Brazil has a large number of legal proceedings in progress; the volume of cases in progress grows every year, since the number of cases opened annually is greater than the number of cases closed. Such volume of lawsuits creates an even larger volume of files, containing data that, when analyzed, can generate great value for law enforcement operators. The vast majority of data present in the legal universe has a textual nature, without the presence of predefined structures, which makes it extremely difficult to obtain information through these data; added to this fact despite a set of attempts to centralize process control, Brazil has more than 67 systems distributed around its territory.

[003] Portanto, não existe no estado da técnica padrões sistêmicos e a característica dos dados dispostos em suma maioria em linguagem natural direta.[003] Therefore, in the state of the art there are no systemic standards and the characteristic of the data arranged in short in direct natural language.

Objetivos e Descrição da Invenção [004] Objetivo é a criação de uma estrutura de centralização de informações, bem como uma metodologia de estruturação e utilização de dados, transformando-os em informações estratégicas para operadores jurídicos. Tal projeto se baseia em técnicas de computação cognitiva e processamento distribuído de grandes volumes da informação.Objectives and Description of the Invention [004] Objective is the creation of a centralized information structure, as well as a methodology for structuring and using data, transforming them into strategic information for legal operators. Such a project is based on cognitive computing techniques and distributed processing of large volumes of information.

Breve Descrição dos Desenhos [005] Os objetivos, efeitos técnicos e vantagens da presente invenção serão aparentes aos técnicos no assunto a partir da descriçãoBrief Description of the Drawings [005] The objectives, technical effects and advantages of the present invention will be apparent to those skilled in the art from the description

15/3215/32

2/10 detalhada a seguir que faz referência às figuras anexas, que ilustram realizações exemplificadoras, mas não limitadoras, da invenção:2/10 detailed below that makes reference to the attached figures, which illustrate exemplary, but not limiting, achievements of the invention:

- a Figura 1 é um fluxograma do método de cognição jurídica, objeto da presente invenção;- Figure 1 is a flowchart of the legal cognition method, object of the present invention;

- a Figura 2 é um fluxograma da taxomania jurídica, objeto da presente invenção;- Figure 2 is a flowchart of the legal taxomania, object of the present invention;

- a Figura 3 é um fluxograma da ontologia jurídica, objeto da presente invenção;- Figure 3 is a flow chart of the legal ontology, object of the present invention;

- a Figura 4 é um fluxograma da modelagem preditiva, objeto da presente invenção; e- Figure 4 is a flowchart of predictive modeling, object of the present invention; and

- a Figura 5 mostra a interface de uso do método de cognição jurídica, objeto da presente invenção.- Figure 5 shows the interface of use of the legal cognition method, object of the present invention.

Descrição de Realizações da Invenção [006] Inicialmente, cumpre destacar que o método de cognição jurídica, objeto da presente invenção, será descrita a seguir de acordo com realizações particulares, mas não limitativas, uma vez que sua concretização poderá ser realizada de diferentes formas e variações e conforme a aplicação desejada pelo técnico no assunto.Description of Realizations of the Invention [006] Initially, it should be noted that the method of legal cognition, object of the present invention, will be described below according to particular realizations, but not limiting, since its realization can be carried out in different ways and variations and according to the application desired by the technician in the subject.

[007] É inserido no campo de aprendizado de máquina e no subcampo processamento de linguagem natural, onde suas técnicas são exploradas nas tarefas de extração de entidades nomeadas, classificação textual, clusterização textual e sumarização textual. O conceito “característica” ou “feature” usados neste contexto se refere a qualquer detalhe encontrado em textos, que possa representar em conjunto ou separadamente evidencias de fenômenos estatísticos linguísticos, possibilitando a extração, classificação ou uso da informação. Também inserido no contexto a expressão corpus se refere ao conjunto de textos utilizados na criação de modelos, quando anotados; ou simplesmente aos textos em análise com o uso de modelos.[007] It is inserted in the machine learning field and the natural language processing subfield, where its techniques are explored in the tasks of extracting named entities, textual classification, textual clustering and textual summarization. The concept "feature" or "feature" used in this context refers to any detail found in texts, which may represent together or separately evidence of linguistic statistical phenomena, enabling the extraction, classification or use of information. Also inserted in the context the expression corpus refers to the set of texts used in the creation of models, when noted; or simply to the texts under analysis using templates.

16/3216/32

3/10 [008] O processo macro do se baseia no uso de crawlers (é um robô usado pelos buscadores para encontrar e indexar páginas de um site. Ele captura informações das páginas e cadastra os links encontrados, possibilitando encontrar outras páginas e mantendo sua base de dados atualizada), que “conhecem” a estrutura de todos os tribunais de justiça, cíveis e trabalhistas do país, onde qualquer alteração na disponibilização de informações é monitorada, para eventual tratamento. Através desses mecanismos de crawling, informações jurídicas são extraídas baseadas em consultas, por nomes ou números, automaticamente através da geração aleatória de números de CNJ ou diretamente através de encomendas pontuais. O objetivo final é a obtenção de todos os processos de todo o brasil, com a atualização dinâmica e diária da base.3/10 [008] The macro process is based on the use of crawlers (it is a robot used by search engines to find and index pages on a website. It captures information from the pages and registers the links found, making it possible to find other pages and maintaining its updated database), who “know” the structure of all courts of justice, civil and labor in the country, where any change in the availability of information is monitored, for possible treatment. Through these crawling mechanisms, legal information is extracted based on queries, by names or numbers, automatically through the random generation of CNJ numbers or directly through one-off orders. The final objective is to obtain all the processes from all over Brazil, with dynamic and daily updating of the database.

[009] Cada tribunal possui uma nomenclatura para suas informações, inseridas dentro das áreas jurídicas (cível ou trabalhista), inseridas nas instancias processuais e também em seus ritos. A ferramenta possui uma matriz de conhecimento base, nomeada taxonomia jurídica, onde um tesauro jurídico (é um tipo de vocabulário controlado utilizado por pessoas que compartilham uma mesma linguagem em dada área de conhecimento. É uma ferramenta de controle terminológico que tem por objetivo a padronização da informação), amparado por estruturas hierárquicas, auxiliam no processo de normalização da informação. A taxonomia é dinâmica e pode ser operada por indivíduos, ou através da fase de retroalimentação.[009] Each court has a nomenclature for its information, inserted within the legal areas (civil or labor), inserted in the procedural instances and also in its rites. The tool has a base of knowledge base, called legal taxonomy, where a legal thesaurus (is a type of controlled vocabulary used by people who share the same language in a given area of knowledge. It is a terminological control tool that aims to standardize information), supported by hierarchical structures, assist in the information normalization process. Taxonomy is dynamic and can be operated by individuals, or through the feedback phase.

[010] A estrutura taxonômica não serve somente para a realização de uma comparação ou um “de-para”; ela serve para a conceituação da informação contida em cada variável extraída dos tribunais e a composição do “conhecimento” inicial da plataforma jurídica X-Gracco.[010] The taxonomic structure is not only used to make a comparison or a "from-to"; it serves to conceptualize the information contained in each variable extracted from the courts and the composition of the initial “knowledge” of the X-Gracco legal platform.

[011] Considerando a homogenia das informações através do uso da taxonomia jurídica, a plataforma persiste todas as informações[011] Considering the homogeneity of information through the use of legal taxonomy, the platform persists all information

17/3217/32

4/10 inerentes a cada processo obtido em uma estrutura distribuída, visando à possibilidade do processamento de grandes volumes de dados. Os dados estruturados são relacionados aos documentos inerentes ao processo (quando disponíveis digitalmente). Cada arquivo obtido é mantido em seu padrão original (extensão, formato e etc.), porém é criado uma nova representação, baseada em matrizes, onde cada parágrafo, frase, expressão ou palavra é analisada, baseados no documento em que estão inseridos, em sua estrutura sintática e léxica e também baseados em conjuntos de documentos; a plataforma trabalha com o conceito de múltiplos domínios de informações inseridos no universo jurídico, ou seja, através das características textuais a ferramenta é capaz de classificar um documento em analise dentro de sua área jurídica (cível, trabalhista e etc.), dentro de sua instancia ou até mesmo a localização do tipo de peça, ou rito.4/10 inherent to each process obtained in a distributed structure, aiming at the possibility of processing large volumes of data. The structured data are related to the documents inherent to the process (when available digitally). Each file obtained is maintained in its original pattern (extension, format, etc.), but a new representation is created, based on matrices, where each paragraph, phrase, expression or word is analyzed, based on the document in which they are inserted, in its syntactic and lexical structure and also based on sets of documents; the platform works with the concept of multiple domains of information inserted in the legal universe, that is, through textual characteristics the tool is able to classify a document under analysis within its legal area (civil, labor and etc.), within its instance or even the location of the type of piece, or rite.

[012] Ao classificar documentos jurídicos a plataforma pode usar a hierarquia da informação, considerando a estrutura jurídica tradicional, ou a hierarquia obtida através da descoberta de características textuais, contidas nos documentos. O objetivo deste procedimento é a garantia de aprendizado continuo e da utilização de aprendizado escalonado. Cada documento ganha um índice baseado em seu número de CNJ e outro baseado em suas classificações internas de pertinência.[012] When classifying legal documents, the platform can use the information hierarchy, considering the traditional legal structure, or the hierarchy obtained through the discovery of textual characteristics, contained in the documents. The purpose of this procedure is to ensure continuous learning and the use of staggered learning. Each document receives an index based on its CNJ number and another based on its internal classifications of pertinence.

[013] Após a correta classificação e indexação a plataforma está com os dados disponíveis para observações, que podem ser executadas interativamente com operadores humanos, ou através dos mecanismos de retroalimentação. Uma observação pode ser conceituada como a resposta a uma pergunta jurídica, dado um conjunto de processos/arquivos/textos, que pode ser obtida através das variáveis primárias ou através da extração de informações dos documentos. Inicialmente as variáveis primárias, são todas aquelas encontradas nas meta-informações dos tribunais, entretanto, com o[013] After the correct classification and indexing, the platform has the data available for observations, which can be performed interactively with human operators, or through the feedback mechanisms. An observation can be conceptualized as the answer to a legal question, given a set of processes / files / texts, which can be obtained through primary variables or by extracting information from documents. Initially the primary variables are all those found in the court's meta-information, however, with the

18/3218/32

5/10 passar do tempo e uso da plataforma, caso outra variável seja encontrada através dos modelos em todo o universo de analise, esta passa a ser uma variável primária.5/10 time and use of the platform, if another variable is found through the models in the entire universe of analysis, this becomes a primary variable.

[014] A cada nova observação à plataforma “conhece” novas variáveis, ou aperfeiçoa seu conhecimento em variáveis já conhecidas, baseada na criação de modelos.[014] With each new observation, the platform "knows" new variables, or improves its knowledge on already known variables, based on the creation of models.

[015] Ao classificar documentos jurídicos a plataforma pode usar a hierarquia da informação, considerando a estrutura jurídica tradicional, ou a hierarquia obtida através da descoberta de características textuais, contidas nos documentos. O objetivo deste procedimento é a garantia de aprendizado continuo e da utilização de aprendizado escalonado. Cada documento ganha um índice baseado em seu número de CNJ e outro baseado em suas classificações internas de pertinência.[015] When classifying legal documents, the platform can use the information hierarchy, considering the traditional legal structure, or the hierarchy obtained through the discovery of textual characteristics, contained in the documents. The purpose of this procedure is to ensure continuous learning and the use of staggered learning. Each document receives an index based on its CNJ number and another based on its internal classifications of pertinence.

[016] Após a correta classificação e indexação a plataforma está com os dados disponíveis para observações, que podem ser executadas interativamente com operadores humanos, ou através dos mecanismos de retroalimentação. Uma observação pode ser conceituada como a resposta a uma pergunta jurídica, dado um conjunto de processos/arquivos/textos, que pode ser obtida através das variáveis primárias ou através da extração de informações dos documentos. Inicialmente as variáveis primárias, são todas aquelas encontradas nas meta-informações dos tribunais, entretanto, com o passar do tempo e uso da plataforma, caso outra variável seja encontrada através dos modelos em todo o universo de analise, esta passa a ser uma variável primária.[016] After the correct classification and indexing, the platform has the data available for observations, which can be performed interactively with human operators, or through the feedback mechanisms. An observation can be conceptualized as the answer to a legal question, given a set of processes / files / texts, which can be obtained through primary variables or by extracting information from documents. Initially, the primary variables are all those found in the court's meta-information, however, with the passage of time and use of the platform, if another variable is found through the models in the entire universe of analysis, this becomes a primary variable .

[017] A cada nova observação à plataforma “conhece” novas variáveis, ou aperfeiçoa seu conhecimento em variáveis já conhecidas, baseada na criação de modelos.[017] With each new observation, the platform "knows" new variables, or improves its knowledge on already known variables, based on the creation of models.

[018] A informação inerente aos dados jurídicos é intimamente[018] The information inherent in legal data is intimately

19/3219/32

6/10 ligada as variáveis trabalhadas em seu contexto, a saber: tipos de documentos, áreas, ritos, autores, juízes, réus, comarcas, tribunais, pedidos, argumentações, causas, evidencias e etc. A plataforma usa o conceito de entidades na concretização dessas variáveis, possibilitando não somente a análise de variáveis primárias e de núcleo bem definido como: comarcas, juízes e tribunais; mas também variáveis argumentativas, como pedidos, decisões, argumentações, causas, evidencias e etc.6/10 linked to the variables worked in their context, namely: types of documents, areas, rites, authors, judges, defendants, counties, courts, requests, arguments, causes, evidences and etc. The platform uses the concept of entities in the realization of these variables, allowing not only the analysis of primary variables and well-defined nucleus, such as: counties, judges and courts; but also argumentative variables, such as requests, decisions, arguments, causes, evidence, etc.

[019] O mecanismo primário para a identificação dessas variáveis se baseia na tarefa de processamento de linguagem natural: extração de entidades nomeadas, através do uso de técnicas de aprendizado de máquina, amparados pela taxonomia jurídica e por uma ontologia jurídica (ontologia é um formato de representação de informação, baseado em uma relação direta entre seus componentes).[019] The primary mechanism for the identification of these variables is based on the task of natural language processing: extraction of named entities, through the use of machine learning techniques, supported by legal taxonomy and a legal ontology (ontology is a format information representation, based on a direct relationship between its components).

[020] Além de entidades conhecidas a priori na taxonomia, ou na ontologia a plataforma necessita conhecer o funcionamento da informação dentro do contexto dos documentos analisados. A plataforma utiliza o aprendizado supervisionado de máquina para anotação do corpus, baseado nas variáveis primárias ou em variáveis customizadas obtidas através de regras de negócios ou preferências de quem utiliza as informações inerentes a processos ou documentos jurídicos.[020] In addition to entities known a priori in taxonomy, or ontology, the platform needs to know the functioning of the information within the context of the analyzed documents. The platform uses supervised machine learning to annotate the corpus, based on primary variables or customized variables obtained through business rules or preferences of those who use the information inherent to legal processes or documents.

[021] Toda a tecnologia envolvida no processo se baseia em maximizar a chance de reduzir a incerteza do que se está sendo analisado. Um exemplo para a aplicação é a extração das entidades decisão e tribunal do documento de acórdão. A plataforma conhece os termos decisão e juiz, porém antes da criação de um modelo não conhecem como tais informações podem ser identificadas; quando a plataforma é demanda pela localização ou extração dessas variáveis a plataforma consegue interagir com um operador humano a fim de “receber conhecimento”. O formato primário de aquisição de[021] All the technology involved in the process is based on maximizing the chance of reducing the uncertainty of what is being analyzed. An example for the application is the extraction of the decision and court entities from the judgment document. The platform knows the terms decision and judge, but before creating a model, they do not know how such information can be identified; when the platform is in demand for the location or extraction of these variables, the platform is able to interact with a human operator in order to “receive knowledge”. The primary format for acquiring

20/3220/32

7/10 conhecimento é o painel de anotação de variáveis, observado na figura 6.7/10 knowledge is the variable annotation panel, seen in figure 6.

[022] A demanda pela observação de variáveis além das primarias, somado ao treinamento por um operador humano cria um modelo que é utilizado para a observação em questão e também para novos processos de descoberta, mantendo o ciclo de aprendizado continuo de toda a plataforma.[022] The demand for observing variables in addition to the primary ones, added to the training by a human operator creates a model that is used for the observation in question and also for new discovery processes, maintaining the continuous learning cycle of the entire platform.

[023] O processo de uma observação está completo quando o modelo é capaz de tabular corretamente os dados contidos em documentos jurídicos e a partir daí é possível a criação de visões preditivas, baseados nas perguntas que orientaram o modelo, por exemplo, “qual a chance de ganhar um processo de dano moral em minas gerais?”.[023] The observation process is complete when the model is able to correctly tabulate the data contained in legal documents and from there it is possible to create predictive views, based on the questions that guided the model, for example, “what is the chance of winning a moral damage lawsuit in minas gerais? ”.

[024] Toda nova observação, além de produzir o conhecimento de novas variáveis (entidades), produz também o conhecimento de novas características e o conhecimento de como tais características se relacionam, trazendo evidencias de padrões estatísticos e linguísticos e possibilitando a plataforma a inferência semântica. A semântica, ainda não é uma ciência completamente elucidada, entretanto quando estudada em um domínio restrito, dentro de conceitos pré-definidos pode ser inferida; além dela considerando a característica primordial da plataforma, que é estar inserida dentro do contexto jurídico a plataforma ganha a analise pragmática que é o significado das palavras e expressões dentro de um determinado contexto.[024] Every new observation, in addition to producing knowledge of new variables (entities), also produces knowledge of new characteristics and the knowledge of how these characteristics are related, bringing evidence of statistical and linguistic patterns and enabling the platform to semantic inference . Semantics is not yet a fully elucidated science, however when studied in a restricted domain, within predefined concepts can be inferred; besides it considering the primordial characteristic of the platform, which is to be inserted within the legal context, the platform gains pragmatic analysis which is the meaning of words and expressions within a given context.

[025] Método utiliza um conceito chamado maximização de entropia, somada ao uso de uma taxonomia jurídica (taxonomia é uma ciência que se destina ao estudo da classificação, nomenclatura e identificação de seres vivos, coisas ou processos dentro de determinada área de conhecimento) e uma ontologia jurídica (que tem como objetivo entender e explicar a essência do direito, as suas particularidades e como o direito está relacionado com o ser humano). Esta técnica é usada, pois se fundamenta em uma distribuição de probabilidades construída baseada nas características dos[025] Method uses a concept called entropy maximization, added to the use of a legal taxonomy (taxonomy is a science that aims to study the classification, nomenclature and identification of living beings, things or processes within a given area of knowledge) and a legal ontology (which aims to understand and explain the essence of law, its particularities and how the law is related to human beings). This technique is used because it is based on a distribution of probabilities built based on the characteristics of the

21/3221/32

8/10 textos analisados, utilizando um modelo de regressão logística multinomial (a regressão logística é uma técnica estatística que tem como objetivo produzir, a partir de um conjunto de observações, um modelo que permita a predição de valores tomados por uma variável categórica, frequentemente binária, a partir de uma série de variáveis explicativas contínuas e/ou binárias) para a detecção de funções de pertinência de trechos, expressões e palavras. A técnica foi escolhida pela facilidade de usar características recursivas, exatamente o ponto onde a taxonomia é utilizada.8/10 analyzed texts, using a multinomial logistic regression model (logistic regression is a statistical technique that aims to produce, from a set of observations, a model that allows the prediction of values taken by a categorical variable, often binary, from a series of continuous and / or binary explanatory variables) for the detection of functions of pertinence of passages, expressions and words. The technique was chosen due to the ease of using recursive characteristics, exactly the point where the taxonomy is used.

[026] A tabulação automática é realizada a partir de um modelo criado com a maximização de entropia. O modelo é baseado em exemplos de textos parecidos com os que se deseja observar, anotados. Tais anotações são a seleção das variáveis selecionadas dentro dos textos de exemplo. A ferramenta cria uma distribuição de probabilidades, baseado na janela de informação anotada. Exemplo: “o Juiz <JUIZ>Paulo Araujo</JUIZ> foi ao tribunal declarar sua decisão” [027] O exemplo acima anota Paulo Araujo como Juiz, mas a janela de informação são as palavras ao redor da informação anotada. A ferramenta tem a capacidade de usar além das palavras ou expressões suas características encontradas, como informações sintáticas.[026] Automatic tabulation is performed from a model created with the entropy maximization. The model is based on examples of texts similar to the ones you want to observe, annotated. Such annotations are the selection of the selected variables within the sample texts. The tool creates a distribution of probabilities, based on the annotated information window. Example: “Judge <JUIZ> Paulo Araujo </JUIZ> went to court to declare his decision” [027] The example above notes Paulo Araujo as a Judge, but the information window is the words surrounding the annotated information. The tool has the ability to use in addition to words or expressions its found characteristics, such as syntactic information.

[028] Os mapas de conceito são simplificados como taxonomia jurídica e ontologia jurídica. O funcionamento é baseado em uma visão recursiva. Exemplo: digamos que se deseja extrair um desembargador de um determinado texto. A ontologia informa à ferramenta que desembargador é uma entidade do super tipo pessoa e a taxonomia informa a relação hierárquica:[028] Concept maps are simplified as legal taxonomy and legal ontology. The operation is based on a recursive view. Example: let's say you want to extract a judge from a certain text. The ontology tells the tool that the judge is an entity of the super person type and the taxonomy informs the hierarchical relationship:

PESSOAPERSON

Juiz + Desembargador [029] A busca de informação/modelos e treinamento é sempreJudge + Judge [029] The search for information / models and training is always

22/3222/32

9/10 baseada na forma primitiva da entidade, pois quanto mais primitiva, maior a quantidade de treinamento. Objetivo dessa abordagem é a redução de dimensionalidade.9/10 based on the entity's primitive form, because the more primitive, the greater the amount of training. The objective of this approach is to reduce dimensionality.

[030] A plataforma é conceituada como cognitiva por possuir mecanismos básicos semelhantes a um processo cognitivo, a saber: percepção, memória, raciocínio e juízo.[030] The platform is conceptualized as cognitive because it has basic mechanisms similar to a cognitive process, namely: perception, memory, reasoning and judgment.

[031] Insights é uma das possíveis aplicações que pode ser traduzida como a descoberta de padrões. Por definição descoberta de padrões é a descoberta de funções dado um conjunto de dados. Neste caso descobrir as funções que descrevem fenômenos estatísticos, ampara a descoberta de informações jurídicas.[031] Insights is one of the possible applications that can be translated as the discovery of patterns. By definition, pattern discovery is the discovery of functions given a set of data. In this case, discovering the functions that describe statistical phenomena, supports the discovery of legal information.

[032] Após a estruturação dos dados inerentes a documentos jurídicos, o conjunto de arquivos viram linhas de uma tabela e as variáveis virão colunas. O relacionamento é realizado baseado na obtenção de probabilidades de ocorrências alinhados com as características do documento analisado.[032] After structuring the data inherent in legal documents, the set of files becomes rows of a table and the variables will come in columns. The relationship is based on obtaining probabilities of occurrences in line with the characteristics of the analyzed document.

[033] Para ilustrar vamos imaginar que o problema seja: qual a chance de vencer um processo de dano moral contra um banco, baseado na listagem do nome no Serasa.[033] To illustrate, let's imagine what the problem is: what is the chance of winning a moral damage case against a bank, based on the name listing on Serasa.

I. É selecionado quais arquivos serão analisados e a abrangência: neste exemplo todas as sentenças de todos os de processos que contenham bancos como réus;I. It is selected which files will be analyzed and the scope: in this example, all the sentences of all the lawsuits that contain banks as defendants;

II. É selecionado as variáveis, a exemplo: juiz, comarca, decisão e advogado;II. The variables are selected, for example: judge, district, decision and lawyer;

III. As variáveis são extraídas e cada sentença (arquivo de sentença) vira uma linha da tabela, que possui as colunas: juiz, comarca, decisão e advogado;III. The variables are extracted and each sentence (sentence file) becomes a row in the table, which has the columns: judge, district, decision and lawyer;

IV. É criado um modelo preditivo baseado na observação dessa tabela;IV. A predictive model is created based on the observation of this table;

23/3223/32

10/1010/10

V. Além das variáveis selecionadas, no caso de fragmentos vencedores, é selecionado a variável argumentação, que visa identificar argumentos;V. In addition to the selected variables, in the case of winning fragments, the argumentation variable is selected, which aims to identify arguments;

VI. Considerando a tabela e o modelo criado é possível identificar os processos vencedores, com o maior índice de assertividade e listar quais argumentos são encontrados neles.SAW. Considering the table and the model created, it is possible to identify the winning processes, with the highest rate of assertiveness and list which arguments are found in them.

[034] A extração de informações é feita por meio de algoritmos estatísticos de aprendizado de máquina após um crawler/scrapper coletar todos os documentos jurídicos baseado em entidades nomeadas guiadas ao domínio jurídico, tais como: juiz, comarca, estado, matéria, decisão, etc. Pode ainda ser utilizado um console de corelacionamento jurídico com indexação semântica.[034] Information extraction is done through statistical machine learning algorithms after a crawler / scrapper collects all legal documents based on named entities guided to the legal domain, such as: judge, district, state, matter, decision, etc. A legal correlation console with semantic indexing can also be used.

[035] Neste contexto, algumas técnicas de machine learning poderiam ser aplicadas, tais como, mas não limitados a: maximization of entropy conditional random fields”. bigramas e trigramas do contexto jurídico como features do algoritmo, e utilização de janelas dinâmicas para cada entidade de representação.[035] In this context, some machine learning techniques could be applied, such as, but not limited to: maximization of entropy conditional random fields ”. bigrams and trigrams of the legal context as features of the algorithm, and use of dynamic windows for each representation entity.

[036] Além disso, a ferramenta seria capaz também de gerar modelos de petições com algumas informações já preenchidas, a fim de facilitar o trabalho do advogado ou prestador de serviço da área jurídica.[036] In addition, the tool would also be able to generate petition templates with some information already filled in, in order to facilitate the work of the lawyer or legal service provider.

[037] Apesar da descrição das realizações particulares acima, a presente invenção pode ser realizada de maneira diferente e pode apresentar modificações em sua forma de implementação, de modo que o escopo de proteção da invenção se limita tão somente pelo teor das reivindicações anexas, incluindo todas as possíveis variações equivalentes.[037] Despite the description of the particular embodiments above, the present invention can be carried out differently and may present modifications in its form of implementation, so that the scope of protection of the invention is limited only by the content of the attached claims, including all possible equivalent variations.

24/3224/32

1/21/2

Claims

1. LEGAL COGNITION METHOD being characterized by the fact of understanding the following steps:

The. acquire (10) data from courts and legal publications;

B. structure (20) the data by means of taxonomic variables;

ç. select taxonomic variables; and

d. create (30) a predictive model.

2. LEGAL COGNITION METHOD, according to claim 1, characterized by the fact that the data is acquired through web page trackers.

3. LEGAL COGNITION METHOD, according to claim 1 or 2, characterized by the fact that the data are indexed and stored in a database.

4. LEGAL COGNITION METHOD, according to claim 1, characterized by the fact that the data are structured by means of taxonomic variables and semantic inference.

5. LEGAL COGNITION METHOD, according to any one of claims 1 to 4, characterized by the fact that the taxonomic variables are legal taxonomies.

6. LEGAL COGNITION METHOD, according to claim 4, characterized by the fact that semantic inference is based on legal ontologies.

7. LEGAL COGNITION METHOD, according to any one of claims 1 to 6, characterized by the fact that predictive models are created by means of probabilities of occurrence of the selected taxonomic variables in the acquired data.

25/32

2/2

8. LEGAL COGNITION METHOD, according to any of claims 1 to 7, characterized by the fact that new taxonomic variables are entered manually by a user.

26/32

1/3

1 - Acquisition of Metadata for Legal Proceedings from All States

4- Extraction of Primary Information

7 - Semantic indexation

10 - Feedback

2 - Acquisition of Legal Documents

5 - Demand for Artificial Intelligence Models

8 - Discovery of Patterns

11 - Human User Interface

3 - Indexing and

Distributed storage

6 - Information Tab

9 - Predictive and Prescriptive Analyzes