CN115544211A - Method for external trade and external law indexing and industry risk assessment - Google Patents

Method for external trade and external law indexing and industry risk assessment Download PDF

Info

Publication number
CN115544211A
CN115544211A CN202211335205.6A CN202211335205A CN115544211A CN 115544211 A CN115544211 A CN 115544211A CN 202211335205 A CN202211335205 A CN 202211335205A CN 115544211 A CN115544211 A CN 115544211A
Authority
CN
China
Prior art keywords
legal
foreign
risk
involved
law
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211335205.6A
Other languages
Chinese (zh)
Inventor
车流畅
徐祎涵
裴兆斌
冉令博
韩炎津
刘亚芳
张菁芃
张睿涵
王旭
韩雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Normal University
Original Assignee
Shenyang Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Normal University filed Critical Shenyang Normal University
Priority to CN202211335205.6A priority Critical patent/CN115544211A/en
Publication of CN115544211A publication Critical patent/CN115544211A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • General Business, Economics & Management (AREA)
  • Biomedical Technology (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computing Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Probability & Statistics with Applications (AREA)
  • Development Economics (AREA)
  • Technology Law (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method for external trade foreign-involved law indexing and industry risk assessment, which comprises the following steps: external trade foreign law indexing and industry risk assessment; wherein, the step of external trade foreign law index includes: acquiring data of the foreign-involved legal documents, preprocessing the data of the foreign-involved legal documents, constructing an index similarity network of the foreign-involved legal documents, and sequencing the foreign-involved legal documents based on the similarity of the quotations to give related legal documents and case lists meeting the index requirements; the industry risk assessment method comprises the following steps: identifying the influence factors of the legal risk, selecting a plurality of factors as the legal risk factors, establishing a legal risk factor set in a hierarchical manner, constructing a judgment matrix according to the established legal risk factor set, determining the evaluation weight according to the constructed judgment matrix, establishing a fuzzy comprehensive evaluation matrix, and evaluating the legal risk level. The method and the system can help foreign trade enterprises to quickly find out the cause of legal risk and evaluate the legal risk of foreign trade enterprises.

Description

Method for external trade and external law indexing and industry risk assessment
Technical Field
The invention relates to the field of computers, in particular to a method and a system for indexing data, and more particularly relates to a method for external trade foreign law indexing and industry risk assessment.
Background
Foreign trade connects two markets in China and China, and plays an important role in constructing a new development pattern. The innovation of foreign trade system is continuously promoted in China, the goods trade is developed in a crossing manner, the trade structure is continuously optimized, the international market is continuously expanded, and the method makes important contribution to the development of economy and society. The most important factors for foreign trade are cost and safety, so enterprises can benefit from walking away without knowing laws of foreign entities. When the problem is not considered from the perspective of law by external trade enterprises and overseas intellectual property management, organization management and human resource management are not considered, legal risks are easy to occur. When disputes occur in investment and problems such as environmental protection, intellectual property, labor service, contract management and the like occur in the operation stage, how to quickly inquire the relevant legal documents and relevant cases of the country where the index is located is particularly important. Especially in the foreign american and english law system, legal problems are carried out by means of case cases, so that the process of legal reasoning and decision making depends heavily on the information stored in the text file. The foreign trade legal services industry is still in the primary stage at present, although it has undergone a long development process, the form is still based on the traditional counselor way that professional lawyers of enterprises are adopted as legal counselors, and there is not much change and development, and the development of the industry faces the problem and risk that each system formation, scale splitting and the like are the development of the industry.
Many online legal databases provide convenient access to such legal documents. These databases allow users to search according to legal terms, and these search options require that the query be very accurately formulated using terms specific to the domain. In addition, with the construction of online legal databases, legal information retrieval has become the core of many legal laws and case query processes today. A large portion of these online legal data consists of unstructured and textual data. The legal domain is considered to be a very complex domain, and the inquiry process relies heavily on the interpretation of knowledge by legal experts. The legal field stores huge information in the form of texts and documents. Legal information can be categorized under different headings, such as court notes, decisions, statements, and the like. These documents are a valuable repository of useful information about legal interpretations, which must be referred to by external trade-in laws. The effectiveness of traditional document lookups is limited due to the complexity of the legal knowledge contained in the legal document. It is necessary to establish the relevance and similarity between two cases, as explained in various legal documents. Therefore, it is very necessary to improve the external trade external law risk management capability, inquire the relevant legal laws and cases, and find out the influencing factors of the law risk, and further evaluate the external trade external law risk.
Disclosure of Invention
The invention provides a method for foreign law indexing and industry risk assessment for external trade, which is realized by adopting the following technical scheme in order to solve the technical problems.
The invention discloses a method for external trade foreign-involved law indexing and industry risk assessment, which comprises the following steps: external trade foreign law indexing S1 and industry risk assessment S2; wherein the content of the first and second substances,
the step of the external trade foreign law index S1 comprises the following steps:
s11, collecting data of the foreign legal documents;
s12, preprocessing the data of the foreign legal documents;
s13, constructing a vector space model based on the preprocessed foreign-involved legal document data;
s14, constructing an index similarity network of the foreign-involved legal documents;
s15, giving out related legal documents and case lists meeting the index requirements based on the citation similarity sorting;
the step of industry legal risk assessment S2 includes:
s21, identifying influence factors of legal risks;
s22, selecting a plurality of factors as legal risk factors, and establishing a legal risk factor set in a hierarchical manner;
s23, constructing a judgment matrix according to the established legal risk factor set;
s24, determining an evaluation weight according to the constructed judgment matrix;
and S25, establishing a fuzzy comprehensive evaluation matrix and evaluating the legal risk level.
Further, the data of the foreign-involved legal documents in step S11 includes a query set and a corresponding document set, and the query set and the document set are subdivided into separate file sets.
Further, the preprocessing technique in step S12 includes: tokenization, stop word deletion, punctuation deletion, and wordbreak.
Further, legal risk factors include internal and external influencing factors.
Further, the investigation is completed by k experts, and the relative importance of each legal risk factor index is labeled using a proportional scaling method.
Further, the overall legal risk level is evaluated based on the maximum of all elements in the evaluation matrix.
The method for external trade foreign-involved law indexing and industry risk assessment is based on the fact that the quoted similarity is closer to the expert evaluation of human beings on the similarity of legal documents, not only one-to-one links are taken as the only index of the similarity, but also whether a path from one node to another node exists or not is taken into consideration to determine the similarity, the quoted network analysis can be effectively used for estimating the similarity index, the mutual relation among various legal concepts can be understood through the quoted links, and the network can be further analyzed through applying a link analysis algorithm; meanwhile, the multi-criterion evaluation problem in the fuzzy environment is solved, the adopted evaluation method identifies the legal risk of the foreign trade enterprises, helps the foreign trade enterprises to quickly find out the reason of the legal risk, and evaluates the legal risk of the foreign trade enterprises.
Drawings
FIG. 1 is a flow chart of a method for external trade foreign law indexing and industry risk assessment according to the present invention.
FIG. 2 is a schematic diagram of a structure of a target document vector for external trade foreign law indexing according to the present invention.
FIG. 3 is a risk factor diagram of a method for external trade foreign law indexing and industry risk assessment according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The invention provides a method for external trade foreign-involved law indexing and industry risk assessment, which is realized by adopting the following technical scheme in order to solve the technical problems.
A method for external trade foreign law indexing and industry risk assessment comprises the following steps: external trade foreign law indexing S1 and industry risk assessment S2; wherein the content of the first and second substances,
the step of the external trade foreign law index S1 comprises the following steps:
s11, collecting the data of the foreign legal documents:
the national and legal incident decisions and legal documents form a data corpus comprising query sets and corresponding document sets, and the query sets and the document sets are subdivided into separate document sets.
S12, preprocessing the data of the foreign legal documents:
the data set of the foreign-involved legal documents consists of a query set and a document set, and the query file is split to obtain a single query. In particular, the query document is divided into a plurality of different queries in order to measure similarity.
Foreign-involved legal documents are completely unstructured and require language preprocessing to convert unstructured data into the appropriate structured information. A pretreatment technique comprising: marking, stopping word deletion, punctuation mark deletion and word sourcing; and carrying out word drying and standardization treatment on case judgment. The term document matrix is then constructed using the scores. The reference data for each case decision is recorded separately.
By means of strong database support, the system provides self-service retrieval service of relevant information such as judicial cases, policy regulations, legal provisions and the like for users, can integrate relevant legal information of foreign trade industries, improves retrieval efficiency, and is accurate and comprehensive in information collection and high in retrieval efficiency.
S13, constructing a vector space model based on the preprocessed foreign-related legal document data:
when the unstructured documents and the query technology are adopted for searching the information of the foreign-involved legal documents, the most similar documents are searched for the given user query, and the documents are searched according to the ranking sequence of the similarity. In the vector space model, text corresponding to foreign-law documents and queries is converted into numerical vectors.
The vector space model comprises three groups of model types including a word group package model, a document vector model and a phrase vector model. The phrase bag is used for digital statistics and emphasizes the importance of certain word groups in the corpus to the foreign legal documents. The document index in the vector space is in a multi-label learning form, the input space is expressed as a certain characteristic space chi of the foreign-involved legal document D, and the output space is expressed as a power set 2 gamma of a finite set of n digital vectors gamma. Given a training corpus
Figure BDA0003915086760000041
Learning to predict functions of invisible document digit vectors
Figure BDA0003915086760000042
Word embedding is the dense representation of words in the form of a numeric vector, revealing many hidden relationships from word to word.
In the term frequency-inverse document frequency approach, each term or phrase found in the vocabulary or corpus is represented by a different, orthogonal dimension. In the speech frequency-inverse document frequency method, the frequency of a single term in the text is measured and multiplied by the log inverse document frequency of that term in the entire corpus. The term frequency-inverse document frequency method is difficult to modify, such as adding other new documents with multiple dimensions. In order to solve the problems, an improved language frequency-inverse document frequency method is provided, indexes are added, and adjustment is carried out according to the duration of word use on a corpus time axis.
The term frequency tf (t, d) specifies the number of times a term t appears in a document d, as measured by the number of times the term appears in the vocabulary in the foreign-related legal document, as a counting function.
Figure BDA0003915086760000043
Where fr (x, t) is defined as:
Figure BDA0003915086760000051
since the length of an outlying legal document is different, for normalization, the term frequency is used divided by the length of the outlying legal document as the total number of terms in the outlying legal document:
TF(t,d)=(Ct)/(D1)
where Ct is the term frequency and Dl is the length of the foreign-involved legal document.
How much information is given by a phrase in the set of foreign-involved legal documents D is measured by the inverse document frequency IDF (t, D) to indicate whether the word appears infrequently or commonly in the corpus of foreign-involved legal documents. Mathematically, it is calculated from the inverse logarithmic scale ratio of the foreign-involved legal documents containing the phrase.
IDF(t,D)=log e (Cd/Cdt)
Where Cd is the foreign-involved legal document count, cdt is the foreign-involved legal document count that contains t.
Figure BDA0003915086760000052
Where, | { d: t e d i is the number of foreign legal documents containing t, plus 1 to avoid except zero error.
Thus, from the formula:
tf-idf(t)=tf(t,d)×idf(t,d)
foreign legal documents are represented by a set of words that form a word-document matrix. Preprocessing of foreign-involved legal documents also includes stemming and wording, forming terms, modeling each foreign-involved legal document by measuring the number of occurrences of each term. The phrase package represents the text regardless of the order and syntax of the text.
In the word-document matrix, each row represents a term, and each column represents a foreign-involved legal document. In the matrix, w ij This value represents the number of times i items appear in the foreign-involved legal document j. Such as W 3,11 =29 indicates that the phrase denoted 3 appears 29 times in the 11 th foreign-involved legal document of the set. If the input is a collection of n foreign-involved legal documents containing w phrases, the phrase package is represented as an n x w matrix.
The phrase vector model is a set of relevant models for generating phrase embedding, and the neural network is used for reconstructing the language context of the phrase. Phrase-to-vector a vector space is generated from a large corpus of text, with considerable dimensionality, where each different term in the corpus is assigned a matching vector in the space. The vector space is arranged with word vectors such that words in the corpus having a common context are close to each other in space. The main purpose of the phrase vector model is to understand the distribution of expressions for each target term by specifying the context. Each embedded dimension represents a potential feature of the phrase, and cosine similarity can be used to compute a similarity operation for the vector. When the model is initialized, the lowest count of input phrases such as words with frequency lower than 20 is discarded.
Since the continuous phrase packet algorithm is suitable for larger data sets, the model is trained using the continuous phrase packet algorithm. The working mode of the continuous phrase packet algorithm is as follows: when a context is given, the probability of a phrase is predicted, and the context is specified with a single word or a group of words. The phrase context model predicts the target words of a given context word in the sliding window. The sliding window is composed of an input layer, a hidden layer and an output layer. In the phrase context model, the input to the neural network is a one-hot coded vector. For a given channel defined by x 1 ,…,x v The list of input context word sets is shown with only one word being 1 and the others being 0. In this model, W represents a V × N matrix between the input layer and the hidden layer. W matrix vector v of relevant words in input layer w Each row of (1). For line i of W
Figure BDA0003915086760000061
And (4) showing. Thus, given a context phrase, assume x k =1, and for k '≠ k, x' k =0; the following can be obtained:
Figure BDA0003915086760000062
wherein, w I Is an input word, represented by a vector
Figure BDA0003915086760000063
The k-th row of the matrix W is copied to the h-th row. When a scalar bias value exists in the model, the weighted sum of the input layer plus the bias value is transferred to the hidden layer.
From hidden layer to output layer, there is a different weight matrix W '= { W' ij And an N × V matrix. N represents the dimensionality of the phrase. In addition, N is any hyper-parameter of the neural network, which represents the number of neurons in the hidden layer. In the phrase vector model, there is no linear activation function between layers. And inputting the hidden weight as a hidden activation weight. Using hidden activation weights h and hidden output weights
Figure BDA0003915086760000064
Performing dot product to calculate a score u for each phrase in the training corpus j The formula is as follows:
Figure BDA0003915086760000065
further, the output of the model output layer is calculated. Output y j By inputting u j Obtained by a soft maximum function.
Figure BDA0003915086760000066
By combining the above formulas, we obtain:
Figure BDA0003915086760000067
the above steps represent forward propagation, followed by a backward propagation step, learning the weight matrix and calculating the loss function. The weights are updated in all layers, i.e. the input layer, the hidden layer and the output layer, by calculating the errors and constantly readjusting the weights. For each word pair, a maximum likelihood estimation technique using a form of cross entropy minimizes the loss. The continuous phrase packet algorithm minimizes the average negative log probability, as follows:
Figure BDA0003915086760000071
specifically, the dimension of the feature vector is set to 200. After the vocabulary and training input data are constructed, a learned word vector representation is performed on the test set documents.
An unsupervised algorithm for generating vectors for foreign-involved legal documents is a document vector model, which is a variant of creating a vector phrase vector model for a phrase. The similarity between the foreign-involved legal documents is searched by utilizing a vector generated by a foreign-involved legal document vector model, continuous phrases are randomly extracted from a paragraph by the model, and a central word is predicted from a randomly extracted phrase set by taking a context phrase and a paragraph id as input. Approximate foreign-involved legal documents are distinguished in vector space. The target of the foreign-involved legal document vector model learning is only to maximally improve the probability of predicting the target phrase under the condition that the given phrase and the foreign-involved legal document are used as contexts.
Figure BDA0003915086760000072
Wherein, W = [ W = 1 ,w 2 ,w 3 ,...,w T ]Representing a sequence of training words. T is the vocabulary of the training phrase. Accordingly, D = [ D ] 1 ,d 2 ,d 3 ,...,d T ]Is a sequence of documents. w is a t Is toCorresponding to x in FIG. 2 i+3 Target foreign-law document vector of (1), i.e. w t :=x i+3
Figure BDA0003915086760000073
The goal of constructing a continuous phrase packet algorithm training model is to minimize the loss functions associated with certain classifiers with respect to phrase embedding and classifier parameters, so that neighboring phrases can be predicted from each other. The model according to fig. 2 minimizes the following average negative log probability:
Figure BDA0003915086760000074
the continuous phrase packet algorithm trains the model to minimize the target function, estimates the loss function using noise contrast, and distinguishes the target words from the noise samples using a logistic regression classifier. A sample is selected from a true distribution, which consists of a true class and some other noise class labels. Noise contrast evaluation dependent on the input word set w I The purpose is to predict the output word w u . Given samples of N other phrases selected from the noise sample distribution Q
Figure BDA0003915086760000081
Represents Q.
The loss function is:
Figure BDA0003915086760000082
to classify location-sensitive foreign-involved legal documents in vector space, the document (or paragraph) vector is phrase-trained using differential training. To generate a location-agnostic numeric vector for a foreign-involved legal document, a set of phrases from a particular context and a general context are trained. Both generic words (i.e., index words that do not describe the nature of the document) and specific words (index words that describe the nature of the document) are considered. This common goal is represented by the following formula:
Figure BDA0003915086760000083
and then, generating the probability of each foreign-involved legal document feature vector by using the tuning parameters determined by the multithread gradient calculation and the critical section weight update:
Figure BDA0003915086760000084
using differential training, different emphasis is given to words extracted from specific and general contexts; and then, locally sensitive foreign-involved legal documents which are very similar to each other are classified by means of cosine similarity and the like. And converted to a vector space classification scheme. The model is extended by training and adding index words related to each foreign-involved legal document feature vector, allowing a user to view the index words associated with each foreign-involved legal document vector in a vector space.
S14, constructing an index similarity network of the foreign-involved legal documents:
cosine and Jaccard similarity of the collection of foreign-involved legal documents and the collection of queries are measured. The cosine similarity measure is used to calculate the cosine value of the included angle between the query and the document vector, as shown in the following formula:
Figure BDA0003915086760000085
the numerator is the dot product of the query vector q and the document vector d, and the denominator is the product of the Euclidean lengths of the query vector q and the foreign-involved legal document vector d.
The Jaccard coefficient is defined as the size of the intersection divided by the size of the union of the foreign legal document and the query vector, as shown by the following equation:
Figure BDA0003915086760000091
constructing a foreign law document set into a network, and representing information obtained in preprocessing by using proper nodes and edges; the nodes represent a case or a case, and the edges between the nodes represent the correlation of the two cases; the edge weight is very important in the analysis of the quotation network, so that the weight distribution is carried out on the edge as a correlation measure. The similarity values are used as edge weights when constructing a foreign-involved legal document set network. In the resulting network, the presence of a direct link or path from one node to another indicates similarity.
And S15, giving out related legal documents and case lists meeting the index requirements based on the citation similarity sorting.
The industry legal risk assessment S2 includes: establishing a legal risk assessment hierarchical model of a comprehensive evaluation index system;
s21, identifying influence factors of legal risks;
s22, selecting a plurality of factors as legal risk factors, and establishing a legal risk factor set in a hierarchical manner;
s23, constructing a judgment matrix according to the established legal risk factor set;
s24, determining an evaluation weight according to the constructed judgment matrix;
and S25, establishing a fuzzy comprehensive evaluation matrix and evaluating the legal risk level.
Providing data support and guarantee by depending on a database, and performing intelligent risk assessment and risk avoidance suggestions by an artificial intelligence system; and entering a risk evaluation interface after completing the data, and automatically analyzing the risk by the system and providing intuitive risk exit and avoidance suggestions for the user for reference.
The external trade enterprise firstly identifies the influence factors of legal risk in the risk assessment stage. Legal risk measurement has no uniform standard, and different measurement methods can be adopted according to different purposes.
In the legal risk assessment process, documents and theoretical analysis can be adopted, experts are invited to complete legal risk investigation, and a plurality of factors are selected as a legal risk factor set. Suppose B = { B1, B2} is a set of legal risk evaluations for an external trade enterprise. B1 is an internal influencing factor, and B2 is an external influencing factor. B1= { C1, C2} is a set of external influence factors including foreign local industry environment C1, law and regulation environment C2. B2= { C3, C4, C5} is a set of internal influencing factors, including intellectual property management C3, people management C4.
Wherein C1= { D1, D2, D3, D4}, C2= { D5, D6, D7, D8}, C3= { D9, D10, D11, D12}, C4= { D13, D14, D15, D16}, respectively include relevant factors such as inter-enterprise competition, judicial environment, supervision mechanism, human resource management system, establishment and implementation of intellectual property system, and the like.
According to the established legal risk factor set, the following steps are as follows:
1) Constructing a judgment matrix: the judgment matrix is used for judging the relative importance of the two indexes based on the constraint condition of the previous stage; the decision matrix may be used to determine weights; assume the decision matrix is: q = (α) ij ) n is multiplied by n; wherein alpha is ij >0,α ij =1/α ji And n is the index number of the same hierarchy.
The decision matrix may be constructed as:
Figure BDA0003915086760000101
according to the actual situation and the evaluation requirement, the k experts complete the investigation, and the relative importance of each index is marked by using a proportional scaling method.
2) Determining an evaluation weight; if the fraction is given by k experts then (α) ij ) k A for the judgment of the kth expert ij The fraction of (c). The geometric mean of each index score may be calculated as:
Figure BDA0003915086760000102
calculating the geometric mean value alpha ij ' the weight of each index can be described as:
Figure BDA0003915086760000103
the feature vector description of the judgment matrix W is:
W=(w 1 ,w 2 ,...,w n )
the weights are verified by consistency checking using matrix theory. Lambda [ alpha ] max For maximum feature root, AW i For the ith component of AW, CI is a consistency indicator. The consistency check of the fuzzy judgment matrix is described as follows:
Figure BDA0003915086760000104
Figure BDA0003915086760000105
random consistency is different on different scales, so a consistency ratio CR is introduced as a consistency evaluation index:
Figure BDA0003915086760000111
the RI rule for judging the random consistency index of the matrix is as follows: when CR is less than 0.1, the matrix consistency is judged to be acceptable. If CR ≧ 0.1, the decision matrix needs to be adjusted to achieve acceptable consistency.
The random consistency index RI of the matrix is judged as follows:
n 1 2 3 4 5 6 7 8 9
RI 0 0 0.58 0.9 1.12 1.24 1.32 1.41 1.45
3) Establishing a fuzzy comprehensive evaluation matrix: establishing an evaluation set V for judging the risk level of each index, and expressing the risk level as follows:
v={v 1 ,v 2 ,v 3 ,v 4 ,v 5 }
V 1 、V 2 、V 3 、V 4 、V 5 low risk, relatively low risk, medium risk, relatively high risk, respectively. And (4) judging the risk level of each index by combining with an expert, and forming a fuzzy membership matrix R by the risk level.
Establishing a fuzzy comprehensive evaluation matrix U, wherein the evaluation matrix is described as follows:
U=W*R。
based on the maximum of all elements in the evaluation matrix, the overall legal risk level is assessed.
The method for external trade foreign-involved law indexing and industry risk assessment is based on the fact that the quoted similarity is closer to the expert evaluation of human beings on the similarity of legal documents, not only one-to-one links are taken as the only index of the similarity, but also whether a path from one node to another node exists or not is taken into consideration to determine the similarity, the quoted network analysis can be effectively used for estimating the similarity index, the mutual relation among various legal concepts can be understood through the quoted links, and the network can be further analyzed through applying a link analysis algorithm; meanwhile, the multi-criterion evaluation problem under the fuzzy environment is solved, the adopted evaluation method identifies the legal risk of the foreign trade enterprises, helps the foreign trade enterprises to quickly find out the reason of the legal risk, and evaluates the legal risk of the foreign trade enterprises.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (6)

1. A method for external trade foreign law indexing and industry risk assessment is characterized by comprising the following steps: external trade foreign law indexing S1 and industry risk assessment S2; wherein, the first and the second end of the pipe are connected with each other,
the step of the external trade external law index S1 comprises the following steps:
s11, collecting data of the foreign legal documents;
s12, preprocessing the data of the foreign legal documents;
s13, constructing a vector space model based on the preprocessed foreign-involved legal document data;
s14, constructing an index similarity network of the foreign-involved legal documents;
s15, giving out related legal documents and case lists meeting the index requirements based on the citation similarity sorting;
the step of industry legal risk assessment S2 includes:
s21, identifying influence factors of legal risks;
s22, selecting a plurality of factors as legal risk factors, and establishing a legal risk factor set in a hierarchical manner;
s23, constructing a judgment matrix according to the established legal risk factor set;
s24, determining an evaluation weight according to the constructed judgment matrix;
and S25, establishing a fuzzy comprehensive evaluation matrix and evaluating the legal risk level.
2. The method for foreign-involved legal indexing and industry risk assessment according to claim 1, wherein the foreign-involved legal document data in step S11 comprises query sets and corresponding document sets, and the query sets and the document sets are subdivided into separate document sets.
3. The method for external trade foreign law indexing and industry risk assessment according to claim 1, wherein the preprocessing technique in step S12 includes: tokenization, stop word deletion, punctuation deletion, and wordbreak.
4. The method for external trade foreign law indexing and industry risk assessment according to claim 1, wherein legal risk factors include internal and external influence factors.
5. The method for foreign law indexing and industry risk assessment according to claim 1, wherein the k experts complete the survey and mark the relative importance of each legal risk factor index using the proportional scaling method.
6. The method for external trade foreign law indexing and industry risk assessment according to claim 1, wherein the overall legal risk level is assessed based on the maximum value of all elements in the evaluation matrix.
CN202211335205.6A 2022-10-28 2022-10-28 Method for external trade and external law indexing and industry risk assessment Pending CN115544211A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211335205.6A CN115544211A (en) 2022-10-28 2022-10-28 Method for external trade and external law indexing and industry risk assessment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211335205.6A CN115544211A (en) 2022-10-28 2022-10-28 Method for external trade and external law indexing and industry risk assessment

Publications (1)

Publication Number Publication Date
CN115544211A true CN115544211A (en) 2022-12-30

Family

ID=84718477

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211335205.6A Pending CN115544211A (en) 2022-10-28 2022-10-28 Method for external trade and external law indexing and industry risk assessment

Country Status (1)

Country Link
CN (1) CN115544211A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237114A (en) * 2023-11-10 2023-12-15 深圳市迪博企业风险管理技术有限公司 Financing trade compliance detection method based on twin evolution

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237114A (en) * 2023-11-10 2023-12-15 深圳市迪博企业风险管理技术有限公司 Financing trade compliance detection method based on twin evolution
CN117237114B (en) * 2023-11-10 2024-03-08 深圳市迪博企业风险管理技术有限公司 Financing trade compliance detection method based on twin evolution

Similar Documents

Publication Publication Date Title
CN106649260B (en) Product characteristic structure tree construction method based on comment text mining
CN110825877A (en) Semantic similarity analysis method based on text clustering
CN113239181A (en) Scientific and technological literature citation recommendation method based on deep learning
CN108733748B (en) Cross-border product quality risk fuzzy prediction method based on commodity comment public sentiment
CN107688870B (en) Text stream input-based hierarchical factor visualization analysis method and device for deep neural network
CN107315738A (en) A kind of innovation degree appraisal procedure of text message
CN107844533A (en) A kind of intelligent Answer System and analysis method
Kaur Incorporating sentimental analysis into development of a hybrid classification model: A comprehensive study
CN111158641B (en) Automatic recognition method for transaction function points based on semantic analysis and text mining
KR102161666B1 (en) Similar patent document recommendation system and method using LDA topic modeling and Word2vec
CN111190968A (en) Data preprocessing and content recommendation method based on knowledge graph
CN115952292B (en) Multi-label classification method, apparatus and computer readable medium
CN114254201A (en) Recommendation method for science and technology project review experts
CN113779264A (en) Trade recommendation method based on patent supply and demand knowledge graph
CN114265935A (en) Science and technology project establishment management auxiliary decision-making method and system based on text mining
CN116304020A (en) Industrial text entity extraction method based on semantic source analysis and span characteristics
CN115544211A (en) Method for external trade and external law indexing and industry risk assessment
CN114942974A (en) E-commerce platform commodity user evaluation emotional tendency classification method
CN112784049B (en) Text data-oriented online social platform multi-element knowledge acquisition method
CN114239828A (en) Supply chain affair map construction method based on causal relationship
CN116342167B (en) Intelligent cost measurement method and device based on sequence labeling named entity recognition
Li et al. Evaluating the rationality of judicial decision with LSTM-based case modeling
CN114595324A (en) Method, device, terminal and non-transitory storage medium for power grid service data domain division
CN112580691A (en) Term matching method, matching system and storage medium of metadata field
Medina et al. Classification of legal documents in portuguese language based on summarization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination