CN112766263A - Identification method for multi-layer stock control relation share graph - Google Patents

Identification method for multi-layer stock control relation share graph Download PDF

Info

Publication number
CN112766263A
CN112766263A CN202110083415.XA CN202110083415A CN112766263A CN 112766263 A CN112766263 A CN 112766263A CN 202110083415 A CN202110083415 A CN 202110083415A CN 112766263 A CN112766263 A CN 112766263A
Authority
CN
China
Prior art keywords
arrow
company
coordinates
share
many
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110083415.XA
Other languages
Chinese (zh)
Other versions
CN112766263B (en
Inventor
张贝贝
仵晨伟
郭仲穗
郑浩然
魏嵬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202110083415.XA priority Critical patent/CN112766263B/en
Publication of CN112766263A publication Critical patent/CN112766263A/en
Application granted granted Critical
Publication of CN112766263B publication Critical patent/CN112766263B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开一种针对多层控股关系股份图的识别方法,步骤为:步骤1,输入待识别股份图;步骤2,采用Faster R‑CNN网络提取公司(个人)、箭头、带线箭头和百分比的坐标;步骤3,根据分治思想,将待识别股份图划分为多个单层一对多或多对一股份图;步骤4,对于每个一对多或多对一股份图,根据箭头坐标确定角点坐标,根据箭头角点坐标确定箭头的走向,将公司(个人)划分为指向对象和被指向对象,将被指向对象和指向对象中个数多的一方与百分比绑定;利用OCR识别方法对公司(个人)中的文字进行识别;步骤5,根据指向关系,构建控股流程有向加权图。本发明解决了现有技术中存在的原有股份图难以直观反映出公司股份的问题。

Figure 202110083415

The invention discloses a method for identifying a multi-layered shareholding relationship share graph. The steps are as follows: step 1, inputting a share graph to be identified; step 2, using a Faster R-CNN network to extract companies (individuals), arrows, arrows with lines and percentages coordinates; step 3, according to the divide and conquer idea, divide the share chart to be identified into multiple single-layer one-to-many or many-to-one share charts; step 4, for each one-to-many or many-to-one share chart, according to the arrow The coordinates determine the coordinates of the corner points, determine the direction of the arrow according to the coordinates of the corner points of the arrow, divide the company (individual) into the pointing object and the pointed object, and bind the number of the pointed object and the pointed object to the percentage; use OCR The identification method identifies the characters in the company (individual); step 5, constructs a directed weighted graph of the holding process according to the pointing relationship. The invention solves the problem that the original share chart in the prior art cannot directly reflect the company's shares.

Figure 202110083415

Description

Identification method for multi-layer stock control relation share graph
Technical Field
The invention belongs to the technical field of image recognition, and relates to a recognition method for a multilayer stock control relationship share graph.
Background
With the change of internet technology, the field of artificial intelligence is developed more vigorously, and the proportion of related technologies and products in daily life of people is increased. The image recognition technology is an important field in artificial intelligence, is the basis of a plurality of practical technologies, such as stereoscopic vision, motion analysis, data fusion and the like, and has important application value in the fields of navigation, weather forecast, natural resource analysis, environment monitoring, physiological lesion research and the like. The specific identification and analysis of the complex image is an important field of artificial intelligence, and the target identification of the current image is mature for identifying the characteristics of license plates, human faces, pedestrians and the like; therefore, researchers hope to identify and analyze more complex relationship images (such as stock images), so that related personnel get rid of the traditional manual stock analysis method, stock right distribution can be mastered efficiently and accurately, and work efficiency is improved.
However, most of the existing stock maps come from annual or quarterly reports and related software (such as sky-eye search) published by companies, the pictures are complicated, the architecture of the stocks of the companies is difficult to understand intuitively, and in addition, the analysis is not only performed on one map and one stock of one company, so that the work is time-consuming, labor-consuming and difficult to clear. In addition, currently, no research on identification of share graphs by using an image identification technology exists at home and abroad, and no technology on aspects such as analysis of share relation graphs is researched.
Disclosure of Invention
The invention aims to provide a method for identifying a multi-layer stock control relation stock graph, which solves the problem that the original stock graph in the prior art cannot visually reflect the stocks of a company.
The technical scheme adopted by the invention is that,
a method for identifying a multi-layer stock control relationship share graph comprises the following specific steps:
step 1, inputting a share graph to be identified of a multilayer stock control relationship;
step 2, extracting coordinates of companies (individuals), arrows with lines and percentages in the picture by adopting a Faster R-CNN network;
step 3, dividing the share graph to be identified into a plurality of single-layer one-to-many or many-to-one share graphs by using the coordinates of the arrowheads with lines according to the dividing and treating thought;
step 4, determining the corner coordinates of each single-layer one-to-many or many-to-one stock image according to the arrow coordinates, and determining the direction of the arrow according to the arrow corner coordinates; dividing a company (person) into a pointing object and a pointed object according to the direction of an arrow, and then binding the pointed object and the pointed object with more parties and percentages in a one-to-one mode; finally, recognizing characters in the pointing object and the pointed object by using an OCR recognition method;
and 5, constructing an object-arrow-percentage-pointed object-oriented stock control flow directional weighting graph according to the pointing relation obtained in the step 3.
The invention is also characterized in that:
the step 2 comprises the following steps:
step 2.1, a large number of share graphs are adopted, and companies (individuals), arrows with lines and percentages in the graphs are manually marked to serve as a data set; wherein the share graph is manually divided into a plurality of single-layer one-to-many or many-to-one share graphs, and an arrow exceeding the single-layer one-to-many or many-to-one share graphs is defined as a strip line arrow;
step 2.2, building a VGG-16 network model, wherein the VGG-16 comprises 13 convolution layers, 3 full connection layers and 5 pooling layers;
step 2.3, training a data set by the VGG-16 network model;
and 2.4, detecting the share graph to be recognized by adopting the trained VGG-16 network model, and outputting a detection result, wherein the detection result is coordinates of a company (individual), an arrow and a percentage.
In step 2, the sizes of convolution kernels adopted by the 13 convolution layers are 3x3 convolution, stride 1 is adopted, padding is same as same, and each convolution layer uses a relu activation function; respectively generating positive anchors and corresponding bounding box regression offsets, and then calculating prosages;
the adopted pooling nuclear parameters of the pooling layer are all 2 multiplied by 2, stride is 2, max pooling mode; the pro-usals of the convolutional layer are used to extract the pro-visual feature from the feature maps and send it to the subsequent full-connection and softmax network for classification (i.e. what object the pro-visual is).
The step 3 is:
step 3.1, setting the upper bound, the lower bound, the left bound and the right bound of the area with the line arrow as U, D, L, R based on the coordinate of the certain line arrow obtained in step 2, and further sequentially searching and expanding the coordinate of the company (personal) name according to the four bounds, wherein the specific operation is as follows:
and expanding the upper bound U: when the absolute value of the difference between the upper bound U of the arrowed area and the lower bound D 'of the company (individual) name area is within the error mu, the upper bound U of the arrowed area is expanded to the upper bound U' of the company (individual) name area; expanding the lower bound D: when the absolute value of the difference between the lower bound D of the arrowed area and the upper bound U 'of the company (individual) name area is within the error mu, the lower bound D of the arrowed area is expanded to the lower bound D' of the company (individual) name area; and (3) expanding L: finding a group of company (personal) names under the condition that the difference between the lower boundary of the company (personal) name area and U is within the error mu range, then finding the difference between the left boundary of the group of company (personal) names and L, and expanding L into the left boundary L' of the company (personal) name area with the minimum difference; expanding R: finding a group of company (personal) names under the condition that the difference between the upper bound of the company (personal) name area and D is within the error mu range, then finding the difference between the right boundary of the group of company (personal) names and R, and expanding R into the right boundary R' of the company (personal) name area with the minimum difference; the area formed by the upper boundary U ', the lower boundary D', the left boundary L 'and the right boundary R' is the final expanded target range of the arrow coordinate with the line;
and 3.2, traversing the coordinates of the arrowheads with lines of all the stock images, repeatedly executing the step 3.1 until the whole coordinates of the trend of each arrowhead are completely expanded, and finally dividing the stock image to be identified into a plurality of single-layer one-to-many or many-to-one stock images.
The error mu is within a range of 10-30 pixels.
Step 4 comprises the following operations for each single-layer one-to-many or many-to-one strand graph:
step 4.1, determining corner point coordinates according to arrow coordinates for a single-layer one-to-many or many-to-one stock picture, and determining the direction of an arrow according to the arrow corner point coordinates:
three corner points of a certain arrow are set as (A (x)1,y1),B(x1,y1),C(x3,y3)): suppose y1,y2Is less than a given threshold e1Then, the two points of the angular points A and B are considered to be on a horizontal line, and then y is judged3And y1If y is3>y1The arrow is considered to be downward if y3<y1The arrow is considered to be up; traversing all the coordinates of the arrow corner points, and judging the directions one by one;
step 4.2, dividing the company (personal) name into a pointing object and a pointed object according to the pointing direction of an arrow, and then binding the pointed object and the pointed object with a large number of the pointing objects and the pointed objects in a one-to-one manner according to the percentage, wherein the inputted stock drawings are all single-layer, so that the stock drawings can be divided into two groups according to the size of the ordinate of the company name, according to the pointing direction of the arrow obtained in step 3.1, if the pointing direction is upward, the group with the largest ordinate in the company (personal) coordinate is the pointed object, and if the pointing direction of the arrow is downward, the group with the smallest ordinate in the company (personal) coordinate is the pointed object; and then one-to-one binding the pointed object and the more part of the pointed object with the percentage is carried out: the minimum abscissa and the maximum abscissa of the coordinates of four points of one of the pointed object and the pointed object are set to (x)min,xmax) Then, thenFind the percent abscissa at (x)min,xmax) Then binding the two in a specific data structure (such as a dictionary), traversing the remaining objects of one party with a large number, and carrying out one-to-one binding with the percentage;
and 4.3, recognizing characters in the coordinates of the pointing object and the pointed object by utilizing an OCR technology.
The step 5 comprises the following steps:
step 5.1, establishing an empty directed graph G, and using the company (individual) names obtained in the step 3.3 as nodes to be sequentially added into the directed graph G to obtain a basic directed graph G' only storing the nodes;
step 5.2, on the basis of the directed graph G' in the step 5.1, converting the pointing relationship in the step 3.2 into a triple [ u, v, w ], wherein u is a starting point and represents a pointing object; v is an end point and represents a pointed object, w is a weight and represents the percentage of stock occupation, the converted triple is used as a parameter and is added into a directed graph G ', and finally, a stock control flow directed weighted graph G' is formed.
The invention has the advantages that
The stock image is identified and analyzed by utilizing the fast R-CNN technology and the image identification technology of the deep learning framework, the defects of time and labor waste and difficulty in understanding when the stock analysis is carried out on individuals or companies are overcome, the defects of research on the aspect at home and abroad are made up, and an efficient and accurate method is provided.
Drawings
FIG. 1 is a schematic diagram of a single-layer one-to-many or many-to-one graph recognition and analysis method according to the present invention;
FIG. 2 is a schematic view of VGG-16 network structure of fast R-CNN in the single-layer one-to-many or many-to-one graph identification and analysis method of the present invention;
FIG. 3 is a diagram of the shares input in embodiment 1 of the method for identifying and parsing a single-layer one-to-many or many-to-one share diagram of the present invention;
FIG. 4 is a graph of the results obtained after performing step 3 in example 1 of the method for identifying and parsing a single-layer one-to-many or many-to-one strand graph according to the present invention;
fig. 5 is a complex network diagram finally obtained in embodiment 1 of the method for identifying and resolving a single-layer one-to-many or many-to-one share diagram according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, a method for identifying a multi-layer stock control relationship share graph includes the following specific steps:
step 1, inputting a share graph to be identified of a multilayer stock control relationship;
step 2, extracting coordinates of companies (individuals), arrows with lines and percentages in the picture by adopting a Faster R-CNN network;
step 3, dividing the share graph to be identified into a plurality of single-layer one-to-many or many-to-one share graphs by using the coordinates of the arrowheads with lines according to the dividing and treating thought;
step 4, determining the corner coordinates of each single-layer one-to-many or many-to-one stock image according to the arrow coordinates, and determining the direction of the arrow according to the arrow corner coordinates; dividing a company (person) into a pointing object and a pointed object according to the direction of an arrow, and then binding the pointed object and the pointed object with more parties and percentages in a one-to-one mode; finally, recognizing characters in the pointing object and the pointed object by using an OCR recognition method;
and 5, constructing an object-arrow-percentage-pointed object-oriented stock control flow directional weighting graph according to the pointing relation obtained in the step 3.
In the step 1, for a share graph to be identified of a multilayer stock control relation, the share graph needs to be scaled to a fixed size;
the step 2 comprises the following steps:
step 2.1, a large number of share graphs are adopted, and companies (individuals), arrows with lines and percentages in the graphs are manually marked to serve as a data set; wherein the share graph is manually divided into a plurality of single-layer one-to-many or many-to-one share graphs, and an arrow exceeding the single-layer one-to-many or many-to-one share graphs is defined as a strip line arrow;
step 2.2, building a VGG-16 network model, wherein the VGG-16 comprises 13 convolution layers, 3 full connection layers and 5 pooling layers;
step 2.3, training a data set by the VGG-16 network model;
and 2.4, detecting the share graph to be recognized by adopting the trained VGG-16 network model, and outputting a detection result, wherein the detection result is coordinates of a company (individual), an arrow and a percentage.
In step 2, the sizes of convolution kernels adopted by the 13 convolution layers are 3x3 convolution, stride 1 is adopted, padding is same as same, and each convolution layer uses a relu activation function; respectively generating positive anchors and corresponding bounding box regression offsets, and then calculating prosages;
the adopted pooling nuclear parameters of the pooling layer are all 2 multiplied by 2, stride is 2, max pooling mode; the pro-usals of the convolutional layer are used to extract the pro-visual feature from the feature maps and send it to the subsequent full-connection and softmax network for classification (i.e. what object the pro-visual is).
The step 3 is:
step 3.1, setting the upper bound, the lower bound, the left bound and the right bound of the area with the line arrow as U, D, L, R based on the coordinate of the certain line arrow obtained in step 2, and further sequentially searching and expanding the coordinate of the company (personal) name according to the four bounds, wherein the specific operation is as follows:
and expanding the upper bound U: when the absolute value of the difference between the upper bound U of the arrowed area and the lower bound D 'of the company (individual) name area is within the error mu, the upper bound U of the arrowed area is expanded to the upper bound U' of the company (individual) name area; expanding the lower bound D: when the absolute value of the difference between the lower bound D of the arrowed area and the upper bound U 'of the company (individual) name area is within the error mu, the lower bound D of the arrowed area is expanded to the lower bound D' of the company (individual) name area; and (3) expanding L: finding a group of company (personal) names under the condition that the difference between the lower boundary of the company (personal) name area and U is within the error mu range, then finding the difference between the left boundary of the group of company (personal) names and L, and expanding L into the left boundary L' of the company (personal) name area with the minimum difference; expanding R: finding a group of company (personal) names under the condition that the difference between the upper bound of the company (personal) name area and D is within the error mu range, then finding the difference between the right boundary of the group of company (personal) names and R, and expanding R into the right boundary R' of the company (personal) name area with the minimum difference; the area formed by the upper boundary U ', the lower boundary D', the left boundary L 'and the right boundary R' is the final expanded target range of the arrow coordinate with the line;
and 3.2, traversing the coordinates of the arrowheads with lines of all the stock images, repeatedly executing the step 3.1 until the whole coordinates of the trend of each arrowhead are completely expanded, and finally dividing the stock image to be identified into a plurality of single-layer one-to-many or many-to-one stock images.
The error mu is within a range of 10-30 pixels.
Step 4 comprises the following operations for each single-layer one-to-many or many-to-one strand graph:
step 4.1, determining corner point coordinates according to arrow coordinates for a single-layer one-to-many or many-to-one stock picture, and determining the direction of an arrow according to the arrow corner point coordinates:
three corner points of a certain arrow are set as (A (x)1,y1),B(x1,y1),C(x3,y3)): suppose y1,y2Is less than a given threshold e1Then, the two points of the angular points A and B are considered to be on a horizontal line, and then y is judged3And y1If y is3>y1The arrow is considered to be downward if y3<y1The arrow is considered to be up; traversing all the coordinates of the arrow corner points, and judging the directions one by one;
step 4.2, dividing the company (personal) name into a pointed object and a pointed object according to the pointing direction of an arrow, binding the pointed object and the pointed object with a larger number of the pointed objects and the pointed objects one by one with a percentage, dividing the stock drawings into two groups according to the size of the ordinate of the company name because the input stock drawings are all single-layer, and if the arrow points upwards, dividing the company (personal) name into one group with the largest ordinate in the company (personal) coordinatesThe group is the pointed object, if the arrow points downwards, the group with the smallest ordinate in the company (personal) coordinates is the pointed object; and then one-to-one binding the pointed object and the more part of the pointed object with the percentage is carried out: the minimum abscissa and the maximum abscissa of the coordinates of four points of one of the pointed object and the pointed object are set to (x)min,xmax) Then find the abscissa of the percentage in (x)min,xmax) Then binding the two in a specific data structure (such as a dictionary), traversing the remaining objects of one party with a large number, and carrying out one-to-one binding with the percentage;
and 4.3, recognizing characters in the coordinates of the pointing object and the pointed object by utilizing an OCR technology.
The step 5 comprises the following steps:
step 5.1, establishing an empty directed graph G, and using the company (individual) names obtained in the step 3.3 as nodes to be sequentially added into the directed graph G to obtain a basic directed graph G' only storing the nodes;
step 5.2, on the basis of the directed graph G' in the step 5.1, converting the pointing relationship in the step 3.2 into a triple [ u, v, w ], wherein u is a starting point and represents a pointing object; v is an end point and represents a pointed object, w is a weight and represents the percentage of stock occupation, the converted triple is used as a parameter and is added into a directed graph G ', and finally, a stock control flow directed weighted graph G' is formed.
Example 1
Step 1 is executed, and a share graph to be identified is input as a graph 3;
executing step 2, wherein the data sets mainly come from a Chinese bidding network and a huge tide information network, the total quantity value exceeds 100G, and the number of the original data sets is 3200 because a single image of a stock right image contains the characteristics of a plurality of target images, the number of the existing data sets is overturned by utilizing an open-cv library, and the like, the number of the expanded data sets is 11000, and the number of the target images of each category exceeds 60000; the OCR technology calls an existing mature OCR interface (such as an API of hundred-degree OCR) to perform recognition, so that the recognition rate is improved;
step 3 is executed, wherein mu is 15, the output result is shown in figure 4, as can be seen, each frame in the figure is a one-to-many or many-to-one share graph of one layer;
and 4, executing the step 4, wherein the complex network for constructing the directional relation is a visual network constructed based on graph theory and a complex network modeling tool NetworkX, and the finally obtained stock control flow directional weighting graph is shown in FIG. 4.

Claims (7)

1.一种针对多层控股关系股份图的识别方法,,其特征在于,具体步骤为:1. a kind of identification method for multi-layer holding relationship share chart, is characterized in that, concrete steps are: 步骤1,输入多层控股关系的待识别股份图;Step 1, input the share map of the multi-level holding relationship to be identified; 步骤2,采用Faster R-CNN网络提取图片中公司(个人)、箭头、带线箭头和百分比的坐标;Step 2: Use the Faster R-CNN network to extract the coordinates of the company (individual), arrow, arrow with line and percentage in the picture; 步骤3,根据分治思想,利用带线箭头的坐标将待识别股份图划分为多个单层一对多或多对一股份图;Step 3, according to the idea of divide and conquer, use the coordinates with line arrows to divide the share chart to be identified into multiple single-layer one-to-many or many-to-one share charts; 步骤4,对于每个单层一对多或多对一股份图,根据箭头坐标确定角点坐标,根据箭头角点坐标确定箭头的走向;根据箭头的走向将公司(个人)划分为指向对象和被指向对象,之后将被指向对象和指向对象中个数多的一方与百分比一对一绑定;最后利用OCR识别方法对指向对象和被指向对象中的文字进行识别;Step 4, for each single-layer one-to-many or many-to-one share chart, determine the corner coordinates according to the arrow coordinates, and determine the direction of the arrow according to the arrow corner coordinates; according to the direction of the arrow, the company (individual) is divided into pointing objects and The pointed object, then bind the pointed object and the one with the largest number of pointed objects to the percentage one-to-one; finally, use the OCR recognition method to identify the text in the pointed object and the pointed object; 步骤5,根据步骤3得到的指向关系,构建“对象—箭头—百分比—被指向对象”的控股流程有向加权图。Step 5: According to the pointing relationship obtained in Step 3, construct a directed weighted graph of the holding process of "object-arrow-percentage-pointed object". 2.如权利要求1所述的一种针对多层控股关系股份图的识别方法,其特征在于,所述步骤2包括:2. a kind of identification method for multi-layer holding relationship share chart as claimed in claim 1, is characterized in that, described step 2 comprises: 步骤2.1,采取大量股份图并对图中的公司(个人)、箭头、带线箭头和百分比进行人工标注后作为数据集;其中股份图被人工划分为多个单层一对多或多对一股份图,将超出单层一对多或多对一股份图的箭头定义为带线箭头;Step 2.1, take a large number of share charts and manually label the companies (individuals), arrows, arrows with lines and percentages in the chart as a dataset; the share chart is manually divided into multiple single-layer one-to-many or many-to-one For the share chart, the arrows beyond the single-layer one-to-many or many-to-one share chart are defined as arrows with lines; 步骤2.2,建立VGG-16网络模型,所述VGG-16包括13个卷积层,3个全连接层,5个池化层;Step 2.2, establish a VGG-16 network model, the VGG-16 includes 13 convolutional layers, 3 fully connected layers, and 5 pooling layers; 步骤2.3,VGG-16网络模型对数据集进行训练;Step 2.3, the VGG-16 network model trains the dataset; 步骤2.4,采用训练好的VGG-16网络模型对待识别股份图进行检测,输出检测结果,所述检测结果为公司(个人)、箭头和百分比的坐标。In step 2.4, the trained VGG-16 network model is used to detect the stock graph to be identified, and the detection result is output, and the detection result is the coordinates of the company (individual), arrow and percentage. 3.如权利要求1所述的一种针对多层控股关系股份图的识别方法,其特征在于,步骤2中13个所述卷积层采用的卷积核的尺寸是3x3卷积,采用步幅stride=1,填充方式为padding=same,每一个卷积层使用一个relu激活函数;分别生成positive anchors和对应bounding box regression偏移量,然后计算出proposals;3. a kind of identification method for multi-layer holding relationship share chart as claimed in claim 1, is characterized in that, the size of the convolution kernel that 13 described convolution layers in step 2 adopts is 3x3 convolution, adopts step 2. The width stride=1, the padding method is padding=same, and each convolutional layer uses a relu activation function; respectively generate positive anchors and the corresponding bounding box regression offset, and then calculate proposals; 所述的池化层的采用的池化核参数均为2×2,步幅stride=2,max的池化方式;利用卷积层的proposals从feature maps中提取proposal feature送入后续全连接和softmax网络作classification(即分类proposal到底是什么object)。The pooling kernel parameters used in the pooling layer are all 2×2, stride=2, max pooling method; use the proposals of the convolutional layer to extract the proposal feature from the feature maps and send it to the subsequent full connection and The softmax network is used for classification (that is, what object is the classification proposal). 4.如权利要求1所述的一种针对多层控股关系股份图的识别方法,其特征在于,所述步骤3为:4. a kind of identification method for multi-layer holding relationship share chart as claimed in claim 1, is characterized in that, described step 3 is: 步骤3.1,基于步骤2所得的某个带线箭头的坐标,设带线箭头的区域的上界、下界、左界、右界为U、D、L、R,进而根据四个界限依次对公司(个人)名称的坐标进行搜索扩展,具体操作如下:Step 3.1, based on the coordinates of a line arrow obtained in step 2, set the upper bound, lower bound, left bound and right bound of the area with line arrow as U, D, L, R, and then according to the four bounds in turn the company The coordinates of the (personal) name are searched and expanded, and the specific operations are as follows: 对上界U进行扩展:当带线箭头区域的上界U和某公司(个人)名称区域的下界D’的差的绝对值在误差μ范围之内时,即将带线箭头区域的上界U扩展为此公司(个人)名称区域的上界U’;对下界D进行扩展:当带线箭头区域的下界D和某公司(个人)名称区域的上界U’的差的绝对值在误差μ范围之内时,即将带线箭头区域的下界D扩展为此公司(个人)名称区域的下界D’;对L进行扩展:找出一组公司(个人)名称,其条件为公司(个人)名称区域的下界与U的差在误差μ范围之内,然后求出这一组公司(个人)名称的左边界与L的差,将L扩展为差最小的那一个的公司(个人)名称区域的左界限L’;对R进行扩展:找出一组公司(个人)名称,其条件为公司(个人)名称区域的上界与D的差在误差μ范围之内,然后求出这一组公司(个人)名称的右边界与R的差,将R扩展为差最小的那一个的公司(个人)名称区域的右界限R’;此时的上界U’、下界D’、左界L’、右界R’构成的区域即为此带线箭头坐标的最后扩展的目标范围;Extend the upper bound U: when the absolute value of the difference between the upper bound U of the area with a line arrow and the lower bound D' of a company (person) name area is within the range of error μ, the upper bound U of the area with a line arrow will be Extend the upper bound U' of this company (person) name area; expand the lower bound D: when the absolute value of the difference between the lower bound D of the area with a line arrow and the upper bound U' of a certain company (person) name area is within the error μ When it is within the range, expand the lower bound D of the area with line arrows to the lower bound D' of the company (person) name area; expand L: find a group of company (person) names whose condition is the company (person) name The difference between the lower bound of the area and U is within the range of error μ, and then the difference between the left boundary of this group of company (personal) names and L is obtained, and L is expanded to the area of the company (personal) name with the smallest difference. Left boundary L'; Extend R: find out a group of company (personal) names, the condition is that the difference between the upper bound of the company (personal) name area and D is within the range of error μ, and then find this group of companies The difference between the right boundary of the (personal) name and R, expand R to the right boundary R' of the company (personal) name area with the smallest difference; at this time, the upper bound U', the lower bound D', and the left bound L' , the area formed by the right boundary R' is the last extended target range with the coordinates of the line arrow; 步骤3.2,遍历全部的股份图的带线箭头的坐标,重复执行步骤3.1,直至每个箭头走向的整体坐标均扩展完成,最终将待识别股份图划分为多个单层一对多或多对一股份图。Step 3.2, traverse the coordinates of the arrows with line in all share charts, repeat step 3.1 until the overall coordinates of each arrow direction are expanded, and finally divide the share chart to be identified into multiple single-layer one-to-many or multiple pairs A stock chart. 5.如权利要求4所述的一种针对多层控股关系股份图的识别方法,其特征在于,所述误差μ取范围值10~30像素。5 . The method for identifying a multi-layer shareholding relationship share graph according to claim 4 , wherein the error μ takes a range of 10 to 30 pixels. 6 . 6.如权利要求1所述的一种针对多层控股关系股份图的识别方法,其特征在于,所述步骤4包括对于每一个单层一对多或多对一股份图做如下操作:6. a kind of identification method for multi-layer holding relationship share chart as claimed in claim 1, is characterized in that, described step 4 comprises doing the following for each single-layer one-to-many or many-to-one share chart: 步骤4.1,对于某一个单层一对多或多对一股份图,根据箭头坐标确定角点坐标,根据箭头角点坐标确定箭头的方向:Step 4.1, for a single-layer one-to-many or many-to-one share chart, determine the corner coordinates according to the arrow coordinates, and determine the direction of the arrow according to the arrow corner coordinates: 某一箭头的三个角点设为(A(x1,y1),B(x1,y1),C(x3,y3)):假设y1,y2的差值小于给定的阈值e1,则认为角点A,B两点在一条水平线上,此时判断y3与y1的关系,若y3>y1则认为箭头是向下的,若y3<y1则认为箭头是向上的;遍历全部箭头角点坐标,逐一判别指向;The three corners of an arrow are set as (A(x 1 , y 1 ), B(x 1 , y 1 ), C(x 3 , y 3 )): Assume that the difference between y 1 and y 2 is less than given If y 3 > y 1 , the arrow is considered to be downward, if y 3 < y 1 , the arrow is considered to be upward; traverse all the coordinates of the corners of the arrow, and determine the direction one by one; 步骤4.2,根据箭头的指向将公司(个人)名称划分为指向对象和被指向对象,之后将被指向对象和指向对象中个数多的一方与百分比一对一绑定,由于输入的股份图都是单层的,所以可以根据公司名称的纵坐标的大小将其分成两组,根据步骤3.1得到的箭头的指向,若指向向上,则公司(个人)坐标中纵坐标最大的一组为被指向对象,若箭头的指向向下,则公司(个人)坐标中纵坐标最小的一组为被指向对象;然后进行将被指向对象和指向对象中个数多的一方与百分比一对一绑定:设被指向对象和指向对象中个数多的一方其中某个对象四个点的坐标中最小的横坐标和最大的横坐标为(xmin,xmax),则寻找百分比的横坐标在(xmin,xmax)内的一个,然后将两者绑定在特定的数据结构(如字典)中,遍历个数多的一方的剩余对象,与百分比进行一对一绑定;Step 4.2, according to the direction of the arrow, divide the company (individual) name into the pointing object and the pointed object, and then bind the one with the largest number of the pointed object and the pointed object to the percentage one-to-one. It is a single layer, so it can be divided into two groups according to the size of the ordinate of the company name. According to the direction of the arrow obtained in step 3.1, if it points upward, the group with the largest ordinate in the company (personal) coordinates is pointed. Object, if the arrow points downward, the group with the smallest vertical coordinate in the company (personal) coordinates is the pointed object; then bind the pointed object and the one with the largest number of pointed objects to the percentage one-to-one: Assuming that the pointed object and the one with the largest number of pointed objects, the smallest abscissa and the largest abscissa among the coordinates of the four points of a certain object are (x min , x max ), then find the abscissa of the percentage in (x min , x max ), then bind the two in a specific data structure (such as a dictionary), traverse the remaining objects of the party with the largest number, and bind one-to-one with the percentage; 步骤4.3,利用OCR技术对指向对象和被指向对象的坐标中文字进行识别。Step 4.3, using OCR technology to identify the characters in the coordinates of the pointing object and the pointed object. 7.如权利要求1所述的一种针对多层控股关系股份图的识别方法,其特征在于,所述步骤5包括:7. a kind of identification method for multi-layer holding relationship share chart as claimed in claim 1, is characterized in that, described step 5 comprises: 步骤5.1,建立一个空的有向图G,利用步骤3.3中得到的公司(个人)名称,使其作为节点依次添加进有向图G中,得到基础的仅存节点的有向图G′;Step 5.1, create an empty directed graph G, use the company (person) name obtained in step 3.3 to add it as a node into the directed graph G in turn, and obtain the basic directed graph G' with only existing nodes; 步骤5.2,在步骤5.1中有向图G′的基础上,将步骤3.2中的指向关系转化为三元组[u,v,w],其中u为起点,代表指向对象;v为终点,代表被指向对象,w为权重代表占股百分比,利用转化成的三元组作为参数,添加进有向图G′中,最终形成控股流程有向加权图G″。Step 5.2, on the basis of the directed graph G' in step 5.1, convert the pointing relationship in step 3.2 into a triple [u, v, w], where u is the starting point, representing the pointing object; v is the end point, representing the The pointed object, w is the weight representing the percentage of shares, and the converted triplet is used as a parameter to add it to the directed graph G', and finally form a directed weighted graph G" of the holding process.
CN202110083415.XA 2021-01-21 2021-01-21 Identification method for multi-layer control stock relationship share graphs Active CN112766263B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110083415.XA CN112766263B (en) 2021-01-21 2021-01-21 Identification method for multi-layer control stock relationship share graphs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110083415.XA CN112766263B (en) 2021-01-21 2021-01-21 Identification method for multi-layer control stock relationship share graphs

Publications (2)

Publication Number Publication Date
CN112766263A true CN112766263A (en) 2021-05-07
CN112766263B CN112766263B (en) 2024-02-02

Family

ID=75703627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110083415.XA Active CN112766263B (en) 2021-01-21 2021-01-21 Identification method for multi-layer control stock relationship share graphs

Country Status (1)

Country Link
CN (1) CN112766263B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114219329A (en) * 2021-12-20 2022-03-22 中国建设银行股份有限公司 A method and device for determining enterprise level

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009548A (en) * 2018-01-09 2018-05-08 贵州大学 A kind of Intelligent road sign recognition methods and system
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN111626292A (en) * 2020-05-09 2020-09-04 北京邮电大学 Character recognition method of building indication mark based on deep learning technology
CN111782772A (en) * 2020-07-24 2020-10-16 平安银行股份有限公司 Text automatic generation method, device, equipment and medium based on OCR technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009548A (en) * 2018-01-09 2018-05-08 贵州大学 A kind of Intelligent road sign recognition methods and system
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN111626292A (en) * 2020-05-09 2020-09-04 北京邮电大学 Character recognition method of building indication mark based on deep learning technology
CN111782772A (en) * 2020-07-24 2020-10-16 平安银行股份有限公司 Text automatic generation method, device, equipment and medium based on OCR technology

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
任明;许光;王文祥;: "家谱文本中实体关系提取方法研究", 中文信息学报, no. 06 *
张新峰 , 沈兰荪: "模式识别及其在图像处理中的应用", 测控技术, no. 05 *
杜恩宇;张宁;李艳荻;: "基于自适应分块编码SVM的车道导向箭头多分类方法", 光学学报, no. 10 *
梅继霞;李伟;: "控股股东与中小股东之间关系的博弈分析", 石河子大学学报(哲学社会科学版), no. 04 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114219329A (en) * 2021-12-20 2022-03-22 中国建设银行股份有限公司 A method and device for determining enterprise level

Also Published As

Publication number Publication date
CN112766263B (en) 2024-02-02

Similar Documents

Publication Publication Date Title
Zhao et al. Reconstructing BIM from 2D structural drawings for existing buildings
CN104966104B (en) A kind of video classification methods based on Three dimensional convolution neutral net
CN108427738B (en) Rapid image retrieval method based on deep learning
CN110084131A (en) A kind of semi-supervised pedestrian detection method based on depth convolutional network
CN110516539A (en) Method, system, storage medium and equipment for extracting buildings from remote sensing images based on confrontation network
CN110827398B (en) Automatic semantic segmentation method for indoor three-dimensional point cloud based on deep neural network
CN114092697B (en) Building facade semantic segmentation method with attention fused with global and local depth features
CN113312501A (en) Construction method and device of safety knowledge self-service query system based on knowledge graph
CN114694038A (en) High-resolution remote sensing image classification method and system based on deep learning
Qiao et al. A weakly supervised semantic segmentation approach for damaged building extraction from postearthquake high-resolution remote-sensing images
CN111191664A (en) Training method of label identification network, label identification device/method and equipment
CN112163447B (en) Multi-task real-time gesture detection and recognition method based on Attention and Squeezenet
CN109360179A (en) Image fusion method, device and readable storage medium
CN111783543B (en) Facial activity unit detection method based on multitask learning
CN113220878A (en) Knowledge graph-based OCR recognition result classification method
CN107392463B (en) A method, module, device and storage device for identifying urban functional area
CN116524189A (en) High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
CN107341440A (en) Indoor RGB D scene image recognition methods based on multitask measurement Multiple Kernel Learning
CN110334719A (en) A method and system for extracting building images from remote sensing images
CN113516379B (en) Work order scoring method for intelligent quality inspection
CN113807347A (en) Kitchen waste impurity identification method based on target detection technology
CN117710975A (en) Indoor building structure point cloud semantic segmentation method and system based on deep learning
CN111027634B (en) Regularization method and system based on class activation mapping image guidance
CN117975090A (en) A method for detecting human interaction based on intelligent perception
CN111339967B (en) Pedestrian detection method based on multi-view graph convolution network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant