CN111405585B

CN111405585B - Neighbor relation prediction method based on convolutional neural network

Info

Publication number: CN111405585B
Application number: CN202010194728.8A
Authority: CN
Inventors: 骆曦; 李克; 刘子巍
Original assignee: Beijing Union University
Current assignee: Beijing Union University
Priority date: 2020-03-19
Filing date: 2020-03-19
Publication date: 2023-10-03
Anticipated expiration: 2040-03-19
Also published as: CN111405585A

Abstract

The invention discloses a neighbor relation prediction method based on a convolutional neural network, and belongs to the technical field of base station information management. The method comprises the steps of data preprocessing, CNN network construction, CNN model training, neighbor relation prediction and post-processing; the wireless knowledge graph related by the invention belongs to non-Euclidean domain data, graph structure information in the graph, namely the attribute and the relation of each entity are converted into characteristics which can be processed by CNN, and implicit relation characteristics between a plurality of sampling point pairs and cells are extracted by adopting convolution kernels with various heights. The method can be used as the supplement or even replacement of the existing manual drive test and ANR technology, guides the operation and maintenance department of the operator network to more efficiently, timely and conveniently configure and manage the neighbor relation of the base station in the wireless network, and provides powerful support for improving the switching success rate of users among cells in the network, improving the service continuity and guaranteeing good service experience.

Description

Neighbor relation prediction method based on convolutional neural network

Technical Field

The invention relates to a neighbor relation prediction method based on a convolutional neural network, and belongs to the technical field of base station information management.

Background

In recent years, knowledge-graph technology has attracted extensive attention and research, and has gradually expanded from general knowledge-graph to various fields. In the field of mobile communication, a wireless network knowledge graph is paid more attention as a new base station information management method, and the method is favorable for improving the intelligent level of the operation and maintenance of the mobile network and improving the efficiency of operation and maintenance work based on the outstanding advantages of knowledge reasoning in the aspect of information mining. In the automatic construction process of the knowledge graph, extraction and calibration of entity relations are very important tasks. Further, how to automatically identify the neighbor relation between cell entities from the massive data and calibrate the existing neighbor relation list (NCL, neighbor Cell List) is one of the important works of building a wireless network knowledge graph.

In the conventional optimization work of mobile communication network planning, the maintenance of the neighbor relation also occupies a very important position. The main purpose is that the user equipment residing at the cell service edge can successfully switch to the adjacent cell with the best signal in time and keep the continuity of the communication service uninterrupted so as to ensure the conversation quality and the whole network performance. The handover is completed only when a user enters one cell from another cell after the neighbor relation of each cell is configured by a base station (referred to as eNB in 4G LTE system and referred to as gNB in 5G system) terminal. Accurate neighbor relation configuration is therefore a fundamental requirement to guarantee mobile network performance.

If the neighbor relation list contains too many neighbor cells, the complexity and redundancy of the system networking are increased, so that the accuracy of the measurement report is reduced; if the number of the neighbor cells contained in the neighbor cell relation list is too small or the configuration is wrong, a coverage leakage or blind area exists between the cells, so that the success rate of switching is low, and a large number of dropped calls are caused. In the network operation process, the network performance is ensured along with the continuous changes of the network scale, the user distribution and the interference environment, and the neighbor relation is required to be checked and optimized regularly.

The prior art means mainly have two kinds:

(1) Manual drive test: traditional network planning and optimization is a technically complex task requiring professionals to carry equipment regularly for drive test analysis. However, the drive test needs to consume a great deal of time, manpower and material resources, has the problems of long period, high cost and the like, and has certain limitation on the guiding significance of the optimization of the adjacent cells because the drive test data is only a sample value in a certain area for a certain time and does not have the statistics of the whole network. From the operator's perspective, lower networking and operating costs are required in order to provide users with lower priced network services to gain the market.

(2) ANR (automatic neighbor relation): the method is a method for automatically completing neighbor relation configuration and optimization, belongs to an important component in self-organizing network (SON) technology, and has high automation degree. However, the existing network generally does not completely depend on the function, and a new network is usually closed after a period of time is started, and then is further optimized manually. The main reasons include: (a) The actual coverage capability of the base station is not considered by the ANR, and coverage quality information of each base station is not fully utilized or considered, so that too many adjacent cells are configured, and too many unnecessary adjacent cell relations (namely redundant or ultra-far adjacent cell relations) are brought; (b) A large amount of terminal measurement and air interface signaling interaction burden can be brought; (c) The configuration of the policy parameters is not unified in the whole network, and the unreasonable configuration can cause the problems of repeated addition and deletion of neighbor cells, unstable number of neighbor cells and the like. Improvements have also been proposed, for example, when the mobile terminal detects that the RSRP of the cell outside the list is higher than the RSRP of the current serving cell, and the difference is greater than this value, the mobile terminal will report the measurement result to the serving eNB. The ANR function includes three modules: the system comprises a neighbor cell detection module, a neighbor cell deletion module and a neighbor cell relation list management module. The main working steps comprise:

(a) The eNB sends measurement configuration related to the ANR to the UE, which can comprise common-frequency measurement, inter-frequency measurement or Inter-technology (Inter-RAT) measurement of the same technology (Intra-RAT), the UE executes measurement of PCI (Physical Cell Identity, neighbor cell physical cell identifier) after receiving the measurement configuration, and reports the measured PCI information of the neighbor cell to the eNB according to the format of a measurement report;

(b) After receiving PCI information of the neighbor cell, the eNB selects a specific UE to send a report CGI (Cell Global Identity, cell global identification) and measurement configuration, and after receiving the measurement configuration, the UE reads broadcast information of the neighbor cell and acquires the CGI and other information of the neighbor cell;

(c) After receiving the information such as CGI of the neighboring cell reported by the UE, the eNB reports the information to an O & M (operation and maintenance) system, and the O & M decides whether to add the neighboring cell.

Note that: the eNB does not care about the measurement results reported by the mobile terminals, but care about the cells which are reported to the eNB, counts the number of times each cell is reported, and adds the cells with the number of times of reporting exceeding a threshold value into a neighbor relation list according to a preset threshold value.

The main drawbacks of the prior art means are:

(1) Manual drive test: a great deal of time, manpower and material resources are consumed, and the drive test cannot traverse all coverage areas and service time and does not have statistics of the whole network, so that the method has certain limitation on the guiding significance of the optimization of the neighbor cells.

(2) ANR (automatic neighbor relation): because the actual coverage capability of the base station is not considered, too many configured neighbor cells are generally caused, a large number of redundant neighbor cell relations are generated, or because of the excessively far neighbor cell relations caused by the cross-zone coverage, a large number of terminal measurement and air interface signaling interaction burdens are brought, or frequent switching is caused to influence the continuity of service and even cause disconnection and call drop, in addition, the configuration of strategy parameters is not unified in the whole network, and the unreasonable configuration can cause the problems of repeated addition and deletion of neighbor cells, unstable neighbor cell quantity and the like. Therefore, this function is not completely relied on in the 4G existing network, and typically, the ANR function of the base station is only turned on for a period of time at the initial stage of the new network, and then turned off and optimized by means of manual drive test and the like.

Therefore, new technical means are required to be searched for accurate discovery and calibration of the neighbor cell relationship between cell entities.

Disclosure of Invention

The invention aims to solve the problem of how to automatically extract the neighbor relation among cell entities from massive terminal perception coverage data, base station measurement report data and preliminarily constructed wireless network knowledge graph data, correct mismatched or redundant neighbor cells in a current neighbor relation list and supplement the neighbor relation of missed configuration, thereby realizing more efficient and intelligent neighbor relation management.

In order to solve the technical problems, the technical scheme adopted by the invention is a neighbor relation prediction method based on a convolutional neural network, and the specific steps of the method are described as follows:

step 1: data preprocessing

(1) Constructing an initial training sample set S': selecting a certain target area of the NCL, which is marked manually, traversing the NCL under the area, and adding all the 'mark' fields as 1, namely determining that adjacent cell pairs exist as positive samples into a training set; all cell pairs not listed in the NCL and satisfying the following conditions are added as negative samples to the training set:

wherein (x) ₁ ,y ₁ )、(x ₂ ,y ₂ ) The longitude and latitude of the base station to which the two cells belong are respectively obtained from the attributes of the two-cell entity 'site longitude' and 'site latitude' in the preliminarily constructed target area wireless network knowledge graph; rc= 6378137 is constant; maxDis is a preset maximum neighbor distance.

(2) Traversing samples in the initial training sample set S', converting each cell pair into a link sequence matrix according to the following steps, and endowing positive and negative category labels, wherein all positive samples are marked as 1, and all negative samples are marked as 0, so as to obtain a sequence matrix training set S with labels:

(a) And traversing sampling point entities in the regional wireless network knowledge graph for the cell pairs u and v, extracting all sampling points of the cell u and the cell v according to the attribute 'cell ID' field, and storing the sampling points in a set N (u) and a set N (v) respectively.

(b) Extracting all sampling point pairs (e) meeting the condition from the sets N (u) and N (v) _u ,e _v ) Wherein e is _u ∈N(u)、e _v E N (v), and sample point e _u And e _v The "terminal ID" attribute value of (c) is the same.

(c) Extracting reference signal receiving power of sampling points from MCS or MR data set, and sorting the sampling point pairs, firstly sorting from small to large according to terminal ID, and secondly sorting according to two sampling points e _u And e _v The sum of the received powers of the reference signals is ordered from large to small, and the following n ordered sampling point pairs are obtained: (e) _u1 ,e _v1 )，(e _u2 ,e _v2 )…,(e _un ,e _vn )。

(d) Constructing a node link sequence matrix A [ (3n+1) d ]: where n is the number of maximum inputtable sample point pairs and d is the maximum feature dimension. And extracting each entity from the wireless network knowledge graph database and storing the attribute information of each entity into the matrix. Because the number of sampling point pairs between every two cells is different, the length of a link sequence matrix A is different, the maximum inputtable node sequence length 3n+1 is specified, and 0 is supplemented to make the length of the link sequence matrix A between every two cells consistent; in addition, the wireless network knowledge graph belongs to a heterogeneous network, namely nodes of different types exist in the graph, the attribute numbers of the nodes are different, the attributes of the two types of nodes of a cell and a sampling point are combined to generate a feature dictionary, d takes the dimension of the feature dictionary, and 0-supplementing processing is carried out on the feature items missing from the nodes.

Step 2: construction of CNN networks

(1) Input: the node link sequence matrix a [ (3n+1) ×d ].

(2) Convolution layer: weight matrix w e R [ h x d ] of convolution kernel]Where R represents a real set, the convolution kernel height is h, and by modifying h, a plurality of convolution kernels of different heights are set, the convolution kernel width is the characteristic dimension d. The convolution kernel w slides from top to bottom with a step 3 on the sequence matrix a, sequence feature c _i Is formed by a convolution kernel w and a matrix region x _3i-2:3i+h-3 And (3) performing convolution operation to obtain:

c _i ＝f(w·x _3i-2:3i+h-3 +b)

where f is a nonlinear activation function such as hyperbolic tangent and b ε R is a bias term. The convolution kernel w pairs each region { x } of the sequence matrix A _1:h ,x _4:h+3 ,x _7:h+6 ,…,x _3n-h+2:3n+1 Convolution operation of the sequence will result in a sequence feature c= [ C ] with column number 1 ₁ ,c ₂ ,…,c _(3n-h+4)/3 ]. The convolution kernel height h is set to 4,7,10, …,3m+1, where m is the preset number of convolution kernels, and m sequence features are obtained and represent the relationship features between 1 and multiple sampling point pairs and cells u and v, respectively.

(4) Pooling layer: downsampling sequence features using average pooling, each sequence feature pooling resulting in a sequence feature z _i Finally, m sequence features z= { z are obtained ₁ ,z ₂ ,…,z _m }。

(5) Full tie layer: inputting a sequence feature z of n×1, outputting a vector y of t×1, wherein T is a class number, taking a prediction of whether a neighbor relation exists as a classification problem, and taking t=2:

y＝W _dense ﹒(z。r)+b _dense

wherein W is _dense Is a weight matrix of 2*N; r is used for introducing Dropout operation, randomly generating a 0 or 1 vector with probability p, and taking 0.5; b _dense Is N1The vector is executed.

(6) softmax layer: vector prob [ p ] of input y, output 2*1 ₁ ,p ₂ ]＝softmax(y)，p ₁ And p ₂ Respectively represent the probability that two cells have a neighbor relation and have no neighbor relation.

Step 3: CNN model training

The CNN model is trained using a labeled sequence matrix training set S.

(1) Initializing a convolution kernel and a bias term of a convolution layer, and a weight matrix and a bias vector of a full connection layer;

(2) Defining super parameters in the training process: maximum iteration number, stop iteration threshold, learning rate, loss function using cross entropy:

Loss＝-(t*log(p ₁ )+(1-t)*log(1-p ₁ ))

wherein t represents the true value, i.e. the label of the sample, p ₁ And (3) obtaining the probability of the existence of the neighbor relation in the step (2). (3) Sequentially inputting all sequence matrixes in the sequence matrix training set S with the labels into a model, calculating a Loss value Loss according to the real labels of training samples and a model prediction result, carrying out layer-by-layer back propagation, and iteratively updating parameters of each layer by using a random gradient descent method;

(4) And when the iteration change of the loss value is smaller than a set threshold value or the maximum iteration number is reached, ending the training of the CNN model.

Step 4: neighbor relation prediction

Selecting a certain target area of NCL which is not marked manually, converting the cell pairs to be predicted into sequence matrixes according to the steps 1 (a) - (d), respectively inputting the sequence matrixes into the CNN model trained in the step 3, judging whether the cell pairs have a neighbor relation according to the output probability, marking the cell pairs with the neighbor relation as Y, and marking the cell pairs without the neighbor relation as N.

Step 5: post-treatment

Updating the NCL of the target area according to the neighbor cell prediction result: for a cell pair identified as Y and having appeared in the NCL, indicating that the correct neighbor relation has been set for that cell pair, its "flag" field is set to 1; for a cell pair identified as Y and not present in the NCL, indicating that the cell pair is a neighbor cell for which there is no mismatch, adding it to the NCL, and setting the "flag" field to 1; for a cell pair identified as N and having appeared in the NCL, indicating that the cell pair is a mismatched or multi-configured neighbor, deleting it from the NCL; for a cell pair identified as N and not present in the NCL, no operation needs to be performed.

Further, the input data of step 1 (2) (c) is MCS or MR coverage sampling data set: the MCS data is composed of data collected from a large number of user terminals, and the MR data is composed of measurement information reported by each terminal device under the base station, which is collected from the base station device.

Further, a wireless network knowledge graph of the target area preliminarily constructed in the step 1: the method at least comprises a base station entity, a cell entity, a terminal entity, a sampling point entity and attributes thereof, and membership, residence and association relations.

Further, the base station entity at least comprises a base station ID, a city, an operator, a network system, a large area ID, administrative areas, site longitude, site latitude, cell number and a base station type attribute characteristic field; the cell entity at least comprises attribute characteristic fields such as cell ID, city, operator, network system, large area ID, base station ID, administrative area, station address longitude, station address latitude, base station type, direction angle, dip angle, physical cell ID, frequency point number, coverage rate, coverage radius and the like; the terminal entity at least comprises attribute characteristic fields such as terminal ID, terminal brand, terminal model, operator, network system and the like; the sampling point entity at least comprises attribute characteristic fields such as a sampling point ID, a terminal ID, a sampling date, sampling time, longitude, latitude, city, administrative district, operator, network system, large area ID, base station ID, cell ID, physical cell ID, frequency point number, reference signal receiving power, reference signal receiving quality, signal-to-interference-plus-noise ratio, terminal brand, terminal model and the like. The profile data is stored in a profile database.

Further, the current actual neighbor list NCL of each base station in the target area: the field at least comprises NCL ID, head cell ID, tail cell ID, and mark; wherein the "mark" field refers to the correct neighboring relationship between each cell under a certain target area after being manually marked, 1 indicates that the correct neighboring relationship exists, and the blank indicates that whether the correct neighboring relationship exists or not is not determined.

Compared with the prior art, the implementation of the invention can effectively ensure the validity, the integrity and the timeliness of the neighbor relation information, and is an important step for automatically constructing the wireless network knowledge graph. The method can be used as the supplement or even replacement of the existing manual drive test and ANR technology, guides the operation and maintenance department of the operator network to more efficiently, timely and conveniently configure and manage the neighbor relation of the base station in the wireless network, and provides powerful support for improving the switching success rate of users among cells in the network, improving the service continuity and guaranteeing good service experience.

Compared with ANR: different data sources have no ANR function upgrading requirement on the eNB, the burden of air interface measurement signaling is not increased, and the coverage distribution characteristics and the timestamp information under actual coverage are utilized to find the optimal neighbor cell instead of the effectiveness problem that only a certain cell is measured in the ANR, namely pairing is performed.

Predicting the neighbor relation based on CNN: the method mainly comprises a data preprocessing method and a design method of a convolution layer. The deep learning method such as CNN can better solve the structural characteristics of Euclidean domain data such as images and videos, but can not effectively extract the structural characteristics of nodes when processing non-Euclidean domain data.

Drawings

FIG. 1 is a complete algorithm flow chart of the present invention.

Fig. 2, node link sequence matrix a [ (3n+1) d ].

Fig. 3.Cnn network architecture diagram.

Fig. 4. Convolution operation (step size 3, convolution kernel height h 4).

Detailed Description

The present invention will be described in detail below with reference to the drawings and examples.

The invention provides a neighbor relation prediction method based on a convolutional neural network.

One important technology in the field of knowledge graphs is knowledge reasoning, which is to use some methods to infer new knowledge or identify incorrect knowledge in the knowledge graph according to the existing knowledge in the knowledge graph, and mainly comprises two aspects of knowledge graph completion and denoising. Knowledge reasoning is an important means and key link in the knowledge graph construction process. The invention predicts the neighbor relation among the cell entities through analysis and reasoning of various entities, attributes and entity relations (not including neighbor relation among the cell entities) existing in the preliminarily constructed wireless network knowledge graph by the convolutional neural network, and carries out corresponding calibration (including adding new neighbor relation, deleting redundant or mismatched neighbor relation) on the existing NCL maintained in the system.

The specific steps are described in detail as follows:

input data:

(1) MCS or MR coverage sample data set: the MCS data is composed of data collected from a large number of user terminals, and the MR data is composed of measurement information reported by each terminal device under the base station, which is collected from the base station device. The feature fields contained in the two types of data: terminal ID (for MR data, generally referred to as IMSI), sampling date, sampling time, longitude, latitude, operator, network type, large area ID (TAC in 4G and 5G networks, i.e. tracking area code), base station ID (eNBID in 4G networks, gbid in 5G networks), cell ID (cellID), physical Cell ID (PCI), frequency number (EARFCN), reference signal received power (RSRP in 4G networks, CSI-RSRP in 5G networks), reference signal received quality (RSRQ in 4G networks, CSI-RSRQ in 5G networks), signal-to-interference and noise ratio (SINR in 4G networks, CSI-SINR in 5G networks), neighbor cell information list (including network type, cell ID, base station ID, cell ID, physical cell ID, frequency number, reference signal received power of each neighbor cell measured by the terminal). In some cases some fields are missing.

(2) Preliminarily constructed wireless network knowledge graph of target area: at least including base station entity, cell entity, terminal entity, sampling point entity (special entity) and its attribute, and membership (cell-base station), residence relationship (terminal-cell), association relationship (sampling-cell, sampling-terminal). The base station entity at least comprises attribute characteristic fields such as base station ID, city, operator, network system, large area ID, administrative area, station address longitude, station address latitude, cell number, base station type and the like; the cell entity at least comprises attribute characteristic fields such as cell ID, city, operator, network system, large area ID, base station ID, administrative area, station address longitude, station address latitude, base station type, direction angle, dip angle, physical cell ID, frequency point number, coverage rate, coverage radius and the like; the terminal entity at least comprises attribute characteristic fields such as terminal ID, terminal brand, terminal model, operator, network system and the like; the sampling point entity at least comprises attribute characteristic fields such as sampling point ID, terminal ID, sampling date, sampling time, longitude, latitude, city, administrative district, operator, network system, large area ID, base station ID, cell ID, physical cell ID, frequency point number, reference signal receiving power, reference signal receiving quality, signal-to-interference-and-noise ratio, terminal brand, terminal model and the like. The profile data is stored in a profile database.

(3) Current actual neighbor list (NCL) of each base station in the target area: the field at least comprises NCL ID, head cell ID, tail cell ID, and mark; wherein the "mark" field refers to a correct neighboring relationship between cells under a certain target area (within a certain geographic area or under a certain TAC) after manual labeling, 1 indicates that a correct neighboring relationship is determined to exist, and null indicates that whether the correct neighboring relationship is uncertain. The neighbor relation that does not exist under the target area is not listed in the NCL.

Step 1: data preprocessing

(c) Extracting reference signal receiving power of sampling points from MCS or MR data set, and sorting the sampling point pairs, firstly sorting from small to large according to terminal ID, and secondly sorting according to two sampling points e _u And e _v The sum of the Reference Signal Received Powers (RSRP) of (a) is ordered from large to small to obtain the following n ordered sampling point pairs: (e) _u1 ,e _v1 )，(e _u2 ,e _v2 )…,(e _un ,e _vn )。

(d) Constructing a node link sequence matrix a [ (3n+1) x d ] as shown in fig. 2: where n is the number of maximum inputtable sample point pairs and d is the maximum feature dimension. And extracting each entity from the wireless network knowledge graph database and storing the attribute information of each entity into the matrix. Because the number of sampling point pairs between every two cells is different, the length of a link sequence matrix A is different, the maximum inputtable node sequence length 3n+1 is specified, and 0 is supplemented to make the length of the link sequence matrix A between every two cells consistent; in addition, the wireless network knowledge graph belongs to a heterogeneous network, namely nodes of different types exist in the graph, the attribute numbers of the nodes are different, the attributes of the two types of nodes of a cell and a sampling point are combined to generate a feature dictionary, d takes the dimension of the feature dictionary, and 0-supplementing processing is carried out on the feature items missing from the nodes.

Step 2: construction of CNN networks

Constructing a CNN network as shown in fig. 3:

(1) Input: the node link sequence matrix A [ (3n+1) d ], n is the number of the maximum inputtable sampling point pairs, and d is the maximum characteristic dimension.

(2) Convolution layer: weight matrix w e R [ h x d ] of convolution kernel]Where R represents a real set, the convolution kernel height is h, multiple convolution kernels of different heights can be set by modifying h, and the convolution kernel width is the feature dimension d. The convolution kernel w slides from top to bottom with a step 3 on the sequence matrix a, sequence feature c _i Is formed by a convolution kernel w and a matrix region x _3i-2:3i+h-3 And (3) performing convolution operation to obtain:

c _i ＝f(w·x _3i-2:3i+h-3 +b)

where f is a nonlinear activation function such as hyperbolic tangent and b ε R is a bias term. The convolution kernel w pairs each region { x } of the sequence matrix A _1:h ,x _4:h+3 ,x _7:h+6 ,…,x _3n-h+2:3n+1 Convolution operation of the sequence will result in a sequence feature c= [ C ] with column number 1 ₁ ,c ₂ ,…,c _(3n-h+4)/3 ]. Fig. 3 shows a convolution operation with a step size of 3 and a convolution kernel height h of 4. The convolution kernel height h may be set to 4,7,10, …,3m+1, where m is a preset number of convolution kernels, resulting in m sequence features representing the relationship between 1 and multiple pairs of sample points and cells u and v, respectively.

(4) Pooling layer: downsampling the sequence features using average pooling, each sequence feature pooling resulting in a sequence feature z _i Finally, m sequence features z= { z are obtained ₁ ,z ₂ ,…,z _m }。

(5) Full tie layer: inputting a sequence feature z of n×1, outputting a vector y of t×1 (T is a category number, and regarding whether there is a prediction of a neighbor relation as a classification problem, therefore taking t=2):

y＝W _dense ﹒(z。r)+b _dense

wherein W is _dense Is a weight matrix of 2*N; r is used for introducing Dropout operation, and a 0 or 1 vector is randomly generated with probability p, wherein p is 0.5; b _dense Is the paranoid vector of N x 1.

Step 3: CNN model training

The CNN model is trained using a labeled sequence matrix training set S.

(2) Defining super parameters in the training process: maximum iteration number, stop iteration threshold, learning rate, etc., the loss function may use cross entropy (comparing the output neighbor relation probability with the true neighbor relation):

Loss＝-(t*log(p ₁ )+(1-t)*log(1-p ₁ ))

Step 4: neighbor relation prediction

Step 5: post-treatment

Claims

1. A neighbor relation prediction method based on a convolutional neural network is characterized by comprising the following steps of: the specific steps of the method are described below,

step 1: data preprocessing

(1) Constructing an initial training sample set S': selecting a target area of the NCL marked manually, traversing the NCL under the area, and adding all the 'marked' fields as 1, namely determining that the pairwise cell pairs with the neighbor cell relationship are used as positive samples to a training set; all cell pairs not listed in the NCL and satisfying the following conditions are added as negative samples to the training set:

wherein (x) ₁ ,y ₁ )、(x ₂ ,y ₂ ) The longitude and latitude of the base station to which the two cells belong are respectively obtained from the attributes of the two-cell entity 'site longitude' and 'site latitude' in the preliminarily constructed target area wireless network knowledge graph; rc= 6378137, preparing a base material; maxDis is the preset maximum neighbor distance;

(a) For the cell pairs u and v, traversing sampling point entities in the regional wireless network knowledge graph, extracting all sampling points of the cell u and the cell v according to an attribute 'cell ID' field, and storing the sampling points in a set N (u) and a set N (v) respectively;

(b) Extracting all sampling point pairs (e) meeting the condition from the sets N (u) and N (v) _u ,e _v ) Wherein e is _u ∈N(u)、e _v E N (v), and sample point e _u And e _v The "terminal ID" attribute values of (a) are the same;

(c) Extracting reference signal receiving power of sampling points from MCS or MR data set, and sorting the sampling point pairs, firstly sorting from small to large according to terminal ID, and secondly sorting according to two sampling points e _u And e _v The sum of the received powers of the reference signals is ordered from large to small, and the following n ordered sampling point pairs are obtained: (e) _u1 ,e _v1 )，(e _u2 ,e _v2 )…,(e _un ,e _vn )；

(d) Constructing a node link sequence matrix A [ (3n+1) d ]; wherein n is the number of the largest inputtable sampling point pairs, and d is the largest characteristic dimension;

step 2: construction of CNN networks

(1) Input: node link sequence matrix a [ (3n+1) d ];

(2) Convolution layer: weight matrix w e R [ h x d ] of convolution kernel]Wherein R represents a real number set, the convolution kernel height is h, a plurality of convolution kernels with different heights are set by modifying h, and the convolution kernel width is a characteristic dimension d; the convolution kernel w slides from top to bottom with a step 3 on the sequence matrix a, sequence feature c _i Is formed by a convolution kernel w and a matrix region x _3i-2:3i+h-3 And (3) performing convolution operation to obtain:

c _i ＝f(w·x _3i-2:3i+h-3 +b)

wherein f is a nonlinear activation function such as hyperbolic tangent, b ε R is a bias term;

(4) Pooling layer: downsampling sequence features using average pooling, each sequence feature pooling resulting in a sequence feature z _i Finally, m sequence features z= { z are obtained ₁ ,z ₂ ,…,z _m }；

y＝W _dense ﹒(z。r)+b _dense

wherein W is _dense Is a weight matrix of 2*N; r is used for introducing Dropout operation, randomly generating a 0 or 1 vector with probability p, and taking 0.5; b _dense Is a paranoid vector of N1;

(6) softmax layer: vector prob [ p ] of input y, output 2*1 ₁ ,p ₂ ]＝softmax(y)，p ₁ And p ₂ Respectively representing the probability that two cells have a neighbor relation and have no neighbor relation;

step 3: CNN model training

Training a CNN model by using a sequence matrix training set S with labels;

(2) Defining super parameters in the training process: maximum iteration times, stop iteration threshold, learning rate, loss function using cross entropy;

(3) Sequentially inputting all sequence matrixes in the sequence matrix training set S with the labels into a model, calculating a Loss value Loss according to the real labels of training samples and a model prediction result, carrying out layer-by-layer back propagation, and iteratively updating parameters of each layer by using a random gradient descent method;

(4) When the iteration change of the loss value is smaller than a set threshold value or reaches the maximum iteration number, finishing the training of the CNN model;

step 4: neighbor relation prediction

Selecting a certain target area of NCL which is not marked manually, converting the cell pairs to be predicted according to the steps (a) - (d) in the step 1 into sequence matrixes, respectively inputting the sequence matrixes into the CNN model trained in the step 3, judging whether the cell pairs have a neighbor relation according to the output probability, marking the cell pairs with the neighbor relation as Y, and marking the cell pairs without the neighbor relation as N;

step 5: post-treatment

And updating the NCL of the target area according to the neighbor cell prediction result.

2. The neighbor relation prediction method based on the convolutional neural network as set forth in claim 1, wherein: the (c) input data in step 1 (2) is MCS or MR coverage sampling data set: the MCS data is composed of data collected from a large number of user terminals, and the MR data is composed of measurement information reported by each terminal device under the base station, which is collected from the base station device.

3. The neighbor relation prediction method based on the convolutional neural network as set forth in claim 1, wherein: step 1, a wireless network knowledge graph of a target area which is initially constructed: the method at least comprises a base station entity, a cell entity, a terminal entity, a sampling point entity and attributes thereof, and membership, residence and association relations.

4. The neighbor relation prediction method based on the convolutional neural network as set forth in claim 1, wherein: the base station entity at least comprises a base station ID, a city, an operator, a network system, a large area ID, administrative areas, station address longitudes, station address latitudes, cell numbers and base station type attribute feature fields; the cell entity at least comprises a cell ID, a city, an operator, a network system, a large area ID, a base station ID, administrative areas, station address longitudes, station address latitudes, base station types, direction angles, dip angles, physical cell IDs, frequency point numbers, coverage rates and coverage radius attribute characteristic fields; the terminal entity at least comprises a terminal ID, a terminal brand, a terminal model, an operator and a network type attribute characteristic field; the sampling point entity at least comprises a sampling point ID, a terminal ID, a sampling date, sampling time, longitude, latitude, city, administrative district, operator, network system, large area ID, base station ID, cell ID, physical cell ID, frequency point number, reference signal receiving power, reference signal receiving quality, signal-to-interference-and-noise ratio, terminal brand and terminal model attribute characteristic fields; the profile data is stored in a profile database.

5. The neighbor relation prediction method based on the convolutional neural network as set forth in claim 1, wherein: the current actual neighbor list NCL of each base station in the target area: the fields at least comprise NCLID, head cell ID, tail cell ID and mark; wherein the 'mark' field refers to the correct adjacent cell relationship between each cell under the target area after manual marking, 1 indicates that the correct adjacent cell relationship exists, and the blank indicates that whether the correct adjacent cell relationship exists or not is not determined.