CN114863093A

CN114863093A - Neural network training method based on eye movement technology and building design method and system

Info

Publication number: CN114863093A
Application number: CN202210603543.7A
Authority: CN
Inventors: 邱鲤鲤; 刘佳桐; 王珍珍; 陈兆其; 李梅; 李君楠
Original assignee: Xiamen University
Current assignee: Xiamen University
Priority date: 2022-05-30
Filing date: 2022-05-30
Publication date: 2022-08-05
Anticipated expiration: 2042-05-30
Also published as: CN114863093B

Abstract

The invention discloses a neural network training method based on eye movement technology, a building design method and a system, the scheme takes a building elevation map and eye movement data corresponding to different demographic information as training data, a first neural network obtained by training can be used for predicting attention hot points, visual focuses and eye movement tracks of people under different demographic information on the building elevation map, the data can be used for building design assistance and has the effect of assisting in positioning the focus of attention, so that building designers can design areas which are seriously concerned by people more finely, the scheme also trains a second neural network through a second training data packet, the second neural network can be deeply applied to the building elevation map design with blank areas, humanized and flexible assistance is provided for building design, the scheme is reliable in implementation and has wide data sources of neural network training, the model after being trained to be convergent has better application and popularization prospects in building design assistance.

Description

Neural network training method based on eye movement technology and building design method and system

Technical Field

The invention relates to the field of vision technology, architectural design and neural network aided design, in particular to a neural network training method based on an eye movement technology, an architectural design method and a system.

Background

With the wide application of neural network (AI) technology, in the field of building design, some aided design software has introduced an AI algorithm to provide a reference suggestion service for designers, which is mainly based on the automatic generation of building configuration schemes by AI. When the architects arrange the buildings, the architects often need to design according to industry knowledge; a set of logic rules is supported behind the design, so that an AI technology is introduced, the set of logic rules is learned, and after the model is trained to be convergent, the logic rules can be used for providing an automatic generation building configuration scheme suggestion; specifically, most AI neural networks in the time of algorithmic training map information from the real world into quantitative data, find the connections between them, summarize and apply rules to form neural networks for different scenarios based on training data.

At present, under the era of artificial intelligence and big data, there is an increasing literature report generated by fitting a computer to design rules by constructing a machine learning model and then applying the trained model to a new design. Although computers can learn and analyze a large amount of construction drawings and also consider economic, scientific, comfortable and other indexes, the computer always lacks perceptual understanding of construction design, mainly one layer less understanding of human perception. As building design practices are more humanized and refined, architects and planners urgently need more means and methods to deeply understand how people perceive the environment and how the environment affects the people, so as to guide the design of the architects. As human perception is mainly reflected in human visual behavior. Therefore, performing an eye movement experiment can accurately record the delicate eye movement behavior of a human. The method is combined with the research of the eye movement technology on the visual behaviors of the human, represents the perception degree of the human on the places, and is combined with a new algorithm in the field of data analysis, so that the interrelation of environmental elements in the built environment can be deeply analyzed, and new possibility is brought to the aspects of design development, scheme evaluation and the like of city design.

If the eye movement technology can be introduced into the neural network training of the building aided design, the accuracy and the reliability of the neural network are realized by integrating all data of different demographic information, so that the assistance to the building design assistance is obvious, and meanwhile, the positive practical significance is provided for the intervention of AI in the building design.

Disclosure of Invention

In view of the above, the present invention is directed to a neural network training method based on eye movement technology, a building design method and a building design system, which are reliable in implementation, flexible in operation, high in response efficiency and humanized.

In order to achieve the technical purpose, the technical scheme adopted by the invention is as follows:

a neural network training method based on eye movement technology, comprising:

s01, inputting a building elevation map, and setting the building elevation map as first training data;

s02, displaying the first training data in the sight of the tester, recording the eye movement condition when the tester watches the first training data, and generating eye movement data;

s03, acquiring eye movement data, generating attention hot spot data, visual focus data and eye movement track data according to the eye movement data, and setting the attention hot spot data, the visual focus data and the eye movement track data as analysis data in a correlation mode;

s04, acquiring demographic information of the tester corresponding to the analysis data, and associating the analysis data, the first training data and the demographic information of the tester to generate a first training data packet;

s05, acquiring a first training data packet, inputting the first training data packet into a neural network for training, and acquiring a trained neural network;

and S06, inputting the test data into the trained neural network to obtain an output result, and converging the model when the output result meets a preset condition to finish the training of the neural network to obtain a first neural network.

As a possible implementation manner, in this solution S03, the eye movement data is analyzed by the betaze analysis software to generate the attention point data, the visual focus data, and the eye movement trajectory data.

As a possible implementation manner, in the present scenario S04, the demographic information includes one or more of age, education background, occupation, and ethnicity of the tester.

As a possible implementation manner, in step S06, the test data is a test data packet extracted from or created separately from a plurality of first training data packets, and the test data packet has a building elevation map, demographic information data, and analysis data corresponding to the demographic information data one to one, and the test data has the building elevation map and the demographic information data as input items and the analysis data as reference output items;

after the input items are input into the trained neural network, an output result is obtained, the output result is matched with the reference output item,

when the matching value accords with a preset value, the model converges, and the neural network training is completed to obtain a first neural network;

and returning to S05 when the matching value does not accord with the preset value.

For the aspect of acquiring the eye movement data, the scheme is based on the eye movement data acquisition equipment in the prior art, and the general working principle, the operation process and the function introduction are briefly as follows:

and in the eye movement experiment, a German SMI eye movement instrument is adopted to collect eye movement data. The data acquisition is divided into eight steps: 1. the eye tracker is connected with a recorder 2, the recorder is started 3, an experiment task is newly established 4, conventional parameters 5 are set, demographic information 6 is input, equipment is correctly worn 7, three-point calibration 8 is performed, and data are acquired. After the eight steps, the eye movement information of a testee is collected. Since the ages, educational backgrounds, and the like of the subjects are different, it is necessary to collect different subjects during the experiment.

After the eye movement data is collected, the eye movement data needs to be analyzed by using Besize analysis software. Thus, we can obtain a series of analysis plots, for example: visual focus maps, attention heat spot maps, eye movement trajectory maps, and the like. In eye movement application studies, there are several commonly used indicators: 1. the visual index, namely the analysis chart 2, the statistical analysis index comprises basic indexes (fixation, eye jump and the like) and synthetic indexes.

Principle and mechanism of visual attention: when the human brain and the visual nervous system perform visual processing on scene or image information, the human brain and the visual nervous system are not used to treat all information equally, but used to assign more visual attention to some areas or objects. The visual attention mechanism is intended to mimic the way human beings observe. Generally, when a person looks at a picture, the person is more concerned with some local information of the picture, in addition to holding the picture from the whole. The limited visual processing capacity is focused on the interested area, and the observation efficiency is improved.

From a physiological perspective, the ability of a person to process information is limited. In the visual field, the horizontal direction coverage of human eyes is about 120 degrees. But only 2 degrees, belonging to a clear foveal field, the image outside the foveal region becomes blurred. That is, our gaze tends to select some objects and ignore others, which is visual attention.

Visual attention is not just a concept in a physiological sense, but human vision is often related to something thought about by the heart. This visual process of directing vision directly to specific individuals or locations in a scene and ignoring other individuals or locations is referred to as visual attention. The human visual attention mechanism comprises two basic mechanisms, bottom-up and top-down. The bottom-up attention mechanism is driven by external stimuli and features responsible for the rapid transition from rapid, automatic and involuntary attention and gaze. The top-down mechanism is task driven, and based on empirical memory, everyone is different from person to person. Therefore, the information selection strategy of the human visual system guides people to pay attention to a remarkable area in massive data by using a visual attention mechanism and allocates resources to process important information.

At this time, through eye movement experiments, the visual attention of people is finally calculated by using a precise scientific method, so that the people can know how to find the interesting area and determine the attention of the people, and how to perceive the external environment is explained. These help us to further elucidate the relationship between visual quality improvement and spatial distribution.

Principle of eye movement instrument: human eye movement is very delicate, and certain scientific instruments, namely eye movement instruments, are needed for collecting eye movement data. The eye tracker has three components, the first is a scene camera, which is located in the middle of two frames and records the experimental scene based on the visual angle of the subject. The second is a near infrared light source that emits light while reflecting off our eyes. The third is an eye movement sensor that records the retinal and corneal reflections, calculates the location of the gaze, and then superimposes it onto the video captured by the scene camera. Through these three components, the infrared emitter emits infrared rays into our eyes, at which time the light reflected by our cornea is constant, but the light reflected by our pupil is varied, so that the gaze position of our subject is recorded by the variation of the angle between the light reflected by the cornea and the light reflected by the pupil.

Specifically, the pupil-cornea reflex method is a method in which an eye image is captured by an eye camera and then the pupil center position is obtained by image processing. Then, the cornea reflection point (yellow spot) is used as a base point of the relative position of the eye camera and the eyeball as shown in fig. 2, and the sight line vector coordinate can be obtained according to the pupil center obtained by image processing, so that the human eye fixation point is determined. On the basis, a mapping function between a vector formed by a pupil and a corneal reflection point and a screen fixation point is found out through some calibration programs, and then an interest point gazed by a person on the screen is tracked in real time by detecting the variation of the pupil-corneal vector, so that an eye movement track and an interest result are obtained.

Three-point calibration in data acquisition: taking the SMIETG eye tracker as an example, the 3-point calibration is a process of matching the gaze point collected by the ETG with the actual gaze point of the subject. The 3-point calibration requires 3 calibration points, and the 3 calibration points must form a triangle shape and cannot be located on the same straight line. Meanwhile, the testee is told the accurate positions of the 3 calibration points. Then, let the subject watch the first calibration point, click the screen, move the cross mark (the fixation point collected by the eye tracker) on the screen to the calibration point (the position that the subject actually watches), and finally, finish 3-point calibration in sequence.

Two deep learning techniques:

1. image study based on attention model

Visual attention mechanism refers to the human automatically processing regions of interest, referred to as salient regions, while selectively ignoring regions of non-interest when facing a scene. In the field of computer vision, distinction is made according to purposes, and research related to "attention" is roughly divided into two directions, namely saliency detection aiming at "purely finding saliency" and a visual attention model (also called as a focusing model) based on the idea of "attention mechanism for something else", wherein the two methods both use simulation of human eye attention as core research content and aim to make the model achieve targeted "focusing", and different attention needs to be paid to different positions in an input scene. The visual attention model is a core module which takes an attention mechanism as a model and is used for positioning a significant difference area which can represent different objects. For example, given a test chart (see fig. 1) whose left image is the original image, the attention-focused region shown in the right image can be predicted by image study based on the attention model.

Based on the above scheme, the present invention further provides a building design method, which includes the above neural network training method based on the eye movement technology, and includes:

a01, acquiring a building elevation and demographic information to be processed to generate data to be processed;

a02, inputting data to be processed into a first neural network for eye movement data prediction to obtain a prediction result;

a03, according to the prediction result, obtaining attention hot spot data, visual focus data and eye movement track data of the building elevation to be processed under the corresponding demographic information;

and A04, outputting the building design auxiliary information according to the attention point data, the visual focus data and the eye movement track data.

As a possible implementation, the present solution further includes:

b01, building a building design database, a design learning database and a demographic information database, wherein the building design database stores building design drawings of different styles and specifications, the design learning database stores a plurality of building elevation drawings formed by various building arrangements, and the demographic information database stores a plurality of pieces of demographic information;

b02, importing the building elevation map and preset demographic information in the design learning database into a first neural network for eye movement data prediction to obtain a prediction result, wherein the prediction result comprises attention point data, visual focus data and eye movement track data; then according to the attention hot spot data, the visual focus data and the eye movement track data, positioning a corresponding area in the building elevation map and extracting image features of a preset area range;

b03, importing the extracted image features into a detection neural network to identify buildings and building styles in the image, obtaining building detection results, then matching the building detection results with visual focus data to obtain focus buildings and setoff buildings, and associating the focus buildings and the setoff buildings;

b04, acquiring the specifications of the focus building and the setoff building, respectively associating the specifications to generate focus building data and setoff building data, and then associating the focus building data, the setoff building data and corresponding demographic information to generate a second training data packet;

b05, acquiring a second training data packet, inputting the second training data packet into the neural network for training, and acquiring a trained neural network;

and B06, inputting the test data into the trained neural network to obtain an output result, converging the model when the output result meets a preset condition, finishing the training of the neural network, and obtaining a second neural network, wherein the second neural network is used for outputting the focus building suggestion information according to the setoff building or outputting the setoff building suggestion information according to the focus building.

As a possible implementation, the present solution further includes:

b07, importing the architectural design elevation with the blank areas and carrying out area marking on the architectural design elevation to generate a to-be-processed architectural design elevation and a to-be-processed area on the to-be-processed architectural design elevation;

b08, extracting areas in preset adjacent ranges of areas to be processed on the building design elevation to be processed, then importing the extraction results into a detection neural network, and outputting building information by the detection neural network;

and B09, importing the building information and the preset demographic information into a second neural network, acquiring data output by the second neural network, and setting the data as the suggested building information of the to-be-processed area.

As a possible implementation manner, further, the recommended building information in the scheme B09 is focused building recommended information or setoff building recommended information;

in addition, the building design database stores the focus building suggestion information or the building information pointed by the setoff building suggestion information.

The scheme of the invention provides an automatic building facade design method based on the combination of an eye movement technology and AI, which identifies visual attention hot spots of users in the environment of the building facade through the eye movement technology, analyzes attention hot spot diagrams of different users, extracts features by using a deep learning model and summarizes rules. The existing AI technology is combined, so that the final automatic design system for the vertical face of the building is more effective, and the requirements of actual building design are met.

Based on the above scheme, the present invention also provides a building design system, which includes:

the system comprises a database unit, a design learning database and a demographic information database, wherein the database unit is used for constructing a building design database, the design learning database and the demographic information database, building design drawings with different styles and specifications are stored in the building design database, a plurality of building elevation drawings formed by various building arrangements are stored in the design learning database, and a plurality of pieces of demographic information are stored in the demographic information database;

the first neural network unit is used for predicting eye movement data of the building elevation map and preset demographic information in the imported design learning database to obtain a prediction result, and the prediction result comprises attention point data, visual focus data and eye movement track data;

the feature extraction unit is used for positioning a corresponding area in the building elevation map and extracting image features of a preset area range according to the attention hot spot data, the visual focus data and the eye movement track data output by the first neural network unit, and is also used for extracting an area in a preset adjacent range of an area to be processed on the building design elevation map to be processed;

the data scheduling unit is used for importing the building elevation map and preset demographic information in the design learning database into a first neural network, and is also used for importing the image features extracted by the feature extraction unit into a detection neural network to identify the building and the building style in the image;

the detection neural network unit is used for identifying the buildings and the architectural styles in the images of the image features extracted by the feature extraction unit to obtain building detection results, and is also used for detecting areas in preset adjacent ranges of areas to be processed on the design elevation of the building to be processed and outputting building information;

the data association unit is used for matching the building detection result output by the detection neural network unit with the visual focus data to obtain a focus building and a setoff building and associating the focus building and the setoff building; the system is also used for acquiring the specifications of the focus building and the setoff building, respectively associating the specifications with the focus building and the setoff building to generate focus building data and setoff building data, and then associating the focus building data, the setoff building data and corresponding demographic information to generate a second training data packet;

and the second neural network unit is used for obtaining the second training data packet through training and outputting the suggested building information of the area to be processed on the building design elevation to be processed according to the building information and the preset demographic information.

Based on the foregoing solution, the present invention further provides a computer-readable storage medium, where at least one instruction, at least one program, a code set, or an instruction set is stored in the storage medium, and the at least one instruction, at least one program, a code set, or an instruction set is loaded by a processor and executed to implement the building design method.

By adopting the technical scheme, compared with the prior art, the invention has the beneficial effects that: the scheme ingeniously obtains a first neural network by leading the building elevation map and eye movement data corresponding to different demographic information into the neural network for training, the first neural network can be used for predicting attention hot spot data, visual focus data and eye movement track data of people under different demographic information on the building elevation map, so that the first neural network can be applied to building design assistance, and therefore, a building designer can be helped to design buildings aiming at different demographic information people or styles, the effect of paying attention to gravity center auxiliary positioning is achieved, the building designer can design areas which are focused on by people more finely, auxiliary positioning can be achieved on partial data in a second training data packet by means of the first neural network, and after feature extraction and feature detection are combined, the second training data packet can be obtained, the second neural network obtained through training of the second training data packet can be used for assisting building design, and can be used for carrying out suggestion pushing of focused buildings or setback buildings for designers in the blank areas of the building elevation map.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic diagram of the eye movement technique for extracting attention hot spot data according to the present embodiment;

FIG. 2 is a schematic flow chart of a neural network training method based on the eye movement technology according to the embodiment of the present invention;

FIG. 3 is a schematic view of an operation flow of acquiring eye movement data of a tester by an eye movement technique and importing the data into neural network training according to an embodiment of the present invention;

FIG. 4 is a schematic flow chart of an embodiment of the present invention, in which a building design method outputs building design assistance information through a first neural network;

FIG. 5 is a schematic flow chart of an embodiment of the invention for guiding the brief implementation of the blank part of the building design elevation after the first neural network and the second neural network are used in combination;

fig. 6 is a schematic diagram of the connection of the elements of the architectural design system according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be noted that the following examples are only illustrative of the present invention, and do not limit the scope of the present invention. Likewise, the following examples are only some examples, not all examples, and all other examples obtained by those skilled in the art without any inventive work are within the scope of the present invention.

As shown in fig. 2, the embodiment of the present invention provides a neural network training method based on an eye movement technology, which includes:

s04, acquiring demographic information of the testers corresponding to the analysis data, and associating the analysis data, the first training data and the demographic information of the testers to generate a first training data packet;

The first neural network obtained through training can be used for conducting visual attention prediction on the building elevation, namely, possible attention focuses, eye movement tracks and attention hot spot data of people with different demographic information on the building elevation are predicted.

In the aspect of acquiring the eye movement data, as shown in fig. 3, in step S03, the eye movement data is analyzed by using a becaze analysis software, so as to generate attention hot spot data, visual focus data, and eye movement trajectory data.

For the demographic information mentioned in the present scenario, in the present scenario S04, the demographic information includes more than one of the age, education background, occupation, and ethnicity of the tester.

In order to facilitate data extraction, in the scheme S06, the test data is extracted from a plurality of first training data packets, the test data packets include a building elevation map, demographic information data, and analysis data corresponding to the demographic information data one by one, the test data takes the building elevation map and the demographic information data as input items, and the analysis data as a reference output item;

The test data of the present scheme is not limited to be extracted from the first training data packet, but may be a separately established test data packet.

Further referring to fig. 4, based on the above scheme, the present embodiment further provides a building design method, which includes the above neural network training method based on the eye movement technology, and includes:

The building design auxiliary information output by the scheme is helpful for a designer to play a role in paying attention to the gravity center auxiliary positioning when building design is carried out on different demographic information personnel or styles, so that the building designer can design an area which is paid attention to by people more finely.

In addition to the above applications, as shown in fig. 5, the embodiment further includes:

b06, inputting the test data into the trained neural network to obtain an output result, converging the model when the output result meets a preset condition, finishing the training of the neural network, and obtaining a second neural network, wherein the second neural network is used for outputting the focus building suggestion information according to the setoff building or outputting the setoff building suggestion information according to the focus building;

In this embodiment, the building information data such as the recommended building information, the focus building information, and the setoff building information may be stored in advance in the database, and then encoded to facilitate extraction.

The proposed building information in the scheme B09 is focus building proposed information or setoff building proposed information; in addition, the building design database stores the focus building suggestion information or the building information pointed by the setoff building suggestion information.

As shown in fig. 6, based on the above solution, the present embodiment further provides a building design system, which includes:

the detection neural network unit is used for identifying the buildings and the architectural styles in the images of the image features extracted by the feature extraction unit to obtain building detection results, and is also used for detecting areas in preset adjacent ranges of the areas to be processed on the design elevation of the building to be processed and outputting building information;

and the second neural network unit is used for being obtained by training the second training data packet and outputting the suggested building information of the area to be processed on the building design elevation to be processed according to the building information and the preset demographic information.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be substantially or partially implemented in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only a part of the embodiments of the present invention, and not intended to limit the scope of the present invention, and all equivalent devices or equivalent processes performed by the present invention through the contents of the specification and the drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A neural network training method based on an eye movement technology is characterized by comprising the following steps:

and S06, inputting the test data into the trained neural network to obtain an output result, and converging the model when the output result meets the preset condition to finish the training of the neural network to obtain a first neural network.

2. The eye movement technology-based neural network training method of claim 1, wherein in S03, the eye movement data is analyzed by the betaze analysis software to generate the attention point data, the visual focus data and the eye movement trajectory data.

3. The eye movement technology-based neural network training method of claim 1, wherein in S04, the demographic information includes more than one of age, education background, occupation, and ethnicity of the tester.

4. An eye movement technology-based neural network training method according to any one of claims 1 to 3, wherein in S06, the test data is a test data packet extracted from or separately created from a plurality of first training data packets, the test data packet has a building elevation map, demographic information data, and analysis data corresponding to the demographic information data one to one, the test data has the building elevation map and the demographic information data as input items, and the analysis data as a reference output item;

5. A building design method comprising the eye movement technology-based neural network training method of any one of claims 1 to 4, comprising:

6. The building design method of claim 5, further comprising:

7. A building design method according to claim 6, further comprising:

b07, importing the architectural design elevation with the blank areas and carrying out area marking on the architectural design elevation to generate an architectural design elevation to be processed and an area to be processed on the architectural design elevation to be processed;

8. A building design method according to claim 7, wherein said recommended building information in B09 is focused building recommended information or setoff building recommended information;

9. A building design system, characterized in that it comprises:

10. A computer-readable storage medium, characterized in that: the storage medium has stored therein at least one instruction, at least one program, set of codes or set of instructions that is loaded by a processor and executed to implement the building design method according to one of claims 4 to 8.