WO2024079903A1

WO2024079903A1 - Degree-of-importance assessment system, degree-of-importance assessment device, and degree-of-importance assessment method

Info

Publication number: WO2024079903A1
Application number: PCT/JP2022/038458
Authority: WO
Inventors: 勇人逸身; 浩一二瓶; 昌治森本; フロリアンバイエ
Original assignee: 日本電気株式会社
Priority date: 2022-10-14
Filing date: 2022-10-14
Publication date: 2024-04-18

Abstract

A degree-of-importance assessment system (100) comprises: an identification means (101) for identifying a plurality of regions in one or more input images; a feature-amount calculation means (102) for calculating a feature amount for each region; and an assessment means (103) for generating, on the basis of the feature amount of each region, relationship information indicative of the relationship between the feature amounts of the regions, and assessing the degree of importance of each region on the basis of the relationship information.

Description

Importance determination system, importance determination device, and importance determination method

The present invention relates to an importance determination system, an importance determination device, and an importance determination method.

A technique is known for dividing and processing an input image when performing image analysis of the input image. For example, Patent Document 1 discloses an image analysis device including a partial image division unit that reprojects the input image in a number of different directions and divides it into a number of partial images, a feature extraction unit that extracts features from each of the partial images, an importance calculation unit that calculates the importance of each position in the input image based on a predetermined regression model from the extracted features, an attention point likelihood distribution calculation unit that calculates the likelihood distribution of attention points based on a predetermined regression model from the calculated importance, and an attention point calculation unit that calculates attention points based on the likelihood distribution of the attention points.

Japanese Patent Publication No. 2018-22360

Patent Document 1 describes calculating the importance of each position in an input image based on a specified regression model, but it would be useful to provide a technology that determines the importance with greater accuracy.

One aspect of the present invention has been made in consideration of the above problems, and one of its objectives is to provide an importance determination system, an importance determination device, and an importance determination method that can determine importance with high accuracy.

The importance determination system according to one aspect of the present invention is an importance determination system that determines the importance of multiple regions in one or more input images, and includes an identification means for identifying multiple regions in the one or more input images, a feature calculation means for calculating features of each region, and a determination means for generating relationship information indicating the relationship between the features of each region based on the features of each region, and determining the importance of each region based on the relationship information.

An importance determination device according to one aspect of the present invention is an importance determination device that determines the importance of multiple regions in one or more input images, and includes an identification unit that identifies multiple regions in the one or more input images, a feature amount calculation unit that calculates feature amounts of each region, feature amount calculation means that calculates the feature amounts of each region, and a determination unit that generates relationship information indicating the relationship between the feature amounts of each region based on the feature amounts of each region, and determines the importance of each region based on the relationship information.

The importance determination method according to one aspect of the present invention is a method for determining the importance of multiple regions in one or more input images, comprising: a feature calculation means for identifying multiple regions in the one or more input images, calculating a feature amount for each region, generating relationship information indicating the relationship between the feature amounts of each region based on the feature amounts of each region, and determining the importance of each region based on the relationship information.

According to one aspect of the present invention, importance can be determined with high accuracy.

1 is a block diagram showing an example of the configuration of an importance determination system according to a first embodiment; FIG. 2 is a flowchart showing an example of the flow of an importance determination method according to the first embodiment. 1 is a block diagram showing an example of the configuration of an importance determination device according to a first embodiment; FIG. 11 is a block diagram showing an example of the configuration of an importance judgment control system and a processing system according to a second embodiment. 13 is a schematic diagram showing an example of an area identified by an identifying unit in the second embodiment; FIG. FIG. 1 is a schematic diagram illustrating an example of a self-attention model. FIG. 1 is a flow diagram illustrating an example of a learning method for generating a trained model. 1 is a block diagram showing an example of the configuration of an importance determination system for executing a learning method. FIG. 13 is a schematic diagram showing an example of an area identified by an identifying unit in the third embodiment. FIG. 2 is a block diagram illustrating an example of the configuration of a computer.

First Embodiment
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A first embodiment of the present invention will be described in detail with reference to the accompanying drawings. This embodiment is a basic form of the embodiments described below.

(Configuration of Importance Judgment System)
The configuration of an importance determination system according to this embodiment will be described with reference to Fig. 1. Fig. 1 is a block diagram showing an example of the configuration of an importance determination system 100 according to a first embodiment. The importance determination system 100 includes a specification unit 101, a feature amount calculation unit 102, and a determination unit 103, and determines the importance of multiple regions in one or more input images.

The input image may be captured by a camera connected to the importance determination system 100, or may be transmitted to the importance determination system 100 via a network. Furthermore, if the input image is captured by a camera, the number of cameras may be one or more. Furthermore, the camera may be a spherical camera, a panoramic camera, etc.

Importance is an index used for a predetermined process based on an input image, and for example, the manner of the process may be changed based on the importance, or the flow of data being processed may be changed based on the importance. The predetermined process based on an input image is not particularly limited, but may be, for example, a process of analyzing an analysis target shown in the input image. The analysis target is not particularly limited, but may be, for example, a worker (person) working at a construction site, work equipment (object), and the behavior (movement) of the worker and work equipment.

In this specification, "analysis" means detecting the occurrence of an event to be detected in the subject of analysis. For example, if the subject of analysis is a worker (person), work equipment (object), or the behavior (movement) of the worker or work equipment working at a construction site, the analysis results may include the detection of the occurrence of events such as inefficient work, procedural errors, and dangerous behavior.

In one embodiment, when the importance is an index used for analyzing an input image, the importance may mean the necessity of performing the analysis. In this case, an object that is likely to have an occurrence of an event or the presence of an object to be detected by the analysis may be determined to have a high importance. Such "importance" may also be expressed as "attention level," "necessity of attention," "danger level," etc. Examples of events with high importance include, but are not limited to, actions according to a process, actions that are different from a process, and actions with high danger. Examples of objects with high importance include, but are not limited to, people and heavy machinery. The importance may also be determined based on whether or not they can be detected. For example, the importance of a person or object that is very small in the image and difficult to detect may be reduced. The method of expressing the importance is not particularly limited, but may be expressed as a binary value of "0" (low importance) and "1" (high importance), or may be expressed as a multi-value of three or more values (e.g., high, medium, low), or a continuous numerical value.

The identification means 101 identifies multiple areas within one or more input images input to the importance determination system 101.

The method of identifying multiple regions by the identification means 101 is not particularly limited, and may identify regions corresponding to parts of the input image for which a preset importance is to be determined, may identify regions surrounding objects detected by object detection processing of the input image, or may identify regions obtained by dividing the input image at equal intervals.

The feature amount calculation means 102 calculates the feature amount of each area identified by the identification means 101. The method for calculating the feature amount is not particularly limited, and various known algorithms can be used.

The determination means 103 generates relationship information indicating the relationship between the feature amounts of each region based on the feature amounts of each region calculated by the feature amount calculation means 102, and determines the importance of each region based on the relationship information. In one embodiment, the relationship information indicates the degree to which other regions are related to the importance of each region. In other words, the relationship information indicates the relationship between regions such that the relationship is large for regions necessary for determining the importance of the region, and small for regions not necessary for determining the importance of a particular region. An example of such relationship information is the attention weight used in attention mechanisms such as a self-attention mechanism.

As a result, the importance determination system 100 according to this embodiment can determine importance with high accuracy. In other words, since relationship information is generated based on the feature amounts of the input image, the importance can be determined using relationship information according to the input image. This allows the importance determination system 100 to determine importance with high accuracy.

(Flow of importance determination method)
The flow of the importance determination method S100 according to the present embodiment will be described with reference to Fig. 2. Fig. 2 is a flow diagram showing an example of the flow of the importance determination method S100 according to the first embodiment. In the example shown in Fig. 2, an importance determination system 100 executes the importance determination method S100.

In step S101, the identification means 101 identifies multiple regions in one or more input images. In step S102, the feature calculation means 102 calculates the feature amount of each region. In step S103, the determination means 103 generates relationship information indicating the relationship between the feature amounts of each region based on the feature amounts of each region, and determines the importance of each region based on the relationship information.

As described above, in the processing control method S100 according to this embodiment, the importance of each region is determined using a trained model that generates relationship information based on the features of the input image. This makes it possible to determine the importance using relationship information according to the input image, and therefore makes it possible to determine the importance with high accuracy.

(Configuration of Importance Determination Device)
The configuration of the importance determination device 200 according to this embodiment will be described with reference to Fig. 3. Fig. 3 is a block diagram showing the configuration of the importance determination device 200 according to the first embodiment. The importance determination device 100 includes an identification unit 201, a feature amount calculation unit 202, and a determination unit 203, and determines the importance of multiple regions in one or more input images.

The identification unit 201 has a function equivalent to the identification means 101, and identifies multiple regions in one or more input images. The feature calculation unit 202 has a function equivalent to the feature calculation means 102, and calculates the feature amount of each region. The judgment unit 203 has a function equivalent to the judgment means 103, and generates relationship information indicating the relationship between the feature amounts of each region based on the feature amounts of each region, and judges the importance of each region based on the relationship information.

The identification unit 201, the feature calculation unit 202, and the determination unit 203 may be computer devices in which processing is performed by a processor executing a program stored in a memory. For example, the identification unit 201, the feature calculation unit 202, and the determination unit 203 may be a single computer device, or may be a computer device group in which multiple computer devices operate in cooperation with each other, or a server device group in which multiple server devices operate in cooperation with each other. The importance determination device 200 can achieve the same effects as the importance determination system 100. In addition, some functions may be distributed to a cloud server.

Second Embodiment
A second embodiment of the present invention will be described in detail with reference to the drawings. Note that components having the same functions as those described in the first embodiment are given the same reference numerals, and descriptions thereof will be omitted as appropriate.

FIG. 4 is a block diagram showing an example of the configuration of an importance determination system 100 and a processing system 1 according to the second embodiment. The importance determination system 100 according to this embodiment includes a specification unit 101, a feature amount calculation unit 102, a determination unit 103, and an analysis method control unit 104.

In this embodiment, the identification means 101 identifies a plurality of regions whose positions within the input image are preset. FIG. 5 is a schematic diagram showing an example of a region identified by the identification means 101 in this embodiment. The identification means 101 divides one frame F of the input image into predefined regions and identifies a plurality of regions R. Here, the regions R do not need to be of uniform size, and it is not necessary to identify regions from the entire frame F.

For example, the identification means 101 does not need to identify as an area a portion that is known in advance not to include the analysis target (e.g., the sky or a building). For example, as shown in FIG. 5, the identification means 101 may identify an area R from a portion other than the portion A corresponding to the sky.

Furthermore, for example, the identification means 101 may change the size of the identified area depending on the characteristics of the input image or the analysis target. For example, as shown in FIG. 5, the identification means 101 may make the size of the lower area R(2) showing the front side larger than the size of the upper area R(1) showing the back side in accordance with the camera's angle of view. Further, for example, the identified area may be larger for areas where a large analysis target may exist, and may be smaller for areas where a small analysis target may exist.

In this embodiment, the feature calculation means 102 calculates the feature of each region. The method of calculating the feature is not particularly limited, but in one aspect, the feature calculation means 102 may include in the feature of each region an estimation result of the type of object contained in the region. The type of object indicates, for example, whether the object is a person or a machine, a vehicle, heavy machinery, etc. Also, in one aspect, the feature calculation means 102 may include in the feature of each region the position of the region within the input image. Also, the representation format of the feature is not particularly limited, but can be, for example, a fixed-length vector.

For example, the feature of each region can be a fixed-length vector that combines position information indicating the position of the region within the input image and class information indicating the estimated type of object contained in the region.

The position information may be any information that indicates the position of the region within the input image, and may be calculated based on the pixel position, for example, with the top left corner of the input image being (0,0) and the bottom right corner being (1,1). The position information may also include the size (width and height) of the region.

The class information may be any information that indicates the estimated result of the type of object contained in the region, for example, the result of identifying each region using an object identification model (class classification). As the object identification model, for example, an object identification model trained using training data such as ImageNet can be used. The representation format of the class information is not particularly limited, but may be, for example, a vector indicating the reliability that each identifiable type of object is contained in the region. For example, in the case of region R(2) in Figure 5, it may be (car: 0.4, truck 0.1, crane truck 0.5, ..., person: 0).

Note that the features of each region calculated by the feature calculation unit 102 are not limited to those described above. For example, features calculated using a learning model with a convolutional layer such as an Auto-Encoder may be used.

In this embodiment, the determination means 103 determines the importance of each region using the trained model M. In one aspect, the trained model M is a trained model that calculates a first matrix from input data combining features of each region and a first parameter trained by machine learning in advance, calculates a second matrix from the input data and a second parameter trained by machine learning in advance, calculates relationship information based on the first matrix and the second matrix, and calculates the importance of each region based on the relationship information. For example, the trained model M includes one or more layers that calculate a first matrix from input data combining features of each region and a first parameter trained by machine learning in advance, one or more layers that calculate a second matrix from input data and a second parameter trained by machine learning in advance, one or more layers that calculate relationship information based on the first matrix and the second matrix, and one or more layers that calculate the importance of each region based on the relationship information. The trained model M is not limited to this, but in one aspect, a self-attention model may be used.

FIG. 6 is a schematic diagram outlining an example of a self-attention model. The example shown in FIG. 6 shows an example in which the number of regions is 9, the number of dimensions of the feature is 1000, and the number of dimensions of the key (first matrix) and query (second matrix) is 4, but both the number of regions and the number of dimensions are not limited to these.

In the example shown in Figure 6, X is the input data that combines the features of each region. It is expressed as a (9,1000) matrix in which the features of each of the nine regions, each with a dimension of 1000, are combined.

First, in the learning model M, the input data X is multiplied by the attention parameter _Wk (first parameter) and the attention parameter _Wq (second parameter), respectively, to obtain the key _XWk (first matrix) and ^the query _XWqT (second matrix). The attention parameter _Wk (first parameter) and the attention parameter _Wq (second parameter) are machine-learned parameters as described below, and are expressed as a (1000, 4) matrix. The obtained key _XWk is a (9, 4) matrix, and the ^query _XWqT is a (4, 9) matrix.

Next, in the learning model M, an attention weight A is generated based on the following formula. Here, d _in indicates the number of dimensions of the feature. The attention weight A indicates which feature of each region is related to the importance of each region, and corresponds to relationship information.

Then, the importance determination result B(1,9) is calculated by calculating the sum of the attention weights A in the column direction. Each column of the importance determination result B(1,9) indicates the importance of each area.

As described above, the attention parameter W _k (first parameter) and the attention parameter W _q (second parameter) are machine-learned parameters. FIG. 7 is a flowchart showing an example of a learning method for generating a trained model M.

In one example, machine learning for generating the trained model M can be performed using an importance determination system 300 as shown in FIG. 8. The importance determination system 300 includes an identification unit 101, a feature calculation unit 102, a determination unit 103, a learning unit 105, and an analysis engine 106.

In step S1, learning data is input to the learning means 105. As the learning data, an image with a label indicating the analysis result can be used. The label may further include a reward value used in reinforcement learning. In addition, multiple labels indicating the analysis result may be attached to one piece of learning data.

For example, images such as those shown in FIG. 5 labeled with "heavy machinery approaching (10)" and "transportation work (1)" can be used as training data for generating a trained model M to be applied to a construction site. Furthermore, for example, images labeled with "packaging work (1)," "installation work (5)," and "screw tightening work (10)" can be used as training data for generating a trained model M to be applied to factory work. The reward value to be set may be a high reward value for events with a high priority to be detected by analysis.

In addition, images containing multiple people can be used as learning data, with human detection as the subject of analysis, and the reward value can be set so that if an area containing people is selected, a high reward is given (for example, +1 depending on the number of people contained), and if an empty area is selected, the reward is set to 0.

In step S2, the learning means 105 initializes parameters of the machine learning model M' (for example, attention parameter W _k (first parameter) and attention parameter W _q (second parameter)).

In step S3, the learning means 105 determines whether there is any learning data to be applied next, and ends the learning if there is no learning data to be applied next.

In step S4, the identification means 101, the feature calculation means 102, and the determination means 103 perform importance determination in the same manner as the importance determination system 100, except that they use learning data instead of the input image and use the machine learning model M' instead of the trained model M.

In step S5, the learning means 105 performs a process on the learning data to extract only areas of high importance based on the importance of each area obtained, and to give only the areas of high importance high image quality and the other areas low image quality.

In step S6, the learning means 105 inputs the learning data processed in step S5 to the analysis engine 106 and identifies the analysis result. Note that images of multiple frames may be input to the analysis engine. The learning means 105 then calculates a reward value according to the obtained analysis result. In other words, if the analysis result indicated by the label attached to the learning data is obtained, the learning means 105 may add the reward value set for that label.

In step S7, the learning means 106 performs reinforcement learning by updating the parameters (attention parameter W _k (first parameter) and attention parameter W _q (second parameter)) based on the reward value calculated in step S6. The parameter update method may conform to a known reinforcement learning method.

As described above, by updating the parameters of the machine learning model M', a trained model M can be generated. Note that multiple types of trained models M may be generated. For example, different trained models M may be generated and used depending on the scene (construction site, factory work) or time of day (morning, afternoon, night), etc.

Next, the analysis method control means 104 will be described. The analysis method control means 104 controls the method of analysis of each area of the input image according to the importance of that area determined by the determination means 103.

Here, we will explain the processing system 1 that performs analysis processing of the input image. As shown in FIG. 4, the processing system 1 includes one or more first processing units 20 and a second processing unit 30. For ease of viewing, FIG. 4 illustrates a configuration with one first processing unit 20, but there may be multiple first processing units 20.

Each of the first processing units 20 is connected to, for example, a camera or a sensor such as LiDAR (Light Detection and Ranging), and acquires one or more input images from the camera or sensor. It is sufficient for the input image to include the analysis target within the field of view of the image. The analysis target is, for example, a worker (person) working at a construction site, work equipment (object), and the behavior (movement) of the worker and work equipment.

The first processing unit 20 may also be connected to multiple cameras, sensors, etc., and acquire multiple input images. The first processing unit 20 may also acquire multiple input images from a single camera, etc.

The first processing unit 20 and the second processing unit 30 may each be configured with one or more computers. The first processing unit 20 and the second processing unit 30 are capable of communicating via a network NW, and share the analysis processing of the input image. The network NW may be wireless or wired, and if wireless, may be a wireless communication system such as Wi-Fi, LTE, 4G, or 5G.

In one embodiment, the first processing unit 20 may be an edge processing unit, and the second processing unit 30 may be a cloud processing unit. In this specification, "edge" refers to a place where data is collected. The first processing unit 20, which is an edge processing unit, is an information processing device (computer) or a group of information processing devices installed at or around the location where the analysis target is present (e.g., a construction site, a factory, etc.), and acquires an input image from a camera, a sensor, etc. installed at the location where the analysis target is present. The first processing unit 20 may be integrated with a camera, a sensor, etc. Also, in this specification, "cloud" refers to a place where data is processed, stored, etc. The second processing unit 30, which is a cloud processing unit, may be an information processing device (computer) or a group of information processing devices installed at a location that can provide large computational resources, such as a data center or a server farm. Note that the second processing unit 30 may be a processing unit located at a location connected to the first processing unit 20 via a network, and may be a computational resource connected to a base station such as 5G (e.g., MEC (Multi-access Edge Computing)), or a server installed in an office at the site (on-premises server), etc.

The first processing unit 20 may perform an analysis process on at least a portion of the one or more acquired input images to generate an analysis result. The first processing unit 20 may also calculate features for at least a portion of the one or more acquired input images and transmit the calculated features to the second processing unit 30 via the network NW. The first processing unit 20 may also transmit at least a portion of the one or more acquired input images to the second processing unit 30 via the network NW. When the first processing unit 20 transmits the features or the input image to the second processing unit 30, it may compress or encrypt the features or the input image before transmitting them to the second processing unit 30, or it may transmit the features or the input image to the second processing unit 30 without compressing or encrypting them.

The second processing unit 30 receives the features or input image sent from the first processing unit 20, performs restoration processing as necessary, and performs analysis processing.

The analysis process is, for example, detection, identification, tracking, and time series analysis of the analysis target (object, person) based on the input image. A learning model may be used for this analysis process. One or both of the first processing unit 20 and the second processing unit 30 may use the learning model.

The analysis method control means 104 controls the processing system 1 (i.e., the first processing unit 20 and the second processing unit 30) as follows.

In one example, the analysis method control means 104 may control the first processing unit 20 to cut out areas of high importance from the input image, deliver them to the second processing unit 30, and discard the rest. This makes it possible to reduce the bit rate by delivering only the parts of high importance when, for example, the communication bandwidth between the first processing unit 20 and the second processing unit 30 is reduced.

In one example, the analysis method control means 104 may control the first processing unit 20 to cut out areas of high importance from the input image and deliver them to the second processing unit 30, and analyze the remainder in the first processing unit 20. This allows the second processing unit 30 to analyze the areas of high importance from the input image using a high-precision model, and the first processing unit 20 to analyze the remaining areas using a low-precision model.

In one example, the analysis method control means 104 may control the first processing unit 20 or the second processing unit 30, or both, to analyze only areas of high importance in the input image and discard the remaining areas. This makes it possible to focus the analysis on only the important parts when it is difficult to analyze all areas due to the computational load.

In one example, the analysis method control means 104 may control the first processing unit 20 or the second processing unit 30, or both, to analyze only areas of high importance in the input image with a high-precision model, and analyze the remaining areas with a low-precision model. In this way, when it is difficult to analyze all areas with a high-precision model due to the computational load, only the areas of high importance can be analyzed with a high-precision model, and the other areas can be analyzed with a low-precision model.

As described above, according to this embodiment, the area for determining importance is not limited to a fixed size, and importance can be determined for any size depending on the characteristics of the input image or the subject of analysis.

In addition, the importance of each region is determined using a trained model that generates relationship information based on the features of the input image, so the importance can be determined using relationship information according to the input image, making it possible to determine the importance with high accuracy.

Furthermore, by controlling the method of analysis of each area according to the determined importance of each area, it is possible to perform efficient analysis. In particular, when the camera is a spherical camera, panoramic camera, etc. and the amount of data is very large, by controlling so that only areas with high importance are sent from the first processing unit 20 to the second processing unit 30, it is possible to cope with a decrease in the network bandwidth.

In one example, the analysis method control means 104 may control the processing system 1 (i.e., the first processing unit 20 and the second processing unit 30) to divide the analysis of the data to be analyzed between the first processing unit 20 and the second processing unit 30. Note that the analysis method control means 104 may not cause the processing system 1 to analyze data to be analyzed that is determined not to require analysis.

The analysis of the data to be analyzed between the first processing unit 20 and the second processing unit 30 can be shared in various ways. For example, the first processing unit 20 that acquires the data to be analyzed performs all of the analysis of the data to be analyzed, the first processing unit 20 that acquires the data to be analyzed performs a certain amount of analysis and the second processing unit 30 performs the remaining analysis, and the first processing unit 20 performs the minimum necessary processing such as compression, and the second processing unit 30 performs all of the analysis of the data to be analyzed. For example, the sharing method for the analysis of the data to be analyzed may be selected according to the computing power of the first processing unit 20, from among a first sharing method in which the first processing unit 20 generates the analysis results of the data to be analyzed, a second sharing method in which the first processing unit 20 calculates the feature values of the data to be analyzed, the first processing unit 20 transmits the feature values to the second processing unit 30, and the second processing unit 30 generates the analysis results from the feature values, and a third sharing method in which the first processing unit 20 transmits the data to be analyzed to the second processing unit 30, and the second processing unit 30 generates the analysis results from the data to be analyzed. In addition to computing power, the criteria used to select the sharing method may also be the computational cost, the importance of the data to be analyzed, the risk level indicated by the data to be analyzed, the compression efficiency of each data to be analyzed, communication quality, etc. By using these sharing methods appropriately, it is possible to perform analysis processing efficiently according to the situation.

Here, the analysis method control means 104 may select a sharing method for each of one or more input images acquired by each first processing unit 20, depending on the importance of each input image. For example, an input image with a high importance may be analyzed quickly by being analyzed by the first processing unit 20 that acquired the input image, and an input image with a high importance may be analyzed with high accuracy by being analyzed by the second processing unit 30.

In another embodiment, the analysis method control means 104 may select a sharing method to switch between the first processing unit 20 and the second processing unit 30 to analyze the analysis target data, based on a prediction of the processing load of the analysis target data in the first processing unit 20 and a prediction of the communication bandwidth between the first processing unit 20 and the second processing unit 30. The analysis method control means 104 may also determine the analysis target data portion to be discarded in the analysis target data, based on the predicted communication bandwidth. The analysis method control means 104 may also cause the first processing unit 20 and the second processing unit 30 to complement frames that were processed before the switching in the unit frame set when switching from a state in which the analysis target data is not being processed to a state in which the analysis target data is being processed. The analysis method control means 104 may also cause the first processing unit 20 and the second processing unit 30, which are not analyzing the analysis target data, to buffer the analysis target data, and when the processing unit that is not processing the analysis target data is switched to processing the analysis target data, to analyze the analysis target data using the buffered data. The process control means 102 may execute the above-mentioned discard process, complement process, and buffering process based on the importance of the data to be analyzed, the reliability of the processing of the data to be analyzed, the communication bandwidth allocated for transmitting the data to be analyzed, etc. The reliability is an index indicating the degree of confidence in the predicted analysis result, and may be, for example, a confidence value output from the trained model that performed the analysis.

Note that although the above describes a configuration in which the importance determination system 100 is independent of each of the first processing units 20 and the second processing units 30, this embodiment is not limited to this. For example, a part or all of the importance determination system 100 may be provided in each of the first processing units 20, the second processing unit 30, or in each of the first processing units 20 and the second processing unit 30 in a distributed manner.

The second embodiment has been described above as a process control system 100, but the process control system 100 according to the second embodiment may be mounted on a single device as a process control device. Furthermore, the operation of the process control system 100 according to the second embodiment may be the process control method according to the second embodiment.

Third Embodiment
A third embodiment of the present invention will be described in detail with reference to the drawings. Note that components having the same functions as those described in the first and second embodiments are given the same reference numerals, and the description thereof will be omitted as appropriate.

In this embodiment, the identification means 101 detects multiple objects included in one or more input images, and identifies the multiple regions by identifying the regions corresponding to each of the detected objects. FIG. 9 is a schematic diagram showing an example of the regions identified by the identification means 101 in this embodiment. The identification means 101 identifies the multiple regions T1 and T2 by identifying the regions corresponding to the objects detected using an object detection model for one frame F of the input image. The regions corresponding to the objects are, for example, the regions surrounding the objects.

In this embodiment, the feature amount calculation means 102 calculates the feature amount of each region in the same manner as the feature amount calculation means 102 according to the second embodiment. At this time, the feature amount calculation means 102 may use, as class information, the identification result obtained by identifying each region using an object identification model, as in the feature amount calculation means 102 according to the second embodiment. Alternatively, the feature amount calculation means 102 may use, as class information, the identification result obtained when the identification means 101 detects an object in the input image by object detection.

As described above, according to this embodiment, the area in which importance is determined can be an area of any size in which an object is detected. This allows the importance to be determined efficiently.

The third embodiment has been described above as a process control system 100, but the process control system 100 according to the third embodiment may be mounted on a single device as a process control device. Furthermore, the operation of the process control system 100 according to the third embodiment may be the process control method according to the third embodiment.

Fourth embodiment
A fourth embodiment of the present invention will be described in detail with reference to the drawings. Note that components having the same functions as those described in the first, second and third embodiments are given the same reference numerals, and the description thereof will be omitted as appropriate.

In this embodiment, the identification means 101 identifies multiple regions by identifying one or more regions within multiple input images input from different cameras.

In this embodiment, the feature calculation means 102 calculates the feature of the region in each input image. The determination means 103 then inputs input data that combines the feature of the regions in the multiple input images into the trained model M, thereby being able to determine the importance of each of the regions in the multiple input images.

In this way, instead of inputting the features of each input image into the trained model, input data that combines the features of regions within multiple input images is input into the trained model, making it possible to perform importance judgments across multiple cameras.

The fourth embodiment has been described above as a process control system 100, but the process control system 100 according to the fourth embodiment may be mounted on a single device to form a process control device. Furthermore, the operation of the process control system 100 according to the fourth embodiment may be the process control method according to the fourth embodiment.

This disclosure is not limited to the above-described embodiments, and various modifications are possible. The technical scope of this disclosure also includes embodiments obtained by appropriately combining the configurations, operations, and processes disclosed in the different embodiments. In addition, the technical scope of this disclosure also includes embodiments in which the order of the operations and processes disclosed in the different embodiments is appropriately changed.

Each of the configurations according to the first to fourth embodiments may be realized by (1) one or more pieces of hardware, (2) one or more pieces of software, (3) a combination of hardware and software, or (4) the second. Each device, function, and process may be realized by at least one computer having at least one processor and at least one memory. An example of such a computer (hereinafter referred to as computer C) is shown in FIG. 10. For example, each of the functions described in the first to fourth embodiments may be realized by storing a program for implementing the processing control method described in the first to fourth embodiments in memory C2, and having processor C1 read and execute program P stored in memory C2.

The program P includes a set of instructions that, when loaded into the computer C, causes the computer C to execute one or more of the functions described in the first to fourth embodiments. The program P is stored in the memory C2. The processor C1 can be, for example, a CPU (Central Processing Unit). The memory C2 can be, for example, a Read Only Memory (ROM), a Random Access Memory (RAM), a flash memory, a Solid State Drive (SSD), etc.

The program P can also be recorded on a non-transitory, tangible recording medium M that can be read by the computer C. Such a recording medium M can be, for example, a tape, a disk, a card, a semiconductor memory, or a programmable logic circuit. The computer C can obtain the program P via such a recording medium M. The program P can also be transmitted via a transmission medium. Such a transmission medium can be, for example, a communications network or broadcast waves. The computer C can also obtain the program P via such a transmission medium.

The present disclosure is not limited to the above-described embodiments. In other words, the present invention can be applied in various aspects that a person skilled in the art can understand within the scope of the present disclosure. Note that some or all of the above-described embodiments can also be described as follows. However, the present invention is not limited to the aspects described below.

(Appendix 1)
1. An importance determination system for determining importance of a plurality of regions in one or more input images, comprising:
means for identifying a plurality of regions within the one or more input images;
A feature amount calculation means for calculating a feature amount of each region;
An importance determination system comprising: a determination means for generating relationship information indicating a relationship between the feature amounts of each region based on the feature amounts of each region, and determining the importance of each region based on the relationship information.

(Appendix 2)
The importance determination system described in Appendix 1, wherein the determination means calculates a first matrix from input data combining features of each region and a first parameter that has been machine-learned in advance, calculates a second matrix from the input data and a second parameter that has been machine-learned in advance, calculates the relationship information based on the first matrix and the second matrix, and calculates the importance of each region based on the relationship information.

(Appendix 3)
3. The importance determination system according to claim 1, wherein the feature calculation means includes an estimation result of a type of object contained in each region in the feature of the region.

(Appendix 4)
The importance determination system according to any one of claims 1 to 3, wherein the feature amount calculation means includes in the feature amount of each region the position of the region within the input image.

(Appendix 5)
The importance determination system according to any one of claims 1 to 4, wherein the identification means identifies the plurality of areas having at least two or more sizes whose positions within the one or more input images are preset.

(Appendix 6)
The identification means detects a plurality of objects included in the one or more input images, and identifies the plurality of regions by identifying a region corresponding to each of the detected objects.

(Appendix 7)
The importance determination system according to any one of claims 1 to 6, wherein the identification means identifies the plurality of regions by identifying one or more regions within a plurality of input images input from different cameras.

(Appendix 8)
8. The importance determination system according to claim 1, further comprising an analysis method control means for controlling a method of analysis of each area in accordance with the importance of the area.

(Appendix 9)
An importance determination device for determining importance of a plurality of regions in one or more input images, comprising:
an identification unit for identifying a plurality of regions within the one or more input images;
A feature amount calculation unit that calculates a feature amount of each region;
A feature amount calculation means for calculating a feature amount of each region;
An importance determination device comprising: a determination unit that generates relationship information indicating a relationship between the feature amounts of each region based on the feature amounts of each region, and determines the importance of each region based on the relationship information.

(Appendix 10)
The importance determination device described in Appendix 9, wherein the determination unit calculates a first matrix from input data combining features of each region and a first parameter that has been machine-learned in advance, calculates a second matrix from the input data and a second parameter that has been machine-learned in advance, calculates the relationship information based on the first matrix and the second matrix, and calculates the importance of each region based on the relationship information.

(Appendix 11)
11. The importance determination device according to claim 9, wherein the feature amount calculation unit includes an estimation result of a type of object included in each region in the feature amount of the region.

(Appendix 12)
12. The importance determination device according to claim 9, wherein the feature amount calculation unit includes in the feature amount of each region the position of the region within the input image.

(Appendix 13)
The importance determination device according to any one of claims 9 to 12, wherein the identification unit identifies the plurality of areas having at least two or more sizes whose positions within the one or more input images are preset.

(Appendix 14)
The importance determination device according to any one of appendices 9 to 12, wherein the identification unit detects a plurality of objects included in the one or more input images, and identifies the plurality of regions by identifying a region corresponding to each of the detected objects.

(Appendix 15)
The importance determination device according to any one of claims 9 to 14, wherein the identification unit identifies the plurality of regions by identifying one or more regions within a plurality of input images input from different cameras.

(Appendix 16)
16. The importance determination device according to claim 9, further comprising an analysis method control unit that controls a method of analysis of each area according to the importance of the area.

(Appendix 17)
1. A method for determining importance of a plurality of regions in one or more input images, comprising:
Identifying a plurality of regions within the one or more input images;
Calculate the feature values for each region,
A feature amount calculation means for calculating a feature amount of each region;
An importance determination method comprising: generating relationship information indicating a relationship between the feature amounts of each region based on the feature amounts of each region; and determining the importance of each region based on the relationship information.

(Appendix 18)
An importance determination method as described in Appendix 17, which calculates a first matrix from input data combining features of each region and a first parameter that has been machine-learned in advance, calculates a second matrix from the input data and a second parameter that has been machine-learned in advance, calculates the relationship information based on the first matrix and the second matrix, and calculates the importance of each region based on the relationship information.

(Appendix 19)
19. The importance determination method according to claim 17 or 18, wherein the feature amount of each region includes an estimation result of the type of object contained in the region.

(Appendix 20)
20. The method of determining importance according to any one of claims 17 to 19, wherein the feature amount of each region includes the position of the region within the input image.

(Appendix 21)
The importance determination method according to any one of appendices 17 to 20, further comprising identifying the plurality of regions having at least two or more sizes, the positions of which within the one or more input images being preset.

(Appendix 22)
21. The importance determination method according to any one of appendices 17 to 20, further comprising: detecting a plurality of objects included in the one or more input images; and identifying the plurality of regions by identifying a region corresponding to each of the detected objects.

(Appendix 23)
23. The importance determination method according to any one of appendices 17 to 22, wherein the plurality of regions are identified by identifying one or more regions within each of a plurality of input images input from different cameras.

(Appendix 24)
24. The importance determination method according to any one of claims 17 to 23, wherein a method of analysis of each area is controlled according to the importance of the area.

(Appendix 25)
The above-described process control system can also be expressed as follows.

1. An importance determination system for determining importance of a plurality of regions in one or more input images, comprising:
At least one processor, the processor comprising:
an identification process for identifying a plurality of regions within the one or more input images;
A feature amount calculation process for calculating a feature amount of each region;
An importance determination system that generates relationship information indicating a relationship between the feature amounts of each region based on the feature amounts of each region, and executes a determination process that determines the importance of each region based on the relationship information.

The processing control system may further include at least one memory, and this memory may store a program for causing the processor to execute the identification process, the feature calculation process, and the determination process. The program may also be recorded on a computer-readable, non-transitory, tangible recording medium.

(Appendix 26)
The above-described process control system can also be expressed as follows.

An importance determination device for determining importance of a plurality of regions in one or more input images, comprising:
At least one processor, the processor comprising:
an identification process for identifying a plurality of regions within the one or more input images;
A feature amount calculation process for calculating a feature amount of each region;
an importance determination device that generates relationship information indicating a relationship between the feature amounts of each region based on the feature amounts of each region, and executes a determination process of determining the importance of each region based on the relationship information.

REFERENCE SIGNS LIST 1 Processing system 10 Camera 20 First processing unit 30 Second processing unit 100 Importance determination system 101 Identification means 102 Feature amount calculation means 103 Determination means 104 Analysis method control means 104
M Trained model

Claims

1. An importance determination system for determining importance of a plurality of regions in one or more input images, comprising:
means for identifying a plurality of regions within the one or more input images;
A feature amount calculation means for calculating a feature amount of each region;
An importance determination system comprising: a determination means for generating relationship information indicating a relationship between the feature amounts of each region based on the feature amounts of each region, and determining the importance of each region based on the relationship information.
The importance determination system according to claim 1, wherein the determination means calculates a first matrix from input data combining features of each region and a first parameter that has been machine-learned in advance, calculates a second matrix from the input data and a second parameter that has been machine-learned in advance, calculates the relationship information based on the first matrix and the second matrix, and calculates the importance of each region based on the relationship information.
The importance determination system according to claim 1 or 2, wherein the feature calculation means includes in the feature of each region an estimation result of the type of object contained in the region.
The importance determination system according to any one of claims 1 to 3, wherein the feature calculation means includes the position of each region within the input image in the feature of the region.
The importance determination system according to any one of claims 1 to 4, wherein the identification means identifies the plurality of regions having at least two or more sizes whose positions in the one or more input images are preset.
The importance determination system according to any one of claims 1 to 4, wherein the identification means detects a plurality of objects included in the one or more input images, and identifies the plurality of regions by identifying a region corresponding to each of the detected objects.
The importance determination system according to any one of claims 1 to 6, wherein the identification means identifies the multiple regions by identifying one or more regions in multiple input images input from different cameras.
An importance determination device for determining importance of a plurality of regions in one or more input images, comprising:
an identification unit for identifying a plurality of regions within the one or more input images;
A feature amount calculation unit that calculates a feature amount of each region;
A feature amount calculation means for calculating a feature amount of each region;
An importance determination device comprising: a determination unit that generates relationship information indicating a relationship between the features of each region based on the features of each region, and determines the importance of each region based on the relationship information and the features of each region.
The importance determination device according to claim 8, wherein the determination unit calculates a first matrix from input data combining features of each region and a first parameter that has been machine-learned in advance, calculates a second matrix from the input data and a second parameter that has been machine-learned in advance, calculates the relationship information based on the first matrix and the second matrix, and calculates the importance of each region based on the relationship information.
The importance determination device according to claim 8 or 9, wherein the feature amount calculation unit includes in the feature amount of each region an estimation result of the type of object contained in the region.
The importance determination device according to any one of claims 8 to 10, wherein the feature amount calculation unit includes the position of each region within the input image in the feature amount of the region.
The importance determination device according to any one of claims 8 to 11, wherein the identification unit identifies the multiple regions having at least two or more sizes whose positions in the one or more input images are preset.
The importance determination device according to any one of claims 8 to 11, wherein the identification unit detects a plurality of objects included in the one or more input images, and identifies the plurality of regions by acquiring a region corresponding to each of the detected objects.
1. A method for determining importance of a plurality of regions in one or more input images, comprising:
Identifying a plurality of regions within the one or more input images;
Calculate the feature values for each region,
A feature amount calculation means for calculating a feature amount of each region;
An importance determination method, comprising: generating relationship information indicating a relationship between the feature amounts of each region based on the feature amounts of each region; and determining the importance of each region based on the relationship information and the feature amounts of each region.
The importance determination method according to claim 14, which calculates a first matrix from input data combining features of each region and a first parameter that has been machine-learned in advance, calculates a second matrix from the input data and a second parameter that has been machine-learned in advance, calculates the relationship information based on the first matrix and the second matrix, and calculates the importance of each region based on the relationship information.
The importance determination method according to claim 14 or 15, in which the feature amount of each region includes an estimation result of the type of object contained in the region.
The method for determining importance according to any one of claims 14 to 16, in which the feature amount of each region includes the position of the region within the input image.
The method for determining importance according to any one of claims 14 to 17, in which the multiple regions having at least two different sizes and whose positions in the one or more input images are preset are acquired.
The importance determination method according to any one of claims 14 to 17, wherein a plurality of objects included in the one or more input images are detected, and the plurality of regions are identified by acquiring a region corresponding to each of the detected objects.
The importance determination method according to any one of claims 14 to 19, further comprising identifying the plurality of regions by identifying one or more regions in each of a plurality of input images input from different cameras.