WO2020027074A1 - Solid-state imaging device and electronic apparatus - Google Patents

Solid-state imaging device and electronic apparatus Download PDF

Info

Publication number
WO2020027074A1
WO2020027074A1 PCT/JP2019/029715 JP2019029715W WO2020027074A1 WO 2020027074 A1 WO2020027074 A1 WO 2020027074A1 JP 2019029715 W JP2019029715 W JP 2019029715W WO 2020027074 A1 WO2020027074 A1 WO 2020027074A1
Authority
WO
WIPO (PCT)
Prior art keywords
image data
unit
image
imaging
solid
Prior art date
Application number
PCT/JP2019/029715
Other languages
French (fr)
Japanese (ja)
Inventor
良仁 浴
Original Assignee
ソニーセミコンダクタソリューションズ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーセミコンダクタソリューションズ株式会社 filed Critical ソニーセミコンダクタソリューションズ株式会社
Priority to US17/251,953 priority Critical patent/US11820289B2/en
Priority claimed from JP2019139196A external-priority patent/JP6725733B2/en
Publication of WO2020027074A1 publication Critical patent/WO2020027074A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N25/00Circuitry of solid-state image sensors [SSIS]; Control thereof
    • H04N25/40Extracting pixel data from image sensors by controlling scanning circuits, e.g. by modifying the number of pixels sampled or to be sampled
    • H04N25/44Extracting pixel data from image sensors by controlling scanning circuits, e.g. by modifying the number of pixels sampled or to be sampled by partially reading an SSIS array

Definitions

  • the present disclosure relates to a solid-state imaging device and an electronic device. More specifically, the present invention relates to processing of image data in a chip.
  • CMOS Complementary Metal Oxide Semiconductor
  • DSP Digital Signal Processor
  • the present disclosure proposes a solid-state imaging device and an electronic device that can execute processing in a chip of an image sensor.
  • a solid-state imaging device includes an imaging unit that acquires image data, and the image data or data based on the image data, based on a neural network calculation model.
  • a processing unit that executes a process of extracting a specific region, and an output unit that outputs image data processed based on the specific region, or image data read from the imaging unit based on the specific region.
  • FIG. 2 is a block diagram illustrating a schematic configuration example of an imaging device as an electronic apparatus according to the first embodiment.
  • FIG. 4 is a diagram for describing image processing according to the first embodiment. It is a flow chart which shows a flow of processing processing concerning a 1st embodiment.
  • FIG. 9 is a diagram illustrating a modification of the first embodiment.
  • FIG. 6 is a diagram illustrating an imaging device according to a second embodiment. It is a figure explaining the modification of a 2nd embodiment.
  • FIG. 11 is a diagram illustrating an imaging device according to a third embodiment. It is a sequence diagram showing the flow of the processing concerning a 3rd embodiment.
  • FIG. 2 is a schematic diagram illustrating a chip configuration example of the image sensor according to the embodiment.
  • FIG. 3 is a diagram for explaining a layout example according to the embodiment.
  • FIG. 3 is a diagram for explaining a layout example according to the embodiment. It is a block diagram showing an example of a schematic structure of a vehicle control system. It is explanatory drawing which shows an example of the installation position of a vehicle exterior information detection part and an imaging part. It is a figure showing an example of the schematic structure of an endoscope operation system.
  • FIG. 3 is a block diagram illustrating an example of a functional configuration of a camera head and a CCU. It is a block diagram showing an example of a schematic structure of a diagnosis support system.
  • FIG. 1 is a block diagram illustrating a schematic configuration example of an imaging device as an electronic device according to the first embodiment.
  • the imaging device 1 is communicably connected to a cloud server 30.
  • the imaging apparatus 1 and the cloud server 30 are communicably connected to each other via a network or a USB (Universal Serial Bus) cable, regardless of whether they are wired or wireless.
  • USB Universal Serial Bus
  • the cloud server 30 is an example of a server device that stores image data such as still images and moving images transmitted from the imaging device 1.
  • the cloud server 30 can store image data in arbitrary units, such as for each user, for each date, and for each imaging location, and can provide various services such as creating an album using the image data.
  • the imaging device 1 is an example of an electronic device having the image sensor 10 and the application processor 20, and is, for example, a digital camera, a digital video camera, a tablet terminal, a smartphone, or the like.
  • a digital camera for example, a digital camera, a digital video camera, a tablet terminal, a smartphone, or the like.
  • an example in which an image is captured will be described.
  • the present invention is not limited to this, and the same processing can be performed for a moving image and the like.
  • the image sensor 10 is, for example, a CMOS (Complementary Metal Oxide Semiconductor) image sensor composed of one chip, receives incident light, performs photoelectric conversion, and outputs image data corresponding to the amount of incident light received by the application processor 20. Output to CMOS (Complementary Metal Oxide Semiconductor) image sensor composed of one chip, receives incident light, performs photoelectric conversion, and outputs image data corresponding to the amount of incident light received by the application processor 20. Output to CMOS (Complementary Metal Oxide Semiconductor) image sensor composed of one chip, receives incident light, performs photoelectric conversion, and outputs image data corresponding to the amount of incident light received by the application processor 20. Output to CMOS (Complementary Metal Oxide Semiconductor) image sensor composed of one chip, receives incident light, performs photoelectric conversion, and outputs image data corresponding to the amount of incident light received by the application processor 20. Output to CMOS (Complementary Metal Oxide Semiconductor
  • the application processor 20 is an example of a processor such as a CPU (Central Processing Unit) that executes various applications.
  • the application processor 20 performs various processes corresponding to the application, such as a display process of displaying image data input from the image sensor 10 on a display, a biometric authentication process using the image data, and a transmission process of transmitting the image data to the cloud server 30. Execute.
  • the imaging device 1 includes an image sensor 10 that is a solid-state imaging device, and an application processor 20.
  • the image sensor 10 includes an imaging unit 11, a control unit 12, a signal processing unit 13, a DSP (also called a processing unit) 14, a memory 15, and a selector 16 (also called an output unit).
  • the imaging unit 11 includes, for example, an optical system 104 including a zoom lens, a focus lens, an aperture, and the like, and a pixel array unit 101 having a configuration in which unit pixels including light receiving elements such as photodiodes are arranged in a two-dimensional matrix. .
  • Light incident from the outside passes through the optical system 104 to form an image on a light receiving surface of the pixel array unit 101 on which light receiving elements are arranged.
  • Each unit pixel of the pixel array unit 101 converts the light incident on the light receiving element into an electric charge, and accumulates a charge corresponding to the amount of incident light in a readable manner.
  • the imaging unit 11 includes a converter (Analog to Digital Converter: hereinafter referred to as an ADC) 17 (for example, see FIG. 2).
  • the ADC 17 generates digital image data by converting an analog pixel signal for each unit pixel read from the imaging unit 11 into a digital value, and outputs the generated image data to the signal processing unit 13.
  • the ADC 17 may include a voltage generation circuit that generates a drive voltage for driving the imaging unit 11 from a power supply voltage or the like.
  • the size of the image data output by the imaging unit 11 can be selected from a plurality of sizes such as 12M (3968 ⁇ 2976) pixels and VGA (Video Graphics Array) size (640 ⁇ 480 pixels Z). .
  • the image data output by the imaging unit 11 can be, for example, a color image of RGB (red, green, and blue) or a monochrome image of only luminance. These selections can be made as a kind of setting of the shooting mode.
  • the control unit 12 controls each unit in the image sensor 10 according to, for example, a user operation or a set operation mode.
  • the signal processing unit 13 performs various kinds of signal processing on digital image data read from the imaging unit 11 or digital image data read from the memory 15 (hereinafter, referred to as processing target image data). .
  • processing target image data digital image data read from the imaging unit 11 or digital image data read from the memory 15
  • the signal processing unit 13 converts the format of the image data into YUV image data, RGB image data, or the like.
  • the signal processing unit 13 performs, for example, processing such as noise removal and white balance adjustment on the image data to be processed as necessary.
  • the signal processing unit 13 performs various signal processing (also referred to as pre-processing) necessary for the DSP 14 to process the image data to be processed.
  • the DSP 14 executes a program stored in the memory 15 to function as a processing unit that executes various processes using a learned model created by machine learning using a deep neural network (DNN). .
  • the DSP 14 executes a calculation process based on the learned model stored in the memory 15 to execute a process of multiplying the dictionary coefficient stored in the memory 15 by the image data.
  • the result (calculation result) obtained by such calculation processing is output to the memory 15 and / or the selector 16.
  • the calculation result may include image data obtained by executing a calculation process using the learned model, and various information (metadata) obtained from the image data.
  • the DSP 14 may include a memory controller for controlling access to the memory 15.
  • the arithmetic processing includes, for example, one using a learned learning model which is an example of a neural network calculation model.
  • the DSP 14 can execute DSP processing, which is various processing, using a learned learning model.
  • the DSP 14 reads out image data from the memory 15 and inputs the image data into a learned learning model, and acquires a face position such as a face outline or a face image area as an output result of the learned model. Then, the DSP 14 performs processing such as masking, mosaic, and avatar processing on the extracted face position in the image data to generate processed image data. After that, the DSP 14 stores the generated processed image data (processed image data) in the memory 15.
  • the learned learning model includes a DNN, a support vector machine, and the like, which have learned the detection of the face position of a person using the learning data.
  • the learned learning model outputs a discrimination result, that is, area information such as an address for specifying a face position.
  • the DSP 14 updates the learning model by changing the weights of various parameters in the learning model using the learning data, or prepares a plurality of learning models and uses the learning model according to the content of the arithmetic processing. , Or a learned model that has been learned from an external device is acquired or updated, and the above-described arithmetic processing can be executed.
  • the image data to be processed by the DSP 14 may be image data normally read from the pixel array unit 101, or the data size may be reduced by thinning out pixels of the normally read image data.
  • the image data may be reduced image data.
  • the image data may be image data read out with a smaller data size than usual by executing reading out of the pixel array unit 101 by thinning out pixels.
  • the normal reading here may be reading without skipping pixels.
  • the processed image data in which the face position of the image data is masked, the processed image data in which the face position of the image data is mosaic-processed, or the face position of the image data is It is possible to generate avatar-processed image data or the like that is replaced with a character.
  • the memory 15 stores the image data output from the imaging unit 11, the image data processed by the signal processing unit 13, the calculation result obtained by the DSP 14, and the like as necessary. Further, the memory 15 stores an algorithm of a learned learning model executed by the DSP 14 as a program and a dictionary coefficient.
  • the memory 15 stores ISO (International Organization for Standardization) sensitivity and exposure time in addition to the image data output from the signal processing unit 13 and the processed image data output from the DSP 14 (hereinafter referred to as processed image data). , A frame rate, a focus, a shooting mode, a cutout range, and the like. That is, the memory 15 can store various types of imaging information set by the user.
  • ISO International Organization for Standardization
  • the selector 16 selectively outputs the processed image data output from the DSP 14 and the image data stored in the memory 15 according to, for example, a selection control signal from the control unit 12. For example, the selector 16 selects one of the processed image data and the operation result of the metadata or the like stored in the memory 15 by a user setting or the like, and outputs the selected operation result to the application processor 20.
  • the selector 16 reads the processed image data generated by the DSP 14 from the memory 15 and outputs the processed image data to the application processor.
  • the selector 16 outputs the image data input from the signal processing unit 13 to the application processor.
  • the selector 16 may directly output the calculation result output from the DSP 14 to the application processor 20.
  • the image data and the processed image data output from the selector 16 as described above are input to the application processor 20 that processes display and user interface.
  • the application processor 20 is configured using, for example, a CPU, and executes an operating system, various application software, and the like.
  • the application processor 20 may have functions such as a GPU (Graphics Processing Unit) and a baseband processor.
  • the application processor 20 performs various processes as needed on the input image data and the calculation results, executes display to the user, and transmits the image data and the calculation result to the external cloud server 30 via the predetermined network 40. Or
  • Various networks such as the Internet, a wired LAN (Local Area Network) or a wireless LAN, a mobile communication network, and Bluetooth (registered trademark) can be applied to the predetermined network 40.
  • the transmission destination of the image data and the calculation result is not limited to the cloud server 30, and various servers having a communication function such as a server that operates alone, a file server that stores various data, and a communication terminal such as a mobile phone.
  • Information processing device system
  • FIG. 2 is a diagram illustrating processing of an image according to the first embodiment.
  • the signal processing unit 13 performs signal processing on the image data read from the imaging unit 11 and stores the processed data in the memory 15.
  • the DSP 14 reads the image data from the memory 15, executes face detection using the learned learning model, and detects a face position from the image data (Process 1).
  • the DSP 14 performs a processing (processing 2) for performing masking, mosaicing, and the like on the detected face position, generates processed image data, and stores the processed image data in the memory 15. Thereafter, the selector 16 outputs the processed image data in which the face area has been processed to the application processor 20 according to the user's selection.
  • processing 2 for performing masking, mosaicing, and the like on the detected face position
  • the selector 16 outputs the processed image data in which the face area has been processed to the application processor 20 according to the user's selection.
  • FIG. 3 is a flowchart showing the flow of the processing according to the first embodiment. As shown in FIG. 3, the image data captured by the imaging unit 11 is stored in the memory 15 (S101).
  • the DSP 14 reads out the image data from the memory 15 (S102), and detects the face position using the learned learning model (S103). Subsequently, the DSP 14 generates processed image data obtained by processing the face position of the image data, and stores the processed image data in the memory 15 (S104).
  • the selector 16 reads the processed image data from the memory 15 and outputs the processed image data to an external device such as the application processor 20 (S106). ).
  • the selector 16 reads out the image data on which the processing has not been performed from the memory 15 and reads the image data from the external device such as the application processor 20. Output to the device (S107).
  • the image sensor 10 can execute the processing in a closed area within one chip even when the processing is required, so that the captured image data can be prevented from being output to the outside as it is, and security can be reduced. And privacy can be improved. Further, since the image sensor 10 allows the user to select whether or not to perform processing, the processing mode can be selected according to the application, and the convenience for the user can be improved.
  • FIG. 4 is a diagram illustrating a modification of the first embodiment.
  • the signal processing unit 13 performs signal processing on the image data read from the imaging unit 11 and stores the processed data in the memory 15.
  • the DSP 14 reads the image data from the memory 15, executes face detection using the learned learning model, and detects a face position from the image data (Process 1).
  • the DSP 14 generates partial image data from which the detected face position is extracted (Process 2), and stores the partial image data in the memory 15.
  • the selector 16 outputs the partial image data of the face to the application processor 20 according to the user's selection.
  • the image sensor 10 can perform extraction of partial image data in a closed area within one chip even when processing is required, so that the application processor such as identification of a person, face authentication, and image collection for each person can be used. 20 can be output. As a result, transmission of unnecessary images can be suppressed, security can be improved, privacy can be protected, and data capacity can be reduced.
  • FIG. 5 is a diagram illustrating an imaging device according to the second embodiment. As shown in FIG. 5, since the configuration of the image sensor 10 according to the second embodiment is the same as that of the image sensor 10 according to the first embodiment, a detailed description is omitted. The difference from the first embodiment is that the DSP 14 of the image sensor 10 notifies the selector 16 of the position information of the face position extracted using the learning model.
  • the signal processing unit 13 performs signal processing on the image data read from the imaging unit 11 and stores the processed data in the memory 15.
  • the DSP 14 reads the image data from the memory 15, executes face detection using the learned learning model, and detects a face position from the image data (Process 1). Then, the DSP 14 notifies the selector 16 of position information such as an address for specifying the face position.
  • the selector 16 reads the image data from the memory 15 when the processing is selected by the user, and specifies the ROI (Region of interest) to be processed by using the position information acquired from the DSP 14. Then, the selector 16 performs processing such as masking on the specified ROI to generate processed image data (Process 2), and outputs the processed image data to the application processor 20. Note that the selector 16 can also store the processed image data in the memory 15.
  • the selector 16 can also generate a partial image in which the face position is extracted by the selector 16 in the second embodiment.
  • FIG. 6 is a diagram illustrating a first modification of the second embodiment.
  • the signal processing unit 13 performs signal processing on the image data read from the imaging unit 11 and stores the processed data in the memory 15.
  • the DSP 14 reads the image data from the memory 15, executes face detection using the learned learning model, and detects a face position from the image data (Process 1). Then, the DSP 14 notifies the selector 16 of position information such as an address for specifying the face position.
  • the selector 16 reads out the image data from the memory 15 and specifies the ROI (Region of interest) to be processed using the position information acquired from the DSP 14. . Thereafter, the selector 16 generates partial image data in which a portion corresponding to the ROI is extracted from the image data (Process 2), and outputs the partial image data to the application processor 20.
  • the selector 16 performs processing such as extraction (also called cutout or trimming) or processing (such as masking) of an ROI on image data stored in the memory 15. 2, the selector 16 performs processing 2 such as ROI cutout or processing (masking or the like) directly on the image data output from the signal processing unit 13. It is also possible to configure to execute.
  • the image data itself read from the imaging unit 11 may be partial image data only of the ROI or image data not including the ROI.
  • the face position extracted by the DSP 14 with respect to the first frame is notified to the control unit 12, and the control unit 12 instructs the imaging unit 11 in the second frame that is the next frame of the first frame.
  • the reading of the partial image data from the pixel area corresponding to the ROI and the reading of the image data from the pixel area corresponding to the area other than the ROI are executed.
  • the selector 16 is not limited to the processing such as masking, and can also rewrite only the area corresponding to the ROI in the image data to another image and output the image. Only the area corresponding to the ROI of the image data can be output without being read from the memory 15. Note that this processing can also be executed by the DSP 14 in the first embodiment.
  • the image sensor 10 can execute the processing by the selector 16, the processing load of the DSP 14 when the processing is unnecessary can be reduced. Further, since the image sensor 10 can output the processed image processed by the selector 16 without storing it in the memory 15, the used capacity of the memory 15 can be reduced, and the cost and size of the memory 15 can be reduced. Can be achieved. As a result, the size of the entire image sensor 10 can be reduced.
  • the image sensor 10 can read out a small amount of image data first and read out the face position before reading out the entire image data from the imaging unit 11, thereby speeding up the processing.
  • the processing speed is increased.
  • FIG. 7 is a diagram illustrating an imaging device according to the third embodiment. As shown in FIG. 7, the configuration of the image sensor 10 according to the third embodiment is the same as that of the image sensor 10 according to the first embodiment, and a detailed description thereof will be omitted. Here, differences from the first embodiment will be described.
  • the imaging unit 11 when reading image data from all the unit pixels, the imaging unit 11 reads out the image data of a small capacity from the target unit pixel instead of all the unit pixels, and stores the thinned-out image data in the memory 15. Store. In parallel with this, the imaging unit 11 executes normal reading of image data.
  • the DSP 14 reads out a small amount of image data from the memory 15, executes face detection using the learned learning model, and detects a face position from the image data (Process 1). Then, the DSP 14 notifies the selector 16 of position information such as an address for specifying the face position.
  • the selector 16 uses the position information acquired from the DSP 14 to convert a ROI (Region of Interest) to be processed from the normal image data. Identify. Then, the selector 16 performs processing such as masking on the area corresponding to the ROI to generate processed image data (Process 2), and outputs the processed image data to the application processor 20.
  • ROI Region of Interest
  • FIG. 8 is a sequence diagram illustrating a flow of the processing according to the third embodiment.
  • the imaging unit 11 reads out the image by thinning it out (S201), and stores the thinned-out image data in the memory 15 (S202). After that, the imaging unit 11 continues reading the normal image data.
  • the DSP 14 performs face detection from the small-capacity image data using DNN or the like, and detects the face position (S203). Then, the DSP 14 notifies the selector 16 of the position information of the detected face position (S205 and S206).
  • the selector 16 holds the position information of the face position notified from the DSP 14 (S207). Thereafter, when the reading of the normal image data is completed, the imaging unit 11 outputs the data to the selector 16 (S209 and S210), and the selector 16 specifies the face position from the normal image data using the position information of the face position. (S211).
  • the selector 16 generates processed image data obtained by processing the face position (S212), and outputs the processed image data to an external device (S213). For example, the selector 16 cuts out and outputs only the position of the face detected by the DNN. As described above, since the image sensor 10 can detect the face position before the normal reading of the image data is completed, the image sensor 10 can execute the processing without delay after the reading of the image data. Processing can be speeded up as compared with the embodiment.
  • FIG. 9 is a schematic diagram illustrating an example of a chip configuration of the image sensor according to the present embodiment.
  • the image sensor 10 has a laminated structure in which a rectangular flat plate-shaped first substrate (die) 100 and a rectangular flat plate-shaped second substrate (die) 120 are bonded together. I have.
  • the size of the first substrate 100 and the size of the second substrate may be the same, for example. Further, the first substrate 100 and the second substrate 120 may be semiconductor substrates such as a silicon substrate.
  • the ADC 17, the control unit 12, the signal processing unit 13, the DSP 14, the memory 15, and the selector 16 are arranged.
  • an interface circuit, a driver circuit, and the like may be arranged on the second substrate 120.
  • the bonding of the first substrate 100 and the second substrate 120 is performed by dividing the first substrate 100 and the second substrate 120 into chips, respectively, and then dividing the first substrate 100 and the second substrate 120 into individual chips.
  • a so-called CoC (Chip-on-Chip) method of bonding may be used.
  • one of the first substrate 100 and the second substrate 120 (for example, the first substrate 100) may be separated into chips, and then this chip may be separated.
  • the so-called CoW (Chip on Wafer) method in which the singulated first substrate 100 is bonded to the second substrate 120 before singulation (that is, in a wafer state), or the first substrate 100 and the second substrate 120 may be used.
  • a so-called WoW (Wafer-on-Wafer) method may be used in which the substrate 120 and the substrate 120 are bonded together in a wafer state.
  • a method for bonding the first substrate 100 and the second substrate 120 for example, plasma bonding or the like can be used.
  • plasma bonding or the like can be used.
  • the present invention is not limited to this, and various joining methods may be used.
  • FIGS. 10 and 11 are diagrams for explaining a layout example according to the present embodiment.
  • FIG. 10 shows a layout example of the first substrate 100
  • FIG. 11 shows a layout example of the second substrate 120.
  • the pixel array unit 101 is arranged to be shifted toward one side L101 among the four sides L101 to L104 of the first substrate 100.
  • the pixel array unit 101 is arranged such that the center O101 is closer to the side L101 than the center O100 of the first substrate 100.
  • the side L101 may be, for example, the shorter side.
  • the present invention is not limited to this, and the pixel array unit 101 may be arranged to be offset on the longer side.
  • Each of the unit pixels 101a in the pixel array unit 101 is placed in a region near the side L101 of the four sides of the pixel array unit 101, in other words, in a region between the side L101 and the pixel array unit 101.
  • a TSV array 102 in which a plurality of through wirings (Through Silicon Via) (hereinafter, referred to as TSVs) penetrating the first substrate 100 is provided as wiring for electrically connecting to the ADC 17 arranged in the 120.
  • TSVs through wirings
  • the TSV array 102 has a region close to one of the two sides L103 and L104 intersecting with the side L101 (but may be the side L103), in other words, the side L104 (or the side L103). It may be provided in a region between the pixel array unit 101 and the pixel array unit 101.
  • each of the sides L102 to L103 in which the pixel array unit 101 is not offset is provided with a pad array 103 including a plurality of pads arranged linearly.
  • the pads included in the pad array 103 include, for example, a pad (also referred to as a power supply pin) to which a power supply voltage for an analog circuit such as the pixel array unit 101 and the ADC 17 is applied, a signal processing unit 13, a DSP 14, a memory 15, and a selector.
  • each pad is electrically connected to an external power supply circuit or interface circuit via a wire, for example. It is preferable that each pad array 103 and the TSV array 102 are sufficiently separated from each other in the pad array 103 so that the influence of signal reflection from a wire connected to each pad in the pad array 103 can be ignored.
  • the memory 15 is divided into two areas, a memory 15A and a memory 15B.
  • the ADC 17 is divided into two areas: an ADC 17A and a DAC (Digital-to-Analog Converter) 17B.
  • the DAC 17B supplies a reference voltage for AD conversion to the ADC 17A, and is included in a part of the ADC 17 in a broad sense.
  • the selector 16 is also arranged on the second substrate 120.
  • the second substrate 120 includes a wiring 122 electrically connected to each TSV (hereinafter, simply referred to as the TSV array 102) in the TSV array 102 penetrating the first substrate 100, A pad array 123 in which a plurality of pads electrically connected to each pad in the pad array 103 of the substrate 100 is linearly arranged.
  • TSV array 102 and the wiring 122 For connection between the TSV array 102 and the wiring 122, for example, two TSVs, that is, a TSV provided on the first substrate 100 and a TSV provided from the first substrate 100 to the second substrate 120, are connected in an out-of-chip manner.
  • a so-called twin TSV method a so-called shared TSV method in which connection is performed by a common TSV provided from the first substrate 100 to the second substrate 120, or the like can be employed.
  • the present invention is not limited thereto, and various methods such as a so-called Cu-Cu bonding method in which copper (Cu) exposed on the bonding surface of the first substrate 100 and the bonding surface of the second substrate 120 are bonded to each other are used.
  • a connection mode can be adopted.
  • connection form between each pad in the pad array 103 of the first substrate 100 and each pad in the pad array 123 of the second substrate 120 is, for example, wire bonding.
  • connection forms such as through holes and castellations may be used.
  • the vicinity of the wiring 122 connected to the TSV array 102 is set as the upstream side, and the ADC 17A and the ADC 17A are sequentially arranged from the upstream along the flow of the signal read from the pixel array unit 101.
  • the signal processing unit 13 and the DSP 14 are provided. That is, the ADC 17A to which the pixel signal read from the pixel array unit 101 is first input is arranged near the wiring 122 on the most upstream side, and then the signal processing unit 13 is arranged and the area farthest from the wiring 122 The DSP 14 is arranged in the.
  • the control unit 12 is arranged, for example, near the wiring 122 on the upstream side. In FIG. 10, the control unit 12 is disposed between the ADC 17A and the signal processing unit 13. With such a layout, it is possible to reduce the signal delay, reduce the signal propagation loss, improve the S / N ratio, and reduce the power consumption when the control unit 12 controls the pixel array unit 101.
  • signal pins and power supply pins for analog circuits are collectively arranged near the analog circuit (for example, the lower side in FIG. 10), and signal pins and power supply pins for the remaining digital circuits are placed near digital circuits (for example, 10 (upper side in FIG. 10), and the power pins for analog circuits and the power pins for digital circuits can be sufficiently separated.
  • the DSP 14 is disposed on the opposite side of the ADC 17A, which is the most downstream side.
  • the DSP 14 is disposed in a region that does not overlap with the pixel array unit 101 in the stacking direction of the first substrate 100 and the second substrate 120 (hereinafter, simply referred to as the vertical direction). Becomes possible.
  • the DSP 14 and the signal processing unit 13 are connected by a part of the DSP 14 or a connection unit 14a formed by a signal line.
  • the selector 16 is arranged, for example, near the DSP 14.
  • the connecting portion 14a is a part of the DSP 14
  • some of the DSPs 14 overlap the pixel array portion 101 in the vertical direction, but even in such a case, all the DSPs 14 overlap the pixel array portion 101 in the vertical direction. It is possible to reduce the intrusion of noise into the pixel array unit 101 as compared with the case of performing the operation.
  • the memories 15A and 15B are arranged, for example, so as to surround the DSP 14 from three directions. In this way, by disposing the memories 15A and 15B so as to surround the DSP 14, it is possible to shorten the overall distance while averaging the wiring distance between each memory element and the DSP 14 in the memory 15. This makes it possible to reduce signal delay, signal propagation loss, and power consumption when the DSP 14 accesses the memory 15.
  • the pad array 123 is disposed, for example, at a position on the second substrate 120 corresponding to the pad array 103 of the first substrate 100 in the vertical direction.
  • a pad located near the ADC 17A is used for transmitting a power supply voltage and an analog signal for an analog circuit (mainly, the ADC 17A).
  • pads located near the control unit 12, the signal processing unit 13, the DSP 14, and the memories 15A and 15B are power supplies for digital circuits (mainly, the control unit 12, the signal processing unit 13, the DSP 14, the memories 15A and 15B). Used for voltage and digital signal propagation. With such a pad layout, it is possible to reduce a distance on a wiring connecting each pad and each part. This makes it possible to reduce signal delay, reduce signal and power supply voltage propagation loss, improve S / N ratio, and reduce power consumption.
  • various processes other than those described in the above embodiment can be executed according to the content learned by the learning model. For example, not only the whole face is extracted, but also the outline of the face, only a part of the eyes and the nose, the owner of the imaging device 1 or a specific person, the image of the house, It is also possible to extract a part of the nameplate, window, etc. from. In addition, it is also possible to extract an outdoor part reflected in image data in a room, to extract a person and an animal separately, and to extract a window part from image data.
  • processing to read only a specific area extracted such as a face, not to read only a specific area, to paint a specific area black, or to read an image in which only a specific area is cut out. included. Further, not only a rectangular area but also an arbitrary area such as a triangle can be extracted. Processing such as masking and mosaic processing is not limited to one processing, and a plurality of processings can be combined. The extraction of the face position and the like can be executed not only by the DSP 14 but also by the signal processing unit 13.
  • the learning model learned by the DNN has been exemplified.
  • various neural networks such as a RNN (Recurrent Neural Network) and a CNN (Convolutional Neural Network) can be used in addition to the DNN.
  • the learning model is not limited to a learning model using DNN, but may be a learning model learned by various other machine learning such as a decision tree or a support vector machine.
  • each device shown in the drawings are functionally conceptual, and do not necessarily need to be physically configured as shown in the drawings. That is, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or a part thereof may be functionally or physically distributed / arbitrarily divided into arbitrary units according to various loads and usage conditions. Can be integrated and configured.
  • the control unit 12 and the signal processing unit 13 shown in FIG. 1 may be integrated.
  • the technology (the present technology) according to the present disclosure can be applied to various products.
  • the technology according to the present disclosure is realized as a device mounted on any type of moving object such as an automobile, an electric vehicle, a hybrid electric vehicle, a motorcycle, a bicycle, a personal mobility, an airplane, a drone, a ship, and a robot. You may.
  • FIG. 12 is a block diagram illustrating a schematic configuration example of a vehicle control system that is an example of a mobile object control system to which the technology according to the present disclosure may be applied.
  • Vehicle control system 12000 includes a plurality of electronic control units connected via communication network 12001.
  • the vehicle control system 12000 includes a drive system control unit 12010, a body system control unit 12020, an outside information detection unit 12030, an inside information detection unit 12040, and an integrated control unit 12050.
  • a microcomputer 12051, an audio / video output unit 12052, and a vehicle-mounted network I / F (Interface) 12053 are illustrated.
  • the drive system control unit 12010 controls the operation of the device related to the drive system of the vehicle according to various programs.
  • the drive system control unit 12010 includes a drive force generation device for generating a drive force of the vehicle such as an internal combustion engine or a drive motor, a drive force transmission mechanism for transmitting the drive force to the wheels, and a steering angle of the vehicle. It functions as a control mechanism such as a steering mechanism that adjusts and a braking device that generates a braking force of the vehicle.
  • the body control unit 12020 controls the operation of various devices mounted on the vehicle body according to various programs.
  • the body-related control unit 12020 functions as a keyless entry system, a smart key system, a power window device, or a control device for various lamps such as a head lamp, a back lamp, a brake lamp, a blinker, and a fog lamp.
  • a radio wave or a signal of various switches transmitted from a portable device replacing the key can be input to the body control unit 12020.
  • the body control unit 12020 receives the input of these radio waves or signals, and controls a door lock device, a power window device, a lamp, and the like of the vehicle.
  • Out-of-vehicle information detection unit 12030 detects information external to the vehicle on which vehicle control system 12000 is mounted.
  • an imaging unit 12031 is connected to the outside-of-vehicle information detection unit 12030.
  • the out-of-vehicle information detection unit 12030 causes the imaging unit 12031 to capture an image outside the vehicle, and receives the captured image.
  • the out-of-vehicle information detection unit 12030 may perform an object detection process or a distance detection process of a person, a vehicle, an obstacle, a sign, a character on a road surface, or the like based on the received image.
  • the imaging unit 12031 is an optical sensor that receives light and outputs an electric signal according to the amount of received light.
  • the imaging unit 12031 can output an electric signal as an image or can output the information as distance measurement information.
  • the light received by the imaging unit 12031 may be visible light or non-visible light such as infrared light.
  • the in-vehicle information detection unit 12040 detects information in the vehicle.
  • the in-vehicle information detection unit 12040 is connected to, for example, a driver status detection unit 12041 that detects the status of the driver.
  • the driver state detection unit 12041 includes, for example, a camera that captures an image of the driver, and the in-vehicle information detection unit 12040 determines the degree of driver fatigue or concentration based on the detection information input from the driver state detection unit 12041. The calculation may be performed, or it may be determined whether the driver has fallen asleep.
  • the microcomputer 12051 calculates a control target value of the driving force generation device, the steering mechanism or the braking device based on the information on the inside and outside of the vehicle acquired by the outside information detection unit 12030 or the inside information detection unit 12040, and the drive system control unit A control command can be output to 12010.
  • the microcomputer 12051 realizes functions of an ADAS (Advanced Driver Assistance System) including a vehicle collision avoidance or a shock mitigation, a following operation based on an inter-vehicle distance, a vehicle speed maintaining operation, a vehicle collision warning, or a vehicle lane departure warning. Cooperative control for the purpose.
  • ADAS Advanced Driver Assistance System
  • the microcomputer 12051 controls the driving force generation device, the steering mechanism, the braking device, and the like based on the information on the surroundings of the vehicle acquired by the outside information detection unit 12030 or the inside information detection unit 12040, and thereby, It is possible to perform cooperative control for automatic driving or the like in which the vehicle travels autonomously without depending on the operation.
  • the microcomputer 12051 can output a control command to the body system control unit 12020 based on information on the outside of the vehicle acquired by the outside information detection unit 12030.
  • the microcomputer 12051 controls the headlamp according to the position of the preceding vehicle or the oncoming vehicle detected by the outside information detection unit 12030, and performs cooperative control for the purpose of preventing glare such as switching a high beam to a low beam. It can be carried out.
  • the sound image output unit 12052 transmits at least one of a sound signal and an image signal to an output device capable of visually or audibly notifying a passenger of the vehicle or the outside of the vehicle of information.
  • an audio speaker 12061, a display unit 12062, and an instrument panel 12063 are illustrated as output devices.
  • the display unit 12062 may include, for example, at least one of an on-board display and a head-up display.
  • FIG. 13 is a diagram illustrating an example of an installation position of the imaging unit 12031.
  • the imaging unit 12031 includes imaging units 12101, 12102, 12103, 12104, and 12105.
  • the imaging units 12101, 12102, 12103, 12104, and 12105 are provided, for example, at positions such as a front nose, a side mirror, a rear bumper, a back door, and an upper part of a windshield in the vehicle interior of the vehicle 12100.
  • the imaging unit 12101 provided on the front nose and the imaging unit 12105 provided above the windshield in the passenger compartment mainly acquire an image in front of the vehicle 12100.
  • the imaging units 12102 and 12103 provided in the side mirror mainly acquire images of the side of the vehicle 12100.
  • the imaging unit 12104 provided in the rear bumper or the back door mainly acquires an image behind the vehicle 12100.
  • the imaging unit 12105 provided above the windshield in the passenger compartment is mainly used for detecting a preceding vehicle, a pedestrian, an obstacle, a traffic light, a traffic sign, a lane, and the like.
  • FIG. 13 shows an example of the imaging range of the imaging units 12101 to 12104.
  • the imaging range 12111 indicates the imaging range of the imaging unit 12101 provided on the front nose
  • the imaging ranges 12112 and 12113 indicate the imaging ranges of the imaging units 12102 and 12103 provided on the side mirrors, respectively
  • the imaging range 12114 indicates 14 shows an imaging range of an imaging unit 12104 provided in a rear bumper or a back door. For example, by overlaying image data captured by the imaging units 12101 to 12104, an overhead image of the vehicle 12100 viewed from above can be obtained.
  • At least one of the imaging units 12101 to 12104 may have a function of acquiring distance information.
  • at least one of the imaging units 12101 to 12104 may be a stereo camera including a plurality of imaging elements or an imaging element having pixels for detecting a phase difference.
  • the microcomputer 12051 calculates a distance to each three-dimensional object in the imaging ranges 12111 to 12114 and a temporal change of the distance (relative speed with respect to the vehicle 12100).
  • a distance to each three-dimensional object in the imaging ranges 12111 to 12114 and a temporal change of the distance (relative speed with respect to the vehicle 12100).
  • microcomputer 12051 can set an inter-vehicle distance to be secured before the preceding vehicle and perform automatic brake control (including follow-up stop control), automatic acceleration control (including follow-up start control), and the like. In this way, it is possible to perform cooperative control for automatic driving or the like in which the vehicle travels autonomously without depending on the operation of the driver.
  • the microcomputer 12051 converts the three-dimensional object data relating to the three-dimensional object into other three-dimensional objects such as a motorcycle, a normal vehicle, a large vehicle, a pedestrian, a telephone pole, and the like based on the distance information obtained from the imaging units 12101 to 12104. It can be classified and extracted and used for automatic avoidance of obstacles. For example, the microcomputer 12051 distinguishes obstacles around the vehicle 12100 into obstacles that are visible to the driver of the vehicle 12100 and obstacles that are difficult to see. Then, the microcomputer 12051 determines a collision risk indicating a risk of collision with each obstacle, and when the collision risk is equal to or more than the set value and there is a possibility of collision, via the audio speaker 12061 or the display unit 12062. By outputting an alarm to the driver through forced driving and avoidance steering via the drive system control unit 12010, driving assistance for collision avoidance can be performed.
  • driving assistance for collision avoidance can be performed.
  • At least one of the imaging units 12101 to 12104 may be an infrared camera that detects infrared light.
  • the microcomputer 12051 can recognize a pedestrian by determining whether or not a pedestrian exists in the captured images of the imaging units 12101 to 12104. The recognition of such a pedestrian is performed by, for example, extracting a feature point in an image captured by the imaging units 12101 to 12104 as an infrared camera, and performing a pattern matching process on a series of feature points indicating the outline of the object to determine whether the object is a pedestrian.
  • the audio image output unit 12052 outputs a rectangular outline to the recognized pedestrian for emphasis.
  • the display unit 12062 is controlled so that is superimposed.
  • the sound image output unit 12052 may control the display unit 12062 to display an icon or the like indicating a pedestrian at a desired position.
  • the technology according to the present disclosure can be applied to the imaging unit 12031 or the like among the configurations described above.
  • the technology according to the present disclosure By applying the technology according to the present disclosure to the imaging unit 12031 and the like, it is possible to reduce the size of the imaging unit 12031 and the like, so that the interior and exterior of the vehicle 12100 can be easily designed.
  • the technology according to the present disclosure to the imaging unit 12031 and the like, a clear image with reduced noise can be obtained, so that a more easily viewable captured image can be provided to the driver. This makes it possible to reduce driver fatigue.
  • the technology (the present technology) according to the present disclosure can be applied to various products.
  • the technology according to the present disclosure may be applied to an endoscopic surgery system.
  • FIG. 14 is a diagram illustrating an example of a schematic configuration of an endoscopic surgery system to which the technology (the present technology) according to the present disclosure may be applied.
  • FIG. 14 illustrates a situation where an operator (doctor) 11131 is performing an operation on a patient 11132 on a patient bed 11133 using the endoscopic surgery system 11000.
  • the endoscopic surgery system 11000 includes an endoscope 11100, other surgical tools 11110 such as an insufflation tube 11111 and an energy treatment tool 11112, and a support arm device 11120 that supports the endoscope 11100.
  • a cart 11200 on which various devices for endoscopic surgery are mounted.
  • the endoscope 11100 includes a lens barrel 11101 having a predetermined length from the distal end inserted into the body cavity of the patient 11132, and a camera head 11102 connected to the proximal end of the lens barrel 11101.
  • the endoscope 11100 which is configured as a so-called rigid endoscope having a hard lens barrel 11101 is illustrated.
  • the endoscope 11100 may be configured as a so-called flexible endoscope having a soft lens barrel. Good.
  • An opening in which an objective lens is fitted is provided at the tip of the lens barrel 11101.
  • a light source device 11203 is connected to the endoscope 11100, and light generated by the light source device 11203 is guided to the distal end of the lens barrel by a light guide that extends inside the lens barrel 11101, and the objective The light is radiated toward the observation target in the body cavity of the patient 11132 via the lens.
  • the endoscope 11100 may be a direct view scope, a perspective view scope, or a side view scope.
  • An optical system and an image sensor are provided inside the camera head 11102, and the reflected light (observation light) from the observation target is focused on the image sensor by the optical system.
  • the observation light is photoelectrically converted by the imaging element, and an electric signal corresponding to the observation light, that is, an image signal corresponding to the observation image is generated.
  • the image signal is transmitted to a camera control unit (CCU: ⁇ Camera ⁇ Control ⁇ Unit) 11201 as RAW data.
  • the $ CCU 11201 is configured by a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and the like, and controls the operations of the endoscope 11100 and the display device 11202 overall. Further, the CCU 11201 receives an image signal from the camera head 11102, and performs various image processing on the image signal for displaying an image based on the image signal, such as a development process (demosaicing process).
  • a development process demosaicing process
  • the display device 11202 displays an image based on an image signal on which image processing has been performed by the CCU 11201 under the control of the CCU 11201.
  • the light source device 11203 is configured by a light source such as an LED (light emitting diode), for example, and supplies the endoscope 11100 with irradiation light when imaging an operation part or the like.
  • a light source such as an LED (light emitting diode)
  • the input device 11204 is an input interface for the endoscopic surgery system 11000.
  • the user can input various information and input instructions to the endoscopic surgery system 11000 via the input device 11204.
  • the user inputs an instruction or the like to change imaging conditions (type of irradiation light, magnification, focal length, and the like) by the endoscope 11100.
  • the treatment instrument control device 11205 controls the driving of the energy treatment instrument 11112 for cauterizing, incising a tissue, sealing a blood vessel, and the like.
  • the insufflation device 11206 is used to inflate the body cavity of the patient 11132 for the purpose of securing the visual field by the endoscope 11100 and securing the working space of the operator.
  • the recorder 11207 is a device that can record various types of information related to surgery.
  • the printer 11208 is a device capable of printing various types of information on surgery in various formats such as text, images, and graphs.
  • the light source device 11203 that supplies the endoscope 11100 with irradiation light at the time of imaging the operation site can be configured by, for example, a white light source including an LED, a laser light source, or a combination thereof.
  • a white light source is configured by a combination of the RGB laser light sources
  • the output intensity and output timing of each color (each wavelength) can be controlled with high accuracy, so that the light source device 11203 adjusts the white balance of the captured image. It can be carried out.
  • the laser light from each of the RGB laser light sources is radiated to the observation target in a time-division manner, and the driving of the image pickup device of the camera head 11102 is controlled in synchronization with the irradiation timing. It is also possible to capture the image obtained in a time-division manner. According to this method, a color image can be obtained without providing a color filter in the image sensor.
  • the driving of the light source device 11203 may be controlled so as to change the intensity of output light at predetermined time intervals.
  • the driving of the image sensor of the camera head 11102 in synchronization with the timing of the change of the light intensity, an image is acquired in a time-division manner, and the image is synthesized, so that a high dynamic image without so-called blackout and whiteout is obtained. An image of the range can be generated.
  • the light source device 11203 may be configured to be able to supply light in a predetermined wavelength band corresponding to special light observation.
  • special light observation for example, the wavelength dependence of light absorption in body tissue is used to irradiate light of a narrower band compared to irradiation light (ie, white light) at the time of normal observation, so that the surface of the mucous membrane is exposed.
  • a so-called narrow-band light observation (Narrow-Band-Imaging) for photographing a predetermined tissue such as a blood vessel with high contrast is performed.
  • fluorescence observation in which an image is obtained by fluorescence generated by irradiating excitation light may be performed.
  • the body tissue is irradiated with excitation light to observe fluorescence from the body tissue (autofluorescence observation), or a reagent such as indocyanine green (ICG) is locally injected into the body tissue and Irradiation with excitation light corresponding to the fluorescence wavelength of the reagent can be performed to obtain a fluorescence image.
  • the light source device 11203 can be configured to be able to supply narrowband light and / or excitation light corresponding to such special light observation.
  • FIG. 15 is a block diagram showing an example of a functional configuration of the camera head 11102 and the CCU 11201 shown in FIG.
  • the camera head 11102 includes a lens unit 11401, an imaging unit 11402, a driving unit 11403, a communication unit 11404, and a camera head control unit 11405.
  • the CCU 11201 includes a communication unit 11411, an image processing unit 11412, and a control unit 11413.
  • the camera head 11102 and the CCU 11201 are communicably connected to each other by a transmission cable 11400.
  • the lens unit 11401 is an optical system provided at a connection with the lens barrel 11101. Observation light taken in from the tip of the lens barrel 11101 is guided to the camera head 11102, and enters the lens unit 11401.
  • the lens unit 11401 is configured by combining a plurality of lenses including a zoom lens and a focus lens.
  • the number of imaging elements constituting the imaging unit 11402 may be one (so-called single-panel type) or plural (so-called multi-panel type).
  • the imaging unit 11402 When the imaging unit 11402 is configured as a multi-panel type, for example, an image signal corresponding to each of RGB may be generated by each imaging element, and a color image may be obtained by combining the image signals.
  • the imaging unit 11402 may be configured to include a pair of imaging elements for acquiring right-eye and left-eye image signals corresponding to 3D (dimensional) display. By performing the 3D display, the operator 11131 can more accurately grasp the depth of the living tissue in the operative part.
  • a plurality of lens units 11401 may be provided for each imaging element.
  • the imaging unit 11402 does not necessarily have to be provided in the camera head 11102.
  • the imaging unit 11402 may be provided inside the lens barrel 11101 immediately after the objective lens.
  • the drive unit 11403 is configured by an actuator, and moves the zoom lens and the focus lens of the lens unit 11401 by a predetermined distance along the optical axis under the control of the camera head control unit 11405.
  • the magnification and the focus of the image captured by the imaging unit 11402 can be appropriately adjusted.
  • the communication unit 11404 is configured by a communication device for transmitting and receiving various information to and from the CCU 11201.
  • the communication unit 11404 transmits the image signal obtained from the imaging unit 11402 as RAW data to the CCU 11201 via the transmission cable 11400.
  • the communication unit 11404 receives a control signal for controlling driving of the camera head 11102 from the CCU 11201 and supplies the control signal to the camera head control unit 11405.
  • the control signal includes, for example, information indicating that the frame rate of the captured image is specified, information that specifies the exposure value at the time of imaging, and / or information that specifies the magnification and focus of the captured image. Contains information about the condition.
  • imaging conditions such as the frame rate, the exposure value, the magnification, and the focus may be appropriately designated by the user, or may be automatically set by the control unit 11413 of the CCU 11201 based on the acquired image signal. Good.
  • a so-called AE (Auto Exposure) function, an AF (Auto Focus) function, and an AWB (Auto White Balance) function are mounted on the endoscope 11100.
  • the camera head control unit 11405 controls the driving of the camera head 11102 based on the control signal from the CCU 11201 received via the communication unit 11404.
  • the communication unit 11411 is configured by a communication device for transmitting and receiving various information to and from the camera head 11102.
  • the communication unit 11411 receives an image signal transmitted from the camera head 11102 via the transmission cable 11400.
  • the communication unit 11411 transmits a control signal for controlling driving of the camera head 11102 to the camera head 11102.
  • the image signal and the control signal can be transmitted by electric communication, optical communication, or the like.
  • the image processing unit 11412 performs various types of image processing on an image signal that is RAW data transmitted from the camera head 11102.
  • the control unit 11413 performs various kinds of control related to imaging of the operation section and the like by the endoscope 11100 and display of a captured image obtained by imaging the operation section and the like. For example, the control unit 11413 generates a control signal for controlling driving of the camera head 11102.
  • control unit 11413 causes the display device 11202 to display a captured image showing the operative part or the like based on the image signal subjected to the image processing by the image processing unit 11412.
  • the control unit 11413 may recognize various objects in the captured image using various image recognition techniques. For example, the control unit 11413 detects a shape, a color, or the like of an edge of an object included in the captured image, and thereby detects a surgical tool such as forceps, a specific living body site, bleeding, a mist when using the energy treatment tool 11112, and the like. Can be recognized.
  • the control unit 11413 may use the recognition result to superimpose and display various types of surgery support information on the image of the operative site.
  • the burden on the operator 11131 can be reduced, and the operator 11131 can reliably perform the operation.
  • the transmission cable 11400 connecting the camera head 11102 and the CCU 11201 is an electric signal cable corresponding to electric signal communication, an optical fiber corresponding to optical communication, or a composite cable thereof.
  • the communication is performed by wire using the transmission cable 11400, but the communication between the camera head 11102 and the CCU 11201 may be performed wirelessly.
  • the technology according to the present disclosure can be applied to, for example, the imaging unit 11402 of the camera head 11102 among the configurations described above.
  • the technology according to the present disclosure can be applied to the camera head 11102, so that the endoscopic surgery system 11000 can be reduced in size.
  • the technology according to the present disclosure to the camera head 11102 and the like, a clear image with reduced noise can be obtained, and thus a more easily viewable captured image can be provided to the operator. Thereby, it becomes possible to reduce the fatigue of the operator.
  • the endoscopic surgery system has been described as an example, but the technology according to the present disclosure may be applied to, for example, a microscopic surgery system or the like.
  • the technology according to the present disclosure can be applied to various products.
  • the technology according to the present disclosure may be applied to a pathological diagnosis system in which a doctor or the like observes cells or tissues collected from a patient to diagnose a lesion or a support system therefor (hereinafter, referred to as a diagnosis support system).
  • This diagnosis support system may be a WSI (Whole Slide Imaging) system that diagnoses or supports a lesion based on an image acquired using digital pathology technology.
  • FIG. 16 is a diagram illustrating an example of a schematic configuration of a diagnosis support system 5500 to which the technology according to the present disclosure is applied.
  • the diagnosis support system 5500 includes one or more pathology systems 5510. Further, a medical information system 5530 and a derivation device 5540 may be included.
  • Each of the # 1 or more pathology systems 5510 is a system mainly used by a pathologist, and is introduced into, for example, a research laboratory or a hospital.
  • Each pathological system 5510 may be installed in different hospitals, and may be connected to various networks such as a WAN (Wide Area Network) (including the Internet), a LAN (Local Area Network), a public line network, and a mobile communication network. It is connected to the medical information system 5530 and the derivation device 5540 via the terminal.
  • WAN Wide Area Network
  • LAN Local Area Network
  • public line network a public line network
  • mobile communication network a mobile communication network
  • Each pathological system 5510 includes a microscope 5511, a server 5512, a display control device 5513, and a display device 5514.
  • the microscope 5511 has a function of an optical microscope, and captures an observation object contained in a glass slide to acquire a pathological image as a digital image.
  • the observation target is, for example, a tissue or a cell collected from a patient, and may be a piece of organ, saliva, blood, or the like.
  • the server 5512 stores and stores the pathological image acquired by the microscope 5511 in a storage unit (not shown). In addition, when the server 5512 receives a browsing request from the display control device 5513, the server 5512 searches for a pathological image from a storage unit (not shown), and sends the searched pathological image to the display control device 5513.
  • the display control device 5513 sends a browsing request for a pathological image received from the user to the server 5512. Then, the display control device 5513 causes the pathological image received from the server 5512 to be displayed on a display device 5514 using a liquid crystal, EL (Electro-Luminescence), CRT (Cathode Ray Tube), or the like. Note that the number of the display devices 5514 may correspond to 4K or 8K, and is not limited to one and may be plural.
  • the observation target when the observation target is a solid such as a piece of meat of an organ, the observation target may be, for example, a stained thin section.
  • the thin section may be produced by, for example, thinly cutting a block piece cut out from a specimen such as an organ.
  • the block pieces may be fixed with paraffin or the like.
  • Various stains may be applied to the staining of the thin sections, such as general staining indicating the morphology of the tissue such as HE (Hematoxylin-Eosin) staining, and immunostaining indicating the immune state of the tissue such as IHC (Immunohistochemistry) staining.
  • one thin section may be stained using a plurality of different reagents, or two or more thin sections (also referred to as adjacent thin sections) cut out continuously from the same block piece may use different reagents. May be used for staining.
  • the microscope 5511 may include a low-resolution imaging unit for imaging at low resolution and a high-resolution imaging unit for imaging at high resolution.
  • the low-resolution imaging unit and the high-resolution imaging unit may be different optical systems or may be the same optical system. When the optical systems are the same, the resolution of the microscope 5511 may be changed according to the imaging target.
  • the glass slide containing the observation target is placed on a stage located within the angle of view of the microscope 5511.
  • the microscope 5511 first obtains an entire image within the angle of view using the low-resolution imaging unit, and specifies an area of the observation target from the obtained entire image. Subsequently, the microscope 5511 obtains a high-resolution image of each divided region by dividing the region where the observation target object is present into a plurality of divided regions of a predetermined size, and sequentially capturing each divided region with a high-resolution imaging unit. I do.
  • the stage may be moved, the imaging optical system may be moved, or both of them may be moved.
  • each divided region may overlap with an adjacent divided region in order to prevent occurrence of an imaging omission region due to unintentional sliding of the glass slide.
  • the whole image may include identification information for associating the whole image with the patient. This identification information may be, for example, a character string or a QR code (registered trademark).
  • the high-resolution image acquired by the microscope 5511 is input to the server 5512.
  • the server 5512 divides each high-resolution image into smaller-sized partial images (hereinafter, referred to as tile images). For example, the server 5512 divides one high-resolution image into a total of 100 tile images of 10 ⁇ 10 vertically and horizontally. At this time, if adjacent divided areas overlap, the server 5512 may perform a stitching process on the high-resolution images adjacent to each other by using a technique such as template matching. In that case, the server 5512 may generate a tile image by dividing the entire high-resolution image attached by the stitching process. However, the generation of the tile image from the high-resolution image may be performed before the stitching process.
  • the server 5512 may generate a tile image of a smaller size by further dividing the tile image. Such generation of a tile image may be repeated until a tile image of a size set as the minimum unit is generated.
  • the server 5512 executes a tile synthesis process of generating one tile image by synthesizing a predetermined number of adjacent tile images for all tile images. This tile synthesizing process can be repeated until one tile image is finally generated.
  • a tile image group having a pyramid structure in which each layer is configured by one or more tile images is generated.
  • the tile image of a certain layer and the tile image of a layer different from this layer have the same number of pixels, but have different resolutions.
  • the resolution of the tile image of the upper layer is ⁇ times the resolution of the tile image of the lower layer used for the synthesis. It has become.
  • the generated pyramid-structured tile image group is stored in a storage unit (not shown) together with identification information (referred to as tile identification information) capable of uniquely identifying each tile image, for example.
  • tile identification information identification information capable of uniquely identifying each tile image, for example.
  • the server 5512 transmits a tile image corresponding to the tile identification information to another device. I do.
  • a tile image as a pathological image may be generated for each imaging condition such as a focal length and a staining condition.
  • a tile image is generated for each imaging condition, along with a specific pathological image, another pathological image corresponding to an imaging condition different from the specific imaging condition, and another pathological image in the same region as the specific pathological image are displayed. They may be displayed side by side.
  • Specific imaging conditions may be specified by the viewer. When a plurality of imaging conditions are specified for the viewer, pathological images of the same area corresponding to each imaging condition may be displayed side by side.
  • the server 5512 may store the pyramid-structured tile image group in a storage device other than the server 5512, for example, a cloud server. Further, a part or all of the tile image generation processing as described above may be executed by a cloud server or the like.
  • the display control device 5513 extracts a desired tile image from the pyramid-structured tile image group in response to an input operation from the user, and outputs this to the display device 5514. Through such processing, the user can obtain a feeling as if the user is observing the observation target object while changing the observation magnification. That is, the display control device 5513 functions as a virtual microscope. The virtual observation magnification here actually corresponds to the resolution.
  • any method may be used as a method for capturing a high-resolution image. Stopping and moving the stage may be repeated to obtain a high-resolution image by capturing the divided area while moving the stage, or by moving the stage at a predetermined speed and capturing a high-resolution image on the strip by capturing the divided area. Is also good.
  • the process of generating a tile image from a high-resolution image is not an indispensable configuration. By changing the resolution of the entire high-resolution image combined by the stitching process in a stepwise manner, an image in which the resolution changes stepwise can be obtained. May be generated. Even in this case, it is possible to gradually present the user from a low-resolution image in a wide area to a high-resolution image in a narrow area.
  • the medical information system 5530 is a so-called electronic medical record system, and stores information for identifying a patient, information on a patient's disease, examination information and image information used for diagnosis, diagnosis results, and information on diagnosis such as prescription drugs.
  • a pathological image obtained by imaging an observation target of a patient may be temporarily stored via the server 5512, and then displayed on the display device 5514 by the display control device 5513.
  • a pathologist using the pathological system 5510 makes a pathological diagnosis based on the pathological image displayed on the display device 5514.
  • the result of the pathological diagnosis performed by the pathologist is stored in the medical information system 5530.
  • the derivation device 5540 can execute analysis on a pathological image. For this analysis, a learning model created by machine learning can be used. The derivation device 5540 may derive a classification result of a specific area, a tissue identification result, or the like as the analysis result. Furthermore, the deriving device 5540 may derive identification results of cell information, number, position, luminance information, and the like, and scoring information for them. These pieces of information derived by the derivation device 5540 may be displayed on the display device 5514 of the pathology system 5510 as diagnosis support information.
  • the deriving device 5540 may be a server system including one or more servers (including a cloud server).
  • the derivation device 5540 may be configured to be incorporated in, for example, the display control device 5513 or the server 5512 in the pathology system 5510. That is, various analyzes on the pathological image may be executed in the pathological system 5510.
  • the technology according to the present disclosure can be suitably applied to, for example, the microscope 5511 among the configurations described above.
  • the technology according to the present disclosure can be applied to the low-resolution imaging unit and / or the high-resolution imaging unit of the microscope 5511.
  • the technology according to the present disclosure can be applied to the low-resolution imaging unit, it is possible to specify the region of the observation target in the entire image in the low-resolution imaging unit.
  • the technology according to the present disclosure to the high-resolution imaging unit part or all of the generation processing of the tile image and the analysis processing for the pathological image can be performed in the high-resolution imaging unit.
  • a part or all of the processing from the acquisition of the pathological image to the analysis of the pathological image can be executed on the fly within the microscope 5511, so that the diagnosis support information can be output more quickly and accurately.
  • partial extraction of a specific tissue, partial output of an image in consideration of personal information, and the like can be executed in the microscope 5511, thereby shortening the imaging time, reducing the data amount, and improving the workflow of a pathologist. It is possible to reduce time and the like.
  • the configuration described above can be applied not only to the diagnosis support system but also to all biological microscopes such as a confocal microscope, a fluorescence microscope, and a video microscope.
  • the observation target may be a biological sample such as a cultured cell, a fertilized egg, or a sperm, a biological material such as a cell sheet or a three-dimensional cell tissue, or a living body such as a zebrafish or a mouse.
  • the observation target object is not limited to a glass slide, and can be observed in a state stored in a well plate, a petri dish, or the like.
  • a moving image may be generated from a still image of the observation target acquired using a microscope.
  • a moving image may be generated from still images captured continuously for a predetermined period, or an image sequence may be generated from still images captured at predetermined intervals.
  • an image sequence may be generated from still images captured at predetermined intervals.
  • An imaging unit for acquiring image data For the image data or data based on the image data, a processing unit that performs a process of extracting a specific region based on a neural network calculation model, Image data processed based on the specific area, or an output unit that outputs image data read from the imaging unit based on the specific area,
  • a solid-state imaging device having: (2) The solid-state imaging device according to (1), wherein the processing unit extracts the specific region to be processed from the image data by an arithmetic process using a learned learning model. (3) The solid-state imaging device according to (1) or (2), wherein the processing unit performs a masking process, a mosaic process, or an avatar process on the specific region to generate the processed image data.
  • the imaging unit is configured to read out a unit pixel to be read out at the time of reading a captured image, to obtain the image data,
  • the solid-state imaging device according to any one of (1) to (10), wherein the processing unit extracts the specific region from the thinned image data.
  • An imaging unit that acquires image data
  • a processing unit that performs a process of extracting a specific region based on a neural network calculation model for data based on the image data, and the image that is processed based on the specific region Data
  • a solid-state imaging device having an output unit that outputs image data read from the imaging unit based on the specific region
  • a control device that executes a process by an application on the processed image data output from the solid-state imaging device or the image data read from the imaging unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Studio Devices (AREA)

Abstract

This solid-state imaging device includes: an imaging unit (11) that acquires image data; a processing unit (14) that executes processing for extracting a specified region on the basis of a neural network calculation model, such processing executed on the image data or data based on the image data; and an output unit (16) that outputs image data which was manipulated on the basis of the specified region, or image data which was read from the imaging unit on the basis of the specified region.

Description

固体撮像装置および電子機器Solid-state imaging device and electronic equipment
 本開示は、固体撮像装置および電子機器に関する。詳しくは、チップ内での画像データの加工処理に関する。 The present disclosure relates to a solid-state imaging device and an electronic device. More specifically, the present invention relates to processing of image data in a chip.
 デジタルカメラなどに代表される機器には、CMOS(Complementary Metal Oxide Semiconductor)やDSP(Digital Signal Processor)を有するイメージセンサが搭載される。イメージセンサでは、撮像された画像がDSPに供給され、DSPにおいて様々な処理がなされて、アプリケーションプロセッサなどの外部装置に出力される。 Devices such as digital cameras are equipped with image sensors having CMOS (Complementary Metal Oxide Semiconductor) and DSP (Digital Signal Processor). In an image sensor, a captured image is supplied to a DSP, where various processing is performed in the DSP and output to an external device such as an application processor.
国際公開第2018/051809号International Publication No. WO 2018/051809
 しかしながら、上記の従来技術では、イメージセンサ内のDSPにおいて、ノイズ除去などの簡単な画像処理が実行され、画像データを用いた顔認証などの複雑な処理はアプリケーションプロセッサなどで実行されるのが一般的である。このため、イメージセンサで撮像された撮像画像がそのままアプリケーションプロセッサに出力されるので、セキュリティの観点やプライバシーの観点から、イメージセンサのチップ内で加工処理を実行することが望まれている。 However, in the above-described conventional technology, simple image processing such as noise removal is performed in a DSP in an image sensor, and complicated processing such as face authentication using image data is generally performed by an application processor or the like. It is a target. For this reason, the captured image captured by the image sensor is output to the application processor as it is. From the viewpoints of security and privacy, it is desired to execute the processing in the chip of the image sensor.
 そこで、本開示では、イメージセンサのチップ内で加工処理を実行することができる固体撮像装置および電子機器を提案する。 Therefore, the present disclosure proposes a solid-state imaging device and an electronic device that can execute processing in a chip of an image sensor.
 上記の課題を解決するために、本開示に係る一形態の固体撮像装置は、画像データを取得する撮像部と、前記画像データまたは前記画像データに基づくデータに対して、ニューラルネットワーク計算モデルに基づいて特定領域を抽出する処理を実行する処理部と、前記特定領域に基づいて加工された画像データ、又は、前記特定領域に基づいて前記撮像部から読み出された画像データを出力する出力部と、を有する。 In order to solve the above-described problem, a solid-state imaging device according to an embodiment of the present disclosure includes an imaging unit that acquires image data, and the image data or data based on the image data, based on a neural network calculation model. A processing unit that executes a process of extracting a specific region, and an output unit that outputs image data processed based on the specific region, or image data read from the imaging unit based on the specific region. And
 撮像部で取得された画像データから加工対象となる特定領域を抽出する処理部を固体撮像装置に搭載することで、チップ内で加工領域の抽出や加工処理を実行することが可能となる。それにより、そのままの画像データに含まれるプライバシー情報等がチップ外へ出力されることを防止でき、セキュアな固体撮像装置を実現することが可能となる。また、固体撮像装置から外部へ出力されるデータ量を低減できるというメリットも得られる。 (4) By mounting a processing unit for extracting a specific region to be processed from the image data acquired by the imaging unit on the solid-state imaging device, it is possible to execute the extraction of the processing region and the processing in the chip. As a result, privacy information and the like included in the image data as they are can be prevented from being output to the outside of the chip, and a secure solid-state imaging device can be realized. Also, there is an advantage that the amount of data output from the solid-state imaging device to the outside can be reduced.
 本開示によれば、イメージセンサのチップ内で加工処理を実行することができる。なお、ここに記載された効果は必ずしも限定されるものではなく、本開示中に記載されたいずれかの効果であってもよい。 According to the present disclosure, it is possible to execute the processing in the chip of the image sensor. Note that the effects described here are not necessarily limited, and may be any of the effects described in the present disclosure.
第1の実施形態に係る電子機器としての撮像装置の概略構成例を示すブロック図である。FIG. 2 is a block diagram illustrating a schematic configuration example of an imaging device as an electronic apparatus according to the first embodiment. 第1の実施形態に係る画像の加工を説明する図である。FIG. 4 is a diagram for describing image processing according to the first embodiment. 第1の実施形態に係る加工処理の流れを示すフローチャートである。It is a flow chart which shows a flow of processing processing concerning a 1st embodiment. 第1の実施形態の変形例を説明する図である。FIG. 9 is a diagram illustrating a modification of the first embodiment. 第2の実施形態に係る撮像装置を説明する図である。FIG. 6 is a diagram illustrating an imaging device according to a second embodiment. 第2の実施形態の変形例を説明する図である。It is a figure explaining the modification of a 2nd embodiment. 第3の実施形態に係る撮像装置を説明する図である。FIG. 11 is a diagram illustrating an imaging device according to a third embodiment. 第3の実施形態にかかる加工処理の流れを示すシーケンス図である。It is a sequence diagram showing the flow of the processing concerning a 3rd embodiment. 本実施形態に係るイメージセンサのチップ構成例を示す模式図である。FIG. 2 is a schematic diagram illustrating a chip configuration example of the image sensor according to the embodiment. 本実施形態に係るレイアウト例を説明するための図である。FIG. 3 is a diagram for explaining a layout example according to the embodiment. 本実施形態に係るレイアウト例を説明するための図である。FIG. 3 is a diagram for explaining a layout example according to the embodiment. 車両制御システムの概略的な構成の一例を示すブロック図である。It is a block diagram showing an example of a schematic structure of a vehicle control system. 車外情報検出部及び撮像部の設置位置の一例を示す説明図である。It is explanatory drawing which shows an example of the installation position of a vehicle exterior information detection part and an imaging part. 内視鏡手術システムの概略的な構成の一例を示す図である。It is a figure showing an example of the schematic structure of an endoscope operation system. カメラヘッド及びCCUの機能構成の一例を示すブロック図である。FIG. 3 is a block diagram illustrating an example of a functional configuration of a camera head and a CCU. 診断支援システムの概略的な構成の一例を示すブロック図である。It is a block diagram showing an example of a schematic structure of a diagnosis support system.
 以下に、本開示の実施形態について図面に基づいて詳細に説明する。なお、以下の各実施形態において、同一の部位には同一の符号を付することにより重複する説明を省略する。 Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. In the following embodiments, the same portions will be denoted by the same reference numerals, without redundant description.
 また、以下に示す項目順序に従って本開示を説明する。
 1.第1の実施形態
 2.第1の実施形態の変形例
 3.第2の実施形態
 4.第3の実施形態
 5.イメージセンサのチップ構成
 6.レイアウト例
 7.その他の実施形態
 8.移動体への応用例
 9.内視鏡手術システムへの応用例
 10.WSI(Whole Slide Imaging)システムへの応用例
In addition, the present disclosure will be described according to the following item order.
1. 1. First embodiment 2. Modification of First Embodiment Second embodiment4. Third embodiment5. 5. Chip configuration of image sensor Layout example 7. Other embodiments8. Application example to mobile object 9. 9. Example of application to endoscopic surgery system Application example to WSI (Whole Slide Imaging) system
(1.第1の実施形態)
[1-1.第1の実施形態に係る画像処理システムの構成]
 図1は、第1の実施形態に係る電子機器としての撮像装置の概略構成例を示すブロック図である。図1に示すように、撮像装置1は、クラウドサーバ30と通信可能に接続される。なお、撮像装置1とクラウドサーバ30とは、有線や無線を問わず、各種ネットワークやUSB(Universal Serial Bus)ケーブルなどを介して、通信可能に接続される。
(1. First Embodiment)
[1-1. Configuration of Image Processing System According to First Embodiment]
FIG. 1 is a block diagram illustrating a schematic configuration example of an imaging device as an electronic device according to the first embodiment. As shown in FIG. 1, the imaging device 1 is communicably connected to a cloud server 30. The imaging apparatus 1 and the cloud server 30 are communicably connected to each other via a network or a USB (Universal Serial Bus) cable, regardless of whether they are wired or wireless.
 クラウドサーバ30は、撮像装置1から送信された静止画や動画などの画像データを記憶するサーバ装置の一例である。例えば、クラウドサーバ30は、ユーザごと、日付ごと、撮像場所ごとなど任意の単位で画像データを記憶し、画像データを用いたアルバム作成など様々なサービスを提供することもできる。 The cloud server 30 is an example of a server device that stores image data such as still images and moving images transmitted from the imaging device 1. For example, the cloud server 30 can store image data in arbitrary units, such as for each user, for each date, and for each imaging location, and can provide various services such as creating an album using the image data.
 撮像装置1は、イメージセンサ10とアプリケーションプロセッサ20を有する電子機器の一例であり、例えばデジタルカメラ、デジタルビデオカメラ、タブレット端末、スマートフォンなどである。なお、以降の実施形態では、画像を撮像する例を用いて説明するが、これに限定されるものではなく、動画などであっても同様に処理することができる。 The imaging device 1 is an example of an electronic device having the image sensor 10 and the application processor 20, and is, for example, a digital camera, a digital video camera, a tablet terminal, a smartphone, or the like. In the following embodiments, an example in which an image is captured will be described. However, the present invention is not limited to this, and the same processing can be performed for a moving image and the like.
 イメージセンサ10は、例えば1チップで構成されるCMOS(Complementary Metal Oxide Semiconductor)イメージセンサであり、入射光を受光し、光電変換を行って、入射光の受光量に対応する画像データをアプリケーションプロセッサ20に出力する。 The image sensor 10 is, for example, a CMOS (Complementary Metal Oxide Semiconductor) image sensor composed of one chip, receives incident light, performs photoelectric conversion, and outputs image data corresponding to the amount of incident light received by the application processor 20. Output to
 アプリケーションプロセッサ20は、各種アプリケーションを実行するCPU(Central Processing Unit)などのプロセッサの一例である。アプリケーションプロセッサ20は、イメージセンサ10から入力された画像データをディスプレイに表示する表示処理、画像データを用いた生体認証処理、画像データをクラウドサーバ30に送信する送信処理などアプリケーションに対応する各種処理を実行する。 The application processor 20 is an example of a processor such as a CPU (Central Processing Unit) that executes various applications. The application processor 20 performs various processes corresponding to the application, such as a display process of displaying image data input from the image sensor 10 on a display, a biometric authentication process using the image data, and a transmission process of transmitting the image data to the cloud server 30. Execute.
[1-2.第1の実施形態に係る撮像装置の構成]
 図1に示すように、撮像装置1は、固体撮像装置であるイメージセンサ10と、アプリケーションプロセッサ20とを備える。イメージセンサ10は、撮像部11、コントロール部12、信号処理部13、DSP(処理部ともいう)14、メモリ15、セレクタ16(出力部ともいう)を有する。
[1-2. Configuration of Imaging Device According to First Embodiment]
As illustrated in FIG. 1, the imaging device 1 includes an image sensor 10 that is a solid-state imaging device, and an application processor 20. The image sensor 10 includes an imaging unit 11, a control unit 12, a signal processing unit 13, a DSP (also called a processing unit) 14, a memory 15, and a selector 16 (also called an output unit).
 撮像部11は、例えば、ズームレンズ、フォーカスレンズ、絞り等を備える光学系104と、フォトダイオードなどの受光素子を含む単位画素が2次元マトリクス状に配列した構成を備える画素アレイ部101とを備える。外部から入射した光は、光学系104を介することで、画素アレイ部101における受光素子が配列した受光面に結像される。画素アレイ部101の各単位画素は、その受光素子に入射した光を電変換することで、入射光の光量に応じた電荷を読出可能に蓄積する。 The imaging unit 11 includes, for example, an optical system 104 including a zoom lens, a focus lens, an aperture, and the like, and a pixel array unit 101 having a configuration in which unit pixels including light receiving elements such as photodiodes are arranged in a two-dimensional matrix. . Light incident from the outside passes through the optical system 104 to form an image on a light receiving surface of the pixel array unit 101 on which light receiving elements are arranged. Each unit pixel of the pixel array unit 101 converts the light incident on the light receiving element into an electric charge, and accumulates a charge corresponding to the amount of incident light in a readable manner.
 また、撮像部11には、変換器(Analog to Digital Converter:以下、ADCという)17(例えば、図2参照)が含まれている。ADC17は、撮像部11から読み出された単位画素毎のアナログの画素信号をデジタル値に変換することで、デジタルの画像データを生成し、生成した画像データを信号処理部13へ出力する。なお、ADC17には、電源電圧等から撮像部11を駆動するための駆動電圧を生成する電圧生成回路等が含まれてもよい。 The imaging unit 11 includes a converter (Analog to Digital Converter: hereinafter referred to as an ADC) 17 (for example, see FIG. 2). The ADC 17 generates digital image data by converting an analog pixel signal for each unit pixel read from the imaging unit 11 into a digital value, and outputs the generated image data to the signal processing unit 13. The ADC 17 may include a voltage generation circuit that generates a drive voltage for driving the imaging unit 11 from a power supply voltage or the like.
 撮像部11が出力する画像データのサイズは、例えば、12M(3968×2976)ピクセルや、VGA(Video Graphics Array)サイズ(640×480ピクセルZ)等の複数のサイズの中から選択することができる。また、撮像部11が出力する画像データについては、例えば、RGB(赤、緑、青)のカラー画像とするか、又は、輝度のみの白黒画像とするかを選択することができる。これらの選択は、撮影モードの設定の一種として行うことができる。 The size of the image data output by the imaging unit 11 can be selected from a plurality of sizes such as 12M (3968 × 2976) pixels and VGA (Video Graphics Array) size (640 × 480 pixels Z). . The image data output by the imaging unit 11 can be, for example, a color image of RGB (red, green, and blue) or a monochrome image of only luminance. These selections can be made as a kind of setting of the shooting mode.
 コントロール部12は、例えば、ユーザの操作や設定された動作モードに従い、イメージセンサ10内の各部を制御する。 The control unit 12 controls each unit in the image sensor 10 according to, for example, a user operation or a set operation mode.
 信号処理部13は、撮像部11から読み出されたデジタルの画像データ又はメモリ15から読み出されたデジタルの画像データ(以下、処理対象の画像データという)に対して種々の信号処理を実行する。例えば、処理対象の画像データがカラー画像である場合、信号処理部13は、この画像データをYUVの画像データやRGBの画像データなどにフォーマット変換する。また、信号処理部13は、例えば、処理対象の画像データに対し、ノイズ除去やホワイトバランス調整等の処理を必要に応じて実行する。その他、信号処理部13は、処理対象の画像データに対し、DSP14がその画像データを処理するのに必要となる種々の信号処理(前処理ともいう)を実行する。 The signal processing unit 13 performs various kinds of signal processing on digital image data read from the imaging unit 11 or digital image data read from the memory 15 (hereinafter, referred to as processing target image data). . For example, when the image data to be processed is a color image, the signal processing unit 13 converts the format of the image data into YUV image data, RGB image data, or the like. In addition, the signal processing unit 13 performs, for example, processing such as noise removal and white balance adjustment on the image data to be processed as necessary. In addition, the signal processing unit 13 performs various signal processing (also referred to as pre-processing) necessary for the DSP 14 to process the image data to be processed.
 DSP14は、例えば、メモリ15に格納されているプログラムを実行することで、ディープニューラルネットワーク(DNN)を利用した機械学習によって作成された学習済みモデルを用いて各種処理を実行する処理部として機能する。例えば、DSP14は、メモリ15に記憶されている学習済みモデルに基づいた演算処理を実行することで、メモリ15に記憶されている辞書係数と画像データとを掛け合わせる処理を実行する。このような演算処理により得られた結果(演算結果)は、メモリ15及び/又はセレクタ16へ出力される。なお、演算結果には、学習済みモデルを用いた演算処理を実行することで得られた画像データや、その画像データから得られる各種情報(メタデータ)が含まれ得る。また、DSP14には、メモリ15へのアクセスを制御するメモリコントローラが組み込まれていてもよい。 The DSP 14, for example, executes a program stored in the memory 15 to function as a processing unit that executes various processes using a learned model created by machine learning using a deep neural network (DNN). . For example, the DSP 14 executes a calculation process based on the learned model stored in the memory 15 to execute a process of multiplying the dictionary coefficient stored in the memory 15 by the image data. The result (calculation result) obtained by such calculation processing is output to the memory 15 and / or the selector 16. The calculation result may include image data obtained by executing a calculation process using the learned model, and various information (metadata) obtained from the image data. Further, the DSP 14 may include a memory controller for controlling access to the memory 15.
 演算処理には、例えば、ニューラルネットワーク計算モデルの一例である学習済みの学習モデルを利用したものが存在する。例えば、DSP14は、学習済みの学習モデルを用いて、各種処理であるDSP処理を実行することもできる。例えば、DSP14は、メモリ15から画像データを読み出して学習済みの学習モデルに入力し、学習済みモデルの出力結果として顔の輪郭や顔画像の領域などである顔位置を取得する。そして、DSP14は、画像データのうち、抽出された顔位置に対して、マスキング、モザイク、アバター化などの処理を実行して、加工画像データを生成する。その後、DSP14は、生成した加工された画像データ(加工画像データ)をメモリ15に格納する。 The arithmetic processing includes, for example, one using a learned learning model which is an example of a neural network calculation model. For example, the DSP 14 can execute DSP processing, which is various processing, using a learned learning model. For example, the DSP 14 reads out image data from the memory 15 and inputs the image data into a learned learning model, and acquires a face position such as a face outline or a face image area as an output result of the learned model. Then, the DSP 14 performs processing such as masking, mosaic, and avatar processing on the extracted face position in the image data to generate processed image data. After that, the DSP 14 stores the generated processed image data (processed image data) in the memory 15.
 また、学習済みの学習モデルには、学習データを用いて、人物の顔位置の検出などを学習したDNNやサポートベクタマシンなどが含まれる。学習済みの学習モデルは、判別対象のデータである画像データが入力されると、判別結果すなわち顔位置を特定するアドレスなどの領域情報を出力する。なお、DSP14は、学習データを用いて学習モデル内の各種パラメータの重み付けを変更することで学習モデルを更新したり、複数の学習モデルを用意しておき演算処理の内容に応じて使用する学習モデルを変更したり、外部の装置から学習済みの学習モデルを取得または更新したりして、上記演算処理を実行することができる。 {Circle around (4)} The learned learning model includes a DNN, a support vector machine, and the like, which have learned the detection of the face position of a person using the learning data. When image data, which is data to be discriminated, is input, the learned learning model outputs a discrimination result, that is, area information such as an address for specifying a face position. The DSP 14 updates the learning model by changing the weights of various parameters in the learning model using the learning data, or prepares a plurality of learning models and uses the learning model according to the content of the arithmetic processing. , Or a learned model that has been learned from an external device is acquired or updated, and the above-described arithmetic processing can be executed.
 なお、DSP14が処理対象とする画像データは、画素アレイ部101から通常に読み出された画像データであってもよいし、この通常に読み出された画像データの画素を間引くことでデータサイズが縮小された画像データであってもよい。若しくは、画素アレイ部101に対して画素を間引いた読み出しを実行することで通常よりも小さいデータサイズで読み出された画像データであってもよい。なお、ここでの通常の読み出しとは、画素を間引かずに読み出すことであってよい。 Note that the image data to be processed by the DSP 14 may be image data normally read from the pixel array unit 101, or the data size may be reduced by thinning out pixels of the normally read image data. The image data may be reduced image data. Alternatively, the image data may be image data read out with a smaller data size than usual by executing reading out of the pixel array unit 101 by thinning out pixels. Note that the normal reading here may be reading without skipping pixels.
 このような学習モデルによる顔位置の抽出や加工処理により、画像データの顔位置がマスキングされた加工画像データ、画像データの顔位置がモザイク処理された加工画像データ、または、画像データの顔位置がキャラクターに置き換えられてアバター化された加工画像データなどを生成することができる。 By the face position extraction and processing by such a learning model, the processed image data in which the face position of the image data is masked, the processed image data in which the face position of the image data is mosaic-processed, or the face position of the image data is It is possible to generate avatar-processed image data or the like that is replaced with a character.
 メモリ15は、撮像部11から出力された画像データ、信号処理部13で信号処理された画像データ、DSP14で得られた演算結果等を必要に応じて記憶する。また、メモリ15は、DSP14が実行する学習済みの学習モデルのアルゴリズムをプログラム及び辞書係数として記憶する。 The memory 15 stores the image data output from the imaging unit 11, the image data processed by the signal processing unit 13, the calculation result obtained by the DSP 14, and the like as necessary. Further, the memory 15 stores an algorithm of a learned learning model executed by the DSP 14 as a program and a dictionary coefficient.
 また、メモリ15は、信号処理部13から出力された画像データやDSP14から出力された演算処理済みの画像データ(以下、加工画像データという)に加え、ISO(International Organization for Standardization)感度、露光時間、フレームレート、フォーカス、撮影モード、切出し範囲等を記憶してもよい。すなわち、メモリ15は、ユーザにより設定される各種撮像情報を記憶し得る。 The memory 15 stores ISO (International Organization for Standardization) sensitivity and exposure time in addition to the image data output from the signal processing unit 13 and the processed image data output from the DSP 14 (hereinafter referred to as processed image data). , A frame rate, a focus, a shooting mode, a cutout range, and the like. That is, the memory 15 can store various types of imaging information set by the user.
 セレクタ16は、例えばコントロール部12からの選択制御信号に従うことで、DSP14から出力された加工画像データやメモリ15に記憶されている画像データを選択的に出力する。例えば、セレクタ16は、メモリ15に格納されている加工画像データやメタデータ等の演算結果とのいずれかを、ユーザの設定等により選択して、アプリケーションプロセッサ20に出力する。 The selector 16 selectively outputs the processed image data output from the DSP 14 and the image data stored in the memory 15 according to, for example, a selection control signal from the control unit 12. For example, the selector 16 selects one of the processed image data and the operation result of the metadata or the like stored in the memory 15 by a user setting or the like, and outputs the selected operation result to the application processor 20.
 例えば、セレクタ16は、加工画像データを出力する加工処理モードが選択されている場合は、DSP14が生成した加工画像データをメモリ15から読み出して、アプリケーションプロセッサへ出力する。一方、セレクタ16は、加工画像データを出力しない通常処理モードが選択されている場合は、信号処理部13から入力される画像データをアプリケーションプロセッサへ出力する。なお、セレクタ16は、第1の処理モードが選択されている場合、DSP14から出力された演算結果を直接アプリケーションプロセッサ20へ出力してもよい。 For example, when the processing mode for outputting the processed image data is selected, the selector 16 reads the processed image data generated by the DSP 14 from the memory 15 and outputs the processed image data to the application processor. On the other hand, when the normal processing mode in which the processed image data is not output is selected, the selector 16 outputs the image data input from the signal processing unit 13 to the application processor. When the first processing mode is selected, the selector 16 may directly output the calculation result output from the DSP 14 to the application processor 20.
 以上のようにしてセレクタ16から出力された画像データや加工画像データは、表示やユーザインタフェースなどを処理するアプリケーションプロセッサ20に入力される。アプリケーションプロセッサ20は、例えば、CPU等を用いて構成され、オペレーティングシステムや各種アプリケーションソフトウエア等を実行する。このアプリケーションプロセッサ20には、GPU(Graphics Processing Unit)やベースバンドプロセッサなどの機能が搭載されていてもよい。アプリケーションプロセッサ20は、入力された画像データや演算結果に対し、必要に応じた種々処理を実行したり、ユーザへの表示を実行したり、所定のネットワーク40を介して外部のクラウドサーバ30へ送信したりする。 The image data and the processed image data output from the selector 16 as described above are input to the application processor 20 that processes display and user interface. The application processor 20 is configured using, for example, a CPU, and executes an operating system, various application software, and the like. The application processor 20 may have functions such as a GPU (Graphics Processing Unit) and a baseband processor. The application processor 20 performs various processes as needed on the input image data and the calculation results, executes display to the user, and transmits the image data and the calculation result to the external cloud server 30 via the predetermined network 40. Or
 なお、所定のネットワーク40には、例えば、インターネットや、有線LAN(Local Area Network)又は無線LANや、移動体通信網や、Bluetooth(登録商標)など、種々のネットワークを適用することができる。また、画像データや演算結果の送信先は、クラウドサーバ30に限定されず、単一で動作するサーバや、各種データを保管するファイルサーバや、携帯電話機等の通信端末など、通信機能を有する種々の情報処理装置(システム)であってよい。 Various networks such as the Internet, a wired LAN (Local Area Network) or a wireless LAN, a mobile communication network, and Bluetooth (registered trademark) can be applied to the predetermined network 40. In addition, the transmission destination of the image data and the calculation result is not limited to the cloud server 30, and various servers having a communication function such as a server that operates alone, a file server that stores various data, and a communication terminal such as a mobile phone. Information processing device (system).
[1-3.第1の実施形態に係る画像加工の説明]
 図2は、第1の実施形態に係る画像の加工を説明する図である。図2に示すように、信号処理部13は、撮像部11から読み出された画像データに信号処理を行ってメモリ15に格納する。DSP14は、メモリ15から画像データを読み出して、学習済みの学習モデルを用いた顔検出を実行し、画像データから顔位置を検出する(処理1)。
[1-3. Description of Image Processing According to First Embodiment]
FIG. 2 is a diagram illustrating processing of an image according to the first embodiment. As illustrated in FIG. 2, the signal processing unit 13 performs signal processing on the image data read from the imaging unit 11 and stores the processed data in the memory 15. The DSP 14 reads the image data from the memory 15, executes face detection using the learned learning model, and detects a face position from the image data (Process 1).
 続いて、DSP14は、検出した顔位置に、マスキングやモザイクなどを施す加工処理(処理2)を実行して加工画像データを生成し、メモリ15に格納する。その後、セレクタ16は、ユーザの選択に応じて、顔領域が加工された加工画像データをアプリケーションプロセッサ20に出力する。 Next, the DSP 14 performs a processing (processing 2) for performing masking, mosaicing, and the like on the detected face position, generates processed image data, and stores the processed image data in the memory 15. Thereafter, the selector 16 outputs the processed image data in which the face area has been processed to the application processor 20 according to the user's selection.
[1-4.第1の実施形態に係る処理の流れ]
 図3は、第1の実施形態に係る加工処理の流れを示すフローチャートである。図3に示すように、撮像部11による撮像された画像データがメモリ15に格納される(S101)。
[1-4. Process flow according to first embodiment]
FIG. 3 is a flowchart showing the flow of the processing according to the first embodiment. As shown in FIG. 3, the image data captured by the imaging unit 11 is stored in the memory 15 (S101).
 そして、DSP14は、メモリ15から画像データを読み出し(S102)、学習済みの学習モデルを用いて、顔位置を検出する(S103)。続いて、DSP14は、画像データの顔位置を加工した加工画像データを生成して、メモリ15に格納する(S104)。 Then, the DSP 14 reads out the image data from the memory 15 (S102), and detects the face position using the learned learning model (S103). Subsequently, the DSP 14 generates processed image data obtained by processing the face position of the image data, and stores the processed image data in the memory 15 (S104).
 その後、セレクタ16は、加工有りの処理モードである加工処理モードが選択されている場合(S105:Yes)、加工画像データをメモリ15から読み出して、アプリケーションプロセッサ20などの外部装置に出力する(S106)。 Thereafter, when the processing mode that is the processing mode with the processing is selected (S105: Yes), the selector 16 reads the processed image data from the memory 15 and outputs the processed image data to an external device such as the application processor 20 (S106). ).
 一方、セレクタ16は、加工無しの処理モードである通常処理モードが選択されている場合(S105:No)、加工処理が施されてない画像データをメモリ15から読み出して、アプリケーションプロセッサ20などの外部装置に出力する(S107)。 On the other hand, when the normal processing mode, which is the processing mode without processing, is selected (S105: No), the selector 16 reads out the image data on which the processing has not been performed from the memory 15 and reads the image data from the external device such as the application processor 20. Output to the device (S107).
[1-5.作用・効果]
 上述したように、イメージセンサ10は、加工が必要な場合でも1チップ内の閉じた領域で、加工処理を実行できるので、撮像された画像データがそのまま外部に出力されることを抑制でき、セキュリティの向上やプライバシーの保護を実現できる。また、イメージセンサ10は、加工の有無をユーザに選択させることができるので、用途に応じて処理モードを選択でき、ユーザの利便性を向上することができる。
[1-5. Action / Effect]
As described above, the image sensor 10 can execute the processing in a closed area within one chip even when the processing is required, so that the captured image data can be prevented from being output to the outside as it is, and security can be reduced. And privacy can be improved. Further, since the image sensor 10 allows the user to select whether or not to perform processing, the processing mode can be selected according to the application, and the convenience for the user can be improved.
(2.第1の実施形態の変形例)
 上記第1の実施形態では、顔位置にマスキング等を実行する例を説明したが、加工処理がこれに限定されるものではない。例えば、顔位置を抽出した部分画像を生成することもできる。
(2. Modification of First Embodiment)
In the first embodiment, an example in which masking or the like is performed on a face position has been described, but the processing is not limited to this. For example, a partial image in which a face position is extracted can be generated.
 図4は、第1の実施形態の変形例を説明する図である。図4に示すように、信号処理部13は、撮像部11から読み出された画像データに信号処理を行ってメモリ15に格納する。DSP14は、メモリ15から画像データを読み出して、学習済みの学習モデルを用いた顔検出を実行し、画像データから顔位置を検出する(処理1)。 FIG. 4 is a diagram illustrating a modification of the first embodiment. As illustrated in FIG. 4, the signal processing unit 13 performs signal processing on the image data read from the imaging unit 11 and stores the processed data in the memory 15. The DSP 14 reads the image data from the memory 15, executes face detection using the learned learning model, and detects a face position from the image data (Process 1).
 続いて、DSP14は、検出した顔位置を抽出した部分画像データを生成し(処理2)、メモリ15に格納する。その後、セレクタ16は、ユーザの選択に応じて、顔の部分画像データをアプリケーションプロセッサ20に出力する。 (4) Subsequently, the DSP 14 generates partial image data from which the detected face position is extracted (Process 2), and stores the partial image data in the memory 15. After that, the selector 16 outputs the partial image data of the face to the application processor 20 according to the user's selection.
 上述したように、イメージセンサ10は、加工が必要な場合でも1チップ内の閉じた領域で、部分画像データの抽出を実行できるので、人物の特定、顔認証、人物ごとの画像収集などアプリケーションプロセッサ20の処理に応じた画像を出力することができる。この結果、不要な画像の送信を抑制でき、セキュリティの向上やプライバシーの保護を実現でき、データ容量も削減することができる。 As described above, the image sensor 10 can perform extraction of partial image data in a closed area within one chip even when processing is required, so that the application processor such as identification of a person, face authentication, and image collection for each person can be used. 20 can be output. As a result, transmission of unnecessary images can be suppressed, security can be improved, privacy can be protected, and data capacity can be reduced.
(3.第2の実施形態)
[3-1.第2の実施形態に係る撮像装置の説明]
 ところで、第1の実施形態では、DSP14が加工処理を実行する例を説明したが、これに限定されるものではなく、セレクタ16が加工処理を行うこともできる。そこで、第2の実施形態では、セレクタ16が加工処理を行う例を説明する。
(3. Second Embodiment)
[3-1. Description of imaging apparatus according to second embodiment]
By the way, in the first embodiment, the example in which the DSP 14 executes the processing is described. However, the present invention is not limited to this, and the selector 16 can also perform the processing. Therefore, in the second embodiment, an example in which the selector 16 performs the processing will be described.
 図5は、第2の実施形態に係る撮像装置を説明する図である。図5に示すように、第2の実施形態に係るイメージセンサ10の構成は、第1の実施形態に係るイメージセンサ10と同様なので、詳細な説明は省略する。第1の実施形態と異なる点は、イメージセンサ10のDSP14が、学習モデルを用いて抽出した顔位置の位置情報をセレクタ16に通知する点である。 FIG. 5 is a diagram illustrating an imaging device according to the second embodiment. As shown in FIG. 5, since the configuration of the image sensor 10 according to the second embodiment is the same as that of the image sensor 10 according to the first embodiment, a detailed description is omitted. The difference from the first embodiment is that the DSP 14 of the image sensor 10 notifies the selector 16 of the position information of the face position extracted using the learning model.
 例えば、図5に示すように、信号処理部13は、撮像部11から読み出された画像データに信号処理を行ってメモリ15に格納する。DSP14は、メモリ15から画像データを読み出して、学習済みの学習モデルを用いた顔検出を実行し、画像データから顔位置を検出する(処理1)。そして、DSP14は、顔位置を特定するアドレスなどである位置情報をセレクタ16に通知する。 For example, as shown in FIG. 5, the signal processing unit 13 performs signal processing on the image data read from the imaging unit 11 and stores the processed data in the memory 15. The DSP 14 reads the image data from the memory 15, executes face detection using the learned learning model, and detects a face position from the image data (Process 1). Then, the DSP 14 notifies the selector 16 of position information such as an address for specifying the face position.
 セレクタ16は、加工処理がユーザにより選択されている場合に、画像データをメモリ15から読み出し、DSP14から取得した位置情報を用いて、加工対象となるROI(Region of Interest)を特定する。そして、セレクタ16は、特定したROIに対して、マスキングなどの加工処理を実行して加工画像データを生成し(処理2)、加工画像データをアプリケーションプロセッサ20に出力する。なお、セレクタ16は、加工画像データをメモリ15に格納することもできる。 The selector 16 reads the image data from the memory 15 when the processing is selected by the user, and specifies the ROI (Region of interest) to be processed by using the position information acquired from the DSP 14. Then, the selector 16 performs processing such as masking on the specified ROI to generate processed image data (Process 2), and outputs the processed image data to the application processor 20. Note that the selector 16 can also store the processed image data in the memory 15.
[3-2.第2の実施形態の第1の変形例]
 上記第1の実施形態の変形例と同様、第2の実施形態においてもセレクタ16が顔位置を抽出した部分画像を生成することもできる。
[3-2. First Modification of Second Embodiment]
As in the modification of the first embodiment, the selector 16 can also generate a partial image in which the face position is extracted by the selector 16 in the second embodiment.
 図6は、第2の実施形態の第1の変形例を説明する図である。図6に示すように、信号処理部13は、撮像部11から読み出された画像データに信号処理を行ってメモリ15に格納する。DSP14は、メモリ15から画像データを読み出して、学習済みの学習モデルを用いた顔検出を実行し、画像データから顔位置を検出する(処理1)。そして、DSP14は、顔位置を特定するアドレスなどである位置情報をセレクタ16に通知する。 FIG. 6 is a diagram illustrating a first modification of the second embodiment. As shown in FIG. 6, the signal processing unit 13 performs signal processing on the image data read from the imaging unit 11 and stores the processed data in the memory 15. The DSP 14 reads the image data from the memory 15, executes face detection using the learned learning model, and detects a face position from the image data (Process 1). Then, the DSP 14 notifies the selector 16 of position information such as an address for specifying the face position.
 続いて、セレクタ16は、加工処理がユーザにより選択されている場合に、画像データをメモリ15から読み出し、DSP14から取得した位置情報を用いて、加工対象となるROI(Region of Interest)を特定する。その後、セレクタ16は、画像データからROIに該当する部分を抽出した部分画像データを生成し(処理2)、アプリケーションプロセッサ20に出力する。 Subsequently, when the processing is selected by the user, the selector 16 reads out the image data from the memory 15 and specifies the ROI (Region of interest) to be processed using the position information acquired from the DSP 14. . Thereafter, the selector 16 generates partial image data in which a portion corresponding to the ROI is extracted from the image data (Process 2), and outputs the partial image data to the application processor 20.
[3-3.第2の実施形態の第2の変形例]
 上述した第2の実施形態及びその第1の変形例では、メモリ15に格納されている画像データに対してセレクタ16がROIの抽出(切り出し又はトリミングともいう)や加工(マスキング等)などの処理2を行なう場合を例示したが、これに限定されず、例えば、セレクタ16が、信号処理部13から出力された画像データに対して直接、ROIの切り出しや加工(マスキング等)などの処理2を実行するように構成することも可能である。
[3-3. Second Modification of Second Embodiment]
In the above-described second embodiment and the first modification thereof, the selector 16 performs processing such as extraction (also called cutout or trimming) or processing (such as masking) of an ROI on image data stored in the memory 15. 2, the selector 16 performs processing 2 such as ROI cutout or processing (masking or the like) directly on the image data output from the signal processing unit 13. It is also possible to configure to execute.
[3-4.第2の実施形態の第3の変形例]
 また、撮像部11から読み出す画像データ自体を、ROIのみの部分画像データやROIを含まない画像データとすることも可能である。その場合、第1のフレームに対してDSP14で抽出された顔位置がコントロール部12へ通知され、コントロール部12が撮像部11に対し、第1のフレームの次のフレームである第2のフレームにおけるROIに相当する画素領域からの部分画像データの読み出しや、ROI以外の領域に相当する画素領域からの画像データの読み出しを実行する。
[3-4. Third Modification of Second Embodiment]
Further, the image data itself read from the imaging unit 11 may be partial image data only of the ROI or image data not including the ROI. In that case, the face position extracted by the DSP 14 with respect to the first frame is notified to the control unit 12, and the control unit 12 instructs the imaging unit 11 in the second frame that is the next frame of the first frame. The reading of the partial image data from the pixel area corresponding to the ROI and the reading of the image data from the pixel area corresponding to the area other than the ROI are executed.
 なお、第2の実施形態及びその変形例において、セレクタ16は、マスキング等の加工処理に限らず、画像データのうちのROIに該当する領域だけを他の画像に書き換えて出力することもでき、画像データのうちのROIに該当する領域だけをメモリ15から読み出さずに出力することもできる。なお、この処理は、第1の実施形態におけるDSP14が実行することもできる。 In the second embodiment and its modifications, the selector 16 is not limited to the processing such as masking, and can also rewrite only the area corresponding to the ROI in the image data to another image and output the image. Only the area corresponding to the ROI of the image data can be output without being read from the memory 15. Note that this processing can also be executed by the DSP 14 in the first embodiment.
 上述したように、イメージセンサ10は、セレクタ16で加工処理を実行できるので、加工処理が不要な場合のDSP14の処理負荷を低減できる。また、イメージセンサ10は、セレクタ16で加工した加工画像をメモリ15に保存せずに、出力することができるので、メモリ15の使用容量を削減することができ、メモリ15のコスト削減や小型化が図れる。この結果、イメージセンサ10全体の小型化を図ることもできる。 As described above, since the image sensor 10 can execute the processing by the selector 16, the processing load of the DSP 14 when the processing is unnecessary can be reduced. Further, since the image sensor 10 can output the processed image processed by the selector 16 without storing it in the memory 15, the used capacity of the memory 15 can be reduced, and the cost and size of the memory 15 can be reduced. Can be achieved. As a result, the size of the entire image sensor 10 can be reduced.
(4.第3の実施形態)
[4-1.第3の実施形態に係る撮像装置の説明]
 ところで、イメージセンサ10は、撮像部11からの画像データ全体の読み出しに先だって、小さい容量の画像データを先に読み出して、顔位置を検出することで、処理の高速化を図ることができる。そこで、第3の実施形態では、処理の高速化を図る例を説明する。
(4. Third Embodiment)
[4-1. Description of imaging apparatus according to third embodiment]
By the way, the image sensor 10 can read out a small amount of image data first and read out the face position before reading out the entire image data from the imaging unit 11, thereby speeding up the processing. Thus, in a third embodiment, an example will be described in which the processing speed is increased.
 図7は、第3の実施形態に係る撮像装置を説明する図である。図7に示すように、第3の実施形態に係るイメージセンサ10の構成は、第1の実施形態に係るイメージセンサ10と同様なので、詳細な説明は省略する。ここでは、第1の実施形態と異なる点について説明する。 FIG. 7 is a diagram illustrating an imaging device according to the third embodiment. As shown in FIG. 7, the configuration of the image sensor 10 according to the third embodiment is the same as that of the image sensor 10 according to the first embodiment, and a detailed description thereof will be omitted. Here, differences from the first embodiment will be described.
 例えば、図7に示すように、撮像部11は、全単位画素から画像データを読み出す際に、全単位画素ではなく対象の単位画素から間引いて読み出し、間引いた小さい容量の画像データをメモリ15に格納する。これと並行して、撮像部11は、画像データの通常読み出しを実行する。 For example, as shown in FIG. 7, when reading image data from all the unit pixels, the imaging unit 11 reads out the image data of a small capacity from the target unit pixel instead of all the unit pixels, and stores the thinned-out image data in the memory 15. Store. In parallel with this, the imaging unit 11 executes normal reading of image data.
 そして、DSP14は、メモリ15から小さい容量の画像データを読み出して、学習済みの学習モデルを用いた顔検出を実行し、当該画像データから顔位置を検出する(処理1)。そして、DSP14は、顔位置を特定するアドレスなどである位置情報をセレクタ16に通知する。 Then, the DSP 14 reads out a small amount of image data from the memory 15, executes face detection using the learned learning model, and detects a face position from the image data (Process 1). Then, the DSP 14 notifies the selector 16 of position information such as an address for specifying the face position.
 その後、セレクタ16は、撮像部11により読み出された通常の画像データが入力されると、DSP14から取得した位置情報を用いて、通常の画像データから加工対象となるROI(Region of Interest)を特定する。そして、セレクタ16は、ROIに該当する領域にマスキングなどの加工処理を実行して加工画像データを生成し(処理2)、加工画像データをアプリケーションプロセッサ20に出力する。 After that, when the normal image data read by the imaging unit 11 is input, the selector 16 uses the position information acquired from the DSP 14 to convert a ROI (Region of Interest) to be processed from the normal image data. Identify. Then, the selector 16 performs processing such as masking on the area corresponding to the ROI to generate processed image data (Process 2), and outputs the processed image data to the application processor 20.
[4-2.第3の実施形態に係る処理の流れ]
 次に、図7で説明した処理の流れを説明する。図8は、第3の実施形態にかかる加工処理の流れを示すシーケンス図である。図8に示すように、撮像部11は、画像を間引いて読み出し(S201)、間引いた小さい容量の画像データをメモリ15に格納する(S202)。その後、撮像部11は、通常の画像データの読み出しを継続する。
[4-2. Process flow according to third embodiment]
Next, the flow of the processing described in FIG. 7 will be described. FIG. 8 is a sequence diagram illustrating a flow of the processing according to the third embodiment. As shown in FIG. 8, the imaging unit 11 reads out the image by thinning it out (S201), and stores the thinned-out image data in the memory 15 (S202). After that, the imaging unit 11 continues reading the normal image data.
 これと並行して、DSP14は、DNN等を用いて、小さい容量の画像データから顔検出を実行し、顔位置を検出する(S203)。そして、DSP14は、検出した顔位置の位置情報をセレクタ16に通知する(S205とS206)。 In parallel with this, the DSP 14 performs face detection from the small-capacity image data using DNN or the like, and detects the face position (S203). Then, the DSP 14 notifies the selector 16 of the position information of the detected face position (S205 and S206).
 そして、セレクタ16は、DSP14から通知された顔位置の位置情報を保持する(S207)。その後、撮像部11は、通常の画像データの読み出しが完了すると、セレクタ16に出力し(S209とS210)、セレクタ16は、顔位置の位置情報を用いて、通常の画像データから顔位置を特定する(S211)。 Then, the selector 16 holds the position information of the face position notified from the DSP 14 (S207). Thereafter, when the reading of the normal image data is completed, the imaging unit 11 outputs the data to the selector 16 (S209 and S210), and the selector 16 specifies the face position from the normal image data using the position information of the face position. (S211).
 その後、セレクタ16は、顔位置を加工した加工画像データを生成し(S212)、加工画像データを外部装置に出力する(S213)。例えば、セレクタ16は、DNNで検出された顔の位置のみを切出して出力する。このように、イメージセンサ10は、通常の画像データの読み出しが完了する前に、顔位置を検出することができるので、画像データの読み出し後に遅滞なく加工処理を実行することができ、第1の実施形態と比べても、処理を高速化することができる。 Then, the selector 16 generates processed image data obtained by processing the face position (S212), and outputs the processed image data to an external device (S213). For example, the selector 16 cuts out and outputs only the position of the face detected by the DNN. As described above, since the image sensor 10 can detect the face position before the normal reading of the image data is completed, the image sensor 10 can execute the processing without delay after the reading of the image data. Processing can be speeded up as compared with the embodiment.
(5.イメージセンサのチップ構成)
 次に、図1に示すイメージセンサ10のチップ構成の例について、以下に図面を参照して詳細に説明する。
(5. Image sensor chip configuration)
Next, an example of a chip configuration of the image sensor 10 shown in FIG. 1 will be described in detail below with reference to the drawings.
 図9は、本実施形態に係るイメージセンサのチップ構成例を示す模式図である。図9に示すように、イメージセンサ10は、四角形の平板状の第1基板(ダイ)100と、同じく四角形の平板状の第2基板(ダイ)120とが貼り合わされた積層構造を有している。 FIG. 9 is a schematic diagram illustrating an example of a chip configuration of the image sensor according to the present embodiment. As shown in FIG. 9, the image sensor 10 has a laminated structure in which a rectangular flat plate-shaped first substrate (die) 100 and a rectangular flat plate-shaped second substrate (die) 120 are bonded together. I have.
 第1基板100と第2基板とのサイズは、例えば、同じであってよい。また、第1基板100と第2基板120とは、それぞれシリコン基板などの半導体基板であってよい。 サ イ ズ The size of the first substrate 100 and the size of the second substrate may be the same, for example. Further, the first substrate 100 and the second substrate 120 may be semiconductor substrates such as a silicon substrate.
 第1基板100には、図1に示すイメージセンサ10の構成において、撮像部11の画素アレイ部101が配置される。また、第1基板100には、光学系104の一部又は全部がオンチップで設けられていてもよい。 (1) On the first substrate 100, the pixel array unit 101 of the imaging unit 11 in the configuration of the image sensor 10 shown in FIG. Further, a part or all of the optical system 104 may be provided on the first substrate 100 on a chip.
 第2基板120には、図1に示すイメージセンサ10の構成において、ADC17と、コントロール部12と、信号処理部13と、DSP14と、メモリ15と、セレクタ16とが配置されている。なお、第2基板120には、不図示のインタフェース回路やドライバ回路などが配置されていてもよい。 1. On the second substrate 120, in the configuration of the image sensor 10 shown in FIG. 1, the ADC 17, the control unit 12, the signal processing unit 13, the DSP 14, the memory 15, and the selector 16 are arranged. Note that an interface circuit, a driver circuit, and the like (not shown) may be arranged on the second substrate 120.
 第1基板100と第2基板120との貼り合わせは、第1基板100及び第2基板120をそれぞれチップに個片化した後、これら個片化された第1基板100及び第2基板120を貼り合わせる、いわゆるCoC(Chip on Chip)方式であってもよいし、第1基板100と第2基板120とのうち一方(例えば、第1基板100)をチップに個片化した後、この個片化された第1基板100を個片化前(すなわち、ウエハ状態)の第2基板120に貼り合わせる、いわゆるCoW(Chip on Wafer)方式であってもよいし、第1基板100と第2基板120とを共にウエハの状態で貼り合わせる、いわゆるWoW(Wafer on Wafer)方式であってもよい。 The bonding of the first substrate 100 and the second substrate 120 is performed by dividing the first substrate 100 and the second substrate 120 into chips, respectively, and then dividing the first substrate 100 and the second substrate 120 into individual chips. A so-called CoC (Chip-on-Chip) method of bonding may be used. Alternatively, one of the first substrate 100 and the second substrate 120 (for example, the first substrate 100) may be separated into chips, and then this chip may be separated. The so-called CoW (Chip on Wafer) method, in which the singulated first substrate 100 is bonded to the second substrate 120 before singulation (that is, in a wafer state), or the first substrate 100 and the second substrate 120 may be used. A so-called WoW (Wafer-on-Wafer) method may be used in which the substrate 120 and the substrate 120 are bonded together in a wafer state.
 第1基板100と第2基板120との接合方法には、例えば、プラズマ接合等を使用することができる。ただし、これに限定されず、種々の接合方法が用いられてよい。 に は As a method for bonding the first substrate 100 and the second substrate 120, for example, plasma bonding or the like can be used. However, the present invention is not limited to this, and various joining methods may be used.
(6.レイアウト例)
 図10及び図11は、本実施形態に係るレイアウト例を説明するための図である。なお、図10は、第1基板100のレイアウト例を示し、図11は、第2基板120のレイアウト例を示す。
(6. Layout example)
FIGS. 10 and 11 are diagrams for explaining a layout example according to the present embodiment. FIG. 10 shows a layout example of the first substrate 100, and FIG. 11 shows a layout example of the second substrate 120.
[6-1.第1基板のレイアウト例]
 図10に示すように、第1基板100には、図1に示すイメージセンサ10の構成において、撮像部11の画素アレイ部101が配置されている。なお、第1基板100に光学系104の一部又は全部を搭載する場合には、画素アレイ部101と対応する位置に設けられる。
[6-1. Layout example of first substrate]
As shown in FIG. 10, on the first substrate 100, the pixel array unit 101 of the imaging unit 11 in the configuration of the image sensor 10 shown in FIG. When a part or all of the optical system 104 is mounted on the first substrate 100, the optical system 104 is provided at a position corresponding to the pixel array unit 101.
 画素アレイ部101は、第1基板100の4つの辺L101~L104のうち、1つの辺L101側に片寄って配置される。言い換えれば、画素アレイ部101は、その中心部O101が第1基板100の中心部O100よりも辺L101に近接するように、配置されている。なお、第1基板100における画素アレイ部101が設けられた面が長方形である場合、辺L101は、例えば、短い方の辺であってもよい。ただし、これに限定されず、長い方の辺に、画素アレイ部101が片寄って配置されてもよい。 The pixel array unit 101 is arranged to be shifted toward one side L101 among the four sides L101 to L104 of the first substrate 100. In other words, the pixel array unit 101 is arranged such that the center O101 is closer to the side L101 than the center O100 of the first substrate 100. When the surface of the first substrate 100 on which the pixel array unit 101 is provided is rectangular, the side L101 may be, for example, the shorter side. However, the present invention is not limited to this, and the pixel array unit 101 may be arranged to be offset on the longer side.
 画素アレイ部101の4つの辺のうちの辺L101に近接する領域、言い換えれば、辺L101と画素アレイ部101との間の領域には、画素アレイ部101中の各単位画素101aを第2基板120に配置されたADC17に電気的に接続させるための配線として、第1基板100を貫通する複数の貫通配線(Through Silicon Via:以下、TSVという)が配列するTSVアレイ102が設けられている。このように、TSVアレイ102を画素アレイ部101が近接する辺L101に近接させることで、第2基板120において、ADC17等の各部を配置スペースを確保し易くすることができる。 Each of the unit pixels 101a in the pixel array unit 101 is placed in a region near the side L101 of the four sides of the pixel array unit 101, in other words, in a region between the side L101 and the pixel array unit 101. A TSV array 102 in which a plurality of through wirings (Through Silicon Via) (hereinafter, referred to as TSVs) penetrating the first substrate 100 is provided as wiring for electrically connecting to the ADC 17 arranged in the 120. As described above, by bringing the TSV array 102 close to the side L101 to which the pixel array unit 101 is close, it is possible to easily secure an arrangement space for each unit such as the ADC 17 on the second substrate 120.
 なお、TSVアレイ102は、辺L101と交わる2つの辺L103及びL104のうち一方の辺L104(ただし、辺L103であってもい)に近接する領域、言い換えれば、辺L104(又は、辺L103)と画素アレイ部101との間の領域にも設けられていてよい。 Note that the TSV array 102 has a region close to one of the two sides L103 and L104 intersecting with the side L101 (but may be the side L103), in other words, the side L104 (or the side L103). It may be provided in a region between the pixel array unit 101 and the pixel array unit 101.
 第1基板100の4つの辺L101~L104のうち、画素アレイ部101が片寄って配置されていない辺L102~L103それぞれには、直線状に配列された複数のパッドよりなるパッドアレイ103が設けられている。パッドアレイ103に含まれるパッドには、例えば、画素アレイ部101やADC17などのアナログ回路用の電源電圧が印加されるパッド(電源ピンともいう)や、信号処理部13やDSP14やメモリ15やセレクタ16やコントロール部12等のデジタル回路用の電源電圧が印加されるパッド(電源ピンともいう)や、MIPI(Mobile Industry Processor Interface)やSPI(Serial Peripheral Interface)などのインタフェース用のパッド(信号ピンともいう)や、クロックやデータの入出力のためのパッド(信号ピンともいう)などが含まれている。各パッドは、例えば、外部の電源回路やインタフェース回路とワイヤを介して電気的に接続される。各パッドアレイ103とTSVアレイ102とは、パッドアレイ103中の各パッドに接続されたワイヤからの信号の反射の影響を無視できる程度に十分に離れていることが好ましい。 Of the four sides L101 to L104 of the first substrate 100, each of the sides L102 to L103 in which the pixel array unit 101 is not offset is provided with a pad array 103 including a plurality of pads arranged linearly. ing. The pads included in the pad array 103 include, for example, a pad (also referred to as a power supply pin) to which a power supply voltage for an analog circuit such as the pixel array unit 101 and the ADC 17 is applied, a signal processing unit 13, a DSP 14, a memory 15, and a selector. 16 and a pad (also referred to as a power supply pin) to which a power supply voltage for digital circuits such as the control unit 12 is applied, and an interface pad (also referred to as a signal pin) such as MIPI (Mobile Industry Processor Interface) or SPI (Serial Peripheral Interface). ), And pads (also called signal pins) for inputting and outputting clocks and data. Each pad is electrically connected to an external power supply circuit or interface circuit via a wire, for example. It is preferable that each pad array 103 and the TSV array 102 are sufficiently separated from each other in the pad array 103 so that the influence of signal reflection from a wire connected to each pad in the pad array 103 can be ignored.
[6-2.第2基板のレイアウト例]
 一方、図11に示すように、第2基板120には、図1に示すイメージセンサ10の構成において、ADC17と、コントロール部12と、信号処理部13と、DSP14と、メモリ15とが配置されている。なお、第1のレイアウト例では、メモリ15がメモリ15Aとメモリ15Bとの2つの領域に分かれている。同様に、ADC17がADC17AとDAC(Digital-to-Analog Converter)17Bとの2つの領域に分かれている。DAC17Bは、ADC17AへAD変換用の参照電圧を供給する構成であり、広い意味でADC17の一部に含まれる構成である。また、図10には図示されていないが、セレクタ16も第2基板120に配置されている。
[6-2. Layout example of second substrate]
On the other hand, as shown in FIG. 11, the ADC 17, the control unit 12, the signal processing unit 13, the DSP 14, and the memory 15 in the configuration of the image sensor 10 shown in FIG. ing. In the first layout example, the memory 15 is divided into two areas, a memory 15A and a memory 15B. Similarly, the ADC 17 is divided into two areas: an ADC 17A and a DAC (Digital-to-Analog Converter) 17B. The DAC 17B supplies a reference voltage for AD conversion to the ADC 17A, and is included in a part of the ADC 17 in a broad sense. Although not shown in FIG. 10, the selector 16 is also arranged on the second substrate 120.
 さらに、第2基板120には、第1基板100を貫通するTSVアレイ102中の各TSV(以下、単にTSVアレイ102とする)と接触することで電気的に接続された配線122と、第1基板100のパッドアレイ103における各パッドと電気的に接続される複数のパッドが直線状に配列されてなるパッドアレイ123とが設けられている。 Further, the second substrate 120 includes a wiring 122 electrically connected to each TSV (hereinafter, simply referred to as the TSV array 102) in the TSV array 102 penetrating the first substrate 100, A pad array 123 in which a plurality of pads electrically connected to each pad in the pad array 103 of the substrate 100 is linearly arranged.
 TSVアレイ102と配線122との接続には、例えば、第1基板100に設けられたTSVと第1基板100から第2基板120にかけて設けられたTSVとの2つのTSVをチップ外表で接続する、いわゆるツインTSV方式や、第1基板100から第2基板120にかけて設けられた共通のTSVで接続する、いわゆるシェアードTSV方式などを採用することができる。ただし、これらに限定されず、例えば、第1基板100の接合面と第2基板120の接合面とにそれぞれ露出させた銅(Cu)同士を接合する、いわゆるCu-Cuボンディング方式など、種々の接続形態を採用することが可能である。 For connection between the TSV array 102 and the wiring 122, for example, two TSVs, that is, a TSV provided on the first substrate 100 and a TSV provided from the first substrate 100 to the second substrate 120, are connected in an out-of-chip manner. A so-called twin TSV method, a so-called shared TSV method in which connection is performed by a common TSV provided from the first substrate 100 to the second substrate 120, or the like can be employed. However, the present invention is not limited thereto, and various methods such as a so-called Cu-Cu bonding method in which copper (Cu) exposed on the bonding surface of the first substrate 100 and the bonding surface of the second substrate 120 are bonded to each other are used. A connection mode can be adopted.
 第1基板100のパッドアレイ103における各パッドと、第2基板120のパッドアレイ123における各パッドとの接続形態は、例えば、ワイヤボンディングである。ただし、これに限定されず、スルーホールやキャスタレーション等の接続形態であってもよい。 The connection form between each pad in the pad array 103 of the first substrate 100 and each pad in the pad array 123 of the second substrate 120 is, for example, wire bonding. However, the present invention is not limited to this, and connection forms such as through holes and castellations may be used.
 第2基板120のレイアウト例では、例えば、TSVアレイ102と接続される配線122の近傍を上流側とし、画素アレイ部101から読み出された信号の流れに沿って、上流から順に、ADC17Aと、信号処理部13と、DSP14とが配置されている。すなわち、画素アレイ部101から読み出された画素信号が最初に入力されるADC17Aが最も上流側である配線122の近傍に配置され、次いで、信号処理部13が配置され、配線122から最も遠い領域にDSP14が配置されている。このように、ADC17からDSP14までを信号の流れに沿って上流側から配置したレイアウトとすることで、各部を接続する配線を短縮することが可能となる。それにより、信号遅延の低減や信号の伝搬損失の低減やS/N比の向上や消費電力の低減が可能となる。 In the layout example of the second substrate 120, for example, the vicinity of the wiring 122 connected to the TSV array 102 is set as the upstream side, and the ADC 17A and the ADC 17A are sequentially arranged from the upstream along the flow of the signal read from the pixel array unit 101. The signal processing unit 13 and the DSP 14 are provided. That is, the ADC 17A to which the pixel signal read from the pixel array unit 101 is first input is arranged near the wiring 122 on the most upstream side, and then the signal processing unit 13 is arranged and the area farthest from the wiring 122 The DSP 14 is arranged in the. In this manner, by arranging the components from the ADC 17 to the DSP 14 from the upstream side along the signal flow, it is possible to reduce the number of wires connecting each part. This makes it possible to reduce signal delay, reduce signal propagation loss, improve the S / N ratio, and reduce power consumption.
 また、コントロール部12は、例えば、上流側である配線122の近傍に配置される。図10では、ADC17Aと信号処理部13との間にコントロール部12が配置されている。このようなレイアウトとすることで、コントロール部12が画素アレイ部101を制御する際の信号遅延の低減や信号の伝搬損失の低減やS/N比の向上や消費電力の低減が可能となる。また、アナログ回路用の信号ピンや電源ピンをアナログ回路の近傍(例えば、図10中の下側)にまとめて配置し、残りのデジタル回路用の信号ピンや電源ピンをデジタル回路の近傍(例えば、図10中の上側)にまとめて配置したり、アナログ回路用の電源ピンとデジタル回路用の電源ピンとを十分に離して配置したりなどが可能となるというメリットも存在する。 {Circle around (1)} The control unit 12 is arranged, for example, near the wiring 122 on the upstream side. In FIG. 10, the control unit 12 is disposed between the ADC 17A and the signal processing unit 13. With such a layout, it is possible to reduce the signal delay, reduce the signal propagation loss, improve the S / N ratio, and reduce the power consumption when the control unit 12 controls the pixel array unit 101. In addition, signal pins and power supply pins for analog circuits are collectively arranged near the analog circuit (for example, the lower side in FIG. 10), and signal pins and power supply pins for the remaining digital circuits are placed near digital circuits (for example, 10 (upper side in FIG. 10), and the power pins for analog circuits and the power pins for digital circuits can be sufficiently separated.
 また、図10に示すレイアウトでは、DSP14が最も下流側であるADC17Aとは反対側に配置されている。このようなレイアウトとすることで、言い換えれば、第1基板100と第2基板120との積層方向(以下、単に上下方向という)において、画素アレイ部101と重畳しない領域に、DSP14を配置することが可能となる。 In the layout shown in FIG. 10, the DSP 14 is disposed on the opposite side of the ADC 17A, which is the most downstream side. With such a layout, in other words, the DSP 14 is disposed in a region that does not overlap with the pixel array unit 101 in the stacking direction of the first substrate 100 and the second substrate 120 (hereinafter, simply referred to as the vertical direction). Becomes possible.
 このように、上下方向において画素アレイ部101とDSP14とが重畳しない構成とすることで、DSP14が信号処理を実行することで発生したノイズが画素アレイ部101に入り込むことを低減することが可能となる。その結果、DSP14を学習済みモデルに基づいた演算を実行する処理部として動作させた場合でも、画素アレイ部101へのDSP14の信号処理に起因したノイズの入り込みを低減することが可能となるため、品質の劣化が低減された画像を取得することが可能となる。 As described above, by configuring the pixel array unit 101 and the DSP 14 not to overlap in the vertical direction, it is possible to reduce noise generated by the DSP 14 performing signal processing from entering the pixel array unit 101. Become. As a result, even when the DSP 14 is operated as a processing unit that executes an operation based on the learned model, it is possible to reduce the entry of noise due to the signal processing of the DSP 14 into the pixel array unit 101, It is possible to obtain an image with reduced quality deterioration.
 なお、DSP14と信号処理部13とは、DSP14の一部又は信号線で構成された接続部14aによって接続される。また、セレクタ16は、例えば、DSP14の近傍に配置される。接続部14aをDSP14の一部とした場合、上下方向において一部のDSP14が画素アレイ部101と重なることとなるが、このような場合でも、全てのDSP14が上下方向において画素アレイ部101と重畳する場合と比較して、画素アレイ部101へのノイズの入り込みを低減することが可能である。 The DSP 14 and the signal processing unit 13 are connected by a part of the DSP 14 or a connection unit 14a formed by a signal line. The selector 16 is arranged, for example, near the DSP 14. When the connecting portion 14a is a part of the DSP 14, some of the DSPs 14 overlap the pixel array portion 101 in the vertical direction, but even in such a case, all the DSPs 14 overlap the pixel array portion 101 in the vertical direction. It is possible to reduce the intrusion of noise into the pixel array unit 101 as compared with the case of performing the operation.
 メモリ15A及び15Bは、例えば、DSP14を3方向から囲むように配置される。このように、DSP14を囲むようにメモリ15A及び15Bを配置することで、メモリ15における各メモリ素子とDSP14との配線上の距離を平均化しつつ全体的に短くすることが可能となる。それにより、DSP14がメモリ15へアクセスする際の信号遅延や信号の伝搬損失や消費電力を低減することが可能となる。 The memories 15A and 15B are arranged, for example, so as to surround the DSP 14 from three directions. In this way, by disposing the memories 15A and 15B so as to surround the DSP 14, it is possible to shorten the overall distance while averaging the wiring distance between each memory element and the DSP 14 in the memory 15. This makes it possible to reduce signal delay, signal propagation loss, and power consumption when the DSP 14 accesses the memory 15.
 パッドアレイ123は、例えば、第1基板100のパッドアレイ103と上下方向において対応する第2基板120上の位置に配置される。ここで、パッドアレイ123に含まれるパッドのうち、ADC17Aの近傍に位置するパッドは、アナログ回路(主にADC17A)用の電源電圧やアナログ信号の伝搬に使用される。一方、コントロール部12や信号処理部13やDSP14やメモリ15A及び15Bの近傍に位置するパッドは、デジタル回路(主に、コントロール部12、信号処理部13、DSP14、メモリ15A及び15B)用の電源電圧やデジタル信号の伝搬に使用される。このようなパッドレイアウトとすることで、各パッドと各部とを接続する配線上の距離を短くすることが可能となる。それにより、信号遅延の低減や信号や電源電圧の伝搬損失の低減やS/N比の向上や消費電力の低減が可能となる。 The pad array 123 is disposed, for example, at a position on the second substrate 120 corresponding to the pad array 103 of the first substrate 100 in the vertical direction. Here, of the pads included in the pad array 123, a pad located near the ADC 17A is used for transmitting a power supply voltage and an analog signal for an analog circuit (mainly, the ADC 17A). On the other hand, pads located near the control unit 12, the signal processing unit 13, the DSP 14, and the memories 15A and 15B are power supplies for digital circuits (mainly, the control unit 12, the signal processing unit 13, the DSP 14, the memories 15A and 15B). Used for voltage and digital signal propagation. With such a pad layout, it is possible to reduce a distance on a wiring connecting each pad and each part. This makes it possible to reduce signal delay, reduce signal and power supply voltage propagation loss, improve S / N ratio, and reduce power consumption.
(7.その他の実施形態)
 上述した各実施形態に係る処理は、上記各実施形態以外にも種々の異なる形態にて実施されてよい。
(7. Other embodiments)
The processing according to each of the above-described embodiments may be performed in various different forms other than the above-described embodiments.
 例えば、加工処理は、上記実施形態で説明した以外にも、学習モデルに学習させた内容に応じた様々な処理を実行することができる。例えば、顔全体を抽出するだけでなく、顔の輪郭を抽出したり、目や鼻などの一部分だけを抽出したり、撮像装置1の所有者や特定の人物などを抽出したり、家の画像から表札や窓などの一部分を抽出したりすることもできる。また、室内の画像データに写り込む室外部分を抽出したり、人物と動物とを区別して抽出したり、画像データから窓の部分を抽出したりすることもできる。また、加工処理の一例には、顔などの抽出された特定領域だけを読み出したり、特定領域だけを読み出さなかったり、特定領域を黒く塗りつぶしたり、特定領域だけを切り抜いた画像を読み出したりする処理も含まれる。また、矩形領域に限らず、三角形などの任意の領域を抽出することもできる。また、マスキング処理やモザイク処理などの加工処理は、1つの処理に限らず、複数の処理を組み合わせることもできる。また、顔位置などの抽出は、DSP14に限らず、信号処理部13で実行することもできる。 For example, in the processing, various processes other than those described in the above embodiment can be executed according to the content learned by the learning model. For example, not only the whole face is extracted, but also the outline of the face, only a part of the eyes and the nose, the owner of the imaging device 1 or a specific person, the image of the house, It is also possible to extract a part of the nameplate, window, etc. from. In addition, it is also possible to extract an outdoor part reflected in image data in a room, to extract a person and an animal separately, and to extract a window part from image data. Further, as an example of the processing, processing to read only a specific area extracted such as a face, not to read only a specific area, to paint a specific area black, or to read an image in which only a specific area is cut out. included. Further, not only a rectangular area but also an arbitrary area such as a triangle can be extracted. Processing such as masking and mosaic processing is not limited to one processing, and a plurality of processings can be combined. The extraction of the face position and the like can be executed not only by the DSP 14 but also by the signal processing unit 13.
 上記実施形態では、DNNで学習した学習モデルを例示したが、DNN以外にも、RNN(Recurrent Neural Networks)やCNN(Convolutional Neural Network)など様々なニューラルネットワークを用いることができる。また、DNNなどを用いた学習モデルに限らず、決定木やサポートベクタマシンなどの他の様々な機械学習で学習した学習モデルを用いることもできる。 In the above embodiment, the learning model learned by the DNN has been exemplified. However, various neural networks such as a RNN (Recurrent Neural Network) and a CNN (Convolutional Neural Network) can be used in addition to the DNN. The learning model is not limited to a learning model using DNN, but may be a learning model learned by various other machine learning such as a decision tree or a support vector machine.
 上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。また、実施例で説明した具体例、分布、数値などは、あくまで一例であり、任意に変更することができる。 処理 The processing procedures, control procedures, specific names, and information including various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified. The specific examples, distributions, numerical values, and the like described in the embodiments are merely examples, and can be arbitrarily changed.
 また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。例えば、図1に示したコントロール部12と信号処理部13は統合されてもよい。 The components of each device shown in the drawings are functionally conceptual, and do not necessarily need to be physically configured as shown in the drawings. That is, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or a part thereof may be functionally or physically distributed / arbitrarily divided into arbitrary units according to various loads and usage conditions. Can be integrated and configured. For example, the control unit 12 and the signal processing unit 13 shown in FIG. 1 may be integrated.
(8.移動体への応用例)
 本開示に係る技術(本技術)は、様々な製品へ応用することができる。例えば、本開示に係る技術は、自動車、電気自動車、ハイブリッド電気自動車、自動二輪車、自転車、パーソナルモビリティ、飛行機、ドローン、船舶、ロボット等のいずれかの種類の移動体に搭載される装置として実現されてもよい。
(8. Application example to mobile object)
The technology (the present technology) according to the present disclosure can be applied to various products. For example, the technology according to the present disclosure is realized as a device mounted on any type of moving object such as an automobile, an electric vehicle, a hybrid electric vehicle, a motorcycle, a bicycle, a personal mobility, an airplane, a drone, a ship, and a robot. You may.
 図12は、本開示に係る技術が適用され得る移動体制御システムの一例である車両制御システムの概略的な構成例を示すブロック図である。 FIG. 12 is a block diagram illustrating a schematic configuration example of a vehicle control system that is an example of a mobile object control system to which the technology according to the present disclosure may be applied.
 車両制御システム12000は、通信ネットワーク12001を介して接続された複数の電子制御ユニットを備える。図12に示した例では、車両制御システム12000は、駆動系制御ユニット12010、ボディ系制御ユニット12020、車外情報検出ユニット12030、車内情報検出ユニット12040、及び統合制御ユニット12050を備える。また、統合制御ユニット12050の機能構成として、マイクロコンピュータ12051、音声画像出力部12052、及び車載ネットワークI/F(Interface)12053が図示されている。 Vehicle control system 12000 includes a plurality of electronic control units connected via communication network 12001. In the example shown in FIG. 12, the vehicle control system 12000 includes a drive system control unit 12010, a body system control unit 12020, an outside information detection unit 12030, an inside information detection unit 12040, and an integrated control unit 12050. As a functional configuration of the integrated control unit 12050, a microcomputer 12051, an audio / video output unit 12052, and a vehicle-mounted network I / F (Interface) 12053 are illustrated.
 駆動系制御ユニット12010は、各種プログラムにしたがって車両の駆動系に関連する装置の動作を制御する。例えば、駆動系制御ユニット12010は、内燃機関又は駆動用モータ等の車両の駆動力を発生させるための駆動力発生装置、駆動力を車輪に伝達するための駆動力伝達機構、車両の舵角を調節するステアリング機構、及び、車両の制動力を発生させる制動装置等の制御装置として機能する。 The drive system control unit 12010 controls the operation of the device related to the drive system of the vehicle according to various programs. For example, the drive system control unit 12010 includes a drive force generation device for generating a drive force of the vehicle such as an internal combustion engine or a drive motor, a drive force transmission mechanism for transmitting the drive force to the wheels, and a steering angle of the vehicle. It functions as a control mechanism such as a steering mechanism that adjusts and a braking device that generates a braking force of the vehicle.
 ボディ系制御ユニット12020は、各種プログラムにしたがって車体に装備された各種装置の動作を制御する。例えば、ボディ系制御ユニット12020は、キーレスエントリシステム、スマートキーシステム、パワーウィンドウ装置、あるいは、ヘッドランプ、バックランプ、ブレーキランプ、ウィンカー又はフォグランプ等の各種ランプの制御装置として機能する。この場合、ボディ系制御ユニット12020には、鍵を代替する携帯機から発信される電波又は各種スイッチの信号が入力され得る。ボディ系制御ユニット12020は、これらの電波又は信号の入力を受け付け、車両のドアロック装置、パワーウィンドウ装置、ランプ等を制御する。 The body control unit 12020 controls the operation of various devices mounted on the vehicle body according to various programs. For example, the body-related control unit 12020 functions as a keyless entry system, a smart key system, a power window device, or a control device for various lamps such as a head lamp, a back lamp, a brake lamp, a blinker, and a fog lamp. In this case, a radio wave or a signal of various switches transmitted from a portable device replacing the key can be input to the body control unit 12020. The body control unit 12020 receives the input of these radio waves or signals, and controls a door lock device, a power window device, a lamp, and the like of the vehicle.
 車外情報検出ユニット12030は、車両制御システム12000を搭載した車両の外部の情報を検出する。例えば、車外情報検出ユニット12030には、撮像部12031が接続される。車外情報検出ユニット12030は、撮像部12031に車外の画像を撮像させるとともに、撮像された画像を受信する。車外情報検出ユニット12030は、受信した画像に基づいて、人、車、障害物、標識又は路面上の文字等の物体検出処理又は距離検出処理を行ってもよい。 外 Out-of-vehicle information detection unit 12030 detects information external to the vehicle on which vehicle control system 12000 is mounted. For example, an imaging unit 12031 is connected to the outside-of-vehicle information detection unit 12030. The out-of-vehicle information detection unit 12030 causes the imaging unit 12031 to capture an image outside the vehicle, and receives the captured image. The out-of-vehicle information detection unit 12030 may perform an object detection process or a distance detection process of a person, a vehicle, an obstacle, a sign, a character on a road surface, or the like based on the received image.
 撮像部12031は、光を受光し、その光の受光量に応じた電気信号を出力する光センサである。撮像部12031は、電気信号を画像として出力することもできるし、測距の情報として出力することもできる。また、撮像部12031が受光する光は、可視光であっても良いし、赤外線等の非可視光であっても良い。 The imaging unit 12031 is an optical sensor that receives light and outputs an electric signal according to the amount of received light. The imaging unit 12031 can output an electric signal as an image or can output the information as distance measurement information. The light received by the imaging unit 12031 may be visible light or non-visible light such as infrared light.
 車内情報検出ユニット12040は、車内の情報を検出する。車内情報検出ユニット12040には、例えば、運転者の状態を検出する運転者状態検出部12041が接続される。運転者状態検出部12041は、例えば運転者を撮像するカメラを含み、車内情報検出ユニット12040は、運転者状態検出部12041から入力される検出情報に基づいて、運転者の疲労度合い又は集中度合いを算出してもよいし、運転者が居眠りをしていないかを判別してもよい。 The in-vehicle information detection unit 12040 detects information in the vehicle. The in-vehicle information detection unit 12040 is connected to, for example, a driver status detection unit 12041 that detects the status of the driver. The driver state detection unit 12041 includes, for example, a camera that captures an image of the driver, and the in-vehicle information detection unit 12040 determines the degree of driver fatigue or concentration based on the detection information input from the driver state detection unit 12041. The calculation may be performed, or it may be determined whether the driver has fallen asleep.
 マイクロコンピュータ12051は、車外情報検出ユニット12030又は車内情報検出ユニット12040で取得される車内外の情報に基づいて、駆動力発生装置、ステアリング機構又は制動装置の制御目標値を演算し、駆動系制御ユニット12010に対して制御指令を出力することができる。例えば、マイクロコンピュータ12051は、車両の衝突回避あるいは衝撃緩和、車間距離に基づく追従走行、車速維持走行、車両の衝突警告、又は車両のレーン逸脱警告等を含むADAS(Advanced Driver Assistance System)の機能実現を目的とした協調制御を行うことができる。 The microcomputer 12051 calculates a control target value of the driving force generation device, the steering mechanism or the braking device based on the information on the inside and outside of the vehicle acquired by the outside information detection unit 12030 or the inside information detection unit 12040, and the drive system control unit A control command can be output to 12010. For example, the microcomputer 12051 realizes functions of an ADAS (Advanced Driver Assistance System) including a vehicle collision avoidance or a shock mitigation, a following operation based on an inter-vehicle distance, a vehicle speed maintaining operation, a vehicle collision warning, or a vehicle lane departure warning. Cooperative control for the purpose.
 また、マイクロコンピュータ12051は、車外情報検出ユニット12030又は車内情報検出ユニット12040で取得される車両の周囲の情報に基づいて駆動力発生装置、ステアリング機構又は制動装置等を制御することにより、運転者の操作に拠らずに自律的に走行する自動運転等を目的とした協調制御を行うことができる。 Further, the microcomputer 12051 controls the driving force generation device, the steering mechanism, the braking device, and the like based on the information on the surroundings of the vehicle acquired by the outside information detection unit 12030 or the inside information detection unit 12040, and thereby, It is possible to perform cooperative control for automatic driving or the like in which the vehicle travels autonomously without depending on the operation.
 また、マイクロコンピュータ12051は、車外情報検出ユニット12030で取得される車外の情報に基づいて、ボディ系制御ユニット12020に対して制御指令を出力することができる。例えば、マイクロコンピュータ12051は、車外情報検出ユニット12030で検知した先行車又は対向車の位置に応じてヘッドランプを制御し、ハイビームをロービームに切り替える等の防眩を図ることを目的とした協調制御を行うことができる。 マ イ ク ロ Also, the microcomputer 12051 can output a control command to the body system control unit 12020 based on information on the outside of the vehicle acquired by the outside information detection unit 12030. For example, the microcomputer 12051 controls the headlamp according to the position of the preceding vehicle or the oncoming vehicle detected by the outside information detection unit 12030, and performs cooperative control for the purpose of preventing glare such as switching a high beam to a low beam. It can be carried out.
 音声画像出力部12052は、車両の搭乗者又は車外に対して、視覚的又は聴覚的に情報を通知することが可能な出力装置へ音声及び画像のうちの少なくとも一方の出力信号を送信する。図12の例では、出力装置として、オーディオスピーカ12061、表示部12062及びインストルメントパネル12063が例示されている。表示部12062は、例えば、オンボードディスプレイ及びヘッドアップディスプレイの少なくとも一つを含んでいてもよい。 The sound image output unit 12052 transmits at least one of a sound signal and an image signal to an output device capable of visually or audibly notifying a passenger of the vehicle or the outside of the vehicle of information. In the example of FIG. 12, an audio speaker 12061, a display unit 12062, and an instrument panel 12063 are illustrated as output devices. The display unit 12062 may include, for example, at least one of an on-board display and a head-up display.
 図13は、撮像部12031の設置位置の例を示す図である。 FIG. 13 is a diagram illustrating an example of an installation position of the imaging unit 12031.
 図13では、撮像部12031として、撮像部12101、12102、12103、12104、12105を有する。 で は In FIG. 13, the imaging unit 12031 includes imaging units 12101, 12102, 12103, 12104, and 12105.
 撮像部12101、12102、12103、12104、12105は、例えば、車両12100のフロントノーズ、サイドミラー、リアバンパ、バックドア及び車室内のフロントガラスの上部等の位置に設けられる。フロントノーズに備えられる撮像部12101及び車室内のフロントガラスの上部に備えられる撮像部12105は、主として車両12100の前方の画像を取得する。サイドミラーに備えられる撮像部12102、12103は、主として車両12100の側方の画像を取得する。リアバンパ又はバックドアに備えられる撮像部12104は、主として車両12100の後方の画像を取得する。車室内のフロントガラスの上部に備えられる撮像部12105は、主として先行車両又は、歩行者、障害物、信号機、交通標識又は車線等の検出に用いられる。 The imaging units 12101, 12102, 12103, 12104, and 12105 are provided, for example, at positions such as a front nose, a side mirror, a rear bumper, a back door, and an upper part of a windshield in the vehicle interior of the vehicle 12100. The imaging unit 12101 provided on the front nose and the imaging unit 12105 provided above the windshield in the passenger compartment mainly acquire an image in front of the vehicle 12100. The imaging units 12102 and 12103 provided in the side mirror mainly acquire images of the side of the vehicle 12100. The imaging unit 12104 provided in the rear bumper or the back door mainly acquires an image behind the vehicle 12100. The imaging unit 12105 provided above the windshield in the passenger compartment is mainly used for detecting a preceding vehicle, a pedestrian, an obstacle, a traffic light, a traffic sign, a lane, and the like.
 なお、図13には、撮像部12101ないし12104の撮影範囲の一例が示されている。撮像範囲12111は、フロントノーズに設けられた撮像部12101の撮像範囲を示し、撮像範囲12112,12113は、それぞれサイドミラーに設けられた撮像部12102,12103の撮像範囲を示し、撮像範囲12114は、リアバンパ又はバックドアに設けられた撮像部12104の撮像範囲を示す。例えば、撮像部12101ないし12104で撮像された画像データが重ね合わせられることにより、車両12100を上方から見た俯瞰画像が得られる。 FIG. 13 shows an example of the imaging range of the imaging units 12101 to 12104. The imaging range 12111 indicates the imaging range of the imaging unit 12101 provided on the front nose, the imaging ranges 12112 and 12113 indicate the imaging ranges of the imaging units 12102 and 12103 provided on the side mirrors, respectively, and the imaging range 12114 indicates 14 shows an imaging range of an imaging unit 12104 provided in a rear bumper or a back door. For example, by overlaying image data captured by the imaging units 12101 to 12104, an overhead image of the vehicle 12100 viewed from above can be obtained.
 撮像部12101ないし12104の少なくとも1つは、距離情報を取得する機能を有していてもよい。例えば、撮像部12101ないし12104の少なくとも1つは、複数の撮像素子からなるステレオカメラであってもよいし、位相差検出用の画素を有する撮像素子であってもよい。 At least one of the imaging units 12101 to 12104 may have a function of acquiring distance information. For example, at least one of the imaging units 12101 to 12104 may be a stereo camera including a plurality of imaging elements or an imaging element having pixels for detecting a phase difference.
 例えば、マイクロコンピュータ12051は、撮像部12101ないし12104から得られた距離情報を基に、撮像範囲12111ないし12114内における各立体物までの距離と、この距離の時間的変化(車両12100に対する相対速度)を求めることにより、特に車両12100の進行路上にある最も近い立体物で、車両12100と略同じ方向に所定の速度(例えば、0km/h以上)で走行する立体物を先行車として抽出することができる。さらに、マイクロコンピュータ12051は、先行車の手前に予め確保すべき車間距離を設定し、自動ブレーキ制御(追従停止制御も含む)や自動加速制御(追従発進制御も含む)等を行うことができる。このように運転者の操作に拠らずに自律的に走行する自動運転等を目的とした協調制御を行うことができる。 For example, based on the distance information obtained from the imaging units 12101 to 12104, the microcomputer 12051 calculates a distance to each three-dimensional object in the imaging ranges 12111 to 12114 and a temporal change of the distance (relative speed with respect to the vehicle 12100). In particular, it is possible to extract, as a preceding vehicle, a three-dimensional object that travels at a predetermined speed (for example, 0 km / h or more) in the same direction as the vehicle 12100, which is the closest three-dimensional object on the traveling path of the vehicle 12100. it can. Further, the microcomputer 12051 can set an inter-vehicle distance to be secured before the preceding vehicle and perform automatic brake control (including follow-up stop control), automatic acceleration control (including follow-up start control), and the like. In this way, it is possible to perform cooperative control for automatic driving or the like in which the vehicle travels autonomously without depending on the operation of the driver.
 例えば、マイクロコンピュータ12051は、撮像部12101ないし12104から得られた距離情報を元に、立体物に関する立体物データを、2輪車、普通車両、大型車両、歩行者、電柱等その他の立体物に分類して抽出し、障害物の自動回避に用いることができる。例えば、マイクロコンピュータ12051は、車両12100の周辺の障害物を、車両12100のドライバが視認可能な障害物と視認困難な障害物とに識別する。そして、マイクロコンピュータ12051は、各障害物との衝突の危険度を示す衝突リスクを判断し、衝突リスクが設定値以上で衝突可能性がある状況であるときには、オーディオスピーカ12061や表示部12062を介してドライバに警報を出力することや、駆動系制御ユニット12010を介して強制減速や回避操舵を行うことで、衝突回避のための運転支援を行うことができる。 For example, the microcomputer 12051 converts the three-dimensional object data relating to the three-dimensional object into other three-dimensional objects such as a motorcycle, a normal vehicle, a large vehicle, a pedestrian, a telephone pole, and the like based on the distance information obtained from the imaging units 12101 to 12104. It can be classified and extracted and used for automatic avoidance of obstacles. For example, the microcomputer 12051 distinguishes obstacles around the vehicle 12100 into obstacles that are visible to the driver of the vehicle 12100 and obstacles that are difficult to see. Then, the microcomputer 12051 determines a collision risk indicating a risk of collision with each obstacle, and when the collision risk is equal to or more than the set value and there is a possibility of collision, via the audio speaker 12061 or the display unit 12062. By outputting an alarm to the driver through forced driving and avoidance steering via the drive system control unit 12010, driving assistance for collision avoidance can be performed.
 撮像部12101ないし12104の少なくとも1つは、赤外線を検出する赤外線カメラであってもよい。例えば、マイクロコンピュータ12051は、撮像部12101ないし12104の撮像画像中に歩行者が存在するか否かを判定することで歩行者を認識することができる。かかる歩行者の認識は、例えば赤外線カメラとしての撮像部12101ないし12104の撮像画像における特徴点を抽出する手順と、物体の輪郭を示す一連の特徴点にパターンマッチング処理を行って歩行者か否かを判別する手順によって行われる。マイクロコンピュータ12051が、撮像部12101ないし12104の撮像画像中に歩行者が存在すると判定し、歩行者を認識すると、音声画像出力部12052は、当該認識された歩行者に強調のための方形輪郭線を重畳表示するように、表示部12062を制御する。また、音声画像出力部12052は、歩行者を示すアイコン等を所望の位置に表示するように表示部12062を制御してもよい。 At least one of the imaging units 12101 to 12104 may be an infrared camera that detects infrared light. For example, the microcomputer 12051 can recognize a pedestrian by determining whether or not a pedestrian exists in the captured images of the imaging units 12101 to 12104. The recognition of such a pedestrian is performed by, for example, extracting a feature point in an image captured by the imaging units 12101 to 12104 as an infrared camera, and performing a pattern matching process on a series of feature points indicating the outline of the object to determine whether the object is a pedestrian. Is performed according to a procedure for determining When the microcomputer 12051 determines that a pedestrian is present in the images captured by the imaging units 12101 to 12104 and recognizes the pedestrian, the audio image output unit 12052 outputs a rectangular outline to the recognized pedestrian for emphasis. The display unit 12062 is controlled so that is superimposed. In addition, the sound image output unit 12052 may control the display unit 12062 to display an icon or the like indicating a pedestrian at a desired position.
 以上、本開示に係る技術が適用され得る車両制御システムの一例について説明した。本開示に係る技術は、以上説明した構成のうち、撮像部12031等に適用され得る。撮像部12031等に本開示に係る技術を適用することにより、撮像部12031等を小型化することが可能となるため、車両12100のインテリアやエクステリアの設計が容易となる。また、撮像部12031等に本開示に係る技術を適用することにより、ノイズの低減されたクリアな画像を取得することが可能となるため、より見やすい撮影画像をドライバに提供することができる。それにより、ドライバの疲労を軽減することが可能になる。 As described above, an example of the vehicle control system to which the technology according to the present disclosure can be applied has been described. The technology according to the present disclosure can be applied to the imaging unit 12031 or the like among the configurations described above. By applying the technology according to the present disclosure to the imaging unit 12031 and the like, it is possible to reduce the size of the imaging unit 12031 and the like, so that the interior and exterior of the vehicle 12100 can be easily designed. In addition, by applying the technology according to the present disclosure to the imaging unit 12031 and the like, a clear image with reduced noise can be obtained, so that a more easily viewable captured image can be provided to the driver. This makes it possible to reduce driver fatigue.
(9.内視鏡手術システムへの応用例)
 本開示に係る技術(本技術)は、様々な製品へ応用することができる。例えば、本開示に係る技術は、内視鏡手術システムに適用されてもよい。
(9. Example of application to endoscopic surgery system)
The technology (the present technology) according to the present disclosure can be applied to various products. For example, the technology according to the present disclosure may be applied to an endoscopic surgery system.
 図14は、本開示に係る技術(本技術)が適用され得る内視鏡手術システムの概略的な構成の一例を示す図である。 FIG. 14 is a diagram illustrating an example of a schematic configuration of an endoscopic surgery system to which the technology (the present technology) according to the present disclosure may be applied.
 図14では、術者(医師)11131が、内視鏡手術システム11000を用いて、患者ベッド11133上の患者11132に手術を行っている様子が図示されている。図示するように、内視鏡手術システム11000は、内視鏡11100と、気腹チューブ11111やエネルギー処置具11112等の、その他の術具11110と、内視鏡11100を支持する支持アーム装置11120と、内視鏡下手術のための各種の装置が搭載されたカート11200と、から構成される。 FIG. 14 illustrates a situation where an operator (doctor) 11131 is performing an operation on a patient 11132 on a patient bed 11133 using the endoscopic surgery system 11000. As shown, the endoscopic surgery system 11000 includes an endoscope 11100, other surgical tools 11110 such as an insufflation tube 11111 and an energy treatment tool 11112, and a support arm device 11120 that supports the endoscope 11100. And a cart 11200 on which various devices for endoscopic surgery are mounted.
 内視鏡11100は、先端から所定の長さの領域が患者11132の体腔内に挿入される鏡筒11101と、鏡筒11101の基端に接続されるカメラヘッド11102と、から構成される。図示する例では、硬性の鏡筒11101を有するいわゆる硬性鏡として構成される内視鏡11100を図示しているが、内視鏡11100は、軟性の鏡筒を有するいわゆる軟性鏡として構成されてもよい。 The endoscope 11100 includes a lens barrel 11101 having a predetermined length from the distal end inserted into the body cavity of the patient 11132, and a camera head 11102 connected to the proximal end of the lens barrel 11101. In the illustrated example, the endoscope 11100 which is configured as a so-called rigid endoscope having a hard lens barrel 11101 is illustrated. However, the endoscope 11100 may be configured as a so-called flexible endoscope having a soft lens barrel. Good.
 鏡筒11101の先端には、対物レンズが嵌め込まれた開口部が設けられている。内視鏡11100には光源装置11203が接続されており、当該光源装置11203によって生成された光が、鏡筒11101の内部に延設されるライトガイドによって当該鏡筒の先端まで導光され、対物レンズを介して患者11132の体腔内の観察対象に向かって照射される。なお、内視鏡11100は、直視鏡であってもよいし、斜視鏡又は側視鏡であってもよい。 開口 An opening in which an objective lens is fitted is provided at the tip of the lens barrel 11101. A light source device 11203 is connected to the endoscope 11100, and light generated by the light source device 11203 is guided to the distal end of the lens barrel by a light guide that extends inside the lens barrel 11101, and the objective The light is radiated toward the observation target in the body cavity of the patient 11132 via the lens. In addition, the endoscope 11100 may be a direct view scope, a perspective view scope, or a side view scope.
 カメラヘッド11102の内部には光学系及び撮像素子が設けられており、観察対象からの反射光(観察光)は当該光学系によって当該撮像素子に集光される。当該撮像素子によって観察光が光電変換され、観察光に対応する電気信号、すなわち観察像に対応する画像信号が生成される。当該画像信号は、RAWデータとしてカメラコントロールユニット(CCU: Camera Control Unit)11201に送信される。 光学 An optical system and an image sensor are provided inside the camera head 11102, and the reflected light (observation light) from the observation target is focused on the image sensor by the optical system. The observation light is photoelectrically converted by the imaging element, and an electric signal corresponding to the observation light, that is, an image signal corresponding to the observation image is generated. The image signal is transmitted to a camera control unit (CCU: \ Camera \ Control \ Unit) 11201 as RAW data.
 CCU11201は、CPU(Central Processing Unit)やGPU(Graphics Processing Unit)等によって構成され、内視鏡11100及び表示装置11202の動作を統括的に制御する。さらに、CCU11201は、カメラヘッド11102から画像信号を受け取り、その画像信号に対して、例えば現像処理(デモザイク処理)等の、当該画像信号に基づく画像を表示するための各種の画像処理を施す。 The $ CCU 11201 is configured by a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and the like, and controls the operations of the endoscope 11100 and the display device 11202 overall. Further, the CCU 11201 receives an image signal from the camera head 11102, and performs various image processing on the image signal for displaying an image based on the image signal, such as a development process (demosaicing process).
 表示装置11202は、CCU11201からの制御により、当該CCU11201によって画像処理が施された画像信号に基づく画像を表示する。 The display device 11202 displays an image based on an image signal on which image processing has been performed by the CCU 11201 under the control of the CCU 11201.
 光源装置11203は、例えばLED(light emitting diode)等の光源から構成され、術部等を撮影する際の照射光を内視鏡11100に供給する。 The light source device 11203 is configured by a light source such as an LED (light emitting diode), for example, and supplies the endoscope 11100 with irradiation light when imaging an operation part or the like.
 入力装置11204は、内視鏡手術システム11000に対する入力インタフェースである。ユーザは、入力装置11204を介して、内視鏡手術システム11000に対して各種の情報の入力や指示入力を行うことができる。例えば、ユーザは、内視鏡11100による撮像条件(照射光の種類、倍率及び焦点距離等)を変更する旨の指示等を入力する。 The input device 11204 is an input interface for the endoscopic surgery system 11000. The user can input various information and input instructions to the endoscopic surgery system 11000 via the input device 11204. For example, the user inputs an instruction or the like to change imaging conditions (type of irradiation light, magnification, focal length, and the like) by the endoscope 11100.
 処置具制御装置11205は、組織の焼灼、切開又は血管の封止等のためのエネルギー処置具11112の駆動を制御する。気腹装置11206は、内視鏡11100による視野の確保及び術者の作業空間の確保の目的で、患者11132の体腔を膨らめるために、気腹チューブ11111を介して当該体腔内にガスを送り込む。レコーダ11207は、手術に関する各種の情報を記録可能な装置である。プリンタ11208は、手術に関する各種の情報を、テキスト、画像又はグラフ等各種の形式で印刷可能な装置である。 The treatment instrument control device 11205 controls the driving of the energy treatment instrument 11112 for cauterizing, incising a tissue, sealing a blood vessel, and the like. The insufflation device 11206 is used to inflate the body cavity of the patient 11132 for the purpose of securing the visual field by the endoscope 11100 and securing the working space of the operator. Send. The recorder 11207 is a device that can record various types of information related to surgery. The printer 11208 is a device capable of printing various types of information on surgery in various formats such as text, images, and graphs.
 なお、内視鏡11100に術部を撮影する際の照射光を供給する光源装置11203は、例えばLED、レーザ光源又はこれらの組み合わせによって構成される白色光源から構成することができる。RGBレーザ光源の組み合わせにより白色光源が構成される場合には、各色(各波長)の出力強度及び出力タイミングを高精度に制御することができるため、光源装置11203において撮像画像のホワイトバランスの調整を行うことができる。また、この場合には、RGBレーザ光源それぞれからのレーザ光を時分割で観察対象に照射し、その照射タイミングに同期してカメラヘッド11102の撮像素子の駆動を制御することにより、RGBそれぞれに対応した画像を時分割で撮像することも可能である。当該方法によれば、当該撮像素子にカラーフィルタを設けなくても、カラー画像を得ることができる。 The light source device 11203 that supplies the endoscope 11100 with irradiation light at the time of imaging the operation site can be configured by, for example, a white light source including an LED, a laser light source, or a combination thereof. When a white light source is configured by a combination of the RGB laser light sources, the output intensity and output timing of each color (each wavelength) can be controlled with high accuracy, so that the light source device 11203 adjusts the white balance of the captured image. It can be carried out. In this case, the laser light from each of the RGB laser light sources is radiated to the observation target in a time-division manner, and the driving of the image pickup device of the camera head 11102 is controlled in synchronization with the irradiation timing. It is also possible to capture the image obtained in a time-division manner. According to this method, a color image can be obtained without providing a color filter in the image sensor.
 また、光源装置11203は、出力する光の強度を所定の時間ごとに変更するようにその駆動が制御されてもよい。その光の強度の変更のタイミングに同期してカメラヘッド11102の撮像素子の駆動を制御して時分割で画像を取得し、その画像を合成することにより、いわゆる黒つぶれ及び白とびのない高ダイナミックレンジの画像を生成することができる。 The driving of the light source device 11203 may be controlled so as to change the intensity of output light at predetermined time intervals. By controlling the driving of the image sensor of the camera head 11102 in synchronization with the timing of the change of the light intensity, an image is acquired in a time-division manner, and the image is synthesized, so that a high dynamic image without so-called blackout and whiteout is obtained. An image of the range can be generated.
 また、光源装置11203は、特殊光観察に対応した所定の波長帯域の光を供給可能に構成されてもよい。特殊光観察では、例えば、体組織における光の吸収の波長依存性を利用して、通常の観察時における照射光(すなわち、白色光)に比べて狭帯域の光を照射することにより、粘膜表層の血管等の所定の組織を高コントラストで撮影する、いわゆる狭帯域光観察(Narrow Band Imaging)が行われる。あるいは、特殊光観察では、励起光を照射することにより発生する蛍光により画像を得る蛍光観察が行われてもよい。蛍光観察では、体組織に励起光を照射し当該体組織からの蛍光を観察すること(自家蛍光観察)、又はインドシアニングリーン(ICG)等の試薬を体組織に局注するとともに当該体組織にその試薬の蛍光波長に対応した励起光を照射し蛍光像を得ること等を行うことができる。光源装置11203は、このような特殊光観察に対応した狭帯域光及び/又は励起光を供給可能に構成され得る。 The light source device 11203 may be configured to be able to supply light in a predetermined wavelength band corresponding to special light observation. In the special light observation, for example, the wavelength dependence of light absorption in body tissue is used to irradiate light of a narrower band compared to irradiation light (ie, white light) at the time of normal observation, so that the surface of the mucous membrane is exposed. A so-called narrow-band light observation (Narrow-Band-Imaging) for photographing a predetermined tissue such as a blood vessel with high contrast is performed. Alternatively, in the special light observation, fluorescence observation in which an image is obtained by fluorescence generated by irradiating excitation light may be performed. In fluorescence observation, the body tissue is irradiated with excitation light to observe fluorescence from the body tissue (autofluorescence observation), or a reagent such as indocyanine green (ICG) is locally injected into the body tissue and Irradiation with excitation light corresponding to the fluorescence wavelength of the reagent can be performed to obtain a fluorescence image. The light source device 11203 can be configured to be able to supply narrowband light and / or excitation light corresponding to such special light observation.
 図15は、図14に示すカメラヘッド11102及びCCU11201の機能構成の一例を示すブロック図である。 FIG. 15 is a block diagram showing an example of a functional configuration of the camera head 11102 and the CCU 11201 shown in FIG.
 カメラヘッド11102は、レンズユニット11401と、撮像部11402と、駆動部11403と、通信部11404と、カメラヘッド制御部11405と、を有する。CCU11201は、通信部11411と、画像処理部11412と、制御部11413と、を有する。カメラヘッド11102とCCU11201とは、伝送ケーブル11400によって互いに通信可能に接続されている。 The camera head 11102 includes a lens unit 11401, an imaging unit 11402, a driving unit 11403, a communication unit 11404, and a camera head control unit 11405. The CCU 11201 includes a communication unit 11411, an image processing unit 11412, and a control unit 11413. The camera head 11102 and the CCU 11201 are communicably connected to each other by a transmission cable 11400.
 レンズユニット11401は、鏡筒11101との接続部に設けられる光学系である。鏡筒11101の先端から取り込まれた観察光は、カメラヘッド11102まで導光され、当該レンズユニット11401に入射する。レンズユニット11401は、ズームレンズ及びフォーカスレンズを含む複数のレンズが組み合わされて構成される。 The lens unit 11401 is an optical system provided at a connection with the lens barrel 11101. Observation light taken in from the tip of the lens barrel 11101 is guided to the camera head 11102, and enters the lens unit 11401. The lens unit 11401 is configured by combining a plurality of lenses including a zoom lens and a focus lens.
 撮像部11402を構成する撮像素子は、1つ(いわゆる単板式)であってもよいし、複数(いわゆる多板式)であってもよい。撮像部11402が多板式で構成される場合には、例えば各撮像素子によってRGBそれぞれに対応する画像信号が生成され、それらが合成されることによりカラー画像が得られてもよい。あるいは、撮像部11402は、3D(dimensional)表示に対応する右目用及び左目用の画像信号をそれぞれ取得するための1対の撮像素子を有するように構成されてもよい。3D表示が行われることにより、術者11131は術部における生体組織の奥行きをより正確に把握することが可能になる。なお、撮像部11402が多板式で構成される場合には、各撮像素子に対応して、レンズユニット11401も複数系統設けられ得る。 撮 像 The number of imaging elements constituting the imaging unit 11402 may be one (so-called single-panel type) or plural (so-called multi-panel type). When the imaging unit 11402 is configured as a multi-panel type, for example, an image signal corresponding to each of RGB may be generated by each imaging element, and a color image may be obtained by combining the image signals. Alternatively, the imaging unit 11402 may be configured to include a pair of imaging elements for acquiring right-eye and left-eye image signals corresponding to 3D (dimensional) display. By performing the 3D display, the operator 11131 can more accurately grasp the depth of the living tissue in the operative part. Note that when the imaging unit 11402 is configured as a multi-plate system, a plurality of lens units 11401 may be provided for each imaging element.
 また、撮像部11402は、必ずしもカメラヘッド11102に設けられなくてもよい。例えば、撮像部11402は、鏡筒11101の内部に、対物レンズの直後に設けられてもよい。 撮 像 In addition, the imaging unit 11402 does not necessarily have to be provided in the camera head 11102. For example, the imaging unit 11402 may be provided inside the lens barrel 11101 immediately after the objective lens.
 駆動部11403は、アクチュエータによって構成され、カメラヘッド制御部11405からの制御により、レンズユニット11401のズームレンズ及びフォーカスレンズを光軸に沿って所定の距離だけ移動させる。これにより、撮像部11402による撮像画像の倍率及び焦点が適宜調整され得る。 The drive unit 11403 is configured by an actuator, and moves the zoom lens and the focus lens of the lens unit 11401 by a predetermined distance along the optical axis under the control of the camera head control unit 11405. Thus, the magnification and the focus of the image captured by the imaging unit 11402 can be appropriately adjusted.
 通信部11404は、CCU11201との間で各種の情報を送受信するための通信装置によって構成される。通信部11404は、撮像部11402から得た画像信号をRAWデータとして伝送ケーブル11400を介してCCU11201に送信する。 The communication unit 11404 is configured by a communication device for transmitting and receiving various information to and from the CCU 11201. The communication unit 11404 transmits the image signal obtained from the imaging unit 11402 as RAW data to the CCU 11201 via the transmission cable 11400.
 また、通信部11404は、CCU11201から、カメラヘッド11102の駆動を制御するための制御信号を受信し、カメラヘッド制御部11405に供給する。当該制御信号には、例えば、撮像画像のフレームレートを指定する旨の情報、撮像時の露出値を指定する旨の情報、並びに/又は撮像画像の倍率及び焦点を指定する旨の情報等、撮像条件に関する情報が含まれる。 The communication unit 11404 receives a control signal for controlling driving of the camera head 11102 from the CCU 11201 and supplies the control signal to the camera head control unit 11405. The control signal includes, for example, information indicating that the frame rate of the captured image is specified, information that specifies the exposure value at the time of imaging, and / or information that specifies the magnification and focus of the captured image. Contains information about the condition.
 なお、上記のフレームレートや露出値、倍率、焦点等の撮像条件は、ユーザによって適宜指定されてもよいし、取得された画像信号に基づいてCCU11201の制御部11413によって自動的に設定されてもよい。後者の場合には、いわゆるAE(Auto Exposure)機能、AF(Auto Focus)機能及びAWB(Auto White Balance)機能が内視鏡11100に搭載されていることになる。 Note that the above-described imaging conditions such as the frame rate, the exposure value, the magnification, and the focus may be appropriately designated by the user, or may be automatically set by the control unit 11413 of the CCU 11201 based on the acquired image signal. Good. In the latter case, a so-called AE (Auto Exposure) function, an AF (Auto Focus) function, and an AWB (Auto White Balance) function are mounted on the endoscope 11100.
 カメラヘッド制御部11405は、通信部11404を介して受信したCCU11201からの制御信号に基づいて、カメラヘッド11102の駆動を制御する。 The camera head control unit 11405 controls the driving of the camera head 11102 based on the control signal from the CCU 11201 received via the communication unit 11404.
 通信部11411は、カメラヘッド11102との間で各種の情報を送受信するための通信装置によって構成される。通信部11411は、カメラヘッド11102から、伝送ケーブル11400を介して送信される画像信号を受信する。 The communication unit 11411 is configured by a communication device for transmitting and receiving various information to and from the camera head 11102. The communication unit 11411 receives an image signal transmitted from the camera head 11102 via the transmission cable 11400.
 また、通信部11411は、カメラヘッド11102に対して、カメラヘッド11102の駆動を制御するための制御信号を送信する。画像信号や制御信号は、電気通信や光通信等によって送信することができる。 (4) The communication unit 11411 transmits a control signal for controlling driving of the camera head 11102 to the camera head 11102. The image signal and the control signal can be transmitted by electric communication, optical communication, or the like.
 画像処理部11412は、カメラヘッド11102から送信されたRAWデータである画像信号に対して各種の画像処理を施す。 The image processing unit 11412 performs various types of image processing on an image signal that is RAW data transmitted from the camera head 11102.
 制御部11413は、内視鏡11100による術部等の撮像、及び、術部等の撮像により得られる撮像画像の表示に関する各種の制御を行う。例えば、制御部11413は、カメラヘッド11102の駆動を制御するための制御信号を生成する。 The control unit 11413 performs various kinds of control related to imaging of the operation section and the like by the endoscope 11100 and display of a captured image obtained by imaging the operation section and the like. For example, the control unit 11413 generates a control signal for controlling driving of the camera head 11102.
 また、制御部11413は、画像処理部11412によって画像処理が施された画像信号に基づいて、術部等が映った撮像画像を表示装置11202に表示させる。この際、制御部11413は、各種の画像認識技術を用いて撮像画像内における各種の物体を認識してもよい。例えば、制御部11413は、撮像画像に含まれる物体のエッジの形状や色等を検出することにより、鉗子等の術具、特定の生体部位、出血、エネルギー処置具11112の使用時のミスト等を認識することができる。制御部11413は、表示装置11202に撮像画像を表示させる際に、その認識結果を用いて、各種の手術支援情報を当該術部の画像に重畳表示させてもよい。手術支援情報が重畳表示され、術者11131に提示されることにより、術者11131の負担を軽減することや、術者11131が確実に手術を進めることが可能になる。 制 御 Also, the control unit 11413 causes the display device 11202 to display a captured image showing the operative part or the like based on the image signal subjected to the image processing by the image processing unit 11412. At this time, the control unit 11413 may recognize various objects in the captured image using various image recognition techniques. For example, the control unit 11413 detects a shape, a color, or the like of an edge of an object included in the captured image, and thereby detects a surgical tool such as forceps, a specific living body site, bleeding, a mist when using the energy treatment tool 11112, and the like. Can be recognized. When displaying the captured image on the display device 11202, the control unit 11413 may use the recognition result to superimpose and display various types of surgery support information on the image of the operative site. By superimposing the operation support information and presenting it to the operator 11131, the burden on the operator 11131 can be reduced, and the operator 11131 can reliably perform the operation.
 カメラヘッド11102及びCCU11201を接続する伝送ケーブル11400は、電気信号の通信に対応した電気信号ケーブル、光通信に対応した光ファイバ、又はこれらの複合ケーブルである。 The transmission cable 11400 connecting the camera head 11102 and the CCU 11201 is an electric signal cable corresponding to electric signal communication, an optical fiber corresponding to optical communication, or a composite cable thereof.
 ここで、図示する例では、伝送ケーブル11400を用いて有線で通信が行われていたが、カメラヘッド11102とCCU11201との間の通信は無線で行われてもよい。 Here, in the illustrated example, the communication is performed by wire using the transmission cable 11400, but the communication between the camera head 11102 and the CCU 11201 may be performed wirelessly.
 以上、本開示に係る技術が適用され得る内視鏡手術システムの一例について説明した。本開示に係る技術は、以上説明した構成のうち、例えば、カメラヘッド11102の撮像部11402等に適用され得る。カメラヘッド11102に本開示に係る技術を適用することにより、カメラヘッド11102等を小型化することが可能となるため、内視鏡手術システム11000をコンパクト化が可能となる。また、カメラヘッド11102等に本開示に係る技術を適用することにより、ノイズの低減されたクリアな画像を取得することが可能となるため、より見やすい撮影画像を術者に提供することができる。それにより、術者の疲労を軽減することが可能になる。 As described above, an example of the endoscopic surgery system to which the technology according to the present disclosure can be applied has been described. The technology according to the present disclosure can be applied to, for example, the imaging unit 11402 of the camera head 11102 among the configurations described above. By applying the technology according to the present disclosure to the camera head 11102, the camera head 11102 and the like can be reduced in size, so that the endoscopic surgery system 11000 can be reduced in size. In addition, by applying the technology according to the present disclosure to the camera head 11102 and the like, a clear image with reduced noise can be obtained, and thus a more easily viewable captured image can be provided to the operator. Thereby, it becomes possible to reduce the fatigue of the operator.
 なお、ここでは、一例として内視鏡手術システムについて説明したが、本開示に係る技術は、その他、例えば、顕微鏡手術システム等に適用されてもよい。 Here, the endoscopic surgery system has been described as an example, but the technology according to the present disclosure may be applied to, for example, a microscopic surgery system or the like.
(10.WSI(Whole Slide Imaging)システムへの応用例)
 本開示に係る技術は、様々な製品へ応用することができる。例えば、本開示に係る技術は、医師等が患者から採取された細胞や組織を観察して病変を診断する病理診断システムやその支援システム等(以下、診断支援システムと称する)に適用されてもよい。この診断支援システムは、デジタルパソロジー技術を利用して取得された画像に基づいて病変を診断又はその支援をするWSI(Whole Slide Imaging)システムであってもよい。
(10. Application to WSI (Whole Slide Imaging) system)
The technology according to the present disclosure can be applied to various products. For example, the technology according to the present disclosure may be applied to a pathological diagnosis system in which a doctor or the like observes cells or tissues collected from a patient to diagnose a lesion or a support system therefor (hereinafter, referred to as a diagnosis support system). Good. This diagnosis support system may be a WSI (Whole Slide Imaging) system that diagnoses or supports a lesion based on an image acquired using digital pathology technology.
 図16は、本開示に係る技術が適用される診断支援システム5500の概略的な構成の一例を示す図である。図16に示すように、診断支援システム5500は、1以上の病理システム5510を含む。さらに医療情報システム5530と、導出装置5540とを含んでもよい。 FIG. 16 is a diagram illustrating an example of a schematic configuration of a diagnosis support system 5500 to which the technology according to the present disclosure is applied. As shown in FIG. 16, the diagnosis support system 5500 includes one or more pathology systems 5510. Further, a medical information system 5530 and a derivation device 5540 may be included.
 1以上の病理システム5510それぞれは、主に病理医が使用するシステムであり、例えば研究所や病院に導入される。各病理システム5510は、互いに異なる病院に導入されてもよく、それぞれWAN(Wide Area Network)(インターネットを含む)やLAN(Local Area Network)や公衆回線網や移動体通信網などの種々のネットワークを介して医療情報システム5530及び導出装置5540に接続される。 Each of the # 1 or more pathology systems 5510 is a system mainly used by a pathologist, and is introduced into, for example, a research laboratory or a hospital. Each pathological system 5510 may be installed in different hospitals, and may be connected to various networks such as a WAN (Wide Area Network) (including the Internet), a LAN (Local Area Network), a public line network, and a mobile communication network. It is connected to the medical information system 5530 and the derivation device 5540 via the terminal.
 各病理システム5510は、顕微鏡5511と、サーバ5512と、表示制御装置5513と、表示装置5514とを含む。 Each pathological system 5510 includes a microscope 5511, a server 5512, a display control device 5513, and a display device 5514.
 顕微鏡5511は、光学顕微鏡の機能を有し、ガラススライドに収められた観察対象物を撮像し、デジタル画像である病理画像を取得する。観察対象物とは、例えば、患者から採取された組織や細胞であり、臓器の肉片、唾液、血液等であってよい。 The microscope 5511 has a function of an optical microscope, and captures an observation object contained in a glass slide to acquire a pathological image as a digital image. The observation target is, for example, a tissue or a cell collected from a patient, and may be a piece of organ, saliva, blood, or the like.
 サーバ5512は、顕微鏡5511によって取得された病理画像を図示しない記憶部に記憶、保存する。また、サーバ5512は、表示制御装置5513から閲覧要求を受け付けた場合に、図示しない記憶部から病理画像を検索し、検索された病理画像を表示制御装置5513に送る。 The server 5512 stores and stores the pathological image acquired by the microscope 5511 in a storage unit (not shown). In addition, when the server 5512 receives a browsing request from the display control device 5513, the server 5512 searches for a pathological image from a storage unit (not shown), and sends the searched pathological image to the display control device 5513.
 表示制御装置5513は、ユーザから受け付けた病理画像の閲覧要求をサーバ5512に送る。そして、表示制御装置5513は、サーバ5512から受け付けた病理画像を、液晶、EL(Electro‐Luminescence)、CRT(Cathode Ray Tube)などを用いた表示装置5514に表示させる。なお、表示装置5514は、4Kや8Kに対応していてもよく、また、1台に限られず、複数台であってもよい。 The display control device 5513 sends a browsing request for a pathological image received from the user to the server 5512. Then, the display control device 5513 causes the pathological image received from the server 5512 to be displayed on a display device 5514 using a liquid crystal, EL (Electro-Luminescence), CRT (Cathode Ray Tube), or the like. Note that the number of the display devices 5514 may correspond to 4K or 8K, and is not limited to one and may be plural.
 ここで、観察対象物が臓器の肉片等の固形物である場合、この観察対象物は、例えば、染色された薄切片であってよい。薄切片は、例えば、臓器等の検体から切出されたブロック片を薄切りすることで作製されてもよい。また、薄切りの際には、ブロック片がパラフィン等で固定されてもよい。 Here, when the observation target is a solid such as a piece of meat of an organ, the observation target may be, for example, a stained thin section. The thin section may be produced by, for example, thinly cutting a block piece cut out from a specimen such as an organ. In the case of slicing, the block pieces may be fixed with paraffin or the like.
 薄切片の染色には、HE(Hematoxylin-Eosin)染色などの組織の形態を示す一般染色や、IHC(Immunohistochemistry)染色などの組織の免疫状態を示す免疫染色など、種々の染色が適用されてよい。その際、1つの薄切片が複数の異なる試薬を用いて染色されてもよいし、同じブロック片から連続して切り出された2以上の薄切片(隣接する薄切片ともいう)が互いに異なる試薬を用いて染色されてもよい。 Various stains may be applied to the staining of the thin sections, such as general staining indicating the morphology of the tissue such as HE (Hematoxylin-Eosin) staining, and immunostaining indicating the immune state of the tissue such as IHC (Immunohistochemistry) staining. . At that time, one thin section may be stained using a plurality of different reagents, or two or more thin sections (also referred to as adjacent thin sections) cut out continuously from the same block piece may use different reagents. May be used for staining.
 顕微鏡5511は、低解像度で撮像するための低解像度撮像部と、高解像度で撮像するための高解像度撮像部とを含み得る。低解像度撮像部と高解像度撮像部とは、異なる光学系であってもよいし、同一の光学系であってもよい。同一の光学系である場合には、顕微鏡5511は、撮像対象に応じて解像度が変更されてもよい。 The microscope 5511 may include a low-resolution imaging unit for imaging at low resolution and a high-resolution imaging unit for imaging at high resolution. The low-resolution imaging unit and the high-resolution imaging unit may be different optical systems or may be the same optical system. When the optical systems are the same, the resolution of the microscope 5511 may be changed according to the imaging target.
 観察対象物が収容されたガラススライドは、顕微鏡5511の画角内に位置するステージ上に載置される。顕微鏡5511は、まず、低解像度撮像部を用いて画角内の全体画像を取得し、取得した全体画像から観察対象物の領域を特定する。続いて、顕微鏡5511は、観察対象物が存在する領域を所定サイズの複数の分割領域に分割し、各分割領域を高解像度撮像部により順次撮像することで、各分割領域の高解像度画像を取得する。対象とする分割領域の切替えでは、ステージを移動させてもよいし、撮像光学系を移動させてもよいし、それら両方を移動させてもよい。また、各分割領域は、ガラススライドの意図しない滑りによる撮像漏れ領域の発生等を防止するために、隣接する分割領域との間で重複していてもよい。さらに、全体画像には、全体画像と患者とを対応付けておくための識別情報が含まれていてもよい。この識別情報は、例えば、文字列やQRコード(登録商標)等であってよい。 ガ ラ ス The glass slide containing the observation target is placed on a stage located within the angle of view of the microscope 5511. The microscope 5511 first obtains an entire image within the angle of view using the low-resolution imaging unit, and specifies an area of the observation target from the obtained entire image. Subsequently, the microscope 5511 obtains a high-resolution image of each divided region by dividing the region where the observation target object is present into a plurality of divided regions of a predetermined size, and sequentially capturing each divided region with a high-resolution imaging unit. I do. In switching the target divided region, the stage may be moved, the imaging optical system may be moved, or both of them may be moved. Further, each divided region may overlap with an adjacent divided region in order to prevent occurrence of an imaging omission region due to unintentional sliding of the glass slide. Further, the whole image may include identification information for associating the whole image with the patient. This identification information may be, for example, a character string or a QR code (registered trademark).
 顕微鏡5511で取得された高解像度画像は、サーバ5512に入力される。サーバ5512は、各高解像度画像をより小さいサイズの部分画像(以下、タイル画像と称する)に分割する。例えば、サーバ5512は、1つの高解像度画像を縦横10×10個の計100個のタイル画像に分割する。その際、隣接する分割領域が重複していれば、サーバ5512は、テンプレートマッチング等の技法を用いて互いに隣り合う高解像度画像にスティッチング処理を施してもよい。その場合、サーバ5512は、スティッチング処理により貼り合わされた高解像度画像全体を分割してタイル画像を生成してもよい。ただし、高解像度画像からのタイル画像の生成は、上記スティッチング処理の前であってもよい。 高 The high-resolution image acquired by the microscope 5511 is input to the server 5512. The server 5512 divides each high-resolution image into smaller-sized partial images (hereinafter, referred to as tile images). For example, the server 5512 divides one high-resolution image into a total of 100 tile images of 10 × 10 vertically and horizontally. At this time, if adjacent divided areas overlap, the server 5512 may perform a stitching process on the high-resolution images adjacent to each other by using a technique such as template matching. In that case, the server 5512 may generate a tile image by dividing the entire high-resolution image attached by the stitching process. However, the generation of the tile image from the high-resolution image may be performed before the stitching process.
 また、サーバ5512は、タイル画像をさらに分割することで、より小さいサイズのタイル画像を生成し得る。このようなタイル画像の生成は、最小単位として設定されたサイズのタイル画像が生成されるまで繰り返されてよい。 {Circle around (5)} The server 5512 may generate a tile image of a smaller size by further dividing the tile image. Such generation of a tile image may be repeated until a tile image of a size set as the minimum unit is generated.
 このように最小単位のタイル画像を生成すると、サーバ5512は、隣り合う所定数のタイル画像を合成することで1つのタイル画像を生成するタイル合成処理を、全てのタイル画像に対して実行する。このタイル合成処理は、最終的に1つのタイル画像が生成されるまで繰り返され得る。このような処理により、各階層が1つ以上のタイル画像で構成されたピラミッド構造のタイル画像群が生成される。このピラミッド構造では、ある層のタイル画像とこの層とは異なる層のタイル画像との画素数は同じであるが、その解像度が異なっている。例えば、2×2個の計4つのタイル画像を合成して上層の1つのタイル画像を生成する場合、上層のタイル画像の解像度は、合成に用いた下層のタイル画像の解像度の1/2倍となっている。 When the minimum unit tile image is generated in this way, the server 5512 executes a tile synthesis process of generating one tile image by synthesizing a predetermined number of adjacent tile images for all tile images. This tile synthesizing process can be repeated until one tile image is finally generated. Through such processing, a tile image group having a pyramid structure in which each layer is configured by one or more tile images is generated. In this pyramid structure, the tile image of a certain layer and the tile image of a layer different from this layer have the same number of pixels, but have different resolutions. For example, when one tile image of the upper layer is generated by synthesizing a total of four tile images of 2 × 2, the resolution of the tile image of the upper layer is 倍 times the resolution of the tile image of the lower layer used for the synthesis. It has become.
 このようなピラミッド構造のタイル画像群を構築することによって、表示対象のタイル画像が属する階層次第で、表示装置に表示される観察対象物の詳細度を切り替えることが可能となる。例えば、最下層のタイル画像が用いられる場合には、観察対象物の狭い領域を詳細に表示し、上層のタイル画像が用いられるほど観察対象物の広い領域が粗く表示されるようにすることができる。 By constructing such a pyramid-structured tile image group, it is possible to switch the level of detail of the observation target displayed on the display device depending on the layer to which the tile image to be displayed belongs. For example, when the lowermost tile image is used, a narrow area of the observation target is displayed in detail, and a wider area of the observation target is coarsely displayed as the upper tile image is used. it can.
 生成されたピラミッド構造のタイル画像群は、例えば、各タイル画像を一意に識別可能な識別情報(タイル識別情報と称する)とともに、不図示の記憶部に記憶される。サーバ5512は、他の装置(例えば、表示制御装置5513や導出装置5540)からタイル識別情報を含むタイル画像の取得要求を受け付けた場合に、タイル識別情報に対応するタイル画像を他の装置へ送信する。 タ イ ル The generated pyramid-structured tile image group is stored in a storage unit (not shown) together with identification information (referred to as tile identification information) capable of uniquely identifying each tile image, for example. When receiving a request to acquire a tile image including tile identification information from another device (for example, the display control device 5513 or the deriving device 5540), the server 5512 transmits a tile image corresponding to the tile identification information to another device. I do.
 なお、病理画像であるタイル画像は、焦点距離や染色条件等の撮像条件毎に生成されてもよい。撮像条件毎にタイル画像が生成される場合、特定の病理画像とともに、特定の撮像条件と異なる撮像条件に対応する他の病理画像であって、特定の病理画像と同一領域の他の病理画像を並べて表示してもよい。特定の撮像条件は、閲覧者によって指定されてもよい。また、閲覧者に複数の撮像条件が指定された場合には、各撮像条件に対応する同一領域の病理画像が並べて表示されてもよい。 Note that a tile image as a pathological image may be generated for each imaging condition such as a focal length and a staining condition. When a tile image is generated for each imaging condition, along with a specific pathological image, another pathological image corresponding to an imaging condition different from the specific imaging condition, and another pathological image in the same region as the specific pathological image are displayed. They may be displayed side by side. Specific imaging conditions may be specified by the viewer. When a plurality of imaging conditions are specified for the viewer, pathological images of the same area corresponding to each imaging condition may be displayed side by side.
 また、サーバ5512は、ピラミッド構造のタイル画像群をサーバ5512以外の他の記憶装置、例えば、クラウドサーバ等に記憶してもよい。さらに、以上のようなタイル画像の生成処理の一部又は全部は、クラウドサーバ等で実行されてもよい。 The server 5512 may store the pyramid-structured tile image group in a storage device other than the server 5512, for example, a cloud server. Further, a part or all of the tile image generation processing as described above may be executed by a cloud server or the like.
 表示制御装置5513は、ユーザからの入力操作に応じて、ピラミッド構造のタイル画像群から所望のタイル画像を抽出し、これを表示装置5514に出力する。このような処理により、ユーザは、観察倍率を変えながら観察対象物を観察しているような感覚を得ることができる。すなわち、表示制御装置5513は仮想顕微鏡として機能する。ここでの仮想的な観察倍率は、実際には解像度に相当する。 The display control device 5513 extracts a desired tile image from the pyramid-structured tile image group in response to an input operation from the user, and outputs this to the display device 5514. Through such processing, the user can obtain a feeling as if the user is observing the observation target object while changing the observation magnification. That is, the display control device 5513 functions as a virtual microscope. The virtual observation magnification here actually corresponds to the resolution.
 なお、高解像度画像の撮像方法は、どの様な方法を用いてもよい。ステージの停止、移動を繰り返しながら分割領域を撮像して高解像度画像を取得してもよいし、所定の速度でステージを移動しながら分割領域を撮像してストリップ上の高解像度画像を取得してもよい。また、高解像度画像からタイル画像を生成する処理は必須の構成ではなく、スティッチング処理により貼り合わされた高解像度画像全体の解像度を段階的に変化させることで、解像度が段階的に変化する画像を生成してもよい。この場合でも、広いエリア域の低解像度画像から狭いエリアの高解像度画像までを段階的にユーザに提示することが可能である。 Note that any method may be used as a method for capturing a high-resolution image. Stopping and moving the stage may be repeated to obtain a high-resolution image by capturing the divided area while moving the stage, or by moving the stage at a predetermined speed and capturing a high-resolution image on the strip by capturing the divided area. Is also good. Also, the process of generating a tile image from a high-resolution image is not an indispensable configuration. By changing the resolution of the entire high-resolution image combined by the stitching process in a stepwise manner, an image in which the resolution changes stepwise can be obtained. May be generated. Even in this case, it is possible to gradually present the user from a low-resolution image in a wide area to a high-resolution image in a narrow area.
 医療情報システム5530は、いわゆる電子カルテシステムであり、患者を識別する情報、患者の疾患情報、診断に用いた検査情報や画像情報、診断結果、処方薬などの診断に関する情報を記憶する。例えば、ある患者の観察対象物を撮像することで得られる病理画像は、一旦、サーバ5512を介して保存された後、表示制御装置5513によって表示装置5514に表示され得る。病理システム5510を利用する病理医は、表示装置5514に表示された病理画像に基づいて病理診断を行う。病理医によって行われた病理診断結果は、医療情報システム5530に記憶される。 The medical information system 5530 is a so-called electronic medical record system, and stores information for identifying a patient, information on a patient's disease, examination information and image information used for diagnosis, diagnosis results, and information on diagnosis such as prescription drugs. For example, a pathological image obtained by imaging an observation target of a patient may be temporarily stored via the server 5512, and then displayed on the display device 5514 by the display control device 5513. A pathologist using the pathological system 5510 makes a pathological diagnosis based on the pathological image displayed on the display device 5514. The result of the pathological diagnosis performed by the pathologist is stored in the medical information system 5530.
 導出装置5540は、病理画像に対する解析を実行し得る。この解析には、機械学習によって作成された学習モデルを用いることができる。導出装置5540は、当該解析結果として、特定領域の分類結果や組織の識別結果等を導出してもよい。さらに、導出装置5540は、細胞情報、数、位置、輝度情報等の識別結果やそれらに対するスコアリング情報等を導出してもよい。導出装置5540によって導出されたこれらの情報は、診断支援情報として、病理システム5510の表示装置5514に表示されてもよい。 The derivation device 5540 can execute analysis on a pathological image. For this analysis, a learning model created by machine learning can be used. The derivation device 5540 may derive a classification result of a specific area, a tissue identification result, or the like as the analysis result. Furthermore, the deriving device 5540 may derive identification results of cell information, number, position, luminance information, and the like, and scoring information for them. These pieces of information derived by the derivation device 5540 may be displayed on the display device 5514 of the pathology system 5510 as diagnosis support information.
 なお、導出装置5540は、1台以上のサーバ(クラウドサーバを含む)等で構成されたサーバシステムであってもよい。また、導出装置5540は、病理システム5510内の例えば表示制御装置5513又はサーバ5512に組み込まれた構成であってもよい。すなわち、病理画像に対する各種解析は、病理システム5510内で実行されてもよい。 Note that the deriving device 5540 may be a server system including one or more servers (including a cloud server). The derivation device 5540 may be configured to be incorporated in, for example, the display control device 5513 or the server 5512 in the pathology system 5510. That is, various analyzes on the pathological image may be executed in the pathological system 5510.
 本開示に係る技術は、以上説明した構成のうち、例えば、顕微鏡5511に好適に適用され得る。具体的には、顕微鏡5511における低解像度撮像部及び/又は高解像度撮像部に本開示に係る技術を適用することができる。本開示に係る技術を低解像度撮像部に適用することで、全体画像における観察対象物の領域の特定を低解像度撮像部内で実行可能となる。また、本開示に係る技術を高解像度撮像部に適用することで、タイル画像の生成処理や病理画像に対する解析処理の一部又は全部を、高解像度撮像部内で実行可能となる。それにより、病理画像の取得から病理画像の解析までの処理の一部又は全部を顕微鏡5511内においてオンザフライで実行可能となるため、より迅速且つ的確な診断支援情報の出力が可能となる。例えば、特定組織の部分抽出や、個人情報に配慮しての画像の一部出力などを顕微鏡5511内で実行可能となり、撮像時間の短縮化や、データ量の縮小化や、病理医のワークフローの時間短縮等を実現することが可能となる。 技術 The technology according to the present disclosure can be suitably applied to, for example, the microscope 5511 among the configurations described above. Specifically, the technology according to the present disclosure can be applied to the low-resolution imaging unit and / or the high-resolution imaging unit of the microscope 5511. By applying the technology according to the present disclosure to the low-resolution imaging unit, it is possible to specify the region of the observation target in the entire image in the low-resolution imaging unit. In addition, by applying the technology according to the present disclosure to the high-resolution imaging unit, part or all of the generation processing of the tile image and the analysis processing for the pathological image can be performed in the high-resolution imaging unit. Thereby, a part or all of the processing from the acquisition of the pathological image to the analysis of the pathological image can be executed on the fly within the microscope 5511, so that the diagnosis support information can be output more quickly and accurately. For example, partial extraction of a specific tissue, partial output of an image in consideration of personal information, and the like can be executed in the microscope 5511, thereby shortening the imaging time, reducing the data amount, and improving the workflow of a pathologist. It is possible to reduce time and the like.
 なお、上記で説明した構成は、診断支援システムに限らず、共焦点顕微鏡や蛍光顕微鏡、ビデオ顕微鏡等の生物顕微鏡全般にも適用され得る。ここで、観察対象物は、培養細胞や受精卵、精子等の生体試料、細胞シート、三次元細胞組織等の生体材料、ゼブラフィッシュやマウス等の生体であってもよい。また、観察対象物は、ガラススライドに限らず、ウェルプレートやシャーレ等に格納された状態で観察されることもできる。 The configuration described above can be applied not only to the diagnosis support system but also to all biological microscopes such as a confocal microscope, a fluorescence microscope, and a video microscope. Here, the observation target may be a biological sample such as a cultured cell, a fertilized egg, or a sperm, a biological material such as a cell sheet or a three-dimensional cell tissue, or a living body such as a zebrafish or a mouse. Further, the observation target object is not limited to a glass slide, and can be observed in a state stored in a well plate, a petri dish, or the like.
 さらに、顕微鏡を利用して取得した観察対象物の静止画像から動画像が生成されてもよい。例えば、所定期間連続的に撮像した静止画像から動画像を生成してもよいし、所定の間隔を空けて撮像した静止画像から画像シーケンスを生成してもよい。このように、静止画像から動画像を生成することで、がん細胞や神経細胞、心筋組織、精子等の拍動や伸長、遊走等の動きや培養細胞や受精卵の分裂過程など、観察対象物の動的な特徴を機械学習を用いて解析することが可能となる。 動 Furthermore, a moving image may be generated from a still image of the observation target acquired using a microscope. For example, a moving image may be generated from still images captured continuously for a predetermined period, or an image sequence may be generated from still images captured at predetermined intervals. In this way, by generating a moving image from a still image, it is possible to observe the movement of cancer cells, nerve cells, myocardial tissue, sperm, etc. Dynamic characteristics of an object can be analyzed using machine learning.
 また、上述してきた各実施形態及び変形例は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 The embodiments and the modifications described above can be combined as appropriate within a range that does not contradict processing contents.
 また、本明細書に記載された効果はあくまで例示であって限定されるものでは無く、また他の効果があってもよい。 効果 In addition, the effects described in the present specification are merely examples and are not limited, and may have other effects.
 なお、本技術は以下のような構成も取ることができる。
(1)
 画像データを取得する撮像部と、
 前記画像データまたは前記画像データに基づくデータに対して、ニューラルネットワーク計算モデルに基づいて特定領域を抽出する処理を実行する処理部と、
 前記特定領域に基づいて加工された画像データ、又は、前記特定領域に基づいて前記撮像部から読み出された画像データを出力する出力部と、
 を有する固体撮像装置。
(2)
 前記処理部は、学習済みの学習モデルを用いた演算処理によって、前記画像データから加工対象となる前記特定領域を抽出する
 前記(1)記載の固体撮像装置。
(3)
 前記処理部は、前記特定領域に対して、マスキング処理、モザイク処理、または、アバター化処理を実行して、前記加工された画像データを生成する
 前記(1)または(2)に記載の固体撮像装置。
(4)
 前記処理部は、前記画像データから前記特定領域に該当する部分画像データを抽出する
 前記(1)に記載の固体撮像装置。
(5)
 前記出力部は、前記部分画像データを外部へ出力する
 前記(4)に記載の固体撮像装置。
(6)
 前記出力部は、前記処理部で抽出された前記特定領域の部分画像データを前記加工された画像データ又は前記撮像部から読み出された前記画像データとして外部へ出力する
 前記(1)から(5)のいずれかに記載の固体撮像装置。
(7)
 前記出力部は、前記処理部で抽出された前記特定領域以外の画像データを前記加工された画像データ又は前記撮像部から読み出された前記画像データとして外部へ出力する
 前記(1)から(5)のいずれかに記載の固体撮像装置。
(8)
 前記撮像部からの画像データの読み出しを制御する制御部をさらに備え、
 前記制御部は、前記特定領域に該当する部分画像データを前記撮像部から読み出す
 前記(1)から(7)のいずれかに記載の固体撮像装置。
(9)
 前記撮像部からの画像データの読み出しを制御する制御部をさらに備え、
 前記制御部は、前記特定領域を含まない画像データを前記撮像部から読み出す
 前記(1)から(7)のいずれかに記載の固体撮像装置。
(10)
 前記特定領域は、人物の顔、目、鼻、口、窓及び表札のうち少なくとも1つを含む領域である
 前記(1)から(9)のいずれかに記載の固体撮像装置。
(11)
 前記撮像部は、撮像された画像の読み出しの際に、読み出し対象の単位画素を間引いて読み出し、前記画像データを取得し、
 前記処理部は、前記間引かれた画像データから前記特定領域を抽出する
 前記(1)から(10)のいずれかに記載の固体撮像装置。
(12)
 画像データを取得する撮像部と、前記画像データに基づくデータに対して、ニューラルネットワーク計算モデルに基づいて特定領域を抽出する処理を実行する処理部と、前記特定領域に基づいて加工された前記画像データ、又は、前記特定領域に基づいて前記撮像部から読み出された画像データを出力する出力部とを有する固体撮像装置と、
 前記固体撮像装置から出力された前記加工された画像データ、又は、前記撮像部から読み出された画像データに対してアプリケーションによる処理を実行する制御装置と
 を有する電子機器。
Note that the present technology can also have the following configurations.
(1)
An imaging unit for acquiring image data;
For the image data or data based on the image data, a processing unit that performs a process of extracting a specific region based on a neural network calculation model,
Image data processed based on the specific area, or an output unit that outputs image data read from the imaging unit based on the specific area,
A solid-state imaging device having:
(2)
The solid-state imaging device according to (1), wherein the processing unit extracts the specific region to be processed from the image data by an arithmetic process using a learned learning model.
(3)
The solid-state imaging device according to (1) or (2), wherein the processing unit performs a masking process, a mosaic process, or an avatar process on the specific region to generate the processed image data. apparatus.
(4)
The solid-state imaging device according to (1), wherein the processing unit extracts partial image data corresponding to the specific region from the image data.
(5)
The solid-state imaging device according to (4), wherein the output unit outputs the partial image data to the outside.
(6)
The output unit outputs the partial image data of the specific area extracted by the processing unit to the outside as the processed image data or the image data read from the imaging unit. ).
(7)
The output unit outputs image data other than the specific area extracted by the processing unit to the outside as the processed image data or the image data read from the imaging unit. ).
(8)
A control unit that controls reading of image data from the imaging unit,
The solid-state imaging device according to any one of (1) to (7), wherein the control unit reads partial image data corresponding to the specific region from the imaging unit.
(9)
A control unit that controls reading of image data from the imaging unit,
The solid-state imaging device according to any one of (1) to (7), wherein the control unit reads image data that does not include the specific region from the imaging unit.
(10)
The solid-state imaging device according to any one of (1) to (9), wherein the specific region is a region including at least one of a person's face, eyes, nose, mouth, window, and nameplate.
(11)
The imaging unit is configured to read out a unit pixel to be read out at the time of reading a captured image, to obtain the image data,
The solid-state imaging device according to any one of (1) to (10), wherein the processing unit extracts the specific region from the thinned image data.
(12)
An imaging unit that acquires image data, a processing unit that performs a process of extracting a specific region based on a neural network calculation model for data based on the image data, and the image that is processed based on the specific region Data, or a solid-state imaging device having an output unit that outputs image data read from the imaging unit based on the specific region,
A control device that executes a process by an application on the processed image data output from the solid-state imaging device or the image data read from the imaging unit.
 1 撮像装置
 10 イメージセンサ
 11 撮像部
 12 コントロール部
 13 信号処理部
 14 DSP
 15 メモリ
 16 セレクタ
 20 アプリケーションプロセッサ
 30 クラウドサーバ
Reference Signs List 1 imaging device 10 image sensor 11 imaging unit 12 control unit 13 signal processing unit 14 DSP
15 memory 16 selector 20 application processor 30 cloud server

Claims (12)

  1.  画像データを取得する撮像部と、
     前記画像データまたは前記画像データに基づくデータに対して、ニューラルネットワーク計算モデルに基づいて特定領域を抽出する処理を実行する処理部と、
     前記特定領域に基づいて加工された画像データ、又は、前記特定領域に基づいて前記撮像部から読み出された画像データを出力する出力部と、
     を有する固体撮像装置。
    An imaging unit for acquiring image data;
    For the image data or data based on the image data, a processing unit that performs a process of extracting a specific region based on a neural network calculation model,
    Image data processed based on the specific area, or an output unit that outputs image data read from the imaging unit based on the specific area,
    A solid-state imaging device having:
  2.  前記処理部は、学習済みの学習モデルを用いた演算処理によって、前記画像データから加工対象となる前記特定領域を抽出する
     請求項1に記載の固体撮像装置。
    The solid-state imaging device according to claim 1, wherein the processing unit extracts the specific region to be processed from the image data by an arithmetic process using a learned learning model.
  3.  前記処理部は、前記特定領域に対して、マスキング処理、モザイク処理、または、アバター化処理を実行して、前記加工された画像データを生成する
     請求項1に記載の固体撮像装置。
    The solid-state imaging device according to claim 1, wherein the processing unit performs a masking process, a mosaic process, or an avatar process on the specific region to generate the processed image data.
  4.  前記処理部は、前記画像データから前記特定領域に該当する部分画像データを抽出する
     請求項1に記載の固体撮像装置。
    The solid-state imaging device according to claim 1, wherein the processing unit extracts partial image data corresponding to the specific region from the image data.
  5.  前記出力部は、前記部分画像データを外部へ出力する
     請求項4に記載の固体撮像装置。
    The solid-state imaging device according to claim 4, wherein the output unit outputs the partial image data to the outside.
  6.  前記出力部は、前記処理部で抽出された前記特定領域の部分画像データを前記加工された画像データ又は前記撮像部から読み出された前記画像データとして外部へ出力する
     請求項1に記載の固体撮像装置。
    The solid according to claim 1, wherein the output unit outputs the partial image data of the specific area extracted by the processing unit to the outside as the processed image data or the image data read from the imaging unit. Imaging device.
  7.  前記出力部は、前記処理部で抽出された前記特定領域以外の画像データを前記加工された画像データ又は前記撮像部から読み出された前記画像データとして外部へ出力する
     請求項1に記載の固体撮像装置。
    2. The solid according to claim 1, wherein the output unit outputs image data other than the specific area extracted by the processing unit to the outside as the processed image data or the image data read from the imaging unit. 3. Imaging device.
  8.  前記撮像部からの画像データの読み出しを制御する制御部をさらに備え、
     前記制御部は、前記特定領域に該当する部分画像データを前記撮像部から読み出す
     請求項1に記載の固体撮像装置。
    A control unit that controls reading of image data from the imaging unit,
    The solid-state imaging device according to claim 1, wherein the control unit reads partial image data corresponding to the specific area from the imaging unit.
  9.  前記撮像部からの画像データの読み出しを制御する制御部をさらに備え、
     前記制御部は、前記特定領域を含まない画像データを前記撮像部から読み出す
     請求項1に記載の固体撮像装置。
    A control unit that controls reading of image data from the imaging unit,
    The solid-state imaging device according to claim 1, wherein the control unit reads out image data that does not include the specific region from the imaging unit.
  10.  前記特定領域は、人物の顔、目、鼻、口、窓及び表札のうち少なくとも1つを含む領域である
     請求項1に記載の固体撮像装置。
    The solid-state imaging device according to claim 1, wherein the specific area is an area including at least one of a person's face, eyes, nose, mouth, window, and nameplate.
  11.  前記撮像部は、撮像された画像の読み出しの際に、読み出し対象の単位画素を間引いて読み出し、前記画像データを取得し、
     前記処理部は、前記間引かれた画像データから前記特定領域を抽出する
     請求項1に記載の固体撮像装置。
    The imaging unit is configured to read out a unit pixel to be read out at the time of reading a captured image, to obtain the image data,
    The solid-state imaging device according to claim 1, wherein the processing unit extracts the specific region from the thinned image data.
  12.  画像データを取得する撮像部と、前記画像データに基づくデータに対して、ニューラルネットワーク計算モデルに基づいて特定領域を抽出する処理を実行する処理部と、前記特定領域に基づいて加工された前記画像データ、又は、前記特定領域に基づいて前記撮像部から読み出された画像データを出力する出力部とを有する固体撮像装置と、
     前記固体撮像装置から出力された前記加工された画像データ、又は、前記撮像部から読み出された画像データに対してアプリケーションによる処理を実行する制御装置と
     を有する電子機器。
    An imaging unit that acquires image data, a processing unit that executes a process of extracting a specific region based on a neural network calculation model for data based on the image data, and the image processed based on the specific region. Data, or a solid-state imaging device having an output unit that outputs image data read from the imaging unit based on the specific region,
    A control device that executes processing by an application on the processed image data output from the solid-state imaging device or the image data read from the imaging unit.
PCT/JP2019/029715 2018-07-31 2019-07-29 Solid-state imaging device and electronic apparatus WO2020027074A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/251,953 US11820289B2 (en) 2018-07-31 2019-07-29 Solid-state imaging device and electronic device

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP2018144173 2018-07-31
JP2018-144173 2018-07-31
JP2019-139196 2019-07-29
JP2019139196A JP6725733B2 (en) 2018-07-31 2019-07-29 Solid-state imaging device and electronic device

Publications (1)

Publication Number Publication Date
WO2020027074A1 true WO2020027074A1 (en) 2020-02-06

Family

ID=69231883

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/029715 WO2020027074A1 (en) 2018-07-31 2019-07-29 Solid-state imaging device and electronic apparatus

Country Status (1)

Country Link
WO (1) WO2020027074A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009001530A1 (en) * 2007-06-22 2008-12-31 Panasonic Corporation Camera device and imaging method
WO2018051809A1 (en) * 2016-09-16 2018-03-22 ソニーセミコンダクタソリューションズ株式会社 Image pickup device and electronic apparatus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009001530A1 (en) * 2007-06-22 2008-12-31 Panasonic Corporation Camera device and imaging method
WO2018051809A1 (en) * 2016-09-16 2018-03-22 ソニーセミコンダクタソリューションズ株式会社 Image pickup device and electronic apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BONG, KYEONGRYEOL ET AL.: "14. 6 A 0.62mW ultra-low- power convolutional-neural-network face-recognition processor and a CIS integrated with always-on haar-like face detector", IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2017 *

Similar Documents

Publication Publication Date Title
JP7414869B2 (en) Solid-state imaging device, electronic equipment, and control method for solid-state imaging device
JP6705044B2 (en) Stacked light receiving sensor and in-vehicle imaging device
WO2020090509A1 (en) Stacked light receiving sensor and electronic device
WO2020027233A1 (en) Imaging device and vehicle control system
US11962916B2 (en) Imaging device with two signal processing circuitry partly having a same type of signal processing, electronic apparatus including imaging device, and imaging method
WO2020045539A1 (en) Solid-state image capture device, information processing device, information processing system, information processing method, and program
JP7423491B2 (en) Solid-state imaging device and vehicle control system
WO2021075321A1 (en) Image capture apparatus, electronic device and image capture method
WO2020027161A1 (en) Layered-type light-receiving sensor and electronic device
US20240021646A1 (en) Stacked light-receiving sensor and in-vehicle imaging device
WO2020027074A1 (en) Solid-state imaging device and electronic apparatus
US20240080546A1 (en) Imaging apparatus and electronic equipment
WO2021075292A1 (en) Light receiving device, electronic equipment, and light receiving method
TWI840429B (en) Multilayer photosensitive sensor and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19845150

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19845150

Country of ref document: EP

Kind code of ref document: A1