WO2019097804A1

WO2019097804A1 - Image recording device and image recording method

Info

Publication number: WO2019097804A1
Application number: PCT/JP2018/031838
Authority: WO
Inventors: 裕亮櫻田
Original assignee: オリンパス株式会社
Priority date: 2017-11-14
Filing date: 2018-08-28
Publication date: 2019-05-23

Abstract

This image recording device is provided with: an image input unit for acquiring a medical image; a speech acquisition unit for acquiring speech produced by a health professional; a speech analysis unit for analyzing the speech acquired by the speech acquisition unit; a procedure phase estimation unit for estimating a phase of a procedure on the basis of a phrase from the analysis results of the speech analysis unit; a determination unit for confirming the estimated phase on the basis of a time series comparison of a change in the phase estimated by the procedure phase estimation unit and a phase set in advance; and a recording data generation unit which generates recording data by applying, to the medical image, index information associating the medical image and the phase relationship according to the determination results of the determination unit, and records the recording data in a recording unit.

Description

Image recording apparatus and image recording method

The present invention relates to an image recording apparatus and an image recording method for recording an image obtained by a medical device such as an endoscope.

Conventionally, endoscopes are widely adopted in the medical field and the like. Medical images obtained by the endoscope are recorded in various media for diagnosis and case recording. In recent years, with the increase in capacity of recording media, recording of moving images from endoscopes has also been performed.

For example, in a procedure or examination using an endoscope, various images such as an endoscopic image, an ultrasound image, an X-ray image during the procedure or examination, an image of an operator's hand, and an image of an indoor condition Images (hereinafter referred to as medical images) may be recorded as moving images. Among such image recording apparatuses, not only the image recording apparatus main body but also those which can perform a recording operation by a scope switch or the like provided in the endoscope.

Furthermore, in Japanese Patent Application Laid-Open No. 2007-275237, a diagnostic support system is also developed that records not only the image recording but also the sound related to the examination object to be photographed together with the information of the recording time.

By the way, as a purpose of recording a case, it is conceivable to use a medical image as a backup for an evidence image or the like, or to use it as an educational material. For example, for important anatomical scenes in cases, recorded images can be shared at academic meetings or in-hospital conferences and used for education of young doctors. In addition, in order to use the technology certification system, it is also possible to record the procedure of the endoscope and the like, and perform the procedure recognition from the recorded image.

By the way, in the case of recording for backup, it is necessary to record the entire case. For example, in a surgical operation etc., image recording is performed for a relatively long time. On the other hand, images used for educational applications are often part of the procedure or examination period. Therefore, when a recorded image obtained by recording the entire case is used for educational purposes, it may take time to search for a desired scene.

Therefore, there is a method of recording an image by adding information of the editing point at the time of recording by designating the editing point by a scope switch or the like during recording of the medical image. However, performing an operation of adding a desired scene editing point is not always easy during surgery or the like, and may cause an operation to be forgotten.

Therefore, it is easy to search for an image of a desired scene by estimating the switching of the phase of the procedure from the change of the image and adding the meta information of the editing point to the recorded image corresponding to the estimated switching point. An image recording apparatus has been proposed.

However, even if it is the start timing of a scene necessary for educational use, for example, if there is no significant change in the image at this timing, it may not be determined as an editing point and meta information may not be added.

An object of the present invention is to provide an image recording apparatus and an image recording method capable of adding index information at a desired timing by estimating a procedure phase based on a conversation such as a doctor or a nurse during a procedure. Do.

An image recording apparatus according to an aspect of the present invention includes a video input unit for acquiring a medical image, an audio acquisition unit for acquiring audio generated by a medical worker, and an audio analysis unit for analyzing audio acquired by the audio acquisition unit. A procedure phase estimation unit that estimates a procedure phase based on words and phrases analyzed by the voice analysis unit; a change of the phase estimated by the procedure phase estimation unit; and a time series of phases set in advance Based on the comparison of the first phase, index data for correlating the relationship between the medical image and the phase to the medical image according to the judgment result of the judgment of the estimated phase and the judgment result of the judgment portion, and recording data And a recording data generation unit for recording the information in the recording unit.

An image recording method according to an aspect of the present invention includes a video input step of acquiring a medical image, an audio acquisition step of acquiring audio generated by a medical worker, and an audio analysis step of analyzing audio acquired in the audio acquisition step. A procedure phase estimation step of estimating a procedure phase based on words and phrases obtained by analysis in the voice analysis step, a change of the phase estimated in the procedure phase estimation step, and a time series of predetermined phases Based on the comparison of the second phase, index information for correlating the relationship between the medical image and the phase according to the determination step of determining the estimated phase, and the determination result in the determination step, Recording data generation step of generating the .

FIG. 1 is a block diagram showing an image recording apparatus according to a first embodiment of the present invention. Explanatory drawing which shows the mode of the operating room in which the image recording apparatus of FIG. 1 is arrange | positioned. Explanatory drawing for demonstrating the example of the content of the look-up table memorize | stored in the memory | storage part 60. FIG. 6 is a flowchart for explaining the operation of the first embodiment. FIG. 6 is an explanatory view showing an example of an image file generated by a recording unit 55. FIG. 6 is an explanatory view showing an example of an image file generated by a recording unit 55. The flowchart which shows the operation | movement flow employ | adopted as the 2nd Embodiment of this invention. Explanatory drawing which shows an example of the look-up table of two procedures memorize | stored in the memory | storage part 60. FIG. Explanatory drawing which shows an example of the look-up table of two procedures memorize | stored in the memory | storage part 60. FIG. The flowchart which shows the operation | movement flow employ | adopted in the 3rd Embodiment of this invention. Explanatory drawing which shows an example of the content of the look-up table employ | adopted in 3rd Embodiment.

Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

First Embodiment
FIG. 1 is a block diagram showing an image recording apparatus according to a first embodiment of the present invention. FIG. 2 is an explanatory view showing the operating room where the image recording apparatus of FIG. 1 is disposed.

Usually, multiple doctors and nurses are placed in the operating room to perform the procedure, and in order to achieve mutual cooperation during the procedure, the delivery of instruments, confirmation of the condition, operation of the instruments, etc. Conversations are exchanged at important points. The content of the conversation emitted during the procedure may correspond to each phase of the procedure. In the present embodiment, such a conversation is used to estimate the phase of the procedure, and at the estimated timing, information (specifically, index information described later) of unique points such as editing points is generated. In an important scene in a procedure, it is extremely likely that conversations of contents corresponding to each scene will be exchanged, and it is possible to reliably estimate a singular point that specifies a desired scene.

First, the arrangement of the image recording device 50 in the operating room 2 will be described with reference to FIG. As shown in FIG. 2, the medical system 3 disposed in the operating room 2 is provided with a system controller 41 that controls medical equipment such as the operating table 10 on which the patient 48 lies and the electric scalpel device 13. A first cart 11 and a second cart 12 are provided in the operating room 2, and a system controller 41 is placed on the first cart 11.

Further, on the first cart 11, devices such as an electric scalpel device 13, an insufflation device 14, a video processor 15, and a light source device 16 as medical devices which are controlled devices, and a gas cylinder 18 filled with carbon dioxide are mounted. It is placed. The video processor 15 is connected to the first endoscope 31 via a camera cable 31a.

The light source device 16 is connected to the first endoscope 31 via the light guide cable 31 b. Further, on the first cart 11, the display device 19, the first central display panel 20, the operation panel 49 and the like are placed. The display device 19 is, for example, a TV monitor that displays an endoscope image or the like from the video processor 15, for example.

The central display panel 20 is a display means capable of selectively displaying any data during surgery. The operation panel 49 includes, for example, a display screen such as a liquid crystal display and a touch sensor integrally provided on the display screen, and is a centralized operation device operated by a nurse or the like who is in a non-sterile area. .

The operating table 10, the shadowless lamp 6, the electric scalpel device 13, the insufflation device 14, the video processor 15, and the light source device 16 are connected to the system controller 41 which is a central control device via a communication line (not shown) There is.

Further, the first cart 11 can read / write the individual ID information of an object wirelessly by an ID tag embedded in the first endoscope 31 or the treatment tool of the electric scalpel device 13 or the like. A Radio Frequency Identification) terminal 35 is provided.

On the other hand, on the second cart 12, a video processor 23, which is a controlled device, a light source device 24, an image processing device 25, a display device 26, a second centralized display panel 27, and an image recording device 50 are placed. The video processor 23 is connected to the second endoscope 32 via a camera cable 32a. The light source device 24 is connected to the second endoscope 32 via the light guide cable 32b.

The display device 26 displays an endoscopic image or the like captured by the video processor 23. The second central display panel 27 can selectively display any data during the operation.

The video processor 23, the light source device 24, the image processing device 25 and the image recording device 50 are connected to the relay unit 28 placed on the second cart 12 via a communication line (not shown). The relay unit 28 is connected to the system controller 41 mounted on the first cart 11 by a relay cable 29.

Thus, the system controller 41 includes the video processor 23 mounted on the second cart 12, the light source device 24, the image processing device 25 and the image recording device 50, and the electric scalpel device mounted on the first cart 11. 13, the insufflation device 14, the video processor 15, the light source device 16, and the operating table 10 can be centrally controlled. When communication is performed between the system controller 41 and these devices, the system controller 41 displays setting states of connected devices, setting screens such as operation switches, etc. on the display screen of the operation panel 49. It can be done. Furthermore, the system controller 41 can perform operation input such as change of setting values by touching a desired operation switch and operating the touch panel in a predetermined area.

The remote controller 30 is a second centralized control device operated by a surgeon or the like who is in a sterile area, and can operate other devices with established communication via the system controller 41. .

Further, an infrared communication port (not shown) which is a communication means is attached to the system controller 41. The infrared communication port is provided at a position where infrared light can be easily emitted, such as in the vicinity of the display device 19, and the system controller 41 is connected with a cable.

The system controller 41 is connected to the patient monitoring system 4 by the cable 9. The patient monitoring system 4 can analyze biological information and can display the analysis result on a required display device.

In the operating room 2, a camera 37 for imaging a medical device such as the operating table 10 is also provided. It is possible to determine the operation state by imaging a medical device such as the operating table 10 with the camera 37 and analyzing the captured image. The determination result and the image captured by the camera 37 are supplied to the system controller 41.

The

video processors

15 and 23 can generate endoscopic images based on the outputs of the

endoscopes

31 and 32, respectively. The endoscopic images from the

video processors

15 and 23 are supplied to the image recording device 50. Also, the

video processors

15 and 23 and the image recording device 50 are connected via a network (not shown), and various information including inspection information is supplied from the

video processors

15 and 23 to the image recording device 50. (Not shown in Figure 1). As a network for connecting the

video processors

15 and 23 and the image recording apparatus 50, communication lines of various communication standards can be adopted.

In the present embodiment, a headset type microphone 33 can be connected to the image recording apparatus 50. The microphone 33 picks up the voice emitted by the wearer and outputs a voice signal to the image recording device 50. Although only one microphone 33 is shown in FIG. 1, a plurality of microphones 33 can be connected to the image recording device 50, and the image recording device 50 acquires audio from the plurality of microphones 33. It is possible to Also, although the microphone 33 is shown as being wired connected by a cable, transmitting an audio signal to the image recording apparatus 50 via a wireless transmission path such as Wi-Fi (registered trademark) or Bluetooth (registered trademark). You may be able to

In the operating room 2, a microphone 34 is also provided, supported by a support member (not shown). The microphone 34 is connected to the image recording apparatus 50 via a cable (not shown) or a wireless transmission path such as Wi-Fi or Bluetooth (registered trademark), and picks up the sound in the operating room 2 to image the sound signal. The recording apparatus 50 can be supplied.

In addition, although the example which employ | adopts

several microphones

33 and 34 was demonstrated in FIG. 1, if the doctor's and nurse's conversation in the operating room 2 can be collected, the kind and number of transmission paths of a microphone or an audio signal etc. Can be set as appropriate.

FIG. 1 shows an example of a specific configuration of the image recording apparatus 50 in FIG.

The image recording device 50 is provided with a control unit 51. The control unit 51 can control each part of the image recording apparatus 50. The control unit 51 may be configured by a processor such as a CPU (not shown) and operated according to a program stored in a memory (not shown) to control each unit, and may be configured by a field programmable gate array (FPGA) or the like. It may be done.

The video input unit 52 of the image recording device 50 is an interface suitable for image transmission, and takes in various medical images. The video input unit 52 can adopt various terminals such as a DVI (Digital Visual Interface) terminal, an SDI (Serial Digital Interface) terminal, an RGB terminal, a Y / C terminal, and a VIDEO terminal. The image input unit 52 is, for example, an endoscopic image from the

video processor

15 or 23, an ultrasound apparatus, an operation field camera, an X-ray observation apparatus, an endoscope processor (not shown) other than the

video processor

15 or 23 Can capture various medical images from

The medical image captured by the image input unit 52 is given to the record data generation unit 54. The recording data generation unit 54 converts the input medical image into a video signal of a predetermined image format by performing a predetermined encoding process on the input medical image. For example, the recording data generation unit 54 may input the medical image into the MPEG2 format or the MPEG-4 AVC / H. It is possible to convert it into a video signal of the H.264 format or the like and output it as recording data.

In the present embodiment, the recording data generation unit 54 is controlled by the control unit 51 and can add index information to the recording data. For example, the recording data generation unit 54 may include index information as meta information in an image file which is recording data, or may generate a file including index information different from the image file. Two types of index information may be recorded.

For example, the procedure includes phases such as anesthesia, ablation, excision, dissection, hemostasis, and bypass phase. The index information corresponds to the time information of the medical image included in the recording data, and is information corresponding to the recording time of each stage (phase) of the procedure among the scenes of the medical image as described later. By using index information at the time of reproduction, it is possible to make the reproduction position jump to a predetermined position of each phase in the medical image, for example, the head position.

The recording data generation unit 54 is configured to output the generated recording data to the recording unit 55. The recording unit 55 is controlled by the control unit 51 to record the record data of the medical image as an image file. For example, a hard disk drive or the like can be employed as the recording unit 55.

Also, the recording unit 55 can be controlled by the control unit 51 to read out the recording data and output it to the external media recording and reproducing unit 56. The external media recording and reproducing unit 56 can apply recording data from the recording unit 55 to an external medium 65, which is an external recording medium (not shown), and can record the data. As the external medium 65, not only BD (Blu-ray Disc), DVD, USB, but also a server or the like on a network may be adopted, and other recording media may be adopted.

The record data generation unit 54 can also output the input medical image to the video output unit 57. The video output unit 57 outputs the input medical image to the external monitor 66. Thereby, the external monitor 66 can display a medical image.

In the present embodiment, the image recording apparatus 50 is also provided with a sound acquisition unit 53. The voice acquisition unit 53 receives voice signals from the

microphones

33 and 34. The audio acquisition unit 53 is configured by a connector or the like to which a cable is connected when the transmission path of the audio signal from the

microphones

33 and 34 is wired, and the transmission path of the audio signal from the

microphones

33 and 34 is wireless. Is constituted by an antenna, a receiver or the like for receiving a radio signal, and takes in the input voice signal and outputs it to the voice analysis unit 58.

The sound acquisition unit 53 can acquire a plurality of sound signals. The voice acquisition unit 53 can detect which microphone each voice signal is picked up by. For example, when transmission of an audio signal is performed by wire, the audio acquisition unit 53 can detect which microphone the audio signal is picked up by which connector is connected by the connector connected to each cable. In the case where signal transmission is performed wirelessly, it is possible to detect which microphone an audio signal is picked up by, for example, a device ID acquired at the time of establishment of wireless communication.

The voice analysis unit 58 performs analysis processing on the input voice to obtain a voice analysis result. For example, the speech analysis unit 58 performs speech recognition using an acoustic model, a word dictionary, a language model and the like prepared in advance in the system. For example, the speech analysis unit 58 analyzes the speech contents of the doctor or nurse collected by the

microphones

33 and 34 by speech recognition processing, acquires the speech recognition result of the word or speech spoken by the doctor or nurse, and analyzes the result Output to the determination unit 59 as

The determination unit 59 determines the phase of the procedure corresponding to the voice collected by the

microphones

33 and 34 by referring to the lookup table stored in the storage unit 60. The determination unit 59 may be configured by a processor using a CPU or the like, and may operate in accordance with a program stored in a memory (not shown), or a part or all of the functions of the electronic circuit of hardware. It may be realized.

FIG. 3 is an explanatory diagram for explaining an example of the contents of the lookup table stored in the storage unit 60. As shown in FIG. The look-up table of FIG. 3 registers the relationship between each phase (Phase 1, Phase 2,..., Phase n) of the procedure and the corresponding word. In FIG. 3, for example, the phrases "start" and "start" are registered corresponding to Phase 1, and the phrases "separation complete", "separation complete" and "end complete" are registered corresponding to Phase 5 It is done. Although FIG. 3 shows an example in which each phase is associated with one or a relatively small number of words or phrases, a look-up table in which each phase is associated with the contents of conversation consisting of a relatively large number of words. May be adopted.

In addition, each phase (Phase1, Phase2, ...) of FIG. 3 generate | occur | produces in order of a phase (Phase1, Phase2, ...) in time series. Further, FIG. 3 corresponds to a predetermined procedure, and if the procedure is different, each phase corresponding to the procedure and a lookup table in which words and phrases corresponding to each phase are registered are adopted.

The determination unit 59 includes a procedure phase estimation unit 59a. The procedure phase estimation unit 59a estimates the phase corresponding to the content of the phrase obtained from the collected voice by referring to the lookup table using the phrase obtained by the analysis result of the voice analysis unit 58, and the estimation result Are output to the control unit 51. For example, the procedure phase estimation unit 59a is acquired by the

microphones

33 and 34 by referring to the look-up table stored in the storage unit 60 when “peeling” is obtained as the analysis result of the voice. It is estimated that the procedure shifts to the phase (Phase 7) at the timing of the voice.

Although FIG. 1 shows an example in which the procedure phase estimation unit 59a is provided in the determination unit 59, the procedure phase estimation unit 59a may be provided separately from the determination unit 59.

The image recording device 50 is provided with an operation unit 61. The operation unit 61 includes, for example, switches, buttons, keys, and a touch panel (not shown), and is configured to receive a user operation and output an operation signal based on the user operation to the control unit 51. The control unit 51 is configured to control each unit based on a user operation on the operation unit 61. For example, the control unit 51 can edit the look-up table stored in the storage unit 60 based on a user operation on the operation unit 61.

Note that it is also possible to input various types of information about the patient by the operation unit 61, and the control unit 51 can also provide various types of information about the patient to the recording data generation unit 54 and record them as metadata of an image file to be recorded It is supposed to be. The control unit 51 may be configured to take in various information related to the patient from an external database server (not shown) via a communication circuit (not shown).

In addition, the control unit 51 may read the look-up table stored in the external medium 65 via the external medium recording and reproducing unit 56 and store the lookup table in the storage unit 60.

The determination unit 59 determines whether or not the estimation result of the phase based on the speech recognition indicates a phase corresponding to the actual order of the phases in consideration of the order of each phase arranged in time series. If so, it is determined that the estimation result of the phase based on speech recognition is correct, otherwise it is determined that the estimation result of the phase based on speech recognition is incorrect. That is, the determination unit 59 determines the estimated phase based on the comparison between the change of the phase, which is the estimation result of the procedure phase, and the time series of the phase set in advance.

For example, if the determination unit 59 determines that the current phase is correct in the phase (Phase 5) and then the estimation result of the phase (Phase 4) is obtained based on speech recognition, the estimation result of the phase (Phase 4) If the estimation result of the phase (Phase 6) is obtained based on speech recognition, it is determined that the estimation result of the phase (Phase 6) is correct. The determination unit 59 stores the determination result including the determination of the phase transition based on speech recognition and the information on the determined phase only when the estimation result of the phase based on speech recognition is correct in time series. , And output to the control unit 51. The determination unit 59 may store the determination result in a memory (not shown) in the determination unit 59.

In the present embodiment, when control unit 51 receives the determination result of the phase from determination unit 59, control unit 51 adds index information corresponding to the determination result to the recording data at the timing when the determination result is input. And generates a recording request to the recording data generation unit 54. Thereby, the recording data generation unit 54 performs medical treatment on the index information associated with the timing when the phase of the procedure shifts to the next phase, that is, the recording time (reproduction time) at which the scene of the medical image to be recorded shifts to the next phase. Add to recorded data of image.

For example, in the case where the recording data generation unit 54 includes time information serving as a time reference in encoding processing in the recording data as the time code of the medical image to be recorded, the index information is the value of the time code of the timing to shift to each phase. It may be information including In addition, the index information may include not only information on recording time but also information on the type of the determined phase.

In addition, the recording data generation unit 54 may generate the file as another file independent of the image file for recording the index information. For example, each time the recording data generation unit 54 determines a phase based on voice recognition, even if it generates a file in which information of determination time (recording time, reproduction time) and text information of voice recognition results are listed. Good.

The control unit 51 is configured to receive, from the voice acquisition unit 53, information indicating which microphone the voice subjected to voice analysis in the voice analysis unit 58 is picked up by. When a table in which the correspondence between the microphone and the user of the microphone is registered is stored in a memory (not shown), the control unit 51 refers to the table using the information from the voice acquisition unit 53, It is possible to specify a speaker such as a word and the like used in the phase determination, and control the recording data generation unit 54 so that the information of the speaker is included in the index information.

Next, the operation of the embodiment configured as described above will be described with reference to FIG. FIG. 4 is a flowchart for explaining the operation of the first embodiment.

Now, it is assumed that a procedure having each phase (Phase 1, Phase 2,...) Of FIG. 3 is performed. For example, the video input unit 52 takes in a medical image from the video processor 15 and outputs the medical image to the recording data generation unit 54. The record data generation unit 54 is controlled by the control unit 51, converts the medical image into a moving image of a predetermined format, and starts recording in the recording unit 55 in step S1 of FIG. 4. Further, the medical image from the record data generation unit 54 is displayed on the external monitor 66 by the video output unit 57.

On the other hand, the voice acquisition unit 53 takes in the voice collected by the plurality of

microphones

33 and 34 and outputs the voice to the voice analysis unit 58. In step S 2, the voice analysis unit 58 acquires the words (conversation) generated by the doctor or the nurse by a known voice recognition process, and outputs the voice recognition result to the determination unit 59. The procedure phase estimation unit 59a of the determination unit 59 estimates the phase corresponding to the speech recognition result by referring to the lookup table of the storage unit 60 using the speech recognition result (step S3).

Here, it is assumed that the doctor utters "start" or "start" at the start of the procedure. The voice analysis unit 58 performs voice recognition of the doctor's speech and outputs the recognition result to the determination unit 59. When the procedure phase estimation unit 59a of the determination unit 59 determines that the speech recognition result of the words "start" or "start" is present in the look-up table (LUT) of the storage unit 60, the phase corresponding to the word is An estimation result indicating that it is a phase (Phase 1) is acquired. When a word or phrase obtained by speech recognition does not exist in the look-up table (LUT) of the storage unit 60, the process proceeds from step S3 to step S7. In step S7, the control unit 51 determines whether or not there is an operation to end recording of a moving image which is a medical image. If there is no operation to end recording, the process returns to step S2, and if there is an operation to end recording. In step S8, the recording end process is performed.

If determination unit 59 determines in step S3 that the speech recognition result is present in the lookup table, it is stored in memory 60 or a memory (not shown) in determination unit 59 in the next step S4. The determination results of the previous phases are read out to determine whether or not the time series of the phase of the estimation result based on speech recognition is correct. The phase (Phase 1) is the first phase after the start of the procedure, and the estimation result of this phase is correct in time series, so the determination unit 59 shifts the process from step S4 to step S5 and determines the estimation result, Information on the phase transition and the type of phase is output to the control unit 51. If the estimation result of the phase by speech recognition is not correct in time series, the determination unit 59 returns the process to step S2 without using the estimation result of this time.

After the control unit 51 determines in step S5, for example, on which microphone the output of the voice analysis unit 58 is due to the sound collection result, the control unit 51 identifies the speaker who issued the voice used for the phase determination, A recording request for adding index information to a moving image to be recorded is issued (step S6).

In response to the recording request, the recording data generation unit 54 adds index information to the recording data. That is, the index information has time information corresponding to the image portion recorded at the timing when the doctor utters "start" or "start". The recording data to which the index information is added is recorded in the recording unit 55.

After making a recording request in step S6, the control unit 51 shifts the process to step S7 and determines whether or not a recording end operation has been performed. Thereafter, the processes of steps S2 to S7 are repeated, and the phase determination and the addition of the index information to the recording data are performed for each utterance of the doctor or the nurse.

For example, when the doctor utters "dissection start", at this timing, it is determined by the determination unit 59 that the phase of the procedure has shifted to the correct phase (Phase 2) in time series, and the control unit 51 indexes according to this determination Information is added to the recorded data.

For example, it is assumed that the doctor utters “start” after the addition of the index information accompanying the transition to the phase (phase 2) by the utterance “dissection start”. In this case, the estimation result of the phase based on speech recognition by the procedure phase estimation unit 59a of the determination unit 59 indicates that the phase (Phase 2) returns to the phase (Phase 1), and the determination unit 59 determines that this estimation result is It determines that the time series is not correct, and ignores the estimation based on the "start" speech recognition. In this way, the transition to each phase is correctly determined in the time series of the phases (Phase1, Phase2,...) Of FIG. 3, and the index information is added to the recording data at the timing according to the transition to each phase.

If the control unit 51 determines that the recording end operation has been performed in step S7, the control unit 51 instructs the recording data generation unit 54 to end the recording in step S8. The recording data generation unit 54 converts the recording data into a file and records the file in the recording unit 55. The external media recording and reproducing unit 56 can record the image file recorded in the recording unit 55 in the external medium 65. In addition, when the determination result of the phase based on the doctor's "end" utterance is given, the control unit 51 determines that the recording end operation is performed, and performs the end of recording and the conversion into an image file. It may be

5 and 6 are explanatory diagrams showing examples of index information added to recording data. FIG. 5 shows an example of recording index information as metadata of recording data. The example of FIG. 5 shows index information added as metadata for recorded data of a predetermined procedure for a certain patient. In the example of FIG. 5, the index information includes patient information including “Date”, “Patient ID”, “Patient Name”, “BOD”, “Age”, and “Sex”. The index information also includes information of the file name of the image file indicated by <title 1>.

In the present embodiment, the index information includes information obtained by the determination of the phase, that is, the time of the phase determination, the speaker of the phrase used for the determination, and the information of the phrase used for the phase determination. In the example of FIG. 5, it is determined that the phase related to peeling has been started at time xx: xx, xx and xx according to the phrase “peeling start” uttered by the speaker OO, and the recording time of the image related to the peeling It indicates that index information indicating a time xx hour xx minute xx second) is recorded as metadata of recording data. Note that the information on the recording time indicates the time from the start of recording, and is the same information as the reproduction time.

Although the reproduction system is not shown in FIG. 1, it is also possible to read out the image file recorded in the recording unit 55 and output the reproduced image to the external monitor 66 by the video output unit 57. In this case, the reproduction circuit (not shown) can jump the reproduction position and the editing position of the recorded image to the position corresponding to the index information by using the index information when reproducing the recording data. This makes it easy to move to the top position of each phase to perform reproduction. In addition, in the case of a playback device having a jump function using such metadata, when playing back an image file recorded on the external medium 65, the index information is used to move to the head position of each phase and play back. It can be carried out.

FIG. 6 shows index information in the form of a list associated with an image file in consideration of a playback device that does not have such a jump function using metadata. The list of FIG. 6 has an item of playback time and an item of corresponding text. That is, in the item of reproduction time, the recording time (reproduction time) in which the phase is determined is indicated, and in the item of text, the term used for the determination of the phase is indicated. For example, in the example of FIG. 6, the first phase of the procedure xx is started after time 00:03:22 from the start of recording by the phrase "Start xx for the subject OO from this" It is determined that the index information indicating the recording time (time xx: xx: xx) of the image regarding the phase is recorded as a file different from the image file of the recording data.

By using index information in a list format as shown in FIG. 6, for example, when reproducing an image file by a personal computer or the like, it is possible to easily confirm the head position of each phase.

As described above, in the present embodiment, the transition to each phase of the procedure is determined by analyzing the voice emitted by the operator or the like, and index information can be reliably added to the image file to be recorded. During the procedure, the doctor and the nurse generally exchange conversations about the procedure in order to cooperate with each other, and the phase of the procedure can be estimated by speech analysis based on the conversation. That is, it is possible to determine the switching of the phase even when there is no status change or video change of the device. Therefore, in order to add the index information, the operator does not need to perform a special operation or the like, and the index information can be added to a desired position without preventing the operator from focusing on the procedure. By using this index information, it is possible to easily move the reproduction position and the editing position to each phase at the time of reproduction, editing and the like.

Second Embodiment
FIG. 7 is a flowchart showing an operation flow employed in the second embodiment of the present invention. The hardware configuration in the present embodiment is the same as that shown in FIG. Also in the first embodiment, the lookup table corresponding to each procedure is adopted. In the present embodiment, a plurality of lookup tables corresponding to each procedure are stored in the storage unit 60, and the lookup table can be selected according to the procedure. Furthermore, in the present embodiment, a look-up table not only for the procedure but also for the operator is prepared, and each phase is estimated and determined using the lookup table for each procedure and each operator. .

As described above, the sound acquisition unit 53 can acquire sound for each microphone, that is, for each operator. The voice analysis unit 58 can obtain a voice analysis result for each microphone by inputting a voice signal for each microphone from the voice acquisition unit 53. When the table in which the correspondence between the microphone and the user of the microphone is registered is stored in the memory (not shown), the determination unit 59 determines the phase used in the determination of the phase based on the information from the control unit 51. The speaker can be identified.

The voice analysis unit 58 can improve voice recognition accuracy by using a personal dictionary for each user at the time of voice recognition processing, and specify the user who uttered by the voice recognition processing without specifying a microphone. Is also possible.

The storage unit 60 stores a lookup table for each procedure and each operator as a lookup table. The storage unit 60 may store only one look-up table for each procedure or each operator. The determination unit 59 identifies the procedure based on the voice recognition result, and identifies the operator based on the information from the control unit 51 or the information acquired in the process of the voice recognition process. The determination unit 59 determines the phase using a look-up table for each procedure and each operator. Thereby, the estimation and determination accuracy of the phase can be improved. Note that the look-up table for each operator corresponds to the language and / or whistling of the operator. For example, when the operator speaks English, English is registered.

The flow of FIG. 7 differs from the flow of FIG. 4 in that steps S2 and S5 of FIG. 4 are omitted and steps S21 to S23 are added. In step S1, when the recording of the medical image is started, in step S21, the voice analysis unit 58 performs voice recognition, and the voice analysis unit 58 or the control unit 51 generates an operator's word that is the target of voice recognition. Identify

In step S22, the determination unit 59 identifies the procedure based on the voice recognition result, and selects a lookup table corresponding to the identified procedure among the lookup tables stored in the storage unit 60. When the look-up table for each procedure and for each operator is stored in the storage unit 60, the look-up table group for each procedure and the specified procedure is selected.

8 and 9 are explanatory diagrams showing an example of look-up tables of two procedures stored in the storage unit 60. FIG. The look-up table (LUT 1) in FIG. 8 corresponds to the procedure of liver resection. By using this look-up table, the phase (Phase 1) is generated by the operator's "start" or "start" speech. It is determined, and the phases "Phase 2", "Phase 3",... Are sequentially determined by the utterance "cut start", "cut start",. Also, the look-up table (LUT 2) in FIG. 9 corresponds to the procedure for bile duct stone removal, and by using this look-up table, the phase is started by the operator's etc. "start" or "start" utterance. (Phase 1) is determined, and the phases "Phase 2", "Phase 3",... Are sequentially determined by the utterance of "calculus confirmation", "stent insertion",.

For example, when the doctor issues “Hepatectomy”, the determination unit 59 selects the lookup table LUT1 in step S22. The determination unit 59 may specify a procedure not only by specifying the procedure based on the voice recognition result, but also by specifying the procedure of the control unit 51 according to the operation of the operation unit 61, for example. In addition, when information on the procedure is obtained from an external database server (not shown), the determination unit 59 may specify the procedure based on the designation of the control unit 51 that has acquired the information.

After the specification of the procedure, the doctor or the like utters “start”, and the determination unit 59 specifies the doctor in step S23, and if there is a look-up table corresponding to the specified doctor, the look-up table A table is selected, and if it does not exist, a look-up table corresponding to “Hepatectomy” common to the operator is selected to perform phase determination.

The processing of the subsequent steps S3, S4 and S6-S8 is the same as that of the first embodiment.

As described above, in this embodiment, the same effect as that of the first embodiment can be obtained, and the phase is determined with reference to the look-up table after the procedure or for each operator, thereby improving the determination accuracy. It has the effect of being able to

Third Embodiment
FIG. 10 is a flowchart showing an operation flow employed in the third embodiment of the present invention. In FIG. 10, the same steps as in FIG. 4 will be assigned the same reference numerals and descriptions thereof will be omitted. The hardware configuration in the present embodiment is the same as that shown in FIG. In the first and second embodiments, the look-up table shows an example in which only the time-sequential phase of each procedure is registered. The present embodiment shows an example in the case where not only the phase in chronological order but also the phase occurring at an arbitrary timing is registered as the lookup table.

FIG. 11 is an explanatory drawing showing an example of the content of the look-up table employed in the present embodiment. The look-up table of FIG. 11 is stored in the storage unit 60, and, as in FIG. 3, registers the relationship between each phase of the procedure (Phase1, Phase2,...) And the corresponding word. Furthermore, in the look-up table of FIG. 11, as indicated by the non-time-series column circles, phases that do not necessarily occur in time-series are registered. In the example of FIG. 11, the hemorrhage phase, the hemostasis phase, and the cleansing phase are registered as an example of phases that do not necessarily occur in time series. In these phases, when the speech recognition result is "bleeding", "hemostasis" or "washing", it is determined that the phase has been entered.

The procedure phase estimation unit 59a of the determination unit 59 determines whether or not there is a phase corresponding to the speech recognition result in step S3 of FIG. 10, and if it exists, shifts the process to step S31. In step S31, determination unit 59 determines whether or not the phase based on the speech recognition result is a phase that occurs in time series. For example, when the phase occurring in the time series is estimated as in the phase (Phase 3) corresponding to “separation start”, the determination unit 59 determines whether the phase is the correct time series phase in the next step S 4 It is determined whether or not.

On the other hand, for example, when a phase that does not necessarily occur in time series is estimated as in the hemostasis phase corresponding to “hemostasis”, the determination unit 59 shifts the process from step S31 to step S5, and Speaker identification is performed without judging the sequence.

The other actions are the same as in the first embodiment. Thus, in the present embodiment, index information corresponding to the occurrence of the phase can be reliably added to the recording data even in the phase that may occur at an arbitrary timing such as “hemostasis”.

As described above, in this embodiment, the same effect as that of the first embodiment can be obtained, and even in a phase that does not necessarily occur in time series, it is determined with certainty that index information is recorded data. There is an effect that it can be added.

Although the present embodiment has been described by way of example applied to the first embodiment, by including information on whether or not it is a non-time-sequential phase in the look-up table for each procedure and for each operator. It is apparent that the present invention is similarly applicable to the second embodiment.

In each of the above embodiments, the speech analysis unit 58 and the determination unit 59 perform the speech recognition and the phase determination. However, these processes may be performed using artificial intelligence. For example, the collected voice is provided to an external server having an artificial intelligence function, the content of the conversation is determined by the artificial intelligence, the phase is determined, the determination result is taken in by the control unit, and the index information is added to the recorded image. You may do so.

The present invention is not limited to the above-described embodiments as it is, and at the implementation stage, the constituent elements can be modified and embodied without departing from the scope of the invention. In addition, various inventions can be formed by appropriate combinations of a plurality of components disclosed in the above-described embodiments. For example, some components of all the components shown in the embodiment may be deleted. Furthermore, components in different embodiments may be combined as appropriate.

This application is based on Japanese Patent Application No. 2017-219040, filed on Nov. 14, 2017, as the basis for claiming priority, and the above disclosure is made with the present specification and claims. It shall be quoted.

Claims

A video input unit for acquiring a medical image,
A voice acquisition unit for acquiring a voice generated by a medical worker,
A voice analysis unit that analyzes the voice acquired by the voice acquisition unit;
A procedure phase estimation unit that estimates the phase of the procedure based on the word / phrase based on the analysis result of the voice analysis unit;
A determination unit that determines the estimated phase based on a comparison between the change of the phase estimated by the procedure phase estimation unit and the time series of the preset phase;
According to the determination result of the determination unit, there is provided a recording data generation unit that adds recording information to the medical image by adding index information that associates the relationship between the medical image and the phase to the medical image. An image recording apparatus characterized by
2. The method according to claim 1, wherein the procedure phase estimation unit estimates the phase by referring to a table in which words and phrases obtained by the analysis result of the voice analysis unit are associated with the corresponding phase. Image recording device.
The image recording apparatus according to claim 2, wherein the table is set for each of at least one of a procedure and a medical worker.
The determination unit is
If it is determined that the estimated phase occurs at an arbitrary timing during the procedure, the estimated phase is determined without comparing the change of the phase with the time series of the phase. The image recording apparatus according to claim 1,
The image recording apparatus according to claim 1, wherein the recording data generation unit records the index information as metadata of the recording data in the recording unit.
The image recording apparatus according to claim 1, wherein the recording data generation unit records the index information in the recording unit as data different from the recording data.
A video input step for acquiring a medical image;
A voice acquisition step of acquiring a voice generated by a medical worker;
A speech analysis step of analyzing the speech acquired in the speech acquisition step;
A procedure phase estimation step of estimating the phase of the procedure based on the words and phrases according to the analysis result in the voice analysis step;
A determination step of determining an estimated phase based on a comparison between a change in the phase estimated in the procedure phase estimation step and a time series of a preset phase;
And recording data generation step of generating recording data and recording the recording data on a recording unit by adding index information correlating the relationship between the medical image and the phase to the medical image according to the determination result in the determining step. An image recording method characterized by