Increasingly powerful technological developments in surgery such as modern operating rooms (OR), featuring digital and interconnected as well as robotic devices provide a huge amount of valuable data which can be used to improve patient therapy. Although a lot of data is available, the human ability to use these possibilities especially in a complex and time-critical situation such as surgery is limited and is extremely dependent on the experience of the surgical staff. This talks focuses on AI-assisted surgery with a specific focus on analysis of intraoperative video data. The goal is to democratize surgical skills and enhance the collaboration between surgeons and cyber-physical systems by quantifying surgical experience and make it accessible to machines. Several examples to optimize the therapy of the individual patient along the surgical treatment path are given. Finally, remaining challenges and strategies to overcome them are discussed.
Endoscopic Optical Coherence Tomography (OCT) enables the assessment of the eardrum and the middle ear in vivo. However, revealing the ossicles is often limited due to shadowing effects of preceding structures and the 3D impression is difficult to interpret. To compare the identified middle ear structures, OCT and cone-beam CT of a patient were spatially aligned and showed a good agreement in locating malleus and the promontory wall. As CT imaging uses ionizing radiation and is thus limited in application, we furthermore provide a concept how radiology can be utilized as a priori knowledge for OCT imaging. Therefore, a statistical shape model derived from μCT data of temporal bone specimens was fitted to in vivo OCT measurements, potentially providing a real-time augmentation of endoscopic OCT for middle ear diagnostics in the future.
Endoscopic optical coherence tomography (OCT) enables the assessment of the eardrum and the middle ear in vivo. However, revealing the ossicles is often limited due to shadowing effects of preceding structures and the 3D impression is difficult to interpret. To compare the identified middle ear structures, OCT and cone-beam CT of a patient were spatially aligned and showed a good agreement in locating malleus and the promontory wall. As CT imaging uses ionizing radiation and is thus limited in application, we furthermore provide a concept how radiology can be utilized as a priori knowledge for OCT imaging. Therefore, a statistical shape model derived from μCT data of temporal bone specimens was fitted to in vivo OCT measurements, potentially providing a real-time augmentation of endoscopic OCT for middle ear diagnostics in the future.
Computer-Assisted Surgery (CAS) aids the surgeon by enriching the surgical scene with additional information in order to improve patient outcome. One such aid may be the superimposition of important structures (such as blood vessels and tumors) over a laparoscopic image stream. In liver surgery, this may be achieved by creating a dense map of the abdominal environment surrounding the liver, registering a preoperative model (CT scan) to the liver within this map, and tracking the relative pose of the camera. Thereby, known structures may be rendered into images from the camera perspective. This intraoperative map of the scene may be constructed, and the relative pose of the laparoscope camera estimated, using Simultaneous Localisation and Mapping (SLAM). The intraoperative scene poses unique challenges, such as: homogeneous surface textures, sparse visual features, specular reflections and camera motions specific to laparoscopy. This work compares the efficacies of two state-of the-art SLAM systems in the context of laparoscopic surgery, on a newly collected phantom dataset with ground truth trajectory and surface data. The SLAM systems chosen contrast strongly in implementation: one sparse and feature-based, ORB-SLAM3,1{3 and one dense and featureless, ElasticFusion.4 We find that ORB-SLAM3 greatly outperforms ElasticFusion in trajectory estimation and is more stable on sequences from laparoscopic surgeries. However, when extended to give a dense output, ORB-SLAM3 performs surface reconstruction comparably to ElasticFusion. Our evaluation of these systems serves as a basis for expanding the use of SLAM algorithms in the context of laparoscopic liver surgery and Minimally Invasive Surgery (MIS) more generally.
The evaluation and trial of computer-assisted surgery systems is an important part of the development process. Since human and animal trials are difficult to perform and have a high ethical value artificial organs and phantoms have become a key component for testing clinical systems. For soft-tissue phantoms like the liver it is important to match its biomechanical properties as close as possible. Organ phantoms are often created from silicone that is shaped in casting molds. Silicone is relatively cheap and the method doesn’t rely on expensive equipment. One big disadvantage of silicone phantoms is their high rigidity. To this end, we propose a new method for the generation of silicon phantoms with a softer and mechanically more accurate structure. Since we can’t change the rigidity of silicone we developed a new and easy method to weaken the structure of the silicone phantom. The key component is the misappropriation of water-soluble support material from 3D FDM-printing. We designed casting molds with an internal grid structure to reduce the rigidity of the structure. The molds are printed with an FDM (Fused Deposition Modeling) printer and entirely from water-soluble PVA (Polyvinyl Alcohol) material. After the silicone is hardened, the mold with the internal structure can be dissolved in water. The silicone phantom is then pervaded with a grid of cavities. Our experiments have shown that we can control the rigidity of the model up to a 70% reduction of its original value. The rigidity of our silicon models is simply controlled with the size of the internal grid structure.
Providing the surgeon with the right assistance at the right time during minimally-invasive surgery requires computer-assisted surgery systems to perceive and understand the current surgical scene. This can be achieved by analyzing the endoscopic image stream. However, endoscopic images often contain artifacts, such as specular highlights, which can hinder further processing steps, e.g., stereo reconstruction, image segmentation, and visual instrument tracking. Hence, correcting them is a necessary preprocessing step. In this paper, we propose a machine learning approach for automatic specular highlight removal from a single endoscopic image. We train a residual convolutional neural network (CNN) to localize and remove specular highlights in endoscopic images using weakly labeled data. The labels merely indicate whether an image does or does not contain a specular highlight. To train the CNN, we employ a generative adversarial network (GAN), which introduces an adversary to judge the performance of the CNN during training. We extend this approach by (1) adding a self-regularization loss to reduce image modification in non-specular areas and by (2) including a further network to automatically generate paired training data from which the CNN can learn. A comparative evaluation shows that our approach outperforms model-based methods for specular highlight removal in endoscopic images.
Andreas Fetzer, Jasmin Metzger, Darko Katic, Keno März, Martin Wagner, Patrick Philipp, Sandy Engelhardt, Tobias Weller, Sascha Zelzer, Alfred Franz, Nicolai Schoch, Vincent Heuveline, Maria Maleshkova, Achim Rettinger, Stefanie Speidel, Ivo Wolf, Hannes Kenngott, Arianeb Mehrabi, Beat Müller-Stich, Lena Maier-Hein, Hans-Peter Meinzer, Marco Nolden
KEYWORDS: Surgery, Data integration, Data modeling, Imaging informatics, Data storage, Standards development, Data modeling, Knowledge acquisition, Cognition, Knowledge management, Medical imaging, Image segmentation, Picture Archiving and Communication System, Information science, Neuroimaging, Data archive systems
In the surgical domain, individual clinical experience, which is derived in large part from past clinical cases, plays
an important role in the treatment decision process. Simultaneously the surgeon has to keep track of a large
amount of clinical data, emerging from a number of heterogeneous systems during all phases of surgical treatment.
This is complemented with the constantly growing knowledge derived from clinical studies and literature. To
recall this vast amount of information at the right moment poses a growing challenge that should be supported
by adequate technology.
While many tools and projects aim at sharing or integrating data from various sources or even provide knowledge-based
decision support - to our knowledge - no concept has been proposed that addresses the entire surgical
pathway by accessing the entire information in order to provide context-aware cognitive assistance. Therefore a
semantic representation and central storage of data and knowledge is a fundamental requirement.
We present a semantic data infrastructure for integrating heterogeneous surgical data sources based on a common
knowledge representation. A combination of the Extensible Neuroimaging Archive Toolkit (XNAT) with semantic
web technologies, standardized interfaces and a common application platform enables applications to access and
semantically annotate data, perform semantic reasoning and eventually create individual context-aware surgical
assistance.
The infrastructure meets the requirements of a cognitive surgical assistant system and has been successfully
applied in various use cases. The system is based completely on free technologies and is available to the community
as an open-source package.
KEYWORDS: Stereoscopic cameras, Endoscopes, Image registration, Endoscopy, Data modeling, 3D modeling, Surgery, Augmented reality, 3D acquisition, Cameras, Filtering (signal processing), Personal digital assistants, Imaging systems
The number of minimally invasive procedures is growing every year. These procedures are highly complex and very demanding for the surgeons. It is therefore important to provide intraoperative assistance to alleviate these difficulties. For most computer-assistance systems, like visualizing target structures with augmented reality, a registration step is required to map preoperative data (e.g. CT images) to the ongoing intraoperative scene. Without additional hardware, the (stereo-) endoscope is the prime intraoperative data source and with it, stereo reconstruction methods can be used to obtain 3D models from target structures. To link reconstructed parts from different frames (mosaicking), the endoscope movement has to be known. In this paper, we present a camera tracking method that uses dense depth and feature registration which are combined with a Kalman Filter scheme. It provides a robust position estimation that shows promising results in ex vivo and in silico experiments.
Minimally-invasive interventions offers multiple benefits for patients, but also entails drawbacks for the surgeon. The goal of context-aware assistance systems is to alleviate some of these difficulties. Localizing and identifying anatomical structures, maligned tissue and surgical instruments through endoscopic image analysis is paramount for an assistance system, making online measurements and augmented reality visualizations possible. Furthermore, such information can be used to assess the progress of an intervention, hereby allowing for a context-aware assistance. In this work, we present an approach for such an analysis. First, a given laparoscopic image is divided into groups of connected pixels, so-called superpixels, using the SEEDS algorithm. The content of a given superpixel is then described using information regarding its color and texture. Using a Random Forest classifier, we determine the class label of each superpixel. We evaluated our approach on a publicly available dataset for laparoscopic instrument detection and achieved a DICE score of 0.69.
The goal of computer-assisted surgery is to provide the surgeon with guidance during an intervention, e.g., using augmented reality. To display preoperative data, soft tissue deformations that occur during surgery have to be taken into consideration. Laparoscopic sensors, such as stereo endoscopes, can be used to create a three-dimensional reconstruction of stereo frames for registration. Due to the small field of view and the homogeneous structure of tissue, reconstructing just one frame, in general, will not provide enough detail to register preoperative data, since every frame only contains a part of an organ surface. A correct assignment to the preoperative model is possible only if the patch geometry can be unambiguously matched to a part of the preoperative surface. We propose and evaluate a system that combines multiple smaller reconstructions from different viewpoints to segment and reconstruct a large model of an organ. Using graphics processing unit-based methods, we achieved four frames per second. We evaluated the system with in silico, phantom, ex vivo, and in vivo (porcine) data, using different methods for estimating the camera pose (optical tracking, iterative closest point, and a combination). The results indicate that the proposed method is promising for on-the-fly organ reconstruction and registration.
The goal of computer-assisted surgery is to provide the surgeon with guidance during an intervention using augmented reality (AR). To display preoperative data correctly, soft tissue deformations that occur during surgery have to be taken into consideration. Optical laparoscopic sensors, such as stereo endoscopes, can produce a 3D reconstruction of single stereo frames for registration. Due to the small field of view and the homogeneous structure of tissue, reconstructing just a single frame in general will not provide enough detail to register and update preoperative data due to ambiguities. In this paper, we propose and evaluate a system that combines multiple smaller reconstructions from different viewpoints to segment and reconstruct a large model of an organ. By using GPU-based methods we achieve near real-time performance. We evaluated the system on an ex-vivo porcine liver (4.21mm± 0.63) and on two synthetic silicone livers (3.64mm ± 0.31 and 1.89mm ± 0.19) using three different methods for estimating the camera pose (no tracking, optical tracking and a combination).
To date, cardiovascular surgery enables the treatment of a wide range of aortic pathologies. One of the current challenges in this field is given by the detection of high-risk patients for adverse aortic events, who should be treated electively. Reliable diagnostic parameters, which indicate the urge of treatment, have to be determined. Functional imaging by means of 4D phase contrast-magnetic resonance imaging (PC-MRI) enables the time-resolved measurement of blood flow velocity in 3D. Applied to aortic phantoms, three dimensional blood flow properties and their relation to adverse dynamics can be investigated in vitro. Emerging ”in silico” methods of numerical simulation can supplement these measurements in computing additional information on crucial parameters. We propose a framework that complements 4D PC-MRI imaging by means of numerical simulation based on the Finite Element Method (FEM). The framework is developed on the basis of a prototypic aortic phantom and validated by 4D PC-MRI measurements of the phantom. Based on physical principles of biomechanics, the derived simulation depicts aortic blood flow properties and characteristics. The framework might help identifying factors that induce aortic pathologies such as aortic dilatation or aortic dissection. Alarming thresholds of parameters such as wall shear stress distribution can be evaluated. The combined techniques of 4D PC-MRI and numerical simulation can be used as complementary tools for risk-stratification of aortic pathology.
KEYWORDS: Image segmentation, RGB color model, Surgery, Laparoscopy, Shape analysis, Light sources and illumination, Endoscopy, In vivo imaging, Visualization, Tissues
One of the most complex and difficult tasks for surgeons during minimally invasive interventions is suturing. A prerequisite to assist the suturing process is the tracking of the needle. The endoscopic images provide a rich source of information which can be used for needle tracking. In this paper, we present an image-based method for markerless needle tracking. The method uses a color-based and geometry-based segmentation to detect the needle. Once an initial needle detection is obtained, a region of interest enclosing the extracted needle contour is passed on to a reduced segmentation. It is evaluated with in vivo images from da Vinci interventions.
A mitral valve reconstruction (MVR) is a complex operation in which the functionality of incompetent mitral valves is re-established by applying surgical techniques. This work deals with predictive biomechanical simulations of operation scenarios for an MVR, and the simulation's integration into a knowledge-based surgery assistance system. We present a framework for the definition of the corresponding surgical workflow, which combines semantically enriched surgical expert knowledge with a biomechanical simulation. Using an ontology, 'surgical rules' which describe decision and assessment criteria for surgical decision-making are represented in a knowledge base. Through reasoning these 'rules' can then be applied on patient-specific data in order to be converted into boundary conditions for the biomechanical soft tissue simulation, which is based on the Finite Elements Method (FEM). The simulation, which is implemented in the open-source C++ FEM software HiFlow3, is controlled via the Medical Simulation Markup Language (MSML), and makes use of High Performance Computing (HPC) methods to cope with real-time requirements in surgery. The simulation results are presented to surgeons to assess the quality of the virtual reconstruction and the consequential remedial effects on the mitral valve and its functionality. The whole setup has the potential to support the intraoperative decision-making process during MVR where the surgeon usually has to make fundamental decision under time pressure.
Michael Delles, Sebastian Schalck, Yves Chassein, Tobias Müller, Fabian Rengier, Stefanie Speidel, Hendrik von Tengg-Kobligk, Hans-Ulrich Kauczor, Rüdiger Dillmann, Roland Unterhinninghofen
Patient-specific blood pressure values in the human aorta are an important parameter in the management of cardiovascular diseases. A direct measurement of these values is only possible by invasive catheterization at a limited number of measurement sites. To overcome these drawbacks, two non-invasive approaches of computing patient-specific relative aortic blood pressure maps throughout the entire aortic vessel volume are investigated by our group. The first approach uses computations from complete time-resolved, three-dimensional flow velocity fields acquired by phasecontrast magnetic resonance imaging (PC-MRI), whereas the second approach relies on computational fluid dynamics (CFD) simulations with ultrasound-based boundary conditions. A detailed evaluation of these computational methods under realistic conditions is necessary in order to investigate their overall robustness and accuracy as well as their sensitivity to certain algorithmic parameters. We present a comparative study of the two blood pressure computation methods in an experimental phantom setup, which mimics a simplified thoracic aorta. The comparative analysis includes the investigation of the impact of algorithmic parameters on the MRI-based blood pressure computation and the impact of extracting pressure maps in a voxel grid from the CFD simulations. Overall, a very good agreement between the results of the two computational approaches can be observed despite the fact that both methods used completely separate measurements as input data. Therefore, the comparative study of the presented work indicates that both non-invasive pressure computation methods show an excellent robustness and accuracy and can therefore be used for research purposes in the management of cardiovascular diseases.
The increase of technological complexity in surgery has created a need for novel man-machine interaction techniques.
Specifically, context-aware systems which automatically adapt themselves to the current circumstances in the OR have
great potential in this regard. To create such systems, models of surgical procedures are vital, as they allow analyzing the
current situation and assessing the context. For this purpose, we have developed a Surgical Process Model based on
Description Logics. It incorporates general medical background knowledge as well as intraoperatively observed
situational knowledge. The representation consists of three parts: the Background Knowledge Model, the Preoperative
Process Model and the Integrated Intraoperative Process Model. All models depend on each other and create a concise
view on the surgery. As a proof of concept, we applied the system to a specific intervention, the laparoscopic distal
pancreatectomy.
Intraoperative tracking of laparoscopic instruments is a prerequisite to realize further assistance functions. Since endoscopic images are always available, this sensor input can be used to localize the instruments without special devices or robot kinematics. In this paper, we present an image-based markerless 3D tracking of different da Vinci instruments in near real-time without an explicit model. The method is based on different visual cues to segment the instrument tip, calculates a tip point and uses a multiple object particle filter for tracking. The accuracy and robustness is evaluated with in vivo data.
Context-aware technologies have great potential to help surgeons during laparoscopic interventions. Their underlying idea is to create systems which can adapt their assistance functions automatically to the situation in the OR, thus relieving surgeons from the burden of managing computer assisted surgery devices manually. To this purpose, a certain kind of understanding of the current situation in the OR is essential. Beyond that, anticipatory knowledge of incoming events is beneficial, e.g. for early warnings of imminent risk situations. To achieve the goal of predicting surgical events based on previously observed ones, we developed a language to describe surgeries and surgical events using Description Logics and integrated it with methods from computational linguistics. Using n-Grams to compute probabilities of followup events, we are able to make sensible predictions of upcoming events in real-time. The system was evaluated on professionally recorded and labeled surgeries and showed an average prediction rate of 80%.
Minimally invasive surgery is a highly complex medical discipline with several difficulties for the surgeon. To alleviate these difficulties, augmented reality can be used for intraoperative assistance. For visualization, the endoscope pose must be known which can be acquired with a SLAM (Simultaneous Localization and Mapping) approach using the endoscopic images. In this paper we focus on feature tracking for SLAM in minimally invasive surgery. Robust feature tracking and minimization of false correspondences is crucial for localizing the endoscope. As sensory input we use a stereo endoscope and evaluate different feature types in a developed SLAM framework. The accuracy of the endoscope pose estimation is validated with synthetic and ex vivo data. Furthermore we test the approach with in vivo image sequences from da Vinci interventions.
In order to provide real-time intraoperative guidance, computer assisted surgery (CAS) systems often rely on
computationally expensive algorithms. The real-time constraint is especially challenging if several components such as
intraoperative image processing, soft tissue registration or context aware visualization are combined in a single system.
In this paper, we present a lightweight approach to distribute the workload over several workstations based on the
OpenIGTLink protocol. We use XML-based message passing for remote procedure calls and native types for transferring
data such as images, meshes or point coordinates. Two different, but typical scenarios are considered in order to evaluate
the performance of the new system. First, we analyze a real-time soft tissue registration algorithm based on a finite
element (FE) model. Here, we use the proposed approach to distribute the computational workload between a primary
workstation that handles sensor data processing and visualization and a dedicated workstation that runs the real-time FE
algorithm. We show that the additional overhead that is introduced by the technique is small compared to the total
execution time. Furthermore, the approach is used to speed up a context aware augmented reality based navigation
system for dental implant surgery. In this scenario, the additional delay for running the computationally expensive
reasoning server on a separate workstation is less than a millisecond. The results show that the presented approach is a
promising strategy to speed up real-time CAS systems.
Minimally invasive surgery is medically complex and can heavily benefit from computer assistance. One way to help the
surgeon is to integrate preoperative planning data into the surgical workflow. This information can be represented as a
customized preoperative model of the surgical site. To use it intraoperatively, it has to be updated during the intervention
due to the constantly changing environment. Hence, intraoperative sensor data has to be acquired and registered with the
preoperative model. Haptic information which could complement the visual sensor data is still not established. In
addition, biomechanical modeling of the surgical site can help in reflecting the changes which cannot be captured by
intraoperative sensors.
We present a setting where a force sensor is integrated into a laparoscopic instrument. In a test scenario using a silicone
liver phantom, we register the measured forces with a reconstructed surface model from stereo endoscopic images and a
finite element model. The endoscope, the instrument and the liver phantom are tracked with a Polaris optical tracking
system. By fusing this information, we can transfer the deformation onto the finite element model. The purpose of this
setting is to demonstrate the principles needed and the methods developed for intraoperative sensor data fusion. One
emphasis lies on the calibration of the force sensor with the instrument and first experiments with soft tissue. We also
present our solution and first results concerning the integration of the force sensor as well as accuracy to the fusion of
force measurements, surface reconstruction and biomechanical modeling.
Organ motion due to respiration and contact with surgical instruments can significantly degrade the accuracy of image
guided surgery. In most applications the ensuing soft tissue deformations have to be compensated in order to register
preoperative planning data to the patient. Biomechanical models can be used to perform an accurate registration based
on sparse intraoperative sensor data. Using elasticity theory, the approach can be formulated as a boundary value
problem with displacement boundary conditions. In this paper, several models of the liver from the literature and a
new simplified model are evaluated with regards to their application to intraoperative soft tissue registration. We
construct finite element models of a liver phantom using the different material laws. Thereafter, typical deformation
pattern that occur during surgery are imposed by applying displacement boundary conditions. A comparative
numerical study shows that the maximal registration error of all non-linear models stays below 1.1mm, while the
linear model produces errors up to 3.9mm. It can be concluded that linear elastic models are not suitable for the
registration of the liver and that a geometrically non-linear formulation has to be used. Although the stiffness
parameters of the non-linear materials differ considerably, the calculated displacement fields are very similar. This
suggests that a difficult patient-specific parameterization of the model might not be necessary for intraoperative soft
tissue registration. We also demonstrate that the new simplified model achieves nearly the same registration accuracy
as complex quasi-linear viscoelastic models.
One of the main challenges related to computer-assisted laparoscopic surgery is the accurate registration of
pre-operative planning images with patient's anatomy. One popular approach for achieving this involves intraoperative
3D reconstruction of the target organ's surface with methods based on multiple view geometry. The
latter, however, require robust and fast algorithms for establishing correspondences between multiple images of
the same scene. Recently, the first endoscope based on Time-of-Flight (ToF) camera technique was introduced.
It generates dense range images with high update rates by continuously measuring the run-time of intensity
modulated light. While this approach yielded promising results in initial experiments, the endoscopic ToF
camera has not yet been evaluated in the context of related work. The aim of this paper was therefore to
compare its performance with different state-of-the-art surface reconstruction methods on identical objects. For
this purpose, surface data from a set of porcine organs as well as organ phantoms was acquired with four
different cameras: a novel Time-of-Flight (ToF) endoscope, a standard ToF camera, a stereoscope, and a High
Definition Television (HDTV) endoscope. The resulting reconstructed partial organ surfaces were then compared
to corresponding ground truth shapes extracted from computed tomography (CT) data using a set of local and
global distance metrics. The evaluation suggests that the ToF technique has high potential as means for intraoperative
endoscopic surface registration.
Minimally invasive surgery is a medically complex discipline that can heavily benefit from computer assistance. One
way to assist the surgeon is to blend in useful information about the intervention into the surgical view using Augmented
Reality. This information can be obtained during preoperative planning and integrated into a patient-tailored model of
the intervention. Due to soft tissue deformation, intraoperative sensor data such as endoscopic images has to be acquired
and non-rigidly registered with the preoperative model to adapt it to local changes.
Here, we focus on a procedure that reconstructs the organ surface from stereo endoscopic images with millimeter
accuracy in real-time. It deals with stereo camera calibration, pixel-based correspondence analysis, 3D reconstruction
and point cloud meshing. Accuracy, robustness and speed are evaluated with images from a test setting as well as
intraoperative images. We also present a workflow where the reconstructed surface model is registered with a
preoperative model using an optical tracking system. As preliminary result, we show an initial overlay between an
intraoperative and a preoperative surface model that leads to a successful rigid registration between these two models.
Minimally invasive surgery is a highly complex medical discipline and can be regarded as a major breakthrough in
surgical technique. A minimally invasive intervention requires enhanced motor skills to deal with difficulties like the
complex hand-eye coordination and restricted mobility. To alleviate these constraints we propose to enhance the
surgeon's capabilities by providing a context-aware assistance using augmented reality techniques. To recognize and
analyze the current situation for context-aware assistance, we need intraoperative sensor data and a model of the
intervention. Characteristics of a situation are the performed activity, the used instruments, the surgical objects and the
anatomical structures. Important information about the surgical activity can be acquired by recognizing the surgical
gesture performed. Surgical gestures in minimally invasive surgery like cutting, knot-tying or suturing are here referred
to as surgical skills. We use the motion data from the endoscopic instruments to classify and analyze the performed skill
and even use it for skill evaluation in a training scenario. The system uses Hidden Markov Models (HMM) to model and
recognize a specific surgical skill like knot-tying or suturing with an average recognition rate of 92%.
Minimally invasive surgery is nowadays a frequently applied technique and can be regarded as a major breakthrough in
surgery. The surgeon has to adopt special operation-techniques and deal with difficulties like the complex hand-eye
coordination and restricted mobility. To alleviate these constraints we propose to enhance the surgeon's capabilities by
providing a context-aware assistance using augmented reality techniques. To analyze the current situation for context-aware
assistance, we need intraoperatively gained sensor data and a model of the intervention. A situation consists of
information about the performed activity, the used instruments, the surgical objects, the anatomical structures and defines
the state of an intervention for a given moment in time. The endoscopic images provide a rich source of information
which can be used for an image-based analysis. Different visual cues are observed in order to perform an image-based
analysis with the objective to gain as much information as possible about the current situation. An important visual cue is
the automatic recognition of the instruments which appear in the scene. In this paper we present the classification of
minimally invasive instruments using the endoscopic images. The instruments are not modified by markers. The system
segments the instruments in the current image and recognizes the instrument type based on three-dimensional instrument
models.
Minimally invasive surgery has gained significantly in importance over the last decade due to the numerous advantages on patient-side. The surgeon has to adapt special operation-techniques and deal with difficulties like the complex hand-eye coordination, limited field of view and restricted mobility. To alleviate these constraints we propose to enhance the surgeon's capabilities by providing a context-aware assistance using augmented reality (AR) techniques. In order to generate a context-aware assistance it is necessary to recognize the current state of the intervention using intraoperatively gained sensor data and a model of the surgical intervention. In this paper we present the recognition of risk situations, the system warns the surgeon if an instrument gets too close to a risk structure. The context-aware assistance system starts with an image-based analysis to retrieve information from the endoscopic images. This information is classified and a semantic description is generated. The description is used to recognize the current state and launch an appropriate AR visualization. In detail we present an automatic vision-based instrument tracking to obtain the positions of the instruments. Situation recognition is performed using a knowledge representation based on a description logic system. Two augmented reality visualization programs are realized to warn the surgeon if a risk situation occurs.
Minimally invasive surgery is a highly complex medical discipline with various risks for surgeon and patient, but has
also numerous advantages on patient-side. The surgeon has to adapt special operation-techniques and deal with
difficulties like the complex hand-eye coordination, limited field of view and restricted mobility. To alleviate with these
new problems, we propose to support the surgeon's spatial cognition by using augmented reality (AR) techniques to
directly visualize virtual objects in the surgical site. In order to generate an intelligent support, it is necessary to have an
intraoperative assistance system that recognizes the surgical skills during the intervention and provides context-aware
assistance surgeon using AR techniques. With MEDIASSIST we bundle our research activities in the field of
intraoperative intelligent support and visualization. Our experimental setup consists of a stereo endoscope, an optical
tracking system and a head-mounted-display for 3D visualization. The framework will be used as platform for the
development and evaluation of our research in the field of skill recognition and context-aware assistance generation.
This includes methods for surgical skill analysis, skill classification, context interpretation as well as assistive
visualization and interaction techniques. In this paper we present the objectives of MEDIASSIST and first results in the
fields of skill analysis, visualization and multi-modal interaction. In detail we present a markerless instrument tracking
for surgical skill analysis as well as visualization techniques and recognition of interaction gestures in an AR
environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.