Search tips
Search criteria 


Logo of soroLink to Publisher's site
Soft Robot. 2017 December 1; 4(4): 324–337.
Published online 2017 December 1. doi:  10.1089/soro.2016.0065
PMCID: PMC5734182

Nonparametric Online Learning Control for Soft Continuum Robot: An Enabling Technique for Effective Endoscopic Navigation


Bioinspired robotic structures comprising soft actuation units have attracted increasing research interest. Taking advantage of its inherent compliance, soft robots can assure safe interaction with external environments, provided that precise and effective manipulation could be achieved. Endoscopy is a typical application. However, previous model-based control approaches often require simplified geometric assumptions on the soft manipulator, but which could be very inaccurate in the presence of unmodeled external interaction forces. In this study, we propose a generic control framework based on nonparametric and online, as well as local, training to learn the inverse model directly, without prior knowledge of the robot's structural parameters. Detailed experimental evaluation was conducted on a soft robot prototype with control redundancy, performing trajectory tracking in dynamically constrained environments. Advanced element formulation of finite element analysis is employed to initialize the control policy, hence eliminating the need for random exploration in the robot's workspace. The proposed control framework enabled a soft fluid-driven continuum robot to follow a 3D trajectory precisely, even under dynamic external disturbance. Such enhanced control accuracy and adaptability would facilitate effective endoscopic navigation in complex and changing environments.

Keywords: : endoscopic navigation, finite element analysis, inverse transition model, soft robot control


Design of nature-inspired manipulators actuated based on soft material properties has become one of the most engaged research areas in robotics.1 Soft robots embedded with delicate chambers can be driven by fluidic input,1–4 resulting in functional deformations such as bending and elongation/shortening.5 Accredited to the limber robotic structure, its manipulation assures high compliance within a confined region, facilitating versatile interaction with surrounding objects.6,7 These features introduce a potential impact to many robotic applications demanding safe interaction within a dynamic environment, such as soft tissue in minimally invasive surgery.8,9 Therefore, endoscopy is one of the timely applications.

Conventional endoscopes predominately comprise a metallic skeleton driven by steel cables, governing the kinematics of a series of bending mechanisms. It inevitably induces high friction and is susceptible to fatigue failure upon prolonged duration of service. These metallic structures also come with high rigidity at the scope tip that may increase the risk of causing trauma or even perforation when the scope is forcefully pushed against the wall of a confined lumen or cavity.10 This has motivated the development of soft robotic instruments for surgical interventions,11–14 which can also be disposable to ensure zero risk of endoscopy-related infection transmission. Endotics11,12 was the first system developed for the purpose of pain-free colonoscopy. Its novel locomotion scheme attempted to prevent the formation of complicated looping at the sigmoid/descending colon. As a result, its single-segment bending is capable of omnidirectional endoscopic exploration along the colon. Aer-O-Scope13 was another commercial colonoscope relying on a simple approach making use of single-segment bending, which is combined with effective locomotion. The STIFF-FLOP soft robot9,14 was another milestone in keyhole surgery to offer intracavitary exploration using a soft-material robot validated in a cadaveric trial for the first time.

Soft robotic endoscopes have brought a few branches of research directions in the limelight. Various control approaches have also been developed to master the dexterity of such manipulators, giving rise to agile and responsive telemanipulation. Paramount to surgical safety, having a decent control performance in the presence of a confined and dynamic environment is also essential. Therefore, much research effort15–18 has been paid for deriving analytical models with the aim to describe or predict the robot kinematic/dynamic behavior,19 akin to controlling conventional rigid-link robots. However, these analytical models are complex due to the intrinsic nonlinear hyperelastic property of soft elastomeric materials that constitute the robot body. Any additional control dimensionality of the soft robot would further exacerbate the complexity of such kinematic equations.16

To simplify the modeling process, the piecewise constant curvature (PCC) assumption is one of the widely used techniques15,16,18,20 to obtain close-formed solutions.21,22 This enables real-time kinematic control of curvature discrepancy to attain the desired pose23 and to perform dynamic motion primitives24 for fluidically driven soft continuum robots. The parameters that govern the analytical models can also be estimated online.25 Other model-based methods have been proposed without taking the PCC assumption such as approximation of trunk-like structures to infinite degree-of-freedom (DoF) system26 and modeling spring–mass modeling techniques,27,28 which can be incorporated in a hierarchical controller for generating stereotyped motions of an octopus-like manipulator.27 Recently, the Cosserat theory29 of elasticity has been used to predict underwater motion of a cable-driven, octopus-like soft robot30 by deducing its geometrically exact formulations.

Yet, external disturbance to the robot, such as gravity, payload, and external interaction, can promptly invalidate those assumptions. These oversimplified assumptions would substantially degrade the model's reliability in real applications. Moreover, structural parameters in the kinematics have to be determined before the modeling process. The search for these invariant coefficients is heuristic in nature. This might induce further complications when mapping the robot motion analytically. In addition, such invariants can only hold upon slight modification of the robot as they possess strong correlation with the robot's mechanical structure. Inevitably, the analytical model has to be revisited after any major change to the robot structure, further diminishing the effectiveness of such an approach.

With the foreseen difficulty of developing the analytical/kinematic model, research attempts were made to control the soft pliable robot using nonparametric learning-based approaches. The idea is to obtain forward/inverse mappings for kinematic/dynamic robot control based on measurement data only. Model-free control methods can also be developed based on direct modeling architecture,31 where the inverse mapping is directly obtained. This mapping depicts the inverse transition model of the robot, which could be a changing function due to the contact between the robot and the environments, such as soft tissue.

The use of neural networks (NNs) has been proposed to globally approximate the inverse mapping between end-effector and robot actuation.32,33 Such an approach can compensate for uncertainties in robot dynamics32 and has been demonstrated to yield even more reliable solutions when compared with using an analytical model of a cable-driven soft robot.33 Previous studies of NNs mostly consider simplified scenarios, such as a nonredundant manipulator and contact-free situation.32,33 Although redundantly actuated robotic systems can be controlled in lower dimensionality in a hierarchical manner, it may require predefined movement patterns (primitives) for specific task goals.27

Moreover, there has been a great demand on using machine learning approaches to address the change in inverse mapping of the hyperelastic robot upon contact.1 A Jacobian-based model-free controller has shown its capabilities to manipulate a planar, cable-driven continuum robot in an environment with static constraints.34 However, there is still no example that demonstrates manipulation of redundantly actuated soft continuum robot in three-dimensional (3D) space and is adaptive to unknown external disturbance.

In this article, we propose a control framework based on nonparametric local learning technique. Nonparametric local learning methods, such as those described by Nguyen et al. and Peters et al.,35,36 possess the ability to learn the high-dimensional inverse transition of rigid-link robots. The essence of nonparametric local methods is to construct a batch of locally weighted models that collectively approximate inverse mapping. Each of these models is spawned and updated in an independent manner such that the overall architecture can be rapidly transformed to accommodate new input data. Meanwhile, the weighted global approximation can be optimized on the fly and consistent with the desired control behavior.36 Such nonparametric local learning approach can thus facilitate fast online correction of the learning model.37 Therefore, the proposed framework is suitable for providing a rapid response to soft robot manipulation within constrained environments.

Workspace exploration is a prerequisite to collect pretraining data for learning the proposed controller. It is desirable to have accurate enough kinematic data to initialize the controller offline since it is impractical to carry out robot exploration in the confined transluminal workspace. We propose to use finite element analysis (FEA) to sample the kinematic data for the offline learning process. FEA has been widely used in design optimization and miniaturization of soft robots.13 Not only can the FEA accurately predict the highly deformable behaviors but it can also provide data for characterization of inverse kinematic relationships for control.38 However, the application of FEA to robotic control has only been minimally investigated in continuum structure with small deformation.38,39 The major contributions of this work are as follows:

  •  It is the first attempt to exploit online nonparametric local learning technique with the aim to directly approximate the inverse kinematics of a redundantly actuated, fluid-driven endoscope prototype for soft robot control in 3D space (see the Methods section).
  •  Integration of FEA into the online learning method is implemented to initialize a reliable inverse model offline before deployment of the proposed controller in practical scenarios (see the Experiments, Results, and Discussion section).
  •  Experimental validation of the control performance and adaptability is conducted to demonstrate 3D trajectory tracking (mean error <2.49°) of soft continuum robot even under dynamic external disturbance (see the Experiments, Results, and Discussion section).


Design of soft endoscope prototype

A generic, fluidic-driven soft continuum robot made of RTV (Room Temperature Vulcanization) silicone rubber (Ecoflex 0050; Smooth-On, Inc.) is designed and fabricated to evaluate the proposed framework for endoscopic navigation (Fig. 1a). The soft robot comprises three cylindrical inflatable chambers, each covered by a helical Kevlar string layer with a pitch of 1 mm. This fiber-constrained structure was first proposed by Suzumori et al.,4,40 in which the helical constraint layer enforces axial anisotropic expansion of inflatable chambers so as to generate an effective bending moment when subject to pressure input. To enable effective endoscopic navigation, the three air chambers can be individually actuated by air or other fluid, facilitating a large panoramic workspace with a bending angle >150°. The slender robot configuration with 13-mm outer diameter and 93-mm length is also compatible with conventional endoscopes, which is of importance to dexterous manipulation inside a confined transluminal workspace.

FIG. 1.
(a) Soft robotic endoscope prototype made of silicone rubber. It has a dimension compatible with the insertion tube of conventional endoscope; (b) CAD/CAM model of the soft manipulator showing simulated helical strain-wrapping constrains around its individual ...

Fabrication of the robot involves three major phases: (1) three cylindrical air chambers are cast with RTV silicone in inner molds; (2) Kevlar strings are wrapped densely in a single helical structure along each soft chamber; and (3) additional layers of silicone are cast to house the three inflatable chambers into one. This could fix the strings against dislocation, even after numerous bending actions.

Characterization of robot motion transition

Gradual smooth regulation of the fluidic flow rate allows steady bending of the presented soft manipulator. It also allows rapid reaching of fluid pressure equilibrium, minimizing the residual motion generated during such fluidic actuation. During endoscopic navigation within small and confined spaces (e.g., duodenum), such quasi-static motion characteristic41 can facilitate effective precise targeting of the endoscopic camera or interventional tools (e.g., biopsy forceps or brush cytology) at the surgical regions of interest, thereby avoiding inadvertent damage to delicate tissue and potential discomfort to the patient.

To mathematically describe motion transition of the soft robot, let equation eq1 be the fluid pressure (at equilibrium) in the actuation chambers at time step k where U denotes the control space. Let equation eq2 be the state of the robot when the chambers are filled with the pressure of equation eq3 at equilibrium. This state corresponds to the distal tip position equation eq4 and orientation normal equation eq5 in the Cartesian space (Fig. 2), which are collectively represented by equation eq6. The forward transition model of the soft robot can be described by the following equation system:

equation eq7
FIG. 2.
Three robot configurations illustrating an example of localized inverse models. Assume that their tip directions si will undergo the same rotation equation eq98 (blue arrow) when proper pressure changes equation eq99 are applied, where equation eq100. In the case of configurations 1 and 2, ...

where equation eq8 is the difference of the fluid pressure. The motion transition function f is a continuous mapping that depends on the current state of the robot equation eq9. Compared with rigid-link robots where the robot state can be well defined by joint kinematics, it is difficult to describe the exact state of the soft robot. For example, model-based approaches approximate this robot state based on PCC15,16,18,20–25 and non-PCC26–30 constraints. The nonlinear function h transforms robot state equation eq10 to Cartesian representation equation eq11.

Typical endoscopic navigation requires delicate articulation of the distal tip so as to provide accurate positioning and easy access to the soft tissue lesion. A microcamera at the soft robot tip provides forward vision. Therefore, the operator can aim the distal tip at a lesion target on the luminal wall so as to guide the interventional instruments to deploy from the tip through the biopsy channel. This telemanipulated endoscopic navigation gives rise to a robot task space coordinate equation eq12 defined by its viewing direction (i.e., pitch and yaw angle). The system equation in Equation (1) can hence be extended to an actuation to task space mapping equation eq13 as follows:

equation eq14

where equation eq15 is the task space coordinate at time step equation eq16 after the change in fluid pressure equation eq17 is applied.

Inverse problem for online learning of task space control

Our control objective is to enable the operator to control displacement of the robot directly in the task space coordinate equation eq18 (i.e., the desired change in the robot tip orientation) with the use of a motion input device. The superscript “*” denotes the desired motion specified by users or other reference input. Thus, the controller is designed to approximate the inverse of the motion transition equation eq19 in Equation (2), that is, equation eq20, to estimate the required change in control input equation eq21 (as shown in Fig. 5). The inverse motion transition model equation eq22 heavily depends on the current robot state. However, the exact state equation eq23 cannot be directly measured due to its hyperflexibility and the interactions with enclosed workspace inside a patient's cavity. We sought to adopt the task space coordinates equation eq24, which would offer updated clues about the current robot state.

FIG. 5.
FEA-simulated kinematic data covering the entire workspace of the soft robot. The arrows illustrate the predicted movement of the robot tip when an arbitrary pressure change equation eq25 is applied. These data enable pretraining of a reasonable initial control policy ...

This approach is also of practical interest because these measurements are readily available in our control system. The task space coordinate equation eq26 can be tracked using advanced positional tracking systems. For example, electromagnetic (EM) tracking systems are commonly used in medical application to provide submillimeter-level tracking.42,43 Together with the actuator's input equation eq27, these online acquired data are presented to the learning algorithms to update the inverse mapping equation eq28 during robot run time.

equation eq29

Note that equation eq30 is the approximation of the true inverse mapping equation eq31. If dimensionality of the task space is smaller than that of the control space, theoretically there exist an infinite number of solutions of equation eq32 that result in the same task space displacement equation eq33. This leads to the ill-posed problem in learning the inverse mapping equation eq34.

Inverse model learning with multiple local controllers

Nonparametric local learning techniques have been applied to learn the ill-posed inverse problem, aiming to control redundantly actuated robots.31,44,45 Referring to Peters and Schaal,36 the inverse model of a rigid-link robot can be learnt using spatially localized nonparametric learning techniques given that the robot state is well defined by joint kinematics. In this study, spatial localization refers to the robot state equation eq35. Such localization scheme is motivated by the hypothesis that the inverse problem would be well defined locally.36 It is because nonparametric learning techniques essentially average out the sampled data. Model learning based on nonconvex training datasets would give invalid solutions.36

However, in the vicinity of equation eq36, the average of equation eq37 would be consistent with the average of the task space displacement equation eq38 (Fig. 2). Therefore, in a local region of a given equation eq39, the training dataset equation eq40 would become a convex set. This enables learning of inverse mapping in the vicinity of equation eq41 (Fig. 2). We approximate the local inverse mapping from the desired task space displacement to the actuation command as follows:

equation eq42

where equation eq43 is the parameter of the local inverse model. Each mapping serves as a local controller. Compared with Peters et al.,36 we do not include an intercept/bias term since the change of actuation command Δu should have zero mean. The computation of equation eq44 will be explained in the later context.

Online learning of the global controller

To approximate the global inverse mapping, we employ a linear combination of the locally learned mapping46:

equation eq45

This controller architecture allows straightforward one-iteration computation in each time step, in contrast to indirect modeling approaches.34 The number of local models n and the weight equation eq46, as well as the local controllers equation eq47, can be obtained in an online manner.

For this purpose, the local forward model is learnt using locally weighted projection regression (LWPR),37 which offers piecewise linear function approximation, while it simultaneously determines the appropriate local region of each linear model. Each local forward model performs a linear mapping as follows:

equation eq48

where equation eq49 denotes the corresponding parameter. Each local region, namely the receptive field (RF), is shaped based on the membership function:

equation eq50

centered at equation eq51, where Di is the distance metric. Each membership function weights the corresponding locally learned inverse model in the controller (Eq. 6).

One advantage of LWPR is that it can automatically spawn new linear models and the corresponding RF when new data laid outside all existing RF are presented. Meanwhile, the center equation eq52 of RF is determined by the input space of new data through incremental learning so as the total number of local regions n (Fig. 3). Each newly spawned RF is initialized with a diagonal distance metric Di value. This Di value will be updated throughout the incremental learning process to improve the overall regression accuracy and convergence rate. To prevent overfitting and allocation of too many numbers of RFs n, a smaller initial Di value is preferred (i.e., larger RFs). Cross-validation is also employed in determining the initial Di, which is important to ensure that the forward model can be accurately reflected by piecewise linear regression.

FIG. 3.
Example set of localized linear controllers that approximate the nonlinear inverse mapping equation eq53 of a 1D actuation equation eq54. The valid region of each spatially localized controller is centered at equation eq55 (denoted by plus sign), with the range parameterized by equation eq56 (colored ellipse ...

Despite the fact that each RF could fulfill the local convexity requirement due to redundancy in the robotic system, the solutions of local controllers (Eq. 4) could be inconsistent with the desired solutions.36 Although this problem could be resolved by preprocessing the training data such that it only produces one particular solution, it lacks generality and is difficult to apply in high-dimensional systems.31 Therefore, we employ another approach that reshapes local inverse models using constrained optimization, where the local controllers are enforced to provide consistent solutions from infinite possibilities in the null space of the control space. We then define the optimization problem as follows:

equation eq60

subject to equation eq61

where the cost function Ck represents the user-defined optimality scaled by a diagonal matrix N. equation eq62 is the user-defined null-space behavior. One example of null-space behavior could be minimizing the elongation of the robot, which results in smaller bending radius to facilitate dexterous motion inside enclosed cavity. Finally, the optimization constraint equation eq63 ensures the correctness of the inverse solution.

The constrained optimization problem can be solved by introducing a reward function (Eq. 9) and a cost function (Eq. 10):

equation eq64

equation eq65

The reward function equation eq66 is scaled by the mean cost equation eq67 to improve learning efficiency36:

equation eq68

The cost function is then minimized by means of reward-weighted regression, where each local model needed to be updated:

equation eq69

where equation eq70, equation eq71, and equation eq72 are the training datasets. The overall procedures of the learning-based controller are summarized in Algorithm 1.

Algorithm 1.
Online Learning Algorithm of Inverse Mapping

Experiments, Results, and Discussion

The proposed control framework is implemented on a custom-made soft robot to investigate its performance and behavior under external dynamic constraints. We have also attempted to utilize FEA to simulate robot motion data for pretraining of an initial control policy. This can avoid the need for random exploration of its robot workspace to initialize online learning functions. Such exploration is usually time-consuming and may not be practical, particularly for single-use purposes in surgical applications. Accuracy and stability of the proposed controller are examined through path following under various constrained environments. The interaction force with the external constraint is also measured throughout the experiments. The control block diagram of the overall robotic system, including the processing core and actuation system, is illustrated in Figure 6.

FIG. 6.
System architecture of the proposed control framework depicting interconnections of key components. The processing core is responsible for fast computation of inverse solution. The inverse model is also updated continuously by incorporating the online ...

Initialization of online learning by FEA-based model

Proper initialization of pretraining data is essential to many online learning techniques. These preceding data are dedicated to pretraining an initial control policy before the online learning begins. It is usually acquired by driving the robot with random input. Instead, we proposed to incorporate FEA, by which robot deformation can be simulated with a hyperelastic computation model. This simulation can generate comprehensive pretraining samples that cover the entire robot workspace at a high resolution, facilitating offline pretraining of the learning-based controller (Fig. 5).

The FEA model of the robot is constructed using ABAQUS47 to predict the robot kinematics and workspace. RTV silicone rubber is considered as incompressible hyperelastic material formulated by Odgen material model.48 It exhibits negligible volume change under hydrostatic compression and has a Poisson's ratio close to 0.5. Due to the incompressibility of silicone rubber and the large deformation nature of the simulation, the element formulation and mesh quality pose a compelling effect on both accuracy and convergence of the simulation. Therefore, hexahedral element (C3D8RH; Fig. 1c) based on u-p hybrid formulation with hourglass control47 is chosen over the commonly used quadratic tetrahedral elements (Fig. 1d) in the FEA of our soft robotic manipulators.

The C3D8RH element possesses eight displacement nodes and one interior pressure node. The combination of these displacement and pressure nodes is often close to optimal.49 Such integration scheme improves not only element efficiency but also element accuracy under bending load. However, compared with tetrahedrons, automatic mesh generation of hexahedrons is relatively ineffective, resulting in poor tessellation quality. To this end, the presented meshing has to be obtained by custom-designed protrusions, and all elements are right prisms initially. By restoring the mesh quality, the assemblage contains far fewer elements and is much more robust in convergence.

The presented manipulator model is tessellated with 12k linear hexahedral elements (C3D8RH; Fig. 1c). There are also 2,214 linear truss elements (T3D2) being placed along each actuation chamber in a layer-by-layer arrangement (Fig. 1b). Truss elements are used to model the helical strain-wrapping constraints that ensure the anisotropic expansion of chambers upon pressure actuation. Actuation and gravity loads are applied to the presented FEA model. The gradual change of the stress input, which is distributed across the surface mesh along the inner chamber surface, guarantees reliable convergence, giving rise to an equilibrium solution throughout all the time steps during the FEA.

Quasi-static motion with negligible hysteresis can be achieved when the real robot prototype is manipulated while delicately regulating the inflation pressure into the chamber at high-resolution steps. It is worth noting that deformation/bending of both the FEA-modeled manipulator and the actual one are very similar corresponding to the same levels of inflation pressure simulated, as shown in Figure 4. Over 1,000 simulated motion samples equation eq74 have been obtained using the FEA, covering the entire robot workspace (Fig. 5). These simulated data are adopted to pretrain the online learning controller as described in the following sections.

FIG. 4.
FEA models (left) simulated with seven levels of inflation pressure in a single chamber. Similar deformation characteristics are exhibited in actual configurations of the soft manipulator (right) under the same corresponding pressure levels. FEA, finite ...

Experimental setup

To evaluate the proposed control performance, three motorized pneumatic units are employed to actuate the presented soft manipulator incorporated with our close-loop control testing platform (Fig. 6). Each unit consists of a pneumatic cylinder coupled to a precise stepper motor through a lead screw transmission. This facilitates accurate regulation of air flow. Our soft robotic manipulator can be fully articulated in a dome-shaped workspace with a maximum curve angle of >150° in all directions.

An EM tracking system (NDI Medical Aurora) is employed to close the robot control loop by the continuous positional data feedback (Fig. 7a). This tracking system is commonly available in many image-guided intervention systems. It can track the position and orientation of tiny EM coils in real time with root mean square (RMS) accuracy of 0.7 mm and 0.2° at 40 Hz. A tiny tracking coil is embedded at the robot distal tip. Online updating (at 20 Hz) of the inverse mapping estimation equation eq75 by the local learning algorithm is achieved, where equation eq76 is measured tip direction. The positional data are also recorded throughout the robot task so as to evaluate overall control performance. The entire control framework is implemented in the MATLAB environment. The open-source library of LWPR50 is employed to incrementally learn the robot forward model, which determines valid linearization of each local controller.

FIG. 7.
(a) Registration process of the predefined trajectory using an electromagnetic (EM) position tracking system. Blue line on the transparent sphere illustrates the tracking trajectory on the task space; (b) Soft manipulator is commanded to follow the desired ...

A series of path-following tasks is performed under various constraint scenarios to investigate how the online learning control approach reacts to such unknown interactions. At the beginning, the robot is allowed to move freely in its workspace without any interference. This serves as the control experiment to establish the baseline of controller performance. Subsequently, the robot is gently pushed by a plastic rod to simulate an unknown dynamic interaction with the robot manipulation (Fig. 7b). The rod is actuated by a high-precision stepping motor to generate repeatable contact with the robot body; meanwhile, the contact force is monitored by a force/torque sensor (ATI Industrial Automation: F/T Nano17). The tracking error is defined as the shortest distance between the robot targeting direction equation eq77 and the desired trajectory.

Evaluation of online local learning controller

To realize accurate navigation under unknown constraints, the inverse model is adapted in the proposed learning-based controller, which has to be updated online based on the newly acquired motion data. In this study, we compared three types of data sources for the inverse model training: (1) pretrained by FEA data without using online data; (2) initialized by random exploration with online learning data; and (3) pretrained by the FEA data, and then updated by online data. These online-updated inverse models are evaluated for resolved motion rate control51 to track a predefined trajectory. Thus, the desired task space displacement equation eq78 that tracks the reference input is obtained as follows:

equation eq79

where equation eq80 and equation eq81 are the reference task space displacements and coordinates generated from interpolating a predefined trajectory. Note that the reference input can be replaced by manual control in actual endoscopic navigation scenario. We employed the same proportional–derivative (PD) gain equation eq82 for all three settings to perform tracking along a reference trajectory. Thus, the actuation input equation eq83 is estimated by the online learning inverse model as depicted in Equation (4).

To enforce the consistency of inverse mapping among all localized linear controllers, a standard null-space behavior equation eq84 is defined. This gives rise to an immediate reward function equation eq85 to weigh the training data that best imitate the desired null-space behavior (Eq. 9). For the presented soft robot, we first choose a rest configuration to be equation eq86, which can minimize the overall inflation pressure as well as elongation of the manipulator. Then, the robot is attracted toward the rest configuration with a loose attractor function equation eq87, where equation eq88. We defined an identity metric equation eq89 as all three inflatable actuators of the robot are identical and should contribute the same in achieving the desired null-space behavior.

It is also necessary to normalize the training dataset into the same scale component-wise so that the LWPR can learn the data variance properly. Min-max normalization is a simple but effective technique commonly used52:

equation eq90

However, the statistical max(qi) and min(qi) values would be sensitive to outliers; therefore, we define the min–max values according to the physical constraints of data, including the typical robot workspace and the maximum volume of the cylinder unit.

Pretrained by FEA without using online data

In this setting, both the forward model and control policy are pretrained solely by the FEA-simulated data (see the Initialization of online learning by FEA-based model section). The online data were not taken into account in this setting. This acts as a control experiment to depict the actual influence of external interactions. In the unconstrained experiment (Fig. 8a), it was observed that the controller could roughly follow the trajectory with a relatively large tracking error of ±1.79° and a maximum error of ±6.96° with the use of the feedback controller (Table 1). Despite the considerable discrepancy between the FEA-simulated and actual configuration, this experiment still demonstrates that the FEA data are capable of pretraining a reasonable inverse model for rough path following.

FIG. 8.
Tracked trajectory plotted (left) and the corresponding tracking error in time domain (right). In the control experiment, the robot is allowed to move freely without any constraint. Control performance of the online learning controllers trained by three ...
Table 1.
Trajectory Tracking Performance Under Freely Moveable Environment

In the later constrained experiment (Fig. 9a), the robot maintained tracking of the trajectory with similar accuracy at the beginning. When the external interaction is engaged at the moment of 25 s, the robot was pushed further away from the desired trajectory, resulting in an increased mean tracking error ±4.64° and a maximum error of ±14° (Table 2). This indicates that the feedback controller cannot fully compensate the significant motion bias that is induced by external disturbance.

FIG. 9.
Tracked trajectory plotted (left) and the corresponding tracking error in time domain (right) under external interactions. Control performance is validated in three different conditions as in Figure 8. It can be observed that online learning for (b, c) ...
Table 2.
Trajectory Tracking Performance Under Constrained Environment

In the case of a conventional rigid-linked robot, this kind of error due to the interaction with the constraint is often considered as a perturbation. The error can hence be compensated by increasing the feedback control gain given that the inverse model is readily available from the kinematic chain. However, such approach is not directly applicable to a soft robot due to their mechanical compliance that inevitably induces much larger positioning errors. In addition, the interaction force may also alter the force equilibrium of the robot and therefore substantially degrading the reliability of the predetermined inverse model. The following experiments demonstrate how the proposed online algorithm can accommodate the influence of constrained environment, which is particularly demanding for the control of soft robots.

Initialized by random exploration with online learning

The random exploration of robot workspace is a typical approach34 to initialize a data-driven controller before its actual deployment. This kind of arbitrary movement is necessary to provide preceding data for setting up a learning model. It involves tracking 50 random input pressure waypoints equation eq93 with a PD feedback controller. The deliberately tuned PD gains can cause poor tracking of random waypoints. Such babbling movement (green path in Figs. 8b and and9b)9b) can facilitate a faster learning rate as the robot sweeps throughout a wider neighboring workspace. Pretraining with the exploration data resulted in a forward LWPR model with 110 RFs, which define linearization for the piecewise linear inverse model in advance to actual deployment of the online learning.

Upon exploration, the online learning controller could follow the desired trajectory with an average error of ±1.13° in the first cycle under the constraint-free environment (Fig. 8b). The error was found to be significantly lower than the inverse model pretrained by FEA-simulated data. It is reasonable because the actual robot data were used. After a few cycles, the tracking error further decayed to an average of ±0.87° and maximum of ±1.92° as having the online learning controller adapted with the trajectory.

Next, the feasibility of online inverse model adaptation was validated by engaging external force interactions (Fig. 9b). The online learning controller can compensate the bias and hence minimize the error down to an average of ±2.35° within 5 s upon contact with the constraint. The external constraint is moved away after 30 s of contact. It is also worth noting that the controller could quickly update the inverse mapping online and follow the trajectory will high accuracy. No control instability is observed throughout the experiment. The pure online learning approach achieves the highest average accuracy among all settings, both for constrained and unconstrained scenarios (Tables 1 and and2).2). However, the need for initialization by babbling motion (green path in Figs. 8b and and9b)9b) should be avoided in clinical scenarios to prevent unnecessary interactions with patient anatomy.

Pretrained by FEA data, then updated by online data

To alleviate the need for random exploration, we attempted to pretrain the controller with FEA data and then update the inverse model by online learning. This approach combines the advantages of the both aforementioned settings, in which the inverse model can be initialized with FEA data. The robot can immediately begin navigation using this pretrained model without the need of initialization through undesired babbling movement. The subsequent manipulation data are also acquired to incrementally train a more precise inverse model so as to adapt to external interactions. This feature is demonstrated in Figure 8c, in which the robot is allowed to move freely.

Although the robot begins with a relatively large tracking error of average ±2.21° and maximum of ±7.49° in the first cycle, the error is quickly compensated by the online learning and converged to an average of ±0.90° and maximum of ±2.80°. This tracking result is compared with the other two approaches in Table 1. In the first cycle, the combined approach exhibits tracking error close to pretraining with FEA only (average ±2.21° vs. ±1.79°) because both inverse models are initialized with less accurate FEA data. The learning technique then corrects the inverse model with online data so that the tracking error decreases rapidly and becomes comparable with the pure online approach (average ±0.90° vs. ±0.87°).

This shows that the combined approach can initialize a reasonable learning-based controller with less accurate FEA data, then further refine the inverse model while performing the tracking task. Note that the combined approach does not required random exploration (green path in Figs. 8b and and9b)9b) to obtain pretraining data, which is difficult to cover the entire robot workspace with sufficient density.

This combined approach is also capable of adapting to the unknown external interaction (Fig. 9c). The inverse model can quickly adapt the inverse mapping upon contact with the external interaction at 36 s. It continues to follow the trajectory with a small mean absolute error of ±2.49°. The controller also remains stable and readapts after the removal of constraints. Readers could also refer to the attached Supplementary Video (Supplementary Data are available online at for extra details about the robot behavior and the characteristics of constraint.

Referring to the Evaluation of online local learning controller section, we presented the challenge in learning an inverse model spatially localized by the unmeasurable robot state equation eq94 as well as how this robot state can be retrieved indirectly from sensory measurements. These trajectory tracking experiments have shown that the inverse model could be successfully learnt by continuous updates of both the task space coordinate equation eq95 and control input equation eq96. Both are set as the localization parameters required in the inverse model. Therefore, the robot state equation eq97 could be estimated sustainably by the learning algorithm. These 3-6D positional data updates are clinically practical. The comparable position-tracking techniques designed for image-guided interventions are also under active research,53 one of which would be magnetic resonance imaging-guided endoscopic retrograde cholangiopancreatography.

Conclusion and Future Work

We have proposed a model-free control framework that adopts an online nonparametric local learning technique for manipulation of a redundantly actuated, fluid-driven soft continuum robot in the presence of a dynamic external disturbance. Nonparametric techniques are capable of constructing highly nonlinear functions by measurement of data solely, which is particularly suitable for characterization of hyperelastic robot structure. To accommodate the flexibility of soft robot body, we approximate the global inverse kinematics by a linear combination of many locally learnt inverse kinematic models.

Our model-free controller employs this global approximation, where the behavior of the redundant actuator can be optimized by a user-defined criterion, and simultaneously fulfilling the control objective defined in task space coordinates. In addition, the controller is adaptive to changes in the environment, where each local model can be updated online independently according to newly acquired data. This equips the robot with the ability to maintain control accuracy under external dynamic disturbance. Our work is the first attempt of implementing such direct inverse modeling using an online nonparametric learning technique to control a redundantly actuated soft continuum robot.

We have also incorporated FEA into the learning control framework for proper initialization of the robot inverse model. It enables precise prediction of the hyperelastic robot deformation under various actuation pressures, without the need for the oversimplified analytical model. It can also offer adequate sample data covering the entire workspace at high resolution. This avoids the need of time-consuming random exploration to initialize the learning model, which may not be practical in many surgical applications. The proposed controller can hence be initialized offline using FEA-simulated data, ready for endoscopic navigation procedure.

The proposed novel control framework has been experimentally validated. In the constrained experiment, after FEA-based initialization of the controller, the endoscope prototype could follow a 3D trajectory with an accuracy of mean ± 2.21° and maximum ±7.49° and attained almost the same tracking accuracy (mean ±2.49° and maximum ±11.03°) after 5 s upon addition/removal of external disturbance (maximum 1N). This is also the first demonstration of realizing model-free close-loop control of a fluid-driven soft continuum in 3D task space even under dynamic external disturbance.

The current form of our learning-based control method is first designed for a single segment manipulator. In our future work, we intend to extend the framework to address soft manipulation with multisegments.54 As a cascade of multiple actuation modules, it provides enhanced manipulation flexibility for interventional tools, facilitating more complicated operations in a confined space. In this case, a generic optimization function will be developed to resolve the null-space control of hyper-redundant robot.55 Further characterization of such multisegment soft manipulators will be investigated. To address its hyper-redundancy, it will also require additional sensory systems or algorithms to parameterize the possible motion transition of robot configuration, thus estimating the inverse model for the higher DoF robot.

Supplementary Material

Supplemental data:


This work is supported, in part, by the Croucher Foundation, the Research Grants Council (RGC) of Hong Kong (Ref. Nos. 27209151 and 17227616), the Innovation and Technology Fund (ITF) of Hong Kong (ITS/361/15FX), and NISI (HK) Limited.

Author Disclosure Statement

No competing financial interests exist.


1. Trivedi D, et al. Soft robotics: Biological inspiration, state of the art, and future research. Appl Bionics Biomech 2008;5:99–117
2. Mao S, et al. Gait study and pattern generation of a starfish-like soft robot with flexible rays actuated by SMAs. J Bionic Eng 2014;11:400–411
3. Sareh S, et al. Bio-inspired tactile sensor sleeve for surgical soft manipulators. In: IEEE International Conference on Robotics and Automation (ICRA) Hong Kong, China, May 31–June 7, 2014
4. Suzumori K., Iikura S., Tanaka H. Development of flexible microactuator and its applications to robotic mechanisms. In: The 1991 IEEE International Conference on Robotics and Automation Sacramento, CA, April9–11, 1991
5. Wang L., Iida F. Deformation in soft-matter robotics. IEEE Robot Automat Magaz 2015;22:125–139
6. McMahan W, et al. Field trials and testing of the OctArm continuum manipulator. In: IEEE International Conference on Robotics and Automation (ICRA) Orlando, FL, May15–19, 2006
7. Runge G, et al. SpineMan: Design of a soft robotic spine-like manipulator for safe human-robot interaction. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Hamburg, Germany, September 28–Oct 2, 2015
8. Maghooa F, et al. Tendon and pressure actuation for a bio-inspired manipulator based on an antagonistic principle. In: IEEE International Conference on Robotics and Automation (ICRA) Seattle, WA, May26–30, 2015
9. Cianchetti M, et al. Soft robotics technologies to address shortcomings in today's minimally invasive surgery: The STIFF-FLOP approach. Soft Robotics 2014;1:122–131
10. Lohsiriwat V. Colonoscopic perforation: incidence, risk factors, management and outcome. World J Gastroenterol 2010;16:425. [PMC free article] [PubMed]
11. Tumino E, et al. Endotics system vs colonoscopy for the detection of polyps. World J Gastroenterol 2010;16:5452–5456 [PMC free article] [PubMed]
12. Cosentino F, et al. Functional evaluation of the endotics system, a new disposable self-propelled robotic colonoscope: in vitro tests and clinical trial. Int J Art Organs 2009;32:517–527 [PubMed]
13. Pfeffer J, et al. The Aer-O-Scope: Proof of the concept of a pneumatic, skill-independent, self-propelling, self-navigating colonoscope in a pig model. Endoscopy 2006;38:144–148 [PubMed]
14. Fras J, et al. New STIFF-FLOP module construction idea for improved actuation and sensing. In: IEEE International Conference on Robotics and Automation (ICRA) Seattle, WA, May26–30, 2015
15. Camarillo DB, et al. Mechanics modeling of tendon-driven continuum manipulators. IEEE Trans Robot 2008;24:1262–1273
16. Jones B., Walker ID. Kinematics for multisection continuum robots. IEEE Trans Robot 2006;22:43–55
17. Mahvash M., Dupont PE. Stiffness control of surgical continuum manipulators. IEEE Trans Robot 2011;27:334–345 [PMC free article] [PubMed]
18. Webster RJ., III, Jones BA. Design and kinematic modeling of constant curvature continuum robots: A review. Int J Robot Res 2010;29:1661–1683
19. Ganji Y., Janabi-Sharifi F. Catheter kinematics for intracardiac navigation. IEEE Trans Biomed Eng 2009;56:621–632 [PubMed]
20. Jones BA., Walker ID. A New Approach to Jacobian Formulation for a Class of Multi-Section Continuum Robots. In: IEEE International Conference on Robotics and Automation (ICRA) Barcelona, Spain, April18–22, 2005
21. Webster RJ III, et al. Closed-form differential kinematics for concentric-tube continuum robots with application to visual servoing. In: Experimental Robotics: The Eleventh International Symposium Athens: Greece, July13–16, 2008, pp. 485–494
22. Neppalli S, et al. Closed-form inverse kinematics for continuum manipulators. Adv Robot 2009;23:2077–2091
23. Marchese AD, et al. Design and control of a soft and continuously deformable 2d robotic manipulation system. In: The IEEE International Conference on Robotics and Automation Hong Kong: China, May 31–June 7, 2014, pp. 2189–2196
24. Marchese AD., Tedrake R., Rus D. Dynamics and trajectory optimization for a soft spatial fluidic elastomer manipulator. Int J Robot Res 2016;35:1000–1019
25. Wang H, et al. Visual servo control of cable-driven soft robotic manipulator. In: The 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems Tokyo: Japan, November3–7, 2013, pp. 57–62
26. Chirikjian GS. Hyper-redundant manipulator dynamics: a continuum approximation. J Adv Robot 1995;9:217–243
27. Kang R, et al. Design, modeling and control of a pneumatically actuated manipulator inspired by biological continuum structures. Bioinspir Biomim 2013;8:036008. [PubMed]
28. Yekutieli Y, et al. Dynamic model of the octopus arm. I. biomechanics of the octopus reaching movement. J Neurophysiol 2005;94:1443–1458 [PubMed]
29. Giorelli M, et al. A two dimensional inverse kinetics model of a cable driven manipulator inspired by the octopus arm. In: The 2012 IEEE International Conference on Robotics and Autonomous Systems Saint Paul: MN, May14–18, 2012, pp. 3819–3824
30. Renda F, et al. Dynamic model of a multibending soft robot arm driven by cables. IEEE Trans Robot 2014;30:1109–1122
31. Nguyen-Tuong D., Peters J. Model learning for robot control: A survey. Cogn Process 2011;12:319–340 [PubMed]
32. Braganza D, et al. A neural network controller for continuum robots. IEEE Trans Robot 2007;23:1270–1277
33. Giorelli M, et al. Neural network and Jacobian method for solving the inverse statics of a cable-driven soft arm with nonconstant curvature. IEEE Trans Robot 2015;31:823–834
34. Yip MC., Camarillo DB. Model-less feedback control of continuum manipulators in unknown environments. IEEE Trans Robot 2014;30:880–889
35. Nguyen-Tuong D., Seeger M., Peters J. Model learning with local gaussian process regression. Adv Robot 2009;23:2015–2034
36. Peters J., Schaal S. Learning to control in operational space. Int J Robot Res 2008;27:197–212
37. Vijayakumar S., D'Souza A., Schaal S. Incremental online learning in high dimensions. Neural Comput 2005;17:2602–2634 [PubMed]
38. Largilliere F, et al. Real-time control of soft-robots using asynchronous finite element modeling. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, May26–30, 2015
39. Duriez C. Control of elastic soft robots based on real-time finite element method. In: 2013 IEEE International Conference on Robotics and Automation (ICRA) Karlsruhe, Germany, May6–10, 2013
40. Faudzi AAM, et al. Development of bending soft actuator with different braided angles. In: IEEE/ASME International Conference on Advanced Intelligent Mechatronics Kachsiung, Taiwan, July11–14, 2012
41. Greigarn T., Cavusoglu MC. Task-space motion planning of MRI-actuated catheters for catheter ablation of atrial fibrillation. In: International Conference on Intelligent Robots and Systems (IROS 2014) Chicago, IL, September14–18, 2014 [PMC free article] [PubMed]
42. Ko SY., Frasson L., Baena FRY. Closed-loop planar motion control of a steerable probe with a “programmable bevel” inspired by nature. IEEE Trans Robot 2011;27:970–983
43. Xu R, et al. Position control of concentric-tube continuum robots using a modified Jacobian-based approach. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany, May6–10, 2013
44. Hartmann C, et al. Real-time inverse dynamics learning for musculoskeletal robots based on echo state Gaussian process regression. In: Robotics: Science and Systems. Sydney, NSW, Australia, July 9–13, 2012
45. Sigaud O., Salaün C., Padois V. On-line regression algorithms for learning mechanical models of robots: A survey. Robot Autonom Syst 2011;59:1115–1129
46. Schaal S., Atkeson CG., Vijayakumar S. Scalable techniques from nonparametric statistics for real time robot learning. Appl Intellig 2002;17:49–60
47. Simulia D. ABAQUS 6.13 User's Manual. Providence, RI: Dassault Systems, 2013
48. Başar Y., Itskov M. Finite element formulation of the Ogden material model with application to rubber‐like shells. Int J Numer Methods Eng 1998;42:1279–1305
49. Hughes TJ. The Finite Element Method: Linear Static and Dynamic Finite Element Analysis. New York: Dover Publications, 2012
50. Klanke S., Vijayakumar S., Schaal S. A library for locally weighted projection regression. J Machine Learn Res 2008;9:623–626
51. Whitney DE. Resolved motion rate control of manipulators and human prostheses. IEEE Trans Man-Machine Syst 1969;10:47–53
52. Jain A., Nandakumar K., Ross A. Score normalization in multimodal biometric systems. Pattern Recognit 2005;38:2270–2285
53. Chen Y, et al. Design and fabrication of MR-tracked metallic stylet for gynecologic brachytherapy. IEEE/ASME Trans Mechatron 2015;21:956–962 [PMC free article] [PubMed]
54. Sadati SH, et al. Stiffness control of soft robotic manipulator for minimally invasive surgery (MIS) using scale jamming. In: Intelligent Robotics and Applications. Portsmouth, UK; Springer, August 24–27, 2015, pp. 141–151
55. Kwok KW, et al. Dimensionality reduction in controlling articulated snake robot for endoscopy under dynamic active constraints. IEEE Trans Robot 2013;29:15–31 [PMC free article] [PubMed]

Articles from Soft Robotics are provided here courtesy of Mary Ann Liebert, Inc.