|Home | About | Journals | Submit | Contact Us | Français|
Bioinspired robotic structures comprising soft actuation units have attracted increasing research interest. Taking advantage of its inherent compliance, soft robots can assure safe interaction with external environments, provided that precise and effective manipulation could be achieved. Endoscopy is a typical application. However, previous model-based control approaches often require simplified geometric assumptions on the soft manipulator, but which could be very inaccurate in the presence of unmodeled external interaction forces. In this study, we propose a generic control framework based on nonparametric and online, as well as local, training to learn the inverse model directly, without prior knowledge of the robot's structural parameters. Detailed experimental evaluation was conducted on a soft robot prototype with control redundancy, performing trajectory tracking in dynamically constrained environments. Advanced element formulation of finite element analysis is employed to initialize the control policy, hence eliminating the need for random exploration in the robot's workspace. The proposed control framework enabled a soft fluid-driven continuum robot to follow a 3D trajectory precisely, even under dynamic external disturbance. Such enhanced control accuracy and adaptability would facilitate effective endoscopic navigation in complex and changing environments.
Design of nature-inspired manipulators actuated based on soft material properties has become one of the most engaged research areas in robotics.1 Soft robots embedded with delicate chambers can be driven by fluidic input,1–4 resulting in functional deformations such as bending and elongation/shortening.5 Accredited to the limber robotic structure, its manipulation assures high compliance within a confined region, facilitating versatile interaction with surrounding objects.6,7 These features introduce a potential impact to many robotic applications demanding safe interaction within a dynamic environment, such as soft tissue in minimally invasive surgery.8,9 Therefore, endoscopy is one of the timely applications.
Conventional endoscopes predominately comprise a metallic skeleton driven by steel cables, governing the kinematics of a series of bending mechanisms. It inevitably induces high friction and is susceptible to fatigue failure upon prolonged duration of service. These metallic structures also come with high rigidity at the scope tip that may increase the risk of causing trauma or even perforation when the scope is forcefully pushed against the wall of a confined lumen or cavity.10 This has motivated the development of soft robotic instruments for surgical interventions,11–14 which can also be disposable to ensure zero risk of endoscopy-related infection transmission. Endotics11,12 was the first system developed for the purpose of pain-free colonoscopy. Its novel locomotion scheme attempted to prevent the formation of complicated looping at the sigmoid/descending colon. As a result, its single-segment bending is capable of omnidirectional endoscopic exploration along the colon. Aer-O-Scope13 was another commercial colonoscope relying on a simple approach making use of single-segment bending, which is combined with effective locomotion. The STIFF-FLOP soft robot9,14 was another milestone in keyhole surgery to offer intracavitary exploration using a soft-material robot validated in a cadaveric trial for the first time.
Soft robotic endoscopes have brought a few branches of research directions in the limelight. Various control approaches have also been developed to master the dexterity of such manipulators, giving rise to agile and responsive telemanipulation. Paramount to surgical safety, having a decent control performance in the presence of a confined and dynamic environment is also essential. Therefore, much research effort15–18 has been paid for deriving analytical models with the aim to describe or predict the robot kinematic/dynamic behavior,19 akin to controlling conventional rigid-link robots. However, these analytical models are complex due to the intrinsic nonlinear hyperelastic property of soft elastomeric materials that constitute the robot body. Any additional control dimensionality of the soft robot would further exacerbate the complexity of such kinematic equations.16
To simplify the modeling process, the piecewise constant curvature (PCC) assumption is one of the widely used techniques15,16,18,20 to obtain close-formed solutions.21,22 This enables real-time kinematic control of curvature discrepancy to attain the desired pose23 and to perform dynamic motion primitives24 for fluidically driven soft continuum robots. The parameters that govern the analytical models can also be estimated online.25 Other model-based methods have been proposed without taking the PCC assumption such as approximation of trunk-like structures to infinite degree-of-freedom (DoF) system26 and modeling spring–mass modeling techniques,27,28 which can be incorporated in a hierarchical controller for generating stereotyped motions of an octopus-like manipulator.27 Recently, the Cosserat theory29 of elasticity has been used to predict underwater motion of a cable-driven, octopus-like soft robot30 by deducing its geometrically exact formulations.
Yet, external disturbance to the robot, such as gravity, payload, and external interaction, can promptly invalidate those assumptions. These oversimplified assumptions would substantially degrade the model's reliability in real applications. Moreover, structural parameters in the kinematics have to be determined before the modeling process. The search for these invariant coefficients is heuristic in nature. This might induce further complications when mapping the robot motion analytically. In addition, such invariants can only hold upon slight modification of the robot as they possess strong correlation with the robot's mechanical structure. Inevitably, the analytical model has to be revisited after any major change to the robot structure, further diminishing the effectiveness of such an approach.
With the foreseen difficulty of developing the analytical/kinematic model, research attempts were made to control the soft pliable robot using nonparametric learning-based approaches. The idea is to obtain forward/inverse mappings for kinematic/dynamic robot control based on measurement data only. Model-free control methods can also be developed based on direct modeling architecture,31 where the inverse mapping is directly obtained. This mapping depicts the inverse transition model of the robot, which could be a changing function due to the contact between the robot and the environments, such as soft tissue.
The use of neural networks (NNs) has been proposed to globally approximate the inverse mapping between end-effector and robot actuation.32,33 Such an approach can compensate for uncertainties in robot dynamics32 and has been demonstrated to yield even more reliable solutions when compared with using an analytical model of a cable-driven soft robot.33 Previous studies of NNs mostly consider simplified scenarios, such as a nonredundant manipulator and contact-free situation.32,33 Although redundantly actuated robotic systems can be controlled in lower dimensionality in a hierarchical manner, it may require predefined movement patterns (primitives) for specific task goals.27
Moreover, there has been a great demand on using machine learning approaches to address the change in inverse mapping of the hyperelastic robot upon contact.1 A Jacobian-based model-free controller has shown its capabilities to manipulate a planar, cable-driven continuum robot in an environment with static constraints.34 However, there is still no example that demonstrates manipulation of redundantly actuated soft continuum robot in three-dimensional (3D) space and is adaptive to unknown external disturbance.
In this article, we propose a control framework based on nonparametric local learning technique. Nonparametric local learning methods, such as those described by Nguyen et al. and Peters et al.,35,36 possess the ability to learn the high-dimensional inverse transition of rigid-link robots. The essence of nonparametric local methods is to construct a batch of locally weighted models that collectively approximate inverse mapping. Each of these models is spawned and updated in an independent manner such that the overall architecture can be rapidly transformed to accommodate new input data. Meanwhile, the weighted global approximation can be optimized on the fly and consistent with the desired control behavior.36 Such nonparametric local learning approach can thus facilitate fast online correction of the learning model.37 Therefore, the proposed framework is suitable for providing a rapid response to soft robot manipulation within constrained environments.
Workspace exploration is a prerequisite to collect pretraining data for learning the proposed controller. It is desirable to have accurate enough kinematic data to initialize the controller offline since it is impractical to carry out robot exploration in the confined transluminal workspace. We propose to use finite element analysis (FEA) to sample the kinematic data for the offline learning process. FEA has been widely used in design optimization and miniaturization of soft robots.13 Not only can the FEA accurately predict the highly deformable behaviors but it can also provide data for characterization of inverse kinematic relationships for control.38 However, the application of FEA to robotic control has only been minimally investigated in continuum structure with small deformation.38,39 The major contributions of this work are as follows:
A generic, fluidic-driven soft continuum robot made of RTV (Room Temperature Vulcanization) silicone rubber (Ecoflex 0050; Smooth-On, Inc.) is designed and fabricated to evaluate the proposed framework for endoscopic navigation (Fig. 1a). The soft robot comprises three cylindrical inflatable chambers, each covered by a helical Kevlar string layer with a pitch of 1mm. This fiber-constrained structure was first proposed by Suzumori et al.,4,40 in which the helical constraint layer enforces axial anisotropic expansion of inflatable chambers so as to generate an effective bending moment when subject to pressure input. To enable effective endoscopic navigation, the three air chambers can be individually actuated by air or other fluid, facilitating a large panoramic workspace with a bending angle >150°. The slender robot configuration with 13-mm outer diameter and 93-mm length is also compatible with conventional endoscopes, which is of importance to dexterous manipulation inside a confined transluminal workspace.
Fabrication of the robot involves three major phases: (1) three cylindrical air chambers are cast with RTV silicone in inner molds; (2) Kevlar strings are wrapped densely in a single helical structure along each soft chamber; and (3) additional layers of silicone are cast to house the three inflatable chambers into one. This could fix the strings against dislocation, even after numerous bending actions.
Gradual smooth regulation of the fluidic flow rate allows steady bending of the presented soft manipulator. It also allows rapid reaching of fluid pressure equilibrium, minimizing the residual motion generated during such fluidic actuation. During endoscopic navigation within small and confined spaces (e.g., duodenum), such quasi-static motion characteristic41 can facilitate effective precise targeting of the endoscopic camera or interventional tools (e.g., biopsy forceps or brush cytology) at the surgical regions of interest, thereby avoiding inadvertent damage to delicate tissue and potential discomfort to the patient.
To mathematically describe motion transition of the soft robot, let be the fluid pressure (at equilibrium) in the actuation chambers at time step k where U denotes the control space. Let be the state of the robot when the chambers are filled with the pressure of at equilibrium. This state corresponds to the distal tip position and orientation normal in the Cartesian space (Fig. 2), which are collectively represented by . The forward transition model of the soft robot can be described by the following equation system:
where is the difference of the fluid pressure. The motion transition function f is a continuous mapping that depends on the current state of the robot . Compared with rigid-link robots where the robot state can be well defined by joint kinematics, it is difficult to describe the exact state of the soft robot. For example, model-based approaches approximate this robot state based on PCC15,16,18,20–25 and non-PCC26–30 constraints. The nonlinear function h transforms robot state to Cartesian representation .
Typical endoscopic navigation requires delicate articulation of the distal tip so as to provide accurate positioning and easy access to the soft tissue lesion. A microcamera at the soft robot tip provides forward vision. Therefore, the operator can aim the distal tip at a lesion target on the luminal wall so as to guide the interventional instruments to deploy from the tip through the biopsy channel. This telemanipulated endoscopic navigation gives rise to a robot task space coordinate defined by its viewing direction (i.e., pitch and yaw angle). The system equation in Equation (1) can hence be extended to an actuation to task space mapping as follows:
where is the task space coordinate at time step after the change in fluid pressure is applied.
Our control objective is to enable the operator to control displacement of the robot directly in the task space coordinate (i.e., the desired change in the robot tip orientation) with the use of a motion input device. The superscript “*” denotes the desired motion specified by users or other reference input. Thus, the controller is designed to approximate the inverse of the motion transition in Equation (2), that is, , to estimate the required change in control input (as shown in Fig. 5). The inverse motion transition model heavily depends on the current robot state. However, the exact state cannot be directly measured due to its hyperflexibility and the interactions with enclosed workspace inside a patient's cavity. We sought to adopt the task space coordinates , which would offer updated clues about the current robot state.
This approach is also of practical interest because these measurements are readily available in our control system. The task space coordinate can be tracked using advanced positional tracking systems. For example, electromagnetic (EM) tracking systems are commonly used in medical application to provide submillimeter-level tracking.42,43 Together with the actuator's input , these online acquired data are presented to the learning algorithms to update the inverse mapping during robot run time.
Note that is the approximation of the true inverse mapping . If dimensionality of the task space is smaller than that of the control space, theoretically there exist an infinite number of solutions of that result in the same task space displacement . This leads to the ill-posed problem in learning the inverse mapping .
Nonparametric local learning techniques have been applied to learn the ill-posed inverse problem, aiming to control redundantly actuated robots.31,44,45 Referring to Peters and Schaal,36 the inverse model of a rigid-link robot can be learnt using spatially localized nonparametric learning techniques given that the robot state is well defined by joint kinematics. In this study, spatial localization refers to the robot state . Such localization scheme is motivated by the hypothesis that the inverse problem would be well defined locally.36 It is because nonparametric learning techniques essentially average out the sampled data. Model learning based on nonconvex training datasets would give invalid solutions.36
However, in the vicinity of , the average of would be consistent with the average of the task space displacement (Fig. 2). Therefore, in a local region of a given , the training dataset would become a convex set. This enables learning of inverse mapping in the vicinity of (Fig. 2). We approximate the local inverse mapping from the desired task space displacement to the actuation command as follows:
where is the parameter of the local inverse model. Each mapping serves as a local controller. Compared with Peters et al.,36 we do not include an intercept/bias term since the change of actuation command Δu should have zero mean. The computation of will be explained in the later context.
To approximate the global inverse mapping, we employ a linear combination of the locally learned mapping46:
This controller architecture allows straightforward one-iteration computation in each time step, in contrast to indirect modeling approaches.34 The number of local models n and the weight , as well as the local controllers , can be obtained in an online manner.
For this purpose, the local forward model is learnt using locally weighted projection regression (LWPR),37 which offers piecewise linear function approximation, while it simultaneously determines the appropriate local region of each linear model. Each local forward model performs a linear mapping as follows:
where denotes the corresponding parameter. Each local region, namely the receptive field (RF), is shaped based on the membership function:
centered at , where Di is the distance metric. Each membership function weights the corresponding locally learned inverse model in the controller (Eq. 6).
One advantage of LWPR is that it can automatically spawn new linear models and the corresponding RF when new data laid outside all existing RF are presented. Meanwhile, the center of RF is determined by the input space of new data through incremental learning so as the total number of local regions n (Fig. 3). Each newly spawned RF is initialized with a diagonal distance metric Di value. This Di value will be updated throughout the incremental learning process to improve the overall regression accuracy and convergence rate. To prevent overfitting and allocation of too many numbers of RFs n, a smaller initial Di value is preferred (i.e., larger RFs). Cross-validation is also employed in determining the initial Di, which is important to ensure that the forward model can be accurately reflected by piecewise linear regression.
Despite the fact that each RF could fulfill the local convexity requirement due to redundancy in the robotic system, the solutions of local controllers (Eq. 4) could be inconsistent with the desired solutions.36 Although this problem could be resolved by preprocessing the training data such that it only produces one particular solution, it lacks generality and is difficult to apply in high-dimensional systems.31 Therefore, we employ another approach that reshapes local inverse models using constrained optimization, where the local controllers are enforced to provide consistent solutions from infinite possibilities in the null space of the control space. We then define the optimization problem as follows:
where the cost function Ck represents the user-defined optimality scaled by a diagonal matrix N. is the user-defined null-space behavior. One example of null-space behavior could be minimizing the elongation of the robot, which results in smaller bending radius to facilitate dexterous motion inside enclosed cavity. Finally, the optimization constraint ensures the correctness of the inverse solution.
The reward function is scaled by the mean cost to improve learning efficiency36:
The cost function is then minimized by means of reward-weighted regression, where each local model needed to be updated:
where , , and are the training datasets. The overall procedures of the learning-based controller are summarized in Algorithm 1.
The proposed control framework is implemented on a custom-made soft robot to investigate its performance and behavior under external dynamic constraints. We have also attempted to utilize FEA to simulate robot motion data for pretraining of an initial control policy. This can avoid the need for random exploration of its robot workspace to initialize online learning functions. Such exploration is usually time-consuming and may not be practical, particularly for single-use purposes in surgical applications. Accuracy and stability of the proposed controller are examined through path following under various constrained environments. The interaction force with the external constraint is also measured throughout the experiments. The control block diagram of the overall robotic system, including the processing core and actuation system, is illustrated in Figure 6.
Proper initialization of pretraining data is essential to many online learning techniques. These preceding data are dedicated to pretraining an initial control policy before the online learning begins. It is usually acquired by driving the robot with random input. Instead, we proposed to incorporate FEA, by which robot deformation can be simulated with a hyperelastic computation model. This simulation can generate comprehensive pretraining samples that cover the entire robot workspace at a high resolution, facilitating offline pretraining of the learning-based controller (Fig. 5).
The FEA model of the robot is constructed using ABAQUS47 to predict the robot kinematics and workspace. RTV silicone rubber is considered as incompressible hyperelastic material formulated by Odgen material model.48 It exhibits negligible volume change under hydrostatic compression and has a Poisson's ratio close to 0.5. Due to the incompressibility of silicone rubber and the large deformation nature of the simulation, the element formulation and mesh quality pose a compelling effect on both accuracy and convergence of the simulation. Therefore, hexahedral element (C3D8RH; Fig. 1c) based on u-p hybrid formulation with hourglass control47 is chosen over the commonly used quadratic tetrahedral elements (Fig. 1d) in the FEA of our soft robotic manipulators.
The C3D8RH element possesses eight displacement nodes and one interior pressure node. The combination of these displacement and pressure nodes is often close to optimal.49 Such integration scheme improves not only element efficiency but also element accuracy under bending load. However, compared with tetrahedrons, automatic mesh generation of hexahedrons is relatively ineffective, resulting in poor tessellation quality. To this end, the presented meshing has to be obtained by custom-designed protrusions, and all elements are right prisms initially. By restoring the mesh quality, the assemblage contains far fewer elements and is much more robust in convergence.
The presented manipulator model is tessellated with 12k linear hexahedral elements (C3D8RH; Fig. 1c). There are also 2,214 linear truss elements (T3D2) being placed along each actuation chamber in a layer-by-layer arrangement (Fig. 1b). Truss elements are used to model the helical strain-wrapping constraints that ensure the anisotropic expansion of chambers upon pressure actuation. Actuation and gravity loads are applied to the presented FEA model. The gradual change of the stress input, which is distributed across the surface mesh along the inner chamber surface, guarantees reliable convergence, giving rise to an equilibrium solution throughout all the time steps during the FEA.
Quasi-static motion with negligible hysteresis can be achieved when the real robot prototype is manipulated while delicately regulating the inflation pressure into the chamber at high-resolution steps. It is worth noting that deformation/bending of both the FEA-modeled manipulator and the actual one are very similar corresponding to the same levels of inflation pressure simulated, as shown in Figure 4. Over 1,000 simulated motion samples have been obtained using the FEA, covering the entire robot workspace (Fig. 5). These simulated data are adopted to pretrain the online learning controller as described in the following sections.
To evaluate the proposed control performance, three motorized pneumatic units are employed to actuate the presented soft manipulator incorporated with our close-loop control testing platform (Fig. 6). Each unit consists of a pneumatic cylinder coupled to a precise stepper motor through a lead screw transmission. This facilitates accurate regulation of air flow. Our soft robotic manipulator can be fully articulated in a dome-shaped workspace with a maximum curve angle of >150° in all directions.
An EM tracking system (NDI Medical Aurora) is employed to close the robot control loop by the continuous positional data feedback (Fig. 7a). This tracking system is commonly available in many image-guided intervention systems. It can track the position and orientation of tiny EM coils in real time with root mean square (RMS) accuracy of 0.7mm and 0.2° at 40Hz. A tiny tracking coil is embedded at the robot distal tip. Online updating (at 20Hz) of the inverse mapping estimation by the local learning algorithm is achieved, where is measured tip direction. The positional data are also recorded throughout the robot task so as to evaluate overall control performance. The entire control framework is implemented in the MATLAB environment. The open-source library of LWPR50 is employed to incrementally learn the robot forward model, which determines valid linearization of each local controller.
A series of path-following tasks is performed under various constraint scenarios to investigate how the online learning control approach reacts to such unknown interactions. At the beginning, the robot is allowed to move freely in its workspace without any interference. This serves as the control experiment to establish the baseline of controller performance. Subsequently, the robot is gently pushed by a plastic rod to simulate an unknown dynamic interaction with the robot manipulation (Fig. 7b). The rod is actuated by a high-precision stepping motor to generate repeatable contact with the robot body; meanwhile, the contact force is monitored by a force/torque sensor (ATI Industrial Automation: F/T Nano17). The tracking error is defined as the shortest distance between the robot targeting direction and the desired trajectory.
To realize accurate navigation under unknown constraints, the inverse model is adapted in the proposed learning-based controller, which has to be updated online based on the newly acquired motion data. In this study, we compared three types of data sources for the inverse model training: (1) pretrained by FEA data without using online data; (2) initialized by random exploration with online learning data; and (3) pretrained by the FEA data, and then updated by online data. These online-updated inverse models are evaluated for resolved motion rate control51 to track a predefined trajectory. Thus, the desired task space displacement that tracks the reference input is obtained as follows:
where and are the reference task space displacements and coordinates generated from interpolating a predefined trajectory. Note that the reference input can be replaced by manual control in actual endoscopic navigation scenario. We employed the same proportional–derivative (PD) gain for all three settings to perform tracking along a reference trajectory. Thus, the actuation input is estimated by the online learning inverse model as depicted in Equation (4).
To enforce the consistency of inverse mapping among all localized linear controllers, a standard null-space behavior is defined. This gives rise to an immediate reward function to weigh the training data that best imitate the desired null-space behavior (Eq. 9). For the presented soft robot, we first choose a rest configuration to be , which can minimize the overall inflation pressure as well as elongation of the manipulator. Then, the robot is attracted toward the rest configuration with a loose attractor function , where . We defined an identity metric as all three inflatable actuators of the robot are identical and should contribute the same in achieving the desired null-space behavior.
It is also necessary to normalize the training dataset into the same scale component-wise so that the LWPR can learn the data variance properly. Min-max normalization is a simple but effective technique commonly used52:
However, the statistical max(qi) and min(qi) values would be sensitive to outliers; therefore, we define the min–max values according to the physical constraints of data, including the typical robot workspace and the maximum volume of the cylinder unit.
In this setting, both the forward model and control policy are pretrained solely by the FEA-simulated data (see the Initialization of online learning by FEA-based model section). The online data were not taken into account in this setting. This acts as a control experiment to depict the actual influence of external interactions. In the unconstrained experiment (Fig. 8a), it was observed that the controller could roughly follow the trajectory with a relatively large tracking error of ±1.79° and a maximum error of ±6.96° with the use of the feedback controller (Table 1). Despite the considerable discrepancy between the FEA-simulated and actual configuration, this experiment still demonstrates that the FEA data are capable of pretraining a reasonable inverse model for rough path following.
In the later constrained experiment (Fig. 9a), the robot maintained tracking of the trajectory with similar accuracy at the beginning. When the external interaction is engaged at the moment of 25s, the robot was pushed further away from the desired trajectory, resulting in an increased mean tracking error ±4.64° and a maximum error of ±14° (Table 2). This indicates that the feedback controller cannot fully compensate the significant motion bias that is induced by external disturbance.
In the case of a conventional rigid-linked robot, this kind of error due to the interaction with the constraint is often considered as a perturbation. The error can hence be compensated by increasing the feedback control gain given that the inverse model is readily available from the kinematic chain. However, such approach is not directly applicable to a soft robot due to their mechanical compliance that inevitably induces much larger positioning errors. In addition, the interaction force may also alter the force equilibrium of the robot and therefore substantially degrading the reliability of the predetermined inverse model. The following experiments demonstrate how the proposed online algorithm can accommodate the influence of constrained environment, which is particularly demanding for the control of soft robots.
The random exploration of robot workspace is a typical approach34 to initialize a data-driven controller before its actual deployment. This kind of arbitrary movement is necessary to provide preceding data for setting up a learning model. It involves tracking 50 random input pressure waypoints with a PD feedback controller. The deliberately tuned PD gains can cause poor tracking of random waypoints. Such babbling movement (green path in Figs. 8b and and9b)9b) can facilitate a faster learning rate as the robot sweeps throughout a wider neighboring workspace. Pretraining with the exploration data resulted in a forward LWPR model with 110RFs, which define linearization for the piecewise linear inverse model in advance to actual deployment of the online learning.
Upon exploration, the online learning controller could follow the desired trajectory with an average error of ±1.13° in the first cycle under the constraint-free environment (Fig. 8b). The error was found to be significantly lower than the inverse model pretrained by FEA-simulated data. It is reasonable because the actual robot data were used. After a few cycles, the tracking error further decayed to an average of ±0.87° and maximum of ±1.92° as having the online learning controller adapted with the trajectory.
Next, the feasibility of online inverse model adaptation was validated by engaging external force interactions (Fig. 9b). The online learning controller can compensate the bias and hence minimize the error down to an average of ±2.35° within 5s upon contact with the constraint. The external constraint is moved away after 30s of contact. It is also worth noting that the controller could quickly update the inverse mapping online and follow the trajectory will high accuracy. No control instability is observed throughout the experiment. The pure online learning approach achieves the highest average accuracy among all settings, both for constrained and unconstrained scenarios (Tables 1 and and2).2). However, the need for initialization by babbling motion (green path in Figs. 8b and and9b)9b) should be avoided in clinical scenarios to prevent unnecessary interactions with patient anatomy.
To alleviate the need for random exploration, we attempted to pretrain the controller with FEA data and then update the inverse model by online learning. This approach combines the advantages of the both aforementioned settings, in which the inverse model can be initialized with FEA data. The robot can immediately begin navigation using this pretrained model without the need of initialization through undesired babbling movement. The subsequent manipulation data are also acquired to incrementally train a more precise inverse model so as to adapt to external interactions. This feature is demonstrated in Figure 8c, in which the robot is allowed to move freely.
Although the robot begins with a relatively large tracking error of average ±2.21° and maximum of ±7.49° in the first cycle, the error is quickly compensated by the online learning and converged to an average of ±0.90° and maximum of ±2.80°. This tracking result is compared with the other two approaches in Table 1. In the first cycle, the combined approach exhibits tracking error close to pretraining with FEA only (average ±2.21° vs. ±1.79°) because both inverse models are initialized with less accurate FEA data. The learning technique then corrects the inverse model with online data so that the tracking error decreases rapidly and becomes comparable with the pure online approach (average ±0.90° vs. ±0.87°).
This shows that the combined approach can initialize a reasonable learning-based controller with less accurate FEA data, then further refine the inverse model while performing the tracking task. Note that the combined approach does not required random exploration (green path in Figs. 8b and and9b)9b) to obtain pretraining data, which is difficult to cover the entire robot workspace with sufficient density.
This combined approach is also capable of adapting to the unknown external interaction (Fig. 9c). The inverse model can quickly adapt the inverse mapping upon contact with the external interaction at 36s. It continues to follow the trajectory with a small mean absolute error of ±2.49°. The controller also remains stable and readapts after the removal of constraints. Readers could also refer to the attached Supplementary Video (Supplementary Data are available online at www.liebertpub.com/soro) for extra details about the robot behavior and the characteristics of constraint.
Referring to the Evaluation of online local learning controller section, we presented the challenge in learning an inverse model spatially localized by the unmeasurable robot state as well as how this robot state can be retrieved indirectly from sensory measurements. These trajectory tracking experiments have shown that the inverse model could be successfully learnt by continuous updates of both the task space coordinate and control input . Both are set as the localization parameters required in the inverse model. Therefore, the robot state could be estimated sustainably by the learning algorithm. These 3-6D positional data updates are clinically practical. The comparable position-tracking techniques designed for image-guided interventions are also under active research,53 one of which would be magnetic resonance imaging-guided endoscopic retrograde cholangiopancreatography.
We have proposed a model-free control framework that adopts an online nonparametric local learning technique for manipulation of a redundantly actuated, fluid-driven soft continuum robot in the presence of a dynamic external disturbance. Nonparametric techniques are capable of constructing highly nonlinear functions by measurement of data solely, which is particularly suitable for characterization of hyperelastic robot structure. To accommodate the flexibility of soft robot body, we approximate the global inverse kinematics by a linear combination of many locally learnt inverse kinematic models.
Our model-free controller employs this global approximation, where the behavior of the redundant actuator can be optimized by a user-defined criterion, and simultaneously fulfilling the control objective defined in task space coordinates. In addition, the controller is adaptive to changes in the environment, where each local model can be updated online independently according to newly acquired data. This equips the robot with the ability to maintain control accuracy under external dynamic disturbance. Our work is the first attempt of implementing such direct inverse modeling using an online nonparametric learning technique to control a redundantly actuated soft continuum robot.
We have also incorporated FEA into the learning control framework for proper initialization of the robot inverse model. It enables precise prediction of the hyperelastic robot deformation under various actuation pressures, without the need for the oversimplified analytical model. It can also offer adequate sample data covering the entire workspace at high resolution. This avoids the need of time-consuming random exploration to initialize the learning model, which may not be practical in many surgical applications. The proposed controller can hence be initialized offline using FEA-simulated data, ready for endoscopic navigation procedure.
The proposed novel control framework has been experimentally validated. In the constrained experiment, after FEA-based initialization of the controller, the endoscope prototype could follow a 3D trajectory with an accuracy of mean±2.21° and maximum ±7.49° and attained almost the same tracking accuracy (mean ±2.49° and maximum ±11.03°) after 5s upon addition/removal of external disturbance (maximum 1N). This is also the first demonstration of realizing model-free close-loop control of a fluid-driven soft continuum in 3D task space even under dynamic external disturbance.
The current form of our learning-based control method is first designed for a single segment manipulator. In our future work, we intend to extend the framework to address soft manipulation with multisegments.54 As a cascade of multiple actuation modules, it provides enhanced manipulation flexibility for interventional tools, facilitating more complicated operations in a confined space. In this case, a generic optimization function will be developed to resolve the null-space control of hyper-redundant robot.55 Further characterization of such multisegment soft manipulators will be investigated. To address its hyper-redundancy, it will also require additional sensory systems or algorithms to parameterize the possible motion transition of robot configuration, thus estimating the inverse model for the higher DoF robot.
This work is supported, in part, by the Croucher Foundation, the Research Grants Council (RGC) of Hong Kong (Ref. Nos. 27209151 and 17227616), the Innovation and Technology Fund (ITF) of Hong Kong (ITS/361/15FX), and NISI (HK) Limited.
No competing financial interests exist.