. The business process execution language is a specification designed to support the high level orchestration of web services. The heart of the BPEL specification is the scripting language which defines how services and data produced by them are linked together. This specification is rich enough to allow for most workflows and defines both how method invocations and data are linked and the how web services should be coordinated (e.g. concurrency, choices, sequential operations). The specification also defines extensions to the WSDL which can be used to specify links between services (which can be discovered dynamically through UDDI querying). A number of implementations are available, including those that run under JBOSS [22
] and ODE from Apache [23
]. Information about the specification can be found at OASIS.
CORBA. The Common Object Request Broker Architecture supports interoperability between distributed processes (applications). Central to the architecture is an ORB (object request broker) which both marshals data and controls compartmentalisation (to allow for invocation on specific the threads etc) of between the different processes. The specification was defined by the OMG, and ORBS are available for most platforms.
. The Distributed Annotation System [24
] defines a protocol for the retrieval of annotations from genomes. Requests are sent using http encoding which returns an XML document. A number of clients and servers are available, and it is used in a number of large scale genome projects (e.g. Ensembl).
DBMS. A database management system is the environment in which database instances exists. A DBMS provides a unified framework which can be used to control the physical (tablespaces), conceptual (logical schemas) and external (views) of databases.
DCOM. Provides for a means to make distributed calls between COM (Component Object Model) objects. Thread compartmentalisation and marshalling (using low level XML interchange) is handled automatically. For application developers this has largely been superseded by .NET Remoting.
. An Enterprise Java Bean (EJB) is a server side component that lives inside an EJB Container. There are different types of EJBs, and each has a different purpose: an entity bean
which serves as a data cache from an underlying data store, this is used for transformation and data integration logic; a session bean
which is typically used to hold application logic which communicates with information stored within entity beans; a stateless session bean
which typically represent simple stateless logic and generally act as end points for high level services; and a message bean
which is used to pass message between the other beans. The EJB 3.0 standard (2006) represented a major change as the complexities of developing EJBs was inhibitive for most projects, so a simplified process of building EJBs was outlined (through the use of detachment, dependency injection and aspects). Full details are available from Sun [25
. The EJB container controls the life cycle of the bean, facilitates access to core services and manages server-based resources. The services that are available through a EJB Container provide: security management, including method level and ACL; transaction management; including 2 phase/distributed commits; life cycle management, including pooling of bean instances and swapping beans in/out (persisting) of memory for resource management; naming/directory management, typically through JNDI; persistence management; using ORM tools such as hibernate; remote access management, so that the bean can be accessed via RMI-IIOP/CORBA and Web Services; and a number of utility services (e.g. mailing, clustering, caching, monitoring). A widely used, and free, container is JBOSS [26
I3C. The I3C was a short lived commercially led organisation established to standardise aspects of life science informatics. The organisation was led by Oracle, Sun and IBM. The I3C did promote the use of LSIDs, which have been adopted by the OMG.
IDL. The Interface Definition Language formalises the remote interfaces that can be accessed through CORBA. IDL has evolved considerably, with the advent of pass-by-value and components (facets). A WSDL serves the same type of purpose for Web Services.
. The Java Content Repository is a specific Java Standard (JSR-170) for defining the interface to a content repository. A content repository is a flexible system that is typically customised for a specific usage, when customised it is referred to as a Content Management System (CMS). A number of implementations are available, including Jackrabbit which is licensed through Apache [27
. The Life Science Identifier standard [28
] provides a concrete definition and implementation of a URN. The LSID specification outlines how the URN is resolved to two locations (the data and the metadata) through the use of "an authority". In this way the authority acts as a registry. The documents that are retrieved are returned as objects and an associated RDF data file which encodes the metadata. The standard also encompasses many aspects of using URNS, and includes specifications for associated services (e.g. assignment). Details about the specification and implementations are available [29
. The Life Science Research group of the OMG [4
] defines standard in the "vertical" life science domain. The body have defined and adopted a number of standards. These standards cover a wide range of areas (including "sequence" and "literature").
MDA. A Model Driven Architecture is one where the model underlying the system is defined in a language independent way, and the corresponding services/classes are automatically pushed out from that model. Typically the model is defined in UML and them XMI is used to automatically generate stubs/skeletons which can be used to provide implementations of the model.
MIDL. The Microsoft Interface Definition Language serves a similar purpose to IDL, but is generally based on specifying the remote procedure call interface which is used between COM components.
.NET Remoting. The .NET framework provides a mechanism for making remote calls called Remoting. Remoting includes many useful features for the development of distributed systems, these include: life cycle management, so that distributed behaviour/GC and leasing can be controlled; protocol support for binary socket based communication and other streams; and specification of the behaviour of a remote service/object (e.g. singleton).
ODBC/OLEDB. The Open Database Base Connectivity is a definition of the interface presented by a DBMS. The ODBC specification is well established and bridges with other technologies (including JDBC). The OLEDB is an extension to the ODBC offer richer functionality.
. The Object Management Group [30
] is an open not for profit standardisation body. The OMG have produced a number of horizontal (e.g. Trader service, Naming service, Event Service) and vertical (see LSR [4
]) standards for use with CORBA.
. The Web Ontology Language is an RDF description of an underlying data resource. The ontology describes the data items produced through a web service as well as the relationships between them. Details are available from the W3C [31
. The Resource Description Framework is a W3C (WWW consortium) standard for describing resources available on the Web. RDF forms the basis of formalised descriptions of services (e.g. OWL) and can be used in conjunction with extensible metadata descriptions (e.g. Dublin core [32
]). RDF consists of a series of connected triples, so that complex representations can be constructed as a graph. Details are available from the W3C [33
. Representational State Transfer (REST) [34
] can be considered an alternative to SOAP, although it is considerably easier to implement. REST uses pre-existing technologies as the basis for the protocol (e.g. "the web is the platform"). There exists some confusion about what represents a Restful service, rather than just an HTTP encoded request for an XML document. True REST is based upon the verb/noun/type based calls, where you apply an operation (verb e.g. POST, GET, PUT and DELETE) to a URI (noun) with a certain view (type).
RMI. Remote Method Innovation is a Java-to-Java solution for communication between distributed Java threads/applications. RMI uses a number of abstraction layers (remote reference layer/RRL and transport layer), this has a number of advantages including the fact that different underlying protocols can be used to actually provide the communication (e.g. IIOP). Marshalling is done through serialization, leasing is available, and distributed GC is supported. RMI is a convenient, but not interoperable, protocol.
SOA. A Service Oriented Architecture is one which consists of loosely coupled federated services. There is typically little linkage between these services, and they are generally discovered dynamically using a registry system or similar. SOAs have grown in popularity within many enterprises, as they provide a practical mechanism for disparate groups to share information/processes.
. SOAP is a protocol for making requests on remote services to return structured data. It is designed to use a any high level protocol that supports the sending of information, and is primarily used with HTTP. Much like CORBA, interoperability is the big draw of SOAP, and (unlike CORBA) SOAP has the advantage of being simple to develop and test. The original stateless nature of SOAP limited it usage, however with the advent of WS-RF (and other standards) SOAP is maturing into a general purpose object protocol. More information about SOAP specifications is available from the W3C [35
. The SPARQL Protocol and RDF Query Language is designed to allow for the querying and retrieval of documents across multiple unstructured data stores. The power of the system is the distributed RDF documents (or other data stores) remain unchanged, but queries can be run across them – and so it fits well with a "bottom-up" approach. Such a unified approach to accessing information is required to make the semantic web (Web 3.0) a reality, and there do already exist some implementations (e.g. Virtuoso [36
]). More information about the query specification is available from the W3C [37
. Universal Description Discovery and Integration is a WSDL (therefore interoperable) defined registry/dictionary system. UDDI 2.0 is the currently used version, and it supports the registration and querying of Web Services using specific mappings. OASIS [38
] have details of the different standards (version 2 and 3), and Apache have jUDDI which is a 2.0 implementation [39
URN. A Uniform Resource Name is a type of URI (Uniform Resource Identifier). It is the logical counterpart to a URL, in that it provides the name of a resource rather than the exact location of a resource. A number of URN implementations are available, including LSIDs.
Web Services. A Web Service is a server which performs request/response operations and (generally) works using documents. A request is sent (either as a well formed document or using http encoding), and a well formed document is returned. The term Web Service originates from the fact the web based protocols are generally used to provide the communications.
. The WS-* are a series of specifications for adding functionality to SOAP. These extensions provide new functionality such as security, messaging, binary object attachment and state. These extensions generally involve the addition of information to the SOAP message (within the envelope). State information can be maintained between SOAP calls through the use of resource frameworks (e.g. WS-RF). OASIS [38
] keep a large number of specifications.
. The Web Service Description Language provides a means to specify the interface exposed by a SOAP Web Service. The WSDL document can be automatically retrieved, and tools can be use to generate convenience classes for specific languages, so that no XML parsing code needs to be written by the developer. When writing a WSDL a number of standards (e.g. WS-I) are available to ensure interoperability, typically though the use of profiles with literal/document "styles". The W3C have details of the standard [40