The Restaurant Problem [17
] is a classical hypothetical problem used to illustrate concepts of machine learning. The problem describes a decision process by which restaurant guests who have just arrived at a restaurant need to decide whether to wait for a table or go elsewhere. The decision is made based on answers to several questions such as the approximate waiting time, and the weather outside. The restaurant problem decision tree was used to create a complex Boolean function from the problem description . This function was chosen because its output exhibits a nontrivial structure. For example, some inputs are more influential in determining the output than others. Each of the ten binary inputs corresponds and decodes either one question or part of a question from the decision tree. Questions with more than two answers are decoded with two inputs. For example, the “Patrons?”  question is decoded by inputs 1 and 2, such that 00 means none, 01 means some, 10 means full, and 11 is not a valid option. The function has ten binary inputs (attributes) and one corresponding binary output (class)
FIG. 1 A schematic representation (adapted from Russell and Norvig ) (a) of the decision making process in the “restaurant problem,” and the conversion of the decision tree to a Boolean function (b) and (c). The problem describes a scenario (more ...)
This equation produces 210 input/output relationships. A number of inputs/output relationships (i.e., 100) were chosen randomly to create training sets. This complex Boolean function is used to train growing and adapting networks.
Growing network models attempt to decode correctly a set of examples randomly picked from the Boolean function solution space. The networks adapt to decode correctly examples through an evolutionary strategy growth process [18
]. The growing networks are connected undirected graphs
with a fixed number of source nodes (vertices):
, a variable number of core nodes:
, and one target node (sink): Vt
}. The nodes are connected by a set E
of edges (links). This setup is similar to an array of cell surface receptors (inputs) connected through multiple core nodes (transducers) and a downstream transcription factor (effector/output). The seed network is a network (undirected graph) where each source node is linked, with one link (edge), to a unique core node
Each core node is directly linked, with one link, to the single sink target node
In this implementation, source nodes represent n=10 inputs directly connected to m=10 core nodes connected to one output sink node .
FIG. 2 Initial network and a network after growth and adaptation. (a) The initial network with ten input nodes and ten core nodes. (b) An example of a SF model network after a growth period of 41 generations. For each generation a training set of 100 inputs/ (more ...)
These seed networks grow using three previously described algorithms: duplication/divergence (DD) [12
], scale free (SF) [11
], and exponential (EXP) [11
]. Duplication/divergence networks grow by duplicating a randomly selected core node with its links, and then removing some of the links from the newly generated node
The set of Neighbors (nodes that are directly connected to the specific node) of the newly added nodes is a subset of the Neighbors of a randomly chosen node that is copied. When a node is duplicated there is also a chance that the newly created node will also have a link to the node it was copied from. The algorithm for this growth process uses the probabilities p
=0.1 for connecting the new node with the copied node and the probability p
=0.7 for removing new links (neighbors) from the newly generated node. These p
values were chosen because they were used to generate a network similar to the protein-protein interaction network of Saccharomyces cerevisiae
]. Additionally, a rule was added to ensure that newly generated copied nodes are not isolated. New nodes must have at least one path to one source node, and at least one path to the target node. This rule was used for all three growing networks methods.
The scale-free networks (SF) grow by randomly adding one new core node [Eq. (5)
] and connecting it to any other two or three existing nodes (with probability p
The preferential attachment probability selects nodes for making a connection to the new node based on their current connectivity
is the number of Neighbors directly connected (number of edges) to the existing core node
. Barabasi and Albert showed that this model produces scale-free networks [11
]. The exponential (EXP) networks, also introduced by Barabasi and Albert, follow the same growth rules as the SF networks, except that here the probability of connecting new node to any existing nodes is even
Barabasi and Albert showed that networks that grow with this rule display connectivity distributions that follow an exponential distribution [11
Adding one node to a growing network is considered one generation. In each new generation, several alternative networks (Nalt
) are generated with one additional core node. The fitness of each new network is evaluated by the network’s ability to decode correctly a training set which consists of several (i.e., 100) inputs/output known relationships [Eq. (1)
] from the “restaurant problem” Boolean function dataset. The fittest network is then selected, and the process is repeated until networks grow to a certain size (i.e., 65 nodes) . The size of 65 nodes was chosen because networks stop improving their adaptation ability at this size.
The fitness evaluation uses the count of alternative pathways from each input node to the output node to determine the weight of each input. To determine the fitness two strategies were used: The first strategy considers the links to have a neutral sign. The number of all pathways starting from input nodes with the value 0 is compared to the number of all pathways from input nodes with input value 1. If there are more “1” pathways to the output node, the predicted output will be 1, and if there are more “0” pathways, the predicted output will be 0. Thus, the initial network, before adaptation and growth, is a simple voting function. The second strategy randomly assigns a sign (+ or −) to each link when the link is created. The sum of all positive pathways starting from input nodes with value 1 is added to the sum of all negative pathways from input nodes with input value 0; this sum is compared to the sum of all positive pathways starting from input nodes with value 0 added to the sum of all negative pathways from input nodes with input value 1. If there are more positive pathways from “1” inputs and negative pathways from “0” inputs the predicted output will be 1, and 0 if the opposite is true. Positive pathways in addition to containing positive links may contain an even number of negative links or no negative links. Negative pathways have an odd number of negative links. Thus, the algorithms for determining the output are as follows: The input parameters are a network G [Eq. (2)
] and a set of attributes: I
The algorithm computes the class (either 0 or 1) by use of the following procedure:
- Count number of unique pathways (walks) Wi from each source nodes to the sink target node
- Count number of positive pathways and negative pathways from each input (this step is only needed for the second implementation)
- Assign to each source node its matching input attribute
- For the first implementation, the total number of pathways with input 0 is compared to the total number of pathways with input 1. If there are more pathways with the input 0 then the Class would be 0, or 1 otherwise; if
then the Class is 1; else it is 0, where Ī is the complement of I.
- For the second implementation, the total number of positive pathways with the input 1 and the total number of negative pathways with the input 0 are compared to total negative pathways with input 1 and positive pathways with input 0. If there are more positive pathways with the input 1 and negative pathways with the input 0 then the Class would be 1, or 0 otherwise; if
then the Class is 1; else it is 0.
As a control, networks that grow under selective pressure were compared to networks that grow randomly. For networks to grow randomly, in each generation one core node is added according to the same rules described above without the fitness and selection process, namely Nalt=1. The randomly growing networks are created to demonstrate that the evolutionary selection process produces networks that adapt to the Boolean function better than by chance. Thus, the randomly growing networks are not expected to improve in their adaptation as they grow. Randomly growing networks are growing according to the previously proposed methods without adaptation such that their topology can be compared with networks that adapt.
The topological structural features of the adaptation-driven artificial networks were compared to that of real-world systems abstracted to networks with many inputs and one output. For this comparison a biological mammalian regulatory cellular signaling network and an air traffic control network were used. Signaling pathways thought to exist in mammalian hippocampal CA1 neurons were assembled in silico
from known binary interactions extracted from biomedical literature to form a connections map [19
]. This abstract representation considers biochemical components, such as proteins, as nodes and binding interactions or enzymatic reactions as links. Links represent direct molecular interactions and are associated with an effect. Thus, the network is a partially directed graph with signs on the edges (+ positive, − negative, or 0 neutral). Neutral links are undirected links where the source/target relationship is not directional. Alternative pathways from different extracellular ligands (“inputs”) to the transcription factor cAMP response element binding protein (CREB), were counted if the ligand can reach CREB in a maximum of seven steps through direct intermediate cellular components (seven nodes and six links) or less. The transcription factor CREB was chosen because it is a major driver of activity dependent plasticity in these neurons. See Ref. [20
] for the dataset.
An air traffic control network was constructed from the FFA (Federal Aviation Administration) NFDC (National Flight Data Center) Preferred Routes Database [21
]. This network contains 1226 nodes and 2615 links. A list of airport codes was downloaded from Ref. [22
]. A subset list of airport acronyms was created by choosing airport codes representing airports from U.S. states on the Atlantic coast from Maine (north) to Florida (south). The subset list contains 148 airports. These airports were used as source nodes to count potential routes to the destination LAX (Los Angeles International Airport), a major U.S. airport hub on the Pacific (west) coast. The route strings columns in the database were broken into pairs, representing links in the network. Nodes in this network are either airports or flight service stations that are used to direct small airplanes to follow preferred routes specified by the NFDC. Potential routes to LAX from east coast airports were limited to maximum seven hops to reach LAX.