Echoing the Emergence
|
We find ourselves in a world of unbelievable complexity and breathtaking
beauty. How did it all come about? How is this all possible? Western science
has been on the noble and difficult quest to understand the life around
us and ourselves for the past 500 years. Today our knowledge of the sciences,
and probably of the world, is richer and more structured than it was back
than or at any period in human history. To people who inhabited the
planet 200 years ago our current knowledge and state of affairs would probably
be undreamed of and incomprehensible. Today we take for granted Newton's
laws, Darwin's theory, Entropy, Maxwell's laws, Universal Turing Machine,
Godel’s incompleteness theorem and Kolmogorov's complexity. Yet these represent
milestones in our, at times frustrating, attempts to piece together, to
cope with complexity of the surrounding world. While for centuries scientists
have focused on developing and dividing study of nature into different
sciences, lately they started to realize that there are common patterns
and themes that unite them. One thing that became clear throughout XX century
is that living systems exhibit phenomenon of self organization.
Self organization is that 'invisible hand' of Adam Smith that pieces economy together, it is election of the President of United States, it is rise of first cities and societies, it is the World Wide Web, it is what your project group is based on, it is how elementary RNA molecules formed to form yet more complex structures that led to, millions of years later, to human brain, which in turn is another self organizing system. Self organization is one of the keystones in Study of complex systems, or as John Holland calls them Complex Adaptive Systems (CAS). CAS received greater attention in the mid 80's with formation of Santa Fe Institute (www.santafe.edu). This institutions gathered eclectic collection of multi-disciplinary thinkers to tackle problems ranging from bar attendance patterns to stock market simulation to study of elementary biological systems. The key thing that brought these diverse people together was the belief that there are common roots and patterns that underlie sciences as different as physics and sociology. One of the first people who was offered a permanent position at Santa Fe was John Holland. He works as professor of Computer Science and Psychology at the University of Michigan. While Professor Holland is originator of most key concepts in Complex Adaptive Systems, he is more known for his invention of Genetic Algorithm. Throughout his research career he always looked up to nature, mainly to biology, to understand how to model or describe certain phenomenon. It is this synthetic approach that allowed him to come up with succinct and beautiful theories. He works to complement his theories with practical simulations. One of them, called Echo, is described in his book 'Hidden Order', and is the main subject of our project.
Complex Adaptive Systems (CAS) are all around us. In fact, we are complex adaptive systems. An absolutely inextricable property of CAS is feedback. We develop and change based on our interactions with the world. We are reflections of the world around us. These ideas are very similar to the ideas of Eastern philosophies - we are the reflections of the whole, we are the whole. It is true, everything around us makes an impact. Brilliant description is captured by the name of the science created by mathematician Norbert Wiener called Cybernetics. In greek kybernetes means steersman. Since the sea is always in motion, steersman needs to adjust the wheel all the time. This causes changes in the ship's course, which in turn changes the motion of the water, which again impacts the ship and causes steersman to respond.
So CAS co-evolve. In our EWorld coevolution means that entities will move around the site and interact with each other. You can imagine that simulation would be quite dull if we had just one entity or entities which can't move. Another very important question is do CAS sustain themselves? An answer comes from most eminent scientist of XX century, Nobel Laureate Ilya Prigogine. As many of you know from basic Thermodynamics, everything eventually disintegrates into chaos. This is the essence of the second law which says that as time goes by amount of Entropy (= disorder or chaos) does not decrease. In general this is bad news. But not for us. It turns out that this law only applies to closed thermodynamical systems. Now, CAS are open ones. Closed system means that there is no flow of energy between the system and the outside world. It is certainly not the case with living beings on Earth. Living systems move around and consume energy, this is precisely what allows them to be sustained in order on the edge of chaos. Remember that in our EWorld, we will have resources scattered through the site. Entities can compete with each other for these resources. The winners will be awarded with most simple award we find in evolution - life and ultimately proliferation.
Another important property Complex Systems is derived from Game Theory. There is a famous game called Prisoners dilemma. In this came two thieves are captured and are promised life and money if they would speak against their fellow. The problem is that ideally, they should cooperate and simply refuse to speak. But this is not the case when this game is played just once. Here is why. One of them would think 'What if the other guy tells on me?' Then I will be in jail, but he would be spared and even get a reward... So they both defect, i.e. tell on each other and loose. This game brings us to one of the most fundamental insights - evolutionary games are iterative, they are played over and over again. The reasoning above does not apply to repetitive games. Cooperation arises after trial and error, as the only meaningful strategy. So in our EWorld we want to observe interactions which occur over a significant period of time. We want to see essential, global properties which will arise after several hundreds rounds of simulation. Now we come to Seven basic properties found in all CAS which John Holland outlines, in his book. We want to understand them in order to be able to incorporate them into our simulation.
So this gives us an idea of what we are trying to model. Certainly we can't implement entire 'Echo' in such a short period of time, so what we need to do is to come up with a meaningful subset of the functionality. Hopefully, after this lecture you should be able to have a clear enough picture of what you are to do. Now we will analyze our version of Echo in more details. We start by realizing that during the analysis, we need to decide what are the parameters of EWorld, what are the things that user should be able to specify in the config file and perhaps on the fly.
So as we discuss the model, we need to discover meaningful variables, candidates for system parameters. Lets start by reviewing the setup. Start up
System starts by reading configuration file. In the configuration file we find global simulation parameters as well as description of several sites. Sites will differ by kinds of resources found in them and entity generators. Observing a site We need to offer capabilities of observing interactions of actual agents. The user will be able to choose a site and zoom in on it. So the system automatically switches into slower mode when site zooming occurs. This means that the speed of the simulation is calculated base on a certain parameter. When we zoom into a site, the entities which live in it continue to move around, except that their speed is decreased by certain automatic factor. Zooming in on a particular site is useful for several reasons. First, it is kind of visual debugging that we may do. We can watch the interactions and query entities and see if the site functions in the way that we expect it. Now once, we done debugging the end user of the system will be able to enjoy zooming in on a site and seeing the actual interactions. Related functionality which is the ability to suspend the system. In the suspended mode, entities are alive, but aren't moving. There should be a way to investigate the state (genome) of a particular entity at this point. Of course, the user should be able to resume the system. Site dynamics As we said before, every site will consist of entities and resources. The resources will consist of bases ( 'A', 'T', 'C', 'G' ). The resources will be of two types - homogeneous and heterogeneous. The homogeneous ones will contain only one type of base. The heterogeneous one, will contain several ones. The heterogeneous resources can be of two types as well. They can contain either the same number of basis of different types or different. As you remember our configuration file contains the initial specification of the world. So the resource specification should reside in the configuration file as well. We should also be able to make resources renewable or not. We can easily specify homogeneous resource like this
<RESOURCE CLASS=edu.nyu.sejava.iskold.eworld.HomogeneousResource> <PROPERTY NAME=Quantity VALUE=25> <PROPERTY NAME=Base VALUE=a> <PROPERTY NAME=Renewable VALUE=true> </RESOURCE> Then, heterogeneous resource would be specified something like this
<RESOURCE CLASS=edu.nyu.sejava.iskold.eworld.HomogeneousResource> <RBASE> <PROPERTY NAME=Quantity VALUE=25> <PROPERTY NAME=Base VALUE=a> </RBASE> <RBASE> <PROPERTY NAME=Quantity VALUE=2> <PROPERTY NAME=Base VALUE=b> </RBASE> <RBASE> <PROPERTY NAME=Quantity VALUE=11> <PROPERTY NAME=Base VALUE=c> </RBASE> <PROPERTY NAME=Renewable VALUE=false> </RESOURCE> It would also be interesting to allow user to control site resources on the fly. Namely, users could add, remove and modify resources dynamically. Also, it would be interesting to generate resources probabilistically. For example, we can imagine resource generator, where user can specify probability associated with every base and then such generate would create the resource. Entities Lets now look at entities. The entities run around the site, collect resources, fight, trade and mate. Every entity contains the following:
Entity Generators Obviously it would be ridiculous for us to specify every entity in the configuration file - it would be huge. Instead, we will specify entity generators. You may choose to implement more or less sophisticated ones. The simplest entity generator just has a simple parameter - number of entities to generate. Another one, could take instructions on how to generate various genes for an entity in some kind of probabilistic way. In addition, to achieve absolute flexibility, you may want to allow the user to create entities manually as well. <GENERATOR CLASS=edu.nyu.sejava.iskold.eworld.BasicRandomEntityGenerator> <PROPERTY NAME=Quantity VALUE=25> </RESOURCE> <GENERATOR CLASS=edu.nyu.sejava.iskold.eworld.BasicConstrainedEntityGenerator> <PROPERTY NAME=Quantity VALUE=25> <PROPERTY NAME=OffenseGeneLength VALUE=5> <PROPERTY NAME=OffenseGeneBases VALUE=AT> </RESOURCE> Now consider in turn entity genes. Every entity contains a reservoir - collection of found resources. When an entity finds a resource, it places it into its Reservoir. The Reservoir is the only source of genetic material, which entities use for cloning. Now, if an entity finds itself near a resource, which is not empty, then it tries to acquire next base from this resource. Before entity can acquire the base, it must lookup in Resource collection gene if it can acquire this base. Typical resource collection gene would look like this a b d This means that this entity is only allowed to pickup the bases specified above. This is the function of the resource collection gene.
As an entity travels though the site and acquires resources, it will occasionally fork off a copy of itself, if it accumulated enough resources. The reproduction threshold is one of the EWorld parameters. It is minimum value is 1. This simply means that in order to make a clone, an entity must contain in its reservoir exactly the same bases as are contained in the entire genome (all other genes). This definition makes perfect sense of course. More often, the reproduction threshold will be set to 2 so that the entity after cloning still has some resources. Another EWorld parameter - reproduction fraction, is used to determine how to split remaining (if there is any) part of the reservoir between mother and daughter entity. For example 0.6 would mean that mom gets to keep 40% while 60% are chosen at random & transferred to the newly born. During the replication, it is possible for a base to mutate with a small probability. This parameter is called mutation probability and usually would be 0.001 this means that 1 out of 1000 bases would be changed arbitrarily during cloning.
Combat When two entity meet somewhere in a site, there are three types of interaction that may take place. They are (in order) combat, trading and mating. In order to determine if two entities will engage in combat, their combat conditions and offense genes are compared. The entity A will attack entity B if its combat condition is a prefix of B's offense gene and vice versa. Thus combat can be either mutual or one way. If the combat is NOT mutual that the entity which is being attacked has a chance to flee. There are several ways to model this. One is to establish universal fleeing probability. This is simple to implement, but is not realistic. Another, more sophisticated way, is to let the entity which is being attacked, estimate its chances of success. The combat calculations are shown in the diagrams below:
So the entity being attacked has a chance to fee equals to one minus its probability of winning. If it doesn't then the winner is computed based on the above formulas. If the winner is unconditional, then ALL the resources of the looser are transferred to winner and looser dies :<. However, if the winner was probabilistically chosen, then it receives only a portion of overall losers resources. The portion is equal to losers probability of winning. The winner gets part of losers resources starting with its reservoir and then proceeding to genes, if necessary. Trade If entities don't engage in combat, they may choose to trade. A and B will trade if A's trade condition gene is a prefix of B's offense gene and vice versa. This means that trade is a MUTUAL agreement. The only thing that entity can give up is excess trading resource, which is simply one base. Entity gives up one of its bases which is currently in reservoir. If it doesn't have any, then it gives the other entity nothing and its just bluffing. Mating Finally, if two entities neither fight nor trade, they may choose to mate. In order for a mating to occur, A's mating condition gene must be NON ZERO prefix of B mating gene or vice versa. The mating can be done either using single point or two point crossover as depicted below:
If you don't have time to implement both, implement two - point crossover. Now, ONLY corresponding genes should be crossed over including the reservoir. This means that you should crossover offense gene with offense gene, etc. The crossover should not be automatic, even for compatible entities. It should take place with certain fairly high probability (like 0.75) and should be one of the system's parameters. Mortality There are several ways to deal with mortality of entities in the system. You may choose to kill random number of entities every traversal cycle. You can also kill entities that don't have enough 'food' in their reservoir. These techniques are simple and effective, however, are inferior to the following method. This method is based on Gompertz mortality law. The Gompertz mortality law is very simple it as that the longer entity lives the higher its chances of dying. Probability that an entity will die can be increases exponentially with entity's age.
Sites We should start by simply creating one site. once we get it working it will be fairly straight forward to add others. Each site should run in its own thread. The site will essentially execute some kind of traversal on every step. It will move every entity to a new position and then walk through and execute necessary interactions. In order to capture site statistics, you will probably have to create SiteListener interface, which will be implemented by Statistics manager. Decide what are the important things that we need to keep track of. Statistics On the large scale, we will be concerned with collecting system statistics. Every site should be registered with some kind of centralized statistics manager. The manager has a parameter on how often to collect statistics and another one how often to save it to a file. The idea is to be able to plot at least the number of entities on different site as the system running. Then, it would be of interest to compare the characteristics of majority of the population in the end of simulation. The fastest way to store statistics would probably be using Serialization. The wiser thing to do would be to store it in Object oriented or relational database but we don't have this luxury in this class. Also, users should be able to view statistics without running the system.
Please read Holland's Hidden Order You can learn more about complexity here: Complexity, by M. Mitchell Waldrop |