Category Science

A humbling programming experience

I’m working a short script to post-process some simulation data from TransmissionLab, and because the scripting language I know best is Perl 5, I’ve written a short Perl program. I’ve been writing Perl since early 1994, and from about 1997 through 2005 I was fairly expert in the language, able to build and maintain fairly large, object-oriented systems that were actually readable by others. I even knew a fair bit about Perl internals, could link a C library to Perl via XS, and followed the (interminable) Perl 6 development process quite closely.

But I realized today that I’ve completely lost my fluency in the language. I’m struggling to re-activate the parts of my brain that understand deeply nested hash tables, objects, and other Perl-isms. I had to look at the perl man pages today to remember bits about foreach loops and the “defined” function. It’s coming back, and the program works, but it’s been slow. I guess that’s what you get for not using a language in several years.

Java is a terrific language for object-oriented development (as is C#, if I were working primarily in Windows), but it does insulate you from a lot of fairly-low level issues, in favor of giving you higher level expression. This little program I’m writing basically just looks for and reduces rows of data from experimental replicates and outputs the reduced data set with error terms. Simple descriptive statistics, plus a bit of data structure work. But without the Collections library and some of the Jakarta Commons stuff, I really had to think about how to do this.

Guess it points out how you need to keep using skills in order to keep them sharp.

Updates to RandomCopyModel simulation code and new TransmissionLab model

Bandwidth has returned here at the house, so I’m catching up on a backlog of things that needed reasonably comfortable internet connectivity. The Canopy wireless service off Mt. Constitution has been having problems with a power supply and some other stuff for a couple of days, so I’ve been relying on my low-speed backup DSL link, which I have to say lets me get email but otherwise feels more like “super dialup” than “broadband.” Oh well.

I’ve been busy on the simulation modeling front this week, coming out with a “final” version of the RandomCopyModel from an upcoming paper by Alex Bentley, Carl Lipo, Harold Herzog, and Matthew Hahn. I didn’t work on the original model or paper, but Alex graciously allowed me to use the original code as the basis for some future experimental models we’re working on, as well as simulation setups I’ll need for my dissertation research. In return I refactored and made the original simulation a bit clearer to follow in terms of the code, so we’re providing version 1.3 of the RandomCopyModel under the terms of a Creative Commons-GPL license for non-commercial use.

In the meantime, I’ve also been planning the next “version” of the model, which is now a separate codebase managed by Google Code under the name TransmissionLab.

The goal of TransmissionLab is to accurately represent theoretical models of CT (e.g., random copying, prestige-biased transmission, frequency-biased transmission) within a variety of population structures (e.g., complete graphs/well-mixed, sparse random graphs and social networks of varying topologies, spatial lattices), and using a variety of update algorithms (e.g., Moran processes, Wright-Fisher processes, various other birth-death processes). TransmissionLab seeks to also make data collection and “observation” of simulated populations simple, with modules which are completely separate from the simulated population itself thus preventing observational “side-effects” on the model. Analysis flows from data collection, and can be done in a variety of ways.

Naturally we’re not the first people by any means to do agent-based simulation models of cultural transmission, imitation behavior within populations, or the diffusion of innovations. Heck, this isn’t even the first simulation of these sorts of phenomena I’ve been involved with. Where I hope that TransmissionLab differs is that nearly all of my previous simulation models have proven fairly special-purpose, expedient models for working on one particular question or problem, and I’m trying to make TransmissionLab a common platform that can span projects both for myself and my research group.

This is important for a number of reasons:

  1. Stable, well-used models tend to be well-structured, well-tested models. The issues of whether the simulation is showing us artifacts of writing software or the theoretical behavior we’re trying to describe can only be solved by deep investment in design, coding, and testing.
  2. Relating the results of one study to another when each uses different simulation code is a difficult one, for obvious reasons. To the extent that we use the same code framework and models to perform multiple studies, we can make arguments (and even measurements) which relate the results of several research studies to each other.
  3. The relationship issue mentioned in the previous item could span multiple research teams if the model is well-structured and tested enough that others adopt it for research.

Thus, I’m putting some reasonable effort into developing TransmissionLab, and if you have an interest in agent-based modeling and cultural phenomena, I hope you give it a look in a version or two. Right now I’m moving from the older RandomCopyModel to a new set of development tasks, which will be outlined shortly at the googlecode wiki for the project. Once these are checked in, there will be some interesting elements beyond the earlier model to explore. I’ll post when that happens.

Research website

I’ve made my University of Washington website live this morning, as a location for discussing my research, distributing software and publications, and so on. The site isn’t finished yet: publications and research areas still need to be filled in. The immediate impetus of putting up the site was to create a distribution point for the agent-based simulation software I’m working on with Dr. Alex Bentley. I’m getting ready to make another release of it, which generalizes it for use beyond Bentley’s original experiments, so when it’s ready I’ll post a notice and description here.

Darwin Day 2007: Darwin’s Impact on the Social Sciences

In addition to being Lincoln’s birthday today, it’s also the 198th birthday of Charles Darwin, celebrated around the world as “Darwin Day.” In recognition of the day, I thought I’d share some thinking on Darwin’s contribution to the social sciences, because these are potentially as powerful as his direct effect on biology, if much less well developed. What follows is necessarily a sketch, since (a) much of it is reprising other sources, and (b) fully justifying and documenting this would turn it into an article or book, rather than a blog posting.

Ernst Mayr, the great biologist and architect of the Modern Synthesis, wrote in a 1959 essay that Darwin’s great contribution to biology was anti-essentialism, or what Mayr called “population thinking.” By this, Mayr meant that Darwin was one of the first biological thinkers to offer a theory of the evolution of species which did not rely upon changes to, differences in, or transformation of the “essence” of a species. In the older view, which goes back to Aristotle (at in terms of codifying this view; the origins of this view are much more ancient and are likely bound up in the cognitive science of “natural kinds”), each species is characterized by an “essence” or definition, which tells us what characteristics an animal, plant, and so on must have in order to belong to that species. The fact that each individual in a species is unique, and that often many individuals lack one or more essential characteristics, but are still considered part of the species, is explained away in Aristotelian essentialism as simply noise or reproductive error that causes the real world to be an imperfect reflection of the species’s underlying “reality.” Variation is thus explanatorily unimportant when species are viewed as characterized by pre-Darwinian biology. And the evolution of one species into another, over time, seems to run up against a massive gulf between two “essences.” One sees echoes of this “problem,” for example, in the objections of many contemporary anti-evolutionists when presented with what biologists believe is the abundant empirical evidence of evolution: the key words “intermediate forms” pop up as tell-tale signs, even though such a notion really arises only if one thinks a species has an “essence” between which a form could be “intermediate.”

Darwin’s great contribution, at least according to Mayr and others, was in founding modern evolutionary biology on a firm basis of anti-essentialism. In this case, what Mayr originally meant by “population thinking” (philosophers will recognize it as a variety of anti-essentialism) is that variation among individuals is not “noise,” nor is it meaningless error — variation among individuals is both the cause of evolution, and the great engine that powers the development of successive adaptations through the continuous (if occasionally roundabout) process of selection. Variation isn’t just important for selection, variation is critical for selection. No variation, no selection, as pointed out by the great mathematical biologist and statistician Ronald Fisher. In fact, selection might simply be the statistical consequence of having variation, some of which makes a difference of our life chances, in an environment where there aren’t enough resources, or enough room, or enough time, for every individual to succeed equally. Selection almost comes naturally when you think about the world through the lens of Darwin’s population thinking.

What has this got to do with social science and Darwin’s contribution to the human sciences? Potentially everything.

Notes on a cold Superbowl Sunday

I’m about to start watching the game, but I like to let Tivo get a little ahead so I can zip through the slow parts of most games (the TV pace and commentary virtually guarantee that the game is much slower than it needs to be).  So I’m sitting here in the office getting a few things done, writing up some notes from last week’s talks by Robert Boyd at the UW.  And just a minute ago, tDsc_0043_croppedwo relatively young bald eagles swooped past the window, one chasing the other, as they dived and twirled their way north along the shore of Rocky Bay.  I’m not quite sure where the nest is, but they like to stop on a dead tree snag just off my property line but towering over the deck, so I see quite a few eagles here at the house.  (note: the picture here isn’t from today — it’s far cloudier and grey today — but it is the snag and one of my raptor neighbors).
 

Boyd’s talks were excellent, discussing evolutionary models for the association of "group markers" which non-randomly assort with traits which represent behavioral norms, and provide a way to make in-group, out-group identifications in situations where the underlying norm may be observationally transparent (e.g., attitude, religious belief, moral rule, etc).  The analysis followed McElreath et al.’s 2003 article in Current Anthropology, and when the model is overlaid on spatially distributed populations, it demonstrates that within-group covariance between the marker trait and the norm trait is strongest at boundaries, not in group cores.  This is true even in the simplest spatial case of a 1-D lattice or ring.  This makes sense because similarity between any two random individuals is lowest at group boundaries, and highest in group cores, where two randomly chosen interaction partners have the highest chance of being similar in their expression of a norm, so the selective advantage of the marker/norm covariance is lower. 

While down in Seattle I moved into the room I’m renting near the University, complete with a nice 3-layer futon from Soaring Heart (man, those 3 layer systems are comfortable…might switch my traditional mattress at home for the 3-layer in the guest room….), internet service from Clearwire (which works surprisingly well once you futz around with antenna positioning a bit, and a couple of items from my storage locker.  I still need to sort through the locker (which has enough stuff to furnish a 1 bedroom apt), take the desk and dresser to the new house, and haul the rest of this crap up to the island since I don’t need to furnish a kitchen, etc.  But that’s work for a future trip.

Week in Seattle

It’s a beautiful day on the ferry Yakima, headed up to Friday Harbor. Clear and cold, the remnants of this week’s snow hardened into icy hummocks in the ferry line. I have no idea what I’ll find when I get back home to the island, except possibly more of the same. As long as power has been fairly continuous and no trees fell, there shouldn’t be any problem at the house. I’ll feel better next week, though, when my generator gets installed (it’s now sitting on a little pallet in the garage).

I spent the week down in Seattle, handling beginning-of-school chores, finding a place to rent in town, and doing some social events.

In the latter category, our book group is reading Proust, and we’re making decent progress through Swann’s Way — most of us are reading the new Lydia Davis translation, which I have to say is very readable. Not sure whether I’d ever have read Proust without a group commitment, but between Richard Rorty’s writings on Proust and my friends, it seemed well worth it. This time around Christian hosted, serving a great Italian dinner, and we finished off with home-baked cookies and a 1983 Filhot Sauternes I had in the cellar (very tasty and nearing a full maturity in my opinion).

I also attended Roy Hersh’s “Great Seattle Madeira Tasting,” but since he makes his living writing about wine, I’ll give him a chance to write his article about the wines and the tasting before I comment on the wines. I will say, however, that it was a great opportunity to get a perspective across many great wines, and reconfirm my impressions about which producers and styles of Madeira I most enjoy.

I’ve found a place in Seattle, so starting February 1st I’ll have a place to live working at the UW. My landlord and roommate, Scott, is an artist and the house is chock full of art, deeply homey, and just a little bit funky. It should be fun. The only downside (if there is one at all) is that I’d been enjoying my time at the WAC — the king beds are amazingly comfortable and it’s really good for me to be a couple of floors away from the gym. But it’s also fairly expensive if I’m down in Seattle every two weeks, so it’s time for something different.

After some administrative preliminaries at school, I stocked up on academic-priced software (Mathematica, Endnote, and the Adobe CS2 suite) and math books (I need to bone up in several areas for my dissertation research). The University Book Store continues to be a terrific source, not just for textbooks, but technical books of all kinds. I wish I could say the same for Barnes and Noble at University Village, however. It still rivals and sometimes exceeds UBS for computers and programming books, but in days past the math and science sections were also highly competitive. Sadly, both subjects have been gutted, reduced to an aisle or so from their former 2-3 full aisles and a couple of side displays. Market forces, no doubt, but this does point out why the extreme libertarian argument for “markets in everything” ought to be rejected in certain realms of life — obscure and low-volume books might be useless commercially but they often serve a key role in research and scholarship. Which is why we have libraries and university-connected bookstores, I guess. And Amazon, of course.

We’re now past Lopez Island and on our way to San Juan Channel. The sky is clouding up a bit, and the island shores around us are white with light snow accumulation. It’s a frosty winter world up here, but a beautiful one. Seattle is a good change, but I can’t wait to be home.