Why are Systems Biologists so bad at Computer Modeling?


The title of this blog is meant to be a little brazen and eye catching but I decided to write it after having been on a variety of government grant panels over the years where I am always shocked at the poor quality of the wet lab proposals that employ computer modeling. Even after 12 years of full on systems biology, the community still doesn’t seem to quite get it (Obviously there are exceptions). What do I mean by this? Let’s say you’re an experimentalist who is about to write a grant proposal for, say, the NIH. In your mind I am sure you have the following broad plan for your proposed work, you start off with a hypothesis in Aim 1, followed by some validation in Aim 2 then maybe some refinement of the hypothesis in Aim 3.The classic approach.

Perhaps less likely but equally valid, you may not have a hypothesis up front so you start with collecting some data in Aim 1, formulate your hypothesis in Aim 2, then do some validation in Aim 3. There may also be some iteration between Aim 2 and 3 as the researcher refines the main hypothesis. These are tried and tested methods in science and almost every grant proposal that plans to do experiments has some pattern like this (See footnote for an exception).

So why, the minute an author of a proposal decides to do some computer modeling this pattern is completely different, computer simulation models are after all just hypotheses? What I generally see are patterns shown in the figure below::

What we see here is that the start of the proposal looks something like the classical pattern, then for some inexplicable reason, a modeling project is tagged on the end in Aim 3. There are obviously exceptions to this and some authors do understand that a computer model is just like any other hypothesis and should be treated just like the non-quantitative hypotheses used in the classical approach. In other words, the computer model comes first! Aim1 should actually read something like:

Aim 1: Formulate an Initial Computer Model of our System.

Some might say but wait a minute, how can I formulate a computer model before I have the data? Are you crazy or what? The point of the initial model (the null hypothesis if you like) is to state up front what we currently think we know about our system. The beauty of this is the model will then also generate hypotheses which can now be tested in Aim 2. Aim 3 can then refine the model further. In other words this is no different from the classical approach.

Just to make it clear again: The computer model is a hypothesis like any other hypothesis.

What is surprising is that some very well known scientists have fallen in to this trap. Some years ago I was at a meeting that was discussing how a very large project would use modeling, collect data etc. It would be used as a fine example of how modeling would be combined with experiment. What they actually meant by combine was in the last year of the project they would build a computer model of the system. I remember a number of us in the audience were really surprised at this, without a model up front how would they know what data to collect in the first place? Not surprisingly the project, after some years, just seemed to slowly peter out with very little to show for all the $$ that were spent – and definitely no model. These kinds of projects give modeling a bad name and there are an equally large number smaller ones that seem to get through peer review and fail just as badly in their own small way.

I can only assume that the many modeling courses and workshops that exist just don’t seem to get this message across. As reviewers of proposals we should be diligent about spotting this fundamental error, we should teach our students what computer modeling is and how it should be interweaved into a scientific project.

Next time you’d like to put some computer modeling into a grant proposal, put the model first.

Footnote: Perhaps the exception to the hypothesis driven approach are the pure data collecting proposals such as sequencing or simply gathering data and taking a different approach called Discover Science (Equipping scientists for the new biology). I am not concerned with that type of proposal here.

This entry was posted in General Science Interest, Modeling, Pathways, Systems Theory. Bookmark the permalink.

Leave a Reply

Your email address will not be published.