Category Archives: Cardiac Electrophysiology

Cardiac Electrophysiology

How I got our cluster to send me MMS movies of my simulations

As the models used in our lab become larger both in resolution (more detail) and gross size (bigger pieces of tissue), the time and effort required to visualize and otherwise check results increase. With the largest model currently used in the lab (mine), one short simulation produces 2.0 GB of uncompressed data. Compression gets it down to about 500 MB or so. That still means that — just to make sure the simulation ran correctly — I have to download 500 MB of data, load my model into a viewer of some kind, and load and view the data. This is not acceptable, especially since even loading the data requires a machine with significant graphical power and a large hard drive.

There are other ways to visualize our data. A little while ago (two years?), our programmers Rob and Umar put together an off-screen renderer for the IBVRE project. It’s based on VTK and coded in Python. It simply loads the model with views from all 6 sides, maps the data on to the surface, and then writes an image file.

I spent 12 hours Saturday resurrecting this software and tweaking it for my own needs. Now it just loads one view of my model, efficiently steps through a specified number of time steps (it did this very very inefficiently before), writing them all to image files, and then exits. I run this on the cluster with a script that joins all of the images into a movie, and then emails them to me.

I currently have the cluster email me when a job is done. I also have mail filters that forward these messages to my phone. However, I can now do better. I have integrated the rendering program with my cluster run scripts, such that the following happens. When a simulation is finished, the visualization program is run and dumps images of all of the time steps. They are then joined to make a movie, emailed to me, and emailed directly to my phone. Thus, when the simulation is finished, I not only get an email notification, but I can review the video right on my phone.

If you’re not a geeky type, this may not impress you. To me, it is the very apex of cool. Enough so that it drove me to stay at work for nearly 13 hours on a Saturday. Here’s a sample video:

Now, instead of downloading 500 MB or more to a high-powered workstation (or let it limp on my laptop) to check the outcome of a simulation, I can have a 1 to 4 MB video automatically sent to my mobile phone and watch it wherever I am.

It all begins when the gates open.

Maria has an excellent little vignette about nerve firing (which is very very similar to cardiac cells firing) over at intueri. Here is an excerpt (whose prose is typical of her excellent writing):

Sodium ions flood into the single brain cell through the gates of channels linking the exterior and interior of the neuron. Other channels along the length of the cell follow suit, opening more gates to allow more sodium ions in. A deluge of positively-charged atoms overtakes the single cell.

Other channels follow the precedence, though they selectively permit calcium, not sodium, ions to join the influx. The excess positive charge soon beckons the potassium channels to open and, with gusto, potassium ions flee from the interior of the cell.

Whew! Jobs running on the cluster. (And travel.)

I’ve been totally absent from most of my life the last week as a result of some problems we had with our code on the cluster. My jobs kept dying, taking down compute nodes in the process, for no apparent reason. After a while I narrowed it down to the time when restart files (from a previous simulation) are read. It turns out that the way the files were read (and that way for a good reason) was really brutal on the network. It involved way too much communication. This was okay for smaller models, but I currently have the largest model we’ve ever run in the lab.

After a conversation with our current programmer and one with our former programmer, and about 6 hours of coding last night, the restart files are now read in a less naughty way, and my jobs are reliably running.

—–

Tomorrow I am leaving for about two and a half weeks in New Orleans and Mandeville! I have an early flight, preceded by an even earlier train ride to the airport. I should be hooked up to the “tubes” and (New Year’s resolution here I come) updating the blog more often with the blow-by-blow as I try to get enough data for a Heart Rhythm conference abstract in time for the deadline, despite all of the sundry delays with the cluster.

*gasps for breath*

—–

Also, Penguin liked my cluster video so much that they put it on their front page.

Five years in the lab: looking back, then forward

About this time five years ago, I was a nervous junior undergraduate studying Biomedical Engineering at Tulane University. I had just been accepted as an undergraduate member of Dr. Natalia Trayanova’s computational cardiac electrophysiology lab. The goal at that time was to complete a research project for my undergraduate thesis.


So very many things have happened since then. Here are the highlights:

  • 2002: Started learning the ropes of the lab
  • 2003: Continued to familiarize myself with the computers and code in use in the lab. The most powerful machine in our possession was an SGI with 8 processors (the Origin 300 listed here). There was almost always a wait to use those processors. Spent my summer vacation working in the lab. This was the first time I was paid to to research. Some time during this year (I think) I created the lab wiki using MoinMoin. By this time I was administering the lab computers and was sick of answering the same questions over and over. In desperation I created a wiki and started putting answers on it, referring people to the wiki when I was asked a question. The wiki is now (as of November 2007) huge, and contains basically all of the documentation of everything used in the lab, as well as gigabytes upon gigabytes of attached models, data, and images.
  • 2004: Graduated from Tulane with my Bachelor of Science in Engineering (BSE) degree. Joined the lab as a graduate student. Sometime in 2004 (I think), Tulane acquired a Linux Networx cluster, and we owned 20 nodes in that cluster.
  • 2005: Shortly after returning from my trip to Niger, Katrina struck New Orleans. The lab was scattered. Few people in the lab had access to their data. A few lab members actually snuck past armed guards to get our file servers and some workstations from our lab at Tulane. We took up residence in St.Louis, MO for two and a half months, aided by our colleagues in the labs of Drs. Yoram Rudy and Igor Efimov at Washington University. By the end of the year, we had returned to a slowly-recovering New Orleans.
  • 2006: Dr. Trayanova accepted a position as a professor at Johns Hopkins University. Almost the entire lab transfered to JHU and moved to Baltimore, MD.
  • 2007: In April, I began discussing a cluster purchase with High Performance Computing (HPC) companies. Around that time, the weather warmed up, the server room could no longer be adequately cooled, and we started limping by on 4 compute nodes. By the end of July, we had placed an order for a new cluster. We moved from Clark Hall into the newly-completed though poorly-named Computational Science and Engineering Building. In mid-November, most of our new cluster arrived, though FedEx dropped and destroyed one rack, and the cluster was not completely set up.

That brings us to the present day. Now, looking forward a little:

In the next two weeks, the cluster set-up will be completed. We will have free rein on 140 compute nodes (20 old, 120 new), all managed from one head node. The new nodes will be connected by the fastest Infiniband interconnects available on the market, and each node will have 8 GB of RAM available, with the potential to hold 64 GB each. There are four 3.0 GHz Opteron cores per node, yielding a total of 480 processors and 960 GB of RAM on the new nodes alone.

To give you some perspective on what that means, let me give you some details about the kinds of models we run. When I joined the lab, our two largest models consisted of a 4mm thick slice of the canine heart, and a very smooth, idealized model of the rabbit heart. These models are composed of 1.6 million and 0.82 million tetrahedral elements, respectively. It took something like an hour of wall clock time per millisecond of simulation time to run these models. (In other words, to get one millisecond worth of simulation data it was necessary to wait about an hour.) We could run one or two simulations at a time, at that speed.

My newest model, and currently the largest model in use in the lab, is composed of 28 million tetrahedral elements. On a cluster similar to our new one (Lonestar on TeraGrid), using 32 processors, it takes about 22 minutes of wall-clock time to simulate one millisecond in the model. Using a crude estimated unit of speed of (minutes real time / millisecond simulation time / tetrahedral element), and focusing only on the number of simulations we can run at once, not the number of CPUs required:

  • Old way: 60 minutes / 1 ms / 0.82 million tets = 73 minutes / ms sim time / million tets
  • New way: 22 minutes / 1ms / 28 million tets = .78 minutes / ms sim time / million tets

We have increased our simulation speed by almost 100 fold. We can run two to four simulations of that size at a time, vs one or two the old way. But that’s not all. We can now run bigger models. Much bigger models. We are now capable of running something the size of a dog heart (we have verified this). More importantly, we now have the technical capacity to run a model the size of the human heart, with a resolution near that of the size of a cardiac cell, and to model contraction in addition to electrical activity. It remains only to develop such models. We are prepared to store the results: the new cluster has a storage capacity of 28 TB online, with the ability to add something like 40 or 50 TB more simply by expanding the existing storage device.

In my time in the lab, I have watched our abilities expand from serial jobs with relatively small models to massively parallel jobs with the capacity to model electrical and mechanical activity in the human heart. We are just beginning a very exciting time in the lab and in the field, and what’s really killing me is that fact that there’s so much more to tell you.

But I can’t just yet.

(This post was partly inspired by a conversation with Maria and Amanda)

Finding related articles graphically

When doing a literature search, it’s a good idea to start from a few articles and then (if they are along the lines of what you are looking for) use their references and articles that reference them to expand the search.

One handy way of doing that is with the HubMed Graph Browser. You get to it by finding an article (like mine here) and then selecting the “Graph” link next to “Related” in the line of options at the bottom.

Once you load the TouchGraph, you can see the related articles, change the depth of relationships graphed, zoom in and out, and so on. It can be a nice alternative to the normal related articles list, graphically showing distance and relation.