New physics, old computing

I was reading about the recent black hole mergers simulations performed by the people at Goddard, more thoroughly described here and in this forthcoming article. These are, undoubtedly, beautiful results, and a testament to the complexity of Einstein’s equations when it comes to obtain realistic results: according to the reports above, thousands of lines of code (plus an impressive array of supercomputers) were needed to obtain them. An impressive achievement, but still there’s something in it that makes me uneasy: if i’ve read it right, those thousands of lines of code are actually lines of Fortran (or, in the best case, C) code (more concretely, they’re using a library called Paramesh, written in Fortran 90). Now, if you ask anyone with a solid background in computer science, she will probably tell you that nobody (except physicists, that is) programs these days in Fortran. We know better languages, and have developed far better ways of writing computer programs in the 50 years since Fortran was invented. That is, we physicists are using obsolete technologies. Those newer languages (Scheme, Haskell, OCaml, and so on and so forth) are better in many ways, but specially in one that i am sure is close to any physicist’s heart: they provide far, far better means of abstraction. That is, you can write much shorter programs in a language that is conceptually closer to the problem at hand. And shorter may well mean something like a ten fold reduction in the number of lines of code; not to mention the benefits on clarity, maintanability and extensibility that greater abstraction entails. To use a metaphor, it’s like we were using Levi-Civita’s books and notation as our standard way of calculating in General Relativity, instead of modern differential geometry.

Of course, there’re perfectly understandable reasons for our using antics like Fortran, legacy code being probably the most prominent one; and physicists not having the needed expertise might well an important one too (but let me rush to say that efficiency of the code is not a good excuse these days). But i’m convinced that numerical physics would be vastly improved if we imported some expertise from the professional computing world. I’m told by friends in the field that some of the most ‘advanced’ guys are trying things like C++ and Java (instead of Fortran) these days: i’m sorry, but these languages were current some 20 years ago, and we’ve learnt since then how to avoid many of the pitfalls and unnecessary complexities they carry on. Much more interesting is to use interactive languages like Python (to be on the conservative side) or, if you ask me, functional languages like Scheme or Haskell. To give you a glimpse of what i’m talking about, here is how you’d write quicksort in Fortran 95; in Haskell, it’s a two-liner:

qsort []     = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)

Fortunately, not every one sticks to Fortran these days: Michele Vallisneri’s Synthetic LISA is a beautiful example of a step in the right direction, and i’m glad to see that numerical libraries like PETSC do in fact provide Python bindings. But, as i said, i think (after nine years or so of earning a living writing computer programs) that there are even better ways. As a matter of fact, i’m seriously considering the possibility of writing some LISA simulation code using Scheme. What deters me, besides lack of time, is the enormous weight of tradition: almost everybody out there in the physics community is using those C and Fortran libraries, and that means millions of lines of well-tested code and wondrous results like those black hole mergers. The easiest thing to do is to go with the flow, but still…

(By the way, these comments are by no means intended as a critique of Baker et al. work, which is impressive as it stands. Besides, for what i know, they may well be using far more sophisticated techniques than plain Fortran or C. My rants are more geared towards many other cases i’ve seen of physics programmers which were anything but sophisticated.)

Technorati Tags: , , , , , ,

15 Responses to “New physics, old computing”

  1. What’s the fastest computer on the earth? » chuyskywlk Says:

    [...] New physics, old computing [...]

  2. Alex Says:

    Thank You

  3. Pierre-Yves Gérardy Says:

    Answering a two years old post…

    Sussman and Wisdom used Scheme back in the early 90’s to prove that the motion of the planets of the solar system was chaotic…

    http://www.sciencemag.org/cgi/content/abstract/257/5066/56

    they even published a book describing their methodology (new functional notation for physics equations, and how to implement them in Scheme) : “Structure an Interpretation of Classical Mechanics”

    http://mitpress.mit.edu/SICM/ for the free full text online.

  4. Nico Says:

    Thanks for the interesting blog.

    I would imagine that if you’re talking about running code on “an impressive array of supercomputers” that efficiency of code _is_ important.

    But code efficiency is one factor among many in deciding what combination of language, algorithms, hardware, etc. will give you the best solution to whatever criteria you might find relevant. In addition to performing the simulation, some other factors might be the total time it takes to arrive at a solution, computing cost, programming time, etc.

  5. Adam Hupp Says:

    The Fortress language (currently in development at Sun) is being designed to fill the niche that Fortran currently resides in.

    “Fortress is a new programming language designed for high-performance computing (HPC) with high programmability. ”

    http://fortress.sunsource.net/

    One of the more exciting features is its heavy use of mathematical notation:

    http://research.sun.com/projects/plrg/faq/index.html#six

  6. Aaron Denney Says:

    That’s _not_ quicksort in Haskell. Quicksort is very carefully specified to be in place, and not allocate lots of memory while shuffling the elements around. Haskell is a great language, but a true Haskell quicksort looks far more complicated than this.

  7. RandomInt() Says:

    C++ FTW.
    Fast, efficient, runs on anything.

  8. Rasmus Ulvund Says:

    As Aaron said:

    This is not quicksort.

    This algorithm begins thrashing much sooner.

  9. Warren Henning Says:

    I was dismayed at the presence of Fortran among applied math and physics people in college.

    I’m pretty sure MATLAB, LabView, and other (unfortunately, mostly proprietary) tools are quite popular among engineering (as opposed to science) people.

    Also, there’s this lingering stereotype that physicists are often better computer programmers than computer science majors. That might be true if they actively work at learning computer programming, but you need look no further than a typical “numerical methods in FORTRAN for physicists and engineers” type book to see horrible, horrible spaghetti code. Physicists are most certainly capable of producing totally unmaintainable shit code.

  10. mclaren Says:

    You’re an ignorant crackpot. Only a programmer could say something this ignorantly foolish.
    The entire IMSL is written in FORTRAN. It’s millions of lines of super-optimized code perfected over 40 years. No one is going to rewrite in the latest greatest script-kiddy bullshit language you and your clueless programmer pals came up with yesterday.
    Programming is a disaster. 50% of all large programming projects are cancelled because they become unworkable. While every other science races forward toward the stars, programming is still suck in the dark ages building everything by hand and reinventing the wheel over…and over…and over…and over…and over…and over again.
    Every programmer I’ve ever known is an arrogant clueless clown who claims every programming taks is “trivial,” then bogs down taking 5 times as long as esimated to produce code that never runs correctly. Every possible problem is “easy” to programmers — natural language translation, real-time music transcription, face recognition, they’re all “simple” and “easy” and “a piece of cake” according to programmers. Except that no programmer has been able to write a program that solves these “simple” “trivil” “easy” problems.
    Progarmming isn’t a profession, it’s a scam. The programmer explain that ecery task is “trivial,” then when they produce garbage code that doesn’t work 2 years after the deadline passed and 5x over-budget, they shrug and blame the programming lanaguge instead of their own gross incompetence. Then they invent some new programming language…as a form of welfare, so they can translate all their old buggy non-working programs into even buggier new non-working programs in the new programming langauge.
    Someone needs to invent an automated bitch-slap machine that will whack every ignorant arrogant incompetent programmer into silence when they spew out yet another ridiculous article like this one.

  11. jdag Says:

    You make a reasonably powerful argument for writing new code in one of the very many languages that is better than Fortran, because it’s a tough language to write big programs in. You will have a _much_ harder time arguing that forty-year-old code that has been extensively optimized and debugged should be dumped just because the people who wrote and perfected it over decades did it the hard way. The Brooklyn Bridge was built using now-100-year-old technology, but it would be stupid to refuse to cross it because the builders didn’t use modern equipment.

  12. steve Says:

    I invite you to try out some physics simulations in Ruby, Scheme, Haskell or LISP. You will very quickly discover the order of magnitude difference in performance.

    In other words, instead of needing 10 supercomputers to run a physics simulation, you will need 100 or 150 supercomputers - and thus a range of calculations moves into the “not possible” category.

    I program in a couple of the later languages, and mainly in Objective C… but even I would go to Fortran when working in this field, mainly due to the huge highly optimised libraries that have been perfected over 50 years.

    I’m sorry, but you need to study more about the implementation of languages. Many of the very nice languages these days are based on virtual machines with interpreters and are consequently very slow.

  13. Viaceslav Says:

    Im physics student, first year. so Im really confused what programming language should I lear. I am going to do masters and phd and do research. so as I understand by above Steve’s comment Im better off to learn fortran and C?

  14. Eneas Labra Says:

    With the introduction of multicore processors this discusion is very obsolete. The processor is the new transistor. If i need 1000 computers i will have 1000 computers on one. How can i manage 1000 computers on one? Fortran and C are not good for this kind of computations because they are disigned for old computer architectures with one old memory (without transactional memory) and one CPU. Less cycles of CPU is not all if speed matters. Old Fortran code is not optimiced for the machines of the future because the algoritmes that implement are not optimiced for paralelism. All the serius research about the language for the new machines is about functional programing languages (haskell, erlang,Sheme,…).
    Other question is the inportance of languages such ruby, python, objective-c. They are the languages of the present. With this languages you can use much level of abstraction than C or Fortran. And if you need you can use bindings to libraries of C and fortran (with more speed in the present).

  15. Guillaume Says:

    I agree with the author of the article. The biggest advantage of fortran for heavy computations was the fact that it can optimize more than C because it doesn’t have to care about pointer aliasing problem. But this is not even true anymore, because C has now strict aliasing, and restrict keyword to optimize the same way than fortran.

    Beside, as you say, often the algorithms can be split into a huge -not speed sensitive- part (that can be done in python, haskell, etc) , and a small low level part, that can be written in fortran, C, or even better in D. I don’t know about the black holes stuffs simulation code, but from my experience I would never judge the quality of a software by its number of lines.

    I used to work in a physic laboratory doing quantum physic simulations. My code was 99 percent python and 1 percent C. It run as fast as any fortran implementation, and it was much more small, readable and elegant.

    Now about the speed of fortran. I invite people to have a look at the computer benchmark :
    http://shootout.alioth.debian.org/debian/fortran.php
    You have not a single fortran g95 implementation that can beat C ! (ok I guess with intel fortran it could have been much better, but who knows, since it is forbidden to publish benchmark for this fortran implementation)

    Many scientists I know justify there use of fortran by the fact that it is faster, but they don’t even know why it is (in very very specific cases) faster. And they use it for things that don’t even need to run fast (Once I saw a physicist codding a home made implementation of a buggy regular expression automaton in Fortran : 2 hundred lines of buggy useless code ) The guy didn’t even know this problem had been solved a long time ago by other people.

    Most of the phd students in the place i work now are writing codes that don’t need aggressive optimizations. They really waste there time with fortran, and worst, it makes them seeing the act of coding as a painful experience, and they will finally give up and just try to make there simulation work ; and then when you read there code it is full of global variables and hacks like suppressing warnings output and ugly stuffs.

    Fortran shouldn’t be taught in university, it should be used only by people who know exactly why they use it.

Leave a Reply