New physics, old computing

I was reading about the recent black hole mergers simulations performed by the people at Goddard, more thoroughly described here and in this forthcoming article. These are, undoubtedly, beautiful results, and a testament to the complexity of Einstein’s equations when it comes to obtain realistic results: according to the reports above, thousands of lines of code (plus an impressive array of supercomputers) were needed to obtain them. An impressive achievement, but still there’s something in it that makes me uneasy: if i’ve read it right, those thousands of lines of code are actually lines of Fortran (or, in the best case, C) code (more concretely, they’re using a library called Paramesh, written in Fortran 90). Now, if you ask anyone with a solid background in computer science, she will probably tell you that nobody (except physicists, that is) programs these days in Fortran. We know better languages, and have developed far better ways of writing computer programs in the 50 years since Fortran was invented. That is, we physicists are using obsolete technologies. Those newer languages (Scheme, Haskell, OCaml, and so on and so forth) are better in many ways, but specially in one that i am sure is close to any physicist’s heart: they provide far, far better means of abstraction. That is, you can write much shorter programs in a language that is conceptually closer to the problem at hand. And shorter may well mean something like a ten fold reduction in the number of lines of code; not to mention the benefits on clarity, maintanability and extensibility that greater abstraction entails. To use a metaphor, it’s like we were using Levi-Civita’s books and notation as our standard way of calculating in General Relativity, instead of modern differential geometry.

Of course, there’re perfectly understandable reasons for our using antics like Fortran, legacy code being probably the most prominent one; and physicists not having the needed expertise might well an important one too (but let me rush to say that efficiency of the code is not a good excuse these days). But i’m convinced that numerical physics would be vastly improved if we imported some expertise from the professional computing world. I’m told by friends in the field that some of the most ‘advanced’ guys are trying things like C++ and Java (instead of Fortran) these days: i’m sorry, but these languages were current some 20 years ago, and we’ve learnt since then how to avoid many of the pitfalls and unnecessary complexities they carry on. Much more interesting is to use interactive languages like Python (to be on the conservative side) or, if you ask me, functional languages like Scheme or Haskell. To give you a glimpse of what i’m talking about, here is how you’d write quicksort in Fortran 95; in Haskell, it’s a two-liner:

qsort []     = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)

Fortunately, not every one sticks to Fortran these days: Michele Vallisneri’s Synthetic LISA is a beautiful example of a step in the right direction, and i’m glad to see that numerical libraries like PETSC do in fact provide Python bindings. But, as i said, i think (after nine years or so of earning a living writing computer programs) that there are even better ways. As a matter of fact, i’m seriously considering the possibility of writing some LISA simulation code using Scheme. What deters me, besides lack of time, is the enormous weight of tradition: almost everybody out there in the physics community is using those C and Fortran libraries, and that means millions of lines of well-tested code and wondrous results like those black hole mergers. The easiest thing to do is to go with the flow, but still…

(By the way, these comments are by no means intended as a critique of Baker et al. work, which is impressive as it stands. Besides, for what i know, they may well be using far more sophisticated techniques than plain Fortran or C. My rants are more geared towards many other cases i’ve seen of physics programmers which were anything but sophisticated.)

Technorati Tags: , , , , , ,


20 Responses to “New physics, old computing”

  1. What’s the fastest computer on the earth? » chuyskywlk Says:

    […] New physics, old computing […]

  2. Alex Says:

    Thank You

  3. Pierre-Yves Gérardy Says:

    Answering a two years old post…

    Sussman and Wisdom used Scheme back in the early 90’s to prove that the motion of the planets of the solar system was chaotic…

    they even published a book describing their methodology (new functional notation for physics equations, and how to implement them in Scheme) : “Structure an Interpretation of Classical Mechanics” for the free full text online.

  4. Nico Says:

    Thanks for the interesting blog.

    I would imagine that if you’re talking about running code on “an impressive array of supercomputers” that efficiency of code _is_ important.

    But code efficiency is one factor among many in deciding what combination of language, algorithms, hardware, etc. will give you the best solution to whatever criteria you might find relevant. In addition to performing the simulation, some other factors might be the total time it takes to arrive at a solution, computing cost, programming time, etc.

  5. Adam Hupp Says:

    The Fortress language (currently in development at Sun) is being designed to fill the niche that Fortran currently resides in.

    “Fortress is a new programming language designed for high-performance computing (HPC) with high programmability. ”

    One of the more exciting features is its heavy use of mathematical notation:

  6. Aaron Denney Says:

    That’s _not_ quicksort in Haskell. Quicksort is very carefully specified to be in place, and not allocate lots of memory while shuffling the elements around. Haskell is a great language, but a true Haskell quicksort looks far more complicated than this.

  7. RandomInt() Says:

    C++ FTW.
    Fast, efficient, runs on anything.

  8. Rasmus Ulvund Says:

    As Aaron said:

    This is not quicksort.

    This algorithm begins thrashing much sooner.

  9. Warren Henning Says:

    I was dismayed at the presence of Fortran among applied math and physics people in college.

    I’m pretty sure MATLAB, LabView, and other (unfortunately, mostly proprietary) tools are quite popular among engineering (as opposed to science) people.

    Also, there’s this lingering stereotype that physicists are often better computer programmers than computer science majors. That might be true if they actively work at learning computer programming, but you need look no further than a typical “numerical methods in FORTRAN for physicists and engineers” type book to see horrible, horrible spaghetti code. Physicists are most certainly capable of producing totally unmaintainable shit code.

  10. mclaren Says:

    You’re an ignorant crackpot. Only a programmer could say something this ignorantly foolish.
    The entire IMSL is written in FORTRAN. It’s millions of lines of super-optimized code perfected over 40 years. No one is going to rewrite in the latest greatest script-kiddy bullshit language you and your clueless programmer pals came up with yesterday.
    Programming is a disaster. 50% of all large programming projects are cancelled because they become unworkable. While every other science races forward toward the stars, programming is still suck in the dark ages building everything by hand and reinventing the wheel over…and over…and over…and over…and over…and over again.
    Every programmer I’ve ever known is an arrogant clueless clown who claims every programming taks is “trivial,” then bogs down taking 5 times as long as esimated to produce code that never runs correctly. Every possible problem is “easy” to programmers — natural language translation, real-time music transcription, face recognition, they’re all “simple” and “easy” and “a piece of cake” according to programmers. Except that no programmer has been able to write a program that solves these “simple” “trivil” “easy” problems.
    Progarmming isn’t a profession, it’s a scam. The programmer explain that ecery task is “trivial,” then when they produce garbage code that doesn’t work 2 years after the deadline passed and 5x over-budget, they shrug and blame the programming lanaguge instead of their own gross incompetence. Then they invent some new programming language…as a form of welfare, so they can translate all their old buggy non-working programs into even buggier new non-working programs in the new programming langauge.
    Someone needs to invent an automated bitch-slap machine that will whack every ignorant arrogant incompetent programmer into silence when they spew out yet another ridiculous article like this one.

  11. jdag Says:

    You make a reasonably powerful argument for writing new code in one of the very many languages that is better than Fortran, because it’s a tough language to write big programs in. You will have a _much_ harder time arguing that forty-year-old code that has been extensively optimized and debugged should be dumped just because the people who wrote and perfected it over decades did it the hard way. The Brooklyn Bridge was built using now-100-year-old technology, but it would be stupid to refuse to cross it because the builders didn’t use modern equipment.

  12. steve Says:

    I invite you to try out some physics simulations in Ruby, Scheme, Haskell or LISP. You will very quickly discover the order of magnitude difference in performance.

    In other words, instead of needing 10 supercomputers to run a physics simulation, you will need 100 or 150 supercomputers – and thus a range of calculations moves into the “not possible” category.

    I program in a couple of the later languages, and mainly in Objective C… but even I would go to Fortran when working in this field, mainly due to the huge highly optimised libraries that have been perfected over 50 years.

    I’m sorry, but you need to study more about the implementation of languages. Many of the very nice languages these days are based on virtual machines with interpreters and are consequently very slow.

  13. Viaceslav Says:

    Im physics student, first year. so Im really confused what programming language should I lear. I am going to do masters and phd and do research. so as I understand by above Steve’s comment Im better off to learn fortran and C?

  14. Eneas Labra Says:

    With the introduction of multicore processors this discusion is very obsolete. The processor is the new transistor. If i need 1000 computers i will have 1000 computers on one. How can i manage 1000 computers on one? Fortran and C are not good for this kind of computations because they are disigned for old computer architectures with one old memory (without transactional memory) and one CPU. Less cycles of CPU is not all if speed matters. Old Fortran code is not optimiced for the machines of the future because the algoritmes that implement are not optimiced for paralelism. All the serius research about the language for the new machines is about functional programing languages (haskell, erlang,Sheme,…).
    Other question is the inportance of languages such ruby, python, objective-c. They are the languages of the present. With this languages you can use much level of abstraction than C or Fortran. And if you need you can use bindings to libraries of C and fortran (with more speed in the present).

  15. Guillaume Says:

    I agree with the author of the article. The biggest advantage of fortran for heavy computations was the fact that it can optimize more than C because it doesn’t have to care about pointer aliasing problem. But this is not even true anymore, because C has now strict aliasing, and restrict keyword to optimize the same way than fortran.

    Beside, as you say, often the algorithms can be split into a huge -not speed sensitive- part (that can be done in python, haskell, etc) , and a small low level part, that can be written in fortran, C, or even better in D. I don’t know about the black holes stuffs simulation code, but from my experience I would never judge the quality of a software by its number of lines.

    I used to work in a physic laboratory doing quantum physic simulations. My code was 99 percent python and 1 percent C. It run as fast as any fortran implementation, and it was much more small, readable and elegant.

    Now about the speed of fortran. I invite people to have a look at the computer benchmark :
    You have not a single fortran g95 implementation that can beat C ! (ok I guess with intel fortran it could have been much better, but who knows, since it is forbidden to publish benchmark for this fortran implementation)

    Many scientists I know justify there use of fortran by the fact that it is faster, but they don’t even know why it is (in very very specific cases) faster. And they use it for things that don’t even need to run fast (Once I saw a physicist codding a home made implementation of a buggy regular expression automaton in Fortran : 2 hundred lines of buggy useless code ) The guy didn’t even know this problem had been solved a long time ago by other people.

    Most of the phd students in the place i work now are writing codes that don’t need aggressive optimizations. They really waste there time with fortran, and worst, it makes them seeing the act of coding as a painful experience, and they will finally give up and just try to make there simulation work ; and then when you read there code it is full of global variables and hacks like suppressing warnings output and ugly stuffs.

    Fortran shouldn’t be taught in university, it should be used only by people who know exactly why they use it.

  16. zur Says:

    Efficiency matters a lot with these kind of programs … It can matter a lot that your simulation finishes in one night, and not in four days. Or not in one month, if you’d dare to use some absolutely unsuitable high level scripting language to write the program.

    Fortran 90 is not obsolete. It have its serious disadvantages as a general purpose language, but it is much better suited for writing numerical stuff than C. It is also easier to learn.

    People being stuck with Fortran 77 (or even Fortran 66!), and not willing to learn something new or document their code is of course another matter. I hate it when people say how superior their 40 year old unmaintainable horror of unstructured Fortran code is …

  17. navaburo Says:

    Fortran is used extensively (and if I am not mistaken, exclusively) at my uni for physics simulations.

    Personally I have written physics simulations in C and in Scheme, and I must say the C implementation has much more impressive speed and graphics (since I have not found (or, admittedly, even looked for) SDL bindings for Scheme) however the Scheme code is actually structured, readable, and largly reusable, as a result of the language. I know I could have made my C code readable, structured, and reusable, but I didn’t. I also have learned more about the underlying mathematics by programming in Scheme. And I have time to get coffee while waiting for simulations to run!

  18. Rafael S. Calsaverini Says:

    There are LOTS of important problems in scientific programming that are NOT critical on the side of runtime efficiency.

    I’m a physicist and many of the simulations I code are ready in a matter of hours. I would very glady accept (as I did) an increase of 25% in runtime to achieve a decrease in 90% of programming time by using a combination of python and C ( and a decrease of 99.9% in my consumption of aspirins).

    About Haskell, it’s a fantastic language. With incredible capabilities. Off course you’re not going to throw BLAS or Lapack away and rewrite that stuff in Haskell. You don’t need too! You can load your favourite Lapack functions just inside Haskell and then use them to do the runtime critical part of your program. The foreign language interface provided by GHC is FANTASTIC. You can patch your C libraries to work inside Haskell in 15 minutes.

    But the real advantage I think is that Haskell allows, specially for theoretical physicists, I way to scientific computing *beyond numerical calculation* in a very efficient language that has a runtime just a little bigger than C.

    In the Haskell abstraction level you can code Group Theory. You can code Quantum Field Theory. You can code Statistical Physics. You can calculate Green functions analitically. You can manipulate quantum operators. You can do lots of things that today are only possible with pen-and-paper or expensive and extremely slow computer algebra packages.

    I’m very interested in writing Haskell code for physical simulations. I’ve done some stuff, but not much. But what I’ve done really impressed me.

    Just take a look at this: for example.

  19. Rafael S. Calsaverini Says:

    I mean, off course you can code Group Theory in C but that would take thousands of code lines to do what can be done in some dozens of code lines in Haskell. The thing is Algebraic Data Types and Haskell type classes allows you to really code mathematical structures almost *as they are defined in your math textbook*.

    This is another cool source:

    This guy coded abstract mathematical strutures like groups, rings, fields, vector spaces, tensor products and all that stuff in Haskell. Imagine a programming language that trivially can have vector field as a data type!

  20. java programming lesson Says:

    java programming lesson…

    […]New physics, old computing « physics musings[…]…

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: