Taking Multi-core Programming Into The Bazaar: An Argument for Open Source Tools

All the major CPU manufacturers have thrown their lot in with multi-core designs. The (multi-billion dollar) question now is how to program these devices. I can tell you with some confidence that we don’t yet know what the answer will be in 10 years. I can’t imagine that any single company can reliably solve this problem…and I think the Open Source community is essential to finding the answer. The main reason lies in the relatively unexplored territory of how multi-core programming models interact. If I’m preaching to the choir (though not in a Cathedral…see below), feel free to skip the rest of this. However, if you’re still unconvinced, read on. Admittedly, much of this argument is not new, but I think the challenges of multi-core programming create a greater imperative.

In today’s parallel programming models, we have a variety of approaches that work now but they all have shortcomings and limitations. This isn’t so much an intrinsic problem in these languages or tools, in most cases, but a shortcoming in their implementation. Rather, it was a shortcoming in our vision; for the most part, as we invented these models, they weren’t envisioned or implemented to work together.

Getting them to work together isn’t trivial, but is do-able in most cases. (For example, we’ll often find that the underlying threading runtimes weren’t designed well to play together with others, but this can be fixed.) The real problem is that of these many choices, some will need to be mutated and many combinations will need to be tried. These models can and will be combined in thousands of interesting ways, with many different semantic implications. Each of these efforts will be risky, all being more likely to fail than succeed on the way to perfecting the model(s) and language(s) that will ultimately be used for large-scale parallel programming. Though we take risks at big companies, they are fairly risk-averse for the most part. Moreover, we tend to try to leverage our existing investments in development as much as possible. This means that a fatally flawed bet (product) is not likely to be readily tossed out as sound technical “natural selection” would require.

The experimental substrate for this evolutionary churn must be real applications, but again, we run into the risks that any (sensible) large software company must be aware of. When developing new major version of products, it is highly unlikely that the code base is completely rewritten or even significantly turned over. Estimates vary, but let’s assume that major version revisions change (often much) less than 30% of the source base. Given this, how likely is it that a major, risk-averse software developer would rewrite substantial portions (>50%) of an important application to use a combination of parallel programming models? Especially when the initial value of parallel programming (increased performance, versus longer term feature differentiation) is of limited value to the typical application? How about several such models that have never been used together?

This is the great challenge facing us and it is a daunting one. For example, in the research labs, we develop a pretty wide range of multi-core related programming technologies around data parallelism, implicit parallelism, functional programming languages, transactional memory, and speculative multithreading. We have barely begun to think about how these different models interact (we’re starting with the Pillar project).

So what is the answer? I have a strong intuition that the answer lies in the open source community, with it’s iconoclastic brilliance, unabashed bravado, fearless experimentation, enormous energy and (growing) size, and commitment to quality software development. The open source community may well be the only place where parallel programming constructs, models, libraries and compilers can be deconstructed and recombined at the scale and pace required in the coming years (see The Cathedral and the Bazaar). For recent evidence of this, look at the amazing pace of innovation in web application frameworks (Ruby on Rails is a favorite example).

Does this mean we’re abandoning differentiation in our bread-and-butter products? Hardly. There are so many other components of a platform on which companies can differentiate and compete. For chip companies, we ultimately live and die by leading with our architecture and manufacturing technologies. Programming tools are critical to delivering the value to programmers, but they are limited to the extent that access is limited.

6 Responses to Taking Multi-core Programming Into The Bazaar: An Argument for Open Source Tools

  1. Charles Bess says:

    I had this argument with someone the other day. Much of the current code in production is relatively linear in contruction. It will not be replaced easily. On the other hand there are large areas of value generation that have never been tackled because they would have required parallel processing – areas like simulation and pattern recognition. Let’s say that the current computational mix stays the same but the additional capabilities consume the majority of processing. It could be that we’re just moving to a new paradigm that needs wholly different tools and well as deliverables.

  2. I think demands for parallel computing must be increased multiple folds in order to motivate someone to get involved, whether in open source or commercial applications/tools. I typically found people arguing modern processor is fast enough, most jobs in the industry DO NOT ask for knowledge in parallel computing, hence, not worth the learning effort (but I certainly do not agree). Demand is necessary to upgrade mindset of these people to think otherwise.
    I do think open source tools are playing an important role, especially to show these people how multi-core processors can be directly beneficial to the applications. Those who do not evolve will be losing out to their competitors. Things will move on quickly when people revamp their old mindset, and start thinking parallel/concurrent the default, serial the special case.

  3. You’re both touching on the same issue: how to translate parallel performance into end-user value and experience. I.e. something more tangible than the raw performance benefits. I think almost everyone agrees that there are new usages, features, etc. that would benefit from this performance…but many of them would require some commitment to refactoring applications to take advantage of these.
    BTW, I think we end up simultaneously better training our university students (which we’ll reap the benefits of in 10 years) while we provide easy to use tools (which we’ll benefit from in the very near term).

  4. The training has to spread wider than merely computer science/engineering students. I find many potential applications are actually written by people from other disciplines. CFD simulation, by mechanical engineers? Seismic processing, by geophysicists? … You’ll hardly find them hiring computer scientists/engineers for their development. When they get some computer scientists/engineers with very little domain knowledge, they simply hope they do all the dirty work required for parallel computing, which is almost impossible.
    How computer scientists/engineers get trained for various domain applications, and how non-computer scientists/engineers get trained in parallel computing, is rather questionable. Schools simply do not teach students out of their own domains. Perhaps, students should be encouraged to take subjects out of their core domains.

  5. Tom Conte says:

    The parallel programming community is the Boy Who Cried Wolf: we have many times said “ok, NOW you have to start parallel programming, because we have no choice!” I remember the 80s, the death exaggerations for the uniprocessors, the huge investment in massively parallel systems, and the eventual rise of the “killer micros.” That cynicism is what keeps ISVs, and software engineering in general, from believing us this time.
    But the Power Wall is for real. It is hard as rock and not budging. It’s telling that major HARDWARE vendors are advocating parallel programming now: there will be no “killer micro” saviors this time around. Physics has truly and finally boxed us in.
    Anwar, you are right that it must be the open source community that takes this on because even universities are stuck with too much inertia to change the status quo. It’s not JUST a systems problem, it’s a software engineering philosophical problem, a CS education problem, a generational problem. We’re the Boy Who Cried Wolf and they don’t get that this time, the wolf is for reah and he’s hungry! However the open source community of iconoclasts have no such baggage. That is indeed where the “fun” will happen in the coming decade. I predict the arc of the transition for the next ten years will follow a model that would make Thomas Kuhn proud.