Python, Ruby, Haskell - Do they provide true multithreading?


Question

We are planning to write a highly concurrent application in any of the Very-High Level programming languages.

1) Do Python, Ruby, or Haskell support true multithreading?

2) If a program contains threads, will a Virtual Machine automatically assign work to multiple cores (or to physical CPUs if there is more than 1 CPU on the mainboard)?

True multithreading = multiple independent threads of execution utilize the resources provided by multiple cores (not by only 1 core).

False multithreading = threads emulate multithreaded environments without relying on any native OS capabilities.

1
16
12/17/2009 1:22:09 PM

Accepted Answer

1) Do Python, Ruby, or Haskell support true multithreading?

This has nothing to do with the language. It is a question of the hardware (if the machine only has 1 CPU, it is simply physically impossible to execute two instructions at the same time), the Operating System (again, if the OS doesn't support true multithreading, there is nothing you can do) and the language implementation / execution engine.

Unless the language specification explicitly forbids or enforces true multithreading, this has absolutely nothing whatsoever to do with the language.

All the languages that you mention, plus all the languages that have been mentioned in the answers so far, have multiple implementations, some of which support true multithreading, some don't, and some are built on top of other execution engines which might or might not support true multithreading.

Take Ruby, for example. Here are just some of its implementations and their threading models:

  • MRI: green threads, no true multithreading
  • YARV: OS threads, no true multithreading
  • Rubinius: OS threads, true multithreading
  • MacRuby: OS threads, true multithreading
  • JRuby, XRuby: JVM threads, depends on the JVM (if the JVM supports true multithreading, then JRuby/XRuby does, too, if the JVM doesn't, then there's nothing they can do about it)
  • IronRuby, Ruby.NET: just like JRuby, XRuby, but on the CLI instead of on the JVM

See also my answer to another similar question about Ruby. (Note that that answer is more than a year old, and some of it is no longer accurate. Rubinius, for example, uses truly concurrent native threads now, instead of truly concurrent green threads. Also, since then, several new Ruby implementations have emerged, such as BlueRuby, tinyrb, Ruby Go Lightly, Red Sun and SmallRuby.)

Similar for Python:

  • CPython: native threads, no true multithreading
  • PyPy: native threads, depends on the execution engine (PyPy can run natively, or on top of a JVM, or on top of a CLI, or on top of another Python execution engine. Whenever the underlying platform supports true multithreading, PyPy does, too.)
  • Unladen Swallow: native threads, currently no true multithreading, but fix is planned
  • Jython: JVM threads, see JRuby
  • IronPython: CLI threads, see IronRuby

For Haskell, at least the Glorious Glasgow Haskell Compiler supports true multithreading with native threads. I don't know about UHC, LHC, JHC, YHC, HUGS or all the others.

For Erlang, both BEAM and HiPE support true multithreading with green threads.

2) If a program contains threads, will a Virtual Machine automatically assign work to multiple cores (or to physical CPUs if there is more than 1 CPU on the mainboard)?

Again: this depends on the Virtual Machine, the Operating System and the hardware. Also, some of the implementations mentioned above, don't even have Virtual Machines.

33
12/17/2009 1:35:17 PM

The Haskell implementation, GHC, supports multiple mechanisms for parallel execution on shared memory multicore. These mechanisms are described in "Runtime Support for Multicore Haskell".

Concretely, the Haskell runtime divides work be N OS threads, distributed over the available compute cores. These N OS threads in turn run M lightweight Haskell threads (sometimes millions of them). In turn, each Haskell thread can take work for a spark queue (there may be billions of sparks). Like so: enter image description here

The runtime schedules work to be executed on separate cores, migrates work, and load balances. The garbage collector is also a parallel one, using each core to collect part of the heap.

Unlike Python or Ruby, there's no global interpreter lock, so for that, and other reasons, GHC is particularly good on mulitcore in comparison, e.g. Haskell v Python on the multicore shootout


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon