The rise of multicore processors and programmable GPUs has sparked a wave of developments in parallel programming languages.
Developers seeking to exploit multicore and manycore systems — the latter involving hundreds or potentially thousands of processors — now have more options at their disposal. Parallel languages making moves of late include the SEJITS of University of California, Berkeley; The Khronos Group’s OpenCL; the recently open-sourced Cilk Plus; and the newly created ParaSail language. Developers may encounter these languages directly, though the wider community will most likely find them embedded within higher-level languages.
Read on for the details:
Scientific Developments
Parallel computing and programming has been around for years in the high-performance scientific computing field. Recent developments in this arena include SEJITS (selective, embedded, just-in-time specialization), a research effort at the University of California, Berkeley.
The SEJITS implementation for the Python high-level language, which goes by ASP (ASP is SEJITS for Python), aims to make it easier for scientists to harness the power of parallelism. Scientists favor speed as they work to solve a specific problem, while professional programmers take the time to devise a parallel strategy to boost performance.
Armando Fox, adjunct associate professor with UC Berkeley’s Computer Science Division, says SEJITS bridges the gap between productivity programmers and efficiency programmers. SEJITS, he notes, allows productivity programmers to write in a high-level language, a benefit facilitated by efficiency programmers’ ability to capture the parallel algorithm. Intel and Microsoft are early-adopter customers.
Here’s how it works: A scientist/programmer leverages a specializer — a design pattern, essentially — that addresses a specific problem and is optimized to run in parallel settings. Specializers that are currently available cover audio processing and structured grids, among other fields. This approach embeds domain-specific languages into Python with compilation occurring at runtime.
ASP specializers are available via GitHub, with a planned repository to provide a catalog of specializers and metadata. The beginnings of such a repository may be in place by December, says Fox.
“As more and more efficiency programmers contribute their patterns to this repository of patterns, application writers can pick up and use them as they would use libraries,” explains Fox.
Fox characterized SEJITS as a prototype — albeit one with customers. He says researchers are working to get SEJITS documentation more complete.
Tapping GPUs and More
Stemming from a graphics background, OpenCL appears to be broadening its reach after emerging in Mac OS X a couple of years ago.
OpenCL, now a Khronos Group specification, consists of an API set and OpenCL C, a programming language. On one level, OpenCL lets programmers write applications that take advantage of a computer’s GPU for general, non-graphical purposes. GPUs, inherently parallel, have become programmable in recent years. But OpenCL’s role extends to increasingly parallel CPUs, notes Neil Trevett, vice president of mobile content at NVIDIA and president of the Khronos Group.
“Historically, you have had to use different programming frameworks for programming … CPUs and GPUs,” says Trevett. “OpenCL lets developers write a single program using a single framework to use all of the heterogeneous parallel resources on a system.”
Those resources could include multiple CPUs and GPUs mixed together and exploited by a single application, he adds.
OpenCL’s scope includes multicore CPUs, field-programmable gate arrays, and digital signal processing. The basic approach is to use OpenCL C to write a kernel of work and employ the APIs to spread those kernels out across the available computing resources, says Trevett.
OpenCL C is based on C99 with a few modifications, says Trevett. Those include changes that let developers express parallelism and the removal of recursion, he notes.
OpenCL emphasizes power and flexibility versus ease of programming. A programmer explicitly controls memory management and has considerable control over how computation happens on a system, says Trevett. But higher-level language tools and frameworks may be built upon OpenCL’s foundational APIs, he adds. Indeed, Khronos Group has made C++ bindings available for OpenCL.
Trevett says the C++ bindings will make OpenCL more accessible. In another initiative, Khronos Group is working on an intermediate binary representation of OpenCL. The objective is to help developers who don’t want to ship source code along with the programs they write in OpenCL.
Earlier this year, Intel set its Cilk Plus language on an open-source path as part of the company’s effort to make parallel programming more widely available.
Cilk Plus is an extension to C and C++ that supports parallel programming. Robert Geva, principal engineer at Intel, notes that Intel first started with implementing Cilk Plus into its compiler products. Then, after gaining initial success with customer adoption, extended this to its open source efforts by implementing Cilk Plus into the GNU C Compiler (GCC) through a series of releases.
The Cilk Plus extension to C/C++ aims to provide programmer benefits via allowing composable parallelism, and allowing utilization of hardware resources including multiple cores, vector operations within the cores, while being cache friendly.
Geva says that Cilk Plus provides a tasking model with a user level “work stealing” run time task scheduler. The work stealing algorithm assigns tasks — identified by the programmer as capable to execute in parallel with each other — to OS threads. According to Intel, the dynamic assignment of tasks to threads guarantees load balancing independently of an application’s software architecture. This approach to load balancing delivers a composable parallelism model. That is, the components of a large system may use parallelism and come from independent authors, but still be integrated into a single, parallel application.
Geva says that this solves a problem for those developers who were trying to build complex parallel software systems without a good dynamic load balancing scheduler and encountered hardware resource over subscription and, therefore, poor performance.
The re-implementation of Cilk Plus in open source GCC is intended to help with adoption. Geva says the open-source move helps with adoption by two types of developers: one group that prefers the GCC compiler over the Intel compiler, and a second group that is comfortable with the Intel compiler but would like to have another source..
The first components of Cilk Plus to be released into open source includes the language’s tasking portion and one language construct for vector-level parallelization (#pragma simd). The tasking portion includes compiler implementation for three keywords, including _Cilk_spawn, _Cilk_sync and _Cilk_for; the runtime task scheduler; and the hyperobject library. The remainder of the language will be introduced in multiple steps.
Porting to GCC will also help with Intel’s standardization objectives. The current plan is to take Cilk Plus to the C++ standards body and work on a proposal there, says Geva.
“We will be in a better position working inside a standards body with two implementations instead of one,” he explains.
A High-integrity Initiative
A newly launched language, ParaSail, focuses on high-integrity parallel programming.
Tucker Taft, chairman and chief technology officer at SofCheck, a software analysis and verification firm, designed the language. The alpha release of a compiler with executables for Mac, Linux and Windows emerged in October. Taft says the compiler isn’t intended for production use, but can be used to learn the language.
“Right now, we’re just trying to get it out there and get people interested,” says Taft.
According to Taft, creating a parallel programming language from scratch gave him the opportunity to build in safety and security. The language incorporates formal methods such as preconditions and post-conditions, which are enforced by the compiler. That approach makes ParaSail “oriented toward building a high-integrity embedded system,” notes Taft.
In another nod to secure, safety-critical systems, ParaSail eliminates memory management via garbage collection. Taft says garbage collection isn’t a good match for high-integrity systems, noting the difficulty of proving that a garbage collector is “correct.”
“It is also very difficult to test a garbage collector as thoroughly as is required by high-integrity systems,” he adds.
Taft’s experience in the high-integrity area includes designing Ada 95 and Ada 2005. Defense Department once made Ada its official language, citing its ability to create secure systems. The language has found a continuing role in avionics software.
Similarly, ParaSail could cultivate a niche in aerospace. Taft cites the example of an autopilot system for a commercial jet. He also lists systems for controlling high-speed trains, medical devices and collision avoidance systems for cars.
As for distribution methods, Taft says he is working with other companies, including one with a close association with the GCC. Taft says hooking the ParaSail front end — parser, semantic analyzer and assertion checker — to the GCC back end would be a natural way to make the language widely available.
Another possibility: making ParaSail available as a modeling language. In that context, ParaSail could be used to prototype a complex system that would be written in another language.