Sunday, September 13, 2009

Apple Open-Sources Key New API

Apple's new operating system Snow Leopard contains several new technologies for allowing programmers to better utilize modern hardware. Remember the bad old days, when you thought about maybe buying a multiple-processor system or a system with more than one core, and you were told that whether you derived any benefit from the extra processors would depend significantly on acrobatics taken by developers to be aware of what you have and to use the resources properly? Them days are numbered. The assault has begun in earnest. Apple attacks the problem with two tools it's opened to the world.

OpenCL: OpenGL for Non-Graphics Applications (delivers cross-platform hardware acceleration)
One, OpenCL, was opened to the public when it was submitted as an open standard, and has the backing of graphics card vendors whose hardware would become more valuable if developers were to find more demand for their products. OpenCL is a generic language for allowing applications that aren't strictly-speaking graphics applications to take advantage of the extremely powerful GPU hardware commonly found in quality computers, offloading from the CPU computations that are well-suited to GPUs. OpenCL allows programmers to access computational hardware resources in parallel without having to understand what hardware will be available at runtime, which vendor will supply GPU hardware, how many processors will be available, and so on; it promises to be like OpenGL for non-graphics applications, allowing hardware acceleration of computations without foreknowledge of hardware specifications. To the extent high-end GPU hardware were frequently leveraged by non-graphics applications and would speed ordinary programming logic, they could become non-optional equipment in a broader class of machine. A graphics OEM win. Also, because OpenCL applications make better use of available high-end Apple hardware, a win for Apple.

Grand Central Dispatch: All The Benefit of Multithreading Without The Bother Of Unnecessarily Numerous Threads
Apple's other major technology for allowing computers to better utilize available hardware, Grand Central Dispatch, has just been open-sourced. This means two things: first, Apple wants to make projects commonly ported to MacOS from other environments (*cough*penguins*cough*) capable of enjoying Apple's new CPU-saturation technology. If technology is broadly accepted to allow CPU saturation without advance knowledge of system load or system capacity, and permits programmers to use fewer system resources because work can be divided among work units without the overhead of a thread for each of the work units, porting applications to MacOS without performance degradation will become better. Moreover, use of MacOS as a development platform will become more attractive because the tools for making widely-used Unix tools (*cough*Apache or Perl or Python or Ruby or your favorite Go program*cough*) perform better on Macs will be broadly available on all Unix platforms. (In the case of Apache, it will become trivial to performance-tune the job thread count under GCD, as GCD will make decisions about thread counts in well-written GCD-enabled applications; parallelizing Go calculations or the like will be a snap.) As Macs become a development platform of choice (which from anecdotal evidence isn't entirely crazy), applications for Macs get better and better. Second, Apple is making LLVM – which already offers smaller executables and better execution speed than is sometimes available in the open-source gcc compiler – even more attractive to developers. Apple has had to maintain its own fork of GCC because the compiler project's maintainers (which include some folks hostile to proprietary software vendors like Apple) have different goals than Apple. Migrating to LLVM frees Apple from the maintenance overhead of synchronizing a fork with an actively-developed main branch (as OpenBSD experienced after incorporating ProPolice into the compiler for security, causing it to be "stuck" with a GCC fork based on more primitive versions than those in use on other platforms; as of September 2009, OpenBSD ships patched versions of Gcc 2.95.3 and 3.3.5, whereas the GCC project offers updates through version 4.3.4; OpenBSD also ships a version of Apache 1.3 rather than one of the Apache Project's own successor versions).

By helping make LLVM the wave of the future – that is, by enabling LLVM developers to leverage things like Grand Central Dispatch on all target platforms – Apple decreases its maintenance overhead by joining forces with a compiler project that isn't actively hostile to Apple's development plans. Apple also increases the potential code base that easily ports to Apple's platform with built-in resource-saturation adaptations. Further, Apple simultaneously increases the potential quality of compilers available to its developers. (As the OpenBSD team commented, the GCC tool chain has some shortcomings and can profitably use competition.)

By opening the source of Grand Central Dispatch in an LLVM release, Apple advances a programming model that serves to improve the quality of code that will be available to run on Apple's machines, while promoting a software development tool chain with a potentially interesting future not only on Apple's platform but on other Unix and Unix-like platforms (*cough*penguins*cough*). Promoting Unix-like systems increases both the demand for such systems and the availability of those who would support them, which is of course a win for Apple (as a leading vendor of such systems). Performance improvements that can be derived by developers of Mac products simply by recompiling on the new tool sets will enable a generation of performance updates that benefit both consumers and their software vendors.

Open-sourcing GCD is a win for Apple and the C-coding Unix world.

UPDATE: HardMac reports an example of real-world performance improvements experienced by an application developer adding OpenCL and GCD to an application. Offloading decoding work to the GPU with OpenCL decreased CPU use in decoding, and improving processor saturation of encoding with GCD drove multiprocessor encoding from 100% (saturating a CPU) to 130% (effectively balancing load at least some of the time). Frame-rate performance increased 44% on the same hardware. Universally multiprocessor hardware (especially considering the proliferation of coprocessors like GPUs, DSPs, and network interface hardware accelerators) enables GCD and OpenCL to differentiate applications from competitors on platforms that don't offer similarly elegant means to access diverse hardware that is unknown until runtime.

No comments: