[Editor's note: Part 2 of this series shows how to optimize DSP “kernels,” i.e., inner loops. For more programming tips, see the DSP programmer’s guide.] DSP applications typically have tough ...
Digital signal processors (DSPs) have found widespread use for many computationally intensive signal processing applications in fields such as communications. As the communication applications become ...
In this special guest feature, James Reinders describes why roofline estimation is a great tool for code optimization in HPC. Roofline Analysis is a technique that projects a view of realism into ...
This course focuses on developing and optimizing applications software on massively parallel graphics processing units (GPUs). Such processing units routinely come with hundreds to thousands of cores ...
A technical paper titled “Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation” was published by researchers at MIT (CSAIL), Argonne National Lab, and TU ...
In this slidecast, Torsten Hoefler from ETH Zurich presents: Data-Centric Parallel Programming. The ubiquity of accelerators in high-performance computing has driven programming complexity beyond the ...
A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...
One size does not fit all, and it never will. Parallel programming looks to level the playing field by leveraging multicore hardware. It was easy to program applications in the days when one chip, one ...
A hands-on introduction to parallel programming and optimizations for 1000+ core GPU processors, their architecture, the CUDA programming model, and performance analysis. Students implement various ...