Massively Parallel Systems: Architecture and Programming

MS, Winter semester 2023, 2023

Module Gerneral Information

The module consists of lectures and several programming assignments.

  • Level: MS
  • Credit points: 6
  • Semester hours per week (SWS): 4
  • Instructor: Sohan Lal
  • Teaching Assistant: Tim Lühnen
  • Lecture: Thursday, 11:00-13:00, room O-0.018
  • Lab: Monday, 10:00-12:00, Linux pool, room CIP/E-2.012P3b

Desirable Previous Knowledge

An introductory module on basic computer architecture and good programming skills in C/C++.

Course Description

This course will prepare students to understand the architecture, organization, and programming of parallel computers. The course starts with parallel computers classification, multithreading, and covers the architecture of centralized and distributed shared-memory parallel systems, multiprocessor cache coherence, snooping / directory-based cache coherence protocols, implementation, and limitations. Next, the students study interconnection networks and routing in parallel systems, synchronization, and memory consistency. To ensure the correctness of shared-memory multithreaded programs, independent of the speed of execution of its independent threads, the important topics of memory consistency and synchronization will be covered in detail. As a case study, the architecture of a few accelerators such as GPUs will also be discussed in detail. Besides understanding the architecture and organization of parallel systems, programming them is also very challenging. The course will cover how to program massively parallel systems using API/libraries such as CUDA/OpenCL, MPI/OpenMP.

Problem-based Assignments/Project

There will be 3-4 assignments for project-based learning consisting of the following:

  • Implement and compare different cache coherence protocols using a simulator or a high-level, event-driven simulation interface such as SystemC
  • Programming massively parallel systems to solve computationally intensive problems such as password cracking using CUDA/OpenCL/MPI/OpenMP

Course Evaluation

  • Assignments + 30 minutes oral exam

Course Registration and Further Information

Please register for the course on Tune. The registration is mandatory. We will use Stud.IP for sharing course related information and material such as slides and assignments.

Technical Infrastructure



  • Michel Dubois, Murali Annavaram, and Per Stenström, Parallel Computer Organization and Design (Book)
  • David B. Kirk, Wen-mei W. Hwu, Programming Massivley Parallel Processors - A Hands on Approach, Second Edition (Book)
  • David A. Patterson and John L. Hennessy, Computer Architecture: A Quantitative Approach, 5th Edition (Book)
  • MPI Forum,
  • SystemC,