Cluster Computing At Home
Complete Developer Podcast - En podcast af BJ Burns and Will Gant - Torsdage
High performance computations using a cluster of compute nodes can be exclusively expensive. Even with the various cloud Platform-as-a-Service (PaaS) options spinning up a cluster can become cost prohibitive for an individual or an open source or non-profit project. Raspberry Pi’s on the other hand are inexpensive, fully functional computing boards that can be combined and used to build their own cluster. For his term paper in a Parallel Programming graduate course, Beej built a small 4 node cluster using the most recent, as of 2021, Raspberry Pi model. The Raspberry Pi 4 Model B contains 8GB of RAM and a 1.5 GHz quad-core ARM processor. After building the Raspberry Pi cluster, he compared it to 4 nodes of a larger, more expensive university cluster. For testing the clusters, Beej modified an md5 hash cracking algorithm to use a hybrid model of parallelism combining MPI and OpenMP. He tested Clusters at 2, 3, and 4 nodes each tested with 1 – 4 threads per cluster using hashed strings of four, five, and six-character length. Both the Raspberry Pi cluster and the university cluster showed similar behavior when comparing string sizes across the different node and thread combinations. In tests with fewer nodes per cluster the more powerful university cluster out performs the Raspberry Pi cluster. However, as more nodes are added the Raspberry Pi showed similar speeds to the university cluster, even out performing in a few places. BJ enjoyed building and testing this Raspberry Pi cluster. It is an example of some of the fun things we get to do when learning in the field. It did turn out to be a little more expensive than expected, mainly because he chose to use the top Raspberry Pi 4 with 8GB RAM. That decision came from BJ’s desire to reuse the Raspberry Pi’s in the cluster this summer to build other projects. He could have saved money and still accomplished the hybrid parallelization using less RAM or even a Raspberry Pi 3. Whether you want to build the best or save money, use the information here as an example of the many fun things we get to do as programmers. If building a cluster computer or playing with Raspberry Pi’s doesn’t inspire you, find something that does to help you find joy in learning within the field. Episode Breakdown Background Information Cluster Computing Cluster computing involves distributing computational workload across a group of two or more interconnected computers. Each connected computer or node in the cluster is an independent, self-contained system that may be contain a single or multiprocessor core. A node can function as a complete computer containing it’s own operating system, memory, and Input Output drivers. Typically clusters use distributed memory as each node in the cluster has it’s own memory, though the internal architecture may vary from node to node as to how memory is shared between cores in a multiprocessor node. Though it is possible to mimic shared memory in a cluster using a distributed shared memory architecture. In addition to the compute nodes and depending on the architecture a control node, the cluster will also contain a network switch to allow for internode communication. Nodes will also need a scheduling application such as Message Processing Interface (MPI). Clusters allow for high performance and high availability without the higher cost of a high powered single system. They also add redundancy and safety from failure as they can be configured to survive the failure of an individual node. Message Passing Interface The Message Processing Interface (MPI) is a committee developed specification for message-passing library interfaces. It was created by a group of venders, specialist, and library developers working together on the MPI Forum. MPI is not a language, implementation,