RAIN: Reliable Array of Independent Nodes | Seminar Report

A seminar report on IEEE topic RAIN project. RAIN stands for Reliable Array of Independent Nodes. The seminar report discusses The RAIN project which is a research collaboration between Caltech and NASA-JPL on distributed computing and data storage systems for future space bore missions

Abstract

The RAIN (Reliable Array of Independent Nodes) project is a research collaboration between Caltech and NASA-JPL on distributed computing and data storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN Reliable Array of Independent Nodes platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple nodes, link, and switch failures, with no single point of failure. The RAIN technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this seminar report, we describe the following contributions: 
  1. Fault-tolerant interconnect topologies and communication protocols provide consistent error reporting of link failures
  2. Fault management techniques based on group membership
  3. Data storage schemes based on computationally efficient error-control codes. 
The seminar report presents several proof-of-concept applications: a highly-available video server, a highly-available Web server, and a distributed checkpointing system. Also, the Seminar report describes a commercial product, Rainwall, built with the (RAIN Reliable Array of Independent Nodes) technology.