User Authentication and Disk Monitoring Discussions

Authentication and disk help on the way

The Beowulf mailing list provides detailed discussions about issues concerning Linux HPC clusters. In this article I review some postings to the Beowulf list on user authentication within clusters and on some postings to the smartmontools mailing list discussing the monitoring of disks.

Authentication Within Clusters

A very good cluster topic for discussion is how people authenticate within a cluster. Authentication is the process of determining who you are and what you can do on a system. In layman's terms, authentication allows you to log into a node and run jobs. On January 30, 2004, Brent Clements asked the Beowulf mailing list how people did authentication on their clusters.

One should expect a number of responses to this question. The first response was from Daniel Widyono who responded that they had one form of authentication to log into the head node and then used their own system for authentication inside the cluster. They copy /etc/passwd to all of the nodes via a cron script and have written wrappers for useradd and userdel to copy /etc/passwd and /etc/shadow to the nodes when a user is added or removed. They use /etc/password for account information and then they update an authentication token on each node once it becomes assigned to a user (through a scheduling system). Then ssh checks the authentication token using a PAM module before execution begins. They also use Bproc to determine ownership on the head node.

{mosgoogle right}

Robert Brown (RGB to his friends) then pointed out what many experienced cluster people know - NIS is a high overhead protocol that impacts the performance of clusters. There have been past discussions about NIS usage in clusters and if you search the web for "NIS" and "cluster" you should be able to find the discussion (try filtering the search with "beowulf" to refine the search). RGB pointed out that you will get NIS traffic any time a file stat is performed. Imagine this across many nodes and you will see how NIS can become a drain on network performance. RGB also discussed security aspects of NIS. There have been many well known problems with NIS including the fact that NIS sends information in the clear (i.e. not encrypted). RGB then pointed out that many people use rsync to copy /etc/passwd and /etc/shadow to the nodes (in much the fashion that Daniel mentioned). However, RGB did point out that you have to watch for password changes and copy the appropriate files to the nodes (you could write a wrapper for passwd to perform parts of this operation).

A user with the email name of "Jag" replied that at his university they configured PAM on the head node to authenticate off the main kerberos server but they remap the home directories and other things for the cluster. They also use NIS within the cluster but only for name service information. To access the compute nodes they use host-based authentication using ssh. Jag also suggested that for people using NIS that NSCD (Name Services Caching Daemon), which is part of glibc, could be used. NSCD doesn't stop the NIS traffic but limits it because it stores authentication information for subsequent requests. Leif Nixon posted that he was suspicious of NSCD because he has seen it hang on stale information with no good reason.

Mark Hahn chipped in that he uses the ubiquitous rsync-ing of the password/shadow files and uses ssh to get to the nodes inside the cluster. Mark also had some good comments about why he doesn't like centralized authentication for a campus because it creates a central point of failure (despite fail-over servers, etc.), a network hotspot, and because it can increase the work load on the poor person who has to administer the central authentication system.

Joe Landman posted that he was very leery of NIS because he has had customers crash it when serving login information just by running a simple script across the cluster. Joe said that he prefers to push name service lookups through DNS, particularly dnsmasq. Joe added that configuring a full blown named/bind system for a cluster is a significant overkill in many cases. For authentication, Joe had been hoping that LDAP would solve his problems but he hasn't been able to repeatedly make a working LDAP server with databases. He said that he's beginning to think about a simple database with PAM modules on the front end (such as pam-mysql).

Brent Clements responded that they had been using LDAP and found it to work very well especially with Red Hat. They like it because they can integrate it with a web based account management system for various groups within the campus. Joe responded that he thought the client side of LDAP was very easy to configure and run, but it was the server side that he had trouble with. He used Red Hat's LDAP rpm's and tried various things but could never get it to work the way he wanted.

The final poster was Steve Timm and he had some good information about NIS. Steve has used NIS on their cluster, but found problems with it. In particular when a job, such as a cron job that runs a script, starts on all the nodes at once, then the NIS server is hammered by all the nodes (aka' "NIS storm"). In an effort to prevent NIS storms, they tried allowing each node in the cluster to be a NIS slave, but found that the transmission protocol is not perfect and there were always a few slaves that were down a map or two. Steve said they ended up pushing the password and shadow files out to the compute nodes from the head node using rsync.

It seems for the time being that many people prefer using rsync to copy the password and shadow files to the compute nodes. While not the most ideal of methods, it is very simple and effective and has a very low impact on the network (unlike NIS). Perhaps some ingenious person will come up with a better way some day (hint, hint).



    Login Form

    Share The Bananas

    Creative Commons License
    ©2005-2012 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.