So You Want to Set Up a Grid

Article Index

Security Infrastructure: Once a software stack has been defined, security issues must be addressed. These include trust issues such as CA acceptance, identification/authorization policy, gridmap file management, and accounting and allocation agreements.

Grid environments with a common Globus Toolkit deployment typically use Public Key Infrastructure-based tools and have certificates and a certificate authority (CA). Once you have decided on a CA, it is important to settle on policies for identifying users and resources. Many sites have divergent requirements on how a user must prove his or her identity before accessing resources. Some sites may issue you an identity certificate based on your email address, while others may require users to present, in person, a drivers license or passport before the certificate is issued. Typically all sites in a Grid must meet the highest identification policies required by any one site.

Once the identity policy issues are understood, certificates must be issued from the defined certificate authority for all hosts, services, and users. In addition, the mapping of user certificate to local login must be maintained throughout the Grid. In Globus Toolkit-based Grids, this is done through the use of a gridmap file that must be maintained in a consistent way on each of the resources.

Firewalls are another important issue. Some details are given in this document, and these will also be discussed in an upcoming columns. {mosgoogle right}

Grid Functionality: After establishing the basic software infrastructure, you should verify that the installation is functioning properly. This verification should be done first on each site in isolation and then across sites. One set of verification tests is available at the Globus Toolkit website. Similar tests should be run for your other services. For example, MPI has a nice set of verification tests (See Resources)

User Support: After your Grid has been established and everything is up and running, it is time to support the users - as they use it in ways you never imagined! In our experience, several simple mechanisms can ease the pain of this process.

  • Establish Web pages and a quick-start guide. Several questions will be asked over and over - How do I get an account? What software is supported? Which commands do I use to run a job? We recommend (at least initially) simply adapting a quick-start guide from a project similar to yours.
  • Set up mailing lists. Though simple, a mailing list of members from your sites, both systems- and application-oriented, can serve as a front line of defense for many user concerns.
  • Establish a trouble ticket system for your project. Because of the cross-site nature of any Grid, having a cross-site trouble ticket system that gets rerouted as necessary can strongly increase the level of usability of your Grid. Freeware such as Bugzilla is used extensively for trouble ticketing in many Grid projects.

First Steps to Success

Setting up a Grid, as we've seen, involves numerous issues stemming from the fact that different sites will have different local policies that must be reconciled with global control. In this column we have discussed many of those issues and have provided guidelines for setting up teams to run the Grid, defining a software stack, handling security infrastructure, and addressing user support for your organization.

Additional infrastructure may be needed as your Grid scales. We have not discussed accounting or shared allocations, higher-level schedulers, network monitoring between sites, or user-level portals. Many of these topics may be needed for a successful Grid, and we expect that these and others will be addressed in future columns.

The Globus Toolkit is a registered trademark held by the University of Chicago.

This work was supported in part by the Mathematical, Information, and Computational Sciences Division subprogram of the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy, under Contract W-31-109-ENG-38 with the University of Chicago and under Contract DE-AC03-76SF0098 with the University of California; by the National Science Foundation; by the NASA Information Power Grid program; and by IBM.

Sidebar One: Grid Resources
Implementing Production Grids for Science and Engineering: W. Johnston et al.,in The Grid: Blueprint for a New Computing Infrastructure (Second Edition), ed. Foster and Kesselman, 2003

Softenv and Modules:

Globus Toolkit Information:

Grid Bundles:

MPI verification tests

This article was originally published in ClusterWorld Magazine. It has been updated and formatted for the web. If you want to read more about HPC clusters and Linux you may wish to visit Linux Magazine.

Jennifer M. Schopf is a researcher at Argonne National Laboratory and part of the Globus Alliance, with a focus on monitoring and performance.

Keith R. Jackson is a scientist at the Lawrence Berkeley Lab where he leads the DOE Science Grid Engineering team.

    Search

    Login And Newsletter

    Create an account to access exclusive content, comment on articles, and receive our newsletters.

    Feedburner

    Share The Bananas


    Creative Commons License
    ©2005-2012 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.