ABOUT
NPACI Grid
USER REFERENCE
LEARN
MORE
|
|
NPACI Archive Page
The NPACI program ended on September 30, 2004. This site is presented for archival purposes only.
For current resources at each of the partner sites, please refer to the appropriate institution site.
|
- Why should I use the NPACI Grid and/or
NPACKage?
- How do I access the NPACI Grid?
- How do I obtain a certificate for the
NPACI Grid?
- May I use a certificate in a non-Globus
environment?
- How do I start a new grid session?
- The grid-proxy-init commands gives
"command not found"
- Why do I have a
bunch of gram_job_mgr_*.log files in my home directory?
- grid-proxy-init fails on an
AIX system
- Why do I get authentication errors
when I try to connect to horizon.sdsc.edu under Globus?
- I want to use mpich/mpich-g2/vendormpi,
but they are in the wrong order in my path
Answers
- Why should I use the NPACI Grid
and/or NPACKage?
Simplified Job Submission: Because NPACKage is
installed on the NPACI Grid, a single job description language
(Resource Specification Language) may be used to submit
jobs to any site. In addition, with Condor-G one may
submit and monitor all jobs from a single site.
Single Sign-on: Grid certificates enable
single sign-on capabilities on the NPACI Grid.
Extend Local Resources: By installing NPACKage
on your local resources, you will be extending your own
grid with Supercomputing resources. This will allow
you to run smaller jobs locally and large jobs on the NPACI
Grid.
Portals: Grid portals may be developed
to provide a single point of access to to data and tools.
In addition, portals simplify complex programs and workflows.
- How do I access
the NPACI Grid?
For information on obtaining an account, setting up your
environment, and obtaining certificates, refer to Getting
Started. For information on some of the NPACI
Grid programs and features, see the Tutorial.
Refer to the Grid Services
Matrix for the services and parameters required to run
on the grid.
- How do I obtain a certificate
for the NPACI Grid?
See Certificates for
instructions.
- Can I use my NPACI certificate
in a non-Globus environment?
No.
- How do I start a new grid session?
Once you have obtained
a certificate and initialized
your NPACKage environment, run grid-proxy-init.
This will create a proxy certificate for you so you won't
have to enter a passphrase each time you access a new site.
Proxies are generally valid for one day.
b80n03 ~ 2% grid-proxy-init
Your identity: /C=US/O=NPACI/OU=SDSC/CN=J Doe/USERID=jdoe
Enter GRID pass phrase for this identity:
Creating proxy .............................................
Done
Your proxy is valid until: Wed Jul 16 02:59:27 2003
- The grid-proxy-init
command gives "command not found"
You need to initialize your NPACKage environment. To
use the NPACI Grid and the NPACKage software stack, you
need to set up the right environment. To accomplish this,
you will need to place a few commands in your shell initialization
files. (And of course, re-login after you make the changes.)
Read more
about configuration.
- Why
do I have a bunch of gram_job_mgr_*.log files in my home
directory?
These log files are automatically placed in your
home directory when you run globus jobs and are removed
when the job completes successfully. If the job fails,
the log is not deleted as it may be useful for debugging
the problem.
- grid-proxy-init
fails on an AIX system
If this happens, and you run grid-proxy-init with the -debug
flag with the result below,
[uxxx@longhorn uxxx]$ grid-proxy-init
-debug
Output File: /tmp/x509up_uxx
Your identity: /C=US/O=NPACI/USERID=uxxx
Enter GRID pass phrase for this identity:
Creating proxy
ERROR: Couldn't create proxy certificate
grid_proxy_init.c:869:
globus_gsi_proxy.c:763: globus_gsi_proxy_create_signed:
Error with the proxy handle
globus_gsi_proxy.c:234: globus_gsi_proxy_create_req:
Error with private key: Couldn't generate RSA key pair
for proxy handle
OpenSSL Error: rsa_gen.c:182: in library: rsa routines,
function RSA_generate_key: BN lib
OpenSSL Error: md_rand.c:501: in library: random number
generator, function SSLEAY_RAND_BYTES:
PRNG not seeded
OpenSSL Error: pem_lib.c:666: in library: PEM routines,
function PEM_read_bio: no start line
this can be caused for one of two reasons.
- there is an internal error. The 'SSLEAY_RAND_BYTES:
PRNG not seeded' error means there were not enough random
numbers available to a daemon called "entropy"
to create your proxy certificate. The entropy
daemon will collect some more random numbers. Wait several
seconds and try again.
- On AIX machines, a random number generated in /dev/random
is not being generated. There are two workarounds.
1) Create a .rnd file in your home
directory with 200 bytes of random data (it doesn't
matter what data this file contains)
2) Set this environment variable every
time you login (place it in your shell startup file,
e.g., .cshrc):
setenv EGD_PATH /etc/entropy
- Authentication errors
connecting to horizon under Globus
When attempting to run batch jobs on horizon.sdsc.edu remotely
using Globus, mutual authentication errors occur that may
look like this:
GRAM Authentication test failure:
authentication failed:
GSS Major Status: Unexpected Gatekeeper or Service Name
GSS Minor Status Error Chain:
init.c:499: globus_gss_assist_init_sec_context_async:
Error during context initialization
init_sec_context.c:286: gss_init_sec_context: Mutual
authentication failed: The target name (/C=US/O=NPACI/OU=SDSC/CN=tf005i.sdsc.edu)
in the context, and the target name (/CN=host/tf005ig.sdsc.edu)
passed to the function do not match
The problem occurs because horizon.sdsc.edu
is not a single machine. A connection to "horizon"
will round robin to either tf004i.sdsc.edu or tf005i.sdsc.edu.
Therefore, the hostname in the host certificate file does
not match the host you were trying to contact. Because
neither of the host certificates on these machines recognize
the hostname 'horizon', the connection fails.
Solution:
You can usually get around this problem by not
using CNAMES or round-robin DNS names. Be careful
when authenticating against hosts with multiple interfaces.
- Instead of horizon.sdsc.edu, use tf004i.sdsc.edu
or tf005i.sdsc.edu
- Instead of b80login.sdsc.edu, use b80n01.sdsc.edu
- I want to
use mpich/mpich-g2/vendormpi, but they are in the wrong
order in my path
The problem with having multiple MPI implementations available
is that the tools are all named the same (mpicc, mpirun,
...), so the first implementation in the path takes precedence.
That is not necessarily the one that you want to use.
The trick is to first change your path to include the implantation
you want first in the path:
export PATH=/usr/npaci-grid-1.1/grid/bin:$PATH
The example above puts mpich-g2 first in the path.
|
|