SLACVX Cluster Issues
SLACVX Cluster Issues
- Major Push to Improve Reliability of Cluster
- Tuned up staging system
- Reduced load on tape system
- How to fix page
- Also a how to report problems page
- GNATS problem tracking system
- Log of system problems
- Automatic paging when problems occur
- Nightly job checks full MC/recon chain.
- Remaining Issues
- Track down and fix “fatal drive error”
- Persuade more users to use Alphas for batch.|
- Improve debugging on Alpha.
- Comments/suggestions to Gary Bower
http://www-sld.slac.stanford.edu/sldwww/staging/monitoring.html
http://www-sld.slac.stanford.edu/htbin/syslog
http://www-sld.slac.stanford.edu/sldwww/runinfo/tdint.html
http://www-sld.slac.stanford.edu/sldwww/slacvx/how-to-fix.html