Abstract:
MS 1/OM/ATE, a monitor system for parallel and distributed computers, is introduced. MS 1 is an event driven, hybrid, and universal distributed monitor system. Providing a high speed buffer, the event recorder has a peak performance of 10 million events/s. OM is an on line monitoring and analyzing environment in which the system designer and programmer can view and analyze the internal dynamic behavior of parallel and distributed systems in the on line mode. ATE is an analyzing tools environment in the off line mode, including statistic and visualization tools. This monitor system is used in debugging the prototype of TJ MPP. It can also be used in debugging and tuning computer systems connected or coordinated in any mode.