<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.6000.16414" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>Hi Jonathan,</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Thanks for using MVAPICH. </FONT><FONT face=Arial
size=2>We are glad to work with you to solve the problems.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>For the <FONT face="Courier New">Got completion
with error, code=12, <FONT face=Arial>it is not about </FONT></FONT><FONT
face=Arial>VIADEV_CM_TIMEOUT env variable. You can try to increase
VIADEV_DEFAULT_TIME_OUT to 22. The unit of VIADEV_DEFAULT_TIME_OUT is specified
by IB Spec, page 340, which is 4.096 us * 2 ^ (<5 bits time
out value>) </FONT></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>And about VIADEV_CM_TIMEOUT, it's only used for
connection setup, and its unit is in milliseconds (the default value for
this is 500 milliseconds). Thanks for your suggestion and we will modify the
userguide to make it more clear.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Please let us know if you have any
questions.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Regards,</FONT></DIV>
<DIV><FONT face=Arial size=2>--Qi</FONT></DIV>
<BLOCKQUOTE
style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
<DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>
<DIV
style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B>
<A title=jonathan_follows@uk.ibm.com
href="mailto:jonathan_follows@uk.ibm.com">Jonathan Follows</A> </DIV>
<DIV style="FONT: 10pt arial"><B>To:</B> <A
title=mvapich-discuss@cse.ohio-state.edu
href="mailto:mvapich-discuss@cse.ohio-state.edu">mvapich-discuss@cse.ohio-state.edu</A>
</DIV>
<DIV style="FONT: 10pt arial"><B>Sent:</B> Thursday, February 22, 2007 1:30
PM</DIV>
<DIV style="FONT: 10pt arial"><B>Subject:</B> [mvapich-discuss] MVAPICH on
large clusters - timeouts - any advice?</DIV>
<DIV><BR></DIV><BR><FONT face=sans-serif size=2>Hello,</FONT>
<P><FONT face=sans-serif size=2>I'm running on a relatively large cluster (160
nodes, dual-core dual-socket) with IB connecting all nodes.</FONT>
<P><FONT face=sans-serif size=2>I recompiled MVAPICH 0.9.8 because I wanted to
run under IBM's batch scheduler, LoadLeveler, and that worked fine.</FONT>
<P><FONT face=sans-serif size=2>The IB implementation is with Voltaire PCIe
adapters and I compiled MVAPICH using the "make.mvapich.gen2" script with
appropriate modifications. I'm using Pathscale compilers, for example.</FONT>
<P><FONT face=sans-serif size=2>With anything like a "reasonable" number of
nodes (sometimes even 16, but >=64 for sure) I'm getting failures:</FONT>
<P><TT><FONT size=2>[chpcc022:14] Got completion with error, code=12, dest
rank=78 at line 397 in file viacheck.c</FONT></TT> <BR><BR><FONT
face=sans-serif size=2>I have now recompiled MVAPICH with -DON_DEMAND and, at
run-time, VIADEV_CM_TIMEOUT=5000000.</FONT>
<P><FONT face=sans-serif size=2>[REQUEST: the documentation is unclear but the
value for this parameter needs to be specified in microseconds, I
believe]</FONT>
<P><FONT face=sans-serif size=2>Now my job is running, but it's probably
running very badly; in due course I plan on changing this timeout value to
something less (but greater than the default).</FONT>
<P><FONT face=sans-serif size=2>Just looking for now for any comments, ideas,
experiences, advice?</FONT>
<P><FONT face=sans-serif size=2>Gratefully received of course,</FONT>
<P><FONT face=sans-serif size=2>Thanks,</FONT>
<P><FONT face=sans-serif size=2>Jonathan Follows<BR>Deep Computing, Consulting
I/T Specialist<BR>IBM UK, Manchester [Internal 487099]<BR>POST: c/o IBM UK
Limited, NHBR-1PH, Portsmouth PO6 3AU<BR>Tel: (+44) 1619057099 FAX: (+44) 870
1385642<BR>Mobile: (+44) 7764660714 MOBX 273842<BR>E-mail:
Jonathan_Follows@uk.ibm.com<BR>Text messaging:
http://www.jonathanfollows.com/pageme.html<BR></FONT><FONT face=sans-serif
size=3><BR></FONT><BR><FONT face=sans-serif size=3><BR></FONT>
<HR>
<FONT face=sans-serif size=2><BR><I><BR></I></FONT>
<P><FONT face=sans-serif size=2><I>Unless stated otherwise above:<BR>IBM
United Kingdom Limited - Registered in England and Wales with number 741598.
<BR>Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU</I></FONT>
<P><FONT face=sans-serif size=2><BR></FONT><FONT face=sans-serif
size=3><BR></FONT><BR><BR><FONT face=sans-serif size=3><BR></FONT>
<P>
<HR>
<P></P>_______________________________________________<BR>mvapich-discuss
mailing
list<BR>mvapich-discuss@cse.ohio-state.edu<BR>http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss<BR></BLOCKQUOTE></BODY></HTML>