<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7638.1">
<TITLE>Failed to Initialize HCA type for mvapich2-0.9.8</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2>Hi,<BR>
I just setup two nodes connected through an IB cable running Fedora Core6 OS kernel 2.6.19-1.2911.fc6 and OFED-1.1. ibstat and ibnodes outputs are below. I ran make.mvapich2.gen2 file in order to create the mpi related files. I am getting following error when I am running mpiexec. Could you please tell me what I am doing wrong? The configure is using --with-device=osu_ch3:mrail inside make.mvapich2.gen2 . I don't know whether I have wrong device or something. Also ulimit -l shows unlimited. Thanks for your help.<BR>
<BR>
<BR>
Prakashan Korambath<BR>
UCLA<BR>
<BR>
------------------------------------------<BR>
<BR>
<BR>
<BR>
-bash-3.1$ mpd &<BR>
[1] 13652<BR>
-bash-3.1$ !mpdboot<BR>
mpdboot -n 2 -f hostfile<BR>
[1]+ Done mpd<BR>
-bash-3.1$ mpicc -o bones bones.c<BR>
-bash-3.1$ which mpicc<BR>
~/mvapich2/bin/mpicc<BR>
-bash-3.1$ mpiexec -n 2 ./bones<BR>
cannot create cq<BR>
Failed to Initialize HCA type<BR>
Fatal error in MPI_Init: Other MPI error, error stack:<BR>
MPIR_Init_thread(230): Initialization failed<BR>
MPID_Init(81)........: channel initialization failed<BR>
(unknown)(): Other MPI errorrank 1 in job 1 grid4.ats.ucla.edu_33136 caused collective abort of all ranks<BR>
exit status of rank 1: killed by signal 9<BR>
-bash-3.1$<BR>
-bash-3.1$ mpdtrace<BR>
grid4<BR>
n11<BR>
<BR>
<BR>
<BR>
-----------------------<BR>
[root@grid4 ~]# ibstat<BR>
CA 'mthca0'<BR>
CA type: MT25204<BR>
Number of ports: 1<BR>
Firmware version: 1.0.800<BR>
Hardware version: a0<BR>
Node GUID: 0x00066a0098007a39<BR>
System image GUID: 0x00066a0098007a39<BR>
Port 1:<BR>
State: Active<BR>
Physical state: LinkUp<BR>
Rate: 20<BR>
Base lid: 1<BR>
LMC: 0<BR>
SM lid: 2<BR>
Capability mask: 0x02510a6a<BR>
Port GUID: 0x00066a00a0007a39<BR>
[root@grid4 ~]# ibnodes<BR>
Ca : 0x00066a0098007a25 ports 1 "n11 HCA-1"<BR>
Ca : 0x00066a0098007a39 ports 1 "grid4 HCA-1"<BR>
</FONT>
</P>
</BODY>
</HTML>