Double, double packet trouble: too many NICs, too many problems in three Acts

Act 1 - The Problem Here be known our Complicating Incident. Look at that fancy new server just installed in the rack with 8x HDR 200Gb InfiniBand HCAs on it. WOW. That is going to make your data scientist stakeholders very happy. Linux is installed and it's just dying to consume input. This particular node [...]

2023-08-04T14:20:56-05:00August 3rd, 2023|HPC Blog|Comments Off on Double, double packet trouble: too many NICs, too many problems in three Acts

“ib_srp REJ reason 0x3” and you

Most HPC administrators have had to cut their teeth on SCSI RDMA Protocol (SRP) at some point. And most of them have torn their hair out getting SRP to work. SRP is a fine remote block storage protocol that is one of the rocks HPC was built opon -- if a little dated. Troubleshooting [...]

2023-06-17T01:48:59-05:00June 13th, 2023|HPC Blog|Comments Off on “ib_srp REJ reason 0x3” and you
Go to Top