Sean Hefty [Thu, 24 May 2012 21:36:41 +0000 (14:36 -0700)]
rsocket: Reduce SQ from 2 SGE per WR to 1 SGE
We currently request 2 SGEs per WR when allocating a QP. The
second SGE is only used when sending data at the end of
the circular send buffer and the start. All other sends are
restricted to a single SGE.
Reduce the size of the SQ by only requesting 1 SGE per WR. The
resulting performance is basically unaffected.
Sean Hefty [Thu, 24 May 2012 21:31:12 +0000 (14:31 -0700)]
rsockets: Change the default QP size from 512 to 384
Simple bandwidth tests using rstream showed no difference in
performance between using a QP sized to 384 entries versus 512.
Reduce the overhead of a default rsocket by using 384 entries.
A user can request a larger size by calling rsetsockopt.
Sean Hefty [Tue, 22 May 2012 01:46:36 +0000 (18:46 -0700)]
rsocket preload: Use environment variable to set QP size
Allow the user to specify the size of the send/receive queues
and inline data size through environment variables:
RS_SQ_SIZE, RS_RQ_SIZE, and RS_INLINE.
Sean Hefty [Fri, 25 May 2012 17:48:47 +0000 (10:48 -0700)]
librdmacm: Delay ACM connection until resolving an address
Avoid creating a connection to the ACM service when
it's not needed. For example, if the user of the librdmacm
is a server application, it will not use ACM services.
Sean Hefty [Fri, 25 May 2012 17:48:47 +0000 (10:48 -0700)]
librdmacm: Delay ACM connection until resolving an address
Avoid creating a connection to the ACM service when
it's not needed. For example, if the user of the librdmacm
is a server application, it will not use ACM services.
Sean Hefty [Fri, 25 May 2012 17:48:47 +0000 (10:48 -0700)]
librdmacm: Delay ACM connection until resolving an address
Avoid creating a connection to the ACM service when
it's not needed. For example, if the user of the librdmacm
is a server application, it will not use ACM services.
Sean Hefty [Fri, 25 May 2012 17:48:47 +0000 (10:48 -0700)]
librdmacm: Delay ACM connection until resolving an address
When ACM support is enabled, the librdmacm will attempt to connect
to the ACM service during startup. This results in the library
hanging when rsockets are being used with the rs-preload library.
The code path ends up as:
This problem was pointed out by Chet Murthy <chet@watson.ibm.com>.
To fix this, delay connecting to the ACM service until it's
actually needed. This not only avoids the hang described above,
but also avoids creating a connection to the ACM service when
it's not needed. For example, if the user of the librdmacm
is a server application, it will not use ACM services.
Sean Hefty [Fri, 25 May 2012 01:18:43 +0000 (18:18 -0700)]
rsockets: Reduce the default inline size
Inline data consumes the same space used by the SGL. Since
we reduced the default number of SGEs per SQ entry to 1,
also reduce the default inline data size to 16 bytes.
Otherwise, the SQ size won't actually be reduced.
Although this increases the latency of small messages over
16 bytes, tests show that decreasing the inline data size
from 64 bytes to 32 or 16 bytes improves large message
bandwidth 8-10%.
Sean Hefty [Fri, 25 May 2012 01:18:43 +0000 (18:18 -0700)]
rsockets: Reduce the default inline size
Inline data consumes the same space used by the SGL. Since
we reduced the default number of SGEs per SQ entry to 1,
also reduce the default inline data size to 16 bytes.
Otherwise, the SQ size won't actually be reduced.
Although this increases the latency of small messages over
16 bytes, tests show that decreasing the inline data size
from 64 bytes to 32 or 16 bytes improves large message
bandwidth 8-10%.
Sean Hefty [Thu, 24 May 2012 21:36:41 +0000 (14:36 -0700)]
rsocket: Reduce SQ from 2 SGE per WR to 1 SGE
We currently request 2 SGEs per WR when allocating a QP. The
second SGE is only used when sending data at the end of
the circular send buffer and the start. All other sends are
restricted to a single SGE.
Reduce the size of the SQ by only requesting 1 SGE per WR. The
resulting performance is basically unaffected.
Sean Hefty [Thu, 24 May 2012 21:31:12 +0000 (14:31 -0700)]
rsockets: Change the default QP size from 512 to 384
Simple bandwidth tests using rstream showed no difference in
performance between using a QP sized to 384 entries versus 512.
Reduce the overhead of a default rsocket by using 384 entries.
A user can request a larger size by calling rsetsockopt.
Sean Hefty [Tue, 22 May 2012 01:46:36 +0000 (18:46 -0700)]
rsocket preload: Use environment variable to set QP size
Allow the user to specify the size of the send/receive queues
and inline data size through environment variables:
RS_SQ_SIZE, RS_RQ_SIZE, and RS_INLINE.