mirror of
https://github.com/NVIDIA/nccl-tests.git
synced 2026-01-13 18:37:16 +08:00
Clarified use of Mebibytes and Gibibytes for sizes
This commit is contained in:
parent
2656c58421
commit
7278698c1b
13
README.md
13
README.md
@ -32,13 +32,14 @@ NCCL tests can run on multiple processes, multiple threads, and multiple CUDA de
|
||||
|
||||
### Quick examples
|
||||
|
||||
Run on single node with 8 GPUs (`-g 8`), scanning from 8 Bytes to 128MBytes :
|
||||
Run on single node with 8 GPUs (`-g 8`), scanning from 8 Bytes to 128MiB (Mebibytes), doubling between each test (`-f 2`) :
|
||||
|
||||
```shell
|
||||
$ ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 8
|
||||
```
|
||||
|
||||
Run 64 MPI processes on nodes with 8 GPUs each, for a total of 64 GPUs spread across 8 nodes :
|
||||
Run 64 MPI processes on nodes with 8 GPUs each, for a total of 64 GPUs spread across 8 nodes.
|
||||
Scanning from 8 Bytes to 32GiB (Gibibytes), doubling between each test (`-f 2`).
|
||||
(NB: The nccl-tests binaries must be compiled with `MPI=1` for this case)
|
||||
|
||||
```shell
|
||||
@ -57,10 +58,10 @@ All tests support the same set of arguments :
|
||||
* `-t,--nthreads <num threads>` number of threads per process. Default : 1.
|
||||
* `-g,--ngpus <GPUs per thread>` number of gpus per thread. Default : 1.
|
||||
* Sizes to scan
|
||||
* `-b,--minbytes <min size in bytes>` minimum size to start with. Default : 32M.
|
||||
* `-e,--maxbytes <max size in bytes>` maximum size to end at. Default : 32M.
|
||||
* Increments can be either fixed or a multiplication factor. Only one of those should be used
|
||||
* `-i,--stepbytes <increment size>` fixed increment between sizes. Default : 1M.
|
||||
* `-b,--minbytes <min size in bytes>` minimum size to start with. Default : 32M (Mebibytes).
|
||||
* `-e,--maxbytes <max size in bytes>` maximum size to end at. Default : 32M (Mebibytes).
|
||||
* Increments can be either fixed or a multiplication factor. Only one of those should be used.
|
||||
* `-i,--stepbytes <increment size>` fixed increment between sizes. Default : 1M (Mebibytes).
|
||||
* `-f,--stepfactor <increment factor>` multiplication factor between sizes. Default : disabled.
|
||||
* NCCL operations arguments
|
||||
* `-o,--op <sum/prod/min/max/avg/all>` Specify which reduction operation to perform. Only relevant for reduction operations like Allreduce, Reduce or ReduceScatter. Default : Sum.
|
||||
|
||||
@ -210,6 +210,7 @@ testResult_t initComms(ncclComm_t* comms, int nComms, int firstRank, int nRanks,
|
||||
return testSuccess;
|
||||
}
|
||||
|
||||
// NOTE: We use the binary system, so M=Mebibytes and G=Gibibytes
|
||||
static double parsesize(const char *value) {
|
||||
long long int units;
|
||||
double size;
|
||||
|
||||
Loading…
Reference in New Issue
Block a user