查看详细节点信息:scontrol show node¶
scontrol show node 显示全部节点信息,scontrol show node NODENAME 或
scontrol show node=NODENAME 显示节点名NODENAME的节点信息,输出类似:
NodeName=anode01 Arch=x86_64 CoresPerSocket=20
CPUAlloc=0 CPUTot=40 CPULoad=0.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=anode01 NodeHostName=anode01 Version=19.05.4
OS=Linux 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019
RealMemory=2031623 AllocMem=0 FreeMem=1989520 Sockets=2 Boards=1
State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=2TB-AEP-Mem
BootTime=2019-11-09T15:47:56 SlurmdStartTime=2019-12-01T19:01:59
CfgTRES=cpu=40,mem=2031623M,billing=40
AllocTRES=
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=gnode01 Arch=x86_64 CoresPerSocket=20
CPUAlloc=0 CPUTot=40 CPULoad=0.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=gpu:v100:2
NodeAddr=gnode01 NodeHostName=gnode01 Version=19.05.4
OS=Linux 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019
RealMemory=385560 AllocMem=0 FreeMem=368966 Sockets=2 Boards=1
State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=GPU-V100
BootTime=2019-11-13T16:51:31 SlurmdStartTime=2019-12-01T19:54:55
CfgTRES=cpu=40,mem=385560M,billing=40,gres/gpu=2
AllocTRES=
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
scontrol show node主要输出项¶
NodeName:节点名。Arch:系统架构,如x86_64。CoresPerSocket:每颗 CPU 处理器的核数。CPUAlloc:已分配的 CPU 核数。CPUErr:出错的 CPU 核数。CPUTot:总 CPU 核数。CPULoad:CPU 当前负载。AvailableFeatures:可用特性。ActiveFeatures:激活的特性。Gres:通用资源,包括 GPU 等。例如上面Gres=gpu:v100:2指明了该节点上安装有两块V100 GPU。NodeAddr:节点 IP 地址或当前主机可以访问的主机名。NodeHostName:节点名。Version:Slurm版本。OS:操作系统及版本信息 。RealMemory:实际物理内存,单位 MB。AllocMem:已分配内存,单位 MB。FreeMem:可用内存,单位 MB。Sockets:CPU 颗数。Boards:主板数。State:状态。ThreadsPerCore:每颗CPU核线程数。TmpDisk:设定的临时存储空间大小。Weight:权重。BootTime:开机时间。SlurmdStartTime:Slurmd守护进程启动时间。