1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
|
SYSFS FILES
For each InfiniBand device, the InfiniBand drivers create the
following files under /sys/class/infiniband/<device name>:
node_type - Node type (CA, switch or router)
node_guid - Node GUID
sys_image_guid - System image GUID
In addition, there is a "ports" subdirectory, with one subdirectory
for each port. For example, if mthca0 is a 2-port HCA, there will
be two directories:
/sys/class/infiniband/mthca0/ports/1
/sys/class/infiniband/mthca0/ports/2
(A switch will only have a single "0" subdirectory for switch port
0; no subdirectory is created for normal switch ports)
In each port subdirectory, the following files are created:
cap_mask - Port capability mask
lid - Port LID
lid_mask_count - Port LID mask count
rate - Port data rate (active width * active speed)
sm_lid - Subnet manager LID for port's subnet
sm_sl - Subnet manager SL for port's subnet
state - Port state (DOWN, INIT, ARMED, ACTIVE or ACTIVE_DEFER)
phys_state - Port physical state (Sleep, Polling, LinkUp, etc)
There is also a "counters" subdirectory, with files
VL15_dropped
excessive_buffer_overrun_errors
link_downed
link_error_recovery
local_link_integrity_errors
port_rcv_constraint_errors
port_rcv_data
port_rcv_errors
port_rcv_packets
port_rcv_remote_physical_errors
port_rcv_switch_relay_errors
port_xmit_constraint_errors
port_xmit_data
port_xmit_discards
port_xmit_packets
symbol_error
Each of these files contains the corresponding value from the port's
Performance Management PortCounters attribute, as described in
section 16.1.3.5 of the InfiniBand Architecture Specification.
The "pkeys" and "gids" subdirectories contain one file for each
entry in the port's P_Key or GID table respectively. For example,
ports/1/pkeys/10 contains the value at index 10 in port 1's P_Key
table.
There is an optional "hw_counters" subdirectory that may be under either
the parent device or the port subdirectories or both. If present,
there are a list of counters provided by the hardware. They may match
some of the counters in the counters directory, but they often include
many other counters. In addition to the various counters, there will
be a file named "lifespan" that configures how frequently the core
should update the counters when they are being accessed (counters are
not updated if they are not being accessed). The lifespan is in milli-
seconds and defaults to 10 unless set to something else by the driver.
Users may echo a value between 0 - 10000 to the lifespan file to set
the length of time between updates in milliseconds.
MTHCA
The Mellanox HCA driver also creates the files:
hw_rev - Hardware revision number
fw_ver - Firmware version
hca_type - HCA type: "MT23108", "MT25208 (MT23108 compat mode)",
or "MT25208"
HFI1
The hfi1 driver also creates these additional files:
hw_rev - hardware revision
board_id - manufacturing board id
tempsense - thermal sense information
serial - board serial number
nfreectxts - number of free user contexts
nctxts - number of allowed contexts (PSM2)
chip_reset - diagnostic (root only)
boardversion - board version
sdma<N>/ - one directory per sdma engine (0 - 15)
sdma<N>/cpu_list - read-write, list of cpus for user-process to sdma
engine assignment.
sdma<N>/vl - read-only, vl the sdma engine maps to.
The new interface will give the user control on the affinity settings
for the hfi1 device.
As an example, to set an sdma engine irq affinity and thread affinity
of a user processes to use the sdma engine, which is "near" in terms
of NUMA configuration, or physical cpu location, the user will do:
echo "3" > /proc/irq/<N>/smp_affinity_list
echo "4-7" > /sys/devices/.../sdma3/cpu_list
cat /sys/devices/.../sdma3/vl
0
echo "8" > /proc/irq/<M>/smp_affinity_list
echo "9-12" > /sys/devices/.../sdma4/cpu_list
cat /sys/devices/.../sdma4/vl
1
to make sure that when a process runs on cpus 4,5,6, or 7,
and uses vl=0, then sdma engine 3 is selected by the driver,
and also the interrupt of the sdma engine 3 is steered to cpu 3.
Similarly, when a process runs on cpus 9,10,11, or 12 and sets vl=1,
then engine 4 will be selected and the irq of the sdma engine 4 is
steered to cpu 8.
This assumes that in the above N is the irq number of "sdma3",
and M is irq number of "sdma4" in the /proc/interrupts file.
ports/1/
CCMgtA/
cc_settings_bin - CCA tables used by PSM2
cc_table_bin
cc_prescan - enable prescaning for faster BECN response
sc2v/ - 32 files (0 - 31) used to translate sl->vl
sl2sc/ - 32 files (0 - 31) used to translate sl->sc
vl2mtu/ - 16 (0 - 15) files used to determine MTU for vl
|