|
| 1 | +# The NBD Netlink Control Protocol |
| 2 | + |
| 3 | +## Introduction |
| 4 | + |
| 5 | +The NBD kernel driver provides a netlink-based control interface that allows userspace |
| 6 | +tools to configure, manage, and monitor NBD devices. This interface is used by the |
| 7 | +nbd-client utility to establish connections, configure devices, and query status. |
| 8 | + |
| 9 | +The netlink protocol uses the generic netlink (genl) framework with family name "nbd" |
| 10 | +and version 0x1. It supports both unicast commands/responses and multicast notifications |
| 11 | +for link death events. |
| 12 | + |
| 13 | +## Protocol Overview |
| 14 | + |
| 15 | +### Family Information |
| 16 | + |
| 17 | +- **Family Name**: `nbd` |
| 18 | +- **Version**: `0x1` |
| 19 | +- **Multicast Group**: `nbd_mc_group` |
| 20 | + |
| 21 | +### Message Types |
| 22 | + |
| 23 | +The protocol defines the following command types: |
| 24 | + |
| 25 | +- `NBD_CMD_CONNECT` - Connect and configure an NBD device |
| 26 | +- `NBD_CMD_DISCONNECT` - Disconnect an NBD device |
| 27 | +- `NBD_CMD_RECONFIGURE` - Reconfigure an existing connection |
| 28 | +- `NBD_CMD_STATUS` - Query device status |
| 29 | +- `NBD_CMD_LINK_DEAD` - Multicast notification of link failure (kernel → userspace) |
| 30 | + |
| 31 | +## Attributes |
| 32 | + |
| 33 | +### Configuration Attributes |
| 34 | + |
| 35 | +These attributes are used with various commands: |
| 36 | + |
| 37 | +- `NBD_ATTR_INDEX` (u32) - NBD device index |
| 38 | +- `NBD_ATTR_SIZE_BYTES` (u64) - Device size in bytes |
| 39 | +- `NBD_ATTR_BLOCK_SIZE_BYTES` (u64) - Block size in bytes |
| 40 | +- `NBD_ATTR_TIMEOUT` (u64) - Connection timeout |
| 41 | +- `NBD_ATTR_SERVER_FLAGS` (u64) - Server flags from negotiation |
| 42 | +- `NBD_ATTR_CLIENT_FLAGS` (u64) - Client flags |
| 43 | +- `NBD_ATTR_SOCKETS` (nested) - Socket configuration |
| 44 | +- `NBD_ATTR_DEAD_CONN_TIMEOUT` (u64) - Dead connection timeout |
| 45 | +- `NBD_ATTR_DEVICE_LIST` (nested) - List of devices (for STATUS response) |
| 46 | +- `NBD_ATTR_BACKEND_IDENTIFIER` (string) - Backend identifier |
| 47 | + |
| 48 | +### Socket Attributes |
| 49 | + |
| 50 | +Nested within `NBD_ATTR_SOCKETS`: |
| 51 | + |
| 52 | +- `NBD_SOCK_ITEM` (nested) - Individual socket item |
| 53 | + - `NBD_SOCK_FD` (u32) - File descriptor for the socket |
| 54 | + |
| 55 | +### Device List Attributes |
| 56 | + |
| 57 | +Nested within `NBD_ATTR_DEVICE_LIST`: |
| 58 | + |
| 59 | +- `NBD_DEVICE_ITEM` (nested) - Individual device item |
| 60 | + - `NBD_DEVICE_INDEX` (u32) - Device index |
| 61 | + - `NBD_DEVICE_CONNECTED` (u8) - Connection status (1 = connected, 0 = disconnected) |
| 62 | + |
| 63 | +## Command Details |
| 64 | + |
| 65 | +### NBD_CMD_CONNECT |
| 66 | + |
| 67 | +Connect and configure an NBD device. |
| 68 | + |
| 69 | +**Request Attributes:** |
| 70 | +- `NBD_ATTR_INDEX` (optional) - Device index to use, kernel assigns if not specified |
| 71 | +- `NBD_ATTR_SIZE_BYTES` (required) - Export size in bytes |
| 72 | +- `NBD_ATTR_BLOCK_SIZE_BYTES` (required) - Block size in bytes |
| 73 | +- `NBD_ATTR_SERVER_FLAGS` (required) - Flags from server negotiation |
| 74 | +- `NBD_ATTR_TIMEOUT` (optional) - Connection timeout in seconds |
| 75 | +- `NBD_ATTR_SOCKETS` (required) - Nested socket configuration |
| 76 | +- `NBD_ATTR_DEAD_CONN_TIMEOUT` (optional) - Dead connection timeout |
| 77 | +- `NBD_ATTR_BACKEND_IDENTIFIER` (optional) - Backend identifier string |
| 78 | + |
| 79 | +**Response Attributes:** |
| 80 | +- `NBD_ATTR_INDEX` - Assigned device index |
| 81 | + |
| 82 | +**Example Request Structure:** |
| 83 | +``` |
| 84 | +NBD_CMD_CONNECT |
| 85 | +├── NBD_ATTR_SIZE_BYTES: 10737418240 |
| 86 | +├── NBD_ATTR_BLOCK_SIZE_BYTES: 4096 |
| 87 | +├── NBD_ATTR_SERVER_FLAGS: 0x123 |
| 88 | +├── NBD_ATTR_TIMEOUT: 30 |
| 89 | +└── NBD_ATTR_SOCKETS |
| 90 | + ├── NBD_SOCK_ITEM |
| 91 | + │ └── NBD_SOCK_FD: 5 |
| 92 | + └── NBD_SOCK_ITEM |
| 93 | + └── NBD_SOCK_FD: 6 |
| 94 | +``` |
| 95 | + |
| 96 | +### NBD_CMD_DISCONNECT |
| 97 | + |
| 98 | +Disconnect an NBD device. |
| 99 | + |
| 100 | +**Request Attributes:** |
| 101 | +- `NBD_ATTR_INDEX` (required) - Device index to disconnect |
| 102 | + |
| 103 | +**Response Attributes:** |
| 104 | +- None (success/failure indicated by return code) |
| 105 | + |
| 106 | +### NBD_CMD_RECONFIGURE |
| 107 | + |
| 108 | +Reconfigure an existing NBD connection. |
| 109 | + |
| 110 | +**Request Attributes:** |
| 111 | +- `NBD_ATTR_INDEX` (required) - Device index to reconfigure |
| 112 | +- `NBD_ATTR_SOCKETS` (required) - New socket configuration |
| 113 | +- `NBD_ATTR_DEAD_CONN_TIMEOUT` (optional) - New dead connection timeout |
| 114 | + |
| 115 | +**Response Attributes:** |
| 116 | +- None (success/failure indicated by return code) |
| 117 | + |
| 118 | +### NBD_CMD_STATUS |
| 119 | + |
| 120 | +Query the status of NBD devices. |
| 121 | + |
| 122 | +**Request Attributes:** |
| 123 | +- `NBD_ATTR_INDEX` (optional) - Specific device index, or all devices if not specified |
| 124 | + |
| 125 | +**Response Attributes:** |
| 126 | +- `NBD_ATTR_DEVICE_LIST` - Nested list of device statuses |
| 127 | + - `NBD_DEVICE_ITEM` (repeated) |
| 128 | + - `NBD_DEVICE_INDEX` - Device index |
| 129 | + - `NBD_DEVICE_CONNECTED` - Connection status |
| 130 | + |
| 131 | +**Example Response Structure:** |
| 132 | +``` |
| 133 | +NBD_CMD_STATUS Response |
| 134 | +└── NBD_ATTR_DEVICE_LIST |
| 135 | + ├── NBD_DEVICE_ITEM |
| 136 | + │ ├── NBD_DEVICE_INDEX: 0 |
| 137 | + │ └── NBD_DEVICE_CONNECTED: 1 |
| 138 | + ├── NBD_DEVICE_ITEM |
| 139 | + │ ├── NBD_DEVICE_INDEX: 1 |
| 140 | + │ └── NBD_DEVICE_CONNECTED: 0 |
| 141 | + └── NBD_DEVICE_ITEM |
| 142 | + ├── NBD_DEVICE_INDEX: 2 |
| 143 | + └── NBD_DEVICE_CONNECTED: 1 |
| 144 | +``` |
| 145 | + |
| 146 | +### NBD_CMD_LINK_DEAD |
| 147 | + |
| 148 | +Multicast notification sent by kernel when a link dies. |
| 149 | + |
| 150 | +**Message Attributes:** |
| 151 | +- `NBD_ATTR_INDEX` - Device index whose link died |
| 152 | + |
| 153 | +**Delivery:** |
| 154 | +- Sent via multicast group `nbd_mc_group` |
| 155 | +- No direct response expected |
| 156 | + |
| 157 | +## Error Handling |
| 158 | + |
| 159 | +Commands return standard netlink error codes: |
| 160 | +- Success: 0 |
| 161 | +- Invalid parameters: `-EINVAL` |
| 162 | +- Device not found: `-ENOENT` |
| 163 | +- Device busy: `-EBUSY` |
| 164 | +- Memory allocation failure: `-ENOMEM` |
| 165 | +- Permission denied: `-EPERM` |
| 166 | + |
| 167 | +## Implementation Notes |
| 168 | + |
| 169 | +### Socket Management |
| 170 | + |
| 171 | +The kernel expects file descriptors for already-connected sockets to be passed |
| 172 | +via `NBD_ATTR_SOCKETS`. This allows userspace to handle: |
| 173 | +- TCP connections |
| 174 | +- TLS negotiations |
| 175 | +- Authentication |
| 176 | +- Connection establishment |
| 177 | + |
| 178 | +The kernel takes ownership of these file descriptors and will close them |
| 179 | +when the device is disconnected. |
| 180 | + |
| 181 | +### Timeout Handling |
| 182 | + |
| 183 | +Two timeout types are supported: |
| 184 | +- `NBD_ATTR_TIMEOUT` - Initial connection timeout |
| 185 | +- `NBD_ATTR_DEAD_CONN_TIMEOUT` - Timeout for detecting dead connections |
| 186 | + |
| 187 | +### Multicast Notifications |
| 188 | + |
| 189 | +Userspace applications can subscribe to the `nbd_mc_group` multicast group |
| 190 | +to receive asynchronous notifications about link death events. |
| 191 | + |
| 192 | +## Security Considerations |
| 193 | + |
| 194 | +- Netlink communications require appropriate capabilities (typically CAP_NET_ADMIN) |
| 195 | +- File descriptors passed to the kernel are validated |
| 196 | +- Backend identifiers should be treated as opaque strings |
| 197 | +- Timeout values should be reasonable to avoid resource exhaustion |
| 198 | + |
| 199 | +## Protocol Evolution |
| 200 | + |
| 201 | +The protocol version is 0x1. Future versions will maintain backward compatibility |
| 202 | +where possible, with new attributes being optional. Unknown attributes should be |
| 203 | +ignored by implementations. |
| 204 | + |
| 205 | +## Usage Examples |
| 206 | + |
| 207 | +### Connecting a Device |
| 208 | + |
| 209 | +```c |
| 210 | +// Create netlink message |
| 211 | +struct nl_msg *msg = nlmsg_alloc(); |
| 212 | +genlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, driver_id, 0, 0, |
| 213 | + NBD_CMD_CONNECT, 0); |
| 214 | + |
| 215 | +// Add attributes |
| 216 | +NLA_PUT_U64(msg, NBD_ATTR_SIZE_BYTES, size); |
| 217 | +NLA_PUT_U64(msg, NBD_ATTR_BLOCK_SIZE_BYTES, blocksize); |
| 218 | +NLA_PUT_U64(msg, NBD_ATTR_SERVER_FLAGS, flags); |
| 219 | + |
| 220 | +// Add sockets |
| 221 | +struct nlattr *socks = nla_nest_start(msg, NBD_ATTR_SOCKETS); |
| 222 | +for (i = 0; i < num_sockets; i++) { |
| 223 | + struct nlattr *sock = nla_nest_start(msg, NBD_SOCK_ITEM); |
| 224 | + NLA_PUT_U32(msg, NBD_SOCK_FD, sock_fds[i]); |
| 225 | + nla_nest_end(msg, sock); |
| 226 | +} |
| 227 | +nla_nest_end(msg, socks); |
| 228 | + |
| 229 | +// Send message |
| 230 | +nl_send_sync(socket, msg); |
| 231 | +``` |
| 232 | +
|
| 233 | +### Querying Status |
| 234 | +
|
| 235 | +```c |
| 236 | +struct nl_msg *msg = nlmsg_alloc(); |
| 237 | +genlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, driver_id, 0, 0, |
| 238 | + NBD_CMD_STATUS, 0); |
| 239 | +
|
| 240 | +// Optional: query specific device |
| 241 | +NLA_PUT_U32(msg, NBD_ATTR_INDEX, device_index); |
| 242 | +
|
| 243 | +nl_send_sync(socket, msg); |
| 244 | +``` |
| 245 | + |
| 246 | +### Disconnecting a Device |
| 247 | + |
| 248 | +```c |
| 249 | +struct nl_msg *msg = nlmsg_alloc(); |
| 250 | +genlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, driver_id, 0, 0, |
| 251 | + NBD_CMD_DISCONNECT, 0); |
| 252 | +NLA_PUT_U32(msg, NBD_ATTR_INDEX, device_index); |
| 253 | +nl_send_sync(socket, msg); |
| 254 | +``` |
0 commit comments