NVMe-oF Initiator

From Open-E Wiki
Jump to navigation Jump to search

The NVMe-oF (NVMe over Fabrics) initiator enables connections to external NVMe storage arrays (targets) via network protocols. This feature provides efficient and high-performance management of remote storage solutions, overcoming traditional cabling limitations by allowing substantial distances between servers and storage arrays.

Supported Protocols

The software supports two principal NVMe-oF initiator protocols:

  • TCP – A widely adopted protocol ensuring ease of implementation and compatibility with conventional networking infrastructure.
  • RDMA – A protocol providing lower latency and higher performance, ideal for environments requiring exceptional throughput. RDMA requires specialized hardware, such as Mellanox/NVIDIA ConnectX or ATTO network interface cards, to fully utilize its capabilities.


Configuration

Follow these steps to configure the NVMe-oF initiator:

  1. Start Discovery
    Click the "Discover" button to start the discovery wizard.
  2. Enter Connection Details
    • Server IP: IP address of the NVMe storage target.
    • Server port: Network port for communication (default is 4420).
    • Server protocol: Choose between TCP and RDMA.
    • Advanced settings (optional): Enable and specify the number of I/O queues. Leave blank or disabled to use the system default, or enter a specific number to override.
    The number of I/O queues refers to the parallel channels through which data is transferred between the NVMe initiator and the target. Increasing this number can improve performance by enabling higher parallelism and reducing latency. However, each queue consumes system resources, and setting the number too high may exceed hardware or network capabilities, leading to connection issues. Adjust this value based on performance requirements and available resources.
  3. Proceed to Subsystems
    Click "Next". A list of available NVMe-oF subsystems will appear. Select the subsystems you want to connect to and click “Connect”.


Manage Connection Paths

Add a new path: Click the “Options” dropdown menu and select “Add path”. Enter the required connection details (Server IP, port, protocol, and optionally the number of I/O queues).

Disconnect a subsystem: Use the “Options” menu and select “Disconnect subsystem”.

You can perform additional discoveries at any time to connect new subsystems.


Practical Implementation

After connecting to a subsystem, a list of available namespaces will be displayed, including:

  • Namespace ID
  • Namespace capacity
  • Namespace aliases

Namespaces are sections of the NVMe controller on the storage array. They appear as independent NVMe disks to the server, can be identified by their alias, and are managed in the same manner as standard NVMe disks. Namespaces can be partitioned and added to storage pools.

Note: Only one partition per disk can be active within a single pool or data group to maintain redundancy and reliability.


Multi-path Connectivity

The initiator supports multi-path connectivity, allowing multiple redundant network paths to a single NVMe target. Each path requires a distinct IP address (Virtual IP) to ensure redundancy and high availability.


Troubleshooting

If you encounter connection issues (e.g., “Could not connect to subsystem(s)” error), consider the following actions:

  1. Check Network Connectivity:
    • Ensure that the server can ping the target’s IP address.
    • Verify that the correct port (default 4420) is open and not blocked by a firewall.
  2. Validate Target Configuration:
    • Verify that the NVMe target is online and properly configured to support NVMe over Fabrics (NVMe-oF) connections.
    • Ensure that access control lists (ACLs) or authentication settings on the target allow the initiator to establish a connection.
  3. Adjust I/O Queues:
    • If connection errors occur due to queue limits, try lowering the number of I/O queues in the advanced settings to match target capabilities.
  4. Use Alternative Paths:
    • If multiple network interfaces are available (typical in JBOD or HA environments), try using an alternative IP address or configure multi-path connectivity.
  5. Review Logs:
    • Check logs for detailed error messages that can guide further troubleshooting.