Internet-Draft Recursively Setting Attributes July 2024
Zhang, et al. Expires 27 January 2025 [Page]
Workgroup:
Network File System Version 4
Internet-Draft:
draft-mzhang-nfsv4-recursively-setting-04
Published:
Intended Status:
Standards Track
Expires:
Authors:
M. Zhang
Huawei Technologies
S. Bhargo
VMware
R. Parambattu
Huawei Technologies
D. Geng
Huawei Technologies
Y. Du
Huawei Technologies

Recursively Setting Attributes of Subdirectories and files

Abstract

In the recent years, the concept of near-data computing has been widely recognized in storage architectures. The core idea is to process data nearby, reduce the overhead of network transmission, and utilize the computing capability of smart devices (such as intelligent NICs, smart SSDs, and DPUs). This reduces CPU and memory usage of clients (computing nodes) and improves data processing efficiency. This design idea is applied in NFSv4.2 or future NFS versions, such as Server-Side Copy, in which client sends the control command and the storage server copies data without transmitting between client and server. Compared with traditional copy operations, data is read from the source storage server and then written to the target storage server after two network transmissions. Data transmission on the network is reduced, and bandwidth resources are greatly released. In addition, the client changes from an original data copy executor to a data copy controller, and a specific execution action is executed by the storage server. Therefore, a large amount of computing resources and memory resources are saved on the client side.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 27 January 2025.

Table of Contents

1. Problem Statement

In actual storage applications, users often recursively set the attributes of directories and subitems(their subfiles and subdirectories). Message interaction between client and server is complex, and the client consumes a lot of resources, which does not match the concept of near-data computing. FIG. 1 shows the existing flowchart of recursively setting the attributes of all files under directory.

Step 1: The client sends the READDIR command to obtain the list of all files in dir1.

Step 2: The storage server responds to the READDIR operation. If the directory contains many subdirectories and files the client needs to run the READDIR operation for multiple times.

Step 3: The client sends a SETATTR request for each subdirectory and file.

Step 4: The storage server responds to the SETATTR request.

If the parent directory contains 100,000 files, the client needs to repeat step 3 and 4 for 100,000 files. The whole process consumes more CPU resources and memory resources of the client, and a large number of RPC messages are exchanged between the client and the storage server. As a result, an end-to-end time for the attribute set operation is relatively long.

preamble to the figure.

                 Client                                Server
                 +                                       +
                 |                                       |
                 |------ READDIR ----------------------->|
                 |<--------------------------------------|
                 |------ GETATTR ----------------------->|
                 |<--------------------------------------|
                 |------ SETATTR ----------------------->|
                 |<--------------------------------------|
                 |         ....                          |
                 |                                       |

       Figure 1: Existing flowchart for recursive set operation

As you can see, this figure doodled and dawdled.

Similar to the design of Server-Side Copy, in this proposal we propose four new operations to be used to recursively set the attributes of a directory and its subdirectories and subfiles. These operations can be in synchronous or asynchronous mode. These four new operations are RECURSIVE_SET, RECURSIVE_SET_STATUS, RECURSIVE_SET_CANCEL and CB_RECURSIVE_SET_NOTIFY.

RECURSIVE_SET is used by client to request set attributes of the directories and files.

RECURSIVE_SET_STATUS is used by client to query the status of the recursively set operation requested by operation RECURSIVE_SET.

RECURSIVE_SET_CANCEL is used by client to cancel the recursively set operation.

CB_RECURSIVE_SET_NOTIFY is used by server to notify client that the recursively set operation finished.

2. Protocol Overview

After adopting the concept of near data calculation, the above scenario can be optimized.

Step 1: The client identifies that the object of the attribute setting is a directory and the attribute setting is recursive, and invokes the new operation RECURSIVE_SET in compound request, e.g.

Compound request:

SEQUENCE

PUTFH (directory filehandle)

RECURSIVE_SET

SETATTR

RECURSIVE_SET_STATUS

Step 2: The storage server receives the compound request with RECURSIVE_SET operation before SETATTR, server identifies the filehandle as a directory filehandle create a recursively set task and start recursively querying all files in the directory, sets attributes for each file. If filehandle refers to a regular file, server SHOULD return NFS4ERR_NOTDIR.

Step 3: The storage server responds to the request once the recursive set operation completes setting attributes of all subdirectories and files. RECURSIVE_SET can be one of the two types either synchronous or asynchronous. If client choose synchronous RECURSIVE_SET, server must respond to the client, once server finishes the operation. If the server fails to complete the attribute set within the timeout, the server responds to the client with the error code NFS4ERR_PENDING, with recursive task id and verifier to the client. Client queries the result periodically till the operation is completed on server side. The recommended timeout for the synchronous operation can be one third of the lease timeout.

If client choose asynchronous setting, server will immediately return the error code NFS4ERR_PENDING with recursive task id and verifier to the client and client will start an observer task to monitor the server. Server will send callback operation to the client once server finishes RECURSIVE_SET operation. Client will terminate the observer task once client receives the callback notification from server.

Compared to the original iterative process, the proposed process not only saves the CPU and memory usage of the client, but also significantly reduces the number of RPC’s exchanged between the client and server. This greatly improves the performance of setting attributes in subdirectories and files.

o If no backchannel is created when the client and server establish a connection, the client can only use the synchronous mode in the RECURSIVE_SET request. If the client uses the asynchronous mode, the server returns the error code NFS4ERR_CB_PATH_DOWN.

o If a backchannel is already established the client can choose to use synchronous or asynchronous mode. Server reboot When server reboot, the client will get NFS4ERR_BADSESSION. Client SHOULD retry the RECURSIVE_SET operation after re-establishing the Clientid and after RECLAIM_COMPLETE procedure.

Client reboot

If the client sends the RECURSIVE_SET operation and later there is a network disruption between the client and server, the client lease may expire. After the lease expiration the server will terminate the RECURSIVE_SET operation, which might result in partially modified files/directories under the parent directory on which the RECURSIVE_SET operation was executed.

Lease Consideration

RECURSIVE_SET operation is tied to specific client instance, so if the client lease has expired the server should cancel the RECURSIVE_SET operation. In case of there are huge number of files need to be set attributes, the server can determine the timeout but the timeout must be lesser than lease time.

Backchannel Consideration

Before client initiate the RECURSIVE_SET operation to the server, the client MUST check if the client has a backchannel established with the server. If there is no backchannel then client MUST use only synchronous RECURSIVE_SET operation. If there is an existing backchannel then the client can use either synchronous or asynchronous RECURSIVE_SET operationrecursively setting. If the server wants to send a callback operation over the backchannel of a session and no backchannel exists for the session, the server cannot establish the backchannel because only the client can associate connections with the backchannel. If there is no such connection, the server indicate that the session has no backchannel by setting the SEQ4_STATUS_CB_PATH_DOWN_SESSION flag bit in the response to the next SEQUENCE operation from the client. The client then associate a connection with the session (or destroy the session).

Grace Consideration

RECURSIVE_SET operation must honor the server grace time. During server grace period, server should NFS4ERR_GRACE to the client and the client should retry the request till the grace period is over.

Position Consideration

RECURSIVE_SET operation MUST not be the first operation of the compound request and compound operation containing the RECURSIVE_SET op should always have the SEQUENCE as the first operation.

Note to RFC Editor: this section may be removed on publication as an RFC.

3. Implementation Considerations

A recommended Recursive Set operation in synchronous mode is shown in Figure 2.

Step 1: The client sends a RECURSIVE_SET request. In the request, rsa_sync must be set to true.

Step 2: If the storage server completes to recursively set the attributes within the timeout period, the storage server returns the result back to the client. If the attributes are not set within the timeout period, the server must generate rsr_callback_id and rsr_recursiveverf and return back to client. In addition, server must respond the client with NFS4ERR_PENDING.

Step 3: The client sends a RECURSIVE_SET_STATUS query request. The request contains the information of rss_recursive_taskid. The value of rss_recursive_taskid should be set to rsr_callback_id which is obtained from the response of RECURSIVE_SET operation if the value of rss_recursive_taskid is the same as the value of rsr_callback_id cached on the storage server, the storage server returns the current status of the attribute set operation. Storage server return NFS4_OK if the server has set all the attributes, or NFS4ERR_PENDING if the operation is still in progress. If the server has encountered error during the attribute setting, then the result code must be cached and must be set in the response. If the value of rss_recursive_taskid in the request is different from the value cached on the server, the storage server returns the error code NFS4ERR_INVAL.

Step 4: The client decodes the response. If the response is NFS4_PENDING, the client would retry the RECURSIVE_SET operation again, after a delay period. If the error code returned by the server is NFS4_OK, the recursive attribute setting is successful. If SETATTR operation has encountered an error, the recursive attribute setting fails. In this case, the client returns a response to the application.

.

preamble to the figure.

                 Client                                                     Server
                 +                                                             +
                 |                                                             |
                 |------ RECURSIVE_SET(rsa_sync = 1) ------------------------> |
                 |                                                             |
                 |<-----Response(rsr_callback_id = 0, rsr_recursiveverf = 0)---|  within the timeout period
                 |                                                             |
                 |                                                             |
                 |<----Response(rsr_callback_id = 1, rsr_recursiveverf = 1)----|  beyond the timeout period
                 |                                                             |
                 |                                                             |
                 |                                                             |
                 |-------RECURSIVE_SET_STATUS(rss_recursive_taskid = 1)------> |
                 |                                                             |
                 |<------Response--------------------------------------------- |
                 |                                                             |
                 |                                                             |

                           Figure 2:  A synchronous Recursive Set

As you can see, this figure doodled and dawdled.

An alternative Recursive Set operation in asynchronous mode is also given in Figure 3.

Step 1: The client sends a RECURSIVE_SET request. In the request, rsa_sync flag should be set to false.

Step 2: The storage server needs to generate rsr_callback_id and rsr_recursiveverf, and set the error code to NFS4ERR_PENDING. The storage server continue executing the recursive setting operation.

Step 3: After receiving the response, and if the error code is NFS4ERR_PENDING, the client starts an asynchronous task to receive the callback message from the server.

Step 4: The client creates an asynchronous listening task and matches rsr_callback_id and rsr_recursiveverf. Client matches rsr_callback_id and rsr_recursiveverf, and if both the parameters match then the response is a valid response. If rsr_callback_id can be matched but rsr_recursiveverf cannot be matched, client skip the message.

Step 5: If the client does not receive the asynchronous message, the started task is forcibly terminated when the session is destroyed.

If an error occurs when the storage server recursively set attributes of subdirectories and files, the storage server terminates the task and returns the error code to the client. All possible errors are subject to the error codes defined by SETATTR.

preamble to the figure.

                 Client                                                     Server
                 +                                                             +
                 |                                                             |
                 |------ RECURSIVE_SET(rsa_sync = 0) ------------------------->|
                 |                                                             |
                 |<------Response(rsr_callback_id = 1, rsr_recursiveverf = 1)--|
                 |                                                             |
                 |                                                             |
                 |<------CB_RECURSIVE_SET_NOTIFY-------------------------------|
                 |                                                             |
                 |                                                             |
                 |                                                             |

                            Figure 3: An asynchronous Recursive Set

As you can see, this figure doodled and dawdled.

4. Recursive Set Operations

4.1 4.1 Operation TBD1: RECURSIVE_SET – Recursively sets the attributes of a directory and its subdirectories and files.

ARGUMENT

<CODE BEGINS>

Struct RECURSIVE_SET4args {

bool rsa_sync;

};

<CODE ENDS>

RESULT

<CODE BEGINS>

struct recursive_set_response4 {

recursive_taskid4 rsr_callback_id;

verifier4 rsr_recursiveverf;

};

union RECURSIVE_SET4res (nfsstat4 rsr_status) {

case NFS4_OK:

recursive_set_response4 rsr_resok4;

default:

void;

};

<CODE ENDS>>

DESCRIPTION

The RECURSIVE_SET operation is used by the client to recursively set the attributes of a directory and all its subdirectories and files. The operation should be placed before SETATTR in the compound operation. After the storage server receives the SETATTR combination operation, if the SETATTR operation is not preceded by RECURSIVE_SET, the original process remains unchanged. If the SETATTR operator is preceded by the RECURSIVE_SET operation, the storage server considers the attributes of the directory and its subdirectories and files to initiate recursive set mode.

If the storage is successfully executed, the values of rsr_callback_id and rsr_recursiveverf are 0.

If the recursive SETATTR operation in storage is not complete within the timeout period, the values of rsr_callback_id and rsr_recursiveverf are generated.

If rsa_sync is set to true, then client can choose one of the below implementation.

o According to the NFSv4 protocol, the client must wait for the response from the server. Therefore, the client can wait for the processing result from the server. A problem in this mode is that the current request occupies a slot in a session, resulting in a decrease in the number of available slots. If multiple tasks of the same type are being executed, no slot is available on the client in extreme cases.

o The storage server determines the implementation duration. If the implementation duration is too long, the storage server may return non-zero values of rsr_callback_id and rsr_recursiveverf.

After the client receives the request, the client waits for a period of time and executes RECURSIVE_SET_STATUS to query the execution progress of the current task. If the server does not finish the execution, NFS4ERR_PENDING is returned. After receiving the error code, the client retries the query after a period of time. If the execution is complete, NFS4_OK is returned.

4.2 Operation TBD2: RECURSIVE_SET_STATUS – Query the result of the recursively setting the attributes of subdirectories and files

ARGUMENT

<CODE BEGINS>

struct RECURSIVE_SET_STATUS4args {

stateid4 rssa_recursive_taskid;

};

<CODE ENDS>

RESULT

<CODE BEGINS>

#define NFS4ERR_PENDING 10090

struct RECURSIVE_SET_STATUS4res {

nfsstat4 rssr_status;

};

<CODE ENDS>

DESCRIPTION

rssa_recursive_taskid is the value same to rsr_callback_id in RECURSIVE_SET response. The RECURSIVE_SET_STATUS operation is used by the client to query the status of a recursively set task (attributes of subdirectories and files). Server must check if rssa_recursive_taskid match the task id in server and if the task on the storage server is complete, NFS4_OK is returned. If any error occurs during task execution, a response error code is returned and the error code is not extended or modified in this case so the error code is the same as the error code that may occur during the SETATTR operation. If the current setting task is not complete, NFS4_PENDING is returned.

4.3 Operation TBD3: RECURSIVE_SET_CANCEL – Canceling a Running Task on the Client

ARGUMENT

<CODE BEGINS>

//The following operation is used to cancel the recursive setting task that is being executed.

struct RECURSIVE_SET_CANCEL4args {

stateid4 rsca_recursive_taskid;

};

<CODE ENDS>

RESULT

<CODE BEGINS>

struct RECURSIVE_SET_CANCEL4res {

nfsstat4 rscr_status;

};

<CODE ENDS>

DESCRIPTION

RECURSIVE_SET_CANCEL is used to cancel the task that is being executed. The request packet contains rsca_recursive_taskid. The value of rsca_recursive_taskid is obtained from the response of RECURSIVE_SET. If the storage server fails to cancel the task, NFS4ERR_DELAY is returned. When receiving the message, the client delays the retry. If the current task is complete, NFS4_OK is returned.

4.4 Operation TBD4: CB_RECURSIVE_SET_NOTIFY – Notify the recursively setting result to client

ARGUMENT

<CODE BEGINS>

struct CB_RECURSIVE_SET_NOTIFY4args {

nfs_fh4 crsna_fh;

stateid4 crsna_recursive_taskid;

verifier4 crsna_recursiveverf;

nfsstat4 crsna_status;

};

<CODE ENDS>

RESULT

<CODE BEGINS>

struct CB_RECURSIVE_SET_NOTIFY4res {

nfsstat4 crsnr_status;

};

<CODE ENDS>

DESCRIPTION

CB_RECURSIVE_SET_NOTIFY is used to send the server callback to client to notify the client of the result of the task of recursively setting the attributes of subdirectories and files. Client check the crsna_recursive_taskid and crsna_recursiveverf and client will finish the wait task if they are matched the value received from previous RECURSIVE_SET response or will skip the notification in case of not match and return NFS4ERR_INVAL to server.

Race condition between CB_RECURSIVE_SET_NOTIFY and RECURSIVE_SET_STATUS. A race condition can happen if the RECURSIVE_SET_STATUS is in flight and server has responded with CB_RECURSIVE_SET_NOTIFY. In this case the server would have cleaned up the recursive_taskid before the RECURSIVE_SET_STATUS is received from client. The server may return NFS4ERR_INVAL, and this should be gracefully handled by the client.

5. Security Considerations

TBD

6. IANA Considerations

TBD

7. References

7.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC7862]
Haynes, T., "Network File System (NFS) Version 4 Minor Version 2 Protocol", RFC 7862, DOI 10.17487/RFC7862, , <https://www.rfc-editor.org/info/rfc7862>.

7.2. Informative References

[InfRef]
"", .

Appendix A. An Appendix

Authors' Addresses

Minqian Zhang
Huawei Technologies
1899 Xiyuan
Chengdu
High-tech West District, 611731
China
Sunil Kumar Bhargo
VMware
Phone: +
Rijesh Kunhi Parambattu
Huawei Technologies
Dongyu Geng
Huawei Technologies
Yunfei Du
Huawei Technologies