Distributed File System

Processes

The following processes are used by Dfs:

Maintaining the PKT
Caching referrals by clients
Gaining access to a Dfs shared folder
Linking logical names to physical addresses
Replicating shared folders
Switching between replicas during failover
Establishing security

Maintaining the Partition Knowledge Table (PKT)

The Dfs topology is stored in the server–based PKT. When Dfs roots and links are accessed by users, the machine caches that portion of the PKT and connects to one of the servers in the referral list.

The PKT maps the logical Dfs namespace into physical referrals, as shown in Table 17.3. (Replicas appear as a list for a single Dfs link.)

Table 17.3 PKT Location Mapping

Dfs path	Link [server and share]	Time-To-Live
DFS name #1	UNC name #1	5 minutes (default)
	UNC name #2	5 minutes
	UNC name #3	5 minutes
DFS name #2	UNC name #4	5 minutes
	UNC name #5	5 minutes

A flag in the PKT indicates whether the shared folder is hosted on an earlier version of Windows NT or if it is located on Novell NetWare, network file system (NFS), or computers that are running Windows 98–based, or Windows 98–based computers.

The PKT also stores site information, which is used to connect users to Dfs roots and links in the same site. Windows 2000 computers accessing Windows 2000 domain based Dfs Roots and links give preference to servers in the same Active Directory site when they exist.

All other combinations of clients and servers do not provide site awareness but gain load balancing for Dfs Links.

Note

The site information is stored in the PKT when Dfs configuration is created. If you move the Dfs server to a different site, you must redefine the Dfs configuration. This is important if you are prestaging servers from a central location.

The PKT is a sorted lookup table that requires about 400 bytes per entry. One PKT resides in Active Directory for each Dfs root in a domain-based Dfs. The PKT for a stand-alone Dfs resides locally in the registry.

Caching Referrals by Clients

Clients that have access to the Dfs namespace cache portions of the server-based PKT locally to improve performance. When a user traverses a Dfs link in the namespace, the client receives a referral from the appropriate Dfs server and then adds a PKT entry to its local cache. When the client needs to revisit that portion of the Dfs namespace again, it uses the mapping from its locally cached PKT.

When the Dfs client attempts to navigate a Dfs link, it first looks to its locally cached PKT entries. If the referral cannot be resolved, the client contacts the Dfs root for an updated PKT entry and resets the TTL. If the referral still cannot be resolved, an error occurs. If the referral is properly resolved, the client adds the referral to its local table of entries.

When a Dfs client obtains a referral from the PKT, the referral is cached for a period of time defined by the TTL parameter on the Dfs server. If the client reuses that referral, the TTL is renewed; otherwise, the cache expires. If a replica set exists for a particular referral, all replicas are sent to and cached by the client. The client then randomly selects which referral to use.

In Windows 2000, the TTL interval is assigned on a per-link basis. If the physical location of the underlying shared folder (or shared folders, if there are replicas) is fairly dynamic, you would want to set TTL for the Dfs link to a smaller value. This would cause the client to go back to the server for a fresh copy of the referral more frequently. Similarly, if the physical location of the underlying shared folder is static, a larger TTL value can be used. If you set the TTL value too large and the client accesses the Dfs link before TTL expiration, the client will not receive a new referral to learn about changes to the link.

Suppose, for example, that you had a Dfs link called \\Company\Sales\Contracts\Today, which contained the set of contracts that were created on the current day. This link refers to a physical folder on the \\Sales\Contracts share that corresponds to the current day. So today the Dfs link might refer to \\Sales\Contracts\1999\1231, but tomorrow the link would be modified to refer to \\Sales\Contracts\2000\0101, and so on. Users will always refer to \\Company\Sales\Contracts\Today to get to the current day's folder and do not need to be concerned with the underlying mapping. However, you would have to set the TTL value for this Dfs link to a short enough value to make sure the client goes back to the server to get the updated referrals each day.

Checking Referrals

Windows 2000–based clients contain a shell extension to Windows Explorer that you can use to do the following:

See all the referrals for a Dfs link.
Select a referral for a Dfs link.
Refresh the referral cache for a Dfs link.

For more information about the Dfs tab provided by the shell extension, see "Tracking Shared Folders" later in this chapter.

—

Gaining Access to a Dfs Shared Folder

Accessing a shared file or folder in a Dfs namespace is done in exactly the same manner as a Windows NT 4.0, Windows 95, or Windows 98 client accesses any UNC path. This implies that anywhere that a physical UNC can be used, you can use a Dfs name that refers to an object in the logical Dfs namespace.

This includes the ability to specify a point in a Dfs namespace as the share that corresponds to a Dfs link in another Dfs namespace. This is how you can build up more complex hierarchies of Dfs namespaces from existing sets of Dfs namespaces.

Access to a domain-based Dfs is achieved through either of the following conventions using the shell or net use command:

\\Domain_name\Dfs_root

\\Server_name\Dfs_root\

A Dfs client on Windows 2000 and Windows NT 4.0 can also enter a net use command to gain access to any point in the Dfs namespace; this is sometimes referred to as a deep net use.

NET USE * \\Domain_name\Dfs_root\Dfs_path\Shared_folder

NET USE * \\Server_name\Dfs_root\Dfs_path\Shared_folder

Because a domain-based Dfs root is hosted in a Windows 2000 domain, it is accessible by way of the domain name. This removes the burden of the user having to know the physical location of the share; now he or she has to traverse only the logical namespace or namespaces that exist in the domain. The second convention is also supported. A user can use this convention to gain access to a Dfs namespace by specifying one of the servers that hosts the domain-based Dfs root. In this case, that specific server is always used for referrals. In Windows 2000, both the domain name and server name can be specified as either a DNS name or a NetBIOS name.

Older Dfs-aware clients (Windows NT 4.0, Windows 95, and Windows 98) cannot connect to a domain-based Dfs root by its domain name until they are upgraded with an appropriate service pack. They can, however, connect to individual Dfs root servers that participate in a domain-based Dfs by using the second naming convention:

\\Server_name\Dfs_root\Dfs_path\Shared_folder

Gaining access to a stand-alone Dfs is always through the following convention:

\\Server_name\Dfs_root\Path\File

Linking Logical Names to Physical Addresses

When a client specifies the logical name of a shared folder, the referral process provides its physical address. If a Dfs link to another server is encountered, the process is the same. However, in this case, it is important to note that the referral process expressly searches for the longest referral — the one with the most backslashes (\) — that can be resolved from the requested path. This ensures that with a single referral, the final destination has been resolved.

For example, in Figure 17.1, Dfs_link represents a link from Server1 to Server2. Because Dfs can resolve only \\Server1\Dfs_root\ locally, it fetches the longest path from the PKT: \\Server1\Dfs_root\Dfs_link\. However, because Dfs_root is linked to a second server and share (\\Server2\Share\), that server and share are substituted for the referral. In other words, when a client requests access to \\Server1\Dfs_root\Dfs_link\Share\File, Dfs returns the longest path known from PKT knowledge. It first looks at the local cache, then asks the root server, and finally consults Active Directory. In this example, the referral for \\Server1\Dfs_root\Dfs_link would map to the other server and share. Thus, the referral returns a physical address of \\Server2\Share\File.

Enlarge figure

Figure 17.1 Referral Process Across a Dfs Link

The Dfs-aware redirector, SMB Services, and Dfs driver collaborate to reroute path-based operations to the server and share hosting the file or directory. For more information about the application programming interfaces (APIs) that provide this functionality, see the Microsoft Platform SDK link on the Web Resources page at http://windows.microsoft.com/windows2000/reskit/webresources.

Switching Between Replicas During Failover

Referrals are cached locally to maintain performance, and if replicas are available, all replicas are provided to the Dfs client. The client arbitrarily chooses which referral to use. Selection is random, although preference is given to replicas within the same site as the client.

After a referral is selected from the replicas, a session setup is performed (credentials are passed to the new server if a prior connection does not exist). If the selected referral fails, a failover process begins. The speed and implications of the failover depend on what the client was doing at the time of the failure, how the failure occurred, and how tolerant of delays an application is.

Scenario 1

A client is browsing through a replicated folder. The computer hosting the replica loses power or drops off the network for some reason. In order to fail over, the client must first detect that the hosting computer is no longer present. How long this takes depends on what protocol the client is using. Many protocols, such as TCP/IP, account for slow and loosely connected WAN links, and, as such, might have retry counts up to two minutes before the protocol itself times out. After that occurs, Dfs immediately selects a new replica. If none are available from the local cache, the Dfs client consults with the Dfs root to see whether the administrator has modified any PKT entries. If no replicas are available at the root, a failure occurs; otherwise, Dfs initiates a fresh replica selection and session setup.

Scenario 2

A client is browsing through a replicated folder. The computer hosting the replica loses the hard disk containing the replica, or the replica itself is deactivated. In this scenario, because the server hosting the replica is still responding to the client request, the failover to a fresh replica is nearly instantaneous.

Scenario 3

A client has open files. The computer hosting the replica loses power or drops off the network for some reason. In this scenario, you have the same protocol failover process described in Scenario 1. In addition, the failover depends on the application that previously had file locks from the previous replica to detect the change and establish new locks.

New attempts to open files trigger the same failover process that is described in Scenario 1. Operations on already open files fail with appropriate errors.

Scenario 4

A client has open files. The computer hosting the replica loses the hard disk containing the replica, or the replica itself is deactivated. In this scenario, you have the same rapid failover process that is described in Scenario 2. In addition, the failover depends on the application that previously had file handles from the previous replica to detect the change and establish new handles.

Replicating Files

The load balancing and fault tolerance of Dfs makes it well suited for software distribution shares, web content and internal documentation. Administrators may optionally enable automatic replication of files and folders between Windows 2000 computers using the Replication Policy command in the Dfs Administrative console. The replication policy can be different for each Dfs root and link in the Dfs namespace.

Replication of Dfs content is performed by the File Replication Service (FRS) which provides multimaster updates. For more information about file replication, see "File Replication Service" in this book

Establishing Security

As each Dfs link is crossed and cached for the first time, the Dfs-aware client establishes a session setup with the server on the other side of the link. The credentials the user originally used to connect with Dfs are used (for example, net use * \\Server\Dfs_Share /u:domain\user). If the user did not supply credentials, the credentials that are cached when the user logged on to his or her workstation are used.

ACLs

File access control lists (ACLs) are administered at each individual shared folder. There is no mechanism to administer ACLs systemwide from the Dfs root, nor is there an attempt to keep ACLs consistent between replicas. Several reasons account for this:

A centrally administrated logical ACL database can be bypassed because users can issue the net use command directly to the physical resource.
The logical Dfs root can cross between FAT and NTFS volumes, as well as contain shares from other network operating systems. There is no reasonable way to set an inherited Deny ACL that starts on a NTFS volume, passes to FAT, passes back to NTFS, and concludes on NetWare share.
A tool that searches the logical namespace and sets ACLs appropriately would require a complicated message and transaction engine to ensure that the ACLs would be queued and updated over loosely connected or unreliable networks.
Storage quotas available in Windows 2000 would require an additional burden of tallying storage for all possible users across all possible volumes to establish when users have exceeded their storage allotment.

Replicating Permissions

FRS replicates changes to file permissions on NTFS. If you change one replica's ACLs, they also change for each member of the replica set. If FRS is not being used to replicate shared folders automatically, you must set the permissions on each copy of a shared folder and manually propagate any changes that occur.