Active Directory Data Storage |
Active Directory data is stored in the Ntds.dit ESE database file. Two copies of Ntds.dit are present in separate locations on a given domain controller:
Some interobject references in the directory require back-references for either usability or administrative purposes. For example, if managedBy is an object attribute, you can look at ObjectA and determine that ObjectA is managed by ObjectB. Likewise, it is sometimes helpful to be able to look at ObjectB and determine what objects ObjectB manages (the values of the managedObjects attribute). Active Directory maintains referential integrity between objects that reference each other so that when one object is moved in the directory tree, the reference between it and other objects is maintained. This referencing is accomplished through linked attributes.
Two attributes that are linked are marked in the schema as having the same link-pair identifier — one is marked as the forward link and the other as the back link. For reasons that relate to security and replication, only the forward link attribute can be modified. For example, in the managedBy/managedObjects link pair, managedBy is the forward link. Therefore, to adjust the managedObjects attribute on a user object, you must go to the objects that you want to add or remove from the user's managedObjects value and modify the managedBy value on each object. Back-link attributes are computed when they are requested by a user action.
Note
When you extend the schema, you have to know when to make an object a link object. For more information about extending the schema, see "Active Directory Schema" in this book.
To find all of the objects that ObjectB manages, links are examined for all records in which the link pair is managedBy/managedObjects and the back-link attribute identifies ObjectB. The link pairs of those records provide the database identifiers of all the records (objects) that are managed by ObjectB.
The managedBy and managedObjects example uses a single-value forward link and a multivalue back link, respectively, but there is no requirement that the forward link be a single-value link. For example, distribution list membership is implemented both as a forward-link and as a back-link pair. The back-link objects would be the objects that store the isMemberOfDl attribute. The forward-link member attribute is a multivalue attribute, which allows a user to be a member of more than one distribution list. The back link must always be a multivalue link because it is impossible to restrict who creates links to various objects.
Table 2.8 shows link values for an object (ObjectB) that is the manager of several other objects (ObjectA, ObjectC, and ObjectD). The distribution list (DL1) is an example of an object that has several objects as members.
Table 2.8 Example of Forward-Link and Back-Link Values
Linked object | Back-linked object | Link pair |
---|---|---|
ObjectA | ObjectB | managedBy/managedObjects |
ObjectC | ObjectB | managedBy/managedObjects |
ObjectD | ObjectB | managedBy/managedObjects |
DL1 | ObjectE | member/isMemberOfDl |
DL1 | ObjectF | member/isMemberOfDl |
DL1 | ObjectG | member/isMemberOfDl |
When an object that is linked is deleted, all of its linked attribute values are deleted. In the preceding example, if ObjectA were deleted, the managedObjects multivalue attribute on ObjectB would suddenly (and with no change to any replication-related metadata) lose a value. Similarly, if ObjectB were deleted, the value of the managedBy attribute on ObjectA would suddenly be blank. Nothing about the object changes in either case, except that the attribute value is gone.
When you request the value of a back link on a particular object (for example, "What objects are managed by ObjectB?"), the system searches for all objects whose corresponding forward link names the original object (that is, "What objects have ObjectB as the value in their managedBy attribute?"). The results of that search and, hence, the apparent contents of the back-link attribute, depend on the LDAP port to which the client is bound; that is, the results can differ, depending on whether the client binds to the local domain (LDAP port 389) or the Global Catalog (LDAP port 3268).
For example, suppose that you are looking at the user object named "JohnDoe." You are interested in discovering the groups in which JohnDoe has memberships. Suppose further that JohnDoe is an object in the child domain B that has a parent domain A. If you bind to the JohnDoe object in domain B and read the memberOf attribute, you receive a list of all group memberships in domain B, including both domain local and global groups; however, you do not see any memberships in groups outside domain B. On the other hand, if you bind to the copy of the JohnDoe object in the Global Catalog and read the memberOf attribute, you see the group memberships in all universal groups in the forest. You do not see any domain local group memberships, however, because local groups are not replicated to the Global Catalog. Thus, to see all of an object's memberships, you must search both the local and Global Catalog copies of the object.
For example, suppose you are interested in learning what the groups are to which JohnDoe has memberships. The system implicitly searches for all objects whose forward links name the object (that is, the group objects that have JohnDoe as a value for the member attribute). Suppose further that JohnDoe is an object in the child domain B that has a parent domain A. When there is more than one domain in a forest, you must take into account the following group behaviors:
In the example, if you bind to the JohnDoe object in domain B and read the memberOf attribute, Active Directory lists all groups in domain B that have JohnDoe as a member, including both local and global groups; however, no groups except for domain B (the domain to which JohnDoe belongs) are visible.
If you bind to the copy of the JohnDoe object in the Global Catalog and read the memberOf attribute, the groups that are listed depend on what domain contains the Global Catalog server, assuming that there is not a Global Catalog server in both domains.
Note
Memberships in domains that are external to the forest are not found in either type of search because they are outside the scope of the forest. These memberships must be discovered by using the respective external cross-reference. (For more information about external cross-references, see "Name Resolution in Active Directory" in this book.)
If you add a member of a trusted domain from a different forest to a group in your domain, Samsrv.dll creates a placeholder object of the class foreignSecurityPrincipal. This object represents the real object, about which Active Directory has no information because the object exists in a different forest. When you list the members of a group, Active Directory usually lists the distinguished names of the group members. For a member that is from an external domain, Active Directory displays the distinguished name of the foreign security principal object in the form of a NetBIOS name. For example, the user JohnD from the domain Acquired.com would appear as JohnD in "acquired" as shown in Figure 2.7.
Figure 2.7 Example of a Members Tab That Displays the Distinguished Name of a Foreign Security Principal
If you open the properties on the foreign group member, an informational message like the one in Figure 2.8 appears. This message explains that the member is not a real object in Active Directory but a placeholder for the object. The object SID is displayed in the title bar of the dialog box.
Figure 2.8 Properties for a Member from an External Domain
You can use the object's SID in an LDAP query to determine the LDAP name of the object. Such a query involves enumerating all trusted domains and then issuing a query on each one for the object whose objectSid attribute value matches the SID of the foreign security principal object.
In Active Directory, all references from one object to another stored as the database identifier of the referenced object. For example, a user object might have an attribute that defines that user's manager; the value for that attribute is the database identifier of the user object that represents the manager in the database. If the referenced object does not exist (for example, a user account in one domain has a manager in a different domain, and the contacted server is not a Global Catalog), a "phantom" is created as a record in the database, and the database identifier of that record is used. A phantom record contains the GUID, the SID (in the case of references to security principals), and the distinguished name of the object that is being referenced. If a copy of the object named in the attribute exists in the local database, no phantom is needed. If the object is located in an external directory partition, the local database uses a phantom record. For example, if an object in the domain dc=noam,dc=reskit,dc=com holds a reference to an object in dc=europe,dc=reskit,dc=com, a phantom for that object and its parent exist in the domain dc=noam,dc=reskit,dc=com. The infrastructure master deletes phantom objects when the objects that they reference are renamed or deleted. For more information about the infrastructure master, see "Managing Flexible Single-Master Operations" in this book, and see Windows 2000 Server Help.
Operations are written to the Active Directory database as transactions, which are the units of work performed by a database. Transactions are atomic — that is, they are either completed in full or are not applied at all. If for any reason an error occurs and a transaction is unable to complete all of its steps, the system is returned to the state that existed before the transaction began. An example of an atomic transaction is an account transfer transaction. Money is removed from account A and placed into account B. If the system fails after it removes the money from account A, the transaction processing system puts the money back into account A and returns the system to its original state — that is, it rolls back the transaction.
In Active Directory, write operations on a single object are transacted — that is, a transaction cannot be applied across multiple objects. Active Directory writes a transaction synchronously to the transaction log file and then to the database. First, a change is made to an in-memory copy of the object. Then the change is written to the log file, which ensure that the change is effected, even if the database shuts down after that point. The database engine continually updates the database file with recent changes. The database update works from memory, not from the log files, so it keeps pace with the updates rather than waiting for the server to be available. This method of performing updates is referred to as "advancing the checkpoint," where the checkpoint is the point in time at which all changes that have been made thus far have been fully written to the database.
The Active Directory logging and recovery system is designed to guarantee data integrity and consistency in the case of a system crash. Logging is the process of recording database operations in a log file. Recovery is the process of using the log file to restore a database after a system crash to the most recent state that is recorded in the log file.
Note
Because Active Directory is replicated (if you have at least two domain controllers in a domain), you can recover from a disaster by restoring from backup and allowing replication to replicate data that has changed since the last backup.
For efficient disk usage, Active Directory uses circular logging. Circular logging keeps the log file size to a minimum by overwriting data that is no longer needed as rapidly as possible. By using circular logging, the directory database engine automatically deletes unneeded log files every time the checkpoint is advanced.
For more information about backing up and restoring Active Directory, see "Active Directory Backup and Restore" in this book. For more information about allocating log file space, see "Active Directory Diagnostics, Troubleshooting, and Recovery" in this book. For more information about replication of database transactions, see "Active Directory Replication" in this book.
For efficient searches on common attributes, Active Directory supports indexing. Attributes can be indexed to decrease the time required to locate a record in a large database — that is, a certain attribute or combination of attributes can be used to uniquely identify a record.
By default, attributes that are searched often, such as surname, cn (common name), userPrincipalName, and so forth, are indexed. You can select other attributes for indexing by using the Active Directory Schema console. When you open the properties for an attribute object, you can see whether the attribute is already selected for indexing; if it is not, you can select it, which sets an index flag on the attribute. The value of this flag is replicated, and the indexing is performed by the DSA when the schema is refreshed. Likewise, if you reverse the selection, the change is made when the schema is refreshed.
Note
Indexing attributes can affect update rate and database size. Attributes should be indexed only when you are certain that they will be used often for searching.
For more information about searching on attributes, see "Active Directory Name Resolution" in this book.