In its ten plus years as an open source project, OpenAFS has established AFS as an open source success stories. OpenAFS provides clients for all of the major operating system distributions and servers on all UNIX/Linux variants. Even so, there is still a great deal that must be finished in order for AFS to acheive first class status on MacOS X and Microsoft Windows.
The work to be accomplished on OpenAFS falls into six broad categories:
It is the goal of the OpenAFS Elders to raise resources from the OpenAFS Community and others to successfully implement all of these functions over the next three to five years.
An Implementation and Release Schedule is provided at the end of the page.
A note about the estimates provided on this page. For many of the projects the Gatekeepers have designs, partially completed work, or even fully implemented systems. The estimates that are provided is the time necessary to complete the project and/or integrate it into a standard release of OpenAFS.
Core client functionality encompasses the AFS cache manager, file system interfaces, pioctl interfaces, and credential management.
The Microsoft Windows client has received significant attention over the last four years. It is a fully functional client that works on all Microsoft Windows releases from Windows 2000 SP4 through Windows Vista and Server 2008. For a summary, see the OpenAFS for Windows Status Report. Still, there are a number of deficiencies that adversely impact the ability of end users to use AFS to its full existing potential.
Read-only or Read-write disconnected mode:
Microsoft Windows users are used to the "Windows Offline Folders" functionality which permits them to synchronize local copies of files or folders from a CIFS server to their local disk for use when disconnected from the network. UMichigan long ago implemented a read-write disconnected mode for the UNIX AFS client which permits users to continue using data within the AFS cache while in an offline mode. Once the client is restored to an online state the modifications made to the cache buffers are written back to the file server provided that there are no conflicts. If there are conflicts a manual conflict resolution process must be initiated. Conflict resolution is hard but AFS users would gain a great deal even if the contents of the AFS cache were available when disconnected from the file servers in a read-only manner.
Estimate: 3 months
Status: no resource commitments
On-going maintenance necessary to keep up with backward incompatible changes to the Linux kernel and new interfaces provided and used therein.
update client to use system inodes instead of private inode pool.
There is a growing demand for pervasive access to data from handset devices. Clients for Symbian S60, Windows Mobile, Apple's iOS, OpenHandsetAlliance (aka Android), and Nokia/Intel Maemo/Meego devices will be critical in the years to come.
Client Cache Usage Tracking and Tuning:
The current cache managers implementation an explicit Least Recently Used algorithm for recycling objects. This algorithm does not take into account:
There have been many discussions about how hard AFS is to use, how end users don't want AFS and really want a WebDAV solution. What do those statements really mean? First, AFS isn't any harder to use than any other authenticated file system from the perspective of end users. If a user has an "encrypted" local disk she has to authenticate herself by providing her password. With the single sign-on solutions available for OpenAFS there isn't much reason for users today to be running without tokens when they have network access. Second, the statement that end users don't want AFS (as opposed to some other centralize storage solution) really makes no sense. End users don't ask for technologies, they ask for functionality. If a user wants centralized storage then the user wants centralized storage.
Users describe their desires using the technologies that are most familiar to them which today most often means Windows Shares (CIFS) and Browser based services. Why? Because those are the technologies the user is presented on his or her operating system's desktop. The vast majority of daily users are uncomfortable with command-line operations. Improving the ease of use of AFS can be achieved by providing tighter integration with the operating system desktop environment.
The Secure Endpoints Inc. OpenAFS Windows Road Map web page provides an number of mock ups of Explorer Shell extensions that can be used to not only make AFS more accessible to end users but also significantly improve its ease of use. By making the Explorer Shell AFS aware, users will be more comfortable using it. No longer will users have to use command line techniques to access AFS and manage its contents and metadata.
One of the most important ideas that was the result of discussions with Stanford University's Help Desk staff is the concept of Custom Name Spaces. On Microsoft Windows a Name Space is a virtual folder that appears as part of the Explorer Shell. The objects "My Computer", "My Documents", "Control Panel", My Network Places", "My Sharing Folders", etc. are all name spaces. Stanford University has been shipping for many years variations of an application now called "Stanford Desktop Tools". One of the features of SDT is the ability to search for classes, users, departments, and projects and map a drive letter to the associated AFS volume. Another feature is the ability to quickly map a drive letter to "my home directory". A final feature is the most recently used volume list.
With Name Spaces, we can implement all of this functionality. We can define a "recently used volumes list" which is always populated with the volumes the user most recently read or stored data to. We can define a "My Stanford Home Directory" name space that always contains a shortcut to the volume associated with the user's token for the ir.stanford.edu cell. We can also create name spaces for "Stanford Users", "Stanford Classes", "Stanford Departments", etc. Other organizations can distribute their own AFS name spaces that represent important data that is stored in their cell. AFS name spaces from multiple organizations can co-exist on the same system. Since name spaces are built into the Explorer Shell they are always easily accessible to the end user because they become a part of the Desktop.
A detailed proposal describing an AFS Name Spaces implementation is available in PDF.
Users expect to find a Control Panel for Services that support per-user configuration. For OpenAFS users can configure the behavior of the AFS Credential Provider for Network Identity Manager and their Protection Service Groups. For more details ...
System-wide configuration of Services are performed via Microsoft Management Console plug-ins. For more details ...
Microsoft Windows Vista User Account Control Privilege Separation. For more details ...
Apple doesn't permit the same degree of customization of the Finder as Microsoft does for the Explorer Shell. However, the Finder can be customized with an AFS virtual folder and AFS context menus. Likewise, certain other graphical interfaces which will become available in Leopard provide opportunities for customization to ease use of AFS.
Enhance Finder with an OpenAFS Context menu
In order for AFS to be treated as a first class file system for MacOS X and Microsoft Windows it must gain the following functionality:
Removing Directory Limitations:
The current AFS directory format and RPCs suffer from a number of limitations that adversely affect the user experience. A directory has a maximum of 64,000 entries if all file names are 16 or fewer octets. Longer names are implemented by consuming an additional entry for each additional 32 octets of file name. Given the ever increasing length of file names some cells are filling the directory with as few as 10,000 entries. Some scientific research projects require the use of millions of files perhaps containing a single data byte within a single directory.
The current AFS directory format is very inefficient for searching when case-insensitivity or Unicode normalization is required. Under these circumstances search time is linear to the number of entries in the directory. Many modern file systems implement the directory as a B+ tree to permit O(log n) searching. The existing format places a heavy burden on each and every cache manager. Each client must download a copy of the directory buffers and perform linear searching. This results in heavy use of the CPU when searching directories with 500 or more entries.
Another issue is the lack of support for internationalization. In the current directory format directory entries are stores as a sequence of octets without any character set hinting. A file that is stored using a name encoded with ISO 8859-5 or CP437 will not be represented correctly to the user on a system that expects UTF-8. Even when file names are stored using UTF-8 it is important to recognize that depending on the input mechanisms it is possible for a user to enter the same semantic string using different octet sequences. Therefore it is crucial that any implementation of Unicode file names support normalized forms for comparison.
Finally, Microsoft Windows and MacOS X are now requiring that first class file systems support the concept of multiple data streams per file. These streams are used to store extended attributes, security zone information, resource forks, and other forms of meta data in addition to providing a general purpose storage mechanism for applications. For more details ...
At the 2004 AFS Hackathon in Stockholm there was much discussion of potential methods of extending the existing directory format to support Unicode. http://www.afsig.se/afsig/space/AFS+directory+format+extensions However, these approaches did not address the directory search performance issues, the entry limitations or multiple data streams.
The current direction under consideration is to completely replace the on disk directory format with an entirely new one consisting of data blocks representing nodes in a B+ tree with each block containing a variable number of entries. The new data structure would be Unicode aware and support multiple data streams. Microsoft Windows clients would implement extended attributes in a reserved data stream. MacOS X clients would use a reserved stream for the resource fork.
New versions of all of the directory RPCs would be implemented to support the new data structure. Clients that use the new APIs would be delivered directory buffers which construct a B+ tree which in turn would significantly improve directory search times.
For old clients, new implementations of the old RPCs would deliver directory
data translated to the old linear format up to the maximum number of directory
entries. It is possible that old clients will not be able to see all the
files in a given directory.
Extended Attributes are used by MacOS X to store resources and DOS Attributes. When they are not supported by the file system, MacOS X is forced to create the ._ (Apple Double files). Extended Attributes on Microsoft Windows are used to store a variety of meta-data about files and directories. The lack of EA support in AFS damages the Windows User Experience. AFS Cache Managers can implement support for extended attributes and store them in hidden Apple Double files while waiting for full EA support within AFS volumes.
For more details ...
AFS supports per-directory ACLs. Per-file ACLs would make it possible to apply a different set of access constraints on a single object within a directory. At the present time storing multiple objects with different access controls requires that they be stored in separate directories. The AFS protocol provides partial support for this from the AFS/DFS translator, and this is supported in clients going back to IBM AFS.
Mandatory Locking and Byte range locks:
Platforms such as Microsoft Windows and MacOS X require that their first class file systems support mandatory lock semantics and byte ranges. Applications which rely on these capabilities such as Microsoft Office and databases risk data corruption if their data files are altered while they are assumed to be under a lock. AFS only provides advisory full file locks and provides no upgradeable lock type. The existing AFS file server lock implementation doesn't keep track of which clients were issued locks which results a number of situations in which lock counts can become incorrect and produce a denial of service on a given file.
The Windows AFS client in the 1.5 series has added a localized implementation
of mandatory locking and byte range locks. Each time an application requests
a byte range to be locked, the cache manager ensures that it has an appropriate
full lock on the object. The cache manager than accepts the responsibility
of tracking each of the locks and doling out a range at a time.
Estimate: 2 months.
Status Data (Callback Registration) Expiration Algorithm
Status data and callback registration expiration is currently determined based upon the number of clients that are accessing the data instead of the likelihood that the data is going to change.
Status: Implementation in progress.
OPEN/CLOSE File Server RPCs:
New file server RPCs would provide new audit data
Additionally, Volker Lendecke implemented a similar project for use with Samba, which is not known to have been completed.
Luke Howard PADL Ltd.) developed an AFS Protection Service as part of his Active Directory clone, XAD. Ownership of XAD has since been transferred to Novell. However, it is expected that Luke will assist us in developing a new implementation in the coming months.
Status: Standardization complete.
Once AFS is capable of being used as a first class file system for Microsoft Windows clients it will make sense to support the AFS servers on the Windows Server platform as there are a large number of Microsoft Windows only IT organizations that do not have the expertise to manage UNIX/Linux systems. The servers are mostly there already. There is work that needs to be done on the NTFS Namei implementation and there needs to be much better integration with power management, plug-n-play networking, and Windows Event Logging.
Of course if you want to host services on Windows, you must provide a Microsoft Management Console plug-in to manage them.
For more details ...
Estimate: 4 to 6 weeks
Asynchronous RX RPCs:
All Rx calls in the existing implementation are synchronous. The currently executing thread must wait for completion. The maximum number of simultaneous requests that can be processed is limited by the number of threads that can be allocated to the process. By adding an asynchronous Rx call mode, the file server can be redesigned to process requests without blocking threads for callback breaks, whoareyou? probes, and getcps calls. This will permit a significant reduction in client requests waiting for threads.
Estimated: 6 weeks for Asynchronous Rx and 3 weeks for file server modifications.
rxgk is designed but has not yet been fully implemented. Love Hörnquist Åstrand, Magnus Ahltorp, Jeffrey Hutzelman, Derrick Brashear and Jeffrey Altman met at KTH the week of 22 Jan 2007 to begin implementation of rxgk and modify as many of the AFS services as possible. Love presented a status report at the 2007 AFS & Kerberos Best Practice Workshop and did more work with Derrick the following week.Status: Standardization in progress. Implementation substantially complete. promised by MIT.
For users that are willing to give up the location independence of the data, there isn't much preventing the construction of a file server back end that reads and writes from the native file system provided that native file system has some way of notifying AFS when a file changed. Change notification is required for the file server to be able to callback the clients and report the invalidation of their data.
Another question that needs to be addressed is how to provide for authenticated
access and access control lists. Finally, location discovery is a challenge
that might be addressed with Apple's Bonjour and/or dyndns; this work can be
extended to provide similar ability to discover a local cell for any client.
Estimate: 2 months
Most off the shelf backup systems only see file systems from the viewpoint of the user. Whereas backing up AFS so that a given volume can be restored as needed in a location independent manner is much more similar to backing up a distributed database. Backing up the files that the database writes does not allow for the necessary granularity of restores that are required. In addition, backing up the database files while they are in use results in data inconsistencies.
Teradactyl is one of the few remaining commercial offerings that have integrated
support for AFS. VERITAS Net Backup and Tivoli Storage Manager have both dropped
integrated AFS support. Teradactyl have been a sponsor of the AFS & Kerberos
Best Practice Workshops for the last couple of years.
There have also been various efforts to contribute AFS support to Amanda, http://www.amanda.org/, and there have been efforts to provide an AFS wrapper to Legato Networker.
The implementation schedule for these projects is entirely dependent upon resource availability. Please send inquiries, comments, and offers of support to firstname.lastname@example.org. Where external contributors have promised contributions, they are included, as are timelines when those are provided. The following release schedule is subject to change.
The next release in the previous stable series for UNIX is expected before December 2011. This release will correct implementation defects and will most likely be the last release in the 1.4 series.
The 1.6 series is the current stable series for
UNIX and the last stable series for Microsoft Windows without a native IFS
1.6.8 is the next release in the series and is expected April 2014.
The 1.7 series is the development branch for the Windows IFS implementation. The first release on this branch was announced on 15 Sept 2011. Subsequent releases are expected every two to four weeks until the code enters maintenance mode after two or three months.
The 1.8 series will become the first stable release of OpenAFS to include the Windows IFS implementation. No other new features will be added to 1.8.
The 1.9 series will replace the 1.5 series as the experimental release series. 1.9 releases will begin shortly after the 1.7 series has the Windows IFS implementation committed. Major new features will be integrated into 1.9 releases in preparation for the 1.10 stable release.
The 2.0 series will replace the 1.6 and 1.8 series as the stable release series for UNIX and Microsoft Windows. The 2.0 series are scheduled to include the rxgk security class including Kerberos v5, RxUDP performance improvements, PTS authentication name extensions, and extended callbacks.