@c $Id: afs-basics.texi,v 1.1 2000/09/11 14:40:52 art Exp $

@node Description of AFS infrastructure, Organization of data, Introduction, Top
@comment  node-name,  next,  previous,  up
@chapter Description of AFS infrastructure

This is an overview of the AFS infrastructure as viewed from a Transarc
perspective, since most people still run Transarc cells.

@heading Server overview

AFS filespace is split up in smaller parts called cells. These cells are
usually listed under @file{/afs}. A cell is usually a whole organization
or an adminstative unit within an organization. An example is e.kth.se
(with the path @file{/afs/e.kth.se}), that is the department of
electrical engineering at KTH, which obviously has the @file{e.kth.se}
domain in DNS.  Using DNS domains for cell names is the typical and most
convenient way.

All cells (and their db-servers) in the AFS world are listed in a file
named @file{CellServDB}. There is a central copy that is maintained by
Transarc at @file{/afs/transarc.com/service/etc/CellServDB}.  In
addition Arla can use DNS to find the db-servers of a cell. The DNS
resource record that is used is the AFSDB. It tells you what machines
are db servers for a particular cell.  The AFSDB resource record is also
used for DCE/DFS.  An example (the 1 means AFS, 2 is used for DCE):

@example
e.kth.se.               1D IN AFSDB     1 sonen.e.kth.se.
e.kth.se.               1D IN AFSDB     1 anden.e.kth.se.
e.kth.se.               1D IN AFSDB     1 fadern.e.kth.se.
@end example

Some cells use the abbreviated version
@file{/afs/<word-before-first-dot>} (in the example above that would be
@file{/afs/e/}.  This might be convenient when typing them, but is a bad
idea, because it does not create the same name space everywhere.  If you
create a symbolic link to @file{/afs/e/foo/bar}, it will not work for
people in other cells.

Note that cell names are written in lowercase by convention.

There are several servers running in an AFS cell. For performance and
redundancy reasons, these servers are often run on different hosts.
There is a built in hierarchy within the servers (in two different
dimensions).

There is one server that keeps track of the other servers within a host,
restart them when they die, make sure they run in the correct order,
save their core-files when they crash, and provide an interface for the
sysadmin to start/stop/restart the servers. This server is called
bos-server (Basic Overseer Server).

Another hierarchy is the one who keeps track of data (volumes, users,
passwords, etc) and who is performing the real hard work (serving files)
There is the the database server that keeps the database (obviously),
and keeps several database copies on different hosts relpicated with
Ubik (see below). The fileserver and the client software (like the
afsd/arlad, pts and, vos) are pulling meta-data out of the dbserver to
find where to find user-privileges and where volumes resides.

@heading Basic overseer - boserver

The Bos server is making sure the servers are running. If they crash, it
saves the corefile, and starts a new server. It also makes sure that
servers/services that are not supposted to run at the same time do not.
An example of this is the fileserver/volserver and salvage. It would be
devastating if salvage tried to correct data that the fileserver is
changing. The salvager is run before the fileserver starts. The
administrator can also force a file server to run through salvage again.

@heading Ubik

Ubik is a distributed database. It is really a (distributed) flat file
that you can perform read/write/lseek operation on. The important
property of Ubik is that it provides a way to make sure that updates are
done once (transactions), and that the database is kept consistent. It
also provides read-only access to the database when there is one (or
more) available database-server(s).

This works the following way: A newly booted server sends out a message
to all other servers that tells them that it believes that it is the new
master server. If the server gets a notice back from an other server
that tells it that the other server believes that it (or a third server)
is the master, depending on how long it has been masterserver it will
switch to the new server. If they can't agree, the one with the lowest
ip-address is supposed to win the argument. If the server is a slave it
still updates the database to the current version of the database.

A update to the database can only be done if more than half of the
servers are available and vote for the master. A update is first
propaged to all servers, then after that is done, and if all servers
agree with the change, a commit message is sent out from the server, and
the update is written to disk and the serial number of the database is
increased.

All servers in AFS use Ubik to store their data.

@heading Volume Location database server - vlserver

The vldb-server is resposible for the information on what fileserver
every volume resides and of what kind of volumes exists on each
fileserver.

To confuse you even more there are three types of support for the
clients. Basically there is AFS 3.3, 3.4, and 3.6 support. The different
interfaces look the same for the system administrator, but there are
some important differences.

AFS 3.3 is the classic interface. 3.4 adds the possibility of multihomed
servers for the client to talk to, and that introduces the N interface.
To deal with multihomed clients AFS 3.5 was introduced. This we call the
U interface.

The N interface added more replication-sites in the database-entry
structure. The U interface changed the server and clients in two ways.

Now a server registers all its ip-addresses when it boots. This means
that a server can add (or remove) an network interface without
rebooting. When registering at the vldb server, the file server presents
itself with an UUID, an unique identifier. This UUID will be stored in a
file so the UUID keeps constant even when network addresses are changed,
added, or removed.

@heading Protection server - ptserver

The protection server keeps track of all users and groups. It's used a
lot by the file servers.

@heading Kerberos server - kaserver

The kaserver is a Kerberos server, but in other clothes. There is a new
RPC interface to get tickets (tokens) and administer the server.  The
old Kerberos v4 interface is also implemented, and can be used by
ordinary Kerberos v4 clients.

You can replace this server with an Heimdal kdc, since it provides a
superset of the functionality.

@heading Backup server - buserver

The backup server keeps the backup database that is used when backing up
and restoring volumes. The backup server is not used by other servers,
only operators.

@heading Update server - upserver

With the update server its possible to automagicly update configuration
files, server binaries.  You keep masters that are supposed to contain the
correct copy of all the files and then other servers can fetch them from there.

@heading Fileserver and Volume Server - fs and volser

The file server serves data to the clients, keeps track of callbacks,
and breaks callbacks when needed. Volser is the administative interface
where you add, move, change, and delete volumes from the server.

The volume server and file server are ran at the same time and they sync
with each other to make sure that fileserver does not access a volume
that volser is about to change.

@heading Salvage

Salvage is not a real server. It is run before the fileserver and volser
are started to make sure the partitions are consistent.

It's imperative that salvager is NOT run at the same time as the
fileserver/volser is running.

@heading Things that milko does differently.

Fileserver, volumeserver, and salvage are all in one program.

There is no bu nor ka-server. The ka-server is replaced by kth-krb or
Heimdal. Heimdal's kdc even implements a ka-server readonly interface,
so your users can keep using programs like klog.