UNIX File Hierarchy: Structure and Commands

The UNIX operating system organizes files into a tree structure with a root named by the character '/'. An example of the directory tree is shown below:

All files have two types of names. An absolute path name is the name of the file that begins with a "/"; for example, in the above picture, the program ls has as its absolute path name /bin/ls (because to find it, start at the root /, then go to the directory bin in /, then to ls in bin). A relative path name identifies a file relative to an arbitrary directory in the tree; for example, the relative path name of ls with respect to the directory bin is ls, because within the directory bin you need only look for the file ls. An easy way to keep the two types straight is to remember that an absolute path name always begins with a /, and a relative path name never does.

Every directory has at least two subdirectories. The directory . refers to the directory containing that entry; in other words, in the above picture, /bin/. and /bin refer to the same directory. The . is most often used in contexts where the "current directory" is important, and not the specific name of the current directory. The directory .. refers to the parent directory; so /bin/.. is another way of saying /, and /usr/ucb/.. is the same as /usr. This gives you a convenient way to get back up the file tree.

File Types

There are several types of files in the file system; of these, you need to understand four different types.

An ordinary file (also called a regular file) may contain text, a program, or other data. It can be either an ASCII file, with each of its bytes being in the numerical range 0 to 127 (representing characters), or a binary file, whose bytes can be of all possible values 0 to 255.

A directory is a file that consist of directory entries for the files in that directory. There is one directory entry for each file. Each directory entry contains the name of the file and a pointer to the file.

A device file is a file that represents a device. The UNIX operating system represents physical devices (printers, terminals etc.) as filed so that the same I/O functions used to read and write regular files can be used to read from and write to these devices. Devices which do input and output a character at a time, such as terminals and printers, are represented as character special device files; devices which do input and output in chunks of characters (usually a multiple of 256 characters) are represented as block special device files.

Understanding link files requires you to know a little about how the system implements the concept of "file." The UNIX operating system associates a structure called an inode with each file; all characteristics of the file are stored there. Among things like access permissions, date and time of last access and modification, and owner, the inode identifies the data that makes up the file. The "pointer" to the file that a directory contains is simply the inode number of the file.

Two files are said to be linked if they refer to the same inode number. If the file X is linked to the file Y, then X is an alternate name for the file Y. For example, cat X will produce exactly the same effect as cat Y. There are two types of links. A hard link is simply an entry in a directory which contains an inode number that also is associated with another file name; so, using X and Y as examples, the directory entries would have two different names (one would be X and the other Y) but the inode numbers would be the same. A symbolic link (or soft link) is simply a file containing the name of the file it is linked to. The difference is that with a hard link, deleting one of the directory entries does not affect the other but with a symbolic link, deleting the file which is linked to makes the link useless. The command ln(1) creates both types of links; give it the -s option to get a symbolic link, and no options to get a hard link.

The other types depend upon the system you are using. The most common other types are sockets (which are used by processes to communicate with one another) and FIFOs (also called named pipes, these are very similar to sockets).

The command ls -l allows you to see the type of each file in a directory (among other things). The following is the output of the command

ls -alg
(the a option means to list those files and directories whose names begin with a period, and the g option means to show the group of the file):
drwxr-xr-x  6 ecs4005   student     1024 Apr 22 13:30 ./
drwxr-xr-x 74 root      student     1536 Mar 24 12:51 ../
-rw-------  1 ecs4005   student      188 Apr 13 15:53 .login
-rw-------  1 ecs4005   student        6 Mar 24 11:29 .logout
-rw-------  1 ecs4005   student      253 Apr 10 12:50 .xinitrc
-rw-r--r--  1 ecs4005   student      516 Apr 10 13:00 .twmrc
-rw-r--r--  1 ecs4005   student     1600 Apr 22 10:59 test2.out
The output is separated into 7 columns. The meaning of each column in the output is shown below:
1st column      type and protection mode of the file                             
2nd column      number of links to that file (the original file is considered    
                a link)                                                          
3rd column      owner of the file                                                
4th column      group of the file                                                
5th column      size of file in bytes                                            
6th column      date and time of last modification                               
7th column      name of the file                                                 

File Access Control

In the UNIX operating system, all files are protected using a simple access control mechanism so that the owner of a file can deny other users access to his or her files. The first column of the long directory list shows the access characteristics of a file. It is in a form of 10 flags.
drwxr-xr-x
The meanings of the characters in this column are shown below: In determining access, the system checks to see if you are the owner of the file ; if so, it uses characters 2-4. If not, it checks to see if you are a member of the group of the file; if so, it uses characters 5-7. If not, it uses characters 8-10.

You can change the access characteristics of a file using the command chmod(1), which we will talk about in a later section.

Directory Related Commands

There are several commands affecting directories; here is a brief description and example of each. All are fully documented in section 1 of the UNIX Programmer's Manual.

cd directory

This changes the directory you're in. If you omit directory, you get put in your home directory.

pwd

Display the name of the current working directory. As an example, let's say your current working directory is /usr/ucb. To change it to /tmp, say:
	% pwd				show which directory I'm in
	/usr/ucb
	% cd /tmp			change to /tmp
	% pwd				now check I'm in /tmp
	/tmp

mkdir directory

Create directory. You must have write permission in directory's parent. I strongly recommend you use only letters, numbers, and the dash '-', the underscore '_', and the plus sign '+' in your directory name. You can use other characters (in fact, anything except '/'), but doing so may cause problems.

rmdir directory

Delete directory. Again, you need write permission in directory's parent. Also, directory must contain no files except . and .. or the rmdir will fail.

mv old-directory new-directory

Rename old-directory to new-directory. This can only be done if old-directory and new-directory are on the same file system (basically, if you're moving a directory around in your account area, don't worry about this; but you can't move one of your directories to /tmp, for example). As an example, let's say you create a directory called help, decide to rename it nohelp, and finally delete it:
	% ls
	% mkdir help
	% ls
	help
	% mv help nohelp
	% ls
	nohelp
	% rmdir nohelp
	% ls
	%

cp -r old-directory new-directory

Make a copy of old-directory named new-directory. This does work across file systems, so:
	% ls . /tmp
	.:
	hw1
	
	/tmp:
	% mv hw1 /tmp/hw1
	mv: can't mv directories across file systems
	% cp -r hw1 /tmp/hw1
	% ls . /tmp
	.:
	hw1
	
	/tmp:
	hw1

rm -r directory

Delete directory and all of its contents. As with rmdir, you need to have write permission to the parent of directory, directory itself, and all of its subdirectories. Be very careful with this command as you cannot recover files and directories you've accidentally deleted!

File Related Commands

There are many UNIX commands dealing with files. Here, we go through some of the more common ones. Be aware each of these has lots of options this handout does not go into, so be sure to read the manual page if you want the full scoop!

chmod mode filename or chmod who op permissions

This changes the access permissions of the file. It has two forms, illustrated above.

In the first form, you specify an absolute mode , which is an octal number constructed from the logical OR of any set of the following modes

For example,
chmod 751 .login
makes the file .login readable, writeable, and executeable by the owner, readable and executable by any members of .login's group, and executable only by everyone else.

The second form of the chmod(1) command allows you to specify the mode symbolically. A symbolic mode has the form

who op permission
where who is a combination of: op is one of: and permission is any combination of: Look at the previous permission modes of the .login file in the previous example. Executing
chmod u=rx,g-r,o+r .login
will change the owner's access to read and execute only (the writing permission is deleted), will delete the write permission for members of the group, and will add read permission for everyone else. Here's what the two commands would do; note we use
ls -al .login
to check file status:
	% ls -al .login			check current access mode
	-rw-rw-rw-  1 ecs4005       188 Apr 13 15:53 .login
	% chmod 751 .login		change access modes
	% ls -al .login			check current access mode
	-rwxr-x--x  1 ecs4005       188 Apr 13 15:53 .login
	% chmod u=rx,g-r,o+r .login							change permissions again
	% ls -al .login			check current access mode
	-r-x--xr-x  1 ecs4005       188 Apr 13 15:53 .login
	%

touch file

This creates file (if it does not exist) or changes the time of last access and modification of file (if it does exist). It's not used very often, but sometimes is just what you need to get that makefile just right.
	% ls
	% touch xyzzy
	% ls -l
	-rw-r--r--  1 ecs4005       0 Jan 11 15:53 xyzzy
			... 2 minutes go by ...
	% touch xyzzy
	% ls -l
	-rw-r--r--  1 ecs4005       0 Jan 11 15:55 xyzzy

cp old-file new-file
mv old-file new-file
ln old-file new-file
ln -s old-file new-file

These commands all have the same form, and are often grouped together. The first makes a copy of old-file called new-file, the second changes the name of old-file to new-file, the third creates a hard link named new-file to old-file, and the fourth creates a symbolic link named new-file to old-file.

cp old-file1 old-file2 ... directory
mv old-file1 old-file2 ... directory
ln old-file1 old-file2 ... directory
ln -s old-file1 old-file2 ... directory

These are alternate forms of the above commands. The first makes a copy of the file(s) named old-file1 old-file2 ... in the direcotry directory; the new files retain the old names (but are in directory). The second command moves the files old-file1 old-file2 ... into directory, the third creates hard links named directory/old-file1, directory/old-file2, ... to old-file1, old-file2, ...; and the fourth creates symbolic links named directory/old-file1, directory/old-file2, ... to old-file1, old-file2, ....

cmp file1 file2

This performs a byte-by-byte comparison of two files. It is usually used to compare binary files because of its somewhat sparse output:
	% cmp hello goodbye
	hello goodbye differ: char 17816, line 49

diff file1 file2
diff -r directory1 directory2

The first form displays line by line differences between pairs of text files; the second compares the files in the two named directories, directory1 and directory2. It is completely unsuited for use with binary files:
	% diff xyzzy hello
	57,58c57,60
	<	and so I came down the mountain
	<	and saw a clear lake
	---
	>	and when he got there
	>	his poor cupboard was bare
	>	and so the poor doggie
	>	got none.

df

This displays the amount of free disk space on all file systems:
	% df
	Filesystem     kbytes    used   avail capacity  Mounted on
	/dev/xy0a       14584   10014    3111    76%    /
	/dev/xy0g      686209  304550  313038    49%    /usr

head file

This displays the first few lines of the text file named file.

tail file

This displays the last few lines of the text file file.

Acknowledgement

This document was originally written by Kevin Rich, and has been modified for ECS 40 by Matt Bishop.