I might not be able to give a lecture on filesystems this Friday (or
it might whiz past with my mile-a-minute speech), so here are some
detailed lecture notes on file systems to help you review for the
File systems. You know about file systems because that’s one of the
most obvious things about your computer. You know that your documents
are hidden somewhere on your hard disk. You know that programs are
stored in another directory. You know that you can make your own
directories to organize your programs and your data.
Why are file systems important? Well, remember what happens during
brown-outs or power fluctuations? If the power goes off and then on
again in a computer lab without uninterruptible power supplies
(UPSes), people will lose all their unsaved work. That’s because the
kind of memory we use loses its data if it doesn’t have power. Hard
disks, on the other hand, keep data even when the computer is off – so
Let’s take a look at the data associated with files. Different
operating systems keep track of different things, but here’s a short
- Name: The files in a directory should have unique names. This lets
us refer to that file.
- Type of file: What kind of file is it? Is it an image? A text
document? A webpage? This could help the operating system determine
the best way to open this file.
- Location: Remember our discussion about sectors, tracks, and
cylinders? Where is the file on the hard disk? In terms of the
logical filesystem, in what directory is the file?
- Size: How large is the file? How much data is in it right now?
- Protection: Can other people edit this file? run this file? read
this file? Who can work on this file?
- Time/date: When was it last modified? (Sometimes the operating
system stores more information, like creation time and access time)
- User identification: Who owns this file? (Windows 98 and below
don’t care about this.)
- Create, write, read, delete: Makes sense.
- Reposition within a file: When you’re reading a book and you want
to flip back to a certain place, you don’t have to close the book
and then open it again. You can just go to a certain position. Same
with files. Some files will let you easily jump to a specified
position within the file. On the other hand, some files will only
let you go forward and not back.
- Truncate: Get rid of everything after a certain position.
- Sequential: You can only access it going forward. You can’t go back.
To read the 7th line in the file, you have to read the first 6
- Direct: You can jump around. This is also known as a random-access
- Indexed: This is like the way a dictionary works. You can jump
around, but some places (like the beginning of entries for each
letter) are marked so that you can jump to them easily.
Your hard disk is divided into partitions (PC term) or volumes (Mac
term). These are the “drives” you see on Windows. You can split up one
hard disk into a C: drive and a D: drive. If you learn about something
called RAID, you can combine two hard disks into just one drive.
(“Drive” is a confusing term, though, as you can have C:, D:, E:, F:,
etc. on just one hard disk.)
Long, long ago, MSDOS didn’t have directories. Seriously. Well, there
was one directory, and all of your files had to be under it. You can
imagine how this sucked, as all of your program files had to be in the
same place as your data. Not only that, you were limited to 8
characters for the filename and 3 characters for the extension.
People used clever names like AAAAAAAA.TXT, AAAAAAAB.TXT and
AAAAAAAC.TXT for their files. Of course, after six months, who could
remember what the contents of each file were?
The diagram shows a single-level directory. The directory entries
are “cat”, “bo”, “a”, “test”, “data”, “mail”, “cont”, “hex” and
“records”. All of these entries point to files.
Now we got to organize them by user, at least. Still sucked because
all of your files were in just one directory, but at least you didn’t
have to worry about other people’s naming schemes.
This is the kind of directory tree you got used to in Microsoft
Windows. You can create directories (aka folders) inside directories
inside directories inside directories.
Links aren’t actually real directories. In Microsoft Windows, they’re
fake – for example, you can’t cd into them from the command line.
Remember the ln command from Unix? This is where links in Unix come
in. (They’re _real_ links, not like the fake ones in Microsoft
Windows. =) )
ln somefile anotherfile
creates a link: anotherfile will refer to the exact same file that
somefile refers to on the hard disk. They point to the same place on
the hard disk.
This allows you to have acyclic graph directories, because the same
file is referred to in two or more places.
ln returns! This time, we use it to make a symbolic link.
ln somefileordir anotherfileordir -s
Symbolic links can point to directories, so it’s perfectly acceptable
to make a link that points to one of your parent directories and thus
get into some kind of loop.
Basically, a general graph directory is anything that could have
You don’t want just anyone messing around with your files. Remember
Unix file permissions and chmod? This slide talks about some of those
permissions, although the access groups the slide uses are different
from Unix permissions. Under Unix, it’s user, group, others. Other
operating systems support access control lists (ACLs) – this means
that instead of just giving permission to one group, you can specify
exactly who gets to do what to the file.
Here we start looking at how things are physically stored on your
hard disk. You can start up defrag to get an idea of what it looks
It’s like contiguous allocation for memory. All the space the file
needs should be in one continuous block. The nice thing about it is
that it’s easy to figure out where all the data is – just find the
starting position and count off so and so many bytes. If you need to
allocate space for a new file, try either first-fit or best-fit.
Downside of this is that file sizes are fixed, because once it’s been
allocated and another file has been allocated next to it, the file
The file is treated as a list of blocks, where a block is a fixed-size
contiguous collection of bytes on the hard disk. The blocks don’t all
have to be together on the hard disk – each block in the file points
to the next one. Plus side: files can grow, just link in new blocks.
Minus: To read the file, you have to hop around the hard disk, plus
all of that pointing around wastes space because each block has to
refer to the next one. Solution: use groups of blocks (aka clusters),
but these might be bigger than you need – if so, then space is wasted.
Form of linked allocation. The links to the next block are kept in
one large table in a certain place in the hard disk.
Still uses blocks, but instead of each block pointing to the next one,
one block has an index that points to all of the blocks. This block
is called the inode (index node) under Unix. If the file is too big
for one index block, the index block refers to other index blocks.
How your computer can tell how much space you have free.
- Bit vector/map: keep track of every single block! Requires 1 bit per
block (1 if it’s occupied, 0 if it’s free).
- Linked list: Like the linked allocation scheme (one block points to
the next one) except this keeps track of the free ones.
- Grouping: The first free block has an index of free blocks (as many
as it can). If there are more free blocks than will fit in the
index, it just points to another index block.
- Counting: Keep track of the beginning of each set of free blocks
and how many free blocks there are.
How directories work. That is – how do directories store info about
files within them?
One way is to just keep a list of all the filenames in that directory.
If you have a million files (and it’s happened!), this can get
Hashtables are supposed to be faster at lookup, so naturally there’s
a way to use them too.
Remember the scandisk that shows up when you improperly shutdown your
computer? This is what’s happening. Your computer’s checking if what
it thinks your filesystem should be like is different from what it
Quick review: arrays
Arrays are a neat way to store a fixed number of items. You can
declare and create arrays and loop over them. They’re perfect for
board games and other applications where you know exactly how many
items you’ll need to store.
EXERCISE: (Due 11:59:59 PM Sunday)
Download http://sacha.free.net.ph/notebook/cs21a/address.zip and unzip
it into a convenient directory. Open the HTML file and run it in
JCreator, or open a command prompt and type “appletviewer
address.html”. (If it gives you “Command not found”, try
E-mail your answers to me by the specified time.
- Put 0 in the index field and a name in the name field. Press Set.
Put 1 in the index field and a different name in the name field.
Press Set. Put 0 in the index field and press Show. What happens?
Put 1 in the index field and press Show. What happens? (Note that
the data doesn’t get saved to the hard disk – you’ll learn how to do
that in CS21B!)
- What is the largest number that you can use in the index field
without getting an error?
- What is the smallest number that you can use in the index field
without getting an error?
- How many items can this simple address book store?
- As practice in applets, components, layouts and arrays, write your
version of this applet (ReallySimpleAddressBook). It should look
like this applet and behave exactly like this applet. (5 points)
- As practice in objects and arrays of objects, create a Person class.
The Person class should have a name attribute and a phone attribute
(both Strings), and it can have any methods you want. Make a
SimpleAddressBook applet that has a “Phone:” field after the Name:
field. It should set and show this as well as the name. Instead of
having two separate arrays for name and phone (you might forget to
update one of them – I do that all the time!), use just one array of
Person objects. (5 points)
- Make a SearchableAddressBook applet. Add a “Search by name” button.
When pressed, you should display the first record whose name equals
the one in the name field. NOTE: Use .equals(…) instead of == to
compare strings. You can also use .equalsIgnoreCase(…) if you want
“ABC” to be considered the same as “abc”. (5 points)
You may ask other people for help as long as you type and understand
your entire submission.
Jeremy Hogan said on firstname.lastname@example.org:
Many of you have probably already heard of Meetup.com due to its
prominence in the Democratic primaries. In order to help LUGs, school
groups, and advocates to better and more easily co-ordinate gatherings
we’ve started redhat.meetup.com.
You’ll see us adding functionality over the coming weeks, and we’ll be
co-ordinating a special global meetup on April 1, 2004, so I urge you to
go ahead and sign up now. If you haven’t used it, it’s a great way to
spin up activity, to meet others interested in Red Hat, Linux, F/OSS
face to face in a casual setting.
http://www.meetup.com might be a nice way to organize local meetings.
E-Mail from Jeremy Hogan
I keep getting
Dear eBay Member,
Dear customer, you have been billed for $15.00 recently. Please
update your billing information at eBay Billing Center.
This is eBay auto generated message, if you think you received it
by mistake or you want to remove these notifications, please
update your profile at Billing Center.
in my mail. Now, I don’t use eBay at all, so chances are something weird is happening.
E-Mail from eBay Service
On email@example.com, John McKown said:
where you can download free instructions
on how to build a wearable, chording keyboard.
E-Mail from John McKown
- Alain Chesnais (a-LAHN she-NAY)
- YT Lee, head of SEAGRAPH
- Barbara, in charge of organizing conferences (sponsored and in cooperation of)
- CGEMS link from http://www.siggraph.org (Computer Graphics Educational Materials Store, free)
- ACM DL
- Talk at 3:30 – trends
Yes, there are exemptions: 90 and above. This includes the grades from
the projects. The list of exempted people will be posted on Monday.
Bad news for the CS juniors: you have to take the test also at the
same date! (Uh oh…) Thursday, 1:30 – 3:30 PM at C115. (That’s Chem
115) AND there’s extra work _after_ the finals.
(Apologies for the confusion. Apparently, missed important small text.
(defvar nethack-screenshot-file "~/.nethack-notes" "Filename to store Nethack data in.") (defun sacha/nethack-take-screenshot (caption) (interactive "MCaption: ") (save-window-excursion (save-excursion (find-file nethack-screenshot-file) (goto-char (point-min)) (insert ".#1 " caption "\n\n" (with-current-buffer (get-buffer "*nethack map*") (buffer-substring-no-properties (point-min) (point-max))) "\n\n" (with-current-buffer (get-buffer "*nethack status*") (buffer-substring-no-properties (point-min) (point-max))) "\n\n") (save-buffer))))