Chapter 7 FTP

7.1 TONS OF FILES

Hundreds of systems connected to Internet have file libraries, or archives, accessible to the public. Much of this consists of free or low- cost shareware programs for virtually every make of computer. If you want a different communications program for your IBM, or feel like playing a new game on your Amiga, you'll be able to get it from the Net. But there are also libraries of documents as well. If you want a copy of a recent U.S. Supreme Court decision, you can find it on the Net. Copies of historical documents, from the Magna Carta to the Declaration of Independence are also yours for the asking, along with a translation of a telegram from Lenin ordering the execution of rebellious peasants. You can also find song lyrics, poems, even summaries of every "Lost in Space" episode ever made. You can also find extensive files detailing everything you could ever possibly want to know about the Net itself. First you'll see how to get these files; then we'll show you where they're kept. The commonest way to get these files is through the file transfer protocol, or ftp. As with telnet, not all systems that connect to the Net have access to ftp. However, if your system is one of these, you'll be able to get many of these files through e-mail (see the next chapter). Starting ftp is as easy as using telnet. At your host system's command line, type

ftp site.name

and hit enter, where "site.name" is the address of the ftp site you want to reach. One major difference between telnet and ftp is that it is considered bad form to connect to most ftp sites during their business hours (generally 6 a.m. to 6 p.m. local time). This is because transferring files across the network takes up considerable computing power, which during the day is likely to be needed for whatever the computer's main function is. There are some ftp sites that are accessible to the public 24 hours a day, though. You'll find these noted in the list of ftp sites in section 7.6

7.2 YOUR FRIEND ARCHIE

How do you find a file you want, though? Until a few years ago, this could be quite the pain - there was no master directory to tell you where a given file might be stored on the Net. Who'd want to slog through hundreds of file libraries looking for something? Alan Emtage, Bill Heelan and Peter Deutsch, students at McGill University in Montreal, asked the same question. Unlike the weather, though, they did something about it. They created a database system, called archie, that would periodically call up file libraries and basically find out what they had available. In turn, anybody could dial into archie, type in a file name, and see where on the Net it was available. Archie currently catalogs close to 1,000 file libraries around the world. Today, there are three ways to ask archie to find a file for you: through telnet, "client" Archie program on your own host system or e- mail. All three methods let you type in a full or partial file name and will tell you where on the Net it's stored. If you have access to telnet, you can telnet to one of the following addresses: archie.mcgill.ca; archie.sura.net; archie.unl.edu; archie.ans.net; or archie.rutgers.edu. If asked for a log-in name, type

archie

and hit enter. When you connect, the key command is prog, which you use in this form:

prog filename

followed by enter, where "filename" is the program or file you're looking for. If you're unsure of a file's complete name, try typing in part of the name. For example, "PKZIP" will work as well as "PKZIP204.EXE." The system does not support DOS or Unix wildcards. If you ask archie to look for "PKZIP*," it will tell you it couldn't find anything by that name. One thing to keep in mind is that a file is not necessarily the same as a program - it could also be a document. This means you can use archie to search for, say, everything online related to the Beetles, as well as computer programs and graphics files. A number of Net sites now have their own archie programs that take your request for information and pass it onto the nearest archie database - ask your system administrator if she has it online. These "client" programs seem to provide information a lot more quickly than the actual archie itself! If it is available, at your host system's command line, type

archie -s filename

where filename is the program or document you're looking for, and hit enter. The -s tells the program to ignore case in a file name and lets you search for partial matches. You might actually want to type it this way:

archie -s filename|more

which will stop the output every screen (handy if there are many sites that carry the file you want). Or you could open a file on your computer with your text-logging function. The third way, for people without access to either of the above, is e- mail. Send a message to archie@quiche.cs.mcgill.ca. You can leave the subject line blank. Inside the message, type

prog filename

where filename is the file you're looking for. You can ask archie to look up several programs by putting their names on the same "prog" line, like this:

prog file1 file2 file3

Within a few hours, archie will write back with a list of the

appropriate sites.

In all three cases, if there is a system that has your file,

you'll get a response that looks something like this:

Host sumex-aim.stanford.edu

Location: /info-mac/comm

FILE -rw-r-r- 258256 Feb 15 17:07 zterm-09.hqx

Location: /info-mac/misc

FILE -rw-r-r- 7490 Sep 12 1991 zterm-sys7-color-icons.hqx

Chances are, you will get a number of similar looking responses for each program. The "host" is the system that has the file. The "Location" tells you which directory to look in when you connect to that system. Ignore the funny-looking collections of r's and hyphens for now. After them, come the size of the file or directory listing in bytes, the date it was uploaded, and the name of the file.

7.3 GETTING THE FILES

Now you want to get that file.

Assuming your host site does have ftp, you connect in a similar

fashion to telnet, by typing:

ftp sumex-aim.stanford.edu

(or the name of whichever site you want to reach). Hit enter. If the connection works, you'll see this:

Connected to sumex-aim.stanford.edu. 220 SUMEX-AIM FTP server (Version 4.196 Mon Jan 13 13:52:23 PST 1992) ready. Name (sumex-aim.stanford.edu:adamg):

If nothing happens after a minute or so, hit control-C to return to your host system's command line. But if it has worked, type

anonymous

and hit enter. You'll see a lot of references on the Net to "anonymous ftp." This is how it gets its name - you don't really have to tell the library site what your name is. The reason is that these sites are set up so that anybody can gain access to certain public files, while letting people with accounts on the sites to log on and access their own personal files. Next, you'll be asked for your password. As a password, use your e-mail address. This will then come up:

230 Guest connection accepted. Restrictions apply. Remote system type is UNIX. Using binary mode to transfer files. ftp>

Now type

ls

and hit enter. You'll see something awful like this:

200 PORT command successful.

150 Opening ASCII mode data connection for /bin/ls.

total 2636

-rw-rw-r- 1 0 31 4444 Mar 3 11:34 README.POSTING

dr-xr-xr-x 2 0 1 512 Nov 8 11:06 bin

-rw-r-r- 1 0 0 11030960 Apr 2 14:06 core

dr-r-r- 2 0 1 512 Nov 8 11:06 etc

drwxrwsr-x 5 13 22 512 Mar 19 12:27 imap

drwxr-xr-x 25 1016 31 512 Apr 4 02:15 info-mac

drwxr-x-- 2 0 31 1024 Apr 5 15:38 pid

drwxrwsr-x 13 0 20 1024 Mar 27 14:03 pub

drwxr-xr-x 2 1077 20 512 Feb 6 1989 tmycin

226 Transfer complete.

ftp>

Ack! Let's decipher this Rosetta Stone. First, ls is the ftp command for displaying a directory (you can actually use dir as well, but if you're used to MS-DOS, this could lead to confusion when you try to use dir on your host system, where it won't work, so it's probably better to just remember to always use ls for a directory while online). The very first letter on each line tells you whether the listing is for a directory or a file. If the first letter is a ``d,'' or an "l", it's a directory. Otherwise, it's a file. The rest of that weird set of letters and dashes consist of "flags" that tell the ftp site who can look at, change or delete the file. You can safely ignore it. You can also ignore the rest of the line until you get to the second number, the one just before the date. This tells you how large the file is, in bytes. If the line is for a directory, the number gives you a rough indication of how many items are in that directory - a directory listing of 512 bytes is relatively small. Next comes the date the file or directory was uploaded, followed (finally!) by its name. Notice the README.POSTING file up at the top of the directory. Most archive sites have a "read me" document, which usually contains some basic information about the site, its resources and how to use them. Let's get this file, both for the information in it and to see how to transfer files from there to here. At the ftp> prompt, type

get README

and hit enter. Note that ftp sites are no different from Unix sites in general: they are case-sensitive. You'll see something like this:

200 PORT command successful. 150 Opening BINARY mode data connection for README (4444 bytes). 226 Transfer complete. 4444 bytes received in 1.177seconds (3.8 Kbytes/s)

And that's it! The file is now located in your home directory on your host system, from which you can now download it to your own computer. The simple "get" command is the key to transferring a file from an archive site to your host system. If you want to download more than one file at a time (say a series of documents, use mget instead of get; for example:

mget *.txt

This will transfer copies of every file ending with .txt in the given directory. Before each file is copied, you'll be asked if you're sure you want it. Despite this, mget could still save you considerable time - you won't have to type in every single file name. If you want to save even more time, and are sure you really want ALL of the given files, type

prompt

before you do the mget command. This will turn off the prompt, and all the files will be zapped right into your home directory.

There is one other command to keep in mind. If you want to get a

copy of a computer program, type

bin

and hit enter. This tells the ftp site and your host site that you are sending a binary file, i.e., a program. Most ftp sites now use binary format as a default, but it's a good idea to do this in case you've connected to one of the few that doesn't. To switch to a directory, type

cd directory-name

(substituting the name of the directory you want to access) and hit enter. Type

ls

and hit enter to get the file listing for that particular directory. To move back up the directory tree, type

cd ..

(note the space between the d and the first period) and hit enter. Or you could type

cdup

and hit enter. Keep doing this until you get to the directory of interest. Alternately, if you already know the directory path of the file you want (from our friend archie), after you connect, you could simply type

get directory/subdirectory/filename

On many sites, files meant for public consumption are in the pub or public directory; sometimes you'll see an info directory. Almost every site has a bin directory, which at first glance sounds like a bin in which interesting stuff might be dumped. But it actually stands for "binary" and is simply a place for the system administrator to store the programs that run the ftp system. Lost+found is another directory that looks interesting but actually never has anything of public interest in them. Before, you saw how to use archie. From our example, you can see that some system administrators go a little berserk when naming files. Fortunately, there's a way for you to rename the file as it's being transferred. Using our archie example, you'd type

get zterm-sys7-color-icons.hqx zterm.hqx

and hit enter. Instead of having to deal constantly with a file called zterm-sys7-color-icons.hqx, you'll now have one called, simply, zterm.hqx. Those last three letters bring up something else: Many program files are compressed to save on space and transmission time. In order to actually use them, you'll have to use an un-compress program on them first.

7.4 ODD LETTERS - DECODING FILE ENDINGS

There are a wide variety of compression methods in use. You can tell which method was used by the last one to three letters at the end of a file. Here are some of the more common ones and what you'll need to un- compress the files they create (most of these decompression programs can be located through archie).

.txt or .TXT By itself, this means the file is a document, rather than a program.

.ps or .PS A PostScript document (in Adobe's page description language). You can print this file on any PostScript capable printer, or use a previewer, like GNU project's GhostScript.

.doc or .DOC Another common "extension" for documents. No decompression

is needed, unless it is followed by:

.Z This indicates a Unix compression method. To uncompress,

type

uncompress filename.Z

and hit enter at your host system's command line. If the file is a compressed text file, you can read it online by instead typing

zcat filename.txt.Z |more

u16.zip is an MS-DOS program that will let you download such a file and uncompress it on your own computer. The Macintosh equivalent program is called MacCompress (use archie to find these).

.zip or .ZIP These indicate the file has been compressed with a common MS-DOS compression program, known as PKZIP (use archie to find PKZIP204.EXE). Many Unix systems will let you un-ZIP a file with a program called, well, unzip.

.gz A Unix version of ZIP. To uncompress, type

gunzip filename.gz

at your host system's command line.

.zoo or .ZOO A Unix and MS-DOS compression format. Use a program called zoo to uncompress

.Hqx or .hqx Mactintosh compression format. Requires the BinHex program.

.shar or Another Unix format. Use unshar to uncompress. .Shar

.tar Another Unix format, often used to compress several related files into one large file. Most Unix systems will have a program called tar for "un-tarring" such files. Often, a "tarred" file will also be compressed with the gz method, so you first have to use uncompress and then tar.

.sit or .Sit A Mactinosh format that requires the StuffIt program.

.ARC Another MS-DOS format, which requires the use of the ARC or ARCE programs.

.LHZ Another MS-DOS format; requires the use of LHARC.

A few last words of caution: Check the size of a file before you get it. The Net moves data at phenomenal rates of speed. But that 500,000- byte file that gets transferred to your host system in a few seconds could take more than an hour or two to download to your computer if you're using a 2400-baud modem. Your host system may also have limits on the amount of bytes you can store online at any one time. Also, although it is really extremely unlikely you will ever get a file infected with a virus, if you plan to do much downloading over the Net, you'd be wise to invest in a good anti-viral program, just in case.

7.5 THE KEYBOARD CABAL

System administrators are like everybody else - they try to make things easier for themselves. And when you sit in front of a keyboard all day, that can mean trying everything possible to reduce the number of keys you actually have to hit each day. Unfortunately, that can make it difficult for the rest of us. You've already read about bin and lost+found directories. Etc is another seemingly interesting directory that turns out to be another place to store files used by the ftp site itself. Again, nothing of any real interest. Then, once you get into the actual file libraries, you'll find that in many cases, files will have such non-descriptive names as V1.1- AK.TXT. The best known example is probably a set of several hundred files known as RFCs, which provide the basic technical and organizational information on which much of the Internet is built. These files can be found on many ftp sites, but always in a form such as RFC101.TXT, RFC102.TXT and so on, with no clue whatsoever as to what information they contain. Fortunately, almost all ftp sites have a "Rosetta Stone" to help you decipher these names. Most will have a file named README (or some variant) that gives basic information about the system. Then, most directories will either have a similar README file or will have an index that does give brief descriptions of each file. These are usually the first file in a directory and often are in the form 00INDEX.TXT. Use the ftp command to get this file. You can then scan it online or download it to see which files you might be interested in. Another file you will frequently see is called ls-lR.Z. This contains a listing of every file on the system, but without any descriptions (the name comes from the Unix command ls -lR, which gives you a listing of all the files in all your directories). The Z at the end means the file has been compressed, which means you will have to use a Unix un-compress command before you can read the file. And finally, we have those system administrators who almost seem to delight in making things difficult - the ones who take full advantage of Unix's ability to create absurdly long file names. On some FTP sites, you will see file names as long as 80 characters or so, full of capital letters, underscores and every other orthographic device that will make it almost impossible for you to type the file name correctly when you try to get it. Your secret weapon here is the mget command. Just type mget, a space, and the first five or six letters of the file name, followed by an asterisk, for example:

mget This_F*

The FTP site will ask you if you want to get the file that begins with that name. If there are several files that start that way, you might have to answer 'n' a few times, but it's still easier than trying to recreate a ludicrously long file name.

7.6 SOME INTERESTING FTP SITES

What follows is a list of some interesting ftp sites, arranged by category. With hundreds of ftp sites now on the Net, however, this list barely scratches the surface of what is available. Liberal use of archie will help you find specific files. The times listed for each site are in Eastern time and represent the periods during which it is considered acceptable to connect.

AMIGA

ftp.uu.net Has Amiga programs in the systems/amiga directory.

Available 24 hours.

wuarchive.wustl.edu. Look in the pub/aminet directory.

Available 24 hours.

ATARI

atari.archive.umich.edu Find almost all the Atari files you'll ever

need, in the atari directory.

7 p.m. - 7 a.m.

BOOKS

rtfm.mit.edu The pub/usenet/rec.arts.books directories has reading lists for various authors as well as lists of recommended bookstores in different cities. Unfortunately, this site uses incredibly long file names - so long they may scroll off the end of your screen if you are using an MS-DOS or certain other computers. Even if you want just one of the files, it probably makes more sense to use mget than get. This way, you will be asked on each file whether you want to get it; otherwise you may wind up frustrated because the system will keep telling you the file you want doesn't exist (since you may miss the end of its name due to the scrolling problem). 6 p.m. - 6 a.m.

                         

COPYRIGHT(©) 2022