WebRFM 0.4 (beta) - A Remote CGI File Manager.
Copyright (C) 1999  Yoram Last (ylast@mindless.com)

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.


                             ABOUT THIS FILE:
                            ==================
This is the readme file for WebRFM 0.4, released December 18, 1999. This file
contains general information and installation instructions for WebRFM. More
information can also be obtained by looking at the main WebRFM script in
WebRFM's 'scripts' directory. Further documentation is included in WebRFM's
online help pages. WebRFM needs to first be installed before these pages can
be properly accessed. Also, for the latest information on WebRFM, you should
check its homepage at
http://webrfm.netpedia.net/index.html


                                 CONTENTS:
                               =============
     A. WHAT IS THIS?
     B. WHAT DOES IT DO?
     C. EXTERNAL PROGRAMS
     D. COMPATIBILITY
     E. SYSTEM REQUIREMENTS
     F. INSTALLATION
     G. STATUS OF THIS PROGRAM
     H. TODOS
     I. TECHNICAL SUPPORT


A. WHAT IS THIS?
=================
WebRFM (Web-based Remote File Manager) is a CGI-Perl program aimed at providing
a single solution for remote Web-based file management, and at replacing traditional
FTP-based access for that purpose. It is suitable for managing websites, as well
as for more general purpose file management tasks. WebRFM combines a "visible"
HTML 3.2 compliant form-based layer (which is in the spirit of the tools currently
provided by many large hosting services) along with a "hidden" direct HTTP layer
that implements a class 1 WebDAV server. Support for some legacy HTTP methods
(which are essentially borrowed from AOLserver and Netscape's Enterprise server) is
also provided. While WebRFM can be installed and used by individual users, it is
specifically designed to provide a secure system-wide solution that is suitable for
usage by ISP's, web-space providers, etc. WebRFM currently runs on UNIX/Linux
systems.


B. WHAT DOES IT DO?
===================
1. Provides a simple-minded (but effective) HTML form-based file manager.
2. Provides a built-in HTML form-based text editor.
3. Supports file retrieval (downloading) as well as form-based (RFC 1867)
  file uploading. FTP-like 'text mode' option is available for file transfers
  in both directions.
4. Provides support for the HTTP 1.1 'PUT' and 'DELETE' methods. Files can
  be transparently edited and then published using applications that support
  'PUT' (such as the HTML editor in Netscape's Communicator).
5. Provides extensive support for additional HTTP methods that are used for
   content management. This currently includes a rough implementation of
   a class 1 WebDAV server (MKCOL, MOVE, COPY, and PROPFIND methods), along
   with support for some legacy (non-standardized) HTTP methods (MKDIR, BROWSE,
   INDEX, SAVE, EDIT, and RMDIR methods). This results in WebRFM being able
   to work properly with many clients that are designed to use those HTTP
   extension methods. In particular, AOLpress, SiteCopy, Cadaver, Microsoft's
   Web Folders (it comes as part of Internet Explorer 5) and Office 2000
   applications, and Netscape's Communicator Roaming Profiles, are fully
   functional with WebRFM.
6. Designed to operate in a secure way on multi-user systems. In particular,
  the following security-related features are provided:
  a) Runs in the security context (UID/GID) of the authenticated user that
    is using it, so OS-based restrictions (such as quota limits) are being
    imposed. A special setuid wrapper is provided in order to provide a simple
    way for running WebRFM in this way. It is also possible to use other
    wrappers, such as the Apache 'suEXEC' wrapper.
  b) Optionally implements a user-dependent 'virtual root directory'. By
    default, each user's home directory appears to him as if it where the
    file system root directory and he can't access anything outside of it.
    The 'virtual root directory' can be changed to be any other directory,
    including a subdirectory of the user's home directory.
  c) Contains built-in access control mechanisms (both location-based and
    user-based) that can be used to enhance and double-check server-imposed
    access control. Implements various checks (such as imposing minimal
    UID and GID to run as) to insure secure operation.
  d) Has a built-in permissions engine that can be easily customized to
    impose various restrictions beyond those that are natively provided
    by the OS.
7. Modular, highly configurable design. WebRFM's behavior is controlled
  by a fairly large number of variables that can be changed to customize it
  in various ways. In particular, WebRFM's HTML interface is very
  configurable. Arbitrary parameters for the '<BODY>' tag, two sets of table
  parameters, and several size parameters can be set, and they control
  the appearance of the interface in a consistent way. Properties that should
  be user-controlled (like those that effect the appearance of WebRFM's
  HTML output), are stored in a per-user configuration file and can be
  modified from a web-based interface. Administrators can disable user-
  control of some (or even all) of these properties. Properties that
  should only be changed by administrators are set in the main script 
  and are protected from user intervention.

Remark: Essentially everything that can be done by using WebRFM can also
 be done through FTP. Some of the main advantages of WebRFM over FTP are:
 1) It does not require any client other than a web browser, and it is
   thus likely to be simpler to use for non-technical users.
 2) It can be much more secure, because:
   a) It can be used over completely encrypted connections (by using an SSL
     capable server).
   b) It can be used in conjunction with secure authentication schemes (such
     as digest authentication) that avoid sending plain text passwords over
     non-encrypted connections.
   c) It Can be used behind a firewall through a standard HTTP proxy server.
   d) The 'virtual root directory' and other built-in mechanisms can be used
     to limit access to the system itself, as well as to impose many
     restrictions on what can be done through WebRFM.


C. EXTERNAL PROGRAMS:
======================
The WebRFM distribution archive includes the following two external scripts.
They are used by WebRFM, but are not part of it.

1. cgi-lib.pl by Steven E. Brenner: "The de facto standard library for creating
  Common Gateway Interface (CGI) scripts in the Perl language." It is located in
  WebRFM's 'lib' directory. Details concerning usage and distribution of this
  library can be found in the body of the cgi-lib.pl file itself. More
  information concerning it can be found at the cgi-lib.pl homepage at
  http://www.bio.cam.ac.uk/cgi-lib.

2. getcwd.pl by Brandon S. Allbery: Gets the current working directory, and used
  by WebRFM precisely for this purpose. This is simply the getcwd script from the
  standard Perl library, which is included with the standard Perl distribution.
  We include it here (in WebRFM's own 'lib' directory) to release WebRFM from
  needing anything other than a Perl binary in order to work.


D. COMPATIBILITY:
==================
1. Server-side:
  WebRFM exploits both Perl and the (standard) CGI interface quite a bit. As a
  result, it requires a good Perl interpreter and a server that has a robust CGI
  implementation. The current version was tested mainly on Red Hat Linux 5.x
  systems with Apache 1.2.6/1.3.3 servers and Perl 5.004. However, it should work
  on any UNIX system with Perl 4.036 or later installed, using any web server
  that has a robust CGI implementation (in particular, the server should be willing
  to transfer arbitrary HTTP request methods to CGI programs). Non-UNIX Operating
  systems are not currently supported.

2. Client-side:
  WebRFM's HTML form-based layer should work with any HTML 3.2 compliant browser.
  JavaScript support would make some things work a little bit faster, and there
  are some very minor features that are only available with Netscape browsers
  (WebRFM is at its best when using a Navigator 4.xx browser with JavaScript
  enabled). However, those things are not really necessary, and WebRFM remains
  fully functional without them. Some current browsers do not properly support
  form-based file uploading, and thus this particular functionality is not
  available with such browsers. Overall, WebRFM's form-based interface was
  designed to ensure compatibility with a wide range of browsers and screen
  resolutions (including browsers that run on non-PC devices). It had been
  specifically tested for compatibility with pure text browsers (W3m, Lynx)
  and with WebTV. There is a minor problem when using Microsoft's Internet
  Explorer (version 3.0 or higher) in that the 'GET as TEXT' and 'GET as BIN'
  file downloading methods are not effective (they behave the same as 'GET').
  This is because MSIE ignores the server-reported MIME type of files and decides
  by itself what is the type of the file and what it should do with it (this is
  in violation of the HTTP protocol).

  WebRFM's direct HTTP layer should work with any client that is designed
    to do one of the following:
    a) To publish documents using the HTTP 1.1 PUT method, and/or to remove
      documents (or directories) using the HTTP 1.1 DELETE method.
    b) To use the extension methods of an AOLserver (aka NaviServer).
    c) To use the extension methods of a Netscape Enterprise server
      (except for locking and versioning related functionality).
    d) To work with a class 1 WebDAV server.

    Some specific clients that where found to work well with WebRFM's
    direct HTTP layer are mentioned in section B above.


E. SYSTEM REQUIREMENTS:
========================
WebRFM should work on any system that runs an appropriate operating
system, Perl interpreter, and web server. For reasonable performance
(namely, convenient response time), the following minimal system
configurations (or equivalents) are recommended (faster is always better):

Linux: 486DX2 (66 Mhz) with 16 MB RAM.

If the system is simultaneously running other programs (e.g., a number
of servers) larger memory might be needed to obtain reasonable performance.


F. INSTALLATION:
=================
WebRFM is quite flexible in how it can be installed and used. We provide below
explicit instructions for two types of installations: A single user installation,
and a system wide installation. Before we move to describe those specific
installations, it would be useful to note a few things about WebRFM's design,
which can be thought of as having the following three parts:

a) A 'Main Program Directory', where most of the program actually resides. In
  general, it can be located anywhere. When WebRFM is run, it must have read
  permission to most of the files in this directory.
  
b) A gif image file called 'highdir.gif' that should be made retrievable directly 
   through the web server. It can be added to an existing 'icons' or 'images'
   directory, or reside in its own (web accessible) directory.

c) The 'main WebRFM script' which is the one and only file that is being run as
  a CGI script. It can be moved anywhere and renamed as desired, as long as it
  is being enabled as a CGI script. In order for WebRFM to work, there are two
  pieces of information that must be entered in the body of this file as part
  of the installation: The location of the 'Main Program Directory' (so that
  WebRFM can find the rest of itself) and the URI which corresponds to the
  directory where the 'highdir.gif' image file is found (so that WebRFM can
  create appropriate references to it). The 'main WebRFM script' also doubles
  as being the main configuration file for WebRFM. The first part of this file
  contains many variables that can be set to control various aspects of WebRFM's
  operation. 

Other than these three parts, we should also note that each user running WebRFM
should have a configuration directory where WebRFM keeps some per-user
configuration files (These files should never be edited manually. WebRFM provides
a form-based interface to manage them.) The location of this configuration
directory can be set in the 'main WebRFM script' (the default is ~/.WebRFM).
If it does not exist, it would be automatically created when WebRFM is used for
the first time by a user.

Other than the main configuration information in the 'main WebRFM script',
there are two additional files that contain configuration variables. Both
reside in WebRFM's 'lib' directory (the 'lib' subdirectory of the 'Main Program
Directory'). The first is the file 'initlib.pl', which contains default values
for user controlled variables. The second is the file 'extlib.pl' which
contains most of the implementation of WebRFM's direct HTTP layer. The first
part of this file defines some variables that control some aspects of this
layer. Normally, it should not be needed to modify any of these files.

Another file that may need to be edited is WebRFM's default MIME table. This
is the 'mimetable' file in WebRFM's 'lib' directory. If WebRFM is used for
managing web content, then it is recommended that the MIME type matchings
defined in this file would correspond as closely as possible to those that
are done by the web server. (Note that all of the file extensions in this file
must be capitalized. The matching WebRFM eventually does is case insensitive.)

We can now proceed to describe some specific installation setups of WebRFM:

Private single-user installation:
----------------------------------
An installation of this type can be done by any user that:
a) Has a valid user account on a UNIX system.
b) Has the privilege of running CGI programs (through an appropriate web
  server on that system) in his own user context.
Shell access may be helpful for the installation, but is not essential. In
most cases, FTP access would suffice (but some of the text editing described
below would need to be done on a remote machine). Prior experience in running
CGI programs is recommended. Do the following:

a) Extract the distribution archive to its final destination. You must extract
  it in a way that preserves directory structure, such that you get a top-level
  'WebRFM' directory (this is your 'Main Program Directory') with a number of
  of subdirectories (we refer to those as WebRFM's directories). Your home
  directory should be a good place to extract, such that you will get a
  'WebRFM' subdirectory in your home directory.

b) Copy the file 'highdir.gif' from WebRFM's 'sfdir' directory to some place
   within your web space, such that it can be retrieved through the web server.

c) The file 'webrfm.cgi' in WebRFM's 'scripts' directory is your 'main WebRFM
  script'. Copy it to where you want to run it from, and enable it as a CGI
  script. Restrict access to it such that it is only accessible to whoever is
  supposed to access it (presumably just you). It is strongly recommended that
  you use a username + password authentication scheme.
  
d) Open your CGI-enabled 'main WebRFM script' with a text editor. Look for
  the line starting with '$ProgDir = ', and set the value of $ProgDir to be
  the full path to your 'Main Program Directory'. Then look for the line
  starting with '$SendFilesUrl = ', and set the value of $SendFilesUrl to
  be the URI which corresponds to the directory into which you previously
  copied the 'highdir.gif' file. Also, make sure that the first line of the
  script points to Perl on the system. Save your changes.
  
WebRFM should now be properly installed.

System-wide installation:
--------------------------
In order to perform this type of installation, you should become the root
user.

a) Extract the distribution archive to its final destination. You must extract
  it in a way that preserves directory structure, such that you get a top-level
  'WebRFM' directory (this is your 'Main Program Directory') with a number of
  of subdirectories (we refer to those as WebRFM's directories). The
  recommended location to extract the archive is /usr/local/lib, such that
  your 'Main Program Directory' will be /usr/local/lib/WebRFM

b) Copy the file 'highdir.gif' from WebRFM's 'sfdir' directory to some place
   within your web space, such that it can be retrieved through the web server.
   If you have a global 'icons' or 'images' directory, it should be a good
   location for it, as long as you don't already have some other file with
   the same name in there.

c) The file 'webrfm.cgi' in WebRFM's 'scripts' directory is your 'main WebRFM
  script'. Open this file with a text editor. Look for the line starting with
  '$ProgDir = ', and set the value of $ProgDir to be the full path to your
  'Main Program Directory' (if you followed the recommendation in (a), this
  should already be set for you). Then look for the line starting with
  '$SendFilesUrl = ', and set the value of $SendFilesUrl to be the URI which
  corresponds to the directory into which you previously copied the
  'highdir.gif' file. Also, make sure that the first line of the script points
  to Perl on your system. Save your changes.
  
Your basic installation of WebRFM is now complete. However, In order for your
users to be able to use it, they would need some (properly authenticated) way
to get the 'webrfm.cgi' file in WebRFM's 'scripts' directory to run as a CGI
program in their appropriate user context (namely, it needs to be run with
their UID/GID). There are several ways of doing that. If you already have
a mechanism (such as the Apache suEXEC wrapper) that enables users to run CGI
programs in their own user context, it can also be used to run WebRFM. Your
users can simply set (or you can set for them) a simple two-line wrapper
script of the form

#!/usr/bin/perl
require "/usr/local/lib/WebRFM/scripts/webrfm.cgi";

Of course, a script of this type should be owned by the appropriate
user/group, and access to it must be restricted appropriately.

Another (generally much simpler) way is to use WebRFM's own setuid wrapper
in order to provide all of your users access from a single point. The code
for this wrapper is the file 'wrfmwrap.c' in WebRFM's scripts directory.
You should open this file with a text editor, and then follow the
instructions given there (in particular, note the warnings given there).
Once you have this wrapper properly installed as a setuid CGI program
(if you have a cgi-bin directory, putting the wrapper there should
normally be OK), you would need to set access control for this file, such
that your users would get authenticated with their appropriate user names.
Please note that the wrapper works in the following way: It trusts the
server to supply the appropriate user name in the REMOTE_USER environment
variable, and then if it finds a valid system user with that user name,
it spawns WebRFM with the corresponding UID/GID. Technically speaking, you
can achieve the appropriate kind of authentication (we assume here that you
are using an Apache server, although most other servers should be similar)
by using your /etc/passwd file (if you are using it as a user database) as
an 'AuthUserFile'. However, it is strongly recommended NOT TO DO THAT, since
it exposes your root account (and other privileged accounts) to password
guessing attacks. A much better approach would be to use a separate
'AuthUserFile' file. You should just make sure that this file includes
proper user names (that is, names that correspond to all of your system's
users that need to run WebRFM, but not names of any privileged users that
should not send passwords over non secured connections). Of course, there
are many Apache modules to authenticate against various types of user
databases, and many of them can also be used here. The main principle to
remember here is that users must be valid system users, and they should also
have valid home directories on the system.

An important note concerning Apache and proper WebDAV operation:
-----------------------------------------------------------------
If you are using an Apache server and you would like WebDAV clients such
as Microsoft's Web Folders to work properly with WebRFM, there is still
one further thing that you would need to do. Note that this applies to
both private and system-wide installations. The source of the problem is
that WebDAV clients expect the server to provide a DAV header in responses
to OPTIONS requests for DAV enabled resources. While WebRFM has an
appropriate implementation of the OPTIONS method, Apache handles such
requests by itself and does not transfer them to WebRFM at all. My
(temporary?) workaround is to use the Apache 'Header' directive (this
requires mod_headers to be available) in order to force a 'DAV: 1'
header to be attached to every response from WebRFM. This insures the
inclusion of this header in OPTIONS responses, and it shouldn't hurt
anything else. For example, if I have a single point system-wide
installation, and WebRFM is available as '/cgi-bin/webrfm', then the
following lines in my httpd.conf file do the trick:

<Location /cgi-bin/webrfm>
Header set Dav 1
</Location>

In case of a private installation, a similar setting in an appropriate
.htaccess file should work.

If you further want WebRFM's WebDAV layer to work smoothly with Microsoft
clients (or even to work at all, in case that you have the FrontPage extensions
installed on the same server), then there is yet one more header that needs
to be added in a similar way. This is the 'MS-Author-Via' header (a proprietary
Microsoft header that is instructing Microsoft clients how they should try to
accomplish content management) which should have the value 'DAV'. That is,
the complete WebDAV-related header setup in your httpd.conf (or equivalent)
should be something like:

<Location /cgi-bin/webrfm>
Header set Dav 1
Header set MS-Author-Via DAV
</Location>

A note concerning temporary files:
-----------------------------------
When files are uploaded using WebRFM's form-based interface, they are initially
stored as temporary files. Then, if some problem arises in moving them to their
final destination (for example, if there is already a file by that name and the
user didn't check the 'overwrite existing files' box on the upload form), the
user is prompted for further action. If the user doesn't respond to that prompt,
then the temporary file would remain, and with time this can lead to the
accumulation of many such "garbage" files. A similar thing happens also in case
that the user saves a file using the 'Save As' option of the Text Editor (a
temporary file is created, and it might remain in case that there is a problem
and the user doesn't respond to WebRFM's prompts). These temporary files are
being created in WebRFM's temporary directory, which can be set in the
'main WebRFM script' (the default is ~/.WebRFM/temp). Users having a private
installation should occasionally scan their temporary directory and clean
whatever accumulated there. In system-wide installations, it is recommended to
run a daily cron job that would scan those temporary directories and delete old
files that are found there.

A note concerning files in WebRFM's 'htm' directory:
-----------------------------------------------------
Files that are located in WebRFM's 'htm' directory (by default, these are
just WebRFM help files), can be retrieved by calling WebRFM with a query
string of the form ?com=rshgethtm+<filename>, where <filename> should
be substituted for the name of the file. Such files are not simply retrieved,
but are considered by WebRFM to be HTML files that are intended to be
dynamically parsed. WebRFM scans those files and replaces certain special
strings with values of corresponding WebRFM parameters (see the 'ParseHTM'
subroutine in the main script for what is being replaced) and it also
attaches its footer (along with the ending </BODY></HTML>) to the end
of those files. This mechanism can be used to create links that would
retrieve WebRFM help pages directly. More importantly, it can be used
to extend WebRFM through HTML pages that would have current values of
various WebRFM parameters inserted into them. However, it also has the
following security implication: Any files that reside in WebRFM's 'htm'
directory would be retrievable by all WebRFM users. Thus, one should be
careful not to place in this directory any files that might contain
confidential information.

A note concerning upgrades from previous versions:
---------------------------------------------------
The 'fmopdat' configuration file from versions of WebRFM earlier than 0.3b is
not compatible with the current version (this file resides in WebRFM's per-user
configuration directory, and it holds all of the user configuration options).
As a result, users of such previous versions that are upgrading to the current
one may get an error message about fmopdat being corrupted and WebRFM would
refuse to run. If you encounter this problem, you can do one of several things, 
such as:
1) Delete the existing 'fmopdat' file (or files). A new file with
default values would be created.
2) Change the value of the $CfgDir variable in the 'main WebRFM script'
such that a new configuration directory (with new default configuration
files) would be created for the new version.

If you must preserve configuration options from a previous version to
work with the current one, please drop me a note (to ylast@mindless.com).
It should be quite simple to write a script that would upgrade 'fmopdat'
files to the new format, but I don't see a point in doing that unless
there is actual demand (by at least one person).


G. STATUS OF THIS PROGRAM
==========================
WebRFM is currently in beta status. All of the major features that are planned
for version 1.0 are already implemented, and it seems to work quite nicely
in its current state. However, some bugs may be present, and using it in any
kind of a "working environment" must be done with caution.


H. TODOS
=========
1. Provide more documentation.
2. The author is open to suggestions for improvements.
3. Many wonderful things are being considered for future versions beyond
version 1.0. In the short term, however, the primary goal is to provide
a stable bug-free version 1.0 that is not expected to have any major
features beyond those that are already implemented.


I. TECHNICAL SUPPORT
=====================
No support of any kind is promised for this program, and the author does not
promise to answer e-mail messages related to it. Bug reports and feedback
(of any kind) to ylast@mindless.com would be greatly appreciated. The author
may assist users of this program who run into problems if his time allows
it. However, no commitment to doing that is being given. Users should attempt
to read the documentation that comes with WebRFM and to check the WebRFM
homepage at
http://webrfm.netpedia.net/index.html
for the latest information prior to seeking the author's help. Note that
this program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.