WebRFM 0.4 (beta) - A Remote CGI File Manager. Copyright (C) 1999 Yoram Last (ylast@mindless.com) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. ABOUT THIS FILE: ================== This is the readme file for WebRFM 0.4, released December 18, 1999. This file contains general information and installation instructions for WebRFM. More information can also be obtained by looking at the main WebRFM script in WebRFM's 'scripts' directory. Further documentation is included in WebRFM's online help pages. WebRFM needs to first be installed before these pages can be properly accessed. Also, for the latest information on WebRFM, you should check its homepage at http://webrfm.netpedia.net/index.html CONTENTS: ============= A. WHAT IS THIS? B. WHAT DOES IT DO? C. EXTERNAL PROGRAMS D. COMPATIBILITY E. SYSTEM REQUIREMENTS F. INSTALLATION G. STATUS OF THIS PROGRAM H. TODOS I. TECHNICAL SUPPORT A. WHAT IS THIS? ================= WebRFM (Web-based Remote File Manager) is a CGI-Perl program aimed at providing a single solution for remote Web-based file management, and at replacing traditional FTP-based access for that purpose. It is suitable for managing websites, as well as for more general purpose file management tasks. WebRFM combines a "visible" HTML 3.2 compliant form-based layer (which is in the spirit of the tools currently provided by many large hosting services) along with a "hidden" direct HTTP layer that implements a class 1 WebDAV server. Support for some legacy HTTP methods (which are essentially borrowed from AOLserver and Netscape's Enterprise server) is also provided. While WebRFM can be installed and used by individual users, it is specifically designed to provide a secure system-wide solution that is suitable for usage by ISP's, web-space providers, etc. WebRFM currently runs on UNIX/Linux systems. B. WHAT DOES IT DO? =================== 1. Provides a simple-minded (but effective) HTML form-based file manager. 2. Provides a built-in HTML form-based text editor. 3. Supports file retrieval (downloading) as well as form-based (RFC 1867) file uploading. FTP-like 'text mode' option is available for file transfers in both directions. 4. Provides support for the HTTP 1.1 'PUT' and 'DELETE' methods. Files can be transparently edited and then published using applications that support 'PUT' (such as the HTML editor in Netscape's Communicator). 5. Provides extensive support for additional HTTP methods that are used for content management. This currently includes a rough implementation of a class 1 WebDAV server (MKCOL, MOVE, COPY, and PROPFIND methods), along with support for some legacy (non-standardized) HTTP methods (MKDIR, BROWSE, INDEX, SAVE, EDIT, and RMDIR methods). This results in WebRFM being able to work properly with many clients that are designed to use those HTTP extension methods. In particular, AOLpress, SiteCopy, Cadaver, Microsoft's Web Folders (it comes as part of Internet Explorer 5) and Office 2000 applications, and Netscape's Communicator Roaming Profiles, are fully functional with WebRFM. 6. Designed to operate in a secure way on multi-user systems. In particular, the following security-related features are provided: a) Runs in the security context (UID/GID) of the authenticated user that is using it, so OS-based restrictions (such as quota limits) are being imposed. A special setuid wrapper is provided in order to provide a simple way for running WebRFM in this way. It is also possible to use other wrappers, such as the Apache 'suEXEC' wrapper. b) Optionally implements a user-dependent 'virtual root directory'. By default, each user's home directory appears to him as if it where the file system root directory and he can't access anything outside of it. The 'virtual root directory' can be changed to be any other directory, including a subdirectory of the user's home directory. c) Contains built-in access control mechanisms (both location-based and user-based) that can be used to enhance and double-check server-imposed access control. Implements various checks (such as imposing minimal UID and GID to run as) to insure secure operation. d) Has a built-in permissions engine that can be easily customized to impose various restrictions beyond those that are natively provided by the OS. 7. Modular, highly configurable design. WebRFM's behavior is controlled by a fairly large number of variables that can be changed to customize it in various ways. In particular, WebRFM's HTML interface is very configurable. Arbitrary parameters for the '' tag, two sets of table parameters, and several size parameters can be set, and they control the appearance of the interface in a consistent way. Properties that should be user-controlled (like those that effect the appearance of WebRFM's HTML output), are stored in a per-user configuration file and can be modified from a web-based interface. Administrators can disable user- control of some (or even all) of these properties. Properties that should only be changed by administrators are set in the main script and are protected from user intervention. Remark: Essentially everything that can be done by using WebRFM can also be done through FTP. Some of the main advantages of WebRFM over FTP are: 1) It does not require any client other than a web browser, and it is thus likely to be simpler to use for non-technical users. 2) It can be much more secure, because: a) It can be used over completely encrypted connections (by using an SSL capable server). b) It can be used in conjunction with secure authentication schemes (such as digest authentication) that avoid sending plain text passwords over non-encrypted connections. c) It Can be used behind a firewall through a standard HTTP proxy server. d) The 'virtual root directory' and other built-in mechanisms can be used to limit access to the system itself, as well as to impose many restrictions on what can be done through WebRFM. C. EXTERNAL PROGRAMS: ====================== The WebRFM distribution archive includes the following two external scripts. They are used by WebRFM, but are not part of it. 1. cgi-lib.pl by Steven E. Brenner: "The de facto standard library for creating Common Gateway Interface (CGI) scripts in the Perl language." It is located in WebRFM's 'lib' directory. Details concerning usage and distribution of this library can be found in the body of the cgi-lib.pl file itself. More information concerning it can be found at the cgi-lib.pl homepage at http://www.bio.cam.ac.uk/cgi-lib. 2. getcwd.pl by Brandon S. Allbery: Gets the current working directory, and used by WebRFM precisely for this purpose. This is simply the getcwd script from the standard Perl library, which is included with the standard Perl distribution. We include it here (in WebRFM's own 'lib' directory) to release WebRFM from needing anything other than a Perl binary in order to work. D. COMPATIBILITY: ================== 1. Server-side: WebRFM exploits both Perl and the (standard) CGI interface quite a bit. As a result, it requires a good Perl interpreter and a server that has a robust CGI implementation. The current version was tested mainly on Red Hat Linux 5.x systems with Apache 1.2.6/1.3.3 servers and Perl 5.004. However, it should work on any UNIX system with Perl 4.036 or later installed, using any web server that has a robust CGI implementation (in particular, the server should be willing to transfer arbitrary HTTP request methods to CGI programs). Non-UNIX Operating systems are not currently supported. 2. Client-side: WebRFM's HTML form-based layer should work with any HTML 3.2 compliant browser. JavaScript support would make some things work a little bit faster, and there are some very minor features that are only available with Netscape browsers (WebRFM is at its best when using a Navigator 4.xx browser with JavaScript enabled). However, those things are not really necessary, and WebRFM remains fully functional without them. Some current browsers do not properly support form-based file uploading, and thus this particular functionality is not available with such browsers. Overall, WebRFM's form-based interface was designed to ensure compatibility with a wide range of browsers and screen resolutions (including browsers that run on non-PC devices). It had been specifically tested for compatibility with pure text browsers (W3m, Lynx) and with WebTV. There is a minor problem when using Microsoft's Internet Explorer (version 3.0 or higher) in that the 'GET as TEXT' and 'GET as BIN' file downloading methods are not effective (they behave the same as 'GET'). This is because MSIE ignores the server-reported MIME type of files and decides by itself what is the type of the file and what it should do with it (this is in violation of the HTTP protocol). WebRFM's direct HTTP layer should work with any client that is designed to do one of the following: a) To publish documents using the HTTP 1.1 PUT method, and/or to remove documents (or directories) using the HTTP 1.1 DELETE method. b) To use the extension methods of an AOLserver (aka NaviServer). c) To use the extension methods of a Netscape Enterprise server (except for locking and versioning related functionality). d) To work with a class 1 WebDAV server. Some specific clients that where found to work well with WebRFM's direct HTTP layer are mentioned in section B above. E. SYSTEM REQUIREMENTS: ======================== WebRFM should work on any system that runs an appropriate operating system, Perl interpreter, and web server. For reasonable performance (namely, convenient response time), the following minimal system configurations (or equivalents) are recommended (faster is always better): Linux: 486DX2 (66 Mhz) with 16 MB RAM. If the system is simultaneously running other programs (e.g., a number of servers) larger memory might be needed to obtain reasonable performance. F. INSTALLATION: ================= WebRFM is quite flexible in how it can be installed and used. We provide below explicit instructions for two types of installations: A single user installation, and a system wide installation. Before we move to describe those specific installations, it would be useful to note a few things about WebRFM's design, which can be thought of as having the following three parts: a) A 'Main Program Directory', where most of the program actually resides. In general, it can be located anywhere. When WebRFM is run, it must have read permission to most of the files in this directory. b) A gif image file called 'highdir.gif' that should be made retrievable directly through the web server. It can be added to an existing 'icons' or 'images' directory, or reside in its own (web accessible) directory. c) The 'main WebRFM script' which is the one and only file that is being run as a CGI script. It can be moved anywhere and renamed as desired, as long as it is being enabled as a CGI script. In order for WebRFM to work, there are two pieces of information that must be entered in the body of this file as part of the installation: The location of the 'Main Program Directory' (so that WebRFM can find the rest of itself) and the URI which corresponds to the directory where the 'highdir.gif' image file is found (so that WebRFM can create appropriate references to it). The 'main WebRFM script' also doubles as being the main configuration file for WebRFM. The first part of this file contains many variables that can be set to control various aspects of WebRFM's operation. Other than these three parts, we should also note that each user running WebRFM should have a configuration directory where WebRFM keeps some per-user configuration files (These files should never be edited manually. WebRFM provides a form-based interface to manage them.) The location of this configuration directory can be set in the 'main WebRFM script' (the default is ~/.WebRFM). If it does not exist, it would be automatically created when WebRFM is used for the first time by a user. Other than the main configuration information in the 'main WebRFM script', there are two additional files that contain configuration variables. Both reside in WebRFM's 'lib' directory (the 'lib' subdirectory of the 'Main Program Directory'). The first is the file 'initlib.pl', which contains default values for user controlled variables. The second is the file 'extlib.pl' which contains most of the implementation of WebRFM's direct HTTP layer. The first part of this file defines some variables that control some aspects of this layer. Normally, it should not be needed to modify any of these files. Another file that may need to be edited is WebRFM's default MIME table. This is the 'mimetable' file in WebRFM's 'lib' directory. If WebRFM is used for managing web content, then it is recommended that the MIME type matchings defined in this file would correspond as closely as possible to those that are done by the web server. (Note that all of the file extensions in this file must be capitalized. The matching WebRFM eventually does is case insensitive.) We can now proceed to describe some specific installation setups of WebRFM: Private single-user installation: ---------------------------------- An installation of this type can be done by any user that: a) Has a valid user account on a UNIX system. b) Has the privilege of running CGI programs (through an appropriate web server on that system) in his own user context. Shell access may be helpful for the installation, but is not essential. In most cases, FTP access would suffice (but some of the text editing described below would need to be done on a remote machine). Prior experience in running CGI programs is recommended. Do the following: a) Extract the distribution archive to its final destination. You must extract it in a way that preserves directory structure, such that you get a top-level 'WebRFM' directory (this is your 'Main Program Directory') with a number of of subdirectories (we refer to those as WebRFM's directories). Your home directory should be a good place to extract, such that you will get a 'WebRFM' subdirectory in your home directory. b) Copy the file 'highdir.gif' from WebRFM's 'sfdir' directory to some place within your web space, such that it can be retrieved through the web server. c) The file 'webrfm.cgi' in WebRFM's 'scripts' directory is your 'main WebRFM script'. Copy it to where you want to run it from, and enable it as a CGI script. Restrict access to it such that it is only accessible to whoever is supposed to access it (presumably just you). It is strongly recommended that you use a username + password authentication scheme. d) Open your CGI-enabled 'main WebRFM script' with a text editor. Look for the line starting with '$ProgDir = ', and set the value of $ProgDir to be the full path to your 'Main Program Directory'. Then look for the line starting with '$SendFilesUrl = ', and set the value of $SendFilesUrl to be the URI which corresponds to the directory into which you previously copied the 'highdir.gif' file. Also, make sure that the first line of the script points to Perl on the system. Save your changes. WebRFM should now be properly installed. System-wide installation: -------------------------- In order to perform this type of installation, you should become the root user. a) Extract the distribution archive to its final destination. You must extract it in a way that preserves directory structure, such that you get a top-level 'WebRFM' directory (this is your 'Main Program Directory') with a number of of subdirectories (we refer to those as WebRFM's directories). The recommended location to extract the archive is /usr/local/lib, such that your 'Main Program Directory' will be /usr/local/lib/WebRFM b) Copy the file 'highdir.gif' from WebRFM's 'sfdir' directory to some place within your web space, such that it can be retrieved through the web server. If you have a global 'icons' or 'images' directory, it should be a good location for it, as long as you don't already have some other file with the same name in there. c) The file 'webrfm.cgi' in WebRFM's 'scripts' directory is your 'main WebRFM script'. Open this file with a text editor. Look for the line starting with '$ProgDir = ', and set the value of $ProgDir to be the full path to your 'Main Program Directory' (if you followed the recommendation in (a), this should already be set for you). Then look for the line starting with '$SendFilesUrl = ', and set the value of $SendFilesUrl to be the URI which corresponds to the directory into which you previously copied the 'highdir.gif' file. Also, make sure that the first line of the script points to Perl on your system. Save your changes. Your basic installation of WebRFM is now complete. However, In order for your users to be able to use it, they would need some (properly authenticated) way to get the 'webrfm.cgi' file in WebRFM's 'scripts' directory to run as a CGI program in their appropriate user context (namely, it needs to be run with their UID/GID). There are several ways of doing that. If you already have a mechanism (such as the Apache suEXEC wrapper) that enables users to run CGI programs in their own user context, it can also be used to run WebRFM. Your users can simply set (or you can set for them) a simple two-line wrapper script of the form #!/usr/bin/perl require "/usr/local/lib/WebRFM/scripts/webrfm.cgi"; Of course, a script of this type should be owned by the appropriate user/group, and access to it must be restricted appropriately. Another (generally much simpler) way is to use WebRFM's own setuid wrapper in order to provide all of your users access from a single point. The code for this wrapper is the file 'wrfmwrap.c' in WebRFM's scripts directory. You should open this file with a text editor, and then follow the instructions given there (in particular, note the warnings given there). Once you have this wrapper properly installed as a setuid CGI program (if you have a cgi-bin directory, putting the wrapper there should normally be OK), you would need to set access control for this file, such that your users would get authenticated with their appropriate user names. Please note that the wrapper works in the following way: It trusts the server to supply the appropriate user name in the REMOTE_USER environment variable, and then if it finds a valid system user with that user name, it spawns WebRFM with the corresponding UID/GID. Technically speaking, you can achieve the appropriate kind of authentication (we assume here that you are using an Apache server, although most other servers should be similar) by using your /etc/passwd file (if you are using it as a user database) as an 'AuthUserFile'. However, it is strongly recommended NOT TO DO THAT, since it exposes your root account (and other privileged accounts) to password guessing attacks. A much better approach would be to use a separate 'AuthUserFile' file. You should just make sure that this file includes proper user names (that is, names that correspond to all of your system's users that need to run WebRFM, but not names of any privileged users that should not send passwords over non secured connections). Of course, there are many Apache modules to authenticate against various types of user databases, and many of them can also be used here. The main principle to remember here is that users must be valid system users, and they should also have valid home directories on the system. An important note concerning Apache and proper WebDAV operation: ----------------------------------------------------------------- If you are using an Apache server and you would like WebDAV clients such as Microsoft's Web Folders to work properly with WebRFM, there is still one further thing that you would need to do. Note that this applies to both private and system-wide installations. The source of the problem is that WebDAV clients expect the server to provide a DAV header in responses to OPTIONS requests for DAV enabled resources. While WebRFM has an appropriate implementation of the OPTIONS method, Apache handles such requests by itself and does not transfer them to WebRFM at all. My (temporary?) workaround is to use the Apache 'Header' directive (this requires mod_headers to be available) in order to force a 'DAV: 1' header to be attached to every response from WebRFM. This insures the inclusion of this header in OPTIONS responses, and it shouldn't hurt anything else. For example, if I have a single point system-wide installation, and WebRFM is available as '/cgi-bin/webrfm', then the following lines in my httpd.conf file do the trick: Header set Dav 1 In case of a private installation, a similar setting in an appropriate .htaccess file should work. If you further want WebRFM's WebDAV layer to work smoothly with Microsoft clients (or even to work at all, in case that you have the FrontPage extensions installed on the same server), then there is yet one more header that needs to be added in a similar way. This is the 'MS-Author-Via' header (a proprietary Microsoft header that is instructing Microsoft clients how they should try to accomplish content management) which should have the value 'DAV'. That is, the complete WebDAV-related header setup in your httpd.conf (or equivalent) should be something like: Header set Dav 1 Header set MS-Author-Via DAV A note concerning temporary files: ----------------------------------- When files are uploaded using WebRFM's form-based interface, they are initially stored as temporary files. Then, if some problem arises in moving them to their final destination (for example, if there is already a file by that name and the user didn't check the 'overwrite existing files' box on the upload form), the user is prompted for further action. If the user doesn't respond to that prompt, then the temporary file would remain, and with time this can lead to the accumulation of many such "garbage" files. A similar thing happens also in case that the user saves a file using the 'Save As' option of the Text Editor (a temporary file is created, and it might remain in case that there is a problem and the user doesn't respond to WebRFM's prompts). These temporary files are being created in WebRFM's temporary directory, which can be set in the 'main WebRFM script' (the default is ~/.WebRFM/temp). Users having a private installation should occasionally scan their temporary directory and clean whatever accumulated there. In system-wide installations, it is recommended to run a daily cron job that would scan those temporary directories and delete old files that are found there. A note concerning files in WebRFM's 'htm' directory: ----------------------------------------------------- Files that are located in WebRFM's 'htm' directory (by default, these are just WebRFM help files), can be retrieved by calling WebRFM with a query string of the form ?com=rshgethtm+, where should be substituted for the name of the file. Such files are not simply retrieved, but are considered by WebRFM to be HTML files that are intended to be dynamically parsed. WebRFM scans those files and replaces certain special strings with values of corresponding WebRFM parameters (see the 'ParseHTM' subroutine in the main script for what is being replaced) and it also attaches its footer (along with the ending ) to the end of those files. This mechanism can be used to create links that would retrieve WebRFM help pages directly. More importantly, it can be used to extend WebRFM through HTML pages that would have current values of various WebRFM parameters inserted into them. However, it also has the following security implication: Any files that reside in WebRFM's 'htm' directory would be retrievable by all WebRFM users. Thus, one should be careful not to place in this directory any files that might contain confidential information. A note concerning upgrades from previous versions: --------------------------------------------------- The 'fmopdat' configuration file from versions of WebRFM earlier than 0.3b is not compatible with the current version (this file resides in WebRFM's per-user configuration directory, and it holds all of the user configuration options). As a result, users of such previous versions that are upgrading to the current one may get an error message about fmopdat being corrupted and WebRFM would refuse to run. If you encounter this problem, you can do one of several things, such as: 1) Delete the existing 'fmopdat' file (or files). A new file with default values would be created. 2) Change the value of the $CfgDir variable in the 'main WebRFM script' such that a new configuration directory (with new default configuration files) would be created for the new version. If you must preserve configuration options from a previous version to work with the current one, please drop me a note (to ylast@mindless.com). It should be quite simple to write a script that would upgrade 'fmopdat' files to the new format, but I don't see a point in doing that unless there is actual demand (by at least one person). G. STATUS OF THIS PROGRAM ========================== WebRFM is currently in beta status. All of the major features that are planned for version 1.0 are already implemented, and it seems to work quite nicely in its current state. However, some bugs may be present, and using it in any kind of a "working environment" must be done with caution. H. TODOS ========= 1. Provide more documentation. 2. The author is open to suggestions for improvements. 3. Many wonderful things are being considered for future versions beyond version 1.0. In the short term, however, the primary goal is to provide a stable bug-free version 1.0 that is not expected to have any major features beyond those that are already implemented. I. TECHNICAL SUPPORT ===================== No support of any kind is promised for this program, and the author does not promise to answer e-mail messages related to it. Bug reports and feedback (of any kind) to ylast@mindless.com would be greatly appreciated. The author may assist users of this program who run into problems if his time allows it. However, no commitment to doing that is being given. Users should attempt to read the documentation that comes with WebRFM and to check the WebRFM homepage at http://webrfm.netpedia.net/index.html for the latest information prior to seeking the author's help. Note that this program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.