Scalable webmail HOW-TO

by Jason Belich, macmaster@pobox.com

v1.0, 11 August 1999

1. Introduction

Thanks to several new web mail products (like IMP), web designers have been able to offer web based email with their web systems. However, as web mail solutions using these products have become more popular, the scalability issue has come to the forefront. Most web mail systems (and certainly the free ones) are designed around single mail server, single web server operation, with all the limitations that a single machine brings to the fold. Scaling beyond a few thousand users was simply not easily feasible, which is why I set out to devise a simple system that has virtually unlimited scalability.

1.1 Background

Web based email is taking off like wildfire, and with such sites as hotmail and bigfoot becoming so popular, other sites are looking to add web based email to their repertiore of services. As such, products like EmuMail and IMP have come over the horizon to help people offer this service to their patrons. However, web mail sites based upon these new products are also becoming quite popular and are taxing the network and machine resources. Without a custom solution, like hotmail's, popular webmail systems using off-the-shelf webmail systems are starting to feel growing pains. As such, it's becoming necessary to find an webmail solution that both relatively easy to implement and scalable to thousands or even millions of users.

In my case, I work as a developer for DigitalNation, working on the Worldcities project. Worldcities is a globally local news and information site that also offers web based email. Webmail is highly popular outside of the US since many non-US ISPs do not offer their customer mail services, and since Worldcities is not US centric, we've quickly realized that scalability is very necessary.

Our previous set up used dual servers, one running Sendmail and UW IMAP, and the other running IMP, an open source webmail system built in PHP (and the best, I might add). The system was very easy to implement, and served its job well. However, after only a couple of months it became very apparent that our system was straining under the pressure, and we needed to find a new solution quickly. Also, since we had so many active users, we could not abandon their already existing mail.

Previously we had chosed IMP because of it's free status (i.e. I can make changes to it as necessary) and Sendmail and UW IMAP because they were the standard tools. It was obvious that Sendmail and UW IMAP was too slow for our uses, so I researched other MTAs and IMAP servers.

I ultimately decided on postfix as the MTA, since it's at minimum 3 times faster than Sendmail, and cyrus since it neither requires user accounts on the host machine nor uses the system resources that UW IMAP does. We also kept IMP because of its free status, because of its easy configuration, and because we didn't want to subject users to a new interface. Also, in the course of research, I discovered that postfix can use an LDAP server for its alias database (using the maildrop attribute and mailacceptinggeneralid attribute) and that there was a patch for Cyrus that allowed Cyrus to use an LDAP server for authentication. Combine that with a DNS round robin for the frontend web/incoming mail server(s), and you have the ingredients for a highly scalable web mail solution. It just became a matter of putting it all together.

It turned out that the mix was rather easy too. The bulk of the scalability comes from postfix's ability to forward mail to another user or machine based upon the maildrop attribute of an LDAP server. All that needed to be done is to use a full email address for the maildrop attribute, which pointed to the machine that the user's mail was actually hosted on. For instance, if the DNS MX record points to the frontend mailserver of mail.dom.ain, and LDAP says to maildrop user@dom.ain to user@machine44.dom.ain, then postfix running on the front end mail server will forward mail for that user to machine44. With feature alone, you can have unlimited mail accounts in your domain. All you have to do is keep adding boxes to the back end.

The tricky part, or so I thought, was to get the mail off the backend server and into IMP. Since IMP already supports mail servers separate from the web server, what needed to be done is to retrieve the mail server machine name from LDAP, instead of relying on the configuration setting and a single mail server. It turned out that setting the mail server was extremely easy, requiring only a few lines of new code in one file, and adding new configuration parameters to IMPs default configuration file. We had postfix forwarding and IMP retrieving mail from multiple machines. I could have left it at that.

However, UW IMAP was still an issue. For a multiple of reasons, it was too slow for our uses: it used a flat file format, which is not scalable to large numbers of mail messages, it used the host machine's authentication system, which required accounts on the machine and had to parse /etc/passwd every time someone logged on, and it was just plain slower than we wanted. So we decided to use Cyrus, which has none of these limitations, as our imap server. Best of all, with Clayton Donely's LDAP patch for Cyrus, we could use the same LDAP server as postfix for authentication. This made our entire mail system portable in addition to scalable, and we didn't have to keep multiple copies of passwords. After all this is what LDAP was designed for.

So finally, what we have is a frontend farm of web/mail servers in DNS round robin, forwarding mail and retrieving mail based upon an LDAP server to the proper backend mailserver which also uses LDAP for authentication. Since LDAP servers are replicable, there is no single machine in the system (provided the machine have highspeed interconnectivity) that can either be a point of failure or slow the system down. This is the holy grail of scalability, and with the help of open source tools, what we designed.

1.2 Assumptions

This document refers to three servers: the web server, the imap server, and the LDAP server. These servers can be separate machines, several separate machines (i.e. 23 web servers, 5 LDAP servers, and 10 imap servers), or a single machine. It also assumes basic knowledge of installing software from source in a UNIX or Linux environment.

1.3 What is used.

The following software packages are used with this document.

note: the cyrus 1.6 tree uses a different method of authentication, called SASL, and the pwcheck_ldap patch is not designed for it. However, I'm told that an LDAP patch for SASL is in the works, so when that happens, this point will be moot.

1.4 Of Special Note

The setup described in this document does have one point that is not yet scalable, the SQL server used by IMP to store session data, preferences and the address book. The author is currently working on a mechanism by which the SQL server can also be easily scaled.

2. Installing Software

2.1 The Web Server

On the web server, you will need to install Apache, PHP, postfix, OpenLDAP, UW IMAP, and IMP.

Installing Apache and PHP is straight forward and is covered by the PHP install docs. You will need to compile in LDAP and IMAP support in PHP.

Install postfix according to the accompanying docs, particularly the LDAP_README.

2.2 The LDAP Server

On the ldap server, you will need to install OpenLDAP. You will also need to decide on a root dn and add that to the LDAP db accordingly.

2.3 The IMAP server

On the IMAP server, you will need to install LDAP, postfix, Cyrus, and the pwcheck_ldap patch for cyrus. Install postfix with ldap support according to the docs. Install Cyrus according to the docs, but adding pwcheck_ldap.c according to the pwcheck_ldap docs. Don't forget that you need to modify the pwcheck_ldap.c file to reflect your ldap server and base dn. Also, you may need to add the line


#include <linux/stddef.h>

to pwcheck_ldap.c if you run linux as your imap server and you may need to make a few syntax corrections in the file (I forget where they are right now) to get pwcheck to compile properly. Configure cyrus with the command


./configure --with-login=unix_pwcheck --with-pwcheck=ldap

since the pwcheck_ldap docs omit the --with-login directive. Compile, install according to the cyrus docs.

3. Configuring the servers to work together

3.1 The LDAP Server

Each mail user's entry in the LDAP database needs to have the following entries, in addition to any other entries you choose to use (assuming your basedn is o=someorg, c=US):


dn: uid=someuser, o=someorg, c=us
uid: someuser
userpassword: somepassword
maildrop: fulladdress@machine.dom.ain
mailacceptinggeneralid: someuser
(and for an added benefit, if you like)
mailacceptinggeneralid: somealias

Also, you will need to choose a user which will have cyrus admin rights. You don't need to worry about this until you configure cyrus on the imap server, but keep it in mind. Also, don't choose an existing user for admin privs, problems may include a security hole and/or inability to check mail for that user.

3.2 the imap server

configure postfix on the imap server to use ldap for its alias mapping. This is explained in LDAP_README in the postfix docs. For Cyrus, follow the install directions that come with the package. Also, don't forget to activate pwcheck and to add your admin user to imapd.conf.

3.3 the web server

in this setup, the web server(s) is also the front end incoming mail gateway. Configure postfix to use ldap for it's alias_map. What it does is check for the maildrop LDAP entry and forwards the mail to the maildrop address, hence the need for the full machine name in the maildrop attribute.

For instance, you could have 700,000 users split among servers 10000 a piece. Mail that comes into one of the web server(s) addressed to user1@dom.ain will forward to the maildrop address of user1@mail05@dom.ain while mail addressed to user657 would forward to user657@mail34. In addition, the maildrop attribute could be used as a forwarding address i.e. user302 forwards to someuser@somewhereelse.com.

Configure imp as you normally would according to the docs. After you're configured make the following changes:

to config/defaults.php3 add the following lines:


/* LDAP/IMAP Server Default */
$default->LDAP_server               = 'ldap.dom.ain';
$default->LDAP_dn                   = 'o=someorg,c=US';
$default->LDAP_search_field         = 'uid';
$default->ldap_choose_server        = true;
to mailbox.php3 apply the following patch:
Index: mailbox.php3
===================================================================
RCS file: /home/cvs/imp/mailbox.php3,v
retrieving revision 2.29
diff -c -r2.29 mailbox.php3
*** mailbox.php3        1999/07/29 07:20:00     2.29
--- mailbox.php3        1999/08/04 18:04:10
***************
*** 29,34 ****
--- 29,51 ----
  require './lib/mimetypes.lib';
  require './config/defaults.php3';

+ /************LDAP**************/
+
+ if ($default->ldap_choose_server) {
+   $ldapconnect = ldap_connect($default->LDAP_server);
+   if ($ldapconnect) {
+     print("YES!
"); + $ldapbind = ldap_bind($ldapconnect); + $ldapsearch = ldap_search($ldapconnect, $default->LDAP_dn, $default->LDAP_search_field."=".$imapuser, array("maildrop")); + $ldapget = ldap_get_entries($ldapconnect, $ldapsearch); + $ldapspl = explode("@", $ldapget[0]["maildrop"][0]); + $server = $ldapspl[1]; + $port = $default->port; + } + } + /**********end ldap************/ + + /* Html styles configuration */ require './config/html.php3'; /* Mailbox configuration */
and you're done. Now you have a highly scalable web mail solution.

note: as of August 5, 1999, the development version of IMP contains these patches, and therefore do not need to be added.

4. Miscellaneous

4.1 Adding Users

here's a code snippet that will add a user to the LDAP server and add a mailbox to cyrus. It is designed for one imap server, but can easily be altered to search for a server based on whatever criteria you choose.

<?php
$ldapconn = ldap_connect("ldap.dom.ain");
$machine = "mail01";
if ($ldapconn) {
  $ldhb = ldap_bind($ldapconn, "cn=cyrusadmin, o=someorg,c=US","password");
  $dn = "uid=". $username .", o=someorg, c=US";
  $info["uid"]=$username;
  $info["userpassword"] = $password;
  $info["objectclass"] = "account";
  $info["maildrop"] = $username ."@" .$machine .".dom.ain";
  $info["mailacceptinggeneralid"] = $username;
  $ldhb = ldap_add($ldapconn, $dn, $info);
  ldap_close($ldapconn);
}

$imapconn = imap_open("{" .$machine .".dom.ain:143}", "cyrusadmin","password");
if ($imapopen) {
   imap_createmailbox($imapconn, "{" .$machine ."dom.ain:143}user.".$username );
   imap_close($imapconn);
}
?>

note: due to a bug which I have yet to track down, encrypted passwords cannot be reliably supported, use plaintext passwords instead.