Difference between revisions of "Email"

From Peyton Hall Documentation
Jump to navigation Jump to search
Line 70: Line 70:
 
=== SMTPD ===
 
=== SMTPD ===
 
After running Sendmail for a long time, I had a look at Postfix before one of the mail server upgrades.  While Sendmail configuration has become so cumbersome that there's entire books dedicated to the configuration files (O'Reilly's [http://www.oreilly.com/catalog/sendmailckbk/ Bat Book]), Postfix's configuration is fairly straightforward.  With mostly plain-English directives, I was able to setup a Postfix system that mirrored our Sendmail system in a fraction of the time.  Whats more, the Postfix system allowed for sending a mail through a filter and then processing it on the other side, and authentication/authorization and encryption for security.  Sendmail is only used for secondary systems now, and only because it's the default installation at the time.   
 
After running Sendmail for a long time, I had a look at Postfix before one of the mail server upgrades.  While Sendmail configuration has become so cumbersome that there's entire books dedicated to the configuration files (O'Reilly's [http://www.oreilly.com/catalog/sendmailckbk/ Bat Book]), Postfix's configuration is fairly straightforward.  With mostly plain-English directives, I was able to setup a Postfix system that mirrored our Sendmail system in a fraction of the time.  Whats more, the Postfix system allowed for sending a mail through a filter and then processing it on the other side, and authentication/authorization and encryption for security.  Sendmail is only used for secondary systems now, and only because it's the default installation at the time.   
 +
 +
We're running Postfix 2.2.x compiled with SMTP AUTH via SASL (which in turn gets information through PAM using LDAP; there's also a local LDAP slave running on the mail server for its own use and to make the machine autonomous).  Postfix is configured to use TLS encryption over the standard port (25), allowing for secure authentication to the server for relaying mail.
  
  
 
=== IMAPD ===
 
=== IMAPD ===
=== Spam/virus filtering ===
+
Originally, all email was delivered to a file named for each user, which was cross-mounted on all machines in the building.  While this allowed for the use of things like 'grep' to search your inbox, it had a number of downsides:
 +
 
 +
* Mail server downtimes could cause strange NFS locking issues on every machine in the building
 +
* Lock contention
 +
*: If the mail server process and your client were both trying to write to the file at the same time, there was a very real chance that mail would be lost, and/or the file completely corrupted.
 +
* Remote access to mail was impossible - only option was to login and use [[Pine]]
 +
 
 +
To make the system more robust, I moved everything over to Cyrus IMAPD.  Cyrus is a "black box" mail system - end users do not (and in fact, can not) login to the mail server, but only speak to it over IMAP.  This makes the system better for us, since it no longer needs to rely on the availability of the home directory server - in fact, our mail server could be physically removed from the building, and plugged in somewhere else on campus, and as long as it can get the same IP address it uses while here, it will continue to function fully.  Cyrus IMAPD allows for filtering via the [[Sieve]] filtering language (similar to how [[Procmail]] works).  Cyrus is also compiled to use SSL or TLS encryption depending on the connection.
 +
 
 +
We use Cyrus IMAPD 2.3.x.
 +
 
 +
 
 +
=== Spam & virus filtering ===
 +
In this day and age, it's impossible to run a mail server without filtering for spam (and it's irresponsible to run one without virus scanning as well).  Early on in the process, I added AMaViS as a filter in the mail processing pipeline to do just that.  It scans every incoming and outgoing mail for viruses, and scores mails based on their content to see if they're spammy or not.  The spam filtering comes from SpamAssassin, a powerful filter which is very configurable and routinely updated.
 +
 
 +
We're running amavisd-new version 2.3.x, and SpamAssassin 3.1.x.
 +
 
 +
 
 +
=== Mailing lists ===
 +
Originally, mailing lists here were nothing more than "exploders"; an alias which mapped to a long list of recipients, so an email sent to the alias would just be expanded to all the people on the list.  While effective, these are harder to maintain and not very powerful.  By installing and using Mailman, there's a lot more flexibility available in our lists, including the ability to delegate someone other than us as the maintainer and moderator (important in the case of "user-run" lists for projects and such).  We can also offer archiving for mailing lists this way, so that all mails can be stored on the server for future reference.
 +
 
 +
We currently run Mailman 2.1.x.
 +
 
 +
 
 +
=== Squirrelmail ===
 +
As mentioned above, Squirrelmail allows anyone to read their email with just a web browser over a secure connection.  A web-based IMAP client, Squirrelmail has helped those who might not be at a computer capable of doing [[SSH]], or installing [[Thunderbird]] for a full-fledged IMAP client.  Some people even use it full-time, instead of a regular IMAP client.
 +
 
 +
Squirrelmail 1.4.x is the version currently installed.
 +
 
 +
 
 +
=== How it all fits ===
 +
An email coming in from outside the building to be delivered to a local account takes the following path:
 +
# External Postfix
 +
#: Postfix takes the details of the email, and when the remote client/server finishes sending it does '''not''' immediately reply with an acceptance or rejection message - instead, the connection is held open while processing takes place.  Because of this, we ''never'' accept responsibility for an email until it's been delivered, so if any stage fails it is still up to the sending computer to either retry or notify the original sender of the failure.
 +
# AMaViS
 +
#: amavisd-new runs the email through a virus scanner, and through SpamAssassin to assign a score to the message.  If the virus scan fails, the message is rejected and Postfix will inform the sending system.  SpamAssassin tags all emails with their score, no matter the value, for debugging purposes.  If the score is above 3.0, the email will also be flagged as spam but still delivered (it's up to the user if they want finer grained filtering at that point).  If the score is 5.0 or above, the mail is rejected.
 +
# Assuming above passed, mail moves to an internal Postfix queue
 +
#: This queue deals with local delivery; either handing the message off to Cyrus, or forwarding it to another computer for processing (SDSS and some other lists are processed elsewhere).  The internal Postfix checks to make sure the recipient exists, and if not will reject the mail (which carries through the open pipes back to the sender).  At this point, if this Postfix accepts the message, amavisd-new sends back an acceptance to the external Postfix, which in turn tells the sending computer that the email was accepted.
 +
# Cyrus via LMTP
 +
#: The internal Postfix speaks to Cyrus with the Local Mail Transfer Protocol (LMTP), a lightweight version of SMTP for local intra-network processes.  Cyrus will accept the message from Postfix, process it through Sieve (if needed) and save it to a mail folder for later viewing.
 +
 
  
 
== Email FAQ ==
 
== Email FAQ ==

Revision as of 21:43, 7 May 2007

Crystal 128 clock.png
This article is actively undergoing a major edit for the afternoon.

As a courtesy, please do not edit this page while this message is displayed. The person who added this notice will be listed in its edit history or has placed their signature above.

If this page has not been edited recently (several hours!), please remove this template (or replace it with {{underconstruction}}).

This message is intended to help reduce edit conflicts; please remove it between editing sessions to allow others to improve this page.

Template:oldfaq

Email has become one of the most important means of communication among people here, in part because they may not be physically anywhere near their collaborators. Also, being a 'non-intrusive' communication is helpful - if you send someone an email and they're not available to reply right away, they can take their time formulating a response instead of needing to on the spot, or promising a return phone call later when time is available. Quite handy for those 4AM questions that pop up. This section will explain a bit about email, including links for setting up various mail clients in the building, as well as some of the details of our email setup here in Peyton Hall.

Email basics

While email has become an important and common method of communication, many don't quite understand the details that go behind pressing 'send' in your client. What follows is a brief explanation of how that all works.


How it works

When you compose an email to send somewhere, your mail client generates a series of text which describes the details of the message. There's a few headers written, including who the mail is from (your From: address), the destinations (To:, CC: and BCC: for blind carbon-copied messages - those which are sent somewhere, but not written in the message itself so other recipients are unaware it was sent to them), the Date: header saying when the message was written, and a Subject: header for the subject line of the message. Some clients will also include other headers, such as priority (which works differently in different mail clients) or headers to identify the client program and version. The email is now ready to be sent to a server that speaks Simple Mail Transfer Protocol, or SMTP.

In some cases, when you send a mail, the message is handed off to a SMTP client program such as Sendmail running from the command line (/usr/bin/sendmail). This client program will take the message and handle speaking SMTP to the mail server. In other cases, your mail client will speak SMTP directly to the server. In both cases, the conversation is fairly short; the client tells the server what information it needs, known as the "envelope headers", so called because they could be different than what the message itself has for recipients and headers (ie, if you're sending a BCC of an email to someone, the envelope recipient is that person, even though their name doesn't exist in the actual headers of the email). Once the headers are passed, the client tells the server it's now sending the data of the message, sends the data, then signals that the message is done. At that point, the mail server can either accept the message completely - usually giving a kind of tracking number for the email - or reject it, which should cause the client to pop up an error that the mail could not be delivered. Should the mail be accepted, it is now the responsibility of the mail server which accepted the mail to either deliver it one step closer to the recipient, or return a message to the sender to say that the message could not be delivered. It is unacceptable to have a legitimate email discarded by a mail server and no "delivery status notification" sent back to the sender of the message.

Once the SMTP server holds the mail, it looks at it to determine who needs a copy, and where they are. Through some clever Domain Name Service (DNS) entries, it can tell where a mail for a certain person needs to be delivered, and will contact that server (unless this SMTP server is configured to send all mail to a "relay host", in which case it will relay the message up the chain and that host will now be responsible for delivery - this is common if you have an ISP which requires all emails be sent through their own servers, and not directly from your home computer). Once the SMTP server knows where to hand off the message, it initiates a connection to that server and repeats the process, adding to the message a few headers to mark the email's travels through itself (in most cases, a "Received:" header to show that it received the message and handed it off, while in some cases there may be a few headers added if the message was scanned for spam & viruses). This process will continue to repeat, until an SMTP server holds the message and must deliver it locally somehow.

When the local SMTP server holds the mail, it may be delivered many ways depending on the local configuration. In many cases, the mail will be scanned for viruses and spam content, and assuming the message passes it will be delivered. Some mail storage systems work with plain text files for holding messages, so the mail server can deliver them internally. Others use the Local Mail Transport Protocol (LMTP) to pass a message on to a mail server daemon which clients will connect to for reading mail (using either Post Office Protocol v3 (POP3) or Internet Mail Access Protocol (IMAP)).

As mentioned before, if a SMTP daemon cannot deliver a message any closer to the recipient for some reason, it has two choices for what to do:

  1. Reject the message, and the sending program (SMTPD or client) is now responsible for notifying the sender somehow
  2. Generate a Delivery Service Notification (DSN) message to the sender, notifying them that their email could not be delivered and why

In some cases, DSNs are sent because the recipient does not exist, or their mailbox is full. Sometimes it may be because a virus was detected in the email, or because the mail appeared to be spam (or had enough characteristics to make it 'spammy'). It could also be that the recipient's SMTP server is not responding, in which case you'll see the familiar "Message could not be delivered for 4 hours; will keep trying for up to 5 days" DSN. These mean just what they say: the message could not be delivered in the last four hours, but the program will keep trying to deliver it (usually once every hour) until it is five days old, at which time it will send another DSN to let you know the mail could not be delivered at all. Because SMTP servers will hold messages like this, mail tends to always go through even if a mail server is down for a while. All those trying to send mail to it will just hold their messages and wait until a later time, so the worst problem you'll usually see is that messages are delayed an extra hour from when the server is brought back online.

If you'd like to know how our mail system here is configured, see the below section "The nasty details".


Using email in Peyton Hall

There's a few ways to read your email while in Peyton Hall, which are outlined here. A word on "support" first, however. Because there are so many different ways to access email, we cannot possibly support every permutation of them. However, we have selected a few programs which are widely used and easily setup so that we can guarantee support with those programs. This does not mean you must run only one of those programs; you're free to use whatever software you like for reading and sending emails. However, if the program you've chosen does not work properly, or has some kind of problems that are not a fault of the mail server itself, you will be on your own to figure out the problem and corresponding solution. See Other clients below for information if you wish to use an unsupported email client.


Supported clients

Pine

From any Unix/Linux machine in the building, there's two main options for viewing your email. The first is Pine, a text-based email program which will work over SSH connections as well as while you're sitting in front of a machine in Peyton. While it may not have all the whiz-bang features of a GUI mail client, Pine is still in use by many and is a good no-frills (or low-frills) program to use for accessing email. Because of the way we have it configured, we can also guarantee that even if nothing else works, Pine should be able to read your mail no matter what - as long as the mail server itself is running. You can find more information about Pine in the article linked above in this section.


Thunderbird

Another supported client in the building is Thunderbird. This is what became of the mail client portion of the Mozilla suite. Thunderbird is a great IMAP client with many plugins and extensions that can help you effectively and efficiently use your email. Since it supports encryption and secure connections, you can configure Thunderbird for use on your laptop, and it will work the same if you're in the building or sitting at home. More information on Thunderbird is available in its own article (linked above).


Squirrelmail

Squirrelmail is an IMAP client which is accessed via a web interface. We run Squirrelmail on the mail server directly, and using a SSL certificate for encryption means your user name and password are not sent in-the-clear for others to see. You can do just about anything through Squirrelmail as you would with any other mail client, with the added benefit that you don't need anything more than a web browser to access it. The downside, of course, is if you don't trust the computer in front of you (for example, in an Internet cafe or web kiosk). But if you're visiting some other institution and have web access, you can read and reply to emails with Squirrelmail. You can use the article link above to see more information.


Other clients

For unsupported email clients, we have some information here which may be useful in getting it setup to access the mail server. If you have information about another client, you're welcome to either add it here, or if it's a large article create a new page for it and link to it in here instead.

  • VM is an email client that runs within Emacs.
  • Netscape Mail, while no longer in active development, may still be used - but you should probably upgrade to Thunderbird instead.


Use your own

If none of these clients fit your needs, you can always pick your own for whatever you'd like to do. The information you'll need to know is:

  • SMTP (outgoing) server: mail.astro.princeton.edu
  • IMAP (incoming) server: mail.astro.princeton.edu
  • Security/encryption must be turned on
    This means that the IMAP port will be 993 instead of 143, and SSL must be turned on. For SMTP, you'll need to tell your client to use TLS encryption. You will also have to supply your user name and password to the client, as it will have to authenticate to the mail server to be able to send mail to recipients not at the astro domain (ie, you're relaying mail through our server from home, and the recipient is in Berkeley).


Accessing email from outside Peyton

As nice as the building is, you can't always be here. So how do you access your mail from somewhere else? Simple!

  • If you have Thunderbird configured on your laptop or home computer, it will work just the same while outside the building as it will while connected to our network.
  • If you have SSH access from wherever you're connecting, you can login to any machine and run Pine.
  • If all you have is a web browser, you can use Squirrelmail


The nasty details

This section explains all the nitty-gritty of the mail system in use here. For casual users, it's probably not important, but the more curious may want to know exactly what programs touch their email on its travels through the Ether.


SMTPD

After running Sendmail for a long time, I had a look at Postfix before one of the mail server upgrades. While Sendmail configuration has become so cumbersome that there's entire books dedicated to the configuration files (O'Reilly's Bat Book), Postfix's configuration is fairly straightforward. With mostly plain-English directives, I was able to setup a Postfix system that mirrored our Sendmail system in a fraction of the time. Whats more, the Postfix system allowed for sending a mail through a filter and then processing it on the other side, and authentication/authorization and encryption for security. Sendmail is only used for secondary systems now, and only because it's the default installation at the time.

We're running Postfix 2.2.x compiled with SMTP AUTH via SASL (which in turn gets information through PAM using LDAP; there's also a local LDAP slave running on the mail server for its own use and to make the machine autonomous). Postfix is configured to use TLS encryption over the standard port (25), allowing for secure authentication to the server for relaying mail.


IMAPD

Originally, all email was delivered to a file named for each user, which was cross-mounted on all machines in the building. While this allowed for the use of things like 'grep' to search your inbox, it had a number of downsides:

  • Mail server downtimes could cause strange NFS locking issues on every machine in the building
  • Lock contention
    If the mail server process and your client were both trying to write to the file at the same time, there was a very real chance that mail would be lost, and/or the file completely corrupted.
  • Remote access to mail was impossible - only option was to login and use Pine

To make the system more robust, I moved everything over to Cyrus IMAPD. Cyrus is a "black box" mail system - end users do not (and in fact, can not) login to the mail server, but only speak to it over IMAP. This makes the system better for us, since it no longer needs to rely on the availability of the home directory server - in fact, our mail server could be physically removed from the building, and plugged in somewhere else on campus, and as long as it can get the same IP address it uses while here, it will continue to function fully. Cyrus IMAPD allows for filtering via the Sieve filtering language (similar to how Procmail works). Cyrus is also compiled to use SSL or TLS encryption depending on the connection.

We use Cyrus IMAPD 2.3.x.


Spam & virus filtering

In this day and age, it's impossible to run a mail server without filtering for spam (and it's irresponsible to run one without virus scanning as well). Early on in the process, I added AMaViS as a filter in the mail processing pipeline to do just that. It scans every incoming and outgoing mail for viruses, and scores mails based on their content to see if they're spammy or not. The spam filtering comes from SpamAssassin, a powerful filter which is very configurable and routinely updated.

We're running amavisd-new version 2.3.x, and SpamAssassin 3.1.x.


Mailing lists

Originally, mailing lists here were nothing more than "exploders"; an alias which mapped to a long list of recipients, so an email sent to the alias would just be expanded to all the people on the list. While effective, these are harder to maintain and not very powerful. By installing and using Mailman, there's a lot more flexibility available in our lists, including the ability to delegate someone other than us as the maintainer and moderator (important in the case of "user-run" lists for projects and such). We can also offer archiving for mailing lists this way, so that all mails can be stored on the server for future reference.

We currently run Mailman 2.1.x.


Squirrelmail

As mentioned above, Squirrelmail allows anyone to read their email with just a web browser over a secure connection. A web-based IMAP client, Squirrelmail has helped those who might not be at a computer capable of doing SSH, or installing Thunderbird for a full-fledged IMAP client. Some people even use it full-time, instead of a regular IMAP client.

Squirrelmail 1.4.x is the version currently installed.


How it all fits

An email coming in from outside the building to be delivered to a local account takes the following path:

  1. External Postfix
    Postfix takes the details of the email, and when the remote client/server finishes sending it does not immediately reply with an acceptance or rejection message - instead, the connection is held open while processing takes place. Because of this, we never accept responsibility for an email until it's been delivered, so if any stage fails it is still up to the sending computer to either retry or notify the original sender of the failure.
  2. AMaViS
    amavisd-new runs the email through a virus scanner, and through SpamAssassin to assign a score to the message. If the virus scan fails, the message is rejected and Postfix will inform the sending system. SpamAssassin tags all emails with their score, no matter the value, for debugging purposes. If the score is above 3.0, the email will also be flagged as spam but still delivered (it's up to the user if they want finer grained filtering at that point). If the score is 5.0 or above, the mail is rejected.
  3. Assuming above passed, mail moves to an internal Postfix queue
    This queue deals with local delivery; either handing the message off to Cyrus, or forwarding it to another computer for processing (SDSS and some other lists are processed elsewhere). The internal Postfix checks to make sure the recipient exists, and if not will reject the mail (which carries through the open pipes back to the sender). At this point, if this Postfix accepts the message, amavisd-new sends back an acceptance to the external Postfix, which in turn tells the sending computer that the email was accepted.
  4. Cyrus via LMTP
    The internal Postfix speaks to Cyrus with the Local Mail Transfer Protocol (LMTP), a lightweight version of SMTP for local intra-network processes. Cyrus will accept the message from Postfix, process it through Sieve (if needed) and save it to a mail folder for later viewing.


Email FAQ

See also Mailing Lists for information about the mailing lists available.