I've just presented a draft spec of this system to a private members-only site I am running.
As, if this does go ahead in any way shape or form - it will be implemented in Perl (of course) - I wanted to bounce it off my fellow PerlMonks too.
Let me know what you think:
This is a specification for a non-existant software system codenamed "Segmail". It doesn’t exist yet – nor may it ever.
++ BACKGROUND ++
The problem with current email infrastructure is that the From address of an email message is based on the honor system.
Anyone can send a fake message to you, and make the message look like it has come from someone else or simply "no one at all".
This is one of the biggest causes of spam. The ability to send email anonymously means that once someone gets your address and sends you email, there is no way to (a) identify them; and/or (b) stop them from sending you email.
Over time, the problem compounds, and you receive more and more junk email.
Current junk email filters are either not very accurate, or require a real correspondent to “jump through a hoop” by answering a challenge/response.
The only way currently to really stop junk mail is changing your email address. This is a pain, because you have to tell everyone you know that you have a new address. Doing it more than once a year is impractical.
++ PROS & CONS ++
With that in mind, lets talk about the Pros & Cons of this new theoretical system called “Segmail” which I am going to specify.
Segmail is compatible with normal email. You can use Segmail to exchange email with someone that doesn't use Segmail, and still get all of the benefits described.
Furthermore, it allows you to continue to use your current email client software (Outlook, Eudora) and even in many cases your current Internet Service Provider and mail server.
Compared to existing systems, Segmail is 100% accurate. That means no false positives. No real mail will be marked junk. Almost no junk email will get through (at least much less than any current system).
Segmail allows you to accurately identify who is sending you email (or at least who gave them your email address). This is not technically possible with current systems.
As a corollary, Segmail allows you to block an individual correspondent from sending you email. This is not technically possible with existing systems (as they can't identify who is sending the email).
Segmail does not require your contacts to “jump through a hoop” with a challenge/response.
The downside is that you can’t give people your email address unless you have Internet access handy to use Segmail. You need Segmail present to generate your email address. You will see why.
Also, your email address looks really long and ugly. This is only really a problem when you have to write it down, or give it to someone over the phone.
Thirdly, you have to change your email address once, when you first start using Segmail. You cannot use Segmail with your current email address.
++ ENTER SEGMAIL ++
Segmail itself is implemented as an email proxy. You check your email through it via POP/SMTP with a normal email client. (eg Outlook, Eudora, etc). Segmail then talks POP/SMTP to your real email server on your behalf.
A domain is setup for each Segmail user such that any incoming message to that domain goes to their POP mailbox. Eg - For John Doe, the domain might be john.doe.com – so any message to foo@john.doe.com, bar@john.doe.com and anything@john.doe.com goes to John’s POP mailbox – and through Segmail.
Segmail maintains a database with your address book in it. You can access it via a normal user interface (web, native, wap, whatever).
Segmail generates and stores a random password for each entry in your address book.
It uses this password to generate a different email address for each entry in your address book. These email addresses "segregate" the exposure of your email address(es) - hence the name.
++ SENDING MAIL ++
When you send a message, Segmail checks to see if the recipient is in your address book - if not, a new entry is automatically created.
Next, Segmail changes the From address of your outgoing message to a new address. The new address contains the password associated with the recipient.
For example. - suppose John Doe had a friend, Tom Smith.
The secret password that Segmail generated for Tom Smith is gh3f3gh3. Segmail would change the From address of John's outgoing message to be "john-tom-gh3f3gh3@john.doe.com".
The "john-tom" bit is to help keep track of it by humans. (It's John's address given to Tom). Segmail itself is really only interested in the password.
Basically, each person, company, or indeed any place where your email address is exposed – is given a different email address for you. Each of these email addresses has a “password” in it, so that it can’t be guessed.
Each person sees a different email address for you.
++ SETUP ++
As part of the setup process, when you first put your address book in Segmail, it gives you the option of sending out a “my email address has changed” to each of your contacts. Each contact receives a different email address for you (as in the example above).
The user interface to Segmail presents the option of generating a new entry manually. This allows you to generate an address for the places where you expose your email address down other channels, other than by sending email.
For example: Filling in your contact details at your bank. You go to Segmail, create a new entry “bank” in your address book. Segmail gives you an address like "john-bank-g5hj32g5j@john.doe.com" – which you then give to your bank.
++ RECEIVING MAIL ++
When Segmail receives mail it checks to see if the password is valid. If it isn’t, it marks it as junk (bounces it, deletes it, challenge/responses it, moves it to a different folder, whatever).
If it is valid, it lets the message through. If the From address does not match the entry in the Address Book, a warning is added to the bottom of message that contains the original address of the correspondant in the address book.
Segmail allows you to "block" or "rotate" an entry in your address book.
When an entry is "blocked" it no longer accepts mail with that password. You would do this if you start recieving unwanted mail through that address.
When an entry is "rotated", a new password is generated and the old one marked old. When a message is received at the old one, an automatic message is sent to the correspondent saying “My email address has changed. Please resend your message to my new address. My new address is BLAH” – where BLAH is the encoded email address with the new password in it.
++ ADVANCED: WEB EXPOSURE ++
A special entry could be placed in the address book under "web", specifically for the address you will place on your web site (if you do so). It could be linked to Segmail, to automatically rotate every few days. A web-exposed email address is one of the most common sources of unwanted mail, by making the address stale every few days it would go a long way to curbing it.
Physical business cards can be rotated as well, but most likely over a longer time period than a few days.
++ FEEDBACK ++
Please consider the following questions:
1. Are there Pros of Segmail not mentioned?
2. Are there Cons of Segmail not mentioned?
3. Can you think of any constructive improvements or changes to this system?
4. What do you think of it in general?
Thanks for reading!
Re: RFC: Email 2.0: Segmail
by Tanktalus (Canon) on Sep 24, 2005 at 02:43 UTC
|
A lot of interesting thoughts here. It addresses one part of the spam problem: the spam getting through to your inbox. What it doesn't help with is another part: the traffic of spam. Lots of bandwidth is being used by spam, and this may have no effect. Nay, I propose it will have no effect.
I base this on the domain server that I do run for an ever-shrinking Fidonet network in Toronto, Canada. The vast, vast majority of email that hits my poor server is junk. It's also mostly blocked at the source. How I know it's junk? Because the majority of email addresses that these people are attempting to hit don't exist, and never did. A significant portion is addressed to addresses that were defunct 5 years ago (and, for the pedantic of us, still are). And the next biggest group is spam for real, live email addresses. And then legitimate traffic.
Thus, using a setup like you propose, which may make my life a bit easier when reading my inbox, won't reduce the strain on the server by the large amounts of junk that it will receive, some of which will still be stored by the server until you retrieve it and Segmail rejects it.
By rejecting mail and rotating your email address, the domain will actually get more junk mail to handle, which Segmail will need to sort through, as spammers send their spam to all your outdated email addresses that they've harvested - you're giving them a much longer list to spam against, even if you don't actually see a single one in your inbox.
That said, if you take it with its shortcomings, it seems pretty neat. It does, however, stop me from sending friend A's email address to friend B - since I don't know what the email address for A is that will be valid for B. And if both A and B use Segmail, we're going to have a hard time connecting without A and B both trusting me to hand off the password-encrusted email address and deleting it right away.
I suppose that part of my concern with this is that your solution does more to hide spam than cure it, which could give a false impression to those who don't know what you're doing under the covers. Many people would be fooled into thinking that you've solved a problem when the amount of bandwidth available to their network hasn't actually improved, thus the cost of their bandwidth won't get better, but it may possibly get worse. That said, if the whole world were doing this, perhaps spammers would give up and that would have the beneficial effect - but you'd have to hit critical mass first.
Just my 2 cents CDN.
| [reply] |
|
| [reply] |
|
That's a pretty neat idea (you should retry your link - it doesn't point to the tarpit entry ;->). As it is, I've reduced the bandwidth by getting my SMTP author to reject email when my filteraddress perl script says to reject it, and then to disconnect after about 4 incorrect commands - most MTAs that send spam seem to ignore the reject command, and just send the rest of the email immediately anyway, which results in hundreds of bad commands to my SMTP server. So the SMTP server now just disconnects after a few bad commands, and I don't even see the rest of the email - it's blocked partway through the header.
The tarpit idea is even better - if I could just delay the rejection by a second or two, then disconnect after 4 bad commands, that could have a beneficial effect. At least that system is a good multitasker with usable threads ;-) otherwise I could overload the system with this - all these extra processes waiting around.
| [reply] |
|
Thanks for your comments.
The sharing problem is no problem. If you give your email address for person A to person B, it will still work - but you and person B are now in the same "segment" together. That means if the address gets compromised than person A will have to rotate the address for both of you.
Still a damn site better than rotating the address for everyone in your address book. :)
As for the traffic caused by spam - I like your suggestion. Everyone use Segmail and they'll all give up. :)
-Andrew.
| [reply] |
Re: RFC: Email 2.0: Segmail
by tirwhan (Abbot) on Sep 24, 2005 at 08:35 UTC
|
The one thing you'd need to make sure of when implementing this would be to make rejection/acceptance happen at SMTP time. This goes especially for the notifications about a new email address, don't generate a new mail for these but rather give the information in an SMTP error message. Otherwise you're just aggravating the situation, because bounces and automatic messages in reaction to spam are almost as much of a problem as the spam itself these days.
All in all I'm not overly optimistic about the efficiency of this scheme. There is a lot of evidence of cooperation between spammers and virus writers, and it'd be trivial to write a virus which, additionally to just sending itself out to everyone in a victims address book, harvests this address book and sends it back to the spammer, complete with the victims address. Then the spammer has a set of sender-recipient addresses which are guaranteed to work (unless there's an additional anti-spam-mechanism in place) for segmail addresses. Sure, you can then rotate the address, and the spammer can harvest it again, lather, rinse, repeat. It's an automatic process for the spammer, not so automatic (and rather tortuous) process for you. For this to happen segmail would have to become popular enough to register on the spammers annoyometer first of course, so I guess it can be useful until then (sorta like grey-listing).
Personally, the annoyance of my correspondents having to keep track of which address is my current valid one (and valid for them, not someone else) would keep me from using such a scheme. I'm quite happy with my anti-spam setup at the moment (in fact, I receive far more spam via snail-mail than email, wish someone would implement Spamassassin::RealWorld ;-).
| [reply] |
|
Your comments are good.
I think that it all depends on how often a segmail address is compromised and needs to be rotated. Intuitively, and from speaking to people that use a manual system in the spirit of segmail - this happens, but quite infrequently.
I think the majority of spam comes "over time" as your address is copied/sold from one spammers address book to the next.
I've definately noticed that my older addresses get much more spam than my newer ones. I think this is evidence of that fact.
-Andrew.
| [reply] |
Re: RFC: Email 2.0: Segmail
by gloryhack (Deacon) on Sep 24, 2005 at 08:09 UTC
|
If what you'd like to do is to solve the problem at hand (if possible) using existing tools/without introducing a modification to an existing protocol, try DSPAM and SPF.
Between those two things, and some fairly aggressive DNS-based host rejection filters, I get between one and two spam messages per month in my email inbox -- and my email address was out there on the internet before spam was an issue. If DNS-based host rejection isn't feasible, DSPAM will just have to work harder, but work it will.
In general, I think it unwise to fiddle around with things that work for their intended purpose, especially when desired additional functionality can be had for minimal effort.
Specifically (intended constructively, and offered in a friendly tone), I see your proposed scheme suffering from the usual problems -- your ++RECEIVING MAIL++ section details a mechanism that suffers from the same old problems inherent in unintelligent filtering and/or challenge/response systems. I don't see that you've worked around any limitations in any existing system or provided any functionality not already available in a mature product.
However you go about it, I wish you the very best of luck with your project. | [reply] |
|
Whatever scheme someone thinks up will have to be adopted by enough domains before it starts to be effective. That doesn't mean it's not a good idea. Just that it's going to take a while so be ready to stick with the project. :)
Recently my domain has been getting bounce messages to non-existant accounts. The reason is that spam is currently being sent with forged From addresses that are just some random name @ mydomain.
I have had SPF setup in my DNS for a while now. I like it because it's pretty simple to setup. I don't have any issues where I try to send from different ISPs when I'm not road, etc.
I have learned that many domains do not use SPF. AOL, Gmail, Hotmail are some that do but many others don't. I've been emailing the whois contacts for each domain that bounces spam to me to tell them about SPF.
(Side note if your WHOIS contact information doesn't work I think you should lose your domain.)
| [reply] |
|
That is not true superfrink. The Segmail system works effectively even if only one person takes it up. It is backwards-compatible with current email infrastructure. It does not matter how many other people use it. That is the point.
-Andrew.
| [reply] |
|
Hi Gloryhack.
I think statistical junk mail filters like DSPAM and SPF are great, and better than nothing, but I can't stand the false positives. Losing any legitimate mail is not acceptable - and the reason most people have a junk email folder, and not simply delete it automatically.
Segmail specifically will never have a false positive by design. No legitimate mail will ever get filtered. This is a big win, because it means you can delete stuff marked as junk straight away.
As for the "same old problems inherit in unintelligent filtering and/or challenge/response systems", if you could state what those same old problems are - perhaps I could address them. I think you may be wrong, and many seem to agree with me.
No offense, but it looks to me like you have glossed over the spec - and haven't taken the time to understood how Segmail works - before commenting.
-Andrew.
| [reply] |
|
I haven't seen a DSPAM false positive in many months. Once trained, it's exceptionally accurate and very low-maintenance. I've been using DSPAM for at least two years, and have been quite impressed with it. SPF is not a statistical filter.
Your handling of mail received at an expired ("rotated") address looks to be a sticking point, to me. If Segmail sees an invalid password, "it marks it as junk (bounces it, deletes it, challenge/responses it, moves it to a different folder, whatever)". If a legitimate message is bounced or challenged, the sender might decide instead to abandon the contact -- this is one of challenge/response's sticking points. Legitimate message deletion is a big sticking point with unintelligent filters.
You say that "No legitimate mail will ever get filtered", but what happens if a correspondent doesn't have your most recent address? "it marks it as junk (bounces it, deletes it, challenge/responses it, moves it to a different folder, whatever)". In any except the quarantine response, the legitimate mail will be filtered, perhaps into the bit bucket.
Live like you want to live. I was merely suggesting that you consider alternatives that already exist and have been proven in the real world by thousands or tens of thousands of users.
| [reply] |
|
|
Okay, I've had a look at SPF.
It suffers the same problem as digital signing. It requires that all of your email correspondants use it in order for it to be effective.
As an email user, I don't have control over how my correspondants use email. The only thing I have control over is what email address I give them. By piggy-backing a username and password in the address I give each of my correspondants, I can identify and authenticate them - without buy in from them.
That is the essence of this Segmail spec - and what makes it different from DSPAM, SPF, Statistical Junk Mail Filters, Challenge/Response systems and Digital Signing.
-Andrew.
| [reply] |
|
Again, good luck with your project. I hope it does for you what you want done.
Just for the record: Of the 89,326 messages destined for my account and processed by DSPAM since I last reset the stats, it's been 99.6% accurate, with a 0.04% false positive rate. I haven't seen a false positive in several months. I see two or fewer spams in my inbox each month, and it takes me all of about a minute a day to clear my spam quarantine. DSPAM is a darn fine product.
My anti-spam system consists of some DNS-based blacklists (one local, the rest third-party) and SPF on the front line, with DSPAM behind it. This configuration meets my goals, in that it stops network transfer of most spam and quarantines the rest. Yesterday, the front-line stopped 396 connections, 18 of them stopped by the local blacklist, seven by SPF. 66 messages got through the front-line and were processed by DSPAM (with 100% accuracy for the day). Most of the spam that gets through the front line does so by virtue of coming via hosts I remotely administer for others, where I'm known variously as webmaster, postmaster, hostmaster, and root, and webmaster is usually visible on the web. Without those, I'd have received only three messages in my quarantine yesterday, which is not bad at all for an account I've had for seven years that's been exposed (unobfuscated) on the web and in Usenet since day one.
Again, I hope your project does for you what you want done, and wish you the best of luck with it.
| [reply] |
|
Re: RFC: Email 2.0: Segmail
by Corion (Patriarch) on Sep 24, 2005 at 06:47 UTC
|
There is a similar idea that Apache::SMTP/Apache2::Protocol::ESMTP surveyed - instead of munging the email address, munge the email hostname by setting up a wildcard domain and giving each email a different hostname. That way you could not only switch off an email address but also immediately stop all incoming traffic for that poisoned email address.
The problem with this setup is a human one (or so I gathered from the comments of the author) - people do not want their email addresses munged and can't stand that they're sending with a different email than what is on their business cards. But it seemed feasible for a small-scale, private/clueful email setup.
| [reply] |
Re: RFC: Email 2.0: Segmail
by chromatic (Archbishop) on Sep 24, 2005 at 07:00 UTC
|
| [reply] |
Re: RFC: Email 2.0: Segmail
by Ultra (Hermit) on Sep 25, 2005 at 16:40 UTC
|
The secret password that Segmail generated for Tom Smith is gh3f3gh3. Segmail would change the From address of John's outgoing message to be "john-tom-gh3f3gh3@john.doe.com".
This thing is bothering me. It's not _that_ secret when sent over the Internet, in plain-text, right?
Suppose Segmail becomes the de facto mailing standard. Then I think the spammers techniques will change: a little less web spidering, a little more network sniffing and "address book" grabbing.
I guess this is a weakness in the design ... compared with Signing/Encrypting techniques that _require_ the user to do some action (i.e.: typing a password).
| [reply] |
Re: RFC: Email 2.0: Segmail
by radiantmatrix (Parson) on Sep 27, 2005 at 19:26 UTC
|
You Personally advocate a
(x) technical ( ) legislative ( ) market-based ( ) vigilante
approach to fighting spam. Your idea will not work. Here is why it won
+'t work. (One or more of the following may apply to your particular i
+dea, and it may have other flaws which used to vary from state to sta
+te before a bad federal law was passed.)
( ) Spammers can easily use it to harvest email addresses
(x) Mailing lists and other legitimate email uses would be affected
( ) No one will be able to find the guy or collect the money
(x) It is defenseless against brute force attacks
(x) It will stop spam for two weeks and then we'll be stuck with it
(x) Users of email will not put up with it
(x) Microsoft will not put up with it
( ) The police will not put up with it
( ) Requires too much cooperation from spammers
( ) Requires immediate total cooperation from everybody at once
(x) Many email users cannot afford to lose business or alienate potent
+ial employers
(x) Spammers don't care about invalid addresses in their lists
( ) Anyone could anonymously destroy anyone else's career or business
Specifically, your plan fails to account for
( ) Laws expressly prohibiting it
( ) Lack of centrally controlling authority for email
( ) Open relays in foreign countries
(x) Ease of searching tiny alphanumeric address space of all email add
+resses
(x) Asshats
( ) Jurisdictional problems
( ) Unpopularity of weird new taxes
( ) Public reluctance to accept weird new forms of money
( ) Huge existing software investment in SMTP
( ) Susceptibility of protocols other than SMTP to attack
( ) Willingness of users to install OS patches received by email
( ) Armies of worm riddled broadband-connected Windows boxes
(x) Eternal arms race involved in all filtering approaches
(x) Extreme profitability of spam
(x) Joe jobs and/or identity theft
( ) Technically illiterate politicians
(x) Extreme stupidity on the part of people who do business with spamm
+ers
(x) Extreme stupidity on the part of people who do business with Micro
+soft
(x) Extreme stupidity on the part of people who do business with Yahoo
( ) Dishonesty on the part of spammers themselves
(x) Bandwidth costs that are unaffected by client filtering
(x) Outlook
and the following philosophical objections may also apply:
(x) Ideas similar to yours are easy to come up with, yet none have eve
+r been shown practical
( ) Any scheme based on opt-out is unacceptable
( ) SMTP headers should not be the subject of legislation
( ) Blacklists suck
(x) Whitelists suck
( ) We should be able to talk about Viagra without being censored
( ) Countermeasures should not involve wire fraud or credit card fraud
( ) Countermeasures should not involve sabotage of public networks
( ) Countermeasures must work if phased in gradually
( ) Sending email should be free
(x) Why should we have to trust you and your servers?
( ) Incompatiblity with open source or open source licenses
( ) Feel-good measures do nothing to solve the problem
(x) Temporary/one-time email addresses are cumbersome
( ) I don't want the government reading my email
( ) Killing them that way is not slow and painful enough
Furthermore, this is what I think about you:
(x) Sorry dude, but I don't think it would work.
( ) This is a stupid idea, and you're a fascist for suggesting it.
( ) Nice try, assh0le! I'm going to find out where you live and burn y
+our house down!
<-radiant.matrix->
Larry Wall is Yoda: there is no try{} (ok, except in Perl6; way to ruin a joke, Larry! ;P)
The Code that can be seen is not the true Code
"In any sufficiently large group of people, most are idiots" - Kaa's Law
| [reply] [d/l] |
|
Thanks for your comments radiantmatrix.
(1) Segmail doesn't require wide-spread adoption in order to be effective.
(2) Further, depending on how you use email, it may not be the right solution for you (compare the advantages versus the disadvantages) - and that is okay because see point 1.
Mailing lists and other legitimate email uses would be affected
A mailing list is just another correspondant. You supply your email address, they send you mail at that address, it goes through as normal. I am unclear how they would be affected.
It is defenseless against brute force attacks
The fact that the email address contains an eight character random password means that they are defended against dictionary attacks. Certainly better defended than a non-segmail address is. Define a brute force attack.
It will stop spam for two weeks and then we'll be stuck with it
Stuck with what? The system or more spam? If one address is compramised it can be rotated without effecting the rest of your correspondants.
Users of email will not put up with it
Not all users of email are required to put up with it. It works whether or not everybody uses it. For the correspondants of a Segmail user, it is simply a normal address change. What's to put up with?
Many email users cannot afford to lose business or alienate potential employers
Once again, it is simply a one-time address change. I don't see how it effects them.
Spammers don't care about invalid addresses in their lists
Therefore what? This statement is irrelevant.
Ease of searching tiny alphanumeric address space of all email addresses
This is a stated disadvantage - and for some users of email, may be worth the tradeoff.
Why should we have to trust you and your servers?
You don't. Run it on your own server. It is a decentralized solution requiring no centralized authority in order for it to work.
Temporary/one-time email addresses are cumbersome
This is a stated disadvantage - and for some users of email, may be worth the tradeoff.
-Andrew.
| [reply] |
Re: RFC: Email 2.0: Segmail
by mattr (Curate) on Dec 03, 2005 at 17:40 UTC
|
Hi,
What can I say, great minds think alike? I never saw this thread and coincidentally posted a very similar idea today, a month later. Wierd. Well I updated my page to credit you and hopefully this will lead to more discussion that might help you. Seems like a doable project, though I am not sure about how easy it is to spoof my version which does not use a hash or domain. Also someone above mentioned the need to send an error response back but I am not convinced he's right. | [reply] |
|
|