downloadMail - Downloading email error

The downloadMail script is having trouble download a specific message (see attached).

The error is:

MySQL Error:<br/> Incorrect string value: '\xC2 \xE2\x80\x94 ...' for column 'headers' at row 1

Any thoughts?

Thank you

Attachments:

email-source.txt 5K

Thank you

By Dave - November 14, 2013

Hi BrownLeather, 

I've patched your Download Mail on your server so that it is working now and I'll have an update for Download Mail out shortly that includes this new patch and some other code improvements.

Here's the patch I applied, in downloadMail.php in _getMessagePartData function on line 299 I replaced this:

$headers = imap_fetchbody($mbox, $msgno, 0);

With this code:

// get headers
$headers = imap_fetchbody($mbox, $msgno, 0);
$headers = _mb_convert_to_utf8($headers, @$params['charset'], $GLOBALS['DOWNLOAD_MAIL_DEFAULT_CHARSET']); // see below
// NOTE: Above line converts headers to UTF-8 if not already UTF-8 *AND* fixes broken UTF-8.  We've seen spam filters
// ... inject broken headers where unicode nbsp \xC2\xA0 was converted to invalid unicode \xC2\x20 by charset unaware
// ... code that was trying to convert ascii nbsp \xA0 to spaces \x20. Invalid UTF-8 causes MySQL to return errors and
// ... be unable to save records containing it.

And as the code suggests the issue was being caused by a spam filter somewhere along the way between the message being sent and received.  The spam filter was adding content to the message header and incorrectly encoding it.  This fix will ensure that even incorrectly encoded headers will be processed in future.

Can you give that a try and let me know if it works for you?  Thanks!

Dave Edis - Senior Developer
interactivetools.com

Thank you!

Thank you!!

Thank you!!!

By Brownleather - December 11, 2013 - edited: December 11, 2013

Hi Dave,

The download mail plugin has been running smoothly for a while, thank you for the previous fix.

We received an email today that is chocking up the parser (see attached)

The error is: Incorrect string value: '\xF0\x9F\x8D\x80Ke...' for column 'text' at row 1

I truly appreciate your help

Attachments:

uid-2431.txt 14K

By Dave - December 12, 2013

Hi Brownleather, 

You are doing an amazing job of finding edge cases where the plugin fails.  Thanks for that, and sorry it's not working.

This sequence "\xF0\x9F\x8D\x80" is a 4 byte unicode UTF8 symbol for a green leaf clover:
http://www.charbase.com/1f340-unicode-four-leaf-clover

And the problem is, MySQL's version of UTF8 doesn't support these 4 byte characters until version 5.5:
http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-sets.html

So the simplest option would probably be to just remove 4-byte sequences, other options would be to require MySQL 5.5+ and switch to utf8mb4 encoding.

Questions:

- What version of MySQL do you have (listed under: Admin > General at the bottom)
- Would it work to just strip out those characters? (They are things like emoji's, smileys, symbols, etc).

Let me know, thanks!

References: 

Dave Edis - Senior Developer
interactivetools.com

Thanks for your reply.

We are running 5.5.32

So I'd like to switch to utf8mb4 encoding.

Whats the next step?

Thank you

By Dave - December 17, 2013

Hi brownleather,

There's a post here on how to switch over: 
http://dba.stackexchange.com/questions/8239/how-to-easily-convert-utf8-tables-to-utf8mb4-in-mysql-5-5

And here on mysql.com: 
http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html

If you download this free plugin it has an easy interface for entering MySQL commands:
http://www.interactivetools.com/add-ons/detail.php?MySQL-Console-1011

So I'd try the following steps: 

  1. Under: Admin > General: Backup the table "_incoming_mail"
  2. Check your "Table Prefix" at the bottom of the page (usually cms_)
  3. In the MySQL Console Plugin enter this (using your table prefix name):  ALTER TABLE cms__incoming_mail CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
  4. Note that your table prefix ends with a _ and the _incoming_mail table starts with one.  You need two underscores.
  5. Test downloading mail and check for any problems (if you have problems, restore backup of individual _incoming_mail table

Let me know if that works for you or if you run into any issues.

Thanks!

Dave Edis - Senior Developer
interactivetools.com

Thanks.

I'll let you know how it goes.