Fuspam Akismet Function

Summary

Akismet is a service designed to stop comment spam on blogs. It is built-in to WordPress, however it also has a public API you can use to build Akismet support into any app you create. It's highly effective at stopping comment spam, web contact form spam, and any other interface where users submit data to a website. Spammers have no shame, they will try to spam anything.

Fuspam is our own very simple PHP function that makes using Akismet in your php application very simple.

The Code

<?
// Fuspam 1.3
// F-U-Spam!
// This is the fully-commented version of the script
// The downloadable version is much leaner
// http://www.whatsmyip.org/lib/fuspam-akismet-php/

// Set these values of array $comment, then call fuspam().
// Set all array values unless you are only verifying your key, then just set 'blog'

include("akismet.fuspam.php");

$comment['blog'] = "";
// The front page or home URL of the instance making the request (your blog/webapp etc).
// Note: Must be a full URI, including http://

$comment['user_ip'] = "";
// IP address of the comment submitter

$comment['user_agent'] = "";
// User Agent of commenter/spammer (not YOUR user agent!)

$comment['referrer'] = "";
// The content of the HTTP_REFERER header

$comment['permalink'] = "";
// The permanent location of the entry the comment was submitted to.

$comment['comment_type'] = "";
// May be blank, comment, trackback, pingback, or a made up value like "registration", "email", "review" etc.

$comment['comment_author'] = "";
// Submitted name with the comment

$comment['comment_author_email'] = "";
// Submitted email address

$comment['comment_author_url'] = "";
// Commenter's URL.

$comment['comment_content'] = "";
// The content that was submitted

// When submitting data back to Akismet, you have to include all of this data.
// In other words, you have to store the commenter/spammer's IP addresses, User Agents,
// and Referers in your comment database. Don't submit spam/ham with your own User Agent etc!

// Once you fill up the $comment array, you simply call the fuspam() function. It's input is as follows:
// $comment is the array with all of the comment data in it
// $type is a string with the following possible values:
// "check-spam" - used for seeing if a comment is spam
// "submit-spam" - used for submitting a comment to Akismet as spam (when misdiagnosed as not-spam)
// "submit-ham" - used for submitting a comment that is NOT spam, back to Akistmet (when misdiagnosed as spam)
// "verify-key" - used for verifying your akismet key
// $key is your akismet key. You have to get a key before you can use this service. Go to akismet.com

function fuspam( $comment , $type , $key )
	{
	$payload = http_build_query($comment);
	// Build the post request. This compiles your comment data so you can send it to akismet
	
	switch ($type)
		{
		case "verify-key":
			$call = "1.1/verify-key";
			$payload = "key={$key}&blog={$comment['blog']}";
			break;
			// if you are verifying your key, use the verify key url
			
		case "check-spam":
			$call = "1.1/comment-check";
			break;
			// if you are checking if a comment is spam, use the spam checking url
			
		case "submit-spam":
			$call = "1.1/submit-spam";
			break;
			// if you are submitting spam, use the spam submission url
			
		case "submit-ham":
			$call = "1.1/submit-ham";
			break;
			// if you are submitting a non-spam, use the ham submission url
			
		default:
			return "Error: 'type' not recognized";
			break;
			// if the type you pass to fuspam() isn't recognized, return an error
		}
	
	$curl = curl_init("http://$key.rest.akismet.com/$call");
	curl_setopt($curl,CURLOPT_USERAGENT,"Fuspam/1.3 | Akismet/1.11");
	curl_setopt($curl,CURLOPT_TIMEOUT,5);
	curl_setopt($curl,CURLOPT_POSTFIELDS,$payload);
	curl_setopt($curl,CURLOPT_RETURNTRANSFER,true);
	// Set up the CURL session, this is how we send data to akismet

	
	$i = 0;
	do
		{
		// this loop tries to contact the akismet server up to 5 times before giving up.
		// Very helpful in overcoming network instability
		
		$result = curl_exec($curl);
		// Submit data to akistmet
		
		if ($result === false)
			{ sleep(1); }
			// If request/submission fails, wait 1 second
		
		$i++;
		}
	while ( ($i < 6) and ($result === false) );
	// If request/submission failed, retry up to 5 times
	
	if ($result === false)
		{ $result = "Error: Repeat Failure"; }
		// Convert boolean failure result into a string
		
	return $result;
	// Return the result to the script that called this function

	}

// fuspam( ) always returns a string. Here are it's possible return values:
// "true" - returned if it finds a comment to be spam
// "false" - returned if it finds a comment to be legit
// "valid" - returned if your akismet key is valid (when verifying)
// "invalid" - returned if your akismet key is not valid
// "Error: 'type' not recognized" - returned when the $type you pass to fuspam( ) is not valid
// "Error: Repeat Failure" - returned when curl could not contact the akismet server after 5 attempts
// When you successfully submit spam or ham to akismet, the function will return
// a thank you message that is the akismet server's response

?>

Usage Tips

When submitting data to akismet, make sure you are setting all of the proper values in your array. This means you may have to add a few columns to your database to store data like the poster's User Agent, etc. You want to submit all the info. The more accurate akismet is, the better for you and all it's other users.

The User Agent string is too short to effectively compress, so I wouldn't bother. However you can use the MySQL functions INET_ATON( ) and INET_NTOA( ) to store IP addresses as integers instead of strings. That compresses them quite a bit and also makes working with them faster.

There are many ways to handle the results of fuspam( ). You have to think about what procedure you want. "true" messages are probably spam, and "false" messages probably are not. If there is a server outage, or some other kind of error, you need to decide what to do with the data. You can assume on error, that the comment is not spam. Or you could do the opposite and assume it is spam. If you have a flag-based spam management system, you could submit a flag value half as strong as a "true", when the function returns an error.