Blog Engine Spam Cleanup Utility

9. January 2010

While this is totally a reactive approach to dealing with the spam bots posting comments I thought I’d share the clean-up project. In the end I have to give the bots credit for being so kind in nature of appreciative comments. It’s easy to get a big head reading the comments even though they are clearly bots with scripted content.

Attached is the project along with the XML block list. The block list is not complete with equal email, author, IP, and website listing for all the comment postings I received. It was sufficient enough for my needs to clean the comments up. The list may vary depending on the different bots that have visited the blog site, but should allow anyone in the same situation to clean up the mess. The code is roughly written and meant for a one-off cleanup, but below is the code along with the full project that includes the XML block list.

Blog Comment Cleaner Utility

Please note the project was created with Visual Studio 2010 beta 2 in case of difficulty opening the solution file.

Example Command Line:

BlogCommentCleaner.exe “c:\blog\App_Data\posts” “c:\blog\App_Data\block.xml”

Programming, Programs & Utilities, .Net

Blog Engine Spam Cleanup

7. January 2010

After reviewing my web logs I’ve noticed a big uptick in the number of external links to spam sites. I knew about the spam bots posting tens of hundreds of comments on this blog and I promptly shut down comment postings. The bad thing is I authorized all of them to appear not realizing I had set comments to being moderated, and promptly forgot to authorize any comments. That’s why nobody ever saw or heard from me in reply to legit comment postings. ~Sorry~

I took it upon myself to write a blog engine comment spam cleanup utility. It’s basically a console application that just runs through the XML postings (not database driven) using LINQ to XML to query comments looking for criteria on Author, Email, IP, and Website. These elements exist within the posting files within the comment element, and was core to the filtering process.

Essentially I just created a new XML filter file called ‘blocked.xml’ that I created using the following structure:

   1:  <block>
   2:      <emails>
   3:          <email>...</email>
   4:      </emails>
   5:      <authors>
   6:          <author>...</author>
   7:      </authors>
   8:      <ips>
   9:          <ip>...</ip>
  10:      </ips>
  11:      <websites>
  12:          <website>...</website>
  13:      </websites>
  14:  </block>

From there I load each section into a generic list, using LINQ, and loop through creating a new XDocument for each posting file looking for the comments element’s child comment elements. It queries each comment section in each post looking for any comment that contains emails/ips/author/ or websites contained in the generic lists created from the block list file. Upon finding an offending element it removes its parent element, being the comment element, and removes it from the comments element. Thus removing forever the spam that currently exists from all posting files based on the block list.

The problem is building out the block list. Don’t want to arbitrarily remove valid comments. Luckily most of the spam bots that have taken root on the block share many of the some websites even though email addresses are often random for the same bots. I’m still building out my block list and at this point only have a hundred or so blocked elements added. I will need to most likely script out the creation of the block list so I don’t have to manually build the list. Still so far after manually building the list for an hour it’s dropped the existing spam count by a quarter of what it used to be. Not finished, but it’s getting there.

All of this work brought out an interesting fact that there is no option to remove all comments from view in the blog comment settings, which would have been nice to reduce external linking to websites with T & A related content. You’d think since it can enable and disable allowing comments there would equally be a show/hide all comments setting for those not looking to ever allow comments. Probably pretty niche, but still would be nice for just such an spam infestation occasion.

Programming, Programs & Utilities

AWStats

9. October 2009

I’ve been using AWStats for a few years now and figured I would give it a plug. What is AWStats you may ask? Well, AWStats is a command line application that takes web logs generated by IIS or Apache, formats the log information, and is able to generate HTML reports on the traffic data.

Not only will it function on web server logs but it will also do the same for FTP and Mail server logs. The data is formatted into varying sections of interest. For web server logs it presents the specific browsers used, the number of unique visitors, pages accessed, and traffic to the site over a months period with previous months aggregated for a summary table. The information is incredibly handy to figure out traffic patterns, identify potential robots.txt ignoring search engines, bot attacks, and see what pages present a 404 or 500 error to be taken care of.

At some levels this program can be compared with Google Analytics (GA), but unlike GA AWStats can work on a specific log file or series of log files instead of being limited to live site visits. The same information is stored in the web server log so you’re actually getting more pertinent information from AWStats. AWStats exposes the same keyword referrers, bandwidth usage over time, and can show the individual days with traffic breakdowns. Handy information for any web hosting or personal user looking to not piss off the ISP.

The catch is you’ll have to install Perl in order to install/use AWStats but by visiting ActiveState there shouldn’t be a problem acquiring Perl for Windows for use with IIS web servers. The configuration is a breeze so long as you actually follow the documentation installed with the program. Essentially create a config file, change the config domain to the domain name being hosted, and map a new virtual directory to the cgi-bin installed under AWStats. That’s really it in a nut shell. Below is a screen capture from the AWStats project site so you can see the initial HTML generated report. There is a lot more to it so give the image a click to see the live site demo.

snapshot

Programs & Utilities

Windows 7 Party Pack – Let down?

9. October 2009

I’m excited for Windows 7, and I was actually thinking of trying to setup a Windows 7 party. I was really curious to see exactly what would come in the Windows 7 party pack too, and it’s been an itch I couldn’t scratch. I know maybe one person, pro-Microsoft, that I could definitely get to come, and the rest would be inclined to come if the party also included LAN party games.

I was really bummed to hear from PC Pro what exactly was in the party pack. Apparently just playing cards, a puzzle, one poster, and one free copy of Windows 7 Ultimate. The free Windows 7 copy aside that’s a pretty lame party pack. That’s a party for 50 something's who sit around the card table, and talk about how in their day they didn’t have computers. No sir, they had slide rulers and those were the shizzle. Seriously, my parents play cards every now and again with people and call it a party.

Our generation, X and that Y generation like games. We play games a lot. Remember that the average gamer is a 30 something male with an above average BMI, or a 30 something woman with emotional problems. Yah, I’m actually serious about that fact. So why would we want to have a party with just playing cards, one poster, and a puzzle? I want Windows 7, home networking, and Halo for the PC at the very least!

I’m not interested in a Windows 7 party, unless Microsoft wants to send me a copy of Windows 7, and I’d cave in a heartbeat because I want Windows 7. Don’t worry Microsoft you’ll get my $200 x2 to get your sweet merchandise. Still I’m left wondering why after the friggn’ millions spent on the new marketing blitz you’d be so cheap on something the party is supposed to be centered around? As it is it’s more of a sit around and watch Bob install the newest OS while the rest of us have to play solitaire with the pack of playing cards. Bummer.

Offbeat, Programs & Utilities

Windows Server 2008 Mail Server using hMailServer

16. March 2009

scr_installation_big I’ve been toying with Windows Server 2008 for a little while now, nothing major, just hosting a few domains, and tweaking to get some experience with the latest server offerings. As such I’ve been really looking into getting an email server setup on Windows Server 2008. The problem is my budget doesn’t extend to the niceties of Exchange Server, or the likes of MailEnable. So after some snooping around i came across hMailServer that came with a somewhat decent support structure, decent documentation, and best of all free mail server software.

hMailServer allows for multiple domains, seemingly no limit on domain creation, multiple domain user accounts, and even a somewhat decent scripting system for automated maintenance. The only limitation on storage size I can see quickly would probably come from the particular database management system being used. The choices are between MS SQL, MySQL, or PostgreSQL.

After signing into the local hMailServer administration tool you can begin configuring your SMTP settings, setup domains, setup user accounts, configure mail rules, schedule backups, and in general configure your small or enterprise level hMailServer setup. Below you can see a glimpse of the hMailServer Administrator dialog in action.

scr_hmailadmin_iprange_big

Programs & Utilities

Hulu – Watch TV and Movies online for FREE!

7. January 2009
hulu_logo

 

I don’t pay for cable, and i don’t pay for dish. What I do pay for is internet, and it gives me more than enough TV to compensate than both dish or cable ever could. What it gives me is Hulu.com and all the on demand access i want to popular sitcoms, and even shows that are no longer on the air!

Not only does Hulu provide access to standard broadcasting shows Hulu also provides access to cable shows, and even movies. Yes, movies, and no this is not Netflix. This site does not charge anything, and only broadcasts a standard ‘commercial’ during the playing of a show. Considering I can watch what i want, when i want, and wherever i want (bear in mind wireless laptop.)

As i write this i’m watching the Dilbert cartoon series that hasn’t been broadcast in some time. What’s great is i already watched the new series ‘The Legend of the Seeker’ episode 7 before it was even broadcast on standard TV. The shows range from ‘The Family Guy’, ‘Chuck’, and go all the way to the National Geographic shows. Man, i love technology between Pandora online radio and Hulu online TV who needs both a real TV and radio?

Offbeat, Programs & Utilities

Pandora Radio (Free Online Radio)

4. January 2009

pandoraradio I'm a geek, no question, and I listen to a lot of music. Really i can't imagine a day going by where I'm not listening to some form of music while I'm working out, relaxing, or while I'm toiling at work. It started a long time ago while i was working in my first programming position at SDRC in 2000. I began bringing in my personal laptop with a limited selection of music in MP3 format from the days of Napster not being a corporate unit. I'm not admitting to anything, just saying I had a few songs to listen to.

Since that time I've transitioned from downloading songs to a laptop, listening to radio FM stations, using a Zen Micro Photo MP3 player, an iPod Nano after the Zen died, and now i have Pandora Radio. I use Pandora almost exclusively at home, work, or even on my smart phone. I'm even tempted to hook up a computer system in the car to just use Pandora in the car stereo for long trips. Yes, i have a pocket PC i could use, but come on a computer in the car is awesome. It's become an integrated part of my life where I can listen to genre's i like, artists, and I get the advantage of hearing new music from bands i'd never known existed. I went from Weird Al to The Trucks, Freezepop, Gary Jules, and many more stations waiting to explore. If you haven't tried it, go try it, it's free, and it's incredibly more personalized than your typical top 40 stations you get with most online radio stations.

Programs & Utilities

CCleaner - A must have Windows utility

28. December 2008

While the name may be a little off the cuff i have to say CCleaner (crap cleaner) has been a very nice utility. Recently my step-brother acquired yet another virus on his laptop that like most malware utilities installed itself within one of this temp folders. I used the standard assortment of Avast, Spy-bot, Ad-aware, and now CCleaner to help get to those hard to reach temporary file locations. When running the typical assortment of anti-viral utilities the hardest thing to do is to validate that the malware/virus has been removed from all local directories. On top of that trying to locate the registry entries to the now hopefully removed files. With CCleaner the application is able to identify registry file path references that are no longer valid to be removed, among other things, and remove them within about 30 seconds. Not to mention it cleans out the contents of almost every temp file location on the system to help prevent reinstallation of the bad-ware.

This ability to remove both registry and file temp files helps ensure that when running Spy-bot, Ad-aware, and Avast that when the actual files are found then removed that nothing remains. This is not a 100% solution for removing malware or viruses but it sure helps. Running your typical virus scanner or anti-malware application can take anywhere from 10 minutes to 3 hours. With CCleaner you're able to run the application flat-out or use the analysis function to check the system for each file to be removed. The best part is it takes just mere seconds to run a complete check. In the world of waiting and waiting for a scan to complete for most applications the near instant cleaning capability is very refreshing.

Programs & Utilities