Blog Engine Spam Cleanup

January 8, 2010 at 1:09 AMRampidByter

After reviewing my web logs I’ve noticed a big uptick in the number of external links to spam sites. I knew about the spam bots posting tens of hundreds of comments on this blog and I promptly shut down comment postings. The bad thing is I authorized all of them to appear not realizing I had set comments to being moderated, and promptly forgot to authorize any comments. That’s why nobody ever saw or heard from me in reply to legit comment postings. ~Sorry~

I took it upon myself to write a blog engine comment spam cleanup utility. It’s basically a console application that just runs through the XML postings (not database driven) using LINQ to XML to query comments looking for criteria on Author, Email, IP, and Website. These elements exist within the posting files within the comment element, and was core to the filtering process.

Essentially I just created a new XML filter file called ‘blocked.xml’ that I created using the following structure:

 1: <block>
 2:  <emails>
 3:  <email>...</email>
 4:  </emails>
 5:  <authors>
 6:  <author>...</author>
 7:  </authors>
 8:  <ips>
 9:  <ip>...</ip>
 10:  </ips>
 11:  <websites>
 12:  <website>...</website>
 13:  </websites>
 14: </block>

From there I load each section into a generic list, using LINQ, and loop through creating a new XDocument for each posting file looking for the comments element’s child comment elements. It queries each comment section in each post looking for any comment that contains emails/ips/author/ or websites contained in the generic lists created from the block list file. Upon finding an offending element it removes its parent element, being the comment element, and removes it from the comments element. Thus removing forever the spam that currently exists from all posting files based on the block list.

The problem is building out the block list. Don’t want to arbitrarily remove valid comments. Luckily most of the spam bots that have taken root on the block share many of the some websites even though email addresses are often random for the same bots. I’m still building out my block list and at this point only have a hundred or so blocked elements added. I will need to most likely script out the creation of the block list so I don’t have to manually build the list. Still so far after manually building the list for an hour it’s dropped the existing spam count by a quarter of what it used to be. Not finished, but it’s getting there.

All of this work brought out an interesting fact that there is no option to remove all comments from view in the blog comment settings, which would have been nice to reduce external linking to websites with T & A related content. You’d think since it can enable and disable allowing comments there would equally be a show/hide all comments setting for those not looking to ever allow comments. Probably pretty niche, but still would be nice for just such an spam infestation occasion.

Posted in: Programming | Utilities

Tags:

AWStats

October 9, 2009 at 11:07 PMRampidByter

I’ve been using AWStats for a few years now and figured I would give it a plug. What is AWStats you may ask? Well, AWStats is a command line application that takes web logs generated by IIS or Apache, formats the log information, and is able to generate HTML reports on the traffic data.

Not only will it function on web server logs but it will also do the same for FTP and Mail server logs. The data is formatted into varying sections of interest. For web server logs it presents the specific browsers used, the number of unique visitors, pages accessed, and traffic to the site over a months period with previous months aggregated for a summary table. The information is incredibly handy to figure out traffic patterns, identify potential robots.txt ignoring search engines, bot attacks, and see what pages present a 404 or 500 error to be taken care of.

At some levels this program can be compared with Google Analytics (GA), but unlike GA AWStats can work on a specific log file or series of log files instead of being limited to live site visits. The same information is stored in the web server log so you’re actually getting more pertinent information from AWStats. AWStats exposes the same keyword referrers, bandwidth usage over time, and can show the individual days with traffic breakdowns. Handy information for any web hosting or personal user looking to not piss off the ISP.

The catch is you’ll have to install Perl in order to install/use AWStats but by visiting ActiveState there shouldn’t be a problem acquiring Perl for Windows for use with IIS web servers. The configuration is a breeze so long as you actually follow the documentation installed with the program. Essentially create a config file, change the config domain to the domain name being hosted, and map a new virtual directory to the cgi-bin installed under AWStats. That’s really it in a nut shell. Below is a screen capture from the AWStats project site so you can see the initial HTML generated report. There is a lot more to it so give the image a click to see the live site demo.

snapshot

Posted in: Utilities

Tags:

Telerik – RadControls or more appropriately BadControls…

March 24, 2009 at 8:30 PMRampidByter

1331911801_6f960ea238_o

 

I’ve been a user of the Telerik RadControls for ASP.Net, RadControls for Winforms, and Reporting tools for the last few months. I’ve been using them since Q2 2008 to the latest release of Q3 2008.

In the last few months I can say this in no simpler terms that if you use any of the RadControls expect to be extremely irritated with the flat uselessness of these controls in any practical application scenario. Add a few multitudes of annoyance when you begin building composite control applications using RadControls as they do not interact well in abundance. I especially found the absolute worst case to be with the Telerik Reporting that is just not suitable for any reporting scenario inside of a corporate environment, let alone any business environment. Perhaps the reporting tools would be good for play toys by some stay at home moms to make flyers for the umpteen number of fundraisers that may happen in a child's life.

I don’t say this lightly. I’ve garnered my fair share of ‘Telerik’ points, the points awarded for discovering (don’t let me make that sound like a challenge) problems or quirks that otherwise do not function as expected when using Telerik controls. The worse of all is the ease at which I've found these problems countered with the sheer number of man hours spent determining whether it was my own doing (because of course I wouldn’t expect such a highly regarded toolset as being so shoddy), but of course speaking with the Telerik’s support confirms my suspision that the tools just that bad. Don’t get me wrong Telerik’s support is fantastic for providing points quickly and effectively, and for the occasional gotty work-around to the problem that shouldn’t be there.

Telerik Reporting – Why you shouldn’t use it.

I’ll start with a simple example. If a developer were to use the Telerik Reporting don’t expect the reports to support out-of-process session state. This may not be a problem for some company/personal sites because of the lack for scalable websites, but becomes a deal breaker for any self respecting web architecture.

Why do they not support out-of-process?

Because of some serialization/deserialization problems that cause the Telerik reports to throw a nice reflection error on page rendering. The error itself isn’t a page level error but an error caught only by the Global.asax’s Application_Error event. The great thing about the exception is when a custom error page is used on the site that same custom error page will display within the report viewer where one would expect their report to be generated. The error itself is something very generic, if you handle the Server.GetLastError() in the Global.asax file, the description is something like “A list corresponding to the selected DataMember was not found.” Thanks captain obvious error I really can pin that down quickly.

I will give them this they made reports class objects instead of something like the typed XML format of RDLC, which leads way to more dynamic instantiation of custom report classes, but at the same time they’re not much in the current form.

Another funny thing about the RadReports is the fact that each report can be bound to a DataSet, but the report itself will only handle ONE DataTable from within the DataSet. At least RDLC you can just add a new panel/table and bind to X number of DataTables from within the bound DataSet datasource. Telerik on the other hand requires that you create a sub-report that you then bind to another DataSet, and again only one DataTable from that DataSet is allowed to be used to bind controls to. This becomes incredibly annoying when trying to create a simple order report with a ‘header data’ table, and an itemized ‘detail data’ table inside one report. Instead you have to create a master report for the header table, a sub-report for the itemized data, and mix in the lack of support for out-of-process its massive headache.

There is a work-around, not great, but i’ll touch on it here. After having bound the DataSet to the report, bound the textboxes/report controls, remove the DataMember property of the report settings, and then dynamically handle the on-demand datasource event of the report. The problem you get into in that situation is you then have to do the same for all the sub-reports by going through the same process, and then handling the data binding for the sub-report within the demand datasource of the master report. The end result is twice the data retrieval calls necessary to create a report that could be handled by support multiple DataTable bindings from a single bound DataSet to a report.

For any scenario used for generating business reports, reports with drill-through, or multi-table application just do yourself a favor and skip right over Telerik Reporting. Don’t believe me? Well check out the Telerik Reporting ‘forums/documentation’ on-line and find one sample or one thread talking about mulit-datasource reports. Find one that implements something other than a SqlDataSource page level datasource. Go ahead… i’ll wait. Back already?

Anyway, supposedly Telerik says the out-of-process problem will be resolved in Q1 2009 release, but we’ll see. It didn’t do us a lot of good at the time since it took about six days bouncing back and forth with support (gap over the weekend), creating test projects for them, and to finally discover they just don’t support out-of-process session data. When all was said and done we ditched Telerik Reporting for RDLC. RDLC may not be pretty, but at least it’s dependable.

Next Post – ASP.Net BadControls! Stay tuned…

I think I'll conclude part 1 of my Telerik experience post. I’ll follow up with my infamous experience using RadControls for ASP.Net! I can’t wait to share more about the RadMenu that touts templated items, multi-column menu, but none of that is supported with load-on-demand from web services. I’ll talk about the attributes that don’t carry over on RadMenu, the RadTreeView viewstate client disconnected bombs, RadSplitters looking like what President Obama would say ‘Special Olypmic's’ in non-IE browsers, and much much more.

PS. Needless to say if you want to save yourself time try Infragistics, been a few years since i last used them, but at least they weren’t as bad as Telerik has been. It could be just me, but i doubt it. Hopefully Q1 2009 brings much needed improvement.

Posted in: ASP.Net | Programming | Utilities

Tags:

Windows Server 2008 Mail Server using hMailServer

March 16, 2009 at 9:10 PMRampidByter

scr_installation_big I’ve been toying with Windows Server 2008 for a little while now, nothing major, just hosting a few domains, and tweaking to get some experience with the latest server offerings. As such I’ve been really looking into getting an email server setup on Windows Server 2008. The problem is my budget doesn’t extend to the niceties of Exchange Server, or the likes of MailEnable. So after some snooping around i came across hMailServer that came with a somewhat decent support structure, decent documentation, and best of all free mail server software.

hMailServer allows for multiple domains, seemingly no limit on domain creation, multiple domain user accounts, and even a somewhat decent scripting system for automated maintenance. The only limitation on storage size I can see quickly would probably come from the particular database management system being used. The choices are between MS SQL, MySQL, or PostgreSQL.

After signing into the local hMailServer administration tool you can begin configuring your SMTP settings, setup domains, setup user accounts, configure mail rules, schedule backups, and in general configure your small or enterprise level hMailServer setup. Below you can see a glimpse of the hMailServer Administrator dialog in action.

scr_hmailadmin_iprange_big

Posted in: Microsoft | Utilities

Tags:

Cincinnati SQL User Group

January 11, 2009 at 10:03 PMRampidByter

powershell_2

 

Last Thursday I went to my first Cincinnati SQL user group meeting. The meeting was suggested to me by my co-worker, and the topic was something I've been interested in. When I installed Windows Server 2008 i was excited to see that Powershell was included, it was also on my Vista PC’s, but I never really took the time to check it out.

I had been warned that at this particular meeting a past employee had been given a hard time for being a .Net developer. I thought they were pulling my chain so of course i decided I'm going to wear the .Net shirt that is completely black with the large bold white font .Net on the front.

We get to the meeting and i start to see one by one people heading into MAX training for the meeting. We both get out of the car and head in. First thing i notice is that everyone is dressed nicely and I'm the one in my carheart jacket, .net shirt, and cargo jeans. From there the speaker Arie Jones starts to give background on himself establishing his credentials with good background detail on growing up the computer geek who went to camp, and was one of the young developer of his time during high school. I couldn’t help but note a surprising similarity between his background in mine. I went to a college course on DOS when i was about 9 years old, did co-op in high school when i was 16 becoming the fourth youngest worker at SDRC, and Arie then mentioned he joined the military to do something different. Something i also did in my life, but again that is where the similarities end.

During the presentation we went through the basics of Powershell, that uses it provides, some overall benefits, and really just touched the surface of the technology. What i gained was understanding how to launch Powershell by entering Powershell into the command prompt. From there you can load/include a script into the environment, and then given the ability to call functions defined within the function. It’s pretty easy to load a script file just include the path with “. ./Functions.ps1” and all functions from that script file will be loaded into the environment.

 Untitled

The syntax was what really caught me off guard since Powershell has been touted as bringing .Net to the command line shell scripting level. The syntax is anything but .Net styled, and in fact reminds me heavily of a PHP and C++ hybrid scripting language with Unix styled functionality. Mainly the way all variables are prefixed with $, and how equality checks are done with ‘-eq’ for equals, ‘-le’ less than or equal to, and the general format of the curly bracket and semi-colon syntax. You can pipe things in Powershell as you could in Unix/Linux systems. One friend of mine actually made a comment in jest that Microsoft has finally given something that Unix gave some thirty years ago how novel of them.

From what i can tell initially, keep in mind I'm a complete novice to Powershell, the variables are completely lazy in declaration, just use as you go, and can be either object or simple data type. From there you can make calls to functions only available in .Net assemblies, such as the Reflection namespace, and actually use the functions exposed to that namespace. While there isn’t anything much I can do with Powershell that i couldn’t do in .Net it is nice to know that should i wish i could create some seemingly simple scripts to carry out some tasks without the need to compile and recompile assemblies to do the same. I was thinking of updating my Remote Desktop port changing application to a simple Powershell script, and another one to send an SMS notification in the event my IP address changes to my phone so i could log in remotely to the new address. Again, nothing i couldn’t do with .Net, but merely something to try the new technology out and get at least acquainted.

Well, i do have to share one thing. I of course was called out during the SQL user group presentation, asked about dynamic data pages, and polled about our usage of LINQ. This was shortly after Arie asked whether there were any developers in the room, and of course my buddy and I were the only two. Didn’t help my .Net kind of gave me away on that one, and during the whole presentation besides calling server admin’s ‘Monkeys’ the presenter also seemed to be directly addressing both of us developers throughout the entire thing.

After the presentation i will admit that the lunch was kind of awkward, at least until the one guy from Western Southern came to sit with us, and from then on we were engaged in conversation with the three user group founders. I mostly sat in on the conversation regarding SSIS packages, but it was hard to focus after totally getting swag. I got second dibs of the table full of Microsoft products, books, shirts, and assorted drinking containers. It was awesome, i got a sweet Microsoft thermas AND was given a awesome server center operations manager book. I heard they were going to have the next group meeting on analysis services and i have to admit i hope to make this a regular group to attend, so long as the topics are interesting.

Posted in: Microsoft | SQL | User Group | Utilities

Tags:

CCleaner - A must have Windows utility

December 28, 2008 at 10:34 PMRampidByter

While the name may be a little off the cuff i have to say CCleaner (crap cleaner) has been a very nice utility. Recently my step-brother acquired yet another virus on his laptop that like most malware utilities installed itself within one of this temp folders. I used the standard assortment of Avast, Spy-bot, Ad-aware, and now CCleaner to help get to those hard to reach temporary file locations. When running the typical assortment of anti-viral utilities the hardest thing to do is to validate that the malware/virus has been removed from all local directories. On top of that trying to locate the registry entries to the now hopefully removed files. With CCleaner the application is able to identify registry file path references that are no longer valid to be removed, among other things, and remove them within about 30 seconds. Not to mention it cleans out the contents of almost every temp file location on the system to help prevent reinstallation of the bad-ware.

This ability to remove both registry and file temp files helps ensure that when running Spy-bot, Ad-aware, and Avast that when the actual files are found then removed that nothing remains. This is not a 100% solution for removing malware or viruses but it sure helps. Running your typical virus scanner or anti-malware application can take anywhere from 10 minutes to 3 hours. With CCleaner you're able to run the application flat-out or use the analysis function to check the system for each file to be removed. The best part is it takes just mere seconds to run a complete check. In the world of waiting and waiting for a scan to complete for most applications the near instant cleaning capability is very refreshing.

Posted in: Microsoft | Utilities

Tags:

Windows Live Writer

December 28, 2008 at 10:00 PMRampidByter

This is my first test of the Windows Live Writer application. I downloaded it then spent some time skimming through add-ins for the application. There were some pretty decent applications available, and i picked one that formats Visual Studio code and another for Polaroid style images. All in all i can't complain about the functionality. I have a very nice editor, it provides spell checking (thank you!), and i am able to preview the post before submitting. I'm going to submit this and we'll see what happens as i play with more settings, but i figured this would be a good test blogging without blogging through the .NetBlogEngine interface.

 Quick follow-up. I hit the 'publish' button and it did post directly to my blog without much hassle, well no hassle, but I would have wished for the post to clear itself instead of having to go to 'File > New Post' to clear the post.

Posted in: Microsoft | Utilities

Tags: