Parsing CSV files with Grails

One of the arguments that I often make for my use of CFML is how you can do so much with so little code. Seemingly every time I attempt to do something that I haven’t done previously with Grails, I find that argument holds less water than I thought, as I can often do it even easier in Grails.

For a current project, we have an occasionally updated CSV document that contains codes related to the customer’s industry. Given that this file will be changing with additional codes being added, while this app is in early development, we decided that we would just keep it in our application config directory and ensure that any new codes are added during the application bootstrap routine.  Here is what I came up with:


// Insert new codes
def csv = new File("grails-app/conf/code.list.csv")
def code
csv.splitEachLine(',') { row ->
   code = Code.findByLabel(row[1]) ?: new Code(
      code: row[0],
      label: row[1]
   ).save(failOnError: true, flush: true)
}

Essentially, the above is saying:

  • Read the CSV file
  • Loop each line, where each line is referred to as “row” in the closure.
  • Search in the database for codes with the same label
  • If the code does not yet exist in the system, create a new instance of Code passing in property values from the row in the CSV file.
  • Save the new code to the database.

As you see, I have added line breaks for readability, but I was able to get the result I was looking for in THREE lines of code! I figured I would share in case anyone is looking for a similar solution.

JRun wsconfig error- Security alert: attempt to connect to JRun server from host

I was experimenting with the Railo 3.3 installer, which includes an IIS connector to Tomcat, which works really well.   Too well in fact!  When I ran it, it actually unmapped all my existing IIS ISAPI mappings to JRun and was sending all requests to Tomcat.

I decided the quickest fix to this would be to simply open up /JRun4/bin/wsconfig.exe and remap the sites that were no longer connected.  However, when I did this, I received the following error:

Could not connect to JRun/ColdFusion servers on host localhost.

Knowing perfectly well that I had an instance of JRun running, I went to the terminal to look at the standard out and saw this:

Security alert: attempt to connect to JRun server from a 10.252.11.207 host

In case that is too hard to read, it says: “Security alert: attempt to connect to JRun server from a 10.252.11.207 host”.   I suspect that because I am attached to a WIFI connection with an IP Address on 192.168.*, and then VPN’d into my company with a second address of 10.252.*, JRun assumes that the connection attempt is coming from outside the subnet.

I went digging through files in JRun4/lib and came across security.properties.  In this file, there is a default setting:

jrun.subnet.restriction=255.255.255.0
jrun.trusted.hosts=
jrun.subnet.restriction.ipv6=[ffff:ffff:ffff:ffff:0:0:0:0]

I altered that restriction  setting from “255.255.255.0″ to “*” like this:

jrun.subnet.restriction=*
jrun.trusted.hosts=
jrun.subnet.restriction.ipv6=[ffff:ffff:ffff:ffff:0:0:0:0]

Once I did this and restarted the server, I was able to use wsconfig without issue.  And my ACF sites are pointed to JRun, my Railo sites are pointed to Tomcat, and all is right in the world again.

NOTE: DO NOT DO THIS ON A PRODUCTION MACHINE!   If you do, I strongly recommend that it is a very temporary change.

CFML wishlist: All collections should extend Iterator

Have you ever really given a second thought to the fact that in ColdFusion/CFML you have to loop queries, arrays, and structures in completely different ways?   For example, in each of these things, we are essentially doing the same thing:

<!--- looping our query --->
<cfloop query="myQuery">
	<cfset doStuff() />
</cfloop>
<!--- looping our array --->
<cfloop array="#myArray#" index="i">
	<cfset doStuff() />
</cfloop>
<!--- or --->
<cfloop from="1" to="#ArrayLen(myArray)#" index="i">
	<cfset doStuff() />
</cfloop>

<!--- looping our structure --->
<cfloop collection="#myStruct#" item="i">
	<cfset doStuff() />
</cfloop>

In each of these cases, we are essentially doing the same thing, that being looping a collection that contains multiple items and acting on each iteration.  I have always liked the fact that the ColdFusion array can be converted to a Java iterator like this:

<cfset iterator = myArray.iterator() />
<cfloop condition="#iterator.hasNext()#">
	<cfset thisIteration = iterator.next() />
	<cfset doStuff() />
</cfloop>

However, given the fact that <cfloop array=”#myArray#” index=”i”> is already an abstraction, it doesn’t make sense to use this in most cases.   But wouldn’t it be cool if you could call myQuery.iterator() or myStruct.iterator() and have the same functionality?  Or even better, why not have those collections all extend an iterator class so that it could be simplified even futher with myQuery.hasNext() or myStruct.hasNext().

Keep in mind, this discussion is only coming from the perspective of the programming interface itself, and I am not going to get into the differences behind the scenes of how a query result is actually a set of arrays of columns, or how an array differs from a struct.  My point is simply that if we have these abstractions, it sure would be cool if they were consistent, but we were still able to call specific functions on them like ArrayFind() , StructDelete(), etc.  If we had this ability, our loops in CFSCRIPT would be a lot more consistent as well.

With that said…. I will leave you with my wishful implementation for writing loops in CFML:

<cfloop condition="#myQuery.hasNext()#">
	<cfset thisRow = myQuery.next() />
	<cfset doStuff() />
</cfloop>

<cfloop condition="#myArray.hasNext()#">
	<cfset thisItem = myArray.next() />
	<cfset doStuff() />
</cfloop>

<cfloop condition="#myStruct.hasNext()#">
	<cfset thisItem = myStruct.next() />
	<cfset doStuff() />
</cfloop>

ColdFusion 9 catch() is not thread-safe!

I almost hate to admit this… no, I really hate to admit this.   For some reason, I was have always been under the impression that when you do a catch() in CFSCRIPT, that the variable you define as the catch is protected within the catch condition.   However, it hit me today that it is written to the variables scope by default.  Not only that, as I was testing it further, I believe that I have discovered that it is not thread-safe at all!   (edit: this problem applies to CFCATCH too. See notes at bottom and in comments)

Want to test this?  Open up a dummy CFM file and run the following:

(edit: I have modified this example since originally posting, putting the try/catch within a method so that it is consistent with the other examples)

WriteDump(ourFunction());	

public void function ourFunction()	{
	try	{
		local.a = b;
	}
	catch( any e )	{
		killE();
		WriteDump(e);
	}
}

public void function killE()	{
	StructDelete(variables,"e");
}

In case what we are doing isn’t abundantly obvious, we are creating a forced exception by referencing variable “b” which doesn’t exist.  In the catch(), we are saving the exception structure as variable “e”  so that we can handle however our business rules dictate.  In the case of our example, we are calling a method called killE(), and as you see it deletes “e” from the variables scope of the current template.  In the following line in catch() we are going to dump out exception details.  However, rather than a nice exception telling us that “b” wasn’t defined, we get the following:

ColdFusion exception

So, all we need to do is change “e” to “local.e” right and it will be thread-safe right?

WRONG!

This is apparently invalid syntax in Adobe ColdFusion (as of v.9).  Take this example:

WriteDump(ourFunction());	

public void function ourFunction()	{
	try	{
		local.a = b;
	}
	catch( any local.e )	{
		killE();
		WriteDump(local.e);
	}
}

public void function killE()	{
	StructDelete(variables,"e");
}

When we run this example, the following occurs:

What?!  So apparently we have to use the “var” scope to define our exception then, right?  So how about this attempt… this should do it, right?

WriteDump(ourFunction());	

public void function ourFunction()	{
	var e = "";

	try	{
		local.a = b;
	}
	catch( any e )	{
		killE();
		WriteDump(e);
	}
}

public void function killE()	{
	StructDelete(variables,"e");
}

WRONG!

If we take this approach, we receive the following:

My impression of this exception is that it is trying to write a variable into the variables scope that is already defined in the var scope and barking about it.

So then, what are we left with?

I am admittedly not the sharpest crayon in the box, but to me this indicates that catch() is 100% not thread-safe!   If anyone sees any problem with this diagnosis that I have made or has any thoughts, I am all ears.  If this is true, this throws huge wrenches into my work, and I am sure it does to others as well.

EDIT: As has been pointed out by Henry, this is an issue that Adobe was notified of this bug July of 2010 – almost 16 months!  It affects at least versions 8 and 9, and is not limited to CFSCRIPT.  It apparently affects CFCATCH as well.  I have looked through the tech notes of ColdFusion 9.0.1 Cumulative Hot Fix 1 and ColdFusion 9.0.1 Cumulative Hot Fix 2 and find no evidence that they have ever addressed it.

EDIT (again): I have tested this in both Railo and OpenBD.  OpenBD, due to the fact that it puts variables within methods in the var scope by default does not fail on the first test.  Railo, does fail.  However by using the variables scope like ColdFusion.  However, it allows you to do catch( any local.e ) and var scope e ahead.  So in a nutshell, ColdFusion is the only engine that does not give us any way to do thread-safe error handling.

Refactoring: avoiding nested conditional statements

Recently at I was given the task of adding an new validation routine to an existing validation process.  In this piece of code, the requirements mandated that a series of sequential tests would be run, but in the event of a failure of any of them, the process would kick out and set an error state, provide user feedback, and whatever other tasks needed to occur.  We have all seen processes like this before.  Essentially it looked like this:

error = true;
if ( testOne() )  {
    if ( testTwo() ) {
        if ( testThree() ) {
            if (testFour() )  {
                 error = false;
                 doAllTestsPassedStuff()
            } 
        }
    }
}
if ( error ) {
	handleErrorCondition()
}

Looking at this block of code, the intent is pretty obvious as we progressively run tests as long as the previous test returned true, eventually firing the doAllTestsPassedStuff() method.  If any of the tests failed, we would call handleErrorCondition().  While this approach is completely functional, the maintainability of it is no fun, and it just feels wrong to me.  For the task I was given, I had to add a new test davesSuperTest()  between the 2nd and 3rd conditional blocks.   If I were to follow the previous approach, I would insert it there, and tab out the previous testThree() and testFour() conditional statements further to the right.  In my opinion this is an ugly block that is getting uglier by the minute.

By altering the approach to use try/catch blocks, we can still maintain the same level of control and order or operations as dictated in the requirements, but each condition becomes insulated from the others,  like this:

try {
	if ( !testOne() )	{
		throw "fail:testOne";
	}
	if ( !testTwo() )	{
		throw "fail:testTwo";
	}
	if ( !davesSuperTest() )	{
		throw "fail:davesSuperTest";
	}
	if ( !testThree() )	{
		throw "fail:testThree";
	}
	if ( !testFour() )	{
		throw "fail:testFour";
	}
	// if we reach this point, then all of the above tests passed.
	 doAllTestsPassedStuff()
}	
catch(e)	{
	handleErrorCondition()
}

Given this approach, it is very simple to add/remove conditions without disrupting other conditions, even better, I don’t have to scroll to my second monitor on the right!

Piecing together optional view elements at runtime with Mach-II

Often in web development, you run across a case where there is a display that contains optional view elements that are derived at runtime.  Perhaps there is a section of a form that is only available to residents of the US.  Maybe, only users with a certain level of group access have the ability to see a section of a page that can be partially viewed by other user types.  I am sure you can think of numerous cases that you have come across in your own work.   In a current project at our company, one of our developers was tasked with rewriting an old piece of legacy code in which logged in agents can select one or more of multiple reports to display.  The legacy code was a complete nightmare that would probably be worthy of an entire series of what not to do, but that is for another day!  To boil this piece down a bit, essentially the agent has a series of checkboxes of specific reports, and “to” and “from” date inputs to provide a date range.   Depending on what the agent selects, the submission page might show a single report, or a series of several reports in line.

One approach to this would be to have an event defined in which you compile each piece of data into some kind of data collection

if ( [Report1 was selected] )
     get data for Report1
if  ( [Report2 was selected] )
     get data for Report2
(... and so on...)

Then on the view, you could do something like:

if ([we have report data for Report1])
show Report1
if ([we have report data for Report2])
     show Report2
(... and so on ....)

Well, we could but it would be wrong!  Why?  For one thing, we now have conditional logic about each report built into multiple places in our application.  From a complexity and maintenance standpoint, you have just made it, at a minimum, twice as complex as it needs to be.  There is also a strong argument that could be made (and I would make it!) that your view shouldn’t be responsible for determining what it’s supposed to display.  It should simply display!

So what is another approach to this?  How could we employ MVC techniques without the individual components involved becoming intertwined, creating yet another administrative issue.  Here is the solution that I proposed to our developer:
First, let’s start with our form, which is remarkably simple:

<h3>Select the reports you would like to view</h3>
<cfoutput>
<form name="reportform" action="#buildUrl( "viewreports" )#" method="post">
<input type="checkbox" name="reportList" value="report1" /> Report 1<br />
<input type="checkbox" name="reportList" value="report2" /> Report 2<br />
<input type="checkbox" name="reportList" value="report3" /> Report 3<br />
<input type="checkbox" name="reportList" value="report4" /> Report 4<br />
<input type="checkbox" name="reportList" value="report5" /> Report 5<br />
<input type="submit" value="run reports">
</form>
</cfoutput>

As you can see, we are going to load up an event-arg on the submission named “reportList” that will be a comma separated list of reports that we will be displaying. For instance, if we make the selections you see below, on the viewreports event, event.getArg( “reportList” ) will be: report1,report3,report5

Report Selection form output

I decided that generally I wanted it to behave with a flow like this:

Reports flow diagram

It is a good goal in application development to move as much specific knowledge of the flow of the application outside of any component (view, service, or otherwise) that is not responsible for it to avoid coupling issues.  For instance, our report display page shouldn’t understand flow should it?   (hint: “no“)   Our service layer that is responsible for retrieving data shouldn’t should it? (you guessed it: “no”)

So where does that responsibility lie?  I place it squarely on the front controller framework at hand, namely Mach-II in this case.

If that is the approach we are going to follow, then how can we have freely operating pieces, and create a composite view without the individual pieces having any knowledge of each other, nor any knowledge of their role in the bigger picture?   We achieve this by creating small encapsulated pieces that are individually responsible for their limited role, and count on our framework to do the rest.

If you look at the flow diagram above, you will see that we start with a conditional statement on our submission event: “Has form output been generated?”

In our Mach-II configuration we can achieve this by doing the following:

<event-handler event="viewreports" access="public">
 <event-mapping event="noData" mapping="multi_report1" />
 <filter name="checkForReportData" />
 <view-page name="selectreports" contentArg="form" />
 <view-page name="reports" />
</event-handler>

Let’s talk about what those pieces are doing.  First, we are defining an event-mapping “noData“.  What this means is that anywhere further in this event, if someone announces “noData“, the event that we are really going to announce is “multi_report1“.   By doing this, we don’t bury specific knowledge into our component responsible for the announcing, but more on that in a moment.  Next, we are calling a filter named checkForReportData. Filters are Mach-II components that contain a single public method filterEvent() which returns a boolean value telling Mach-II whether or not it should continue further within this event.  In the code above, if the filter returns “false”, the <view-page/> nodes will not be processed.   So let’s take a look at the filterEvent() method.

<cffunction name="filterEvent" access="public" returntype="boolean" output="false" hint="I am invoked by the Mach II framework.">
     <cfargument name="event" type="MachII.framework.Event" required="true" hint="I am the current event object created by the Mach II framework." />
     <cfargument name="eventContext" type="MachII.framework.EventContext" required="true" hint="I am the current event context object created by the Mach II framework." />

     <cfset var result = event.isArgDefined( "reportOutput" ) />

     <cfif NOT result>
          <cfset announceEvent( "noData", event.getArgs() ) />
     </cfif>

     <cfreturn result />
</cffunction>

Very simply, we are saying: Is there an event-arg named reportOutput defined? If there is, we are returning true, telling the event to continue.  If not we are going to announce an event noData, and returning false.   By announcing a generic event named “noData”, and then defining what “noData” means in the XML config, we have just insulated this filter from change.  For instance, right now the <event-mapping/> says that this means that we should announce “multi_report1“.  If this ever changes to another report, then we only have to change the config.  Additionally, we might be able to repurpose this filter another way in the future and announce a completely different event by using a different event-mapping.

So in our example, we have no reportOutput on our first pass through this method, so we are being rerouted to the event “multi_report1“.  Here is what it looks like:

<event-handler event="multi_report1" access="private">
     <event-arg name="reportName" value="report1" />
     <event-mapping event="nextEvent" mapping="multi_report2" />
     <filter name="checkIncludeReport" />
     <notify listener="ReportListener" method="getData" resultArg="data" />
     <view-page name="reports.report1" contentArg="reportOutput" append="true" />
     <announce event="nextEvent" copyEventArgs="true" />
 </event-handler>

On the second line, all we are doing is defining an event-arg named “reportName” and assigning a value of “report1“.   We will be using this value in a moment.  Before we get to that, and now that you understand what event-mappings are doing, the third line should be clear.  We are just telling Mach-II “if someone or something announces nextEvent within the context of this event, announce multi_report2 instead“.  Again this allows our components to announce generic events which are explicitly defined in the config.   Next, we are calling a filter to see if report1 has been selected in the form by calling a filter named checkIncludeReport.   If the report was not selected in the form, we will kick out and announce nextEvent aka multi_report2.   However, if the report is included, we will continue down the line calling a method on our listener to retrieve data, and then using that data in a view named “reports.report1“.  We take that generated HTML and append it into an event-arg named “reportOutput“.   If you look at our code above, you will be reminded that this is the argument we were testing for in the checkForReportData filter.   Here is a look at our checkIncludedReport filter which makes the decision to include this report or not.

<cffunction name="filterEvent" access="public" returntype="boolean" output="false" hint="I am invoked by the Mach II framework.">
     <cfargument name="event" type="MachII.framework.Event" required="true" hint="I am the current event object created by the Mach II framework." />
     <cfargument name="eventContext" type="MachII.framework.EventContext" required="true" hint="I am the current event context object created by the Mach II framework." />    

     <cfset var result = ListFindNoCase( event.getArg( "reportList" ), event.getArg( "reportName" ) ) />

     <cfif NOT result>
          <cfset announceEvent( "nextEvent", event.getArgs() ) />
     </cfif>

     <cfreturn result />   
</cffunction>

All this filter is doing is checking in the event-arg reportList, which is a comma separated list of reports, and seeing if the value of event-arg reportName (which was defined on line 2 above) exists in the list.  Based on our example of selecting reports 1, 3, and 5, the plain English translation of this comparison is:  If the list “report1,report3,report5″ contains “report1″, return true, otherwise announce “nextEvent” and return false. As you surely know by now, in this case “nextEvent” translates to “multi_report2

Essentially we just repeat this exact pattern for the next 4 events, with a minor change in the last event:

<event-handler event="multi_report2" access="private">
    <event-arg name="reportName" value="report2" />
    <event-mapping event="nextEvent" mapping="multi_report3" />
    <filter name="checkIncludeReport" />
    <notify listener="ReportListener" method="getData" resultArg="data" />
    <view-page name="reports.report2" contentArg="reportOutput" append="true" />
    <announce event="nextEvent" copyEventArgs="true" />
</event-handler>

<event-handler event="multi_report3" access="private">
     <event-arg name="reportName" value="report3" />
     <event-mapping event="nextEvent" mapping="multi_report4" />
     <filter name="checkIncludeReport" />
     <notify listener="ReportListener" method="getData" resultArg="data" />
     <view-page name="reports.report3" contentArg="reportOutput" append="true" />
     <announce event="nextEvent" copyEventArgs="true" />
</event-handler>

<event-handler event="multi_report4" access="private">
     <event-arg name="reportName" value="report4" />
     <event-mapping event="nextEvent" mapping="multi_report5" />
     <filter name="checkIncludeReport" />
     <notify listener="ReportListener" method="getData" resultArg="data" />
     <view-page name="reports.report4" contentArg="reportOutput" append="true" />
     <announce event="nextEvent" copyEventArgs="true" />
</event-handler>

<event-handler event="multi_report5" access="private">
     <event-arg name="reportName" value="report5" />
     <event-mapping event="nextEvent" mapping="viewreports" />
     <filter name="checkIncludeReport" />
     <notify listener="ReportListener" method="getData" resultArg="data" />
     <view-page name="reports.report5" contentArg="reportOutput" append="true" />
     <announce event="nextEvent" copyEventArgs="true" />
</event-handler>

As I mentioned, there is a slight change in multi_report5 in that nextEvent is defined as “viewreports“.   By doing this, we have then ended our report generation and are redirecting the flow back to the initial event that kicked this process off.  Since we now have reportOutput data, we are directed to the page that ouputs it all.  Quite simply, our big, massive, magnificent multi-form display page looks like this:

these are the reports:

<cfoutput>#event.getArg( "reportOutput" )#</cfoutput>

There is no conditional nonsense, and the view simply outputs all of the generated output that was appended into the event-arg reportOutput.   Additionally, if you reflect on the things we have done, no where are we explicitly saying “if the user selected report1, do something“.  We have left it all fairly generic and hopefully have created some potentially reusable components.    For instance, let’s say that we now have a requirement for an event that only displays report2. No problem!  All we need to do is add an additional event like this:

<event-handler event="report2" access="public">
     <notify listener="ReportListener" method="getData" resultArg="data" />
     <view-page name="reports.report2" />
</event-handler>

Easy, huh!

Lastly,  I know that some of the more astute of you may have noticed a fatal flaw in the design above.  What happens when no reports are selected?   In the interest of keeping this example as stripped down as I could, I let that one go, but it is a very simple fix.  What would you do?  Where would you put it?   Feel free to post your fix in the comments, along with any other thoughts you have on this solution.

download fully-functional example files – NOTE: doesn’t include the Mach-II framework

Help the homeless. Scriptalizer.com is being evicted!

I am cross posting this for AaronScriptalizer is a GREAT project that needs a home.  If you have a server, VPS, etc running a CFML engine (Railo, OpenBlueDragon, or ColdFusion), and wouldn’t mind adding another site to your server, I am sure Aaron would be grateful.

For details contact Aaron in the comments of this post.

End of an era – Turning the lights off at InstantSpot

As of this first week of 2011, we will be permanently turning off the lights at InstantSpot. I felt that it was fitting to give it at least some type of obituary, and share my reflections of what has been a 4-and-a-half year story of the process.  We have learned much through the process, and hopefully we can share some of what we learned with others.

Starting in 2006, Aaron Lynch and I started on modest plan to revolutionize the blogging world.  OK, so that isn’t really true – it happened like this…

In the beginning… a simple CMS

In 2005, Aaron was making a career change and after years of being friends through the Jeep and 4×4 community we found ourselves in the same line of work as ColdFusion developers.  Around the beginning of 2006, he joined my company which gave us way too much time to discuss and conspire on how we might take over the world.  We decided to begin by making a simple no-database CMS in ColdFusion that we could distribute and sell for a nominal fee.  Of the ones that we deployed, we found that no one was really interested in hosting it themselves and we ended up putting a few instances on our server.

A network was born

Once a small handful of them were running, we created a single directory page that listed and linked to all of the individual sites.  Even though it was exceptionally simple and rudimentary, we realized that in essence we had a “network”.  Given the fact that no one really had an interest in hosting themselves, we quickly realized the fallacy of duplicating the code base for each new site and set out to more effectively externalize the settings and make the configuration centrally persisted in a shared database and run all of the sites from a single set of code.  We quickly developed methods of skinning individual sites, and using subdomain URLs.  The methods we came up with are almost exactly the same methods that we would end up using the next 4 years moving forward.

The addition of blogs

Almost immediately we decided that we needed to add blogs to the sites.  In a move that would haunt us to some degree for a long time to come, we chose to implement the open source BlogCFC application created and maintained by Raymond Camden.  While there was nothing inherently wrong with BlogCFC itself, it was just a poor choice on our part for several reasons:

  • We already had a functioning CMS build using the Mach-II framework using its own database.  Adding in multi-site support to the blogs (which I don’t think it had at the time, if I recall correctly) and tying that into our existing application proved to be a ton of work and felt like such a goofy solution.
  • Combining the two databases into a single one, and the vastly different approaches of data persistence (naming schemes, UUID vs integer keys, etc…) made our code base look schizophrenic.
  • We actually had two complete applications and had to manage session replication between them.
  • We ended up using a really a small portion of the actual BlogCFC code base.  If we would have just created the functionality on our own from scratch, we would have spent far less time and ended up with something that seemed like far less of a hodgepodge.

Eventually we got it working, and did so somewhat successfully, but it really felt like a hodgepodge of methodologies.  However, we started growing our user base, got some great community feedback and support, and had a relatively well performing blogging network.

The dirty secret

One of the seriously embarrassing dirty secrets that we had was the original hardware/server architecture of InstantSpot.  When Aaron and I both started out writing the code, we were not the Linux nerds that we grew into being later.  We were both Windows developers and had never paid as close of attention to casing issues as we grew to later.   Unfortunately the first time that we tried to run Spot under Linux, it blew up severely.  We started trying to work through issue by issue,  but decided we would do a temporary fix by throwing it into a Windows VM on our co-located Linux server.  However, this temporary work-around ended up sticking around much longer than we anticipated.   So, for the first year or so, InstantSpot chugged along on a single PC that was little more than a common desktop, running Ubuntu Server, with a Windows VM in VirtualBox!  The fact that it supported as much traffic as it did was always just a little bit comical to us.

The inevitable rewrite

As we continued working to bring in a continual array of new enhancements – many of which were implemented but never even seen by our users! – we found more and more that our architecture sucked badly enough that we would actually have to do something about it.  In addition goofy hardware/OS issue, the maintenance of dealing with the two separate applications became this growing elephant in the room.   As our user base and our exposure continued to grow, we decided it would be worth a complete re-write from the ground up.  When we first started slapping stuff together in the befinning, we didn’t have a defined target in mind, so surely a rewrite would be less time consuming since we actually had a goal now right?!   We set out about mid-2007 to find out.  Around that time, Railo had hit RC 2.0 and Aaron and I were both enamored by the idea of a completely open source version of InstantSpot.  We busted our asses the last quarter of the year and were finally ready to roll right at the turn of the year in 2008.   I will never forget how many people scoffed at us using Railo.  I mean seriously, given the enormous growth of it in 2010, you wouldn’t believe all the snickering we put up with about it early on.  However, we found huge performance increases over ColdFusion and felt like we were really breaking new ground.

Disaster strikes

In January 2008, we rolled out the newly rewritten InstantSpot with HUGE expectations.  As soon as we flipped the switch at 1:00am CST on January 13, we immediately started seeing very strange errors.  Some of these errors were coming in the fashion of ColdSpring returning improper service object bean types, Mach-II throwing errors that made absolutely NO sense at all, and errors in code that was too simple to fail!  Often when someone would visit a blog, someone else’s content would show up.  (not cool!)  It began looking like we had some serious threading issues.  Immediately we discovered one issue in which the getBean() method of ColdSpring (1.x?) was not thread safe!  (Here is an email thread discussing it).    We also realized that we were using getBean() in a few inappropriate places, and moved code around so that the only place this was being used was in the bootstrapping process and we made those processes threadsafe on our own.   Even then, however, we continued to see things spiraling out of control in what became increasingly clear as a concurrency issue within Railo itself.   Gert Franz actually contacted us as soon as we brought the issue to light and wanted to help us resolve it by us sending him our application.  Given the complexity of the application, however, we felt that we didn’t have the option to spend time going down that road without knowing that we would see a solution come out of it quickly.  We made a snap decision to purchase a license of ColdFusion 8, and cut over as soon as we could.  Two nights later we made the cut-over to CF8 and all of our threading issues completely disappeared.  (blog entry from January 2008 discussing this in detail)

Immediately we breathed easier knowing that our months worth of work was actually paying off, but unfortunately this gave some “I told you so” ammo to all the previously mentioned skeptics of our use of Railo, and we were eating crow.  As it turned out, Railo put out a patch within the week that would have completely fixed the problem we had, and in hindsight, I really wish that we had the foresight to hold out and continue down that path.

Easy days

So there we were… we had our new codebase running… and running… and running… with almost zero time spent on maintenance issues.  Before too long we had amassed somewhere around 1000 blogs, with less than 5% of them being relatively active.  We couldn’t have been more pleased with the results of the rewrite as it made our lives so easy.  This period, for better or worse, allowed Aaron and I to put a huge focus on client work in our moonlighting hours.  While this was a temporarily lucrative venture, the focus was definitely off of InstantSpot since it really didn’t demand any of our attention on a daily basis.   One result of this that we essentially quit marketing the network.   Aaron and I are both really cheap dudes… I mean really… ask our wives.   Subsequently we never spent money to push InstantSpot beyond word of mouth.  We had this idea that if we made it an obvious tool, that an organic growth was inevitable.  While this was true to some extent, the flaw of this was the real industry giants that we were up against.  I mean, what had we made really?  A blogging network that from a feature perspective was a competitor against services like Google, WordPress, etc.   These giants that obviously had no idea that we even existed and were only competitors in our wildest imaginations and only on paper from a feature-by-feature perspective, rather than a true competition of any kind.  We had an obvious niche in the CFML development community, but the fact is that many developers want a more hands-on solution than what an out-of-the box solution such as a blogging network would provide.

What about revenue?

So we have never really talked publicly about InstantSpot revenue.  We spent a lot of time trying to figure out how to make money of it, but never really got it worked out right.  I mean, it paid for itself (until recently), but it never really made any money at all.  In periods of our largest traffic, we were still only making about $20.00/day.  So, with our colo fees of about 160/month, we were netting somewhere around 440.00/month.  For those that wondered why we didn’t bust our asses a little more, perhaps you understand our lack of motivation for doing so now!  It was kind of frustrating to have worked so hard, and to put so much time and effort in, for such a small monetary reward.    The thing that was really frustrating was that about 75% or more of that revenue was being generated from one very spammy blog that was set up with keywords all driving around prepaid wireless.  It felt kind of dirty, but it was the one consistent blog that kept us in the black.  Although technical blogs like mine, Aaron’s Sam Farmer’s, and others certainly brought more traffic, the click-through and click-rates on that stuff was dismal.

The decline

Over 2010, we have really seen a solid decline of anything positive from InstantSpot.  What was initially one of the HUGE benefits that we created with InstantSpot ultimately turned into a negative against us.  Our users enjoyed really ridiculous search engine exposure.  A brand new blog could be created and with a relatively well crafted title and content could easily hit page 1 of Google on a wide array of subjects.  This was due in large part to the fact that we started off with a few blogs that already had PR5 ranks, and then we used what we felt was a really smart way of linking pieces together so that new blogs made more relevant by being tied do solidly ranked blogs as they came in.   If you think I am overstating this, here are a couple of examples I can think of…

  • When Saddam Hussein was executed, we had a blogger that put up a youtube video of it ( yeah, the appropriateness  of that was definitely discussed, but we really never established a TOC that restricted specific content ).  This blog entry made page one of Google, and was picked up by El Tiempo, which is a huge latin news source, and linked as the first blog to carry the info.  We hit somewhere around 100K impressions by mid morning and our sever absolutely croaked.
  • We had a blogger that posted about State Farm Life Insurance and for a period, that blog link was higher in Google search results than statefarm.com with the search “state farm life insurance”.  That one was mind blowing to me!

There were a lot more anecdotal examples like this and we were continually shocked by what we were seeing in the way of search relevance.  Unfortunately we were not the only ones that noticed it.  We ended up with a couple of worthless spam sites that capitalized on this.  They would create a blog and then generate scores of blog entries every day with complete bullshit content stuffed with keywords and links to various products.  Over time, our valid content was overshadowed by our spam content as more and more (and more… and more… and MORE) spam sites were created.   Along with them came thousands upon thousands of spam comments.  We were pretty effective at blocking the bots, but there were a surprising number of manually typed spam comments, which were much more difficult to stop.  Eventually when you looked at our recent entries across the network, it was just shameful.   The more spam issues that came up, the more performance issues we had as well.  Unfortunately both Aaron and I were so tied up with our families and our jobs that neither one of us could deal with effectively and things just spiraled negatively.

Realization

At some point it really became quite clear that we were either going to have to invest much more effort and certainly much more money to make a real run at it, or we could just continue sailing along as long as it would go and see what happened.  It became clear to us that the kind of traffic that we would need to actually make InstantSpot something that could support us was so far beyond reality that we didn’t even know what to do about it.  Considering that both Aaron and I have full time jobs and each supports a family of 5,  quitting our jobs, taking loans and going full bore never seemed like the right decision.  “And who knows… maybe someday it will just take off!“.  Obviously, we never made that jump or commitment and both feel that we have hit “… as long as it would go”.  With the drop in revenue we have seen, we are actually spending money on InstantSpot to keep it up now.  We have experienced more and more performance issues that we don’t have the available time – and to be honest, the will – to keep things on track and at a level they should be.  After some discussions, we have decided that it’s time to pull the plug.

But, it’s not all bad

It really has been a great experience, and we have learned so much about so many things.  It has given us not only a great tool that has boosted our education and skill set, but gave us a great reason to be passionate about what we do and strive to do it well.  Hopefully we can find another project on the near horizon that gives us the same opportunity.

Thanks!

In closing, I would like to say thanks…  Thanks first and foremost to my family that were supportive day in and day out, and believed in what we were trying to do simply because we believed in it.  Thanks to all of the early adopters and people who stuck with us even when things were less than ideal with our system.  Thanks to many in the ColdFusion community that gave us really great support.  This has been a really great run, and I am happy to have been a part of it.

Solved: Strange sun.awt.X11.XToolkit exception with Mach-II/Railo/Tomcat/Ubuntu

I am running a BER version of Railo to experiment with the Hibernate ORM functionality for a new project.  I set up a Mach-II app from the 1.8 skeleton using Mach-II 1.8.1 on Railo under Tomcat6 on Ubuntu… whew!  That’s a mouthful huh?

The simple skeleton came up just fine, but after a little bit of customization, I ran across a strange issue.  I had created an event, in which a listener pulled a new entity from a ColdSpring bean, and persisted it using EntitySave().  Somewhere in that process, I started getting exceptions related to sun.awt.X11.XToolkit.

The first time the error would occur, I would see this:

Can’t connect to X11 window server using ‘:0.0′ as the value of the DISPLAY variable.

 /var/lib/tomcat6/webapps/railo-orm/MachII/properties/HtmlHelperProperty.cfc: line 142

    140: <!--- Configure auto-dimensions for addImage() --->
    141: <cfif StructKeyExists(serverInfo, "productLevel") AND serverInfo.productLevel NEQ "Google App Engine">
    142: <cfset variables.AWT_TOOLKIT = CreateObject("java", "java.awt.Toolkit").getDefaultToolkit() />
    143: <cfelse>
    144: <!--- Some hosts (such as GAE) do not support java.awt.* package so replace with mock function --->

On subsequent requests, I would get the following:

Could not initialize class sun.awt.X11.XToolkit again with the specitic exception pointed to HtmlHelperProperty.cfc: line 142.

After some Googling I came across a similar sounding issue in which a guy had added params to his app server.  I added it to mine, and the error went away.  If you come across this yourself, try adding: -Djava.awt.headless=true to the JAVA_OPTS (in catalina.sh for Tomcat).

Implicit constructor and CF9 style implicit accessors without ColdFusion 9

One of the broadly welcomed features of ColdFusion 9 has been the addition of implicit getters and setters to CFCs.  What this means is that you no longer have to hand code the repetitive boiler plate getXXX() and setXXX() methods for each property in your model objects. However, you don’t need ColdFusion 9, (or even ColdFusion for that matter) to enjoy the benefits of the new implicit accessors with the version 9 release.  Since ColdFusion 8, developers have had access to the OnMissingMethod() method which makes tasks like this very simple to implement on your own.  The OnMissingMethod is also available in current versions of OpenBlueDragon and Railo.

For my implementation of this, I use a standard BaseBean class in a number of my projects which is extended by all of my model class objects.  By leveraging the OnMissingMethod() in this BaseBean, I create these implicit accessors, in addition to an implicit constructor init() method as well.

There are multiple examples on the internet that show how you can use OnMissingMethod(), but for the most part, the examples that I have come across do not protect the class from being having infinite previously undefined properties added to it from the outside.  For instance, I could have a class named Person.cfc with define properties of “firstName” and “lastName”, but nothing would stop someone from doing Person.setThisIsNotAProperty(true) and that value would be dynamically added to the Person instance. While some might argue that the flexibility that this approach offers is a positive thing, I prefer to lock my classes down just a bit more.  For instance I like the fact that I can open up a model class and see exactly what properties it can contain with clearly defined <cfproperty/> tags.   From a maintainability standpoint the idea of “mystery” dynamic properties strike me as just wrong.

For my implicit constructor, I take an approach where I loop through any named arguments that were passed, and if the name matches a property that I have defined with <cfproperty/>, then it will be passed to a setter method.  Any arguments that are passed that are not defined as a property are discarded.

So let’s take a look at what this looks like.  First I will paste the entire BaseBean, and below we will break it apart to discuss what is going on.

<cfcomponent output="false">


	<cffunction name="init" access="public" output="false" returntype="BaseBean">
		<cfset var i = "" />
		<cfset initializePropertyList() />
		<cfloop list="#this.propertyList#" index="i">
			<cfif StructKeyExists(arguments,i)>
				<cfset set(i,arguments[i]) />
			</cfif>
		</cfloop>
		<cfreturn this />
	</cffunction>


	<cffunction name="initializePropertyList" access="private" output="false" returntype="void">
		<cfset var i = "" />
		<cfif NOT StructKeyExists(this,"propertyList")>
			<cfset this.propertyList = "" />
			<cfif ArrayLen(GetMetadata(this).properties)>
				<cfloop from="1" to="#ArrayLen(GetMetadata(this).properties)#" index="i">
					<cfset this.propertyList = ListAppend(this.propertyList,GetMetadata(this).properties[i].name)>
				</cfloop>
			</cfif>
		</cfif>
		<cfreturn />
	</cffunction>



	<cffunction name="onMissingMethod" access="public" output="false" returntype="any">
		<cfargument name="missingMethodName" type="string" required="true" />
		<cfargument name="missingMethodArguments" type="struct" required="true" />

		<cfset var property = "" />

		<cfif ReFindNoCase("^[gs](et)",arguments.missingMethodName)>
			<cfset property = ReReplaceNoCase(arguments.missingMethodName,"^[gs](et)","") />
			<cfset initializePropertyList() />

			<cfif ListFindNoCase(this.propertyList,property)>
				<cfif StructIsEmpty(arguments.missingMethodArguments)>
					<cfreturn get(property) />
				<cfelse>
					<cfset set(property,arguments.missingMethodArguments[1]) />
					<cfreturn this />
				</cfif>
			<cfelse>
				<cfthrow message="The class #GetMetadata(this).name# does not have a property named #property# so the implicit #arguments.missingMethodName#() method is not available." />
			</cfif>

		</cfif>
		<cfreturn />
	</cffunction>

	<cffunction name="get" access="public" output="false" returntype="any">
		<cfargument name="property" type="string" required="true" />
		<cfset var value = "" />

		<cfset value = variables[arguments.property] />

		<cfreturn value />
	</cffunction>

	<cffunction name="set" access="public" output="false" returntype="void">
		<cfargument name="property" type="string" required="true" />
		<cfargument name="value" type="any" required="true" />

		<cfset variables[arguments.property] = arguments.value />

		<cfreturn />
	</cffunction>

</cfcomponent>

Before we walk through the actual flow, I would like to point out the method initializePropertyList() that you see here:

<cffunction name="initializePropertyList" access="private" output="false" returntype="void">
	<cfset var i = "" />
	<cfif NOT StructKeyExists(this,"propertyList")>
		<cfset this.propertyList = "" />
		<cfif ArrayLen(GetMetadata(this).properties)>
			<cfloop from="1" to="#ArrayLen(GetMetadata(this).properties)#" index="i">
				<cfset this.propertyList = ListAppend(this.propertyList,GetMetadata(this).properties[i].name)>
			</cfloop>
		</cfif>
	</cfif>
	<cfreturn />
</cffunction>

By using the GetMetadata() function, we are able to introspect our instance and create a list of property names which we can reference in our other methods.  This is what enables us to enforce the rules that I described above in which we make sure that a property exists before setting it in the constructor or through an implicit accessor.  You will see in both the init() and OnMissingMethod() methods that we call this method to ensure the list of properties is available before testing against it.

With that established, let’s take a look at the init() constructor method:

<cffunction name="init" access="public" output="false" returntype="BaseBean">
	<cfset var i = "" />
	<cfset initializePropertyList() />
	<cfloop list="#this.propertyList#" index="i">
		<cfif StructKeyExists(arguments,i)>
			<cfset set(i,arguments[i]) />
		</cfif>
	</cfloop>
	<cfreturn this />
</cffunction>

After making sure that our propertyList has been initialized, we loop through that list and if there is a matching named argument, we call the set() method which sets that value into the variables scope.

So with the object initialized, let’s take a look at our accessors.  For our example, let’s say that we have a Person object and we are doing to set our firstName property like this: person.setFirstName(“Dave”).  Since that setter method doesn’t exist in our Person class, the OnMissingMethod() below will be invoked.

<cffunction name="onMissingMethod" access="public" output="false" returntype="any">
	<cfargument name="missingMethodName" type="string" required="true" />
	<cfargument name="missingMethodArguments" type="struct" required="true" />

	<cfset var property = "" />

	<cfif ReFindNoCase("^[gs](et)",arguments.missingMethodName)>
		<cfset property = ReReplaceNoCase(arguments.missingMethodName,"^[gs](et)","") />
		<cfset initializePropertyList() />

		<cfif ListFindNoCase(this.propertyList,property)>
			<cfif StructIsEmpty(arguments.missingMethodArguments)>
				<cfreturn get(property) />
			<cfelse>
				<cfset set(property,arguments.missingMethodArguments[1]) />
				<cfreturn this />
			</cfif>
		<cfelse>
			<cfthrow message="The class #GetMetadata(this).name# does not have a property named #property# so the implicit #arguments.missingMethodName#() method is not available." />
		</cfif>

	</cfif>
	<cfreturn />
</cffunction>

We are then doing a regular expression test to see if the missing method names starts with either “get” or “set”.  If so, we derive the name of the target property by stripping the “get” or “set” off of the string.  After making sure that our propertyList has been defined, we then look in that list to see if the target property actually exists in the instance.  If it does, the request is then routed on to either the getter or the setter depending on whether arguments were passed to it.  Otherwise an exception will be thrown to let the developer know that he or she has attempted to access an undefined property.

One thing you might notice is that when our conditional block routes the request to the setter, we return an instance of the class itself.  This allows us to do method chaining like this:  person.setFirstName(“Dave”).setLastName(“Shuck”).  it should be noted that this is not the approach that Adobe took with ColdFusion 9.

So let’s take a look at it in action.  Here is our Person.cfc class.

<cfcomponent output="false" extends="BaseBean">

	<cfproperty name="id" type="string" />
	<cfproperty name="firstName" type="string" />
	<cfproperty name="lastName" type="string" />

</cfcomponent>

As you can see in the <cfcomponent/> tag, we are extending our BaseBean which is all that we need to have a workable domain class object.

When we put this together and access it in our code, we can do this:

<cfset person = CreateObject("component","Person").init(id=7,firstName="Dave", lastName="Shuck") />

<cfdump var="#person#" />

Once we have an instance, we can modify those properties like so:

<cfset person.setId(8).setFirstName("Johnny").setLastName("Rotten") />

I have not actually tested this on Railo, but I can confirm that it works in OpenBlueDragon (including GAE version), and ColdFusion versions 8 and 9.