Sunday, September 19, 2010

Government (and your business) needs more beta

MS: Broadly, do you think government's focus on producing "final" projects hinders progress?
DE: Absolutely. Government's obsession with a "final" product is in many ways a relic of the industrial era. The idea that only a finished product can be released to the public -- a public whose needs both the private and public sector often misunderstand -- means that huge cycles are wasted, and launch times delayed, in perfecting programs and products that often don't hit the mark. Everything is a beta today because almost everything can be improved on the fly. What is saddest about this obsession with final products is that it isn't connected to what government does or the technologies. It's a collectively imagined limitation.

http://radar.oreilly.com/2010/03/the-state-of-open-government-i.html
Inteview with David Eaves and Mac Slocum of O'Reilly Media.

Deeper Restructuring

Originally posted on SAT, FEBRUARY 6, 2010 AT 09:22

For some Canadian businesses, the recovery may prove as challenging as the downturn. … Canadian companies are emerging from the recession to an altered world – one that may require deeper restructuring and bolder strategic initiatives than currently contemplated.
- Mark Carney, Governor of the Bank of Canada

Although many pundits claim productivity is a four-letter word in Canadian (and European, even in American) polity, I think this signals a true shift in the conversation among Canadian business leaders. Although the gap between Canadian and US productivity growth has proved to be more illusory than real given hind-sight, Canada does have a real problem. Too long sheltered by a favourable exchange-rate gap with it's largest export market, Canadian businesses have under-invested in their businesses for much of the last 15 years.
I think this presents a great opportunity for two sectors, manufacturers of productivity-enhancing machinery and for IT. The first is an obvious gimme; with the Canadian dollar's appreciation against the US dollars, much of this machinery is now much more affordable, thereby enhancing the ROI calculations and shortening the pay-back for capital investments. The second has a harder row to hoe. IT investment has a disappointing track-record when it comes to realizing a return on investment. Too many promises, too many hidden costs of implementation.
Business Intelligence should be better position in this regards. In many projects I've participated in, we've managed to reduce large numbers of person-hours of unproductive, even soul-crushing labour at the cost of a less than 15-20% of the reduced out-lay. More often that not, this doesn't lead to redundancies and lay-offs. Rather, the people freed of doing pointless spread-sheet jockeying are thereby freed to do the actual job they are putatively hired to do.

Using Spring XsltView with Apache FOP

Originally posted on SAT, OCTOBER 3, 2009 AT 15:44

In the springsource cookbook, there is a recipe for using XLST-FO with and XsltView. However the recipe uses the now deprecated AbstractXsltView. Here is an example using an subclass of XsltView and Apache FOP. If you are using SpringSource Tool Suite, be sure to add Apache FOP as a dependency in your Maven pom. Otherwise, download the latest version directly from the Apache FOP project site and include the batik and and avalon framework dependencies.



XsltFoView.java


package com.queueq.dandy;

import java.io.StringReader;
import java.util.Map;

import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.sax.SAXResult;
import javax.xml.transform.stream.StreamSource;

import org.apache.fop.apps.FOUserAgent;
import org.apache.fop.apps.Fop;
import org.apache.fop.apps.FopFactory;
import org.apache.fop.apps.MimeConstants;
import org.springframework.web.servlet.view.xslt.XsltView;

public class XsltFoView extends XsltView {

@SuppressWarnings("unchecked")
@Override
protected void renderMergedOutputModel(Map model, HttpServletRequest request,
HttpServletResponse response) throws Exception{

FopFactory fopFactory = FopFactory.newInstance();
FOUserAgent foUserAgent = fopFactory.newFOUserAgent();

ServletOutputStream out = null;

try {

out = response.getOutputStream();
response.setContentType(MimeConstants.MIME_PDF);
// Construct fop with desired output format
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out);

// Setup XSLT
TransformerFactory factory = TransformerFactory.newInstance();

// Create new transformer and get the stylesheet using
// getStylesheetSource() from superclass (XsltView)
// there must be an XSL file in WEB-INF/xsl with file name
// equal to the name of the view
Transformer transformer = factory.newTransformer(super.getStylesheetSource());


transformer.setParameter("versionParam", "2.0");

// Setup input for XSLT transformation
StringReader xmlReader = new StringReader((String)model.get("xml"));
Source src = new StreamSource(xmlReader);

// Resulting SAX events (the generated FO) must be piped through to FOP
Result res = new SAXResult(fop.getDefaultHandler());

// Start XSLT transformation and FOP processing
transformer.transform(src, res);
} finally {
out.close();
}

}
}


To use the view you need to add a view handler to your mvc-config.xml


from mvc-config.xml


<bean id="xsltViewResolver" class="org.springframework.web.servlet.view.xslt.XsltViewResolver">
     <property name="viewClass" value="com.queueq.dandy.XsltFoView"/>
     <property name="order" value="2"/>
     <property name="prefix" value="/WEB-INF/xsl/"/>
     <property name="suffix" value=".xsl"/>
    </bean>


This configuration will now look in WEB-INF/xsl for xsl files with names that match a view name


xslfo.xsl


<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns:fo="http://www.w3.org/1999/XSL/Format">
    <xsl:output method="xml" indent="yes"/>
      <xsl:template match="/">
          <fo:root>
              <fo:layout-master-set>
                <fo:simple-page-master master-name="A4-portrait"
              page-height="29.7cm" page-width="21.0cm" margin="2cm">
                  <fo:region-body/>
                </fo:simple-page-master>
              </fo:layout-master-set>
              <fo:page-sequence master-reference="A4-portrait">
                  <fo:flow flow-name="xsl-region-body">
                    <fo:block>
            Hello, <xsl:value-of select="name"/>!
                    </fo:block>
                  </fo:flow>
              </fo:page-sequence>
          </fo:root>
      </xsl:template>
</xsl:stylesheet>


Finally, to use the new XstlFoView simply add a RequestMapping to your controller. The model must contain an attribute called "xml" that contains the XML that is transformed in the stylesheet. In this case, the xml is simply <name>...</name> which is matched and substituted in the stylesheet above.




WelcomeController.java

package com.queueq.dandy;

import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.servlet.ModelAndView;

@Controller
public class WelcomeController {


@RequestMapping("/transform")
public ModelAndView transform() {
                // xslfo is the name of the stylesheet created above
ModelAndView m = new ModelAndView("xslfo");


String s = "<name>Kristian</name>";
m.getModelMap().addAttribute("xml",s);


return m;
}
}

Operations Mantras

Originally posted on TUE, FEBRUARY 5, 2008 AT 06:56

"- Database configurations are changing. Software like HiveDB, MySQL Proxy, DPM exist now. We're absolutely doing partitioned data for huge datasets. We're also thinking outside of the box with software like starling and Gearman. Learn what these are, and understand that not everything will be in a database."  Dormando

Great list of common, and not-so-common sense best practices. I found it a good exercise to mentally check off the ones that I am and *am not* so good at being disciplined about.

The quote above is for me, a real gem.

Using MySQL Proxy to integrate operational data

Originally posted on SUN, JULY 15, 2007 AT 07:51

Most organizations have operational reports that are run on a daily basis. These reports record such things as balances, occurrences and transactions - the day-to-day facts of their operation. Operational reports differ from Analytical reports in that they tend to be snapshots of a relatively narrow time slice. Analytical reports tend to range over much more data covering a longer duration to be probed for individual facts or to generalize the data based on trends or extremes.

To take an example from a call centre, a typical operational report would be a report for the previous day showing the number of calls received broken down by agent. Duration, and probably call routing information from the Interactive Voice Response (IVR) system would be included for each call as details. A typical analytical report would aggregate the same data from a longer period say, 13 weeks and be used to look for trends, extremes and exceptions. If the call centre wanted to understand its relative productivity, or needed to make educated predictions about its future call volume, these reports are used.

What often happens is that companies will have really good coverage on the operational reporting side and abysmal coverage on the analytic side. This, despite the best efforts of the Business Intelligence vendors to make analytic reporting easy and pervasive. One of the main reasons for this is that most data is not in a dimensional model and is distributed all over the place either physically (multiple data sources) or logically (in a normalized form). Companies are often fearful of building Data Warehouse solutions due to perceptions of cost and risk.

One promising approach is to use the operational reports as a de facto data warehouse. Since the data is being extracted as facts along dimensions and with details, a kind of dimensional modelling is occurring in the report design. If the reports are run at intervals and saved as instances, a time series of dimensional data is being accumulated. The trick though, has been in accessing this data. As there is no procedural query syntax for this kind of data.

MySQL Proxy allows SQL queries submitted to the MySQL server to be intercepted and interpreted before being passed on to the server. This allows for query re-writing and language extension. This provides an opening for querying of report instances. The following imagined query would return aggregate data grouped by call agent for a given time period using data saved in report instances created by running an operational report on a daily basis over a three month period.

SELECT Agent_Name, COUNT(Call_Id), AVG(Call_Duration), MIN(Call_Duration), MAX(Call_Duration) FROM REPORTREPOSITORY( 'Call Centre Reports/Call Details by Agent', '2007-01-01 00:00:00.000', '2007-03-31 11:59:59.999') GROUP BY Agent_Name

MySQL Proxy includes a Lua parser that in turn gives you shell access. This should be sufficient to afford API access to a report repository - given that such API access is provided by the reporting solution. The main challenge for this approach is the amount of time that it would take to extract the saved data out of each report instance, which I would expect varies, depending on report format and API design.

A week of choices

Originally posted on SUN, MAY 6, 2007 AT 18:36

This past week has been one of choices. I had to sit down and decide what technologies to use in a project that is well underway. The project required a number of services:


  • Data access
  • Security
  • Presentation layer


Before I could really settle on frameworks and specific technologies, I need to select a language. I wanted to use open source technologies, and of course, I didn't want to learn a new language during this particular project. For me, this narrowed the field to one of:


  • J2EE
  • PHP
  • Rails


Another goal was to use a framework that provided a lot of functionality, since I don't want to have to roll my own for things like data access and security. I'm sure there are PHP frameworks that provide all or most of the functionality I expect, but I haven't used any of them, so maybe next time. Ruby has great data access and data modeling thanks to ActiveRecord, but I still feel like a bit of a newb with other Rails and Ruby technologies. This leads me to J2EE, which gives me options including:




I find Spring MVC easier to use than Struts, and I was recently involved in a project using Spring, so I've chosen Spring MVC + JSTL. I am also interested in looking into the idea of using Spring's XSTLView as a REST-style web service.



For the presentation layer, I really vacillated between using Flex and using XHTML+CSS+Javascript. I finally decided to go the XHTML route because I want to aim for a completely zero-client presentation. I am very tempted to use SVG, but I have concerns around portability: Adobe Viewer in some browsers, native SVG in other browsers. As well, I doubt that a majority of my users will already have the Adobe SVG Viewer installed, or alternately that they can be persuaded to use only Firefox or Safari.



I need to provide some charts for data visualization, I will use, as a starting point at least, PlotKit. Again, this is motivated by the desire for zero-client. This also has the added advantage of doing some negotiation between choosing between SVG and HTML Canvas depending on browser and/or object detection.



Data access was pretty easy given choice of Spring. I will, for this project forego Hibernate and just use the JDBC classes in the Spring framework since they are nearly as easy to use most of the time, and considerably easier to use some of the time. For the persistence layer, I will be supporting the usual suspects, MySQL, Oracle, MS SQL and PostgreSQL - in that order.



For security, I've selected Acegi Security, since it works well with Spring, is flexible, let's me change my mind and doesn't mangle my code. In less than four hours of reading and typing, I was able to integrate pretty sophisticated security that I probably won't have to mess with until much later in the development cycle.

Medians and Percentiles

Originally posted on SAT, MARCH 24, 2007 AT 08:24

Here's is a really nice visualisation in the annual compensation report posted on aiga.org.





I like the use of bar graphic to communicate the relative postion of the ranges of salaries compared as well as indicating the median. The tabluar data is similarly information rich showing not only the median value, but the 25th and 75th percentiles, respectively.



This is a great way to communicate a lot of information. The use of a platial axis to represent the rankings and relations of a range of values is really interesting. It's nice to have a bit more richness to add to non-goal bearing metrics (honestly, trend is usually not so interesting for smaller grain metrics -- do you really care if your sales reps' call-to-opportunity ratio is up .0003% from yesterday?





This map, not so much. Maps are great, but the bubble graph is a little hard to read. It's hard to tell the difference between the sizes of the bubbles, especially in the middle of the range. A tool tip with a number would be clearer