The SpudSoft BIRT Excel Emitters

There are, to my knowledge, five existing emitters for BIRT that output reports in a format understood by Excel. Unfortunately all four of them have significant issues for my purposes:

  • The built-in emitter.
    The built-in emitter does not output Excel format files, it outputs files in an XML format that Excel can understand, as a result newer versions of Excel complain whenever the files are opened.
    The files also have problems with page layout that I could not work around (specifically wide reports would be cut off) and do not display charts or logos.
  • The Tribix emitter.
    The Tribix emitter solves many of the problems with the built-in emitter, but tries too hard to produce an output document that looks identical to the original report.
    As a result the output spreadsheets contain a lot of very small rows/columns that are introduced purely to make the content line up visibly as it would on a PDF. Unfortunately this results in headers appearing in one column with data appearing in another - sparse crosstabs are particularly bad in this regard with data failing to line up - which all means that the output from the Tribix emitters is not actually very useful as a spreadsheet.
  • The Arctorus emitters.
    When I last tried them, the Arctorus emitters behaved suspiciously similarly to the Tribix emitter, but also support XLSX and cost money.
    It is my understanding that their v3 emitters no longer attempt pixel-perfect layout, and that may make them the best around, but they still cost money and I believe they still create/merge rows/columns for layout.
  • The Actuate 11 emitter.
    Unfortunately I haven't tried the new Actuate 11 emitter, because it comes as part of the Actuate 11 system that is priced out of our range.
  • RameshS' emitter
    A fellow entrant in the Getting Started with BIRT contest.
    My understanding is that RameshS' emitter is based on the built in source code, but writing to native XLS files (I don't think it handles XLSX, but it's not clear from my reading).
    I have not tried this emitter, but my expectation is it still creates/merges rows/columns for layout, and its use of Excel formatting seems to be more primitive than mine (images are restricted to a single cell, auto row heights rely on Excel behaviour) - but it claims to have smaller file sizes than Tribix, which may mean it beats me too.
    If your only complaint about the built in emitter is the file format then give this a go.

So, after discovering Apache POI and realising that BIRT emitters are not so complicated after all, I set about writing an Excel emitter that met our needs.
The design aims for the SpudSoft BIRT emitter are:

  • It must be useful as a spreadsheet, cells should all line up, there should be no blank rows/columns for formatting, there should be no page breaks.
  • It must open in Excel without complaint, preferably support both XLS and XLSX files.
  • It must display images (logos and charts are required).
  • It must include any formatting that does not clash with the other requirements.
  • It must work in the BIRT runtime environment, as well as in eclipse.

Known Limitations
Obviously, given that I'm explicitly favouring spreadsheetiness over formatting, there are going to be some limitations:

  • BIRT allows you to put multiple things into a cell (i.e. two different labels), my emitter will try to output the content of both, but the formatting will only be an amalgamation.
    In particular this means that nested tables (or tables within grids) will not display correctly.
    I'm open to suggestions as to how to resolve this, but I haven't actually got any reports that use nested tables yet, so I haven't worried about it.
  • BIRT allows you to have two tables next to each other, horizontally or vertically, with different row/column height/widths, on a spreadsheet I should just take the maximum.
  • Formatting will be handled on an as-needs basis - when I have a report that needs support for something I'll make sure I can have it :)
  • Column widths will often be left at their default, rather than trying to persuade things to fit (because headings and images often overlap).

If you find more limitations or other issues please let me know (either via this blog or by filing an issue on bitbucket), and if can't live with these limitations let me know how you think I should resolve them and I'll see what I can do.

In short, my emitter will produce good results with all the reports I've seen (better, I hope, than BIRT or Tribix), but I could easily produce a report that it could not cope with at all (that BIRT and Tribix probably could cope with).
Download
The emitter is all GPLv3 and the source is available on BitBucket.
The binary is available from https://bitbucket.org/yaytay/spudsoft-birt-excel-emitters/downloads.
And instructions for installation are available here.

Before throwing that binary directly into your production environment you are warned to take good notice of the version number!
You could well be the first person other than myself to use the emitter, it is bound to have faults ranging from slight limitations to major crashers.
If you do find issues with it please report them at https://bitbucket.org/yaytay/spudsoft-birt-excel-emitters/issues.
Update v0.3
Version 0.3 generates new Excel sheets in response to explicit page-breaks.
Sheets will take their names from the last named table seen on that page (if there are no named tables and there is only a single sheet that sheet will be named with the title of the report).

There is a bug in BIRTs RenderTask that prevents page-break-pagination working. There are two ways around this:

  • The simple option is to use the RunAndRenderTask, which does paginate correctly.
  • If you need to use a separate RenderTask the only option I could find is to write a custom RenderTask.

Update v0.4
Version 0.4 has been through a substantial refactor in order to make it possible to implement nested tables without having spaghetti code.
The result is more explicitly an FSM than it was previously, with state being represented by an instance of a "handler" class - typically a new handler is created in a startThing method call and removes itself in the corresponding endThing method call.
The main changes on top of the refactor are:

  • Nested Tables
    Currently these only work if the child table is the only thing on given row (more specifically, there must be no other cells on the same row, so the cell must be merged across columns). This is a significant limitation, but does allow a great many designs to work that couldn't work before.
    Note that my emitter still does not (and never will) merge or insert columns to try to make things line up.
  • Rotated Text
    Large crosstabs are often unwieldly in spreadsheets because the column headings tend to be quite long, meaning that columns are wide and thus very little of the sheet is visible at any time.
    The fix for this in the spreadsheet world is to rotate the column headings to an angle so multiple headings overlap and the columns can be much narrower.
    This option is now available from my emitter.
    BIRT does not have any built in option for rotating text, and any generic solution would have to generate the text as an image, which is not much use for a spreadsheet.
    To inform the emitter that text should be rotated you have to set a User Property called "ExcelEmitter.Rotation" on the text/label control that is to be rotated, for example:
    [sourcecode language="xml"]
    <label id="57">
        <list-property name="userProperties">
            <structure>
                <property name="name">ExcelEmitter.Rotation</property>
                <property name="type">string</property>
            </structure>
        </list-property>
        <property name="ExcelEmitter.Rotation">60</property>
        <property name="textAlign">right</property>
        <text-property name="text">Flights</text-property>
    </label>
    [/sourcecode]
  • Auto Column Widths
    An oft-requested feature from my No1 client, and one that I've been putting off because reports as spreadsheets often don't work with auto-column widths as they often have titles in the early rows that are expected to flow over multiple cells (without using merged cells).
    My solution is to only base auto column widths on cells in the details band of tables, and for performance reasons I only consider the first 4 rows there.
    Auto column widths will not make columns narrower (if you want them narrower specify the width in the report design) and will not be used at all if the width is specified in the report design.
  • Row Grouping
    A simple change - table groups in the report are now configured as row groups in Excel.

Update v0.5
A bit anti-climactic after version 0.4, version 0.5 has a bug fix to set the top and bottom margins correctly and one new feature:

  • Force Auto Column Widths
    After adding support for auto column widths and turning it off by default I go and find a situation where it needs to be be turned on - specifically if you have columns that you want hidden unless they contain data. By forcing auto column width calculations the column widths can be set to zero and they will only become wider if there is data to fill them.
    As with text rotation, this is controlled by a User Property, this time a boolean property called "ExcelEmitter.ForceAutoColWidths".

Footnote
Finally I'd like to thank my employer, GTI Recruiting Solutions, for permitting me to release this.