Last week I had a regular Excel/Access user approach me with a problem:
I have 40 sale campaigns. I need to put together a quick presentation (as a PDF) which will show 2 key metrics (New and Loss revenue) for each of the 40 sale campaigns. Do you know of a quick way I can produce this presentation?
I thought this would the perfect opportunity to show off Pentaho Report Designer (PRD) as an alternative to Mircosoft Excel, Access or other big BI vendors. Below is a step by step guide on how I accomplished this.
The above statement sums up the users requirements, visually the the report needed to look like this:
After I knew the structure of the report I got the user to export the data into an Excel Worksheet – this was going to be my data source – after this it took me only 5 minutes to create the presentation using PRD.
The end user was able to produce a raw extract of the data which the report needed to be based on. Here is a preview of the data:
- Campaign: Campaign identifier which duplicates for each week row
- Week: Week ending date which is for 2010
- New_09, Loss_09: Revenue for 2009
- New_10, Loss_10: Revenue for 2010
To use an Excel worksheet as a data source make sure the Data tab is active, right click on the Data Sets icon and select the Table option:
After clicking the Table option a Table Datasource Editor window is displayed:
Step 1. Create a Query.
The first step is to create a query which will store the data set. Click on the (Add Query) icon and then specify a name, in this example I’m naming the query dataset.
Step 2. Select the Excel Worksheet.
The next step is to select the Excel worksheet which contains the data set. Click on the icon located in the bottom left hand corner to open up a file browser window, select the Excel file which contains the data set and click OK. Your data set should now be visible (as my Excel worksheet has headers I have checked the Use First Row as Header option).
At the moment you can only use .XLS files so if you are using Office 2007+ make sure you save the extension as .XLS instead of .XLSX and if your .XLS file contains formulas they will not be recognised by PRD so be sure to paste special as values.
Step 3. Manipulate the Data.
You can add empty rows and columns or remove rows and columns by using these icons:
You are also able to change the column headers and data type by double clicking on the column header:
In this example I didn’t need to make any modifications to the data however I did have trouble converting the Week_Ending field into a date format so in the Excel worksheet I converted it into a string using the TEXT() function – if anyone knows how to get this working please lleave a comment.
Once you are happy with the data set click the OK button. The data set should now be visible as fields under the Data Sets > Table > dataset icon under the Data tab:
Building the Report.
The next step was to build the report, from the report structure (above) I knew I had to create one group Campaign, and within that group add 2 charts which represented the New and Loss Revenue for 2009 and 2010.
Step 1. Add the Group.
To get started add the Campaign field as a group by right clicking on the Groups item under the Structure tab and selecting the Edit Group .. option:
After clicking the Edit Group .. option an Add Group window will appear, specify a name for the group i.e. Campaign and using the arrows move the field over which defines the group which in this example is Campaign and click OK when done.
The Campaign field will now be defined under the Attributes tab for the Group item:
The next step is to make the Group Header and Details Header are visible on the report canvass. Click on the Groups > Group Header item under the Structure tab and and under the Attributes tab uncheck the hide-on-canvass option:
Repeat the above steps for the Groups > Details Body > Details Header item as well:
Step 2. Functions.
I will need to create three functions; one which will convert the Week_Ending field into a date and the other two will be display dynamic chart titles.
Converting the Date.
As I mentioned above I had some issues with letting PRD convert the Week_Ending field into a date from the Table Datasource Editor, so to get around this I created a formula which would take the Week_Ending field (as a string) and convert it into a Date.
To do this create a new function by right clicking on the Functions option under the Data tab and click the Add Function… option.
From the Add Function… window expand the Common folder and select the Open Formula option and click the Add button:
This will add an Open Formula function under the Functions option (seen under the Data tab) :
To change the name of the function (named by default as FormulaExpression1) click on the function and in the area below change the value that is set for the Function Name, in this example I named the function New_Week_Ending:
To edit the formula of the function click on the icon which is displayed when you click in the Formula value area, this will display a Formula Editor window:
As my date is in the current format YYYY-MM-DD I needed to use the DATE function with a combination of LEFT, MID and RIGHT functions to convert it into a date. I won’t go into too much detail with this section as all these functions are pretty self explanatory and are well documented in the Formula Editor but the final formula looked like this:
Click the OK button to save the New_Week_Ending formula. I have just created a function which will convert the Week_Ending (string) field into a New_Week_Ending (date) field so that the X-Y Chart TimeSeries Collector will now recognise it as a date.
Dynamic Chart Titles.
Each chart needs to have a dynamic title which will output the campaign and the measure which is represented i.e. 1 – New Revenue ($). As I have two charts for each campaign I will need to create two functions: one for the New Revenue chart and one for the Loss Revenue chart.
Following the steps outlined above to create two new functions (with the names sNewChart and sLossChart) using the Open Formula type and specify the formulas to be:
# New Revenue Charts (sNewChart) =[Campaign] & " - New Revenue ($)" # Loss Revenue Charts (sLossChart) =[Campaign] & " - Loss Revenue ($)"
I have now created three functions:
Step 2. Add the Group Header.
To add the Campaign as the Group Header drag and drop the Campaign field from the Data tab onto the Group Header area and apply some formatting:
Step 3. Add the Charts.
Each chart needed to look similar to this mockup (I have already created the Chart Title and New_Week_Ending functions):
Drag and drop a chart object from the panel located on the left hand side onto the Details Header section and double click the object to open up the Edit Chart (properties) window . To create a X-Y Chart which uses a TimeSeries Collector make sure you Edit Chart (properties) window looks like this:
After setting up all the options the Edit Chart (properties) window now look like this (for the New Revenue measure):
- chart-title-field: Field or function which is the chart title i.e. sNewChart or sLossChart
- Primary DataSource
- category-time-option: Value of the x-axis, this must match the amount of series you have in the value-column option i.e. New_Week_Ending (twice)
- time-period-type: Type of period the above field is i.e. Week and New_Week_Ending is an increment of weeks
- value-column: Field(s) which contain the y-axis, you can specify more than one as there can be multiple series i.e. New_09 and New_10 or Loss_09 and Loss_10
- series-by-value: Name of the each series specified for the above option i.e. 2009 and 2010
- reset-group: Reset the charts after a specific group(s) i.e. Campaign
There was alot of extra formatting on the charts so be sure to check out the available .PRPT file if you interested in what they look like.
After I created the chart for New Revenue I just copied and pasted the chart object and made changes to the properties to create the Loss Revenue chart:
The final report structure looks like this:
There are a lot of options in the charting library of PRD so I suggest you pick up Will Gorman’s book Pentaho Reporting 3.5 for Java Developers which has a couple of chapters which cover the charting library in detail.
Step 4. Add the Page Break.
The last step was to add a page break after each Campaign group. To do this click on the Details Body item under the Structure tab and check the pagebreak-after option:
After previewing the report the final output as a PDF looks like this (you can download the actual PDF file below):
This post is very detailed as it covers some interesting areas but as this was an ad hoc request it took me under 5 minutes to whip together this report and have it in the user’s inbox. Do you have any examples of how you handled ad hoc requests with PRD? If so, leave a comment or send me an email.
One user converted n to go! Enjoy.