Overview of the Jaspersoft BI Suite
The Jaspersoft BI Suite
The goal of this article is to review the Jaspersoft BI suite and walk through how it can help you with your business goals. But before we begin lets review the problem space we are working in.
Data and Facts
Everyone embarking on a reporting project usually has a common objective, which is to use the data they have to make better business decisions.
Data is what your systems collects through users and transactions. But facts, by definition are a 'truth known by actual experience or observation'. Business decisions are made (or should be made) on the basis of such facts/truths. The goal of reporting is to present data in such a way that these truths can be detected. Often we look for trends or patterns in the data.
A clear trend can be used to support a fact about the data that can then be used to initiate a major business decision. When you look at it this way, reporting is more than flashy graphs and charts. Its a search for the truth (cue X-files theme song) Such a noble task deserves some worthy tools. This is where the Jaspersoft BI suite comes in.
Types of Reports
In my experience reporting breaks down into two different categories:
- Canned Reports
- Self-Service Reporting
In addition to these, there is Analysis which will be discussed on its own. Lets talk about each.
These reports are answers to very specific questions:
Show me the total sales this month, the total costs and a list of the top sales reps.
Users requesting these types of reports usually know how they want to see the data visualized as well:
I would like to see sales and cost both displayed on the same line chart and a table listing the top sales reports sorted by sales revenue.
The user requesting this may even come to your desk to a mock up and tell you:
I want it to look like this PDF/Powerpoint/Excel.
There are often inputs required by a user before the report runs that can be used to make changes to the layout and/or constrain the data set displayed.
A simple example of this is getting the transaction history for your credit card online (you need to specify a start and end date for the transaction range). The big thing you can't do as a user is redesign the report as you see fit. Canned reports are usually designed this tools that often require some skill to master.
With self-service reporting the client is given a data set to work with (for arguments sake lets say a list of columns from a data base) and a 'simple-to-use' UI that allows them to create the layout of their report and even figure out how the data should be grouped, sorted, and summarized. Typically these design tools are more for the business user and require less technical skill to use.
I'm listing this as its own section of the decision-making problem. With Analysis we are studying the data looking for relationships, but we may not totally be sure of the relationships we are interested in. The requirements are a bit more 'fuzzy' and the data usually needs to be organized in a hierarchy to allow for the 'slicing and dicing' of aggregated data. Layout and appearance may be less of a concern here. Its more about the data and the hierarchies within it. Users interested in Analysis often very proficient with data often utilizing statistics or other algorithms to understand the trends they are evaluating. They often know their data model extremely well and want a tool that will let them explore it, from high level aggregations to the actual individual records making up the aggregations, with ease.
With all this in mind lets look at each component in the Jaspersoft stacked and what part of the business problem it is addressesing.
This is the engine of the stack. Its an API (Application Programming Interface) performs three major tasks:
- Compile a report template
- Fill a report with data
- Export the report to a client viewable format
The report we are referring to is a JRXML (JasperReport XML). It can be made through several means, however iReport (see the next component in the stack) is by far the best and easiest way.
There is an open source and professional version of this API. Both are covered under the LGPL license.
JasperReports (JR) API Professional contains flash based components (charts, maps and widgets). The open source API does not contain support for these.
For our business goals above JR would be the engine. It could be included in a custom application and code written to leverage its functionality to provide canned reports in the application.
JasperReports is very extensible. One can easily create their own datasource or extend an existing one. Here are the current data source implementations (JDBC, XML, CSV, JavaBeans, and more):
As for producing outputs the following implementations of the JRExporter are available. Once again you can extend one of these are create your own. Here is what is currently available (PDF, HTML, Excel, CSV and more):
The project contains some useful samples that give useful starting points. The samples show how to use all the major components and contain documentations explaining each:
JR has a very healthy forum community with regular postings by the contributers of the project. This is truly a first rate API:
This is the report designer of choice when designing reports for JR. iReport is a desktop application built on the Netbeans Platform (it can also run as a plugin for the Netbeans IDE).
Keeping your business problem in mind iReport can be used to design the canned reports, deploy and update them to JasperReports Server where clients can access them (see the next component).
Similar to Crystal Reports, its a band based reporting solution, meaning the report template contains bands and the users can use the designer to drag elements into those bands and configure how those elements interact with data.
Examples of elements are: Charts, Graphs & Crosstabs.
iReport supports most of the data sources that JR supports. It gives developers an easy UI to configure JR's behavior and test and configure exports.
iReport also contains an open source and Professional version. The Professional version contains animated charts powered by flash. There are also interactive, flash based widgets and maps that can also be used to display data.
iReport community and Pro also come with a plugin allowing for easy deployment to JasperReports Server, which is the next component we will discuss.
A good place to start with iReport is the pod casts made by its creator Giulio Toffoli:
The project page can be found here:
As a side note the tool is being redone in on the Eclipse platform. Here is the project page for this:
JasperReports Server (JRS - formerly called JasperServer) is a web based application. Its base functionality is to provide a data base driven repository to deploy reports created in iReport.
Its a secure application that allows users to login and execute/schedule reports on demand, the same operations can also be performed with web services. In reference to the business problem we started with JRS can:
- Host canned reports - users can execute them on demand, choose outputs to view them in, schedule for later executions
- Provide Self-Service reporting with the commercial edition of JRS using the Adhoc Report designer (works with relational DB tables)
- Provide some basic analysis with the commercial edition of JRS using the Adhoc Report designer (works with relational DB tables)
- Provides an easy to use dashboard designer allowing users to combine canned reports into a single view
- Provide very robust analysis with Jaspersoft OLAP (see next component)
For a full review of JRS please check this article.
Under the potential functionality of JRS comes Jaspersoft OLAP (formerly called JasperAnalysis) which gives users full data analysis capabilities. The underlying data source must be relational DB tables. However, the tables need to be organized into a star or snow flake schema (basically a central fact table with dimensional tables providing information on the facts joined to each row).
This often requires some ETL (extract transform load) to get the data from more standardized schemas to a flatter star/schema.
Mondrian is the OLAP (online analytical processing) engine used by Jaspersoft OLAP. Setting up an OLAP view requires the following steps:
- Get the underlying data into a star or snow flake schema
- Create a Mondrian schema file (XML file) that specifies the Facts, Dimensions (and their hierarchies) and the underlying tables that provide the data
- Deploy the schema file to JRS along with an MDX (multi dimensional query language) query to give the user a data level in the hierarchy to start at
From this point the user clicks around within the interface either expanding and contracting hierarchies, adding dimensions (or filtering on them), sorting by measures and more.
Under the hood the UI passes down MDX queries to Mondrian. Mondrian then decides if it can get the data from its cache (Mondrian is very effective at caching data) or if it should issue SQL down to the underlying DB to get data.
This component provides the analysis functionality to our business problem. Of all the components this one requires the most work to get up in running. Getting the correct schema and getting an understanding of MDX and the Mondrian schema takes a little work. Luckily OReilly offers several a great training courses on the subject:
In the near future I hope to write a tutorial style article of setting up Jaspersoft OLAP. Stay tuned.
Here is the main page for the professional version of the tool:
The community edition page can be found here as part of the community release of JRS:
This bring us to the last component in the stack. Jaspersoft ETL provides ETL (extract transform and load). This tool is built by Talend and is based on the Eclipse platform.
In recent times I have become a massive fan of this tool. It has become the swiss army knife of my IT tool kit. JaspersoftETL provides so much more than DB operations. It can work with files (it can read and write to all the common formats) and folders, the underlying OS, web services, you can even add your own java logic, and more.
And a nice part is, even though your job is designed in a visual drag and drop environment, java code (or perl) is generated in the background which you can then export into an executable jar (complete with all the dependencies). This jar can be copied to the actual environment the data is in thus reducing the need for bandwidth when moving large amounts of data.
This tool is key to preparing data for all our reporting needs. Look for tutorials on this site very soon on some of the cool things you can do with this first class tool.
The community edition of Jaspersoft ETL can be downloaded here:
Hopefully this will give you an idea of what Jaspersoft has to offer.