How to Design Software — Report Generators
Learn how to create a reusable report generation system using the template design pattern
Learn how to create a reusable export generation system using the template design pattern.
Reports. It’s often one of the first features developers are asked to build by the business. Once someone has data, what do they want to do with it?
That’s right — view it.
Engineers have problems with reports
Many engineers approach reports in a use-case particular way.
They’ll get a requirement such as “create a .csv report that displays all purchased items from the store” and they’ll go and create that particular feature.
The next time they may get a request to “create a .csv report of all users belonging to the account” and they’ll go and create that feature as well.
As time goes on, the business asks for different formats, different fields, etc. These requests add up, and after a few iterations, the system ends up with more than a dozen report implementations, all with their own bugs and quirks.
Worse yet, if any major iterations are needed, such as customizability or some new performance improvement, the change will have to be made over and over (and likely slightly differently each time), leading to significant overhead implementation and maintenance costs just to generate a comma-separated list! But, it doesn’t have to be this way. There are some simple techniques you can use to avoid this pain altogether.
What Is a report?
First, let’s understand — what is a report, exactly? Every report has these four basic components:
data records
data values
labels
format
Data records
The data is the actual data that you are using to populate your report. Maybe it is a bunch of events in your system. Perhaps it is transaction data. It’s likely data pulled from your database.
There’s likely a specific scope to the data to limit the data based on some parameter, such as date range.
Data values
It’s not enough to have just the data as a whole. People are often interested in specific data points — specific fields of the data. A financial person may only be interested in dollar amounts of a settlement, whereas a security monitor may be interested in an address mismatch status.
Labels
Labels provide meaning to the data. It’s what distinguishes this…:
5 | 3 | 230 | 40
…from this:
id | event_id | amount_paid_cents | fee_paid_cents
5 | 3 | 230 | 40
Labels turn meaningless values into data and provide human-understandable context to otherwise arbitrary values.
Format
Finally, there’s the format of the report. The format could be arbitrary and what format is needed ultimately depends on the consumer. If the consumer is an API, the format might be JSON.
If the consumer is a FTP drop-off or a person intending to import it into another system, it may be a CSV. If the consumer is a data person interested in slicing and dicing, it may end up as an Excel spreadsheet or even a set of database import commands.
What Are the Steps of Generating a Report?
Now that we know the pieces, we can look at the algorithm: what is the sequence of steps needed to generate any kind of report?
Turns out, it’s not a complicated algorithm! It can be encapsulated into the following:
Setup — Get parameters for the report
Fetching records
Map — Get a particular set of fields from each record
Convert each set of fields into an entry in the report
Send the report back to the user
Setup
The first step is setup — sending the parameters into the report to adjust how it behaviors. This is often data used for scoping — some collection of records needs to be filtered and reduced into a subset (eg. by an account, by date, etc.)
Fetch
Once you have the parameters for your report, you’ll want to start retrieving those records. You can apply the filters as appropriate:
Keep reading with a 7-day free trial
Subscribe to Joseph Gefroh to keep reading this post and get 7 days of free access to the full post archives.