Excel is a great tool for quick, ad hoc analysis of data. However, business users often embed entire end-to-end processes in Excel because of the ease of making instant changes to the data, calculations, and outputs. The usual issue in replacing Excel-driven processes is that business requirements constantly change and users need to represent those changes within a few hours, not the few weeks it can take for delivery by an engineering team.
A standard solution proposed for reducing reliance on Excel
is to replace Excel VBA with Python. However, without a fundamental rethink of
the process flows – using for example the pandas library instead of Excel
formulae for data analysis – the only real change is the programming syntax. A
Python solution (e.g. from the Openpyxl library) would still use Excel’s object
library (via COM) to produce automations within Excel; and the processing would
still occur within Excel, such as refreshing pivot tables, recalculating
formulae etc. Python scripts can be version controlled in a Git; however this
can also be true of Excel add-ins written in VBA, using Git hooks. It is even
possible to use VSCode as an IDE for VBA projects, rather than using Excel’s
embedded VB Editor. Without a rethink of the overall process, a change of
programming language is not a strategic solution to the underlying data control
issues.
A strategic solution would include:
- All
data persisted in a database, where it is better controlled, more secure,
and can be analysed more easily;
- Data
flows clearly defined and configurable;
- Transparent,
reviewable, and version-controlled business logic;
- Reporting
represented in an interactive data visualisation application, such as
Power BI.
I have not yet encountered an off-the-shelf solution that
solves the spreadsheets problem; and I think that such a product and supporting
services would have enormous demand.
A Business Logic application between a data warehouse and
Excel could visually represent the business logic, enabling non-technical users
to:
- Define
and apply secondary calculations;
- Join
and group data (with controls in place to avoid dropped and duplicated
rows);
- Filter
and configure output datasets;
- Order
the process flows sequentially.
An Excel add-in could call the resulting reports to Excel
for review and analysis. The add-in could enable ad hoc data (resulting from
the review) to be uploaded into the Business Logic app, such as adjustments and
new types of reference data. All ad hoc data changes would then be tracked and
made transparent, rather than obscured in the logic of an Excel workbook.
Over time, machine learning would have a greater role in
suggesting and optimising the business logic.
No comments:
Post a Comment