Building for Iteration
Or Some Principles I’ve Learned the Hard Way
I’ve built powerful software tools for multiple purposes and companies. What makes "the best" different is that they didn’t start out powerful. Every one of those tools started with a simple implementation that allowed me to learn more about the customer and keep adding features until it was “done” enough that the user ran out of significant updates to request. They haven’t been formal Agile or Scrum developments, but I’ve borrowed from those tool-sets.Today I’m musing on how I’ve planned for iteration at different levels of the tools and how that’s felt throughout the process. The tools are mostly Excel / Google Sheets, SAP, and Visual Basic for Applications (VBA) / Google Apps Script (GAS).
I’ve discussed many of these tools before:
- Overlap tool
- Follow-up through Outlook, OneNote, and To Do
- RFQ / CBF SAP tool
- Change Board
- Shortage report
- Small business Square/Google Sheets integration
- Purchase requisition tracking
Spreadsheet-level iteration
Expandable formulas
- Above the header row might be something more machine-readable to help the formulas in each column. So above the “Sales - December” column might be a cell “12/31/2025” so that the formulas can reference an actual date with a SUMIFS or COUNTIFS formula. Maybe above the header is a MATCH function to look up a reference on another sheet so that the spreadsheet only has to calculate this once (more on speed later).
- To the left (or right) of the “main” columns are other lookup columns, with more MATCH functions, splitting a date into month or year if needed, or leaving “breadcrumbs” so that I or another user can figure out the source of data later. “This row came from SAP transaction ME49 on 1/15/2025 and affects the nose landing gear” is extremely useful when senior management asks why a particular line was included or excluded.
I also reference entire rows or columns where possible, or add a “dummy” row at the bottom (often colored red) that marks the end of my formula references. As long as new rows and columns are inserted before that row/column and below the header row, all of the formulas should expand without editing. All PivotTables are also tied to those dummy rows/columns where possible.
Personally I avoid named ranges: they can create conflicts when interacting with other files (e.g. copying a sheet into a different file) and are opaque as to what they reference, making them more error-prone.
Cell protection
As a power-user myself, I strongly prefer NOT edit-protecting sheets. Too often I’ve tried to understand why a template is giving an unexpected result, or even found a formula error, and I can’t fix it. The intent is to avoid someone less adept messing with the file and getting incorrect results, but I’ve seen problems where the person who knows the password leaves the company and then everyone is stuck with the current iteration until someone gets frustrated enough to recreate it from scratch. If you must use edit protection, leave a blank password or include the password in the “instructions” sheet (you’re writing an “instructions” sheet, right?). Non-power users will ignore it or not know how to un-protect, but power users will thank you.
Speedy lookups
Lookup and reference functions are some of the most powerful tools in spreadsheets. COUNTIFS and SUMIFS are excellent for many purposes (especially “is this line duplicated?” or “give me all of the sales in this month”, but I almost never use HLOOKUP or VLOOKUP. For speed, I prefer the combination of INDEX and MATCH. Often, I’m dealing with many lookups on a sheet and want to avoid making the spreadsheet search in every cell. So if my sheet has part numbers down the side and years across the top and I’m looking for units per month, I would use two MATCH functions and an INDEX. MATCH is the “lookup” part of a VLOOKUP or HLOOKUP: given a value and the range to check, it returns the first match’s position in the search range. So if I run a MATCH on an entire row looking for a part number, MATCH will return the row number of the first time that part number appears. And INDEX returns the value in a cell given its row and column in a range.
For example, maybe I want to pull in the part description, its quantity per shipset, unit cost, and finally the annual quantities. All of these are on the part-number row of the sheet(s) I’m looking up, so I run the MATCH once and then use that as the row input for the INDEX function. And the column function could be fixed (i.e. I just reference the “part description” column on the same sheet that I just pulled the MATCH from) or could be its own MATCH, as it might be for finding annual data.
Iterate!
Because you’ve built the sheet to be flexible, adding additional data, summaries, or functionality is much simpler. Another lookup might just be one more INDEX from an existing MATCH, or replacing all data from report “A” with data from report “B”. Keep pushing those changes to the user and see what else they ask for, and confirm that what you’re giving them has been useful so far.Macro / code-level iteration
As with formulas, the more resilient you can make your VBA / GAS to changes in both code and the underlying spreadsheet, the happier you’ll be. For spreadsheets, I strongly prefer to avoid VBA / GAS if formulas will get the job done. Code introduces more opportunity for error (or security breaches) and fewer people understand it, so you’ll be stuck maintaining it longer than if you published something with just formulas. People often run into an error in a macro and give up when they might have tried to debug a formula. BUT sometimes VBA / GAS is much better, like when pulling from different data sources or making changes to the data or summary at certain times rather than constantly with formulas.
References to the spreadsheet
In the first draft of VBA or GAS code, often I’ll include a lot of absolute references (like “get the value from cell A1 on this sheet”). I know it’s dangerous, but it’s quick and lets me show results to the user early. But as soon as the first changes or requests come in, it’s time to convert as many of those as possible to lookup functions. I often trust that sheet names will remain the same, so my sheet variables remain hard-coded. But header rows can move up or down, columns can move left or right, and the end of the data set is always in motion. Your code should look up or validate those locations (if it uses them) every time it runs!
Modules and functions
Being self-taught for coding, I’ve had to figure out principles by what I find on the internet plus trial-and-error. With so many errors in my trials, I’ve gotten better about splitting my code into separate functions or subs wherever I can afford having some variables disappear from memory. Having separate functions serves two big purposes:
- Code re-use is much easier with discrete functions or subs because they can just be imported wholesale. By the time I had built a few Excel spreadsheets that could read SAP, I had a standard VBA sub that ran the SAP transaction, pulled all of the results into an array matching the necessary output (i.e. all the columns were in the desired order) and and dropped it into a destination spreadsheet. Whenever I needed something to read SAP, I just went to my pre-made function and dropped it in. And every time I fixed a bug or added functionality, that went back into my standard sub.
- Testing the code is much faster when you don’t have to wait for everything around it to run. In my recent experience with the Little Babet code, I was testing the interfaces between Square and our Google Sheet. The first thing the “update Square with new information” function did was pull the latest Square data so that it had updated item information including SKUs and inventory counts. That meant that every time I tested my code, I had a 60-second wait while the “read” function ran perfectly before my new janky code could run and I could debug it. Things went much faster once I moved the “read Square” function from the “update Square” function to an earlier phase that I wasn’t testing but would always run first for the user.
- Trigger function
- Update Square
- Read Square data
- [Code under development to make the update work]
- Trigger function
- Read Square data
- Update Square (could run this function separately from the above)
- [Code under development]
Overall iteration
If you’ve read my posts on the overlap tool, PR tool, Little Babet sheet, shortage report, or quoting automation, you’ll see a pattern where I solved one problem and that solution surfaced several more problems. Those other problems might not have been obvious until the first one was fixed, or having the first one fixed increased either the user’s or my ambitions on what else the tool could do. The overlap tool was never going to interact with SAP until JY asked me if it could. Little Babet started as a “collect purchased-cost data and output labels as a bribe to fill out the cost data” but ended up integrating directly with Square.
Whatever you’re building, if it solves a problem for a user, you or the user will find a reason for it to do more. Build the tool to enable this as easily as possible, and build your processes to expect and celebrate building something much more capable than you would have first expected.
How do you prepare for iteration in your work? In software or hardware? Please reach out in the comments below, at blog@saprobst.com or this page is cross-posted at LinkedIn and you can leave a comment there.
Comments
Post a Comment