Scoring spreadsheet: The most important spreadsheet
Let’s talk about scoring spreadsheets. Since my very first job, I have always used spreadsheets as a tool to complete tasks. As I have always worked in small companies, Microsoft Excel was not an option because you have to pay for it. Not only is Google Sheets free, but it also allows you to share any spreadsheet and edit it simultaneously. For these reasons Google Sheets ended up being the best option.
In my first job, my team and I had important information, like KPIs or CRM data, held in a few key spreadsheets. But we also had a lot of other, less important, data held in other spreadsheets. I’m now working at Sheetgo and we are in the process of accelerated growth. Sheetgo has an add-on for Google Sheets that allows you to connect data between different spreadsheets and update them automatically.
We are pioneers in connecting data in spreadsheets and we want to better understand these connections. So we’ve begun to study the relevance of each spreadsheet in order to make sure we’re controlling our data well and to know which sheets are irrelevant and can be removed. This measurement system, in the form of a scoring spreadsheet, is a feature that we hope to provide to our customers coming soon.
Sheetgo gives users the ability to connect their spreadsheets. Our hypothesis is that the importance of each sheet depends on its connections with others spreadsheets. Why? If one spreadsheet receives or sends data to five, ten or twenty others, it is clear that it will be more important than those that neither receive nor send data.
We’ve charted out the systems created by several of our users and we see patterns like the following:
Each dot represents a spreadsheet and the width of arrows measures the number of tabs that are shared.
Scoring spreadsheet: how to measure each one
Each spreadsheet can import data from others or export data to others. If one spreadsheet imports a lot of data, it is important to the organization because it is really useful for monitoring data. On the other hand, a spreadsheet that exports data to many others is also important because this spreadsheet is likely a central database.
We decided to carry out a spreadsheet ranking based on the amount of received data in each spreadsheet. To measure this we developed the following formula based on Page Rank from Google:
N is total connections in the spreadsheet system, and d is a factor of connectivity. If d is higher, the score of a non-connected spreadsheet is lower. Therefore, the entry score of any worksheet (ES(X)) depends on the input score of each worksheet from which it receives data. L(Y)out is the number of connections out of Y to others spreadsheets.
At the same time we can measure the amount of sent data from each worksheet:
Finally, in order to get a global score of X (GS(X)) for each spreadsheet, we will sum ES(X) and OS(X) and we will multiply them by w. W is used to give more weight to the amount of either sent or received data.
Case study and conclusions
In the next image you can see an example from one of our users:
It seems that mathematical formulas work well, at least in this example and those that we have already analyzed. So what information can we extract? Firstly, we can define which spreadsheets are the core of our data system. They will be those with more Output Power. Secondly, we will consider spreadsheets that are useful for monitoring data like KPIs for instance. Finally, we will know the global punctuation of each spreadsheet.
Ranking your spreadsheets will help you shape your data management procedures and help you ensure that the most important spreadsheets are up to date. It will also help you to know which scoring spreadsheet might be unnecessary and finally allow us to analyze the possibility of improving the efficiency of our spreadsheet system(s).
Learn how to easily merge your spreadsheets using Sheetgo in the following blog post.