How Many Dimensions does Data Have?

Excel is two-dimensional. Columns run horizontally. Rows span vertically. Initially we think of data as having two dimensions as well. In fact, two key terms in database lexicon are rows and columns, just like in Excel. Here’s an example of what an Excel database of some fictitious people might look like:

dimensions1

In reality, however, data usually has three dimensions. Rows describe items. Columns are singular attributes of those items. Lists are plural attributes of those items. Observe how adding a column for the names of each person’s children necessarily morphs the structure into a third dimension. Here’s what that Excel database might look like with a list in a cell:

dimensions2

The way I’ve drawn that table is OK. The problem is that I had to grab each row and adjust its height manually.  Another option is I could have selected all the rows and navigated to Format –> Row –> Height and set the height for all rows to some arbitrary height. The problem with that, of course, is that the appropriate height differs from row to row. Jacob has three kids, Paul has two and Kim has one.

Maybe a better approach is for the software to allow you to enter a list of items in each cell, but be smart about how it shows you that data. Tighten it up. Maybe decorate it with a little triangle to suggest there’s more here than you see, but when you hover your mouse over the cell, show you all the values. Maybe it would like something like this:

dimensions3

What do you think?

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
/

5 Responses to “ How Many Dimensions does Data Have? ”

The challenge with the triangle is that in scanning the list you don’t see the data. Perhaps in ‘view’ mode (e.g. you haven’t clicked in to edit it), you list the values comma separated: Troy, Mathew, Natalie. When you click (or hover over?) you could still display the IFrame bulleted list.


Would be great to have this in Excel, but it’s a slippery slope into relational database land. What if I want to store the kids’ allowances along with their names and sum them in an adjacent cell in the parent worksheet? It would require a rethink of the formula language to deal with collections, sets, lists etc. (not that that would be a bad thing). Excel really needs a schema behind it that knows about relations and helps data validation, worksheet integrity, relational joins, operations across collections, etc. Would be a welcome relief from the V/H LOOKUP hacks and text delimited lists that people build. The new 2007 tables are a start but the syntax is god-awful. I think there’s an opportunity to build a whole new software development environment metaphor based on Excel and its (primitive) oject/event model, but that’s a topic for another post. You seem like a kindred spirit with the folks at Juice Analytics — check out their stuff at http://www.juiceanalytics.com/


I like where you’re going with this, but the solution still seems artificially constrained. You have elevated the dimensions available from 2 to 3 and in fairness you state that data usually has 3 dimensions.

For me a more flexible approach would allow the gui to cope with multiple dimensions - in your example, what if for each child we needed to list their favourite toys (and then maybe each toy could come in a range of colours).

I’d like to see a control (similar maybe to a scroll bar) that lets me move from editing dimensions 1-and-2 to 2-and-3 and on to 3-and-4 etc.


[...] for normal people that isn’t Excel. Kevin Merrit, founder and CEO, has a great post on the multiple dimensions of data and how blist is building interfaces to handle additional dimensions (something Excel does very [...]


One issue that we have right now is that we use “tags” for many of our objects (like staff, project, etc.). Each tag belongs to a category and depending on the category, you can have only one or none/many of the tags.

For instance, for staff tags, there may be “Employment Type” as a category and full-time, part-time, seasonal as the tags. That would be a single-select (radio button).

But for Certifications, a staff member may have any number of those (multi-select).

So, the question is, how do you represent that type of data within a “row”? Of course, moving beyond that are the points brought up above about multiple attributes of these multi-attributes.

We, too, are using Flex for our interface and would love to chat with you about your ideas. We don’t compete, so hopefully we can help each other out! Contact me if you want to brainstorm….

Paul


Something to say?