The most important unsolved question for Big Data startups is how to make money. I consider myself somewhat of an expert on this, having discovered a thousand ways not to do it over the last two years. Here's my hierarchy showing the stages from raw data to cold, hard cash:
You have a bunch of files containing information you've gathered, way too much for any human to ever read. You know there's a lot of useful stuff in there though, but you can talk until you're blue in the face and the people with the checkbooks will keep them closed. The data itself, no matter how unique, is low value, since it will take somebody else a lot of effort to turn it into something they can use to make money. It's like trying to sell raw mining ore on a street corner; the buyer will have to invest so much time and effort processing it, they'd much prefer to buy a more finished version even if it's a lot more expensive.
Down the road there will definitely be a need for data marketplaces, common platforms where producers and consumers of large information sets can connect, just as there are for other commodities. The big question is how long it will take for the market to mature; to standardize on formats and develop the processing capabilities on the data consumer side. Companies like InfoChimps are smart to keep their flag planted in that space, it will be a big segment someday, but they're also moving up the value chain for near-term revenue opportunities.
You take that massive deluge of data and turn it into some summary tables and simple graphs. You want to give an unbiased overview of the information, so the tables and graphs are quite detailed. This now makes a bit more sense to the potential end-users, they can at least understand what it is you have, and start to imagine ways they could use it. The inclusion of all the relevant information still leaves them staring at a space shuttle control panel though, and only the most dogged people will invest enough time to understand how to use it.
You're finally getting a feel for what your customers actually want, and you now process your data into a pretty minimal report. You focus on a few key metrics (eg unique site visitors per-day, time on site, conversion rate) and present them clearly in tables and graphs. You're now providing answers to informational questions the customers are asking; "Is my website doing what I want it to?", "What areas are most popular?", "What are people saying about my brand on Twitter?". There's good money to be had here, and this is the point many successful data-driven startups are at.
The biggest trouble is that it can be very hard to defend this position. Unless you have exclusive access to a data source, the barriers to entry are low and you'll be competing against a lot of other teams. If all you're doing is presenting information, that's pretty easy to copy, and caused a race to the bottom in prices in spaces like 'social listening platforms'/'brand monitoring' and website analytics.
Now you know your customers really well, and you truly understand what they need. You're able to take the raw data and magically turn it into recommendations for actions they should take. You tell them which keywords they should spend more AdWords money on. You point out the bloggers and Twitter users they should be wooing to gain the PR they're after. You're offering them direct ways to meet their business goals, which is incredibly valuable. This is the Nirvana of data startups, you've turned into an essential business tool that your customers know is helping them make money, so they're willing to pay a lot. To get here you also have to have absorbed a tremendous amount of non-obvious detail about the customer's requirements, which is a big barrier to anyone copying you. Without the same level of background knowledge they'll deliver something that fails to meet the customer's need, even if it looks the same on the surface.
This is why Radian6 has flourished and been able to afford to buy out struggling 'social listening platforms' for a song. They know their customers and give them recommendations, not mere information. If this sounds like a consultancy approach, it's definitely approaching that, though hopefully with enough automation that finding skilled employees isn't your bottleneck.
Of course the line between the last two stages is not clear-cut (Radian6 is still very dashboard-centric for example), and it does all sound a bit like the horrible use of 'solution' as a buzz-word for tools back in the 90's, but I still find it very helpful when I'm thinking about how to move forward. More actionable means more valuable!