The Data are Coming, The Data are Coming
Posted by JR Reagan on June 28, 2013
|Follow JR @IdeaXplorer||Connect with JR|
Are we becoming a data-driven nation? Before you answer yes or no, let’s first look at the latest numbers surrounding Big Data. According to Deloitte’s report, “Data: A Growing Problem,” the numbers are staggering1:
- 6 billion mobile phones in use today, representing 87% of the world’s population
- 1.2 billion1 mobile Web users in the world today, representing 17% of the world’s population
- 30 billion pieces of content shared on Facebook every month
- More than 60 billion intelligent devices exist in the world today and is expected to rise to more than 200 billion by 2015
- 40% projected growth in data volume every year
How much data is that? It’s so vast, it’s hard to visualize.
The MegaPenny Project can help. We all know how big a penny is. By envisioning increasingly larger quantities of pennies, it’s easier to picture large numbers. Put 16 pennies in a stack, and it’ll be an inch tall. Lay them in a row, and it’ll be a foot long.
Now take a quintillion pennies and lay them flat. At 89.6 billion acres, they’ll cover the surface of the Earth two times over. That’s the amount of data we create, day in and day out. It comes from telescopes and supercolliders, from social media posts and video streaming sites to sales transactions, and embedded sensors, and mobile phone locations.
Just how large do those daily data numbers grow to over time? Very, very large indeed. In December 2012, IDC released its 6th annual study of the “digital universe,” which it defines as “a measure of all the digital data created, replicated, and consumed in a single year.” The organization predicts our digital universe will approximately double every two years, increasing from 130 exabytes in 2005 to 40,000 exabytes in 2020.
An exabyte equals one quintillion bytes…so think back to the penny example. Now multiply by 40,000 to imagine how big the digital universe will be in just eight years.
Welcome to the world of big data, a world of “data sets so large and complex that they’re difficult to process using on-hand database management tools or traditional data processing applications” (Wikipedia).
Who’s driving the big data bus?
Given numbers like these, if we don’t become a data-driven nation, we may soon become a nation drowned by data.
In May 2012, the White House issued Big Data Across the Federal Government, which describes nearly 100 ongoing, multi-agency programs that “address the challenges of, and tap the opportunities afforded by, the big data revolution to advance agency missions and further scientific discovery and innovation.”
Earlier big-data initiatives offered promising results. Data-Driven Approaches to Crime and Traffic Safety, developed to use geographic mapping to identify hot spots of crime and traffic crashes, and send patrol units for high-visibility enforcement. The program went into action in 2008 in seven locations.
The big picture of a data-driven nation
In 2012, the White House laid out its national digital initiative in Digital Government: Building a 21st Century Platform to Better Serve The American People. It has three goals:
- Enable the American people and an increasingly mobile workforce to access high-quality digital government information and services anywhere, anytime, on any device.
- Ensure that as the government adjusts to this new digital world, we seize the opportunity to procure and manage devices, applications, and data in smart, secure and affordable ways.
- Unlock the power of government data to spur innovation across our Nation and improve the quality of services for the American people.”
Part of that access to digital government information comes through Data.gov, a federal website that was launched in May 2009 to provide public access to some of the vast amounts of data it collects.
Today, the site boasts 378,529 raw and geospatial datasets detailing everything from farmers’ markets to aerial data from the Fukushima incident. To make the datasets easier to manage, Data.gov also offers 1,264 data tools and 236 apps developed by citizens.
Projects that have been developed using information from Data.gov are myriad:
- The Avian Knowledge Network brings together researchers from around the world to discover, organize, and access megadata about bird populations.
- Did You Feel It harnesses the sheer numbers of people on the Internet to gather information about earthquakes from people who experience them.
- Eat, Shop, Sleep pulls together information about code violations to help consumers find safe places to eat and stay, and couples that data with customer reviews from social media sites.
- Pulse Point connects people who’ve been trained to perform CPR with people nearby suffering heart attacks. It’s not meant to be an alternative to 911 responders, but a quicker complement to that service.
2 days, 95 events, 20 government partners, 75 data sets and resources, and more than 10,000 participants came together for National Day of Civic Hacking, a national event that took place June 1-2, 2013, in cities across the nation.
Thousands of volunteers came together to help solve some of the world’s pressing problems—from classifying cancer cells to modeling climate change. In this world of big data, citizen scientists aren’t just valuable—they’re necessary and Federal agencies are exploring how to look at and analyze their growing data stores in new and different ways.
Municipal governments are opening up their data to citizens, too. Data-Driven Detroit promotes community change by tracking and providing access to neighbohood-level data—including everything from liquor licenses to adult-foster care homes.
But does big data equal big value?
Could all this data add up to something besides a towering hill of beans, and a few hundred interesting apps?
Deloitte predicted that in 2012, “big data2” would experience accelerating growth and market penetration. As recently as 2009 there were only a handful of big data (BD) projects and total industry revenues were under $100 million. By the end of 2012 more than 90 percent of the Fortune500 will likely have at least some BD initiatives under way. Industry revenues will likely be in the range of $1-1.5 billion. But the industry is still in its infancy. Big data in 2012 will likely be dominated by pilot projects; there will probably be fewer than 50 full-scale big data projects (10 petabytes and above) worldwide.
But can we get there from here? The TMT Predictions report points to a significant issue: Collecting big data, and offering access to it, doesn’t mean we process it or put it to use. The report identifies that “even though BD is still in its early stages, the growth suggests that the industry needs to develop talent with big data skill sets: 140,000 to 190,000 skilled BD professionals will be needed in the US alone, over the next five years3.”
This is one area where I believe policymakers need to step up to the plate, and make sure there are enough licensed drivers to steer the Big Data buses on the way.
1McKinsey Report: International Telecommunication Union
2Big data is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target currently ranging from a few dozen terabytes to many petabytes of data in a single data set.
3Big data: The next frontier for innovation, competition, and productivity, McKinsey, May 2011: www.mckinsey.com/mgi/publications/big_data/
As used in this document, “Deloitte” means Deloitte & Touche LLP, a subsidiary of Deloitte LLP. Please see www.deloitte.com/us/about for a detailed description of the legal structure of Deloitte LLP and its subsidiaries. Certain services may not be available to attest clients under the rules and regulations of public accounting.