Think you’re immune to Google Search? A new effort by the company promises to unearth your embarrassing Elementary School photos, achievements and other data, then incorporate those into the Google brain.
The Retro-Active Quantification Industry, which I believe will grow to a multi-billion $ valuation by 2015, made a big leap forward this week with the release of Google’s News Archive Search.
Many years in the works, the new service/feature allows users to do exactly what it says – search a huge body of archived small-town newspapers that have been scanned into Google’s system, converted from visual to text data using the company’s perfected system (note: they’re also working on a similar but more robust system that will mine text data – t-shirts, street signs, house #s, etc. – from photographs), and then indexed using Google’s world-famous search.
Best of all, Google allows you to view the original scanned images and “browse through them exactly as they were printed—photographs, headlines, articles, advertisements and all”, much like a microfiche in a library basement (remember those?).
In the past, many scientific discoveries and technological
solutions have come from a non related source of information. From
Archimedes’ realisation in the bath, to the accidental discovery of
penicillin, history is full of occasions where going outside the
subject in question has provided answers to scientific problems.
When you really think about it, in many ways humankind, technology,
and scientific understanding have been propelled forward,
significantly, by luck alone.
Many great individuals have been personally responsible for some
of the most important discoveries of all time. Often, their
discoveries were the result of sharing information with a friend or
colleague from another field, who was able to introduce a new angle
to the problem, opening up the eyes of both parties to new
possibilities. Or, someone will change their field, bringing
knowledge and experience from a previous career into the new
subject and then approaching problems from a unique perspective.
Today’s prime example of this is Aubrey DeGrey’s computing
background giving a new perspective to the concept of aging.
Many major breakthroughs have been created this way, by going
outside the realms of the problem itself, drawing upon the
knowledge of something else to find a solution. It’s often
something that is not done purposefully, so, more often than not,
it doesn’t happen. Chemists might plug away at a problem for years,
not realising that the answer lies in zoology. The solutions to
nanotechnology might lie in quantum physics, or perhaps just
mathematics. There are so many possible avenues that perhaps there
are problems that we will never solve, due to us never taking the
correct path to their discovery.
This is obviously not acceptable. Relying on chance meetings of
elites from different fields coming up with solutions will likely
keep human progress to the speed of the 1800s, whilst working on
problems for which solutions already exist is a ridiculous waste of
time, especially if you want to stay ahead of Actuarial Escape
Velocity. Thankfully, the internet brings a lot of information
together and keeps the relevant people informed on progress. With
the advent of huge, web based amateur communities and special
interest groups, much news and information is shared amongst those
with common goals, helping the spread of information. (cont.)
As personal broadcasting feeds like Twitter and FriendFeed hit the mainstream,
increasingly adding to the information already flowing outward
through social networks like MySpace, Facebook, Orkut and LikedIn, as well as regular
old-school email, it’s steadily becoming more difficult to make
sense of all of the data competing for our attention.
It’s gotten to such a point that Josh Catone over at Read/Write Web
wrote that, “Keeping track of all that activity is starting to
feel like watching code in The
In the Matrix, protagonist Neo’s brain was able to discern the
meaningful patterns in the code. Catone points out that we now have
to take the first baby steps toward such an end:
“The Facebook News Feed only appeared about a year an a half
ago, Twitter only gained real attention about a year ago, and
FriendFeed and similar services are even newer. However, dealing
with information overload is clearly a problem that these services
will need to figure out how to address—whichever does it best will
likely be a big winner.”
As far as tangible near-term solutions, Catone cites basic
algorithmic sorting and a “thumbs-up, thumbs-down” user feedback
system, attributed to blogger/consultant
Jevon MacDonald, that can establish filters. While these are
great first steps, there are a few more techniques and structures
that should be added to the list, not to mention a bunch of
companies already hard at work prepping them.
Enterprise solution company EMC has just
released an update to their seminal
Digital Universe paper in which they report a 10%+ revision of
their 2007 numbers. The resizing comes comes to a grand total of
281 billion gigabytes, resulting from “faster growth in cameras,
digital TV shipments, and better understanding of information
The company also confirms that, as they had predicted, “the
amount of information created, captured, or replicated exceeded
available storage for the first time in 2007” and that trends
indicate “by 2011, almost half of the digital universe will not
have a permanent home.” This is due to the expectation that there
will be 10x the digital information as compared to 2006.
In short, information growth is still increasing at exponential
rate, step-in-step with technology
. To get a sense of this pace of this proliferation, take a look at
this useful ticker that EMC made
available to help diffuse their study:
Also of note, the paper claims that “approximately 70% of the
digital universe is created by individuals, but enterprises are
responsible for the security, privacy, reliability, and compliance
So what does it all mean? To me, it indicates that we have
indeed entered the knee of a broader exponential growth curve
composed of mutually reinforcing technology, information,
communication and intelligence
increases. (The jury is still out on comm and intelligence, but
data could be coming in soon.) For EMC,
this translates into new dollars – lots of them.
Their paper concludes: “We have many of the tools in place —
from Web 2.0 technologies and terabyte drives to unstructured data
search software and the Semantic Web — to tame the digital
universe. Done right, we can turn information growth into economic
Could it be that the global economy is also poised to grow
Chris Anderson, the editor of Wired, has written an excellent
article entitled “The
End of Theory: The Data Deluge Makes Scientific Method
Obsolete” in which he convincingly argues that massive amounts
of data, in combination with sophisticated algorithms and super
powerful computers, offers mankind a whole new way of understanding
Anderson believes that our technological tools have now
progressed to the point where the “old way” of doing science –
hypothesize, model and test – is becoming obsolete. In its place, a
new paradigm is now emerging whereby scientists, researchers and
entrepreneurs simply allow statistical algorithms to find patterns
where science cannot.
If Anderson is correct – and I believe he very well could be –
this will take science in a whole new direction. In short, instead
of modeling and waiting to find out if hypotheses are valid the
scientific community can instead rely on intelligent algorithms to
do the heavy lifting.
Before this vision can be achieved, however, it will require a
great many brilliant scientists to unlearn the idea that their
“model-based” method of trying to make sense of today’s
increasingly complex world is the only way to search for new
Atop a garbage heap amidst the expansive Westchester Landfill an iRobot Refuse Quantifier (iRQ) deftly went about its lucrative business.
Credit card receipt: inconclusive. Candy wrapper: M&M logo, no fingerprint. Check fragment: inconclusive. Candy wrapper: M&M logo, no fingerprint. Candy wrapper: Almond Joy, smudged fingerprint, image stored to temporary cache. Comb: zoom, hair strand: 92% match. Load level 2 protocols. Letter fragment: stamp fragment, zoom, puncture, contaminated sample. Product box fragment: Nintendo Wii logo, burnt, no data. Shredded tax documents: inconclusive, coordinates tagged in case of reassembly contingent on identity correlation.
The mechanical spider legs pumped and the little scavenger-bot systematically inched left, establishing a better focus point for its frontal laser array. The iRQ began scanning the next set of coordinates.
An identity match for a primary target had been established! Power surged from the tertiary battery outward as the spider maxed both input and broadcast. But something was wrong. The swarm network was not responding. Thus it was highly probable that the iRQ was now invisible to its peers and ultimately its owner.
Re-broadcast for 3 seconds. No ping back. Defensive algorithm, blend. Scan for disruption, risk assessment. Attempt new frequencies. Multiple frequencies inoperable. 84% deliberate disruption, 62% location awareness, evasive algorithm.
To scale and dominate as quickly as Google has, a new company will need to generate serious end-user value, monetize effecively, and take a new web-based approach to human resources. One such structure might be an organization specializing in prosumer-based quantification (structured crowd-sourced info mining) that can expand and contract quickly by paying citizen quantifiers for quality content that they input (think adsense, but more structured and directed from the outset). I imagine that this sort of company could catalyze big, fast economic growth and play an important role in generating positive-sum network value as we move further into the acceleration era.
To get the discussion of such a possibility rolling here's a speculative timeline of such a company (2011-2015) that I've cleverly dubbed "Quantification Company":
2011 - Launch: A logical outgrowth of flash mobs, open mapping parties, and steadily rising prosumerism, the Quantification Company (QC) was created in 2011 with the mission of "organizing and accelerating the comprehensive quantification of Earth's most valued systems." The for-profit organization relied on a small core of programmers, salespeople and community managers to catalyze quantification cascades, better known as Data Swarms, for a large variety of clients, but mostly municipalities and large corporations. Early efforts were kept simple and focused mostly on the rapid and/or real-time HD video mapping of U.S. cities, national parks, and other under-quantified areas of interest. Traffic-based fees were paid out to citizen quantifiers who captured and uploaded the best geographic footage and/or commentary. Though they were slightly nervous at the ambition and direction of the QC, competitors like Google, Yahoo and Wikipedia were happy to see traffic and content flow through their systems.
How is the digital revolution shaping the way we interact with
media? Below is a cool concept video exploring how the internet has
already changed the way we consume and share information. It then
presents a timeline into the next 40 years, giving us a vision of
how content may be consumed in the future.
Traditional information sources like books, newspapers, and even
your own experiences may be fully replaced by new interfaces, like
electronic paper, simulated reality through virtual worlds, and
memory sharing among the masses.
In its effort to catalog and effectively share the world’s
information, Google continues to improve its dynamic representation
of earth and has now extended its reach to cities and towns.
The first time I experienced Google Earth, I was pretty
impressed. Accessing satellite information, I was able to navigate
most any location on the planet that I was interested in, from a
bird’s eye view. Of course the first thing I did was check out my
street, the homes of my past, and landmarks around my town.
Next I was introduced to Street View, a
visualization composed of photos taken from automobiles that allows
full 3D street navigation. It wasn’t until a few weeks ago, when
Street View was at last integrated with Google Maps, that I could
travel down my street take a glance at my house and my car parked
neatly on the curb. That was really cool to me. I found myself
wondering where I was the time the photos was taken, and being
thankful they hadn’t caught me outside my
house in an early morning stupor.
After some light research I found that Google isn’t just
concerned with satisfying my curiosity. It has found ways to make
money with this technology while expanding its functionality for
important, decision-making parties.
Google introducing advanced versions of the platform with
Pro ($400/year), a collaborative tool for commercial and
professional use and Google Earth
Plus ($20/year) for everyday map enthusiasts. It also provides
non-profit organizations with Earth Outreach, a
program that allows organizations to map their projects to help
In March 2008, Google Earth introduced Cities in 3D which is
unsurprisingly a complete 3D visualization of numerous cities. To
contribute to this effort, users can submit and share renditions of
structures and buildings using Google’s SketchUp. The program
primarily relies on city governments to submit their 3D information
electronically (for free) and invites them to review the
The benefits for local governments seem rather extensive. They
include: engaging the public in planning, fostering economic
development, boosting tourism, simplifying navigation analysis,
enhancing facilities management, supporting security and crime
prevention, and facilitating emergency management.
A variety of thinkers have converged on the notion that humans rely on what is essentially "software" to build our simulation(s) of the world around us.
Abstractions Driving the Flynn Effect: Cognitive historian James Flynn attributes the steady rise in IQ over the past 100+ years (known as the Flynn Effect) to better human abstraction abilities, not to any significant increase in physical brain power:
Our brains at conception are no better than they ever were. But in response to the evolving demands of society, we can attack a far wider range of problems than our ancestors could. It is like the evolution of the motor car in the 20th century. Are automotive engineers any brighter than they were 100 years ago? – no. But have cars evolved to meet modern demands for more speed and entertainment while we drive (radios, tape decks, etc) – yes. Our brains are no better but our minds have altered as dramatically as our cars.
In other words, the abstract thought frameworks that we drill into our children during critical periods, including math, science, biology, maps, businesses, social networks, new language, etc, are in fact a form of software that affects our IQ and ability to navigate the world.
This simple yet powerful abstraction (npi) is a critical paradigm shift in our definition of what it means to be human and opens the door to additional metaphors for social, economic and intelligence studies.
Particularly intriguing is the question of how quickly and/or regularly we (individuals, groups, societies, nations) experience software upgrades, akin to loading the latest Windows or Linux versions.