Big Data facts, myths and eventualities: 8 insights and implications

Eleven years ago, Nicholas Carr published a Harvard Business Review article entitled "IT Doesn’t Matter" (followed quickly by a book with a slightly more ambiguous title, "Does IT Matter?"). While Carr’s thesis focused primarily on the notion that investment in standards-based technology consumed capital without delivering real competitive differentiation, it was used as shorthand to describe the tension between those who viewed technology as a means of creating market advantage and those who saw IT as a cost centre that is necessary without being especially critical to enhancing business processes.

Today, it seems clear that the balance of evidence this issue has been — and continues to be — decisively on the side of the "IT as a source of advantage" contingent. In a recent blog post, noted author and consultant David Moschella opined that "In retrospect, it would have been hard to give an executive in 2003 worse career advice" than telling him/her that IT didn’t matter. "Most firms today," Moschella added, "are seeking leaders who can thrive in an increasingly digital environment."

As we move forward into this increasingly-digital work world, it’s clear that Big Data analytics will be a driving force in demonstrating the value that IT-literate managers and management practices can deliver. The tremendous expansion in the total amount of data stored by the world’s businesses and governments has created opportunities for mining connections between people, activities and events that could not be explored even a few years ago — and the high-potential business cases arising from effective analysis of this data has business leaders looking for elusive data scientists, and for the new market and efficiency opportunities that they might discover.

To explore the explosion of Big Data and its business implications, InsightaaS sat down with Paul Lewis, CTO for Hitachi Data Systems in Canada. Lewis has a unique perspective forged from combining business and IT experience within both Hitachi and client companies that build business processes on top of data analytics. During the course of the conversation, Lewis highlighted eight issues that can and should be the focus of management attention across industries; we’ve captured these insights and some of their business implications below.

Paul Lewis
Paul Lewis, CTO, Hitachi Canada

1. Big Data helps define highly-granular segments — but isn’t necessarily the ‘whole answer’ in understanding individuals and their preferences.

It’s often argued that Big Data will expose us individually to scrutiny — that in the Big Data future, the world will have uncomfortably keen insight into our individual habits. Lewis pointed out that the opposite risk might apply. As much as we might imagine that marketers will address us as unique entities, the truth is that in order to take action on data, marketers need to assign individuals to groups sharing like characteristics, so that they can apply messaging appropriate to the segment. In some cases, the math won’t add up: for example, you might demonstrate a series of preferences that are common to people who like chocolate ice cream, but actually prefer vanilla; in this case, you are going to be served messages that are "more chocolate-related than vanilla-related" because your profile fits a specific pattern, and "there's too much data" influencing the system in one direction to allow your subsequent actions to counter the data inferences.

The insight: Big Data analysis helps lead us down the road to understanding individuals and their preferences; it doesn’t necessarily provide us with insight into those individuals and preferences.

The implication: Lewis is right — marketers look to aggregate customers into segments. As a result of Big Data analytics, those segments will become more numerous and more granular — but they still represent probabilities associated with individual tastes, not the individuals itself. Patterns based on Big Data help marketers to target messages, but sellers in most industries will need to extend their systems to capture information on people, not just groups of people.

2. Big Data is going to lead to some unpopular decisions — and some tough questions on data sourcing and permissions.

One common path to Big Data driven insight is the connection of multiple data sources into rich profiles. The potential business benefits to developing a more complete view of a potential customer are clear; analysts like Denis Pombraint are already talking about the need to add development of ‘Customer IP’ to an organization’s information architecture, noting that "Sales people have been demanding better leads for a long time and today marketing is in a position to provide them.  At the same time, marketers have discovered that the kind of data they collect is as important as its volume."

Lewis is cognizant of this trend — and also, of some of the potential pitfalls associated with it. He points out that there is great potential for connecting internal and external data sources, stating that "it's the correlation" between data that businesses already possess "and data that they don't have that provides the value" in customer IP. He adds a caution, though, using the example of a mortgage application: "They used to just have my credit report.  Now they can possibly combine that with social feeds and my political opinion and my contact list and my peer group, and that might make a different decision which is interesting in that I didn't give explicit permission" for use of this extended data.

The insight: Big Data offers the potential for assembly of data-rich personal profiles, which have evident business value — and have the equally-clear potential to draw the ire of the individuals who are profiled, and the scrutiny of regulators.

The implication: The sources and uses of Big Data are still emerging — and it’s likely that we’ll see protocols for assembly and use of Big Data emerge as well, to keep companies away from the troubling scenario that Lewis outlined. InsightaaS expects to see Big Data ethics emerge as a function within data-intensive operations that draw upon potentially-sensitive sources or combinations of personally-identifiable information (PII). As McKinsey stated in a report entitled Views from the front lines of the data-analytics revolution, "Privacy has become the third rail in the public discussion of big data, as media accounts have rightly pointed out excesses in some data-gathering methods. Little wonder that consumer wariness has risen."

3. Not all analytics projects are Big Data projects

Unlike some IT trends, there is no commonly-understood, technology-specific definition of what ‘Big Data’ actually means. It can be difficult to use surveys or anecdotes to understand the extent to which Big Data is in use today, because there is a temptation to assume that analytics exercises based on large data sets are all part of the Big Data trend.

Lewis offers a clearer perspective that is helpful in understanding the distinction between use of sizeable data stores to inform business decisions and Big Data. Big corporations, he says, "have applications and they're internally sourcing content, the end result is an analytics exercise versus a Big Data exercise." Firms often apply the Big Data label to these types of initiatives, but Lewis believes that "it's only when you combine that with external economic factors, external social feeds or sentiment analysis, do you start to get a different impression of what your consumers look like or what kind of behavior they have.  And that might lead to different decisions for your business."

The insight: Big Data isn’t a generic term for data-rich analytics processes — it is a specific type of process that is informed by a combination of internal and external information. Internal-only analytics initiatives can and do deliver great value to organizations — but they are different from, and the insights that they convey are different from, the processes and results associated with Big Data.

The implication: Use of Lewis’s distinction between internal-only analytics and Big Data analytics that draw on internal and external data can help IT and business management to set expectations and to understand opportunities and requirements for these different (and potentially complementary) approaches to business decision support.

4. The importance of data will lead organizations to re-evaluate their definition of IP

Recognizing that Big Data requires the integration of internal and external data leads to several important questions, including "how do we maximize the benefit of our own data?" Lewis’s response: "It starts with the business problem." Organizations, he believes, need to "appreciate that their data is as important as their applications."

This seems like a natural assertion, but Lewis finds that many IT organizations "exist for the purpose of building software and delivering that software to the business... building the 400 applications that run the business. The side effect of those applications is the data — it’s what happens when you click the ‘submit’ button." Lewis believes that we will see an evolution from the current state, where "[data] isn't the IP; the IP is the source code" to a future in which management will "come to the conclusion that the value of the business, the IP of the business, is the data that they create and house…on behalf of the consumers that they provide services for." Lewis foresees a shift in management of data-related resources, "an org chart where an enterprise data VP is equivalent to the infrastructure VP who is equivalent to the application VP," with an information management team responsible for storage, backup, and data analytics, with staff — including data scientists as well as DBAs — that "live and breathe by the data."

The insight: The first step in understanding the potential for Big Data analytics is to understand — and appropriately manage and value — the data within your own organization.  

The implication: IT structures will need to evolve to recognize that data is not merely a "side effect" of applications, but a resource that needs to have the same levels of visibility, management and governance as applications and infrastructure.

5. The CIO will play an important role in Big Data analytics

Research conducted by Techaisle and reviewed by InsightaaS shows an interesting dichotomy within Big Data analytics. A survey of 635 Canadian managers, comprised of equal numbers of IT and non-IT managers (ITDMs and BDMs, respectively), found that while relative solution influence for Big Data is reasonably evenly divided between ITDMs and BDMs, BDMs have much more influence over business analytics/BI solutions than their IT counterparts. What does this mean for Big Data analytics?

Lewis recognizes the importance of business management in this equation, but believes that the enterprise data group will be within the IT department, reporting on a straight-line basis to the CIO and on a dotted-line basis to business heads. Despite the fact that the deliverables from this group are used by business rather than IT management, Lewis believes that because Big Data analytics is IT enabled, "a good portion of the expertise in that team needs to be technical in some way…the data architect, the data scientists, the DBA, the backup [will be positioned within] that technological ‘bucket’ and therefore the CIO's budget."

The insight: The survey findings show that IT and business management both consider analytics and Big Data to be largely within the purview of business management, but there is a reasonable argument that the resources and management are technical enough to merit oversight by the CIO instead.

The implication: If the CIO is to be a credible steward for Big Data analytics, he/she will need to have strong lines of communication with business management, and a clear understanding of how internal and external data resources can be connected to serve specific business objectives.

6. Are you trying to find data scientists? Take a look at economics grads

We see the term "data scientists" everywhere, but like the Yeti or the Eurasian ghost orchid, the legend is far more common than actual sightings. We put this to Lewis, who replied that Hitachi has had success in placing economics majors to data scientist roles. As Lewis says, "they tend to be really good at appreciating and understanding raw data to create information." Hitachi is looking to capitalize on the ability of people who can "look at eight different spreadsheets and make some human correlation between them," recognizing that it may not be possible to replicate the processes needed to make these types of complex correlations with technology, "because human correlation takes different steps" than a software routine would.

The insight: Given the scarcity of data scientists, and the broad demand for their skills, any practical tips on locating them is important!

The implication: As Lewis says, "these people tend not to be coming from the technical side, they come from the statistical side." As a result, there is a need to train them on the technology aspects of their roles, or to connect them with IT-savvy colleagues — another reason why having this function within IT may be important to success.

7. The M2M data tsunami creates substantial technical challenges — and tremendous opportunity for business advantage

In his interview with InsightaaS, Lewis made it clear that he approaches Big Data analytics not from a Hitachi Data Systems perspective, but from the perspective of Hitachi as a corporation that creates complex industrial products: "we create storage solutions, but we mostly create bullet trains.  We mostly create nuclear power plants.  We mostly create alternative energy solutions.  We mostly do MRIs and CT Scans…things with the thousand sensors per device."

These complex products create huge volumes of operational data, which Hitachi analyzes through "laboratories that exist for the sole purpose of appreciating and understanding and empowering the data we produce for the products that we produce it for." In some cases, this data helps firms streamline operations — for example, data from a bullet train can trigger preventative maintenance that prevents downtime, while external data — for example, regarding weather, events or other factors — might be combined with data from the trains and system to optimize utilization. In other cases, devices such as MRIs could create metadata repositories that combine to provide guidance to healthcare practitioners.

The insight: Data has traditionally been the output of our own business activity, and consequently, we often see it as a single element in a sequential process. With machine to machine (M2M), though, this paradigm changes. Systems generate vast amounts of data independent of human business processes — collecting, analyzing and then deploying this information for use represents a net-new activity for most organizations, in both IT and business operations.

The implication: Many different roles will be impacted by the availability and use of M2M data. The data itself is vast, and not intrinsically connected to current workflow processes. IT will need to take the lead in understanding how to capture, organize, analyse and distribute Big Data information — and then will need to understand how to communicate this information effectively to others within the corporation.

8. The most difficult Big Data conversations involve business management

The final question in our interview dealt with effectively connecting Big Data with business activity. Which, we asked, is more difficult — getting technical staff to do things that they've never done before, or getting business management to take advantage of information they've never seen before - or, in some cases, never even imagined the ability to see?

In his response, Lewis was very clear: "It's much more difficult on the business side… getting them to ask questions that they haven't even thought of is a difficult process.  I'll sit down with a CEO or a marketing or business development person and say…What questions do you think you could possibly ask of the world — not of yourselves, but of the world — that might change the way you think about how you service your customers?  What kind of new customers do you think you will serve in the future?  And what kind of needs do you think that they're going to have?  It's forcing the conversation by asking the questions they wouldn't normally ask themselves."

The insight: It shouldn’t be surprising that it is difficult to prompt people to ask questions framed in a context that they have never really considered before, but it’s an important point to remember. When we think about the potential for Big Data analytics — as with any technology-driven source of innovation — it’s important to remember that the key adoption constraint is nearly always human capacity for/acceptance of change, rather that the ability of the technology to support that change.

The implication: Our past research has shown that relatively few IT leaders have the business understanding and credibility needed to provide management-level leadership at a corporate level — but our interview with Paul Lewis, and the overall trends in Big Data, make it clear that this will be a critical skill as we move into an analytics-based future.

4 COMMENTS

  1. Good big data article from a CIO/ CTO perspective – I would like to dispute point 3 though….internal data CAN be matched with internal human data or sentiment to become big data…that’s not an exclusive purvey of external data. By example – consider insider threat detection using behavioural analytics. Hard internal file activity data can be matched with human factors such as sentiment towards mgmt, company, hr, or web usage or job searches, resume edits to deduce organizational risk or anomalies – all of which would be considered internal data to draw a big data analytics correlation

    • Karl, that’s a fair comment. The distinction in the case is less about internal vs external sources of data and more about VARIETY of types data being used in the analysis. Big data requires the variety instead of using existing analytic tools to find nuggets of value in just a single or correlated series of databases.

      One tends to think of external data with less of a database spin, and more of a sensor, social, mobile, voice, visual spin…which was was the intention of the point.

      Thanks!

LEAVE A REPLY