Welcome!

Java IoT Authors: Automic Blog, Liz McMillan, Pat Romanski, Kevin Benedict, Elizabeth White

Related Topics: @DXWorldExpo, Java IoT, @CloudExpo

@DXWorldExpo: Blog Feed Post

Thinking Like a Data Scientist | @CloudExpo [#BigData #IoT #DevOps]

Identify, brainstorm and/or uncover new variables that are better predictors of business performance

Thinking Like a Data Scientist: Part I

One question I frequently get is: "How do I become a data scientist?"  Wow, tough question.  There are several new books that outline the different skills, capabilities and technologies that a data scientist is going to need to learn and eventually master.  I've read several of these books and am impressed with the depth of the content.

Unfortunately, these books spend the vast majority of their time reviewing and/or teaching things such as the data science processes (such as CRISP: Cross Industry Standard Process for Data Mining), and basic and advanced statistics, data mining and data visualization techniques and tools.

Yes, these are very important data science skills, but they are not nearly sufficient to make our data science teams effective.  The data science teams still need help from the business users - or subject matter experts (SME) - to understand the decisions the business is trying to make, the hypotheses that they want to test and the predictions that they need to produce in support of those decisions and hypotheses.  In essence, to improve the overall effectiveness of our data science teams, we need to teach the business users to think like a data scientist.

So the objective of this blog (which if successful, will make its way into my Big Data MBA curriculum for the University of San Francisco School of Management fall semester) is to define a process that helps business users to "think like a data scientist."

I am also going to test this concept and methodology at my session at EMC World, where I am presenting "Expert Guidance To Achieve Big Data Maturity" on Monday, May 4th at 4:30.  So sharpen your pencils and let's begin the exercise!

Thinking Like a Data Scientist Process
The goal of the "thinking like a data scientist" process is to identify, brainstorm and/or uncover new variables that are better predictors of business performance.  But "business performance" of what?  Our key business initiative, of course.

Step 1:  Identify Key Business Initiative.  Would you expect anything different from me than starting with what's important to the business?  So, how can you spot a key business initiative?

A key business initiative is characterized as:

  • Critical to the immediate-term performance of the organization
  • Documented (communicated either internally or publicly)
  • Cross-functional (involves more than one business function)
  • Owned/championed by a senior business executive
  • Has a measurable financial goal
  • Has a well-defined delivery timeframe (9 to 12 months)
  • Undertaken to delivery significant, compelling and/or distinguishable financial or competitive advantage

I am a big stickler about targeting business initiatives that are focused on the next 9 to 12 months.  Anything longer than 12 months can quickly digress into a "Battlestar Gallatica" or "cure world hunger" project that may have incredible business value, but little chance of success.

For a refresher on how to identify an organizations key business initiatives, read my blog "Big Data MBA: Reading the Annual Report for Big Data Opportunities."  That blog outlines how to leverage publicly available information (e.g., annual reports, analyst calls, executive speeches, company blogs, SeekingAlpha.com) to uncover an organization's key business initiatives.

For purposes of this exercise, I'm going to pretend that our client is Foot Locker, and that our target business initiative is "Improve Merchandising Effectiveness" as highlighted in their annual report (see Figure 1).

billfig1

Figure 1: Identifying and Understanding Organization's Key Business Initiatives

Step 2:  Identify Strategic Nouns. Strategic nouns are the key business entities that either impact or are impacted by the organization's key business initiative.  These strategic nouns are critical to our data scientist thinking process because these are the entities for which we want to uncover or gain new, actionable insights, and around which we will ultimately build our analytic profiles.  Examples of strategic nouns include customers, patients, students, employees, stores, products, medication, trucks, wind turbines, etc.

For the Foot Locker "Improve Merchandising Effectiveness" business initiative, the strategic nouns upon which we will focus are:

  • Customers
  • Products
  • Campaigns
  • Stores

Step 3:  Brainstorm Strategic Noun Questions. Probably the hardest part of this exercise - and maybe the hardest part of the "thinking like a data scientist" exercise - is to brainstorm the different questions that you want to ask in support of the targeted business initiative.  For this part of the exercise, we want the business users to brainstorm the business questions for each of the "strategic noun" questions from the perspectives of:

  • Descriptive Analytics:  Understanding what happened
  • Predictive Analytics:  Predicting what is likely to happen
  • Prescriptive Analytics:  Recommending what to do next

See Figure 2 for an example of the evolution from Descriptive to Predictive to Prescriptive.

Figure 2:  Evolution of The Analytic Questions

Figure 2: Evolution of The Analytic Questions

In our Foot Locker "Improve Merchandising Effectiveness" example, we want to brainstorm the "Customer" strategic noun questions as such:

Descriptive Analytics (Understanding what happened)

  • What customers are most receptive to what types of merchandising campaigns?
  • What are the characteristics of customers (e.g., age, gender, customer tenure, life stage, favorite sports) who are most responsive to merchandising offers?
  • Are there certain times of year where certain customers are more responsive?

Predictive Analytics (Predicting what will happen)

  • Which customers are most likely to respond to a Back to School event
  • Which customers are most likely to respond to a BOGOF offer?
  • Which customers are most likely to respond to a 50% off in-store markdown?

Prescriptive Analytics (Recommending what to do next)

  • What personalized offers (recommendations) should I deliver to Anne Smith to get her to come into the store?

Part II of "Thinking Like a Data Scientist" blog series will conclude this "thinking like a data scientist" process and hopefully help us uncover new data sources and metrics that may be better predictors of business performance.

Thinking Like a Data Scientist - Part I
Bill Schmarzo

More Stories By William Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business”, is responsible for setting the strategy and defining the Big Data service line offerings and capabilities for the EMC Global Services organization. As part of Bill’s CTO charter, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He’s written several white papers, avid blogger and is a frequent speaker on the use of Big Data and advanced analytics to power organization’s key business initiatives. He also teaches the “Big Data MBA” at the University of San Francisco School of Management.

Bill has nearly three decades of experience in data warehousing, BI and analytics. Bill authored EMC’s Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements, and co-authored with Ralph Kimball a series of articles on analytic applications. Bill has served on The Data Warehouse Institute’s faculty as the head of the analytic applications curriculum.

Previously, Bill was the Vice President of Advertiser Analytics at Yahoo and the Vice President of Analytic Applications at Business Objects.

@ThingsExpo Stories
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, discussed how they built...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and simple way to introduce Machine Leaning to anyone and everyone. He solved a machine learning problem and demonstrated an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intelligence and B...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
With tough new regulations coming to Europe on data privacy in May 2018, Calligo will explain why in reality the effect is global and transforms how you consider critical data. EU GDPR fundamentally rewrites the rules for cloud, Big Data and IoT. In his session at 21st Cloud Expo, Adam Ryan, Vice President and General Manager EMEA at Calligo, examined the regulations and provided insight on how it affects technology, challenges the established rules and will usher in new levels of diligence arou...
The 22nd International Cloud Expo | 1st DXWorld Expo has announced that its Call for Papers is open. Cloud Expo | DXWorld Expo, to be held June 5-7, 2018, at the Javits Center in New York, NY, brings together Cloud Computing, Digital Transformation, Big Data, Internet of Things, DevOps, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
No hype cycles or predictions of a gazillion things here. IoT is here. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, an Associate Partner of Analytics, IoT & Cybersecurity at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He also discussed the evaluation of communication standards and IoT messaging protocols, data...
Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, discussed some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he covered some of the best practices for structured team migration an...
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
DevOps at Cloud Expo – being held June 5-7, 2018, at the Javits Center in New York, NY – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Among the proven benefits,...
@DevOpsSummit at Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, is co-located with 22nd Cloud Expo | 1st DXWorld Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait...
Cloud Expo | DXWorld Expo have announced the conference tracks for Cloud Expo 2018. Cloud Expo will be held June 5-7, 2018, at the Javits Center in New York City, and November 6-8, 2018, at the Santa Clara Convention Center, Santa Clara, CA. Digital Transformation (DX) is a major focus with the introduction of DX Expo within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive ov...
SYS-CON Events announced today that T-Mobile exhibited at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on qua...
SYS-CON Events announced today that Cedexis will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Cedexis is the leader in data-driven enterprise global traffic management. Whether optimizing traffic through datacenters, clouds, CDNs, or any combination, Cedexis solutions drive quality and cost-effectiveness. For more information, please visit https://www.cedexis.com.
SYS-CON Events announced today that Google Cloud has been named “Keynote Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Companies come to Google Cloud to transform their businesses. Google Cloud’s comprehensive portfolio – from infrastructure to apps to devices – helps enterprises innovate faster, scale smarter, stay secure, and do more with data than ever before.
SYS-CON Events announced today that Vivint to exhibit at SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California. As a leading smart home technology provider, Vivint offers home security, energy management, home automation, local cloud storage, and high-speed Internet solutions to more than one million customers throughout the United States and Canada. The end result is a smart home solution that sav...
SYS-CON Events announced today that Opsani will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Opsani is the leading provider of deployment automation systems for running and scaling traditional enterprise applications on container infrastructure.
SYS-CON Events announced today that Nirmata will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nirmata provides a comprehensive platform, for deploying, operating, and optimizing containerized applications across clouds, powered by Kubernetes. Nirmata empowers enterprise DevOps teams by fully automating the complex operations and management of application containers and its underlying ...
SYS-CON Events announced today that Opsani to exhibit at SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California. Opsani is creating the next generation of automated continuous deployment tools designed specifically for containers. How is continuous deployment different from continuous integration and continuous delivery? CI/CD tools provide build and test. Continuous Deployment is the means by which...