Welcome!

Java IoT Authors: Elizabeth White, Don MacVittie, Liz McMillan, Nate Vickery, Kevin Benedict

Related Topics: @DXWorldExpo, Java IoT, @CloudExpo

@DXWorldExpo: Blog Feed Post

Thinking Like a Data Scientist: Part 3 By @Schmarzo | @BigDataExpo #BigData

How 'scores' can play a critical role in supporting an organization’s key business decisions

Thinking Like a Data Scientist Part 3: The Role of Scores

In New Zealand, they are taking a "Moneyball" approach to optimizing social worker spending and focus attention can be most effective. A recent article in BusinessWeek "A Moneyball Approach To Helping Troubled Kids" (May 11, 2015) highlights the role that "scores" can play in identifying and prioritizing problem areas, and deciding what corrective actions to take.

Using data from welfare, education, employment, and the housing agencies and the courts, the government identified the most expensive welfare beneficiaries - kids who have at least one close adult relative who's previously been reported to child safety authorities, been to prison, and spent substantial time on welfare. "There are million-dollar [cost] kids in those families," Minister of Finance Bill English says. "By the time they are 10, their likelihood of incarceration is 70 percent. You've got to do something about that."

...one idea is to rate families, giving them a number [score] that could be used to identify who's most at risk in the same way that lenders rely on credit scores to determine creditworthiness. "The way we may use it, it's going to be like it's a FICO score," says Jennie Feria, Head of Los Angeles' Department of Children and Family Service. The information, she says, could be used both to prioritize cases and to figure out who needs extra services.

In continuing my "Thinking Like a Data Scientist" blog series, we're going to focus on how "scores" can play a critical role in supporting an organization's key business decisions. The power of a score is that it is relatively easy to understand from a business user perspective, and it focuses the data science efforts on identifying and exploring new variables, metrics and relationships that might be better predictors of performance.

Definition of a Score
Let's start by understanding what a score is:

  • A score is a dynamic rating or grade standardized to aid in comparisons, performance tracking and decision-making; scores can help to predict likelihood of certain actions or outcomes
  • Scores are actionable, analytic-based measures that support the decisions your organization is trying to make, and guide the outcomes the organization is trying to predict

A common example of a score is the intelligence quotient or IQ score. An IQ score is derived from several standardized tests in order to create a single number that assesses an individual's "intelligence." The IQ score is standardized at 100 with a standard deviation of 15, which means that 68% of the population is within one standard deviation of the 100 standard (between 85 to 115). This standardization makes the IQ score easier to compare different candidates or applicants, and support key business decisions.

The true beauty of a "score" is its ability to convert a wide range of variables and metrics, all weighted, valued and correlated differently depending upon what's being measured, into a single number that can be used to guide decision-making. And the true power of the "score" is the ability to start small with some simple analytics, and then constantly fine-tune and expand the score with new metrics, variables and the relationships that might yield better predictors of performance.

FICO Score Example
FICO may be the best example of a business score that is used to predict certain behaviors, in this case, the likelihood of a borrower to repay a loan. Fair, Isaac, and Company first introduced the FICO score in 1989. The FICO model uses a wide range of consumer data to create and update these scores.

A person's FICO score can range between 300 and 850. A FICO score above 650 indicates that the individual has a very good credit history while people with scores below 620 will often find it substantially more difficult to obtain financing at a favorable rate (see Figure 1).

image1

Figure 1: http://tightwadtravelers.com/check-fico-credit-score-free/

The FICO score considers a wide range of consumer data to generate the single score for every individual. The data elements that are used in the calculation of an individual's FICO score include[1]:

Payment History: 35 percent of the FICO credit score is based on a borrower's payment history, making the repayment of past debt the most important factor in calculating credit scores. According to FICO, past long-term behavior is used to forecast future long-term behavior. This is a measure of how do you handle

  • credit; think credit "behavioral analytics." This particular category encompasses the following metrics and variables:
  • Payment information on various types of accounts, including credit cards, retail accounts, installment loans and mortgages
  • The appearance of any adverse public records, such as bankruptcies, judgments, suits and liens, as well as collection items and delinquencies
  • Length of time for any delinquent payments
  • Amount of money still owed on delinquent accounts or collection items
  • Length of time since any delinquencies, adverse public records or collection items
  • Number of past-due items listed on a credit report
  • Number of accounts being paid as agreed

Credit Utilization: 30 percent of the FICO credit score is based on a borrower's credit utilization; that is, the percentage of available credit that has been borrowed by that individual. The Credit Utilization calculation is comprised of six variables:

  • The amount of debt still owed to lenders
  • The number of accounts with debt outstanding
  • The amount of debt owed on individual accounts
  • The types of loan
  • The percentage of credit lines in use on revolving accounts, like credit cards
  • The percentage of debt still owed on installment loans, like mortgages

Length of credit history: 15 percent of the FICO credit score is based on the length of time each account has been open and the length of time since the account's most recent activity. FICO breaks down "length of credit history" into three variables:

  • Length of time the accounts have been open
  • Length of time specific account types have been open
  • Length of time since those accounts were used

New credit applications: 10 percent of the FICO credit score is based upon borrowers' new credit applications. Within the new credit application category, FICO considers the following variables:

  • Number of accounts have been opened in the past six to 12 months, as well as the proportion of accounts that are new, by account type
  • Number of recent credit inquiries
  • Length of time since the opening of any new accounts, by account type
  • Length of time since any credit inquiries
  • The re-appearance on a credit report of positive credit information for an account that had earlier payment problems

Credit Mix: 10 percent of the FICO credit score is based upon repaying the variety of debt, which is a measure of the borrower's ability to handle a wide range of credit including:

  • Installment loans, including auto loans, student loans and furniture purchases
  • Mortgage loans
  • Bank credit cards
  • Retail credit cards
  • Gas station credit cards
  • Unpaid loans taken on by collection agencies or debt buyers
  • Rental data

The point of showing all of this FICO calculation detail is to reinforce the basic concept (and power) of a score - that a score can take into consideration a wide range of variables, metrics and relationships to create a single, standardized number that be used to support an organization's key decisions, or in the case of the FICO score, used by lenders to predict a particular loan applicant's ability to repay a loan. That's a very powerful concept. Scores are a critical concept in getting your business stakeholders to contemplate how they might want to integrate different variables and measures to create scores for the key business decisions that they need to make.

Other Industry Score Examples
Scores can be created to support business stakeholder decision-making across a number of different industries. Let's brainstorm just a few, and as my MBA students are going to find out this fall, there are many, many more waiting to be discovered!!

Financial Services

  • Retirement Readiness Score. This would be a score that measures how ready each client or investor is for retirement. This score could include variables such as age, current annual income, current annual expenses, net worth, value of primary home, value of secondary homes, desired retirement age, desired retirement location (Iowa is a lot cheaper than Palo Alto!!), number of dependent children, number of dependent parents, desired retirement lifestyle, etc.
  • Job Security Score. This score would measure the security of each individual's job. This score could include variables such as industry, job type, employer(s), job level/title, job experience, age, education level, skill sets, industry publications and presentations, Klout scores, etc.
  • Home Value Stability Score. This score would measure the stability of the value of a particular house. This score could consider variables such as current value, turnover and house sales history, value of house compared to comparable houses, whether it's a primary residence or rental residence, local price-to-rent ratio, local housing trends (maybe pulled from Zillow), etc.

[1] FICO's 5 factors: The components of a FICO credit score (http://www.creditcards.com/credit-card-news/help/5-parts-components-fico...)

Very Important Note: Combining the Job Security Score and Home Value Stability Score with the FICO score would have provided a more holistic assessment of banks' risk and housing market exposure prior to the 2007 financial market meltdown. For example, the Home Value Stability Score could have provided invaluable insights as banks tried to determine to whom to make home mortgage loans and which markets might be "over-valued".

The key point here is that it is important to have multiple scores that provide different perspectives on the decision that is trying to be made; that these scores provide different perspectives in order to provide a more holistic assessment of the true conditions around which to make these key business decisions.

Additional Scores for different industries can be seen in Table 3 below.

image2

Table 3: Potential Scores by Industry

Summary
Scores are a very important and actionable concept for business stakeholders who are trying to envision where and how data science can improve their decision-making in support of their key business initiatives. As we saw from the FICO example, scores aid in performance tracking and decision-making by predicting likelihood of certain actions or outcomes (e.g., likelihood to repay a loan, in the case of the FICO score).

The beauty of a "score" is its ability to integrate a wide range of variables and metrics into a single number, and the power of the "score" is the ability to start small and then constantly looking for new metrics and variables that might yield better predictors of performance.

Simple but powerful, exactly what big data and data science should strive to be.

Read the original blog entry...

More Stories By William Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business”, is responsible for setting the strategy and defining the Big Data service line offerings and capabilities for the EMC Global Services organization. As part of Bill’s CTO charter, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He’s written several white papers, avid blogger and is a frequent speaker on the use of Big Data and advanced analytics to power organization’s key business initiatives. He also teaches the “Big Data MBA” at the University of San Francisco School of Management.

Bill has nearly three decades of experience in data warehousing, BI and analytics. Bill authored EMC’s Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements, and co-authored with Ralph Kimball a series of articles on analytic applications. Bill has served on The Data Warehouse Institute’s faculty as the head of the analytic applications curriculum.

Previously, Bill was the Vice President of Advertiser Analytics at Yahoo and the Vice President of Analytic Applications at Business Objects.

@ThingsExpo Stories
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
With tough new regulations coming to Europe on data privacy in May 2018, Calligo will explain why in reality the effect is global and transforms how you consider critical data. EU GDPR fundamentally rewrites the rules for cloud, Big Data and IoT. In his session at 21st Cloud Expo, Adam Ryan, Vice President and General Manager EMEA at Calligo, examined the regulations and provided insight on how it affects technology, challenges the established rules and will usher in new levels of diligence arou...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
The 22nd International Cloud Expo | 1st DXWorld Expo has announced that its Call for Papers is open. Cloud Expo | DXWorld Expo, to be held June 5-7, 2018, at the Javits Center in New York, NY, brings together Cloud Computing, Digital Transformation, Big Data, Internet of Things, DevOps, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, discussed how they built...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and simple way to introduce Machine Leaning to anyone and everyone. He solved a machine learning problem and demonstrated an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intelligence and B...
Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, discussed some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he covered some of the best practices for structured team migration an...
No hype cycles or predictions of a gazillion things here. IoT is here. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, an Associate Partner of Analytics, IoT & Cybersecurity at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He also discussed the evaluation of communication standards and IoT messaging protocols, data...
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
DevOps at Cloud Expo – being held June 5-7, 2018, at the Javits Center in New York, NY – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Among the proven benefits,...
@DevOpsSummit at Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, is co-located with 22nd Cloud Expo | 1st DXWorld Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait...
Cloud Expo | DXWorld Expo have announced the conference tracks for Cloud Expo 2018. Cloud Expo will be held June 5-7, 2018, at the Javits Center in New York City, and November 6-8, 2018, at the Santa Clara Convention Center, Santa Clara, CA. Digital Transformation (DX) is a major focus with the introduction of DX Expo within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive ov...
SYS-CON Events announced today that T-Mobile exhibited at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on qua...
SYS-CON Events announced today that Cedexis will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Cedexis is the leader in data-driven enterprise global traffic management. Whether optimizing traffic through datacenters, clouds, CDNs, or any combination, Cedexis solutions drive quality and cost-effectiveness. For more information, please visit https://www.cedexis.com.
SYS-CON Events announced today that Google Cloud has been named “Keynote Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Companies come to Google Cloud to transform their businesses. Google Cloud’s comprehensive portfolio – from infrastructure to apps to devices – helps enterprises innovate faster, scale smarter, stay secure, and do more with data than ever before.
SYS-CON Events announced today that Vivint to exhibit at SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California. As a leading smart home technology provider, Vivint offers home security, energy management, home automation, local cloud storage, and high-speed Internet solutions to more than one million customers throughout the United States and Canada. The end result is a smart home solution that sav...
SYS-CON Events announced today that Opsani will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Opsani is the leading provider of deployment automation systems for running and scaling traditional enterprise applications on container infrastructure.
SYS-CON Events announced today that Nirmata will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nirmata provides a comprehensive platform, for deploying, operating, and optimizing containerized applications across clouds, powered by Kubernetes. Nirmata empowers enterprise DevOps teams by fully automating the complex operations and management of application containers and its underlying ...
SYS-CON Events announced today that Opsani to exhibit at SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California. Opsani is creating the next generation of automated continuous deployment tools designed specifically for containers. How is continuous deployment different from continuous integration and continuous delivery? CI/CD tools provide build and test. Continuous Deployment is the means by which...