Java IoT Authors: Pat Romanski, Yeshim Deniz, Elizabeth White, Liz McMillan, Paul Simmons

Related Topics: @DXWorldExpo, Java IoT, @CloudExpo

@DXWorldExpo: Blog Feed Post

Thinking Like a Data Scientist: Part 3 By @Schmarzo | @BigDataExpo #BigData

How 'scores' can play a critical role in supporting an organization’s key business decisions

Thinking Like a Data Scientist Part 3: The Role of Scores

In New Zealand, they are taking a "Moneyball" approach to optimizing social worker spending and focus attention can be most effective. A recent article in BusinessWeek "A Moneyball Approach To Helping Troubled Kids" (May 11, 2015) highlights the role that "scores" can play in identifying and prioritizing problem areas, and deciding what corrective actions to take.

Using data from welfare, education, employment, and the housing agencies and the courts, the government identified the most expensive welfare beneficiaries - kids who have at least one close adult relative who's previously been reported to child safety authorities, been to prison, and spent substantial time on welfare. "There are million-dollar [cost] kids in those families," Minister of Finance Bill English says. "By the time they are 10, their likelihood of incarceration is 70 percent. You've got to do something about that."

...one idea is to rate families, giving them a number [score] that could be used to identify who's most at risk in the same way that lenders rely on credit scores to determine creditworthiness. "The way we may use it, it's going to be like it's a FICO score," says Jennie Feria, Head of Los Angeles' Department of Children and Family Service. The information, she says, could be used both to prioritize cases and to figure out who needs extra services.

In continuing my "Thinking Like a Data Scientist" blog series, we're going to focus on how "scores" can play a critical role in supporting an organization's key business decisions. The power of a score is that it is relatively easy to understand from a business user perspective, and it focuses the data science efforts on identifying and exploring new variables, metrics and relationships that might be better predictors of performance.

Definition of a Score
Let's start by understanding what a score is:

  • A score is a dynamic rating or grade standardized to aid in comparisons, performance tracking and decision-making; scores can help to predict likelihood of certain actions or outcomes
  • Scores are actionable, analytic-based measures that support the decisions your organization is trying to make, and guide the outcomes the organization is trying to predict

A common example of a score is the intelligence quotient or IQ score. An IQ score is derived from several standardized tests in order to create a single number that assesses an individual's "intelligence." The IQ score is standardized at 100 with a standard deviation of 15, which means that 68% of the population is within one standard deviation of the 100 standard (between 85 to 115). This standardization makes the IQ score easier to compare different candidates or applicants, and support key business decisions.

The true beauty of a "score" is its ability to convert a wide range of variables and metrics, all weighted, valued and correlated differently depending upon what's being measured, into a single number that can be used to guide decision-making. And the true power of the "score" is the ability to start small with some simple analytics, and then constantly fine-tune and expand the score with new metrics, variables and the relationships that might yield better predictors of performance.

FICO Score Example
FICO may be the best example of a business score that is used to predict certain behaviors, in this case, the likelihood of a borrower to repay a loan. Fair, Isaac, and Company first introduced the FICO score in 1989. The FICO model uses a wide range of consumer data to create and update these scores.

A person's FICO score can range between 300 and 850. A FICO score above 650 indicates that the individual has a very good credit history while people with scores below 620 will often find it substantially more difficult to obtain financing at a favorable rate (see Figure 1).


Figure 1: http://tightwadtravelers.com/check-fico-credit-score-free/

The FICO score considers a wide range of consumer data to generate the single score for every individual. The data elements that are used in the calculation of an individual's FICO score include[1]:

Payment History: 35 percent of the FICO credit score is based on a borrower's payment history, making the repayment of past debt the most important factor in calculating credit scores. According to FICO, past long-term behavior is used to forecast future long-term behavior. This is a measure of how do you handle

  • credit; think credit "behavioral analytics." This particular category encompasses the following metrics and variables:
  • Payment information on various types of accounts, including credit cards, retail accounts, installment loans and mortgages
  • The appearance of any adverse public records, such as bankruptcies, judgments, suits and liens, as well as collection items and delinquencies
  • Length of time for any delinquent payments
  • Amount of money still owed on delinquent accounts or collection items
  • Length of time since any delinquencies, adverse public records or collection items
  • Number of past-due items listed on a credit report
  • Number of accounts being paid as agreed

Credit Utilization: 30 percent of the FICO credit score is based on a borrower's credit utilization; that is, the percentage of available credit that has been borrowed by that individual. The Credit Utilization calculation is comprised of six variables:

  • The amount of debt still owed to lenders
  • The number of accounts with debt outstanding
  • The amount of debt owed on individual accounts
  • The types of loan
  • The percentage of credit lines in use on revolving accounts, like credit cards
  • The percentage of debt still owed on installment loans, like mortgages

Length of credit history: 15 percent of the FICO credit score is based on the length of time each account has been open and the length of time since the account's most recent activity. FICO breaks down "length of credit history" into three variables:

  • Length of time the accounts have been open
  • Length of time specific account types have been open
  • Length of time since those accounts were used

New credit applications: 10 percent of the FICO credit score is based upon borrowers' new credit applications. Within the new credit application category, FICO considers the following variables:

  • Number of accounts have been opened in the past six to 12 months, as well as the proportion of accounts that are new, by account type
  • Number of recent credit inquiries
  • Length of time since the opening of any new accounts, by account type
  • Length of time since any credit inquiries
  • The re-appearance on a credit report of positive credit information for an account that had earlier payment problems

Credit Mix: 10 percent of the FICO credit score is based upon repaying the variety of debt, which is a measure of the borrower's ability to handle a wide range of credit including:

  • Installment loans, including auto loans, student loans and furniture purchases
  • Mortgage loans
  • Bank credit cards
  • Retail credit cards
  • Gas station credit cards
  • Unpaid loans taken on by collection agencies or debt buyers
  • Rental data

The point of showing all of this FICO calculation detail is to reinforce the basic concept (and power) of a score - that a score can take into consideration a wide range of variables, metrics and relationships to create a single, standardized number that be used to support an organization's key decisions, or in the case of the FICO score, used by lenders to predict a particular loan applicant's ability to repay a loan. That's a very powerful concept. Scores are a critical concept in getting your business stakeholders to contemplate how they might want to integrate different variables and measures to create scores for the key business decisions that they need to make.

Other Industry Score Examples
Scores can be created to support business stakeholder decision-making across a number of different industries. Let's brainstorm just a few, and as my MBA students are going to find out this fall, there are many, many more waiting to be discovered!!

Financial Services

  • Retirement Readiness Score. This would be a score that measures how ready each client or investor is for retirement. This score could include variables such as age, current annual income, current annual expenses, net worth, value of primary home, value of secondary homes, desired retirement age, desired retirement location (Iowa is a lot cheaper than Palo Alto!!), number of dependent children, number of dependent parents, desired retirement lifestyle, etc.
  • Job Security Score. This score would measure the security of each individual's job. This score could include variables such as industry, job type, employer(s), job level/title, job experience, age, education level, skill sets, industry publications and presentations, Klout scores, etc.
  • Home Value Stability Score. This score would measure the stability of the value of a particular house. This score could consider variables such as current value, turnover and house sales history, value of house compared to comparable houses, whether it's a primary residence or rental residence, local price-to-rent ratio, local housing trends (maybe pulled from Zillow), etc.

[1] FICO's 5 factors: The components of a FICO credit score (http://www.creditcards.com/credit-card-news/help/5-parts-components-fico...)

Very Important Note: Combining the Job Security Score and Home Value Stability Score with the FICO score would have provided a more holistic assessment of banks' risk and housing market exposure prior to the 2007 financial market meltdown. For example, the Home Value Stability Score could have provided invaluable insights as banks tried to determine to whom to make home mortgage loans and which markets might be "over-valued".

The key point here is that it is important to have multiple scores that provide different perspectives on the decision that is trying to be made; that these scores provide different perspectives in order to provide a more holistic assessment of the true conditions around which to make these key business decisions.

Additional Scores for different industries can be seen in Table 3 below.


Table 3: Potential Scores by Industry

Scores are a very important and actionable concept for business stakeholders who are trying to envision where and how data science can improve their decision-making in support of their key business initiatives. As we saw from the FICO example, scores aid in performance tracking and decision-making by predicting likelihood of certain actions or outcomes (e.g., likelihood to repay a loan, in the case of the FICO score).

The beauty of a "score" is its ability to integrate a wide range of variables and metrics into a single number, and the power of the "score" is the ability to start small and then constantly looking for new metrics and variables that might yield better predictors of performance.

Simple but powerful, exactly what big data and data science should strive to be.

Read the original blog entry...

More Stories By William Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business” and “Big Data MBA: Driving Business Strategies with Data Science”, is responsible for setting strategy and defining the Big Data service offerings for Dell EMC’s Big Data Practice.

As a CTO within Dell EMC’s 2,000+ person consulting organization, he works with organizations to identify where and how to start their big data journeys. He’s written white papers, is an avid blogger and is a frequent speaker on the use of Big Data and data science to power an organization’s key business initiatives. He is a University of San Francisco School of Management (SOM) Executive Fellow where he teaches the “Big Data MBA” course. Bill also just completed a research paper on “Determining The Economic Value of Data”. Onalytica recently ranked Bill as #4 Big Data Influencer worldwide.

Bill has over three decades of experience in data warehousing, BI and analytics. Bill authored the Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements. Bill serves on the City of San Jose’s Technology Innovation Board, and on the faculties of The Data Warehouse Institute and Strata.

Previously, Bill was vice president of Analytics at Yahoo where he was responsible for the development of Yahoo’s Advertiser and Website analytics products, including the delivery of “actionable insights” through a holistic user experience. Before that, Bill oversaw the Analytic Applications business unit at Business Objects, including the development, marketing and sales of their industry-defining analytic applications.

Bill holds a Masters Business Administration from University of Iowa and a Bachelor of Science degree in Mathematics, Computer Science and Business Administration from Coe College.

@ThingsExpo Stories
DXWorldEXPO LLC announced today that ICC-USA, a computer systems integrator and server manufacturing company focused on developing products and product appliances, will exhibit at the 22nd International CloudEXPO | DXWorldEXPO. DXWordEXPO New York 2018, colocated with CloudEXPO New York 2018 will be held November 11-13, 2018, in New York City. ICC is a computer systems integrator and server manufacturing company focused on developing products and product appliances to meet a wide range of ...
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smart...
Headquartered in Plainsboro, NJ, Synametrics Technologies has provided IT professionals and computer systems developers since 1997. Based on the success of their initial product offerings (WinSQL and DeltaCopy), the company continues to create and hone innovative products that help its customers get more from their computer applications, databases and infrastructure. To date, over one million users around the world have chosen Synametrics solutions to help power their accelerated business or per...
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and ...
Founded in 2000, Chetu Inc. is a global provider of customized software development solutions and IT staff augmentation services for software technology providers. By providing clients with unparalleled niche technology expertise and industry experience, Chetu has become the premiere long-term, back-end software development partner for start-ups, SMBs, and Fortune 500 companies. Chetu is headquartered in Plantation, Florida, with thirteen offices throughout the U.S. and abroad.
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
"We are a well-established player in the application life cycle management market and we also have a very strong version control product," stated Flint Brenton, CEO of CollabNet,, in this SYS-CON.tv interview at 18th Cloud Expo at the Javits Center in New York City, NY.
It is of utmost importance for the future success of WebRTC to ensure that interoperability is operational between web browsers and any WebRTC-compliant client. To be guaranteed as operational and effective, interoperability must be tested extensively by establishing WebRTC data and media connections between different web browsers running on different devices and operating systems. In his session at WebRTC Summit at @ThingsExpo, Dr. Alex Gouaillard, CEO and Founder of CoSMo Software, presented ...
Most people haven’t heard the word, “gamification,” even though they probably, and perhaps unwittingly, participate in it every day. Gamification is “the process of adding games or game-like elements to something (as a task) so as to encourage participation.” Further, gamification is about bringing game mechanics – rules, constructs, processes, and methods – into the real world in an effort to engage people. In his session at @ThingsExpo, Robert Endo, owner and engagement manager of Intrepid D...
Recently, WebRTC has a lot of eyes from market. The use cases of WebRTC are expanding - video chat, online education, online health care etc. Not only for human-to-human communication, but also IoT use cases such as machine to human use cases can be seen recently. One of the typical use-case is remote camera monitoring. With WebRTC, people can have interoperability and flexibility for deploying monitoring service. However, the benefit of WebRTC for IoT is not only its convenience and interopera...
Michael Maximilien, better known as max or Dr. Max, is a computer scientist with IBM. At IBM Research Triangle Park, he was a principal engineer for the worldwide industry point-of-sale standard: JavaPOS. At IBM Research, some highlights include pioneering research on semantic Web services, mashups, and cloud computing, and platform-as-a-service. He joined the IBM Cloud Labs in 2014 and works closely with Pivotal Inc., to help make the Cloud Found the best PaaS.
Everything run by electricity will eventually be connected to the Internet. Get ahead of the Internet of Things revolution. In his session at @ThingsExpo, Akvelon expert and IoT industry leader Sergey Grebnov provided an educational dive into the world of managing your home, workplace and all the devices they contain with the power of machine-based AI and intelligent Bot services for a completely streamlined experience.
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
Personalization has long been the holy grail of marketing. Simply stated, communicate the most relevant offer to the right person and you will increase sales. To achieve this, you must understand the individual. Consequently, digital marketers developed many ways to gather and leverage customer information to deliver targeted experiences. In his session at @ThingsExpo, Lou Casal, Founder and Principal Consultant at Practicala, discussed how the Internet of Things (IoT) has accelerated our abilit...
In his session at Cloud Expo, Alan Winters, U.S. Head of Business Development at MobiDev, presented a success story of an entrepreneur who has both suffered through and benefited from offshore development across multiple businesses: The smart choice, or how to select the right offshore development partner Warning signs, or how to minimize chances of making the wrong choice Collaboration, or how to establish the most effective work processes Budget control, or how to maximize project result...
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. In his session at @BigDataExpo, Jack Norris, Senior Vice President, Data and Applications at MapR Technologies, reviewed best practices to ...
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, discussed the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker c...