Java IoT Authors: Elizabeth White, Liz McMillan, Yeshim Deniz, Pat Romanski, William Schmarzo

Related Topics: @DXWorldExpo, Java IoT, Industrial IoT, Microsoft Cloud, Machine Learning , @CloudExpo

@DXWorldExpo: Article

Five Big Data Features in SQL Server

Traditional RDBMS and Big Data

Traditional RDBMS & New Data Processing
Over the past two decades relational databases have been most successful in serving large scale OLTP and OLAP applications across enterprises. However, in the past couple of years with the advent of Big Data processing, especially for processing unstructured data coupled with the need for processing massive quantities of data, made the industry to look into Non RDBMS solutions. This has lead into the popularity of NOSQL databases as well as massively parallel processing frameworks.

However the traditional RDBMS were quick to react and added several Big Data features as part of their offering so that the enterprises with a heavy investment of traditional RDBMS can have the best of both worlds by properly leveraging these new features.

The following sections provide ideas about Big Data features in the popular SQL Server databases; a similar analysis will be performed against Oracle also in a later article.

1. Column Store Indexes
Column oriented databases differs from RDBMS like SQL Server such that they store data tables as sections of columns of data rather than rows of data. This has proven to be advantageous in certain situations where aggregates are computed over large number of similar data items.

Columnstore index, a feature in SQL Server 2012, groups and stores data for each column and then joins all the columns to complete the whole index. This differs from traditional indexes that group and store data for each row and then join all the rows to complete the whole index. For some types of queries, the SQL Server query processor can take advantage of the columnstore layout to significantly improve query execution times.

2. Hadoop Connectors
The SQL Server-Hadoop Connector is a Sqoop-based connector that facilitates efficient data transfer between SQL Server and Hadoop. Sqoop supports several databases including HDFS. This connector is bidirectional. You can import as well as export the data.

With SQL Server-Hadoop Connector, you can export data from:

  • delimited text files on HDFS to SQL Server
  • sequenceFiles on HDFS to SQL Server
  • hive Tables to tables in SQL Server

3. Full Text Search
There is no question about Sql Server's ability to process relational data and perform queries using JOINs and other traditional means. However much of the Big Data processing needs of the enterprise goes towards processing unstructured data which is generally in the form of textual data.

Full-Text Search in SQL Server lets users and applications run full-text queries against character-based data in SQL Server tables. Before you can run full-text queries on a table, the database administrator must create a full-text index on the table. The full-text index includes one or more character-based columns in the table.

After columns have been added to a full-text index, users and applications can run full-text queries on the text in the columns. These queries can search for any of the following:

  • One or more specific words or phrases (simple term)
  • A word or a phrase where the words begin with specified text (prefix term)
  • Inflectional forms of a specific word (generation term)
  • A word or phrase close to another word or phrase (proximity term)
  • Synonymous forms of a specific word (thesaurus)
  • Words or phrases using weighted values (weighted term)

4. Windows Azure SQL Federation/ Distributed Partition Views
Though not directly related to SQL Server, Windows Azure SQL databases are just the cloud version of the SQL server and this feature provides massively parallel processing capabilities on the database work load on cloud.

SQL Database with federation is a way to achieve greater scalability and performance from the database tier of your application through horizontal partition. One or more tables within a database with federation are split by row and portioned across multiple databases known as federation members. A federation is defined by a federation distribution scheme, or federation scheme. The federation scheme defines a federation distribution key, which determines the distribution of data to partitions within the federation.

The equivalent feature on the SQL Server database is known as DPV (Distributed Partition View). The primary SQL Server feature that allows transparent scale-out is the distributed partitioned view (DPV), sometimes referred to as federated view. In some cases, DPVs are used to help manage very large databases (VLDBs). Instead of creating and maintaining a multi-terabyte database, several smaller databases within the same instance are created.

5. Map Reduce Integration / Polybase
PolyBase is a breakthrough new technology on the data processing engine in SQL Server 2012 Parallel Data Warehouse designed as the simplest way to combine non-relational data and traditional relational data in your analysis.

Polybase unifies the relational and non-relational worlds at the query level. It provides the following Big Data processing features:

  • Integrated Query: Accepts a standard T-SQL query that joins tables containing a relational source with tables in a Hadoop cluster without needing to learn MapReduce.
  • Advanced query options: Apart from simple SELECT queries, users can perform JOINs and GROUP BYs on data in the Hadoop cluster.
  • Polybase is an exciting new technology and requires a separate coverage, but the above information just provides an introduction.

Traditional high performance RDBMS like SQL Server have their strengths. They are very strong in maintaining the data integrity and quality in the form of constraints, foreign keys and other validation mechanisms. They are also strong in transactional integrity by providing superior locking model, automatic dead lock resolution, etc. However initially they are not found to adjust to Big Data processing needs of enterprises.

With the enhancements in the products made by respective vendors , now databases like Sql Server have been enhanced with big data processing features and makes them the best candidate for enterprises looking for best of the breed features between traditional RDBMS and Big Data processing systems, and to leverage the best of existing investments.

More Stories By Srinivasan Sundara Rajan

Highly passionate about utilizing Digital Technologies to enable next generation enterprise. Believes in enterprise transformation through the Natives (Cloud Native & Mobile Native).

@ThingsExpo Stories
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addr...
Predicting the future has never been more challenging - not because of the lack of data but because of the flood of ungoverned and risk laden information. Microsoft states that 2.5 exabytes of data are created every day. Expectations and reliance on data are being pushed to the limits, as demands around hybrid options continue to grow.
As IoT continues to increase momentum, so does the associated risk. Secure Device Lifecycle Management (DLM) is ranked as one of the most important technology areas of IoT. Driving this trend is the realization that secure support for IoT devices provides companies the ability to deliver high-quality, reliable, secure offerings faster, create new revenue streams, and reduce support costs, all while building a competitive advantage in their markets. In this session, we will use customer use cases...
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of bus...
With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, @CloudEXPO and DXWorldEXPO are two of the most important technology events of the year. Since its launch over eight years ago, @CloudEXPO and DXWorldEXPO have presented a rock star faculty as well as showcased hundreds of sponsors and exhibitors! In this blog post, we provide 7 tips on how, as part of our world-class faculty, you can deliver one of the most popular sessions at our events. But before reading...
DXWordEXPO New York 2018, colocated with CloudEXPO New York 2018 will be held November 11-13, 2018, in New York City and will bring together Cloud Computing, FinTech and Blockchain, Digital Transformation, Big Data, Internet of Things, DevOps, AI, Machine Learning and WebRTC to one location.
DXWorldEXPO | CloudEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
DXWorldEXPO LLC announced today that ICOHOLDER named "Media Sponsor" of Miami Blockchain Event by FinTechEXPO. ICOHOLDER give you detailed information and help the community to invest in the trusty projects. Miami Blockchain Event by FinTechEXPO has opened its Call for Papers. The two-day event will present 20 top Blockchain experts. All speaking inquiries which covers the following information can be submitted by email to [email protected] Miami Blockchain Event by FinTechEXPO also offers s...
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
The IoT Will Grow: In what might be the most obvious prediction of the decade, the IoT will continue to expand next year, with more and more devices coming online every single day. What isn’t so obvious about this prediction: where that growth will occur. The retail, healthcare, and industrial/supply chain industries will likely see the greatest growth. Forrester Research has predicted the IoT will become “the backbone” of customer value as it continues to grow. It is no surprise that retail is ...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
DXWorldEXPO LLC announced today that "Miami Blockchain Event by FinTechEXPO" has announced that its Call for Papers is now open. The two-day event will present 20 top Blockchain experts. All speaking inquiries which covers the following information can be submitted by email to [email protected] Financial enterprises in New York City, London, Singapore, and other world financial capitals are embracing a new generation of smart, automated FinTech that eliminates many cumbersome, slow, and expe...
Cloud Expo | DXWorld Expo have announced the conference tracks for Cloud Expo 2018. Cloud Expo will be held June 5-7, 2018, at the Javits Center in New York City, and November 6-8, 2018, at the Santa Clara Convention Center, Santa Clara, CA. Digital Transformation (DX) is a major focus with the introduction of DX Expo within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive ov...