Click here to close now.

Welcome!

Java Authors: Liz McMillan, Elizabeth White, Pat Romanski, Jason Bloomberg, Esmeralda Swartz

Related Topics: Big Data Journal, Java, Linux, Web 2.0, Cloud Expo, Security

Big Data Journal: Blog Post

In-Memory Database vs. In-Memory Data Grid By @GridGain | @CloudExpo [#BigData]

It's easy to start with technical differences between the two categories

A few months ago, I spoke at the conference where I explained the difference between caching and an in-memory data grid. Today, having realized that many people are also looking to better understand the difference between two major categories in in-memory computing: In-Memory Database and In-Memory Data Grid, I am sharing the succinct version of my thinking on this topic - thanks to a recent analyst call that helped to put everything in place :)

TL;DR

Skip to conclusion to get the bottom line.

Nomenclature
Let's clarify the naming and buzzwords first. In-Memory Database (IMDB) is a well-established category name and it is typically used unambiguously.

It is important to note that there is a new crop of traditional databases with serious In-Memory "options". That includes MS SQL 2014, Oracle's Exalytics and Exadata, and IBM DB2 with BLU offerings. The line is blurry between these and the new pure In-Memory Databases, and for the simplicity I'll continue to call them In-Memory Databases.

In-Memory Data Grids (IMDGs) are sometimes (but not very frequently) called In-Memory NoSQL/NewSQL Databases. Although the latter can be more accurate in some case - I am going to use the In-Memory Data Grid term in this article, as it tends to be the more widely used term.

Note that there are also In-Memory Compute Grids and In-Memory Computing Platforms that include or augment many of the features of In-Memory Data Grids and In-Memory Databases.

Confusing, eh? It is... and for consistency - going forward we'll just use these terms for the two main categories:

  • In-Memory Database
  • In-Memory Data Grid

Tiered Storage
It is also important to nail down what we mean by "In-Memory". Surprisingly - there's a lot of confusion here as well as some vendors refer to SSDs, Flash-on-PCI, Memory Channel Storage, and, of course, DRAM as "In-Memory".

In reality, most vendors support a Tiered Storage Model where some portion of the data is stored in DRAM (the fastest storage but with limited capacity) and then it gets overflown to a verity of flash or disk devices (slower but with more capacity) - so it is rarely a DRAM-only or Flash-only product. However, it's important to note that most products in both categories are often biased towards mostly DRAM or mostly flash/disk storage in their architecture.

Bottom line is that products vary greatly in what they mean by "In-Memory" but in the end they all have a significant "In-Memory" component.

Technical Differences
It's easy to start with technical differences between the two categories.

Most In-Memory Databases are your father's RDBMS that store data "in memory" instead of disk. That's practically all there's to it. They provide good SQL support with only a modest list of unsupported SQL features, shipped with ODBC/JDBC drivers and can be used in place of existing RDBMS often without significant changes.

In-Memory Data Grids typically lack full ANSI SQL support but instead provide MPP-based (Massively Parallel Processing) capabilities where data is spread across large cluster of commodity servers and processed in explicitly parallel fashion. The main access pattern is key/value access, MapReduce, various forms of HPC-like processing, and a limited distributed SQL querying and indexing capabilities.

It is important to note that there is a significant crossover from In-Memory Data Grids to In-Memory Databases in terms of SQL support. GridGain, for example, provides pretty serious and constantly growing support for SQL including pluggable indexing, distributed joins optimization, custom SQL functions, etc.

Speed Only vs. Speed + Scalability
One of the crucial differences between In-Memory Data Grids and In-Memory Databases lies in the ability to scale to hundreds and thousands of servers. That is the In-Memory Data Grid's inherent capability for such scale due to their MPP architecture, and the In-Memory Database's explicit inability to scale due to fact that SQL joins, in general, cannot be efficiently performed in a distribution context.

It's one of the dirty secrets of In-Memory Databases: one of their most useful features, SQL joins, is also is their Achilles heel when it comes to scalability. This is the fundamental reason why most existing SQL databases (disk or memory based) are based on vertically scalable SMP (Symmetrical Processing) architecture unlike In-Memory Data Grids that utilize the much more horizontally scalable MPP approach.

It's important to note that both In-Memory Data Grids and In-Memory Database can achieve similar speed in a local non-distributed context. In the end - they both do all processing in memory.

But only In-Memory Data Grids can natively scale to hundreds and thousands of nodes providing unprecedented scalability and unrivaled throughput.

Replace Database vs. Change Application
Apart from scalability, there is another difference that is important for uses cases where In-Memory Data Grids or In-Memory Database are tasked with speeding up existing systems or applications.

An In-Memory Data Grid always works with an existing database providing a layer of massively distributed in-memory storage and processing between the database and the application. Applications then rely on this layer for super-fast data access and processing. Most In-Memory Data Grids can seamlessly read-through and write-through from and to databases, when necessary, and generally are highly integrated with existing databases.

In exchange - developers need to make some changes to the application to take advantage of these new capabilities. The application no longer "talks" SQL only, but needs to learn how to use MPP, MapReduce or other techniques of data processing.

In-Memory Databases provide almost a mirror opposite picture: they often requirereplacing your existing database (unless you use one of those In-Memory "options" to temporary boost your database performance) - but will demand significantly less changes to the application itself as it will continue to rely on SQL (albeit a modified dialect of it).

In the end, both approaches have their advantages and disadvantages, and they may often depend in part on organizational policies and politics as much as on their technical merits.

Conclusion
The bottom line should be pretty clear by now.

If you are developing a green-field, brand new system or application the choice is pretty clear in favor of In-Memory Data Grids. You get the best of the two worlds: you get to work with the existing databases in your organization where necessary, and enjoy tremendous performance and scalability benefits of In-Memory Data Grids - both of which are highly integrated.

If you are, however, modernizing your existing enterprise system or application the choice comes down to this:

You will want to use an In-Memory Database if the following applies to you:

  • You can replace or upgrade your existing disk-based RDBMS
  • You cannot make changes to your applications
  • You care about speed, but don't care as much about scalability

In other words - you boost your application's speed by replacing or upgrading RDBMS without significantly touching the application itself.

On the other hand, you want to use an In-Memory Data Grid if the following applies to you:

  • You cannot replace your existing disk-based RDBMS
  • You can make changes to (the data access subsystem of) your application
  • You care about speed and especially about scalability, and don't want to trade one for the other

In other words - with an In-Memory Data Grid you can boost your application's speed and provide massive scale by tweaking the application, but without making changes to your existing database.

It can be summarized it in the following table:


In-Memory Data GridIn-Memory Database
Existing Application Changed Unchanged
Existing RDBMS Unchanged Changed or Replaced
Speed Yes Yes
Max. Scalability Yes No

More Stories By Nikita Ivanov

Nikita Ivanov is founder and CEO of GridGain Systems, started in 2007 and funded by RTP Ventures and Almaz Capital. Nikita has led GridGain to develop advanced and distributed in-memory data processing technologies – the top Java in-memory computing platform starting every 10 seconds around the world today.

Nikita has over 20 years of experience in software application development, building HPC and middleware platforms, contributing to the efforts of other startups and notable companies including Adaptec, Visa and BEA Systems. Nikita was one of the pioneers in using Java technology for server side middleware development while working for one of Europe’s largest system integrators in 1996.

He is an active member of Java middleware community, contributor to the Java specification, and holds a Master’s degree in Electro Mechanics from Baltic State Technical University, Saint Petersburg, Russia.

@ThingsExpo Stories
Dave will share his insights on how Internet of Things for Enterprises are transforming and making more productive and efficient operations and maintenance (O&M) procedures in the cleantech industry and beyond. Speaker Bio: Dave Landa is chief operating officer of Cybozu Corp (kintone US). Based in the San Francisco Bay Area, Dave has been on the forefront of the Cloud revolution driving strategic business development on the executive teams of multiple leading Software as a Services (SaaS) application providers dating back to 2004. Cybozu's kintone.com is a leading global BYOA (Build Your O...
Health care systems across the globe are under enormous strain, as facilities reach capacity and costs continue to rise. M2M and the Internet of Things have the potential to transform the industry through connected health solutions that can make care more efficient while reducing costs. In fact, Vodafone's annual M2M Barometer Report forecasts M2M applications rising to 57 percent in health care and life sciences by 2016. Lively is one of Vodafone's health care partners, whose solutions enable older adults to live independent lives while staying connected to loved ones. M2M will continue to gr...
With IoT exploding, massive data will transform businesses with opportunities to monetize almost anything that can be measured. In this C-Level Roundtable Discussion at @ThingsExpo, Brendan O’Brien, Aria Systems Co-founder and Chief Evangelist, will lead an expert panel of consultants, thought leaders and practitioners who will look at these new monetization trends, discuss the implications, and detail lessons learned from their collective experience. Finally, the panel will point the way forward for enterprises who wish to leverage the resulting complex recurring revenue models, adding valu...
How is unified communications transforming the way businesses operate? In his session at WebRTC Summit, Arvind Rangarajan, Director of Product Marketing at BroadSoft, will discuss how to extend unified communications experience outside the enterprise through WebRTC. He will also review use cases across different industry verticals. Arvind Rangarajan is Director, Product Marketing at BroadSoft. He has over 19 years of experience in the telecommunications industry in various roles such as Software Development, Product Management and Product Marketing, applied across Wireless, Unified Communic...
SYS-CON Events announced today that Vicom Computer Services, Inc., a provider of technology and service solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. They are located at booth #427. Vicom Computer Services, Inc. is a progressive leader in the technology industry for over 30 years. Headquartered in the NY Metropolitan area. Vicom provides products and services based on today’s requirements around Unified Networks, Cloud Computing strategies, Virtualization around Software defined Data Ce...
What exactly is a cognitive application? In her session at 16th Cloud Expo, Ashley Hathaway, Product Manager at IBM Watson, will look at the services being offered by the IBM Watson Developer Cloud and what that means for developers and Big Data. She'll explore how IBM Watson and its partnerships will continue to grow and help define what it means to be a cognitive service, as well as take a look at the offerings on Bluemix. She will also check out how Watson and the Alchemy API team up to offer disruptive APIs to developers.
The IoT Bootcamp is coming to Cloud Expo | @ThingsExpo on June 9-10 at the Javits Center in New York. Instructor. Registration is now available at http://iotbootcamp.sys-con.com/ Instructor Janakiram MSV previously taught the famously successful Multi-Cloud Bootcamp at Cloud Expo | @ThingsExpo in November in Santa Clara. Now he is expanding the focus to Janakiram is the founder and CTO of Get Cloud Ready Consulting, a niche Cloud Migration and Cloud Operations firm that recently got acquired by Aditi Technologies. He is a Microsoft Regional Director for Hyderabad, India, and one of the f...
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
SYS-CON Events announced today that SoftLayer, an IBM company, has been named “Gold Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place June 9-11, 2015 at the Javits Center in New York City, NY, and the 17th International Cloud Expo®, which will take place November 3–5, 2015 at the Santa Clara Convention Center in Santa Clara, CA. SoftLayer operates a global cloud infrastructure platform built for Internet scale. With a global footprint of data centers and network points of presence, SoftLayer provides infrastructure as a service to leading-edge customers ranging from ...
SYS-CON Events announced today that Cisco, the worldwide leader in IT that transforms how people connect, communicate and collaborate, has been named “Gold Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Cisco makes amazing things happen by connecting the unconnected. Cisco has shaped the future of the Internet by becoming the worldwide leader in transforming how people connect, communicate and collaborate. Cisco and our partners are building the platform for the Internet of Everything by connecting the...
SYS-CON Events announced today that Liaison Technologies, a leading provider of data management and integration cloud services and solutions, has been named "Silver Sponsor" of SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York, NY. Liaison Technologies is a recognized market leader in providing cloud-enabled data integration and data management solutions to break down complex information barriers, enabling enterprises to make smarter decisions, faster.
SYS-CON Events announced today that Ciqada will exhibit at SYS-CON's @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Ciqada™ makes it easy to connect your products to the Internet. By integrating key components - hardware, servers, dashboards, and mobile apps - into an easy-to-use, configurable system, your products can quickly and securely join the internet of things. With remote monitoring, control, and alert messaging capability, you will meet your customers' needs of tomorrow - today! Ciqada. Let your products take flight. For more inform...
SYS-CON Events announced today that Windstream, a leading provider of advanced network and cloud communications, has been named “Silver Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York, NY. Windstream (Nasdaq: WIN), a FORTUNE 500 and S&P 500 company, is a leading provider of advanced network communications, including cloud computing and managed services, to businesses nationwide. The company also offers broadband, phone and digital TV services to consumers primarily in rural areas.
SYS-CON Events announced today that ProfitBricks, the provider of painless cloud infrastructure, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY., and the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. ProfitBricks is the IaaS provider that offers a painless cloud experience for all IT users, with no learning curve. ProfitBricks boasts flexible cloud servers and networking, an integrated Data Center Designer tool f...
SYS-CON Events announced today that GENBAND, a leading developer of real time communications software solutions, has been named “Silver Sponsor” of SYS-CON's WebRTC Summit, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. The GENBAND team will be on hand to demonstrate their newest product, Kandy. Kandy is a communications Platform-as-a-Service (PaaS) that enables companies to seamlessly integrate more human communications into their Web and mobile applications - creating more engaging experiences for their customers and boosting collaboration and productiv...
SYS-CON Events announced today that Dyn, the worldwide leader in Internet Performance, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Dyn is a cloud-based Internet Performance company. Dyn helps companies monitor, control, and optimize online infrastructure for an exceptional end-user experience. Through a world-class network and unrivaled, objective intelligence into Internet conditions, Dyn ensures traffic gets delivered faster, safer, and more reliably than ever.
SYS-CON Events announced today that Open Data Centers (ODC), a carrier-neutral colocation provider, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. Open Data Centers is a carrier-neutral data center operator in New Jersey and New York City offering alternative connectivity options for carriers, service providers and enterprise customers.
SYS-CON Events announced today that BroadSoft, the leading global provider of Unified Communications and Collaboration (UCC) services to operators worldwide, has been named “Gold Sponsor” of SYS-CON's WebRTC Summit, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. BroadSoft is the leading provider of software and services that enable mobile, fixed-line and cable service providers to offer Unified Communications over their Internet Protocol networks. The Company’s core communications platform enables the delivery of a range of enterprise and consumer calling...
SYS-CON Events announced today that On the Avenue Marketing Group, a sales and marketing firm that utilizes events to market and sell products to consumers, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. On the Avenue Marketing Group (OTA) is a sales and marketing firm that utilizes events to market and sell products to consumers. On behalf of our clients, we attend thousands of fairs, festivals, expos, concerts, conferences, and sporting events annually, helping them reach millions of individuals ...
SYS-CON Events announced today that ActiveState, the leading independent Cloud Foundry and Docker-based PaaS provider, has been named “Silver Sponsor” of SYS-CON's DevOps Summit New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. ActiveState believes that enterprises gain a competitive advantage when they are able to quickly create, deploy and efficiently manage software solutions that immediately create business value, but they face many challenges that prevent them from doing so. The Company is uniquely positioned to help address these challenges thro...