Welcome!

CRM Authors: Xenia von Wedel, Ian Khan, PR.com Newswire, Steve Mordue

Related Topics: CRM

CRM: Blog Feed Post

Data Interaction Patterns

An Analysis

Throughout my experience with working on back-end systems for anything from big governmental to online gaming, I have came to develop a particular appreciation of the interactions that happen between data consumers and data producers. The following is a non-exhaustive and non-authoritative review of the different data interaction patterns that I've came up to play with. These are mostly unstructured notes from my experience in the field that I hope may turn useful to others.

 

As you know, when data is involved caching comes into play when performance and scalability are sought. In the coming diagrams, cache is represented as a vertical rectangle. The persistent storage is represented as a vertical blue cylinder, while horizontal cylinders represent some form of reliable and asynchronous message delivery channels. The data interactions are represented with curvy arrows: they can represent reading or writing.

 

 

Direct [R/W]

 

 

Besides the obvious drawbacks coming from the temporal coupling with the persistent storage mechanism, the interesting thing to note in such a trivial data access pattern is that there is often some form of request-scoped caching happening without the need to explicitly do anything. This first level of cache you get from data access layers help in optimizing operations provided they occur in the same request (to which is bound the transaction, if one exists).

 

Being short lived, this kind of caching is free from the problem of expired cache entries eviction: it can kick in transparently without the application being aware of it.

 

Through Cache [R/W]

 

 

Reading through cache is a simple and powerful mechanism where an application tries first to read from a long lived cache (a very cheap operation) and, if the requested data can't be found, proceeds with a read in the persistent storage (a way more expensive operation).

 

It's interesting to note that write operations don't necessarily happen the same way, ie. it is well possible that a write to the persistent storage doesn't perform a similar write in the cache. Why is that? Cached data is often a specific representation of the data available in the storage: it can be for example an aggregation of different data points that correspond to a particular cache key. The same persistent data can lead to the creation of several different cache entries. In the case, a write can simply lead to an immediate cache flush, waiting for subsequent read operations to repopulate these entries with new data.

 

Conversely, it's possible to have write operations update the cache, which opens the interesting problem of consistency. In the current scenario, the persistent storage remains the absolute truth of consistency: the application must handle the case when the cache was inconsistent and led to an invalid data operation in the persistent storage. I've found that localized cache evictions work well: the system goes through a little hiccup but quickly restores its data sanity.

 

Though some data access technologies allow the automatic management of this kind of second level of caching, I personally prefer that my applications have an explicit interaction with the caching technology they use, and this at the service layer. This is especially true when considering distributed caching and the need to address the inherent idiosyncrasies of such a caching model.

Cache distribution or clustering is not compulsory though: you can reap the benefits of reading through cache with localized caches but at the expense of needing to establish some form of stickiness between the data consumers and the providers (for example, by keeping a user sticky to a particular server based on its IP or session ID).

This said, stickiness skews load balancing and doesn't play well when you alter a pool of servers: I've really became convinced that you get better applications by preventing stickiness and letting requests hit any server. In that case, cache distribution or clustering becomes necessary: the former presents some challenges (like getting stale data after a repartition of the caching continuum) but scales better than the latter.

 

Write Behind [W]

 

 

Writing behind consists in updating the data cache synchronously and then defer the writing to the persistent storage to an asynchronous process, through a reliable messaging channel.

 

This is possible with regular caching technologies if there is no strong integrity constraints or if it's acceptable to present temporarily wrong data to the data consumer. In case the application has strong integrity constraints, the caching technology must be able to become the primary source of integrity truth: consistent distributed cached that supports some form of transactional data manipulation becomes necessary.

 

In this scenario, the persistent storage doesn't enforce any form of data constraint, mostly because it is too hard to propagate violation issues back to the upstream layers in any meaningful form. One could wonder what is the point of using such a persistent storage if it is dumbed down to such a mundane role: if this storage is an RDBMS, there is still value in writing to it because external systems like a back-office or business intelligence tools often require to access a standard data store.

 

Cache Push [R]

 

 

 

Pushing to cache is very useful for data whose lifecycle is not related to the interactions with its consumers. This is valid for feeds or the result of expensive computations not triggered by client requests.

 

The mechanism that pushes to cache can be something like a scheduled task or a process consuming asynchronous message channels.

 

Future Read [R]

 

 

In this scenario, the data producers synchronously answers the consumers with the promise of the future delivery of the requested data. When available, this data is delivered to the client via some sort of server push mechanism (see next section).

 

This approach works very well for expensive computations triggered by client requests.

 

Server Push [R]

 

 

Server push can be used to complement any of the previous interactions: in that case, a process prepares some data and delivers it directly to the consumer. There are many well known technological approaches for this, including HTTP long-polling, AJAX/CometD, web sockets or AMQP. Enabling server push in an application opens the door to very interesting data interactions as it allows to decouple the activities of the data consumers and producers.

 

 

Read the original blog entry...

More Stories By David Dossot

David Dossot has worked as a software engineer and architect for more than 14 years. He is a co-author of Mule in Action and is the project despot of the JCR Transport and a member of the Mule Community Committee. He is the project lead of NxBRE, an open source business rules engine for the .NET platform (selected for O'Reilly's Windows Developer Power Tools). He is also a judge for the Jolt Product Excellence Awards and has written several articles for SD Magazine. He holds a Production Systems Engineering Diploma from ESSTIN.

@ThingsExpo Stories
The 21st International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
SYS-CON Events announced today that Hitachi Data Systems, a wholly owned subsidiary of Hitachi LTD., will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City. Hitachi Data Systems (HDS) will be featuring the Hitachi Content Platform (HCP) portfolio. This is the industry’s only offering that allows organizations to bring together object storage, file sync and share, cloud storage gateways, and sophisticated search and...
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @CloudExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs oft...
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus intern...
SYS-CON Events announced today that Cloud Academy will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud computing technologies. Ge...
Existing Big Data solutions are mainly focused on the discovery and analysis of data. The solutions are scalable and highly available but tedious when swapping in and swapping out occurs in disarray and thrashing takes place. The resolution for thrashing through machine learning algorithms and support nomenclature is through simple techniques. Organizations that have been collecting large customer data are increasingly seeing the need to use the data for swapping in and out and thrashing occurs ...
SYS-CON Events announced today that Interoute has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Interoute is the owner operator of Europe's largest network and a global cloud services platform, which encompasses over 70,000 km of lit fiber, 15 data centers, 17 virtual data centers and 33 colocation centers, with connections to 195 additional partner data centers. Our full-service Unifie...
SYS-CON Events announced today that Fusion, a leading provider of cloud services, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Fusion, a leading provider of integrated cloud solutions to small, medium and large businesses, is the industry’s single source for the cloud. Fusion’s advanced, proprietary cloud service platform enables the integration of leading edge solutions in the cloud, including cloud...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
SYS-CON Events announced today that WineSOFT will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Based in Seoul and Irvine, WineSOFT is an innovative software house focusing on internet infrastructure solutions. The venture started as a bootstrap start-up in 2010 by focusing on making the internet faster and more powerful. WineSOFT’s knowledge is based on the expertise of TCP/IP, VPN, SSL, peer-to-peer, mob...
SYS-CON Events announced today that DivvyCloud will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. DivvyCloud software enables organizations to achieve their cloud computing goals by simplifying and automating security, compliance and cost optimization of public and private cloud infrastructure. Using DivvyCloud, customers can leverage programmatic Bots to identify and remediate common cloud problems in rea...
Detecting internal user threats in the Big Data eco-system is challenging and cumbersome. Many organizations monitor internal usage of the Big Data eco-system using a set of alerts. This is not a scalable process given the increase in the number of alerts with the accelerating growth in data volume and user base. Organizations are increasingly leveraging machine learning to monitor only those data elements that are sensitive and critical, autonomously establish monitoring policies, and to detect...
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
SYS-CON Events announced today that Linux Academy, the foremost online Linux and cloud training platform and community, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Linux Academy was founded on the belief that providing high-quality, in-depth training should be available at an affordable price. Industry leaders in quality training, provided services, and student certification passes, its goal is to c...
SYS-CON Events announced today that Systena America will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Systena Group has been in business for various software development and verification in Japan, US, ASEAN, and China by utilizing the knowledge we gained from all types of device development for various industries including smartphones (Android/iOS), wireless communication, security technology and IoT serv...
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
SYS-CON Events announced today that Infranics will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Since 2000, Infranics has developed SysMaster Suite, which is required for the stable and efficient management of ICT infrastructure. The ICT management solution developed and provided by Infranics continues to add intelligence to the ICT infrastructure through the IMC (Infra Management Cycle) based on mathemat...
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
SYS-CON Events announced today that Loom Systems will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2015, Loom Systems delivers an advanced AI solution to predict and prevent problems in the digital business. Loom stands alone in the industry as an AI analysis platform requiring no prior math knowledge from operators, leveraging the existing staff to succeed in the digital era. With offices in S...