|By David Dossot||
|May 24, 2010 07:26 PM EDT||
Throughout my experience with working on back-end systems for anything from big governmental to online gaming, I have came to develop a particular appreciation of the interactions that happen between data consumers and data producers. The following is a non-exhaustive and non-authoritative review of the different data interaction patterns that I've came up to play with. These are mostly unstructured notes from my experience in the field that I hope may turn useful to others.
As you know, when data is involved caching comes into play when performance and scalability are sought. In the coming diagrams, cache is represented as a vertical rectangle. The persistent storage is represented as a vertical blue cylinder, while horizontal cylinders represent some form of reliable and asynchronous message delivery channels. The data interactions are represented with curvy arrows: they can represent reading or writing.
Besides the obvious drawbacks coming from the temporal coupling with the persistent storage mechanism, the interesting thing to note in such a trivial data access pattern is that there is often some form of request-scoped caching happening without the need to explicitly do anything. This first level of cache you get from data access layers help in optimizing operations provided they occur in the same request (to which is bound the transaction, if one exists).
Being short lived, this kind of caching is free from the problem of expired cache entries eviction: it can kick in transparently without the application being aware of it.
Through Cache [R/W]
Reading through cache is a simple and powerful mechanism where an application tries first to read from a long lived cache (a very cheap operation) and, if the requested data can't be found, proceeds with a read in the persistent storage (a way more expensive operation).
It's interesting to note that write operations don't necessarily happen the same way, ie. it is well possible that a write to the persistent storage doesn't perform a similar write in the cache. Why is that? Cached data is often a specific representation of the data available in the storage: it can be for example an aggregation of different data points that correspond to a particular cache key. The same persistent data can lead to the creation of several different cache entries. In the case, a write can simply lead to an immediate cache flush, waiting for subsequent read operations to repopulate these entries with new data.
Conversely, it's possible to have write operations update the cache, which opens the interesting problem of consistency. In the current scenario, the persistent storage remains the absolute truth of consistency: the application must handle the case when the cache was inconsistent and led to an invalid data operation in the persistent storage. I've found that localized cache evictions work well: the system goes through a little hiccup but quickly restores its data sanity.
Though some data access technologies allow the automatic management of this kind of second level of caching, I personally prefer that my applications have an explicit interaction with the caching technology they use, and this at the service layer. This is especially true when considering distributed caching and the need to address the inherent idiosyncrasies of such a caching model.
Cache distribution or clustering is not compulsory though: you can reap the benefits of reading through cache with localized caches but at the expense of needing to establish some form of stickiness between the data consumers and the providers (for example, by keeping a user sticky to a particular server based on its IP or session ID).
This said, stickiness skews load balancing and doesn't play well when you alter a pool of servers: I've really became convinced that you get better applications by preventing stickiness and letting requests hit any server. In that case, cache distribution or clustering becomes necessary: the former presents some challenges (like getting stale data after a repartition of the caching continuum) but scales better than the latter.
Write Behind [W]
Writing behind consists in updating the data cache synchronously and then defer the writing to the persistent storage to an asynchronous process, through a reliable messaging channel.
This is possible with regular caching technologies if there is no strong integrity constraints or if it's acceptable to present temporarily wrong data to the data consumer. In case the application has strong integrity constraints, the caching technology must be able to become the primary source of integrity truth: consistent distributed cached that supports some form of transactional data manipulation becomes necessary.
In this scenario, the persistent storage doesn't enforce any form of data constraint, mostly because it is too hard to propagate violation issues back to the upstream layers in any meaningful form. One could wonder what is the point of using such a persistent storage if it is dumbed down to such a mundane role: if this storage is an RDBMS, there is still value in writing to it because external systems like a back-office or business intelligence tools often require to access a standard data store.
Cache Push [R]
Pushing to cache is very useful for data whose lifecycle is not related to the interactions with its consumers. This is valid for feeds or the result of expensive computations not triggered by client requests.
The mechanism that pushes to cache can be something like a scheduled task or a process consuming asynchronous message channels.
Future Read [R]
In this scenario, the data producers synchronously answers the consumers with the promise of the future delivery of the requested data. When available, this data is delivered to the client via some sort of server push mechanism (see next section).
This approach works very well for expensive computations triggered by client requests.
Server Push [R]
Server push can be used to complement any of the previous interactions: in that case, a process prepares some data and delivers it directly to the consumer. There are many well known technological approaches for this, including HTTP long-polling, AJAX/CometD, web sockets or AMQP. Enabling server push in an application opens the door to very interesting data interactions as it allows to decouple the activities of the data consumers and producers.
SYS-CON Events announced today that Auditwerx will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Auditwerx specializes in SOC 1, SOC 2, and SOC 3 attestation services throughout the U.S. and Canada. As a division of Carr, Riggs & Ingram (CRI), one of the top 20 largest CPA firms nationally, you can expect the resources, skills, and experience of a much larger firm combined with the accessibility and attent...
Mar. 29, 2017 10:30 AM EDT Reads: 557
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
Mar. 29, 2017 10:15 AM EDT Reads: 1,656
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and management into a ...
Mar. 29, 2017 10:00 AM EDT Reads: 2,313
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), will provide an overview of various initiatives to certifiy the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldw...
Mar. 29, 2017 08:45 AM EDT Reads: 899
In his General Session at 16th Cloud Expo, David Shacochis, host of The Hybrid IT Files podcast and Vice President at CenturyLink, investigated three key trends of the “gigabit economy" though the story of a Fortune 500 communications company in transformation. Narrating how multi-modal hybrid IT, service automation, and agile delivery all intersect, he will cover the role of storytelling and empathy in achieving strategic alignment between the enterprise and its information technology.
Mar. 29, 2017 08:00 AM EDT Reads: 7,448
Microservices are a very exciting architectural approach that many organizations are looking to as a way to accelerate innovation. Microservices promise to allow teams to move away from monolithic "ball of mud" systems, but the reality is that, in the vast majority of organizations, different projects and technologies will continue to be developed at different speeds. How to handle the dependencies between these disparate systems with different iteration cycles? Consider the "canoncial problem" ...
Mar. 29, 2017 06:00 AM EDT Reads: 9,048
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
Mar. 29, 2017 04:00 AM EDT Reads: 15,093
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor - all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
Mar. 29, 2017 03:45 AM EDT Reads: 2,152
My team embarked on building a data lake for our sales and marketing data to better understand customer journeys. This required building a hybrid data pipeline to connect our cloud CRM with the new Hadoop Data Lake. One challenge is that IT was not in a position to provide support until we proved value and marketing did not have the experience, so we embarked on the journey ourselves within the product marketing team for our line of business within Progress. In his session at @BigDataExpo, Sum...
Mar. 29, 2017 03:30 AM EDT Reads: 3,236
Web Real-Time Communication APIs have quickly revolutionized what browsers are capable of. In addition to video and audio streams, we can now bi-directionally send arbitrary data over WebRTC's PeerConnection Data Channels. With the advent of Progressive Web Apps and new hardware APIs such as WebBluetooh and WebUSB, we can finally enable users to stitch together the Internet of Things directly from their browsers while communicating privately and securely in a decentralized way.
Mar. 29, 2017 03:00 AM EDT Reads: 6,081
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
Mar. 29, 2017 01:15 AM EDT Reads: 2,511
What sort of WebRTC based applications can we expect to see over the next year and beyond? One way to predict development trends is to see what sorts of applications startups are building. In his session at @ThingsExpo, Arin Sime, founder of WebRTC.ventures, will discuss the current and likely future trends in WebRTC application development based on real requests for custom applications from real customers, as well as other public sources of information,
Mar. 29, 2017 01:00 AM EDT Reads: 1,116
In his General Session at 17th Cloud Expo, Bruce Swann, Senior Product Marketing Manager for Adobe Campaign, explored the key ingredients of cross-channel marketing in a digital world. Learn how the Adobe Marketing Cloud can help marketers embrace opportunities for personalized, relevant and real-time customer engagement across offline (direct mail, point of sale, call center) and digital (email, website, SMS, mobile apps, social networks, connected objects).
Mar. 28, 2017 11:15 PM EDT Reads: 3,500
With the introduction of IoT and Smart Living in every aspect of our lives, one question has become relevant: What are the security implications? To answer this, first we have to look and explore the security models of the technologies that IoT is founded upon. In his session at @ThingsExpo, Nevi Kaja, a Research Engineer at Ford Motor Company, will discuss some of the security challenges of the IoT infrastructure and relate how these aspects impact Smart Living. The material will be delivered i...
Mar. 28, 2017 09:30 PM EDT Reads: 2,252
"My role is working with customers, helping them go through this digital transformation. I spend a lot of time talking to banks, big industries, manufacturers working through how they are integrating and transforming their IT platforms and moving them forward," explained William Morrish, General Manager Product Sales at Interoute, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Mar. 28, 2017 09:30 PM EDT Reads: 3,884
SYS-CON Events announced today that Ocean9will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Ocean9 provides cloud services for Backup, Disaster Recovery (DRaaS) and instant Innovation, and redefines enterprise infrastructure with its cloud native subscription offerings for mission critical SAP workloads.
Mar. 28, 2017 08:15 PM EDT Reads: 2,377
Your homes and cars can be automated and self-serviced. Why can't your storage? From simply asking questions to analyze and troubleshoot your infrastructure, to provisioning storage with snapshots, recovery and replication, your wildest sci-fi dream has come true. In his session at @DevOpsSummit at 20th Cloud Expo, Dan Florea, Director of Product Management at Tintri, will provide a ChatOps demo where you can talk to your storage and manage it from anywhere, through Slack and similar services ...
Mar. 28, 2017 07:00 PM EDT Reads: 4,542
SYS-CON Events announced today that Linux Academy, the foremost online Linux and cloud training platform and community, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Linux Academy was founded on the belief that providing high-quality, in-depth training should be available at an affordable price. Industry leaders in quality training, provided services, and student certification passes, its goal is to c...
Mar. 28, 2017 03:45 PM EDT Reads: 4,223
"delaPlex is a software development company. We do team-based outsourcing development," explained Mark Rivers, COO and Co-founder of delaPlex Software, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Mar. 28, 2017 03:00 PM EDT Reads: 9,734
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
Mar. 28, 2017 03:00 PM EDT Reads: 2,181