Friday, June 08, 2018

Native Cloud Unicorns and Hybrid Waterfalls

Were you born in the cloud? Are you part of a cloud native organization? Just like many others, they focus on their primary business goals, strategies, requirements, user needs, and good old budgetary constraints. What is new however is that many product led 'cloud native' companies (who also subscribe to an Agile product mindset) are birthed in to an environment where the underlying physical compute, network, and storage is highly elastic, 'abundant', and wholly 'software driven'. This is not always the case across the whole business though, yet a mentality pervades that 'roll back' and 'roll forward' is easier and cheaper than ever before. This, in many macro cases is untrue, with 'cloud' being the ultimate vendor lock-in. Also, if all your IT productivity compute assets are destined for the same cloud provider as your product or service, perhaps pause a moment, to consider the failure domain of front and back of house.



Much of your product engineering and traditional data warehousing may be built upon public cloud compute providers yet internal IT needs tend toward more tangible physical assets, multi-stream dependencies, and physical spaces. Often this leads to hybrid cloud approaches emerging early on in IT teams. Whether scaling in-house 'thin' call centers, brokering partner services, or traditional office 'capacity planning', Agile mindsets bump up against Waterfall processes head on! This is especially pronounced when there are attempts to build large in-house compute footprints (whether for latency, quality of service, or governance requirements).

When the 'cloud' culture of assumed abundance confronts a traditional culture of scarcity and frugality, there's a potential conflict that can result in a net loss or gain. I have seen how utility and public cloud compute drive expectations of micro-billing (since 2007+). Non-SaaS vendors and in-house IT are often unable to match the accounting fidelity of public compute clouds. This contributes to a perceived lack of efficiency by internal IT teams which compounds friction and negativity. Additionally, the test driven mentality of continuous integration and deployment from 'product' teams is only slowly reaching wider IT departments (including networking). The speed of execution for 'product' facing public clouds is *not* easily achieved by IT even on their own private clouds (until experienced product platform facing engineers assist internal teams or there's a cultural revolution in IT automation and testing). Unfortunately though, roll forward and roll back is not a luxury afforded to all but those at the literal edges, but the real question is about the elasticity of the culture, not the technology.

Different departments and functions actually serve a purpose by operating at different velocities. In terms of business and operational risk, slow and steady can still be an asset. It's not so much velocity versus viscosity but that periods of homeostasis are required in all organisms. Additionally there can also be cultural and functional disconnects amongst different Change Management processes (irrespective of CI/CD). Ironically, an organization's footprint of physical IT resources may be reduced (by using public cloud or cloud based management) but conversely have their complexity, management/monitoring, and inter-dependencies increase. Identity/Access Management and Security do not decrease their service 'surfaces' necessarily with SaaS but rather increase their risk with public cloud for a range of reasons (opacity/data-perimeter/zero-trust models, SSO, more dependencies etc).

So, to the ever present question of 'build' versus 'buy'...

One approach is that of accepting and embracing 'hybrid' thus seeking the best of both worlds. Place as much functionality as possible on a select set of public cloud platforms/SaaS (but limit IaaS instance sprawl and seek to avoid a 100% vendor X IaaS footprint). Campus and satellite offices become 'thin' yet 'intelligent' access edges with as minimal a footprint as possible (where only VoIP/Voice and Security compute footprints exist). What increases though is the focus and dependence upon network fabrics including resilient ISP services and physical/logical WAN. Mobility of users and 'intelligent' WLANs become key for productivity, workplace experience, and culture. Remaining IT service footprints can then be minimised and possibly located in public cloud or vendor 'colos' (with latency, OPEX, regulatory, and local expertise helping to drive location and partner selection). 


As organizations mature, vendor selection matures (due diligence/RFPs/structured procurement etc) yet for some applications/vendors, there then exists a legacy lock-in due to early stage employee's decisions. This is quite pronounced around public clouds and any enterprise directory or database choice. These decisions may have been based upon early employee's previous experiences elsewhere (and/or comfort levels/expertise with certain 'tech stacks') or even due to low to no CAPEX available early on.

And so the sine wave of in-sourcing and out-sourcing(or tasking) coupled with 'core versus context' for new age companies inevitably leads to even faster oscillations and greater hidden complexity (counter to the promises of 'cloud' compute, SaaS, and APIs). Oscillations can also be linked to decreasing employee tenures (when 'champions' and 'advocates' are lost). Hybrid cloud can and does become even more complex than that of either an exclusive private or public cloud. 

Effectively though, trying to cap downside risk and maximizing your upside risk is a game being played by a mix of old and young who have differing levels of experience, risk tolerance, and (dependent upon equity), skin in the game. I would argue that complexity is actually increasing and the chances of cascade or common mode failures is increasing (unless expensive Chaos Engineering is being implemented early on). This all leads us back to the concept that you can't manage what you can't measure. We require better metrics for Technology Risk (not just in Infosec where they're wholly unsuitable if not absent too). 

We are surfing leaky abstractions and sprinting, often needlessly...

No comments: