Saturday, June 09, 2018

Irish Role Models for Boys and Men

I'm exploring misogyny in Ireland (having returned after 15 years abroad) and not that it isn't prevalent globally, but I'm trying to debug my own pysche and that of the nation in 2018. I asked in a #mens_health channel online (which I frequent) about current candidates as role models for boys and men in Ireland (or connected to Ireland somehow such that they could be considered 'Irish' thus lending more weight, proximity, and impact). Of course many people's Dad's are role models, but here we're looking for wider and more well known examples. Below are some of the results. What do you think? Who's missing and why are there so few (these are all white)? Am 100% admitting I exist in my own filter bubbles too! Can you suggest others?



Background: Part of it stems from something I've been thinking about for a while in terms of environmental shaping of entities... including but not limited to the existing patriarchy, the Church, education, media, sexism in general, politcal etc. but essentially all power bases and also some interesting background here on the so called "Mother Wound" . But it's also an honest question and personal exploration turned outwards to gauge and crowdsource the 'state of the nation'.

Friday, June 08, 2018

Native Cloud Unicorns and Hybrid Waterfalls

Were you born in the cloud? Are you part of a cloud native organization? Just like many others, they focus on their primary business goals, strategies, requirements, user needs, and good old budgetary constraints. What is new however is that many product led 'cloud native' companies (who also subscribe to an Agile product mindset) are birthed in to an environment where the underlying physical compute, network, and storage is highly elastic, 'abundant', and wholly 'software driven'. This is not always the case across the whole business though, yet a mentality pervades that 'roll back' and 'roll forward' is easier and cheaper than ever before. This, in many macro cases is untrue, with 'cloud' being the ultimate vendor lock-in. Also, if all your IT productivity compute assets are destined for the same cloud provider as your product or service, perhaps pause a moment, to consider the failure domain of front and back of house.

Much of your product engineering and traditional data warehousing may be built upon public cloud compute providers yet internal IT needs tend toward more tangible physical assets, multi-stream dependencies, and physical spaces. Often this leads to hybrid cloud approaches emerging early on in IT teams. Whether scaling in-house 'thin' call centers, brokering partner services, or traditional office 'capacity planning', Agile mindsets bump up against Waterfall processes head on! This is especially pronounced when there are attempts to build large in-house compute footprints (whether for latency, quality of service, or governance requirements).

When the 'cloud' culture of assumed abundance confronts a traditional culture of scarcity and frugality, there's a potential conflict that can result in a net loss or gain. I have seen how utility and public cloud compute drive expectations of micro-billing (since 2007+). Non-SaaS vendors and in-house IT are often unable to match the accounting fidelity of public compute clouds. This contributes to a perceived lack of efficiency by internal IT teams which compounds friction and negativity. Additionally, the test driven mentality of continuous integration and deployment from 'product' teams is only slowly reaching wider IT departments (including networking). The speed of execution for 'product' facing public clouds is *not* easily achieved by IT even on their own private clouds (until experienced product platform facing engineers assist internal teams or there's a cultural revolution in IT automation and testing). Unfortunately though, roll forward and roll back is not a luxury afforded to all but those at the literal edges, but the real question is about the elasticity of the culture, not the technology.

Different departments and functions actually serve a purpose by operating at different velocities. In terms of business and operational risk, slow and steady can still be an asset. It's not so much velocity versus viscosity but that periods of homeostasis are required in all organisms. Additionally there can also be cultural and functional disconnects amongst different Change Management processes (irrespective of CI/CD). Ironically, an organization's footprint of physical IT resources may be reduced (by using public cloud or cloud based management) but conversely have their complexity, management/monitoring, and inter-dependencies increase. Identity/Access Management and Security do not decrease their service 'surfaces' necessarily with SaaS but rather increase their risk with public cloud for a range of reasons (opacity/data-perimeter/zero-trust models, SSO, more dependencies etc).

So, to the ever present question of 'build' versus 'buy'...

One approach is that of accepting and embracing 'hybrid' thus seeking the best of both worlds. Place as much functionality as possible on a select set of public cloud platforms/SaaS (but limit IaaS instance sprawl and seek to avoid a 100% vendor X IaaS footprint). Campus and satellite offices become 'thin' yet 'intelligent' access edges with as minimal a footprint as possible (where only VoIP/Voice and Security compute footprints exist). What increases though is the focus and dependence upon network fabrics including resilient ISP services and physical/logical WAN. Mobility of users and 'intelligent' WLANs become key for productivity, workplace experience, and culture. Remaining IT service footprints can then be minimised and possibly located in public cloud or vendor 'colos' (with latency, OPEX, regulatory, and local expertise helping to drive location and partner selection). 

As organizations mature, vendor selection matures (due diligence/RFPs/structured procurement etc) yet for some applications/vendors, there then exists a legacy lock-in due to early stage employee's decisions. This is quite pronounced around public clouds and any enterprise directory or database choice. These decisions may have been based upon early employee's previous experiences elsewhere (and/or comfort levels/expertise with certain 'tech stacks') or even due to low to no CAPEX available early on.

And so the sine wave of in-sourcing and out-sourcing(or tasking) coupled with 'core versus context' for new age companies inevitably leads to even faster oscillations and greater hidden complexity (counter to the promises of 'cloud' compute, SaaS, and APIs). Oscillations can also be linked to decreasing employee tenures (when 'champions' and 'advocates' are lost). Hybrid cloud can and does become even more complex than that of either an exclusive private or public cloud. 

Effectively though, trying to cap downside risk and maximizing your upside risk is a game being played by a mix of old and young who have differing levels of experience, risk tolerance, and (dependent upon equity), skin in the game. I would argue that complexity is actually increasing and the chances of cascade or common mode failures is increasing (unless expensive Chaos Engineering is being implemented early on). This all leads us back to the concept that you can't manage what you can't measure. We require better metrics for Technology Risk (not just in Infosec where they're wholly unsuitable if not absent too). 

We are surfing leaky abstractions and sprinting, often needlessly...

Sunday, February 21, 2016

On Blame

Blame is attribution and often misused.

I think I have a different take indeed. I think blame has traditionally been misused in many orgs to incorrectly scapegoat individuals or minorities (for political or egoic purposes), *however blame has utility otherwise attribution can not exist*. Today we find ourselves part of a 'politically correct' mainstream fearful of reprisals around allocation of blame in complex scenarios.  Indeed blame can be toxic when used non-skillfully, and can also be used to persecute those who are least able to defend themselves or initiate learning.

For Root Cause Analysis there actually has to be blame of a thing, process, person, agent, or group (or mix thereof). In our societal conflict avoidance culture, we tend to want to fix only the system or process, not always realising that humans are a huge part of, and make up a large part of the system. For humans to learn, they must not only know that they were wrong, or made a mistake, but feel it deeply to trigger deep learning. This is a physiological response that must happen.

There can be no responsibility, accountability or learning without said attribution. The trick is not to extrinsically 'blame' or 'shame' individuals. Blame must be intrinsically attributable to the thing, accountable team, or group responsible (if indeed that is the true RCA), otherwise there is no organisational or individual learning. In the case of no blame, the rest of the organisation has to evolve around these fixable/preventable failures via a form of avoidance or process overhead/tax.

Imagine a startup that couldn't fail fast and learn because the RCA is actually some of the people hired. Sure you reset the training/hiring etc. (or fire them) but you also need to target the individuals for betterment if you want to keep them. If the RCA is that an individual or teams need more training, then this must be identified and dealt with at a management tier right?

The challenge is not to explicitly blame/shame any *individuals* (whose teams intrinsically know who were to blame for certain events) but for managers/leaders or groups to fall on their sword and accept attribution/blame for their team's actions IMHO.

This topic of 'blameless' culture has a groundswell which I fundamentally disagree with (mostly in the avoidance and transparency angles) as a form of avoidance of conflict.

This is indeed a nuanced approach but attribution and accountability form the backbone of progress. I keep coming back to this seminal book (see the summary section)

Perhaps every Post Mortem should end with the question, "do our teams need more training"?

Appendix. A
Why Organizations Don't Learn

Saturday, December 12, 2015

Email Productivity Hacks

Might have just found the answer to work email anxiety when HQ is in another timezone i.e. "Delayed Messages": NOW running only on Mon-Fri at GMT 8am, 2pm and 4pm

Amended code from Musabi link above:

Note: I also use Zapier email parser to respond to meeting invites with some guidelines for meetings (just make sure your Google Mail filter excludes invitations with the words 'accepted' or 'updated' too).

Thursday, January 15, 2015

A note on netsec

Know your network and assets.
Gain situational awareness.
Quantify Value at Risk.
Risk is a factor of dependency.
Map transitive trust.
Zone assets and services.
Partition failure domains.
Assume compromise.
Fail well.
Maintain ability to replay traffic to high value assets.
Drill incident response.
Minimise abstraction layers.
Advocate loose coupling.

Monday, November 10, 2014

WebSummit 2015 Re-Imagined: A More Evenly Distributed Future

WebSummit is trapped between a rock and a hard place yet it need not be so! The very thing that makes WebSummit special, its 'secret sauce' if you will, is that of the host country and its zeitgeist (and more specifically that of Dublin itself). One of the primary ingredients of this sauce is the Irish welcome, openness, and indeed the intimacy that occurs in and around the edges of the conference.

Like any good conference, festival, or gathering, it's as much about the serendipity engine of coming together in large groups which then facilitates unexpected and novel interactions. More often than not the event's official content and schedule plays second fiddle to the more intimate clusters of conversation before, during, and after sessions. WebSummit, like any human get together, is about the people first and foremost, people whose interconnection is supported by the transport fabrics of the venue and host city. People come because of the promise of connection; connection to other people, to ideas, and to methods that facilitate their learning. Today, people expect to connect both digitally and physically, each a proxy to and serving the other.

Thus, event WiFi is one of today's crucial and ubiquitous service fabrics at technology events, and unfortunately it was indeed sub-standard and woefully underprovisioned at WebSummit. Notwithstanding the underlying politics, event Wifi is a three dimensional fabric that helps to distribute information to attendees, to connect them to the outside world during the proceedings (including connecting the outside world in) whilst catalysing human connections. The WebSummit WiFi did not seem to follow certain best practice patterns for high density deployments (which are documented freely and openly on the web) but more on this later, including some key points and recipes for anyone else thinking about high density WiFi at 'webscale' events! First, let's look at how one might potentially extricate WebSummit from the RDS (Royal Dublin Society) conference and exhibition centre without damaging the brand and buzz around the event itself.

WebSummit needs more leverage in this 'Mexican standoff' of sorts. It's trapped in the only event campus large enough to host its *current* numbers by an incumbent who have demonstrated they just don't get it. The irony is WebSummit can neither write the network technology requirements itself yet (and/or bake them in to contractual service level agreements for a range of reasons) nor is it permitted to take advantage of entities who actually know how to provide this type of elastic wired and wireless network due to the RDS's current stance. The RDS is unable to 'fail fast' and can only 'fail big' as there is no incentive nor room to rapidly iterate when your deployment cycle is once a year, involves actual physical hardware, and especially when you have a monopoly. WebSummit is unfortunately paying the RDS yearly to learn a little more about high density WiFi design and operation yet the RDS is still falling short and thus damaging both WebSummit and Ireland's national brand. The lack of quality and stability in this utility service is damaging the attendees experience, damaging WebSummit's intrinsic and global marketing channels, and also damaging the country's reputation by re-enforcing negative Irish stereotypes rather than the positive ones which attracted many of the people in the first place. I could go on here about how the Web itself uses encapsulation and abstraction models and how web startups only learn about 'web scale' (and thus the underlying OSI layers and network patterns) as they mature and gain traction, but I'd like to get back to the venue choice for a moment first...

The only leverage WebSummit has is to actually and fundamentally rethink using the RDS and find or create a local alternative for the event and 'festival' campus (so the RDS understands that moving location is not just a veiled threat but that a WebSummit straight flush beats an RDS full house!). Ireland as a host country and city has many constraints indeed but let's use them to get creative, to innovate, and to bootstrap the basics for a moment. WebSummit can *not* go abroad as it would lose its special powers and become just another technology/startup conference, i.e. bland and over-commercialised. If it left the capital city, Dublin and indeed Ireland would lose so much more than just revenue, it would be an admission of national failure and incompetence. External parties would lose confidence in Ireland's startup scene, in the existing technology base, and most impactful, in the potential and capability of the Irish to play at a global level whilst still at home.

WebSummit needs world class conference and trade show facilities within a stone's throw of the city centre's pubs, restaurants, hotels, transport infrastructure, and preferably all within walking distance of Grafton St. It needs all this with a nexus capable of hosting ~20,000 people at a keynote, but does it really? Intimacy is not a scale free network and it is scarcity that helps to determine perceived value. Let's suppose for a minute that WebSummit explicitly states that Dublin's nucleus is it's true 24/7 campus. Albeit there is no rival to the RDS in terms of 'one (giant) throat to choke', perhaps a new campus could be imagined as an intertwined web of smaller more intimate locations (just like the Night Summit itself!).

Consider if you will for a moment the docklands with a bit of vision?

Have a think about the above with a kind of a SXSW feel? Sure, it would take a master stroke of organisation and liaison with a range of parties but the Convention Centre Dublin has a 2,000 seater auditorium as does the Bord Gáis Energy Theatre, and the 3 Arena has a 14,500 seat capacity (combined with taking over the Odeon Cinema and anything else they could get their hands on nearby!). Just a thought to bootstrap your thinking! I'm sure many Irish people would give you 20 reasons how something like this could fail all without any constructive criticism, ideas, or alternatives but.... what if...?

So, on to the WiFi... and known good patterns. Well, here is what WebSummit could do or have another entity do...

High Level Design / Basic Requirements

Client Requirements:
- One to three devices per attendee (all manner of smartphones and laptops)
- 2.4 GHz and 5GHz support
- Minimum RSSI -67dBm / SNR 25dB in coverage areas
- Minimum 5/5Mbps throughput to maximum 20/20mbps
- Application traffic types primarily miscellaneous web browsing
- HD video conferencing and voice should be available and prioritised
- Sub 5ms response from default gateway
- Sub 5ms response from cached DNS entries
- Multicast and local client to client connectivity not supported except in smaller spaces
- Limited wired connections of 100/100Mbps for all speakers and those wanting to do live demos.

WiFi Related:
- MCA(Multi-Channel Architecture) which rules out Meru!
- Distributed WiFi micro-cell architecture (rules out Xirrus!)
- Overhead directional 'patch' antennas
- 2.4GHz 'event-legacy' and 5GHz 'event' ESSIDs (throughout main hall)
- ESSID's anchored to major spaces (named accordingly vs. full site roaming)
- Limited layer 2 roaming
20MHz only channel widths for maximum spectrum re-use and clean air
- 802.11g/n only (802.11ac in some locations but not required!)
- Basic (i.e. mandatory) 18Mbps data rates and above only
- Predictive modelling/full survey but mandatory post-validation survey
- SNR to 25dB in all expected coverage areas
- Full WIPS+ Spectrum Analyzer capable and dedicated radios/APs.
- 802.11k (and/or proprietary load balancing in mini-radio clusters in super dense client areas)
- Careful use of RX-SOP (if available ;)

Wired Backbone and Event Services:
- Minimum dual active/active 10Gbps ISP transit links via disparate vendors/metro rings (with ability for vendors/exhibitors to terminate their own feeds)
- Minimum 40Gbps+ capable edge routing/firewalling
- 10Gbps dual redundant access edge uplinks to distribution
- Full 20-40Gbps or more primary campus backbone/infrastructure to the CORE
- N+1 redundant architecture throughout as far as the access/edge and APs
- Well architected L3 domains (to partition and minimise L2 failure domains)
- Routers should route and firewalls should firewall, thus DNS and DHCP should be provided for via dedicated servers or appliances 
- Software caches and/or major CDN edges onsite
- Local redundant / Anycast DNS resolvers and/or caches (i.e. not performed on routers/FWs)
- Dedicated physical links and paths (where possible) for exhibitors/vendors and/ or workshops or labs.
- L7 Application Visibility / DPI (Deep Packet Inspection) and associated shaping/throttling or queuing to a scavenger class for known bandwidth hogs
- Optional per-client SRC IP bi-directional rate limiting

NetOps/SecOps and Customer Service:
- A full NoC (Network Operations Centre) that also engaged attendees via locally hosted status pages and other social media channels i.e. Twitter etc..
- All digital signage giving informative and constant network info/updates
- Constant and distributed monitoring via humans/sensors/APs to adjust for a growing noise floor and to track down any 'evil twin' APs or strong rogues
- Full Network Management, Capacity Management, Alerting etc.
- Technically qualified roaming volunteers to assist attendees get connected including at event hubs and booths

Note: This is just a mixed flavour of some high and low level critical design elements (of course more explicit requirements should be created and customised with respect to WebSummit's specific functional and non-functional requirements including proper design documentation etc.). But there is no escape however from doing simple maths with respect to the number of supported CAM table and ARP entries per infrastructure device and factoring things like the TCP set up per second and concurrent NAT sessions at layer 3 boundaries... also.. know your clients i.e. do some capacity planning in advance + wouldn't it be lovely to use all Public IPs for clients at events ;)

Disclaimer: I was not at #WebSummit but was watching live from Berlin whilst talking to some people who were (and am now back home in Dublin for a short stint!).

If anyone wants to leave constructive comments, spots errors/omissions, or would like to follow up please do so or ping me on twitter @irldexter and in case anyone is wondering my background is here @podomere

Sunday, September 14, 2014

OSX Wifi

A quick and dirty script to put in your crontab to see what the hell is going on! MacBook Airs act funny with power management and their SNR. Also, with the RSSI and noise floor you can just subtract the noise to get the SNR. At the end of the day though it's the SINR that counts + proper tools are required to diagnose non-802.11 interference, CCI, ACI, throughput and retries....

## Then put the below in your crontab for every minute (with your own path of course)…
* * * * * /Users/useraccount/wifitest/ > /dev/null 2>&1