Friday, February 27, 2009

Histornet : History and Future of the Internet

Then go here for "Warriors of the Net" for a fun view of some constituent parts:

Then go here for "A Common Sense Approach to Internet Safety":

And when you're finally ready, then go here for the future (38mins in is good!):

Monday, February 09, 2009

10 laws of networking (Donal)

Remember a few simple paradigms
1) The risk profile of a network or fabric is greater than the aggregate of the risk profiles for each of its endpoint/client connected nodes or services.
2) Never underestimate physical *and* logical separation. Ask yourself what happens if the mgmt control plane goes down or gets stuck in 'flipmode'?
3) Protect your management and control plane above all else, try not to have them in-path with the data plane. IT is change management, if you can't manage your resources, you may as well not have them.
4) Where are your policy enforcement points which facilitate auditability and visibility? AAA is a must!
5) Always use subnets and NETBLOCKs to separate traffic when you can. [e.g. use good address management] QOS on subnets is easier than QOS on discrete flows.
6) Darkness is not good. Instrument and gather telemetry from your network. Inbound poll and outbound trap at a minimum. Baselining and trending helps.
7) Always look at logs, sessions and empirical data rather than listening to conjecture and hearsay.
8) Abstraction layers are a good thing such that logical resources and physical resources can move without affecting one another. Loose coupling not tight coupling is the order of the day.
9) Always use loopbacks or virtual interfaces to manage devices where possible. [see 8]
10) In-path tests are the only things that represent what a client or endpoint sees. Up isn't always up, sometimes it's down.

Note: This is evolving, please leave comments on adds, moves, and changes... including priorities!

4 Laws of Troubleshooting:
1) Get, define, refine PROBLEM STATEMENT and the 5 WHY's.
2) Always go back to basics and first principles.
3) Look for commonalities and deltas.
4) Document an end-to-end code/firmware matrix for your problem.

Hugo's take on things (Not that I specifically disagree, but I do have a slightly variying point of view to the previously released laws)

1. Lack of visibility does not constitute lack of activity. While being unable to manage a device constitutes a significant risk, it does not constitute an outage.
2. We spend a great deal of time building highly available data paths in networks. They constitute one of the most reliable ways to get around the network. It is a valid consideration for the carriage of management traffic.
3. In a redundant, highly available network, a down device does not constitute a disaster, in fact, it doesn't even constitute an outage. Delaying its recovery constitutes a risk, not a problem.
4. The weakest part of your management is your people and processes, think less technically and more simply. Sometimes an analogue phone is the best solution.
5. Focus your efforts on the areas you have problems. Management like to see rapid improvement, don't focus on what causes you 1 issue a month to the detriment of something causing you 10.
6. Before you ring for escalation support, type "show log". Or look at the appropriate logs on the device or host.
7. History is important. Nothing changes radically overnight, if you can see what has happened before, you will know better whether you are looking at a one off event or a re-occurring issue. Many other pointers come from history and trending information.
8. No matter how big a nuffer they are, the day to day or other incident staff may well have seen something important that they can tell you. Try to establish the information behind their assumptions.
9. Best practice is merely something that worked for others. Sometimes our differences necessitate divergence. The best German engineering software in the world is of little value to someone who only speaks English. The best network management software in the world adds little value if it does not gather call history and quality information on your VoIP network. Best practice is a great starting point, but usually not where you should end up.
10. Keep it simple. Networks have a way of complicating themselves, your efforts should be towards keeping it simple and reliable.

Sunday, February 01, 2009

The coming global Infosec freeze

Our biggest problem is we can't demonstrate shit happens effectively enough. [Outcomes]
Especially when it get's rolled up in to operational 'stability' or the 80% of self serving retards running IT suppress ripples in the space time continuum. Or the snake-oil selling vendors manage to introduce more nodes and code rather than less.

Baselines and reference points are also missing, and we all know why. We're all using the same virtual bricks but building everything from lego turing machines to traffic systems to flying machines and fighting robots.

Google just did a lot of work for us classifying *everything* on the web as evil. So maybe we can just convince the web 2.0 fanatics to go join the luddites and take part in Donal's solution called....

.... wait for it....

"SLOW IT DOWN" I am now declaring a change freeze on all production systems till 2012 when we can get our shit together :)