Being a System or Network Administrator of a large computer network can be a difficult, and time consuming task. It is not as simple as people might think. Yes, once it is all running well, the life of a Network Administrator may seem easy, but when things go wrong, it can be a living hell. Network Administration is more than just simply connecting a whole bunch of computers together; it is a full time role to ensure that those critical servers and applications remain up almost 100%.
The Main Role Of A Network and System Administrator
As a Network Administrator, your role is to essentially build (integrate existing services), maintain, and upgrade a network of computers, devices, and servers.
If you (or the previous System and Network Administrators) get the build phase right, maintaining and upgrading the network in the future will become much easier. You have to carefully plan your requirements and plan for future growth. In doing so, you have to keep in mind a number of things:
- How many users will be using the system?
- What are the network capacity demands?
- What will the network capacity demands reach in the future?
- What is the geological spread of the network?
- What access controls do you users require?
- What sort of devices and hardware do you require?
- Servers (Web, File, Print, DNS, Backup)
- Do you need remote administration?
- How are you going to support your users?
If the above are planned well, future growth, maintaining and upgrading the network will be much easier. Just remember to analyse your requirements; design and plan the network; implement policies and constraints; and construct and install your network with suitable hardware. Remember to always think of the future. Once it’s all implemented and built, you can administer your network.
When administrating a network, you have to ensure critical services have as close to 100% uptime, because if they are down, your users are going to hassle you all the time – and that can be quite annoying with some users. When maintaining a network, you also have a number of other tasks to perform every day. These include:
- User management
- Adding, removing and apply security to user accounts and groups
- Upgrading hardware
- Replacing faulting hardware
- Providing support for applications and operating systems
- Ensuring critical data is being backed up
- Monitor the system
- Checking logs
- Conducting security audits
- Attending to users help requests
- Writing documentation
One of the hardest challenges in administrating a network is application support. You can very rarely just add a major application or a network service without any issues. Your network will need to support it. For example, you cannot host a website if you only have dialup internet. Other factors that are needed to be taken into consideration are:
- Does it need to be secured? – Both physically and virtually; internally and externally
- Can the application be supported – Are their people who know how to use the application?
- What risks will it introduce? Viruses, Hackers, privacy?
- Is there enough network capacity to support it? Can you network handle the bandwidth requirements?
- Does data need to be migrated across to the new system from the old system?
- Who should be able to access the new application
Depending on how large your organisation is, these tasks can be broken down in to departments or groups.
As you probably have already found out, you are faced with a number of challenges as it is more than just installing computers and networks. You have to make sure your network:
- Is designed efficiently,
- Is capable of mass management – updating multiple machines at once
- Is secured from threats, and internal and external hackers
- Meets all of your users requirements and needs
And not only for your network, you to have to also:
- Understand the users and organisations needs and wants
- Be able to troubleshoot and fix problems and errors quickly
- Be up to date with the latest technical knowledge and computer news
- Be able to write documentation and instructions
Every System and Network Administrator will tell you that they have faced issues that they do not know how to solve, or even know what the actual problem is when they are faced with an issue. That is why they need to have good problem solving skills… and Google!
The basics steps for solving a computer related problem are:
- Detect the fault of problem
- Isolate the problem
- Troubleshoot on how to fix the problem
- (Make sure you document your steps or make a backup before you do anything)
- Carry out tests and use tools to diagnose the problem
- Solve the problem and document a fix
If your network is critical, you cannot just simply reboot a machine, or click a few random buttons to see if you can fix the problem. You have plan how you will fix the problem in the quickest amount of time without causing more disruption to your users, or break it even more.
- First of all – be systematic. Try the simple things first. If a computer won’t start, make sure the power is turned on.
- Read logs – Logs provide a lot of information on when things go wrong. So make sure you read and understand what the logs are telling you
- Pay attention to all the facts
- Read the documentation – yes, it does help and it’s not always there to take up space in the packaging
- Talk to others – get on the internet, forums, blogs because other people would likely have faced your issue or know how to help, and ask your colleagues
- Use test environments – see if you can cause the issue again, and then test the ways on how you can fix it. In a test environment, you know you can’t break the production server any more
- Know your tools – you usually need something to work out how to fix the problem, or even to find out what the problem actually is. You might not know you have a virus if you don’t conduct a scan
- Work out the root cause of the problem – hardware failure, user interaction, external event?
- Have a backup in place – make sure that you can restore the system to what it was like before the problem
- Do it quickly – Users are being affected and can have a large financial loss to the organisation if the system is down. Fix it first, and then discuss the politics
One tip I would like to bring up in System and Network Administration is Change Management. Change Management is very crucial to follow correctly, especially in large organisations. Although it may seem like a waste of time, it is very important step to do. Change Management essential records every change you do (and plan to do), allows you to get permission from bosses (in case something goes wrong) which than allows you to implement a change.
The steps in Change Management are:
- Note the change – what are you actually going to do?
- What are the repercussion – what effect will this Change have, who is going to be affected and when?
- Note the back out plan – if something goes wrong, can the Change be reverted back to the original configuration?
- Revise the policy – are you actually allowed to implement the Change – does your organisation support it?
- Inform impacted users – make sure that all users who will be affected are aware of this change so they don’t call you up saying their system is down
- Make the change/Do the work – disable (take down) services if required
- Inform users that change it is done
A while ago, Virgin Mobile was updating their CRM system which entailed a weekend outage that would affect a number of users and its services. Customers were aware of the change, and were encouraged to plan for this change. One of the services to be affected was mobile phone and SIM card activations. I purchased a phone during the upgrade weekend for my mum, which meant I couldn’t activate the phone. In the mean time, she wasn’t able to use her mobile phone. To make it even worse, Virgin Mobile had problems with the system upgrade, and the system was down for a further 1.5 days. A lot of customers, including myself where unhappy as the change took longer than it should have, and were impacted. Reports on forums, a lot of customers left Virgin Mobile due to it.
The Skills Required for System and Network Administrators
Not everybody has the required skills to be a System and Network Administrator. For example, a person who designs website layouts may not particularly know how to setup a web server. So what are some of the skills that a System and Network Administrator should know?
- UNIX – how to use, install, configure and run
- Scripting – shell, bash, C++, Java
- Network – TCP/IP, Hardware, Communication, Network Standards
- Infrastructure – DNS, DHCP
- Storage – SAN, NAS, NFS, CIFS
- Directory Services – LDAP, WINS, NIS
- User Services – Databases, e-Mail, Office Tools, Web Tools
- System Implementation
- System Troubleshooting
- Security Concepts
So that is a brief introduction into the life of a System and Network Administrator. It can be a very difficult task, especially when things go wrong.