A Hadoop administrator is a person responsible for the installation, configuration, upgrade, administration, monitoring and maintenance of databases in an organization.
The role includes the development and design of database strategies, system monitoring and improving database performance and capacity, and planning for future expansion requirements. They may also plan, co-ordinate and implement security measures to safeguard the database.
Visit: www.magnifictraining.com
I’ve seen both DBAs and sysadmins becoming excellent Hadoop admins. In my highly biased opinions, DBAs have some advantages:
Everyone knows DBA stands for “Default Blame Acceptor”. Since the database is always blamed, DBAs typically have great troubleshooting skills, processes and instincts. All those are critical for good cluster admins.
DBAs are used to manage a system with millions of knobs to turn, all of which have critical impact on the performance and availability of the system. Hadoop is similar to databases in this sense – tons of configurations to fine-tune.
DBAs, much more than sysadmins, are highly skilled in keeping developers in check and making sure no one accidentally causes critical performance issues on an entire system. Critical skill when managing Hadoop clusters.
DBA experience with DWH (especially Exadata) is very valuable. There are many similarities between DWH workloads and Hadoop workloads, and similar principles guide the management of the system.
DBAs tend to be really good about writing their own monitoring jobs when needed. Every production database system I’ve seen has crontab file full of customized monitors and maintenance jobs. This skill continues to be critical for Hadoop system.
To be fair, sysadmins also have important advantages:
They typically have more experience managing huge number of machines. Much more so than DBAs. They have experience working with configuration management and deployment tools (puppet, chef), which is absolutely critical when managing large clusters. They can feel more comfortable digging in the OS and network when configuring and troubleshooting systems, which is important part of Hadoop administration.
0 comments:
Post a Comment