Big Data Engineer Job Description Template

This post has already been read 1877 times!
0 Flares Twitter 0 Facebook 0 0 Flares ×

Presented by Toptal

A Big Data Engineer is a person who creates and manages a company’s Big Data infrastructure and tools, and is someone that knows how to get results from vast amounts of data quickly.

The actual definition of this role varies, and often mixes with the Data Scientist role. Here, we will assume that it is a role focused on engineering, without statistics and strong machine learning skills required.

The world of Big Data has grown significantly during the last decade; therefore, the skills started to be more specific. While in the majority of cases it is built around Hadoop, there are many tools that have become very significant on their own. We have covered some common cases in the following sample description.

Big Data Engineer – Job Description and Ad Template

Company Introduction

{{Write a short and catchy paragraph about your company. Make sure to provide information about the company culture, perks, and benefits. Mention office hours, remote working possibilities, and everything else you think makes your company interesting. Big Data Engineers like to work on huge problems – mentioning the scale (or the potential) can help gain the attention of top talent.}}

Job Description

We are looking for a Big Data Engineer that will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them. You will  also be responsible for integrating them with the architecture used across the company.


– Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities.

– Implementing ETL process {{if importing data from existing data sources is relevant}}

– Monitoring performance and advising any necessary infrastructure changes

– Defining data retention policies

– {{Add any other responsibility that is relevant}}

Skills and Qualifications

– Proficient understanding of distributed computing principles

– Management of Hadoop cluster, with all included services {{unless you are going to have specific Big Data DevOps roles for this}}

– Ability to solve any ongoing issues with operating the cluster {{unless you are going to have specific Big Data DevOps roles for this}}

– Proficiency with Hadoop v2, MapReduce, HDFS

– Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming {{if stream-processing is relevant for the role}}

– Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala

– Experience with Spark {{if you are including or planning to include it}}

– Experience with integration of data from multiple data sources

– Experience with NoSQL databases, such as HBase, Cassandra, MongoDB

– Knowledge of various ETL techniques and frameworks, such as Flume

– Experience with various messaging systems, such as Kafka or RabbitMQ

– Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O {{if you are going to integrate Machine Learning in your Big Data infrastructure}}

– Good understanding of Lambda Architecture, along with its advantages and drawbacks

– Experience with Cloudera/MapR/Hortonworks {{you can specify the distribution you are currently using or planning to use here}}

– {{List any other technologies you are using or planning to use. Most Big Data Engineers will know some of the ones listed here: The Hadoop Ecosystem Table}}

– {{List education level or certification you require}}

This article originally appear in Toptal


Additional Reading

Big Data Analytics for Inclusive Growth

Reducing Big Energy Cost for Big Data

If you liked this article, we'll be happy to send you one email a month to let you know the newest edition of the MetaOps/MetaExperts MegEzine has been published. Just fill the form below.