Are you passionate about data? Does the prospect of dealing with massive volumes of data excite you? Do you want to build data engineering solutions that process billions of records a day in a scalable fashion using big data technologies? Do you want to create the next-generation tools for intuitive data access and advance analytics?
As a Data Engineer, you should be familiar with data warehousing technical components (e.g. Data Modeling, ETL, Performance Tuning, Data Extraction, SQL and Reporting), Infrastructure (e.g. hardware and software) and their integration. You should have good understanding of enterprise level data warehouse solutions using multiple platforms (RDBMS, Columnar, Cloud). You should be able to create and manage business use of extremely large datasets. You should have excellent business and communication skills to be able to work with business owners to develop and define key business questions, and to build data sets that answer those questions. The individual is expected to be able to build efficient, flexible, extensible, and scalable ETL and reporting solutions. You should be enthusiastic about learning new technologies and be able to implement solutions using them to provide new functionality to the users or to scale the existing platform. Excellent written and verbal communication skills are required as the person will work very closely with diverse teams. Having strong analytical skills is a plus. Above all, you should be passionate about working with huge data sets and someone who loves to bring datasets together to answer business questions and drive change.
Our ideal candidate thrives in a fast-paced environment, relishes working with large transactional volumes and big data, enjoys the challenge of highly complex business contexts (that are typically being defined in real-time), and, above all, is a passionate about data and analytics.
Working towards a Master’s degree in Computer Science, Information Technology, Business Analytics, Mathematics, Engineering, Data Science, Information Systems, or Statistics in a field of experimental science
Programming skills sufficient to extract, transform, and clean large (multi-TB) data sets in a Unix/Linux environment.
Knowledge of scripting for automation (e.g. Python, Scala, Perl, Ruby)
Experience in data mining and optimizing performance of data processing workloads (Relational or Non-Relational Data environments)
Experience in analysis of structured and unstructured data. Familiarity with SQL, ETL and other analytical tools
Exemplary communication skills, ability to work with large cross functional teams of technical and non-technical members
Excellent critical thinking skills, combined with the ability to present your beliefs clearly and compellingly in both verbal and written form
Familiar with theory and practice of information retrieval, relevance, machine learning, and data mining
Strong desire to push your ideas into production, overcoming obstacles, in order to benefit Amazon's customers
Experience with Data Modeling and building Data Pipelines for Large scale Data Ingestion and Data Extraction workflows.
Experience with AWS Solutions, including EC2, S3, Redshift, Kinesis, EMR (or Hadoop)
Experience with Big Data Technologies and familiarity with industry leading open source solutions for data processing and data management.