Rate

₹ 168,000 (Monthly)

Experience

4.4 Years

Availability

Immediate

Work From

Offsite

Skills

Data Engineer HADOOPApacheETLPythonSnowflakeAWSMySQL

Description

VENKATESH

Profile Summary:

Having 3.9 years of data engineering experience in handling large datasets.

Hands-on development and implementation experience in Big Data Management Platform (BMP) using Hadoop, Apache Spark, Hive, Sqoop, Scala, ETL.
Basic knowledge on cloud environment like Amazon Web Services (AWS) S3, EMR, Glue, Athene, Lambda, Redshift, Snowflake.
Basic knowledge in Python and PySpark.
Having good knowledge in importing and exporting data using Sqoop from Relational database Management System to HDFS and vice versa.
Experience in handling various file formats like CSV, JSON, AVRO, Parquet etc.
Developed data queries using HQL and optimized the Hive queries and handling SQL and Complex SQL Queries.
I have actively contributed to the development of robust data pipelines using Hadoop, Hive, Spark, and Scala. By leveraging these powerful technologies, I have successfully constructed efficient and scalable data processing workflows.
I have practical expertise in designing and constructing Hive external tables, utilizing a shared meta-store stored in MySQL.
I have hands-on experience with GitHub repository and have successfully executed essential operations such as cloning, committing changes, pulling updates, and pushing modifications.
In addition to my expertise in wrangling big datasets, I possess advanced proficiency in scheduling jobs using Control-M and conducting thorough monitoring.
Successfully implemented agile methodology, working in cross-functional Scrum teams.
Actively participated in Scrum ceremonies, including daily stand-ups, sprint planning, and retrospectives, ensuring effective collaboration and timely project delivery.

Academic Qualification:

Attained a distinguished Master of Science (M.Sc.) degree from Acharya Nagarjuna University.

Professional Experience:

Currently employed as a Big Data Engineer at Confidential from September 2019 to till date.

Technical Skills:

- Big Data Tools : Hadoop, Hive (2.1), Spark (2.4), Sqoop
- Hadoop distribution : Cloudera Distribution Platform (6.3)
- Databases : MySQL
- Cloud Technologies : AWS, EMR, S3, GLUE, Athena, Lambda, Snowflake
- Programming : SQL, Scala (2.11)
- Environment : Windows, Linux
- SDLC Model : Agile Model

Project 2:

Name: Pharmaceutical Production Audit Platform (PPAP)

Role: Data Engineer

Project Description:

This project focuses on developing an effective production management system to address

challenges related to excess production of non-patent, non-exclusive drugs, increasing competition from generic pharmaceuticals, the prevalence of counterfeit drugs, and the sudden emergence of infectious diseases. The client receives raw data from a multitude of internal and external sources, primarily derived from production units. These sources inc

VENKATESH (RID : 15cyvln0f4acf)