₹ 188,000 (Monthly)

7 Years

Immediate

Offsite

HadoopSparkKafkaSQOOPSQLPyspark

Tejender

PROFESSIONAL SUMMARY

Hands on experience on working in domains such as Insurance, and Healthcare etc.
In - depth knowledge of Apache Hadoop Architecture and Apache Spark Architecture.
Hands on experience in Hadoop Ecosystem components such as Hadoop, Spark, HDFS, YARN, Hive, Sqoop, Flume, MapReduce, Pig, Kafka. Experience in importing and exporting data from RDBMS to Hadoop and HIVE using SQOOP.
Experience in transporting and processing real-time stream data using Kafka.
Experience in implementing OLAP multidimensional cube functionality using Azure SQL Data Warehouse, Azure Data Factory.
Experience in writing Spark transformations and actions using Spark SQL in Pyspark.
Experience in writing HQL queries in the Hive Data warehouse.
Modiﬁcation and performance tuning of HIVE scripts, resolving automation job failure issues and reloading the data into the HIVE Data Warehouse if needed.
Experience in processing streaming data using Flume and PIG.
Taking care of the ETL process
Improving data quality, reliability & eﬃciency of the individual components & the complete system
Processing streaming data through spark using Kafka.

TECHNICAL SKILLS

Hadoop/Big Data: Hadoop, Map Reduce, HDFS, Zookeeper, Kafka, Hive, REST API, Kafka, EMR, Glue, Data Pipeline

Big Data Frameworks: HDFS, YARN, Spark

Programming Languages/Scripting: Python

Operating Systems: Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Databases: SQL Server, MySQL, oracle, Athena, Hive

Cloud Solutions: Microsoft Azure,AWS

EDUCATION

Masters: - 2019 - 2021

Masters in Data Science

Bits Pilani ,Delhi

Diploma:-

Advanced Diploma in Big Data Analytics 2017 – 2018

National Institute of Electronics & Information Technology, Calicut

Graduation:-

B.Tech (computer Science Engineering) 2012 - 2016

RPSGOI (MDU), MOHINDERGARH (Haryana) Graduated

Senior Secondary:-

JEEVEN JYOTI SR SEC SCHOOL (HARYANA) 2010-2011

High School: - 2008-2009

STRACEY MEMORIAL HIGH SCHOOL (BANGALORE)

PROFESSIONAL EXPERIENCE

Senior Associate Data Engineer Sept 2020 – Present

Working on the migration of on-premises data to the AWS/Azure cloud.
Preparing Documentation of reports and on-premises database date.
Creating Lambda ,Glue and redshift tables on the AWS/Azure cloud.
Creating a data module for existing databases and reports.
Interacting with the client to a better understanding of existing databases and reports.
Moving with the no of sprints for Low level design.
Working with lambda and Glue for implementing Data flow
Preparing DDL for redshift and deploying to redshift.
Data validation with correct files and checksum through 5 stages.
Worked on the python and pysaprk scripts for glue jobs.
Also designed step function and triggers for the glue job scheduling.
Code optimization in all 3 stages.
Handling daily inbound and outbound feed data files.
Refining historical data and loading into redshift generating 2 years refresh data for cognos reporting.
Handling defects and maintaining changes based on the client requirements.

Data Engineer Sept 2019 – Aug 2020

Wavelabs (CLIX CAPITAL, Farther Finance), Gurugram

Worked on Customer 360.
Formatting diﬀerent data formats (From REST API) into structured format
Prepare logical and physical data models.
Creating Schema of table with primary key and foreign key into AWS RDS for REST API data.
Loading Structured data into AWS RDS, ignoring duplicates by checking the log table.
Comparing records with so^ match using FazzyWazzy also calculates distance between zip codes.
Working on development of data pipelines using AWS glue, AWS data pipeline, AWS lambda, EMR, Notebook.
Writing Business logic over Python, pyspark, Glue ETL.
Writi