Tejender
PROFESSIONAL SUMMARY
- Hands on experience on working in domains such as Insurance, and Healthcare etc.
- In - depth knowledge of Apache Hadoop Architecture and Apache Spark Architecture.
- Hands on experience in Hadoop Ecosystem components such as Hadoop, Spark, HDFS, YARN, Hive, Sqoop, Flume, MapReduce, Pig, Kafka. Experience in importing and exporting data from RDBMS to Hadoop and HIVE using SQOOP.
- Experience in transporting and processing real-time stream data using Kafka.
- Experience in implementing OLAP multidimensional cube functionality using Azure SQL Data Warehouse, Azure Data Factory.
- Experience in writing Spark transformations and actions using Spark SQL in Pyspark.
- Experience in writing HQL queries in the Hive Data warehouse.
- Modification and performance tuning of HIVE scripts, resolving automation job failure issues and reloading the data into the HIVE Data Warehouse if needed.
- Experience in processing streaming data using Flume and PIG.
- Taking care of the ETL process
- Improving data quality, reliability & efficiency of the individual components & the complete system
- Processing streaming data through spark using Kafka.
TECHNICAL SKILLS
Hadoop/Big Data: Hadoop, Map Reduce, HDFS, Zookeeper, Kafka, Hive, REST API, Kafka, EMR, Glue, Data Pipeline
Big Data Frameworks: HDFS, YARN, Spark
Programming Languages/Scripting: Python
Operating Systems: Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8
Databases: SQL Server, MySQL, oracle, Athena, Hive
Cloud Solutions: Microsoft Azure,AWS
EDUCATION
Masters: - 2019 - 2021
Masters in Data Science
Bits Pilani ,Delhi
Diploma:-
Advanced Diploma in Big Data Analytics 2017 – 2018
National Institute of Electronics & Information Technology, Calicut
Graduation:-
B.Tech (computer Science Engineering) 2012 - 2016
RPSGOI (MDU), MOHINDERGARH (Haryana) Graduated
Senior Secondary:-
JEEVEN JYOTI SR SEC SCHOOL (HARYANA) 2010-2011
High School: - 2008-2009
STRACEY MEMORIAL HIGH SCHOOL (BANGALORE)
PROFESSIONAL EXPERIENCE
Senior Associate Data Engineer Sept 2020 – Present
- Working on the migration of on-premises data to the AWS/Azure cloud.
- Preparing Documentation of reports and on-premises database date.
- Creating Lambda ,Glue and redshift tables on the AWS/Azure cloud.
- Creating a data module for existing databases and reports.
- Interacting with the client to a better understanding of existing databases and reports.
- Moving with the no of sprints for Low level design.
- Working with lambda and Glue for implementing Data flow
- Preparing DDL for redshift and deploying to redshift.
- Data validation with correct files and checksum through 5 stages.
- Worked on the python and pysaprk scripts for glue jobs.
- Also designed step function and triggers for the glue job scheduling.
- Code optimization in all 3 stages.
- Handling daily inbound and outbound feed data files.
- Refining historical data and loading into redshift generating 2 years refresh data for cognos reporting.
- Handling defects and maintaining changes based on the client requirements.
Data Engineer Sept 2019 – Aug 2020
Wavelabs (CLIX CAPITAL, Farther Finance), Gurugram
- Worked on Customer 360.
- Formatting different data formats (From REST API) into structured format
- Prepare logical and physical data models.
- Creating Schema of table with primary key and foreign key into AWS RDS for REST API data.
- Loading Structured data into AWS RDS, ignoring duplicates by checking the log table.
- Comparing records with so^ match using FazzyWazzy also calculates distance between zip codes.
- Working on development of data pipelines using AWS glue, AWS data pipeline, AWS lambda, EMR, Notebook.
- Writing Business logic over Python, pyspark, Glue ETL.
- Writi