Himanshu
Professional Summary:
Skills
Technical Skills:
Technology Python, PySpark
RDBMS MS SQL (SSMS), MySQL
Tools Jupyter, PyCharm, VS Code, GitLab, Azure
Synapse, Big Data Technologies Azure DataBricks, Azure Data Factory, PySpark,
Synapse, Pandas DataFrame, Numpy,
Matplotlib, Pysftp, and SQLALCHEMY
Experience:
Aug 2021 - Till Data Engineer a
IBOTIX LLP, NOIDA
Domain: Insurance
Handling Team of 7 members
Nov 2016 - Aug 2021 Senior Python Developer
Ebix Software India Pvt Ltd, Noida
Domain: Insurance
Handling Team of 12 members
Apr 2015 - Oct 2016 Software Developer
Pinaka Aerospace Solutions Pvt Ltd, Delhi
Domain: Defense
Feb 2013 - Mar 2015 Software Developer
Oaxis Software Solutions, Noida
Domain: Web Based Applications
Education:
2008 - 2012 Bachelors of Technology (IT)
Sunder Deep Group of Institutions, Ghaziabad, UPTU
2006 - 2007 Intermediate
Shivaji Inter College, Kanpur
2004 - 2005 Matriculation
Shivaji Inter College, Kanpur
Domain Knowledge: Insurance / Finance
Projects Undertaken:
1. DentsPlySirona —ERP system Data Warehousing (ETL
Development) .
● Big data Developer or Azure Python Developer.
● There are 3 stages. i.e; Source-to-Raw, Raw-to-Refined, and
Refined-to-Certified.
● Client’s Data received into the raw layer from Qlik.
● Quality checks are performed on raw data and stored into a
refined layer.
● Transformation is applied to The Refined data and is stored in
the form of Dimension and Facts.
Environment: Azure DataFactory, Azure Data Bricks, Azure Synapse,
Azure Linked Services, SSMS, Blob, File format like; CSV, Parquet,
etc., Pipeline, Activities, DataSets, Linked Services,
2. Otsuka — File DQ Checks.
● Quality checks performed on received files like, File Name, File
Type, File Size, File Header, etc.
● First files are validated based on file_Name_Pattern along
with the File Format match as mentioned.
● Then checked for File_Size if all checks pass then the system
looks for the First Row (Header) in the file and compares them.
Environment: Python 3, Gitlab Ci/CD pipeline, AWS, PyMysql, AWS
SNS, SQS, S3 bucket, boto, Pandas, Amazon Workspace, AWS
secret manager, AWS Athena.
3. Data Ingestion — Data exchange between vendor and Genpact
Process.
● Python, Panda, numby, boto3, pymysql, gzip, MD5, Docker,
GIT, AWS, RDS, Athena, Secret Manager, Amazon WorkSpace,
S3 Bucket, GitLab, CI/CD Pipelines
Copyright© Cosette Network Private Limited All Rights Reserved