Hello, I'm

Harshit Tripathi

Data Engineer

View My Portfolio

Black Right Arrow

Who I Am and What I Do

Experienced IT professional with over 6 years of hands-on experience in Python ​development, data engineering, and implementing data strategy using Azure cloud ​tools such as Pyspark, Azure Databricks, Azure Data Factory, Azure Synapse ​Analytics, Azure Data Lake, Azure Blob Storage, Azure Analysis Service, Power BI, ​and SQL. Proficient in designing, developing, and maintaining ETL frameworks, ​data pipelines, and data integration solutions. Skilled in data transformation, ​optimizing database performance, ensuring data quality, and automating data ​processing tasks. Strong background in building complex data processing ​frameworks, real-time dashboards for strategic decision-making, and collaborating ​within Agile development frameworks.


My Technical Skills

Delta Lake

Azure Data Factory

Azure Analysis Services

Databricks

Function App

Azure Synapse Analytics

NoSQL

Power Bl

python

Bitbucket

Apache

Airflow

My Professional Journey

Current role

At Expleo Ltd, I serve as a Data Engineer for Rolls Royce, ​where I design, build, and maintain robust data pipelines ​using Azure Data Factory and SQL. I integrate data from ​diverse sources, ensuring accuracy and reliability. My role ​involves optimizing database performance, conducting ​performance tuning, and implementing automation with ​Python and Pandas. I create real-time dashboards with ​Power BI, manage Databricks workflows, and collaborate ​with global teams. Additionally, I ensure high data quality in ​all processes, contribute to data migration projects, and ​drive initiatives to automate processes and enhance ​scalability within an Agile framework.

Previous Roles in Data Engineering


At Websoft Technologies Ltd, working on-site for NHS from May 2022 to September 2023, I ​analyzed data quality, volume, and complexity to determine migration requirements. I designed ​and implemented ETL processes using Azure Databricks, facilitating efficient data extraction, ​transformation, and loading (ETL) into Azure-based data repositories. My role involved optimizing ​data loading processes for performance and scalability, implementing data pipelines to automate ​data movement, and ensuring high data quality standards through anomaly detection and data ​cleansing.


From August 2017 to September 2021, at Stepfinity Software Pvt Ltd, I worked for a Fintech client, ​migrating data from on-premises servers to Azure Data Lake using Azure Data Factory. I created ​schemas, facts, and dimensions using MS SQL, designed ETL processes and data models for ​Azure SQL Server, and developed Spark code using Scala and Spark-SQL/Streaming. Additionally, ​I built dashboards on Power BI and orchestrated ETL processes, enhancing data processing ​efficiency and supporting strategic decision-making.

Projects

Cost Optimization on Azure Cloud

Effective ways for organizations to optimize costs and address user/customer issues through ​analytics on both batch and streaming data on Azure cloud platform.


Institution Sales orders BI

Convert Migrating data from On Premise server to Azure Data Lake using Azure Data Factory and ​processed the received file using Scala, PySpark, Spark SQL in Databricks.


Full Delta load from On-Primeses to Cloud

Conducted a comprehensive analysis of data sources, business needs, and integration ​requirements to determine the optimal approach for implementing a full delta load process. ​Designed a robust architecture for full delta load processes using Azure services. Implemented ​data extraction pipelines and delta load processes, optimizing performance and resource ​consumption.


Get in Touch

You Tube Channel

Email: Harshit.herts@gmail.com

Mobile: +44 07824702731

Follow me on social media

Blue Official Facebook Logo Social Media Icon
Black Instagram Logo
LinkedIn Logo 蓝白领英社交媒体