System Online — All Pipelines Active

I don't just move data.

I engineer intelligence.

Data Engineer & AI/ML Specialist with 5+ years building enterprise-scale pipelines, real-time test validation systems, and AI solutions that have saved $1.6M+ annually.

|
Resume
0K+
Daily Test Transactions
$0.0M+
Annual Savings
0%
Pipeline Optimization
0%
Data Accuracy

// ABOUT

The Engineer Behind the Systems

about_tanish.log

> INIT: Career trajectory loaded...
[2019] Graduated BS Computer Engineering — UC Riverside
[2020] MS Computer Engineering — UC Riverside
[2020] First data role at CE-CERT — processing 500GB+ vehicle telemetry
[2021] Joined American Express — Arena Testing Team
[2023] Built AI data scrubbing system — saved $1.6M annually
[2024] Optimized ETL pipelines — cut processing from 8hrs to 3.2hrs
[2025] Led multi-cloud migration (AWS → GCP)
[2026] Current: Associate Data Engineer-AI at Jorie AI
> STATUS: Evolving from Data Engineer → AI/ML Engineer
> MISSION: Building systems that think, not just process.

I started my journey at UC Riverside, where I earned both my BS and MS in Computer Engineering. What began as curiosity about how systems process information evolved into a passion for building intelligent data architectures at scale.

At American Express, I spent nearly 5 years on the Arena Testing Team, where I didn't just build pipelines — I engineered solutions that fundamentally changed how the team operated. My AI-based data scrubbing system eliminated $1.6M in annual vendor costs. My ETL optimizations cut processing time by 60%. My real-time streaming pipelines validated 130K+ daily test transactions at sub-second speed.

Now at Jorie AI, I'm applying that same systems-thinking to healthcare data, building RCM dashboards that turn complex billing data into actionable intelligence. My trajectory is clear: from moving data to engineering intelligence.

5+

Enterprise Experience

MS

UC Riverside

3

Domains

// EXPERIENCE

Systems I've Built & Scaled

Jorie AI

Associate Data Engineer — AI

Feb 2026 — PresentOak Brook, IL

CURRENT
>

Designing interactive RCM dashboards tracking AR aging, denial rates, and reimbursement trends

>

Building secure SQL Server pipelines for healthcare billing analytics

>

Implementing automated data validation for billing and claims accuracy

>

Translating complex RCM requirements into scalable reporting solutions

PythonSQL ServerPower BIHealthcare Analytics

American Express

Software Developer / Data Engineer

Apr 2021 — Feb 2026Phoenix, AZ

4.8 YEARS
>

$1.6M saved annuallyAI-based data scrubbing for banking personas

>

60% fasterETL optimization with parallel processing using PySpark & AWS Glue

>

99.7% data accuracyAutomated 15+ manual workflows with real-time validation

>

130K+ daily test transactionsReal-time streaming test transaction validation via AWS Lambda & Kinesis

>

45% improvement AWS → GCPQuery performance boost through multi-cloud migration

>

67% fewer incidentsComprehensive CI/CD implementation for data pipelines

PythonPySparkAWSGCPBigQueryLambdaKinesisHDFSPower BISSIS

CE-CERT, UC Riverside

Software Engineer Intern / Data Analyst

Jun 2020 — Jan 2021Riverside, CA

INTERNSHIP
>

Built automated data collection for Eco-Vissim driving simulation

>

500GB+Vehicle telemetry data processing for fuel efficiency research

PythonTableauETLSensor Data
// PROJECTS

Intelligence Systems Portfolio

Real-world systems engineered to detect, predict, and optimize at enterprise scale.

$1.6M Annual Savings

AI Data Scrubbing Engine

Developed an AI-powered data scrubbing solution for banking and credit card personas, completely eliminating dependency on expensive external vendor contracts for test data generation.

$1.6M/yrCost Eliminated
0Vendor Dependencies
99.7%Data Quality
Python
Machine Learning
PySpark
AWS
Sub-second Validation

Real-time Streaming Pipeline

Designed and deployed real-time streaming data pipelines using AWS Lambda and Kinesis for test environment validation, enabling sub-second processing of 130K+ daily test transactions at scale.

<1 secLatency
130K+ txnsDaily Test Volume
99.9%Uptime
AWS Lambda
Kinesis
Real-time Streaming
Event-driven
100+ Engineers Supported

Enterprise Data Lake

Architected enterprise-scale test data lake on AWS, consolidating data from Oracle, Teradata, mainframes, and RDBMS sources into HDFS, supporting massive QA operations.

4+ typesData Sources
100+Engineers Supported
5TB+Data Migrated
AWS S3
HDFS
Oracle
Teradata
Hadoop
60% Faster Processing

ETL Performance Optimizer

Optimized ETL pipeline performance through parallel processing and incremental loading strategies, cutting processing time from 8 hours to 3.2 hours.

60%Speed Improvement
8 hoursBefore
3.2 hoursAfter
PySpark
AWS Glue
Parallel Processing
SSIS
20% Cost Reduction

Multi-Cloud Migration

Led migration from AWS to GCP for test data infrastructure, achieving significant cost reduction while improving query performance through BigQuery optimization.

20%Cost Savings
+45%Query Performance
5TB+Data Migrated
AWS
GCP
BigQuery
Data Migration

// SKILLS

Technology Arsenal

AI / ML & Data Science

Python / PySpark95%
Probabilistic Modeling85%
Anomaly Detection88%
Fraud Prediction82%
Predictive Analytics80%
Machine Learning85%

Cloud & Big Data

AWS (S3, Glue, Lambda)92%
GCP (BigQuery, Dataflow)85%
Hadoop / Hive / Spark88%
Real-time Streaming82%
Data Lake Architecture90%
Kinesis / Kafka78%

Engineering & Tools

ETL / Data Pipelines95%
C# / .NET / Scala75%
SQL / NoSQL90%
Power BI / Tableau85%
CI/CD Automation82%
SSIS / Automation80%

Tools & Platforms

Python
PySpark
Scala
C#
.NET
SQL
AWS S3
AWS Glue
AWS Lambda
AWS Kinesis
GCP BigQuery
GCP Dataflow
Hadoop
HDFS
Hive
Spark
Oracle
Teradata
SSIS
Power BI
Tableau
Git
CI/CD
Docker
Terraform

// INITIATE CONNECTION

Let's Build Something Intelligent Together

Whether you're looking for a Data Engineer who thinks in systems, an AI/ML specialist who delivers measurable impact, or a collaborator who turns complex data into strategic advantage — let's connect.

> tanish.arora

Engineered with precision. Powered by data.

© 2026 Tanish Arora