Senior Data & AI Platform Engineer

CoinDCX

📍 Bengaluru, India💰Competitive🕐 Posted June 8, 2026

Data EngineerOnsite

pythonpysparkspark-sqlkafkaairflowmlflowdatabricksdelta-lakelangchainlanggraph

Apply

Job Description

About Us

At CoinDCX, our mission is clear – to make crypto and blockchain accessible to every Indian and enable them to participate in the future of finance.

As India's first crypto unicorn valued at $2.45B, we are reshaping the financial ecosystem by building safe, transparent, and scalable products that power adoption at scale.

We believe that change starts together. It begins with bold ideas, relentless execution and people who want to build what's next. If you're driven by purpose and thrive in environments where your work defines the next chapter of an industry, you'll feel right at home here.

The Role

Operating a premier crypto exchange means moving at the absolute speed of the market. Crypto never sleeps, risk patterns evolve continuously, and malicious actors iterate by the minute. At CoinDCX, our data engineering foundation is already highly mature—processing billions of events daily via Databricks and Kafka. We aren't looking for someone to build basic data pipelines. We are looking for an exceptional engineer to construct the CoinDCX AI Value Platform.

This horizontal infrastructure layer will transform our massive data footprint into automated, intelligent action. You will build the frameworks, model registries, and context stores that allow both classic machine learning models and state-of-the-art Agentic AI systems to execute critical workflows safely—spanning real-time account takeover (ATO) containment, algorithmic crypto withdrawal risk-tiering, referral abuse detection, and AI-assisted wealth intelligence.

Responsibilities

Engineer the CoinDCX Entity 360 & Semantic Layers
Architect and optimize the Entity 360 Platform—specifically unifying disparate data streams into high-performance, real-time context stores including User 360, Wallet 360 (On-chain/Off-chain balance states), and Token 360
Build and govern a centralized Semantic and Metrics Layer to guarantee that data models, internal engines, and AI agents reference identical, deterministic definitions for core crypto metrics (e.g., active trader, malicious wallet cluster, referral loop, and crypto deposit/withdrawal (CDW) eligibility)
Standardize Exchange-Scale MLOps & Lifecycle Tracking
Own the deployment and standardization of MLflow (Model Registry, Tracking, Recipes) across CoinDCX to catalog, version, and deploy predictive models safely into our 24/7 production environment
Set up automated evaluation pipelines and tracing frameworks via MLflow LLM Tracking to capture live inputs/outputs, monitor data and feature drift, and benchmark model accuracy against real-world crypto market fluctuations
Build Agentic AI & Advanced LLM Infrastructure
Design and scale the data-routing backends required for Multi-Agent Systems (using LangGraph, CrewAI, or similar frameworks) to automate intricate compliance and operational journeys—such as auto-summarizing AML cases, evaluating token listing/delisting intelligence, and executing smart customer support agent routing
Build low-latency Retrieval-Augmented Generation (RAG) data systems. Optimize data chunking strategies, embed generation, vector database indexing (via Databricks Vector Search), and semantic caching to eliminate hallucination vectors within customer-facing applications
Leverage & Fuel Core Feature Stores
Build and maintain low-latency Feature Stores that pull directly from live Databricks (PySpark, Spark SQL, Delta Lake) environments to serve unified real-time signals to downstream transaction-monitoring and threat-detection models
Interface seamlessly with active Kafka/MSK, Auto Loader, and Change Data Capture (CDC) architectures to ensure downstream AI applications scale effortlessly without impacting existing core ledger or reporting SLAs
Implement Web3 Governance & Guardrails: enforce institutional-grade security guardrails directly into the platform layout including automated PII tokenization, wallet-masking policies, and rigorous access control via Unity Catalog

Requirements

4+ years of intensive platform or data engineering experience
SDE-2 or early SDE-3 level candidate with elite programming fundamentals, massive learning velocity, and zero fear of shifting paradigms
Expert-level mastery of Python, PySpark, and Spark SQL optimization
Intimate understanding of how distributed memory management works and how to manipulate massive datasets efficiently
Direct experience with Kafka/MSK and Apache Airflow (or Databricks Workflows) for complex, high-dependency system workflows
Practical implementation experience with MLflow for production model lifecycles
Strong conceptual or practical exposure to Vector Architectures and LLM coordination abstractions (LangChain, LangGraph, or LlamaIndex)

Nice to Have

Prior exposure to high-integrity transactional spaces—such as order matching engines, double-entry ledgers, blockchain nodes, risk compliance systems, or real-time payment gateways
FinTech or Crypto context experience

Success Metrics

Fully standardize and operationalize MLflow pipelines across the team, bringing the first set of live account takeover (ATO) and referral abuse detection models under structured lifecycle management
Successfully ship the production data layers for User 360 and Wallet 360, cleanly feeding real-time context to upstream decision engines
Deploy the automated data ingestion, vector indexation, and evaluation framework for digital customer support or internal intelligence agent
Ensure all new AI Value Platform integrations dock cleanly into our billion-event stream without introducing data lag or compromising the stability of our transactional core

Hiring Process

Application Review – Assessment for skills, alignment, and intent
Recruiter Connect – A short conversation to understand you better
Functional Round(s) – Deep dive into your approach, craft, and problem-solving
Assignment / Simulation Round – A take-home task or live problem-solving exercise to understand how you think and execute in real scenarios
Culture & Values Discussion – A conversation to understand our ways of working and how you thrive best
Founder Conversation (Optional) – For certain roles and senior levels, you may meet our founders to explore strategic alignment and long-term fit

Work Location

This role is based out of our Bangalore office. We operate as a work-from-office organization where collaboration, speed and trust come alive when teams share the same space.

Benefits

Flexible perks to match your lifestyle
Unlimited Wellness Leaves
Mental Wellness Support – Access to therapy and wellness resources
Bi-weekly learning and growth opportunities

Unchain Data provides Web3 data job aggregation as a common good. Jobs are posted by third parties and are not individually verified. Always exercise caution: never download software requested during a hiring process, avoid clicking unfamiliar links in interviews, make sure to verify URLs are legit, and use trusted meeting tools like Google Meet or Zoom.

Similar Jobs

Data Platform Engineer

Paxos · Remote - India

July 23, 2026

Senior Data Platform Engineer

Provable · Remote

July 23, 2026

Senior Data Engineer

Gemini · Remote / US

July 21, 2026

Senior Analytics Engineer

Gemini · Remote / US

July 21, 2026

Senior Analyst, Compliance Technology

Coinbase · Remote

July 21, 2026

Need a custom Dune dashboard?

Production-grade dashboards, built by the Data Lead at Binance and Morpho.

See the work

Hiring Web3 data talent?

Get expert help sourcing, evaluating, and onboarding data professionals.

Book a call Message on Telegram