← Back to all jobs

Senior Data Engineer Exposure

Chainalysis logo

Chainalysis

📍 Aarhus, Denmark💰Est.$97k - $106k🕐 Posted Today
Data EngineerOnsitemulti-chain
pythonsparkawsparqueticebergdelta-lakekubernetesterraformpostgressnowflakeredshiftdatabricks
Apply

Job Description

About Us

The engineering team at Chainalysis is inspired by solving the hardest technical challenges and creating products that build trust in cryptocurrencies. We're a global organization with teams in Denmark, UK, Canada, and the USA who thrive on the challenging work we do and doing it with other exceptionally talented teammates. Our industry changes every day and our job is to create user-facing products supported by a flexible and scalable data platform allowing us to adapt to those rapid changes and bring value to our customers.

Chainalysis has become known as the leader in blockchain investigation and compliance software. Our products have built trust in blockchains by taking down terrorist financing campaigns, disrupting major ransomware operations, identifying the Twitter hackers, and more.

We are building the data platform for blockchain, cryptocurrency, and web3. We're looking to bring onboard experienced Senior Software Engineers who execute high-impact projects and are excited about building at a significant scale!

Blockchain technology is powering a growing wave of innovation. Businesses and governments around the world are using blockchains to make banking more efficient, connect with their customers, and investigate criminal cases. As adoption of blockchain technology grows, more and more organizations seek access to all this ecosystem has to offer. That's where Chainalysis comes in. We provide complete knowledge of what's happening on blockchains through our data, services, and solutions. With Chainalysis, organizations can navigate blockchains safely and with confidence.

Responsibilities

  • Build cloud-native data ingest and aggregation processes that intake gigabytes of data per day.
  • Develop and optimize high-performance Spark jobs in Python to detect activities for market manipulation, fraud, behavioral patterning, and more.
  • Build and improve batch-based applications as well as real-time streaming pipelines processing billions of records per day.
  • Architect and maintain scalable data lakehouse environments using formats like Parquet, Iceberg, and Delta Lake.
  • Collaborate on building scalable API services on AWS that interface with our data layer to handle 1,000s of requests per second.
  • Help the team modernize our data stack to operate at 10x current capacity, moving toward highly automated, serverless architectures.
  • Debug production data quality issues and performance bottlenecks across distributed systems and microservices.

Requirements

  • Experience in designing and implementing cloud-native, distributed data processing systems in a major cloud provider (AWS preferred).
  • Deep expertise in Python and Apache Spark, with a strong understanding of performance tuning and distributed computing principles.
  • A bias to ship and iterate alongside product management and design partners to turn raw data into actionable insights.
  • A technical background with extensive experience working directly on backend systems and large-scale data architecture.
  • Pride in materializing complex product ideas into stable, production-grade data pipelines.
  • Exposure to or interest in the cryptocurrency technology ecosystem and the unique data challenges it presents.

Nice to Have

  • Mentored other engineers, leading cross-team data initiatives, and driving design and technology decisions.
  • Worked with Terraform and Kubernetes (EKS) for orchestrating data workloads.
  • A genuine excitement for significantly scaling large data systems and exploring the latest in Modern Data Stack technologies.

Technologies We Use

Languages: Python, Java

Big Data: Spark (PySpark), Flink, Databricks

Storage: Parquet, Iceberg, Delta Lake, Paimon

Cloud/Infra: AWS (Serverless, EMR, Lambda), Kubernetes, Terraform

CI/CD: GitHub including GitHub Actions

Database: PostgreSQL, Snowflake, Redshift

Unchain Data provides Web3 data job aggregation as a common good. Jobs are posted by third parties and are not individually verified. Always exercise caution: never download software requested during a hiring process, avoid clicking unfamiliar links in interviews, make sure to verify URLs are legit, and use trusted meeting tools like Google Meet or Zoom.