← Back to all jobs

Data Engineer

BTSE logo

BTSE

📍 Hong Kong, Hong Kong SAR💰Competitive🕐 Posted
Data EngineerRemote
pythonsqlkafkawebsockets
Apply

Job Description

About BTSE

BTSE Group is a global leader in fintech and blockchain technology, anchored by three core business pillars: Exchange, Payments, and Infrastructure Development. Serving over 100 corporate clients worldwide, we provide white-label exchange and payment solutions. Our offerings encompass everything from exchange infrastructure hosting and development to custody, wallets, payments, blockchain integration, trading, and more.

About The Opportunity

We're building an AI-powered research platform for institutional investors. Our platform turns vast amounts of market, alternative, and proprietary data into actionable intelligence — powered by AI agents that depend on clean, reliable, real-time data to do their job.

We need someone to own data. Not manage a team that does data. Own it — from finding the right sources, to getting them flowing, to making sure they stay healthy at scale.

Today we ingest from hundreds of sources. That number is growing fast. The sources are diverse: real-time market feeds, regulatory filings, on-chain blockchain data, news, social sentiment, alternative datasets, and proprietary client data. Some are free APIs. Some are $10K/month enterprise contracts. Some are clients pushing their own data into our platform. Every one of them is different, and most of them will break in ways you don't expect.

You'll evaluate vendors, negotiate deals, build integrations, monitor quality, track costs, and make the call on what's worth paying for. When something breaks at 2 AM, you'll know why before the alert finishes firing.

This is an end-to-end ownership role. No handoffs.

Responsibilities

  • Build and maintain integrations with a large and growing number of external data sources — APIs, WebSockets, file drops, streams, scrapers, and formats you haven't seen yet
  • Evaluate and compare data vendors across quality, reliability, coverage, cost, and terms of service
  • Negotiate contracts and manage commercial relationships with data providers
  • Design and operate high-throughput ingestion pipelines handling mixed workloads (real-time, near-real-time, batch, event-driven)
  • Build monitoring that tells you — before anyone else — when data is late, wrong, incomplete, or drifting
  • Manage data quality at scale: anomaly detection, cross-source validation, schema drift detection, gap filling
  • Handle both structured data (time-series, tabular) and unstructured data (documents, text, images) with appropriate extraction and storage
  • Track costs per source, usage per consumer, and ROI — recommend what to keep, upgrade, or cancel
  • Build tooling that makes adding the next data source faster than the last one
  • Use AI tools aggressively in your daily work — for code generation, testing, documentation, anomaly analysis, and anything else that makes you faster

Requirements

You've done this before:

  • 5+ years building data pipelines that run in production, 24/7, with real SLAs
  • Deep hands-on experience with SQL databases and time-series data
  • Python as your primary language, comfortable with async programming
  • You've integrated with dozens of external APIs and dealt with the reality of unreliable vendors, changing schemas, rate limits, and bad documentation
  • You've built monitoring and alerting for data systems — not as an afterthought but as part of how you work

You Think About The Whole Picture:

  • You don't just connect to an API. You think about what happens when it goes down, when the schema changes, when the data is wrong, when the bill doubles
  • You understand that data has a cost and a value, and not every source is worth keeping
  • You've worked with data vendors commercially — contracts, pricing tiers, usage negotiations

You Use AI Daily:

  • AI coding tools are part of your workflow today, not something you're curious about
  • You can articulate specifically how AI makes you faster and where it doesn't help
  • You'd be frustrated if you couldn't use AI in your work

Nice to Have

  • Experience with financial or crypto market data
  • Experience with streaming systems (Kafka or similar) at scale
  • Vector database or embedding pipeline experience
  • Experience with unstructured data extraction (PDFs, documents, NLP)

Benefits

  • Senior individual contributor role with full ownership of the data domain
  • Direct access to leadership — no bureaucracy, fast decisions
  • AI tools provided and encouraged across all work
  • Remote-friendly, async-first
  • Compensation commensurate with experience

Unchain Data provides Web3 data job aggregation as a common good. Jobs are posted by third parties and are not individually verified. Always exercise caution: never download software requested during a hiring process, avoid clicking unfamiliar links in interviews, make sure to verify URLs are legit, and use trusted meeting tools like Google Meet or Zoom.

Hiring Web3 data talent?

Get expert help sourcing, evaluating, and onboarding data professionals.