Mach9 logo

Software Engineer, Sensor Data Integration

Mach9
9 days ago
Full-time
On-site
San Francisco, California, United States

The Role

At Mach9, Sensor Data Integration Engineers build the algorithms and pipelines that transform large-scale geospatial datasets into structured, accessible formats to power our survey product, Digital Surveyor. You’ll work with high-volume data sources — LiDAR-collected point clouds, on-road imagery, overhead aerial ortho photos — and own the systems that ingest, standardize and store them for our training and product use. Every single piece of data that our customers upload will pass through your systems first.

This role is ideal for an engineer who loves puzzle-hunting — reverse-engineering sparsely-documented formats, wrangling coordinate systems and transforms, hunting down strange camera projection issues.

You’ll sit at the divide between our customers and our product, making messy real-world sensor data trustworthy at scale. This role sits at the front of everything we do: our models are only as good as the data feeding them, and you'll be the one making that data trustworthy at scale.

Where you'll make an impact

  • Own the ingestion pipelines that convert point clouds and imagery from hardware vendors into Mach9's standard internal format

  • Reverse-engineer new vendor formats and updates - often working only with sparse or missing documentation - to expand what data Mach9 can take in

  • Build agentic systems to automatically triage failures and reformat data

  • Build automated checks and regression testing to guarantee the consistency of our data

  • Optimize the performance of our processing and storage across massive geospatial datasets in the cloud

  • Work directly with customers and partners to unblock critical customer projects

What you bring

  • Strong software development and debugging skills

  • Experience building production software in Python

  • Comfort operating with ambiguity. You'll need to be able to dig into undocumented or messy data formats and reverse-engineer them.

  • Strong communication skills, with the ability to work across our ML, product, and customer success teams

  • A foundation in parallel computing or distributed systems

  • A bachelor's degree in Computer Science, Engineering, or equivalent experience.

Bonus experience

  • Experience building agentic systems and setting up agent harnesses — orchestrating LLM-driven workflows for triage, debugging, or automated code patching.

  • Understanding of geospatial data formats (e.g., LAS/LAZ, COPC, E57, GeoTIFF, Shapefiles) and tooling (e.g., GDAL, PDAL, untwine, laz-perf).

  • Expertise designing and managing data schemas and storage systems for geospatial data (e.g., Postgres/PostGIS, AWS S3).

  • Experience with large-scale data processing frameworks and cloud platforms (e.g., Spark, AWS Batch).

  • Familiarity with coordinate reference systems and transforms (CRS, WKT, pyproj, affine transforms).

  • Experience building data versioning, lineage, or artifact-tracking systems.

  • Experience operating data pipelines that feed ML training and inference.

  • Familiar with C++.

About Mach9

Mach9 is transforming civil infrastructure design with AI-powered geospatial tools. Our platform accelerates the creation of engineering deliverables from raw data, cutting manual drafting time by 96×. Trusted by global leaders in engineering and construction, we're backed by Y Combinator, Quiet Capital, and top founders and executives from Cruise, Autodesk, Adobe, and DoorDash.

We believe the needs of a startup benefit from an in-person culture. The team works out of our office in SoMa, with the flexibility to work from home when needed.