ML Platform Engineer (Autonomous Driving)
Pangyo, South Korea
42dotFull-time

We are looking for the best

At 42dot, our AD ML Platform Engineers build the core data platform and ML training / eval platform for the cutting edge algorithms in autonomous driving. We develop the distributed system of a scalable data platform for large-scale dataset (millions of scenes), as well as high-performance data serving SDKs for ML model training / evaluation. The platforms we deliver could highly improve the efficiency of ML model development lifecycle, including training, evaluation, deployment, as well as monitoring in the cloud environment.

Responsibilities

  • Develop a high scale, reliable data platform to manage, visualize, search and serve large-scale datasets for ML model training, fine tune and validation.

  • Develop advanced autonomous driving data SDK, including scene data search, datasets preparation, dataset loading, etc.

  • Build up the data lakehouse for autonomous driving scene dataset, including the sensor data, calibration data, as well as annotation data

  • Dig into performance bottlenecks all along the data processing pipelines, from data processing latency, data search latency to Test Procedure (TP) coverage.

  • Bootstrap and maintain infrastructure for data platform components—data processing pipeline, database, data lakehouse and data serving.

  • Collaborate with cross-functional teams, including ML algorithm, ML application, and Cloud Infra to align ML Platforms with overall autonomous driving system architecture.

Qualifications

  • Bachelor's degree or higher in Computer Science, Engineering, Robotics, or a similar technical field.

  • Minimum of 5 years of experience in Data Engineering or ML Platform roles

  • Proficient in Python and solid experience in Python SDK development

  • Solid working experience in Databases (e.g., MongoDB, PostgreSQL, etc)

  • Hands-on experience with data pipeline job orchestration with Databricks Workflows or Apache Airflow, as well as integrating data pipelines with machine learning models

  • Extensive experience with data technologies and architectures such as Data Warehouse (e.g., Hive) or Lakehouse (e.g., Delta Lake)

  • Experience with Apache Spark or other big data computing engines

Preferred Qualifications

  • Experience with autonomous vehicle sensor data (e.g., LiDAR, camera, radar)

  • Experience with ML model training lifecycle (e.g., data preparation, model training / validation / deployment, etc)

  • Understanding of modern AI frameworks (e.g., PyTorch, TensorFlow etc.)

  • Understanding data governance principles, data privacy regulations, and experience implementing security measures to protect data

Interview Process

  • Resume Screening - Coding Test - Virtual Interview (approximately 1 hour) - Onsite or Virtual Interview (approximately 3 hours) - Final Offer

  • Please note that the interview process may vary depending on the position and is subject to change based on scheduling and other circumstances.

  • Interview schedules and results will be communicated individually via the email address provided in your application.

Additional Information

  • Please upload all required documents in PDF format.

  • Veterans and applicants eligible for employment protection will receive preferential consideration in accordance with applicable laws and regulations.

  • In compliance with the Act on Employment Promotion and Vocational Rehabilitation for Persons with Disabilities, registered individuals with disabilities will receive preferential consideration.

  • 42dot does not accept unsolicited resumes from search firms. We will not pay any fees for resumes submitted without prior agreement.

※ Please make sure to review the information below before applying.