ML Platform Engineer
Pangyo (Software Dream Center), South Korea
42dotFull-time

We are looking for the best

At 42dot, our AD ML Platform Engineers build the core data platform and ML training / eval platform for the cutting edge algorithms in autonomous driving. We develop the distributed system of a scalable data platform for large-scale dataset (millions of scenes), as well as high-performance data serving SDKs for ML model training / evaluation. The platforms we deliver could highly improve the efficiency of ML model development lifecycle, including training, evaluation, deployment, as well as monitoring in the cloud environment.

Responsibilities

  • Develop a high scale, reliable data platform to manage, visualize, search and serve large-scale datasets for ML model training, fine tune and validation.

  • Develop advanced autonomous driving data SDK, including scene data search, datasets preparation, dataset loading, etc.

  • Build up the data lakehouse for autonomous driving scene dataset, including the sensor data, calibration data, as well as annotation data

  • Dig into performance bottlenecks all along the data processing pipelines, from data processing latency, data search latency to Test Procedure (TP) coverage.

  • Bootstrap and maintain infrastructure for data platform components—data processing pipeline, database, data lakehouse and data serving.

  • Collaborate with cross-functional teams, including ML algorithm, ML application, and Cloud Infra to align ML Platforms with overall autonomous driving system architecture.

Qualifications

  • Bachelor's degree or higher in Computer Science, Engineering, Robotics, or a similar technical field.

  • Minimum of 5 years of experience in Data Engineering or ML Platform roles

  • Proficient in Python and solid experience in Python SDK development

  • Solid working experience in Databases (e.g., MongoDB, PostgreSQL, etc)

  • Hands-on experience with data pipeline job orchestration with Databricks Workflows or Apache Airflow, as well as integrating data pipelines with machine learning models

  • Extensive experience with data technologies and architectures such as Data Warehouse (e.g., Hive) or Lakehouse (e.g., Delta Lake)

  • Experience with Apache Spark or other big data computing engines

Preferred Qualifications

  • Experience with autonomous vehicle sensor data (e.g., LiDAR, camera, radar)

  • Experience with ML model training lifecycle (e.g., data preparation, model training / validation / deployment, etc)

  • Understanding of modern AI frameworks (e.g., PyTorch, TensorFlow etc.)

  • Understanding data governance principles, data privacy regulations, and experience implementing security measures to protect data

Interview Process

  • 서류전형 - 코딩테스트 - 화상면접 (1시간 내외) - 대면 혹은 화상면접 (3시간 내외) - 최종합격

  • 전형절차는 직무별로 다르게 운영될 수 있으며, 일정 및 상황에 따라 변동될 수 있습니다.

  • 전형일정 및 결과는 지원서에 등록하신 이메일로 개별 안내드립니다.

Additional Information

  • 이력서 제출 시 주민등록번호, 가족관계, 혼인 여부, 연봉, 사진, 신체조건, 출신 지역 등 채용절차법상 요구 금지된 정보는 제외 부탁드립니다.

  • 모든 제출 파일은 30MB 이하의 PDF 양식으로 업로드를 부탁드립니다. (이력서 업로드 중 문제가 발생한다면 지원하시고자 하는 포지션의 URL과 함께 이력서를 recruit@42dot.ai으로 전송 부탁드립니다.)

  • 인터뷰 프로세스 종료 후 지원자의 동의하에 평판조회가 진행될 수 있습니다.

  • 국가보훈대상자 및 취업보호 대상자는 관계법령에 따라 우대합니다.

  • 장애인 고용 촉진 및 직업재활법에 따라 장애인 등록증 소지자를 우대합니다.

  • 42dot은 의뢰하지 않은 서치펌의 이력서를 받지 않으며, 요청하지 않은 이력서에 대해 수수료를 지불하지 않습니다.

※ 지원 전 아래 내용을 꼭 확인해 주세요.