Hyungmo Kim
SW/ML Engineering | 
Senior ML Software Engineer address | Dasan-dong, Namyangju-si, Gyeonggi-do | phone | (+82) 010-4194-6815 | email | kalaluthien@gmail.com |
|
ㅡ Summary | 
I am a polyglot programmer for any problems that can be framed, designed, and interpreted as a program; applications, platforms, models, processes, strategies and team topology.
I mostly focus on learning immutable values like abstract structures, first principles, and mental models that can be generically applied to practical topics. |
ㅡ Experiences | 
Multi-task on-device model / ML Software Engineer2025 - 2026, Hyperconnect LLC / MatchGroup AI- Design and propose to train multi-task vision on-device model to replace previous models in iOS/Android model pipeline based on literature survey
- Propose modular model architecture and lead machine learning engineers in different timezones to train different parts concurrently without blocking each other
- Distill pre-trained CLIP (ViT) models into MobileNet variant to support better generalizability and semantic search accuracy
- Reduced the number of model invocation counts to match product requirements on end-to-end latency (x3 faster)
On-device ML platform as a service / Acting Manager2023 - 2025, Hyperconnect LLC / MatchGroup AI- Design and propose a prototype of the platform for easy integration of mobile inference capabilities
- iOS demo application development and Android demo application multithreading for third-party PoC delivery
- Develop integrated Python script for conversion, encryption and preprocessing insertion between various formats such as TFLite / TorchScript / ONNX / CoreML
- Develop a generalized benchmark application for testing in mobile environments
- Lead software engineers and machine learning engineers for several projects on server and on-device model optimizations
- Supervise tech blog writing: https://hyperconnect.github.io/2026/01/23/On-device-Face-Verification-Pipeline-Optimization.html
GDPR compliance for model training data / Acting Manager2024, Hyperconnect LLC- Select points where problems may occur across the entire learning infrastructure, including in-house cloud and on-premise, and take preemptive action
- Suggest problem specification and software design for automation of each part, supervise team members
Serving 70B LLM on-premise / Acting Manager2024, Hyperconnect LLC- Conduct data research and benchmark to cost-effectively serve 70B LLM in real time
- Propose and build a system that provides services at no additional cost using already purchased on-premise in 2 weeks
- Define and breakdown a complex problem into delegatable and deliverable pieces
- Cost savings of hundreds of millions of won per month
Unstructured training data ETL pipeline / ML Software Engineer2022 - 2023, Hyperconnect LLC- Replace existing legacy data pipeline, which are vulnerable to domain/schema/business rule changes and lack functionalities such as backfill/catalog, in accordance with the relevant product transfer schedule
- There were no existing domain terms (ubiquitous language) and automated systems related to data/labels, so I’ve proposed a new one
- Automation of various media processing such as image/audio/text/video
Real-time on-device audio classification runtime / ML Software Engineer2021, Hyperconnect LLC- Development of a TFLite-based executor that runs a context-based model that improves the limitations of existing keyword detection models on Android/iOS devices
- Advanced Python tools for TFLite model conversion, metadata operations
- Reimplementation of inference engine code base for software quality
CUDA-based LLM inference runtime / ML Software Engineer2021, Hyperconnect LLC- Additional CUDA kernel implementation of NVIDIA’s FasterTransformer, bug fixes, gpu memory optimization, various heuristic implementations
(checked correct behaviour up to 13B) - Run actual services as a backend on NVIDIA Triton inference servers
(4B scale) - Up to 50x improvement over RPS of existing technology (at that time)
On-premises GPU cluster for research / ML Software Engineer2020 - 2024, Hyperconnect LLC- Review of 50 PF GPU cluster system design and technical specifications based on NVIDIA SuperPod architecture
- Achieving more than twice the cost efficiency compared to AWS Cloud
- 400TB distributed storage configuration and GitOps-based management system design
- Constructing a deep learning research environment that allows large-scale distributed learning through slurm scheduler using Ansible and systemd
Real-time on-device image classification engine / ML Software Engineer2020, Hyperconnect LLC- Development of a TFLite-based inference engine SDK that runs a new lightweight model with faster inference time than existing image classification/segmentation models on Android/iOS devices
- TensorFlow 2 Keras quantization related technology PoC in progress
- Application of TFLite GPU/XNNPack delegate technology for hardware acceleration
- Development of Android/iOS demo app for testing in WebRTC environment
K-supercomputer, Chundoong / System Administrator2017 - 2020, SNU Thunder Research Group- Water-cooled/oil-cooled cluster system management of 200 AMD+NVIDIA heterogeneous GPUs and Infiniband interconnect network
- Service for over 300 users for educational/research purposes
Deep learning framework for Samsung NPU (SNPU) system / Researcher2019, Samsung Electronics- Development of CNN benchmark to analyze Samsung Neural Processing Unit performance
- Implement distributed learning benchmarks of 4 types of CNN (VGG, ResNet, DenseNet, Inception) with cuDNN+MPI
HPC testbed system and multi-node GPU benchmarks / Researcher2017 - 2019, Ministry of Science and ICT- Build and manage 32 NVIDIA Tesla V100 physical testbed systems
- Development of OpenCL and CUDA versions of GPU benchmark
(SNU NPB 2019)
|
ㅡ Education | 
Master’s degree in CSE / Seoul Nat’l Univ.2017-03 - 2020-02Bachelor’s degree in CSE / Seoul Nat’l Univ.2013-03 - 2017-02 |
ㅡ Awards | 
2017 National Supercomputing Contest Grand Prize / UNIST President’s Award |
ㅡ Publications | 
International Papers- SNU-NPB 2019: Parallelizing and Optimizing NPB in OpenCL and CUDA for Modern GPUs
Youngdong Do, Hyungmo Kim, Pyeongseok Oh, Daeyoung Park, Jaejin Lee IISWC '19: Proceedings of the 2019 IEEE International Symposium on Workload Characterization, Orlando, FL, USA, November 2019, DOI: https://doi.org/10.1109/IISWC47752.2019.9041954 |
ㅡ Skills | 
I leave the traditional “Skills” section empty, as specific programming languages, frameworks, or tools feel less significant today. Instead, I prefer to highlight what I can actually accomplish. Optimization - I’ve started my career from optimizing numerical analysis programs in C using GPGPU for benchmarking supercomputers. My first project in Hyperconnect was to migrate the underlying technology of our on-device inference SDK from TF1 to TF2. I understand low-level mechanisms of how computers utilize their resources and how neural networks are implemented as software and mapped to hardware.
- I have intuitions and experiences on how to identify bottlenecks in ML systems and resolve them step-by-step.
Engineering - I am comfortable reading and writing across programming languages in multiple programming paradigms, even unfamiliar ones.
- I can comprehend cross-functional domains in the AI engineering era and translate them into well-defined SW or ML problems. I can generate concrete design and execution plans on them considering resources and uncertainties.
Management - I have a basic understanding of people management that I’ve experienced while working as an acting manager in two different organizations; MLOps team and central AI team.
- I’m an experienced agile practitioner; I’ve managed several projects with different cross-functional members and methodologies using issue tracking and communication tools. I can identify blockers, collaboration areas and bad smells on different technical domains.
|