Engineering
Build the Infrastructure Behind Compression.
Own how our software stack is built. Design well-scoped libraries, robust packages, and the abstractions our compression pipeline runs on as we scale the team and ship more to customers.
THE ROLE
About the Role
You'll own how our software stack is built. Today the codebase reflects four people moving fast, it works, but it needs structure. Your job is to give it that structure: well-designed libraries, robust packages, environments, the kind of codebase that scales as we grow the team and ship more to customers.
This is not a glue-code role. You'll work between the algorithm and inference layer: designing a compression pipeline that is fully automated and takes target runtimes into account. You'll design the abstractions our compression pipeline runs on and make them fast.
WHAT YOU'LL DO
Your impact
- —Design and refactor our core libraries — pruning, quantization, retraining, evaluation — into clean, well-scoped packages.
- —Build the internal tooling that lets the team move quickly without breaking things — CI, benchmarks, reproducible runs.
- —Integrate our compression output with inference engines (vLLM, TensorRT-LLM, llama.cpp) and customer deployment targets.
- —Set the engineering bar for the team as we hire.
WHAT YOU BRING
What we're looking for
- —Bachelor's/Master's in computer science or equivalent, plus 2+ years of professional software engineering.
- —Strong opinions about code design. You know what a well-structured library looks like and why.
- —GPU experience — memory hierarchy, kernels, what bottlenecks performance — even if you don't write CUDA daily.
- —Production-grade Python. You write code others can read, extend, and trust.
- —You finish things and you care about the codebase you leave behind.
NICE TO HAVE
- —Open-source contributions to ML infrastructure (vLLM, llama.cpp, transformers, TensorRT-LLM, PyTorch internals).
- —CUDA, Triton, or kernel-level work.
- —Experience designing a library from scratch that other engineers ended up using.
- —Familiarity with model serving and inference optimization.
PRACTICAL
- —Vienna-based. Hybrid or fully remote.
- —Working language is English.
- —We sponsor visas and support relocation.
- —Compensation: €70–120k base + equity. Austrian minimum disclosed per Kollektivvertrag: €45,738/year.
- —You'll set the engineering standards we hire against next.
Interested? Let's talk.
Send your resume and a brief note about your work to info@oracomputing.com
Apply Now →