EBKernel · 上海具脑磐石科技有限公司 — Embodied Agent Algorithm Intern
19 Nov 2025 – Present
Embodied-AI / Human-Computer Interaction Researcher ·
Fully remote · 弹性工作制 (flexible hours)
- RynnBrain VLM × OmniGibson eval pipeline (sub-project from 03/2026): lifted BDDL true success 0% → 80% by stripping six classes of privileged information (GT segmentation → GSAM open-vocabulary perception; container AABBs → pixel-level snap; teleport → CuRobo trajectory planning). 60+ seeds × 3 BEHAVIOR-1K tasks; documented 28 falsified hypotheses. Three-machine remote pipeline: Mac ↔ RTX 4090 ↔ H100 cluster (SSH tunnels + rsync; vLLM TP=1).
- Robot voice interaction: co-deployed real-time stack on Lejun (乐聚) humanoid for water-fetch / hotel front-desk / reception scenarios. Full domestic localisation: Alibaba Cloud TTS/STT, Agora (声网) RTC, TEN Framework. Researched 3C noise-suppression and microphone directivity at the hardware/software boundary.
- Multi-cloud GPU-infra collaboration: worked across Huawei Cloud and partnered with the infrastructure teams of Kuaishou (快手) and DataCanvas · 九章云极 (Alaya NeW Cloud) to debug GPU container scheduling, heterogeneous-compute cluster orchestration (mixed deployment of domestic chips and NVIDIA), and shared-compute inference pipelines.
iMean · 椰子宇宙(杭州)科技有限公司 — Algorithm Intern (Sole Engineer, Voice Agent)
15 Sep 2025 – 02 Feb 2026
Algorithm Department (算法部门) · Onsite Hangzhou Sep–Nov 2025 · Remote Dec 2025 – Feb 2026
- Owner of the full Web + iOS voice-agent stack: cross-tested STT (Deepgram, ElevenLabs, Gemini), TTS (ElevenLabs, ChatGPT TTS), and LLM (GPT-4o, Gemini, Gemini Live cascaded vs. native multimodal).
- Voice infrastructure across LiveKit, FastRTC, TEN Framework — WebSocket and WebRTC underneath. Abstracted OpenAI Agents SDK, Gemini ADK, and LangChain/LangGraph behind a single multi-provider interface so we could swap orchestration without touching business logic.
- Engineered streamed long-context management, dual-tier (short + long) memory over vector recall, and per-session state machines that switched cadence and tone (podcast tempo, calm voice) based on the detected scene. Earned formal team-lead and company recognition under an ambiguous spec.
Daikin Malaysia (大金) — Team Leader, Enterprise Systems (4-Module Lead)
2025 – Ongoing
APU university–industry collaboration project · Ongoing delivery
- Leading delivery across four enterprise modules: (1) IoT-device interconnection + sensor data ingest; (2) form-assistant + work-order ticketing system; (3) internal wiki knowledge base; (4) on-prem RAG question-answering assistant (locally deployed Ollama). Own architecture, integration, and cross-team handoff.