Allen Thomas

+1 217-298-6572 | misalignedmodel.com | github.com/AllenThomasDev

allenthomasdev@gmail.com | linkedin.com/in/AllenThomasDev

AI Engineer with 4+ years experience in large-scale systems, specializing in LLM applications and ML infrastructure

Education

University of Illinois at Urbana-Champaign

Master's in Computer Science | 3.85/4.0

August 2024 - May 2026

Distributed Systems, Systems for Gen AI, Topics in LLM Agents, Deep Learning

University of Pune

Bachelor's in Computer Engineering | 3.40/4.0

August 2017 - May 2021

Reinforcement Learning

Experience

Founding Engineer

[Redacted]

May 2025 - Present

  • Architected a desktop code review platform with React TypeScript, FastAPI with Pydantic validation, analyzing large enterprise codebases for vulnerable security patterns.
  • Engineered a hybrid repair engine combining Semgrep's deterministic static analysis with LLM reasoning, automating fixes for complex vulnerabilities where standard linters fail.
  • Built a fault-tolerant patching system with atomic Git ops + automated rollbacks, ensuring 100% repository integrity during autonomous code repair.

Software Engineer

Helpshift - Customer support platform installed on over 2 billion devices

June 2022 - June 2024

  • Raised analytics uptime from 99.0% to 99.99% and cut $250,000/yr by leading analytics infrastructure migration to AWS
  • Migrated analytics pipelines from HBase to Redshift, enabling 10× traffic growth for 200+ customers with zero downtime
  • Eliminated stream processing bottlenecks affecting real-time analytics by migrating legacy Storm infrastructure to Flink, reducing event latency by 35% for 40K+ support agents.
  • Preserved 350+ TB of historical data during migration, maintaining 6+ years of customer analytics access
  • Established Airflow standards and documentation across 5+ teams, saving 15 developer hours weekly
  • Reduced ad-hoc engineering data requests by 40% by implementing Metabase self-service analytics platform
  • Mentored 10+ hires on coding practices and system architecture, reducing time to first release by 35% compared to previous year.

Projects

Control Vector-Based LLM Steering

March 2025

Python, PyTorch, Transformers

  • Implemented Activation Engineering to steer behavior by extracting steering vectors via PCA on contrastive prompt pairs.
  • Built an automated evaluation pipeline to stress-test model coherence at varying control strengths, ensuring structural integrity of JSON outputs while altering persona.

Distributed Stream Processing Platform

September 2024 - November 2024

Golang, Distributed Systems

  • Implemented SWIM failure detection and consistent hashing to manage dynamic node churn, achieving sub-3s convergence for cluster membership updates.
  • Designed a custom distributed file system (HyDFS) with chain replication, ensuring linearizability for concurrent appends across 10+ nodes.
  • Implemented a stream processing engine (RainStorm) with exactly-once semantics, utilizing distributed write-ahead logs to track tuple lineage and handle worker failures.
  • Engineered autoscaling resource manager that dynamically provisioned worker nodes based on throughput watermarks, optimizing cluster utilization under varying load.

Awards & Research

Open Philanthropy Grant Recipient

September 2025 - Present

AI Safety, Mechanistic Interpretability

  • Awarded competitive funding to investigate mechanistic causes of alignment-faking in Large Language Models.
  • Scope includes identifying specific patterns that trigger deceptive behavior during chain-of-thought reasoning.

Multi-Agent Reinforcement Learning: Hide and Seek

2021

IEEE Publication - Reinforcement Learning, Multi-Agent Systems

  • Researched and implemented multiple reinforcement learning algorithms for multi-agent systems, developing a novel hide-and-seek simulation environment inspired by OpenAI research on Multi-agent Autocurricula

Skills

Languages: Java, Python, Golang, Clojure, JavaScript

Databases: PostgreSQL, MySQL, MongoDB, Apache HBase, Redis, Kafka, Flink

Cloud: AWS Redshift, S3, Athena, EMR