A from-scratch Java implementation of Redis — RESP wire protocol, six data structures, master-slave replication, pub/sub, transactions, streams, geospatial queries, and RDB persistence. Built to understand how an in-memory database actually works at the network and concurrency layer, not to use one.
Performance
Initial throughput was ~1,200 ops/s. I profiled with async-profiler flame graphs and found three problems:
Unbuffered socket I/O. Every read and write was a raw syscall. Wrapping the socket streams in 8KB BufferedInputStream / BufferedOutputStream dropped syscall overhead immediately — roughly 40x improvement on its own.
ByteArrayOutputStream allocations in RESP serialization. The serializer was allocating a new buffer on every response. Replaced with pre-sized byte[] + System.arraycopy. GC pressure dropped sharply.
String concatenation in protocol encoding. Eliminated intermediate String objects in the hot path.
End result: 1,200 → 83,000 ops/s. 70x. Real Redis on the same machine: 133K ops/s. This implementation runs at 63% of Redis’s throughput, in Java, on an i3 laptop.
Lesson: don’t optimize what you think is slow. The bottleneck was I/O buffering, not data structures.
Engineering Decisions
One virtual thread per connection. Java 21’s Project Loom lets each client get its own thread without the overhead of OS threads. No NIO selector loops, no callback hell — blocking I/O reads like sequential code, scales to thousands of concurrent clients.
Concurrency without a global lock. ConcurrentHashMap for the main key-value store. ReadWriteLock + Condition variables for streams (XREAD BLOCK signals waiting readers when new entries arrive). ReentrantLock + Condition for BLPOP. CopyOnWriteArraySet for pub/sub subscriber lists. No single lock on the hot path.
Geospatial from first principles. GEOADD encodes coordinates into a 52-bit geohash via bit-interleaving. GEODIST decodes back to coordinates and computes great-circle distance using the Haversine formula. No library — just the math.
Sorted sets: dual-index structure. TreeMap<Double, TreeSet<String>> for score-ordered iteration, HashMap<String, Double> for O(1) member→score lookups. Handles duplicate scores correctly via per-score buckets.
Replication: offset-tracked command propagation. Master tracks a global AtomicLong replication offset. On PSYNC, sends an RDB snapshot then streams every mutating command to replicas via a BlockingQueue. WAIT blocks until N replicas acknowledge up to the current offset.
RDB persistence: binary format parser. Parses Redis’s RDB v11 binary format by hand — length-encoded strings, integer encodings, key expiration timestamps, auxiliary metadata fields.
Tech Stack
| Layer | Technology |
|---|---|
| Language | Java 21 |
| Concurrency | Virtual threads (Project Loom) |
| Networking | java.net · java.util.concurrent |
| Profiling | async-profiler |
| Infra | Docker · GitHub Actions CI |
- Source: github.com/Md-Talim/codecrafters-redis-java
- CodeCrafters profile: app.codecrafters.io/users/Md-Talim
- Challenge overview: app.codecrafters.io/courses/redis/overview