šŸ“š go-mcache - Awesome Go Library for Database

Go Gopher mascot for go-mcache

Fast in-memory key:value store/cache library. Pointer caches

šŸ·ļø Database
šŸ“‚ Caches
⭐ 0 stars
View on GitHub šŸ”—

Detailed Description of go-mcache

go-mcache

Go Report Card GoDoc Tests Release

Generic in-memory cache for Go with TinyLFU admission, sharded storage, and lock-free read path.

Design

mcache combines two ideas to achieve both high hit ratios and high throughput:

Admission control via TinyLFU. Not every new item gets into the cache. On insertion, a Count-Min Sketch estimates the access frequency of the incoming item and compares it against a random sample of existing entries (SampledLFU). If the new item has lower frequency, it is rejected. This prevents one-time keys from evicting frequently accessed data — a problem that LRU caches have by design. The frequency sketch uses 4-bit counters (8 bytes per ~16 tracked keys) and resets periodically to adapt to changing access patterns.

Lock-free read path. The TinyLFU frequency tracking (Count-Min Sketch + Bloom filter doorkeeper) is implemented with atomic operations — no mutexes on reads. The storage layer uses 1024 shards with per-shard sync.RWMutex and cache-line padding between shards to eliminate false sharing. As a result, read throughput scales linearly with cores.

Architecture

Cache[K, V]
ā”œā”€ā”€ ShardedStore         1024 shards, cache-line padded, per-shard RWMutex
│   └── map[K]*Entry     standard Go map per shard
ā”œā”€ā”€ Policy[K]            generic over key type (no hash-collision ambiguity)
│   ā”œā”€ā”€ TinyLFU          doorkeeper (Bloom filter) + Count-Min Sketch
│   └── SampledLFU[K]    dense array + map for O(1) random sampling
ā”œā”€ā”€ ExpiryHeap           min-heap for TTL-based expiration, O(log n) ops
ā”œā”€ā”€ RadixTree            opt-in, for prefix search on string keys
ā”œā”€ā”€ WriteBuffer          lock-free ring buffer for async batching
└── Metrics              atomic counters, zero-cost when not read

Why exact-key policy matters

The policy tracks entries by exact key K, not by hash. This means:

  • No hash collision ambiguity — two keys with the same hash are tracked independently
  • Eviction victims are returned as {Key, KeyHash} pairs — no reverse lookup needed
  • The SampledLFU uses a dense array with swap-delete, so random sampling is O(sampleSize) instead of O(n) map iteration

Expiration

Entries with TTL are tracked in a min-heap. A background goroutine sweeps expired entries once per second using a shard-local collect-and-delete: each shard is locked once for the entire sweep, expired entries are removed in bulk, and policy/radix/callback updates happen outside the shard lock.

Install

go get github.com/OrlovEvgeny/go-mcache

Requires Go 1.23+

Usage

cache := mcache.NewCache[string, int](
    mcache.WithMaxEntries[string, int](100_000),
)
defer cache.Close()

cache.Set("key", 42, 5*time.Minute)

val, ok := cache.Get("key")  // type-safe, no assertion

Cost-based eviction

cache := mcache.NewCache[string, []byte](
    mcache.WithMaxCost[string, []byte](100 << 20), // 100 MB
    mcache.WithCostFunc[string, []byte](func(v []byte) int64 {
        return int64(len(v))
    }),
)

// A 10 MB value may evict multiple smaller entries
cache.Set("large", make([]byte, 10<<20), 0)

Batch reads

batch := cache.GetBatchOptimized(keys)
// Keys are sorted by shard index before lookup
// for sequential memory access and reduced lock contention
for i, key := range batch.Keys {
    if batch.Found[i] {
        process(key, batch.Values[i])
    }
}

Prefix search (string keys, opt-in)

cache := mcache.NewCache[string, int](
    mcache.WithPrefixSearch[string, int](true),
)

cache.Set("user:1:name", 1, 0)
cache.Set("user:1:email", 2, 0)
cache.Set("user:2:name", 3, 0)

iter := cache.ScanPrefix("user:1:", 0, 100)
for iter.Next() {
    fmt.Println(iter.Key()) // user:1:name, user:1:email
}

Async writes

cache := mcache.NewCache[string, int](
    mcache.WithBufferItems[string, int](64),
)

cache.Set("key", 1, 0)    // buffered, returns immediately
cache.Wait()               // blocks until buffer is flushed
val, _ := cache.Get("key") // guaranteed to see the value

When the write buffer is full, the operation falls back to synchronous execution instead of dropping the entry.

Configuration

OptionDescriptionDefault
WithMaxEntriesMaximum number of entriesunlimited
WithMaxCostMaximum total costunlimited
WithNumCountersTinyLFU counters (recommend 10x max entries)auto
WithShardCountNumber of shards (power of 2)1024
WithBufferItemsAsync write buffer size (0 = sync)0
WithDefaultTTLDefault TTL for entries without explicit TTL0 (no expiry)
WithCostFuncCustom cost calculatorcost = 1
WithKeyHasherCustom key hash functionauto (FNV-1a)
WithLockFreePolicyUse lock-free TinyLFU for readstrue
WithPrefixSearchEnable radix tree for ScanPrefixfalse
WithOnEvictCallback on evictionnil
WithOnExpireCallback on TTL expirationnil
WithOnRejectCallback when TinyLFU rejects entrynil

Supported key types

Built-in zero-allocation hashing for: string, int, int8–int64, uint–uint64, float32, float64, bool, uintptr. Other comparable types use fmt.Sprintf fallback — provide WithKeyHasher for production use with custom types.

API

// Core
cache.Set(key K, value V, ttl time.Duration) bool
cache.SetWithCost(key K, value V, cost int64, ttl time.Duration) bool
cache.Get(key K) (V, bool)
cache.Has(key K) bool
cache.Delete(key K) bool
cache.Len() int
cache.Clear()
cache.Close()
cache.Wait()

// Batch
cache.GetMany(keys []K) map[K]V
cache.GetBatch(keys []K) *BatchResult[K, V]
cache.GetBatchOptimized(keys []K) *BatchResult[K, V]
cache.SetMany(items []Item[K, V]) int
cache.DeleteMany(keys []K) int

// Iterators (cursor-based, Redis-style)
cache.Scan(cursor uint64, count int) *Iterator[K, V]
cache.ScanPrefix(prefix string, cursor uint64, count int) *Iterator[K, V]
cache.ScanMatch(pattern string, cursor uint64, count int) *Iterator[K, V]

// Iterator methods
iter.Next() bool
iter.Key() K
iter.Value() V
iter.All() []Item[K, V]
iter.Keys() []K
iter.ForEach(func(K, V) bool)
iter.Count() int
iter.Cursor() uint64
iter.Close()

// Metrics
cache.Metrics() MetricsSnapshot
// Fields: Hits, Misses, HitRatio, Sets, Deletes, Evictions,
//         Expirations, Rejections, CostAdded, CostEvicted, BufferDrops

Benchmarks

Apple M4 Pro, Go 1.23. Run with go test -bench=. -benchmem.

BenchmarkCacheGet                  4,915,591     445.9 ns/op     0 B/op    0 allocs/op
BenchmarkCacheSet                  2,185,386    1061   ns/op    49 B/op    1 allocs/op
BenchmarkCacheSetWithTTL           2,511,756     973.9 ns/op    49 B/op    1 allocs/op
BenchmarkCacheMixed                4,878,026     473.3 ns/op     9 B/op    0 allocs/op
BenchmarkCacheHas                137,653,788      17.5 ns/op     0 B/op    0 allocs/op
BenchmarkCacheOperations/Read     40,523,341      58.0 ns/op     5 B/op    0 allocs/op
BenchmarkCacheZipf                   683,647    3624   ns/op    13 B/op    0 allocs/op

CacheGet includes TinyLFU frequency tracking on every access. CacheHas is a pure shard lookup without policy update. The difference (~17 ns vs ~446 ns) shows the cost of the admission policy — which is the tradeoff for better hit ratios on skewed workloads (CacheZipf achieves 87.4% hit rate).

Lock-free internals:

BenchmarkCMSketchLockFreeIncrement            58,410,620    20.8 ns/op    0 B/op
BenchmarkCMSketchLockFreeIncrementParallel    43,926,879    24.5 ns/op    0 B/op
BenchmarkBloomFilterLockFreeAddParallel      889,117,638     1.3 ns/op    0 B/op
BenchmarkPolicyLockFreeAccess                 35,165,865    36.3 ns/op    0 B/op

Parallel CM Sketch throughput degrades only ~18% from single-goroutine, showing the lock-free design scales under contention.

Thread safety

All operations are safe for concurrent use. The concurrency model:

  • Reads: shard RLock + lock-free TinyLFU access → no global contention
  • Writes: shard Lock + policy mutex (held only during eviction decision)
  • Expiration: shard-local sweep under write lock, policy/callback updates outside lock
  • Metrics: atomic counters, no locks
  • Shards: cache-line padded to prevent false sharing between cores

Legacy API

The mcache.New() / CacheDriver API from v1 still works. It uses safeMap + GC-based expiration without TinyLFU. See mcache.go and gcmap/ for details. For new code, use the generic NewCache[K, V] API.

License

MIT