minicache

build badge

Distributed cache implemented in Go. Like Redis but simpler. Features include:

Contents

Features

Thread-safe LRU cache with O(1) operations

Consistent Hashing

Distributed leader election algorithm

Dynamic node discovery

No single point of failure

Supports both REST API and gRPC

mTLS for maximum security

Performance

Test environment:

Performance test output

$ GIN_MODE=release sudo go test -v main_test.go 

=== RUN   Test10kGrpcPuts
    main_test.go:60: Time to complete 10k puts via gRPC API: 521.057551ms
    main_test.go:61: Cache misses: 0/10,000 (0.000000%)
--- PASS: Test10kGrpcPuts (3.85s)

=== RUN   Test10kRestApiPuts
    main_test.go:118: Time to complete 10k puts via REST: 2.596501161s
    main_test.go:119: Cache misses: 0/10,000 (0.000000%)
--- PASS: Test10kRestApiPuts (3.11s)

=== RUN   Test10kRestApiPutsInsecure
    main_test.go:175: Time to complete 10k puts via REST: 7.675285188s
    main_test.go:176: Cache misses: 0/10,000 (0.000000%)
--- PASS: Test10kRestApiPutsInsecure (10.72s)

1. LRU Cache implementation ran directly by a test program:

Test: 10 million puts calling a LRU cache with capacity of 10,000 directly in memory:

$ go test -v ./lru

=== RUN   TestCacheWriteThroughput
    lru_test.go:19.go:19: Time to complete 10M puts: 3.869112083s
    lru_test.go:20: LRU Cache write throughput: 2584572.321887 puts/second

Result: 2.58 million puts/second

2. Distributed cache running locally with storage via gRPC calls over local network

Test: 10,000 items stored in cache via gRPC calls when running 4 cache servers on localhost with capacity of 100 items each, when all servers stay online throughout the test

$ go test -v main_test.go
	...
    main_test.go:114: Time to complete 10k puts via gRPC: 588.774872ms

Result: ~17,000 puts/second

3. Distributed cache storage running in Docker containers with storage via gRPC calls:

Test: 10,000 items stored in cache via gRPC calls when running 4 cache servers on localhost with capacity of 100 items each, when all servers stay online throughout the test

# docker run --network minicache_default cacheclient
	...
	client_docker_test.go:95: Time to complete 10k puts via REST API: 8.6985474s

Result: 1150 puts/second

Testing

1. Unit tests

2. Integration tests (local)

Run the integration tests with the command go test -v main_test.go, which performs the following steps:

  1. Spins up multiple cache server instances locally on different ports (see nodes-local-with-mTLS.json config file)
  2. Creates cache client
  3. Runs 10 goroutines which each send 1000 requests to put items in the distributed cache via REST API endpoint
  4. Runs 10 goroutines which each send 1000 requests to put items in the distributed cache via gRPC calls
  5. After each test, displays % of cache misses (which in this case, is when the client is simply unable to store an item in the distributed cache)
  6. Repeats steps 1-5 using the nodes-local-insecure.json config file, to test the pure HTTP implementation (no TLS or mTLS).

3. Integration tests (Docker)

If you have Docker and Docker Compose installed, you can run the test script ./docker-test.sh which performs the following steps:

  1. Spins up multiple containerized cache server instances using Docker Compose (see nodes-local-with-mTLS.json config file)
  2. Builds a Docker image for running the client & integration tests from Docker container
  3. Runs 10 goroutines which each send 1000 requests to put items in the distributed cache via REST API endpoint
  4. Runs 10 goroutines which each send 1000 requests to put items in the distributed cache via gRPC calls
  5. After each test, displays % of cache misses (which in this case, is when the client is simply unable to store an item in the distributed cache)
  6. Once the tests are complete, the Docker containers running the cache servers are stopped.

4. Fault-tolerance testing

A useful test is to to manually stop/restart arbitrary nodes in the cluster and observe the test log output to see the consistent hashing ring update in real time.

Example of stopping and restarting cacheserver1 while integration tests are running:

2022/05/15 02:23:35 cluster config: [id:"node2" host:"cacheserver2" rest_port:8080 grpc_port:5005 id:"epQKE" host:"cacheserver3" rest_port:8080 grpc_port:5005 id:"node0" host:"cacheserver0" rest_port:8080 grpc_port:5005]
...
2022/05/15 02:23:35 Removing node node1 from ring
...
2022/05/15 02:23:36 cluster config: [id:"epQKE" host:"cacheserver3" rest_port:8080 grpc_port:5005 id:"node0" host:"cacheserver0" rest_port:8080 grpc_port:5005 id:"node2" host:"cacheserver2" rest_port:8080 grpc_port:5005]
...

2022/05/15 02:23:40 Adding node node1 to ring
...
2022/05/15 02:23:41 cluster config: [id:"node2" host:"cacheserver2" rest_port:8080 grpc_port:5005 id:"epQKE" host:"cacheserver3" rest_port:8080 grpc_port:5005 id:"node0" host:"cacheserver0" rest_port:8080 grpc_port:5005 id:"node1" host:"cacheserver1" rest_port:8080 grpc_port:5005]

Set Up and Usage

1. Create/update node configuration file

You will need to define 1 or more initial "genesis" nodes in a JSON config file, along with settings enabling/disabling TLS.

Here are a few working examples, it is very straight-forward:

These genesis nodes are the original nodes of the cluster, which any new nodes created later on will attempt to contact in order to dynamically register themselves with the cluster. As long as at least 1 of these initial nodes is online, any arbitrary number of new nodes can be spun up (e.g. launching more cache server containers from an image) without defining them in a config file, rebuilding the image etc.

Therefore it is recommended to define at least 3 initial nodes to support fault-tolerance to a reasonable level.

2. Enabling/Disabling TLS

In the node configuration file:

3. Generating TLS certificates

If you want to enable mTLS, you will need to generate TLS certificates by performing the following steps:

subjectAltName = DNS:localhost,DNS:cacheserver0,DNS:cacheserver1,DNS:cacheserver2,DNS:cacheserver3,IP:0.0.0.0,IP:127.0.0.1

3. Run cache servers and clients by following any of the examples below

Examples

Example 1: Run Distributed Cache Using Docker Containers

  1. Run docker-compose build from the project root directory to build the Docker images.

  2. Run docker-compose up to spin up all of the containers defined in the docker-compose.yml file. By default the config file defines 4 cache server instances, 3 of which are the initial nodes defined in the configs/nodes-docker-with-mTLS.json config file, and 1 of which is an extra server node which dynamically adds itself to the cluster, in order to demonstrate this functionality.

  3. Run docker build -t cacheclient -f Dockerfile.client . from the project root directory to build a Docker image for the client.

  4. Run docker run --network minicache_default cacheclient to run the client in a docker container connected t to the docker compose network the servers are running on. By default, the Dockerfile simply builds the client and runs the integration tests described above, although you can change it to do whatever you want.

PRO TIP: a useful test is to to manually stop/restart arbitrary nodes in the cluster and observe the test log output to see the consistent hashing ring update in real time.

Example 2: Starting All Cache Servers Defined in Config File

In the example below:

	// set up parameters for servers
	capacity := 100
	verbose := false
	insecure := false
	absCertDir, _ := filepath.Abs(RELATIVE_CLIENT_CERT_DIR)
	absConfigDir, _ := filepath.Abs(RELATIVE_CONFIG_PATH)
	shutdownChan := make(chan bool, 1)

	// start servers defined in config file
	components := server.CreateAndRunAllFromConfig(capacity, absConfigDir, verbose, insecure)

	
	...

	// cleanup
	for _, srv := range components {
		srv.GrpcServer.Stop()

	    ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
	    defer cancel()

	    if err := srv.HttpServer.Shutdown(ctx); err != nil {
	        t.Logf("Http server shutdown error: %s", err)
	    }
	}

Example 3: Starting a Single Cache Server

From main.go

func main() {
	// parse arguments
	grpcPort := flag.Int("grpc_port", 5005, "port number for gRPC server to listen on")
	capacity := flag.Int("capacity", 1000, "capacity of LRU cache")
	clientAuth := flag.Bool("enable_client_auth", true, "require client authentication (used for mTLS)")
	httpsEnabled := flag.Bool("enable_https", true, "enable HTTPS for server-server and client-server communication. Requires TLS certificates in /certs directory.")
	configFile := flag.String("config", "", "filename of JSON config file with the info for initial nodes")
	restPort := flag.Int("rest_port", 8080, "enable REST API for client requests, instead of just gRPC")
	verbose := flag.Bool("verbose", false, "log events to terminal")

	flag.Parse()

	// set up listener TCP connectiion
	listener, err := net.Listen("tcp", fmt.Sprintf(":%d", *grpcPort))
	if err != nil {
		panic(err)
	}

	// get new grpc id server
	grpcServer, cacheServer := server.NewCacheServer(
		*capacity,
		*configFile,
		*verbose,
		server.DYNAMIC,
		*httpsEnabled,
		*clientAuth,
	)

	// run gRPC server
	cacheServer.LogInfoLevel(fmt.Sprintf("Running gRPC server on port %d...", *grpcPort))
	go grpcServer.Serve(listener)

	// register node with cluster
	cacheServer.RegisterNodeInternal()

	// run initial election
	cacheServer.RunElection()

	// start leader heartbeat monitor
	go cacheServer.StartLeaderHeartbeatMonitor()

	// run HTTP server
	cacheServer.LogInfoLevel(fmt.Sprintf("Running REST API server on port %d...", *restPort))
	httpServer := cacheServer.RunAndReturnHTTPServer(*restPort)

	// set up shutdown handler and block until sigint or sigterm received
	c := make(chan os.Signal, 1)
	signal.Notify(c, os.Interrupt, syscall.SIGTERM, syscall.SIGINT)
	go func() {
		<-c

		cacheServer.LogInfoLevel("Shutting down gRPC server...")
		grpcServer.Stop()

		cacheServer.LogInfoLevel("Shutting down HTTP server...")
		ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
		defer cancel()

		if err := httpServer.Shutdown(ctx); err != nil {
			cacheServer.LogInfoLevel(fmt.Sprintf("Http server shutdown error: %s", err))
		}
		os.Exit(0)
	}()

	// block indefinitely
	select {}
}

Example 4: Creating and Using a Cache Client

In the example below:

	// start client
	c := client.NewClientWrapper(absCertDir, absConfigDir, insecure, verbose)
	c.StartClusterConfigWatcher()

	...

	c.Put(key, value)

	...

	val := c.Get(key)

Contributing

Feel free to take a look at some of the issues and feature requests here and submit a pull-request.