What are Redis's underlying implementation principles? What core data structures and mechanisms does it include? - 面试题

Redis's underlying implementation principles are key to understanding Redis's high performance, mainly including core content like data structures, network models, and memory management.

1. SDS (Simple Dynamic String)

Basic Concept: SDS is the underlying implementation of strings in Redis, a wrapper around C language strings.

SDS Structure:

c
struct sdshdr {
    int len;        // String length
    int free;       // Remaining available space
    char buf[];     // Byte array
};

SDS Advantages:

O(1) time complexity to get string length: C language strings need to traverse the entire string to get length, time complexity is O(n)
Avoid buffer overflow: SDS checks remaining space and automatically expands when space is insufficient
Reduce memory reallocation次数: SDS uses space pre-allocation and lazy release strategies to reduce memory reallocation次数
Binary safe: SDS can store arbitrary binary data, including '\0' characters
Compatible with C language string functions: SDS follows the C language string convention of ending with '\0'

Space Pre-allocation Strategy:

When SDS length is less than 1MB, expansion allocates 2 times the length
When SDS length is greater than or equal to 1MB, expansion allocates 1MB of additional space

2. Linked List

Basic Concept: Redis's linked list is a doubly linked list, used to implement List, pub/sub, slow query, and other functions.

Linked List Structure:

c
typedef struct listNode {
    struct listNode *prev;   // Previous node
    struct listNode *next;   // Next node
    void *value;             // Node value
} listNode;

typedef struct list {
    listNode *head;          // Head node
    listNode *tail;          // Tail node
    unsigned long len;       // Linked list length
    void *(*dup)(void *ptr); // Node value copy function
    void (*free)(void *ptr); // Node value release function
    int (*match)(void *ptr, void *key); // Node value comparison function
} list;

Linked List Features:

Doubly linked list: Can easily traverse forward and backward
Acyclic linked list: Head node's prev pointer and tail node's next pointer both point to NULL
With head and tail pointers: Can quickly get head and tail nodes
With linked list length counter: Can quickly get linked list length
Polymorphic: Linked list nodes can store different types of values

3. Dictionary (Dict)

Basic Concept: Dictionary is the underlying implementation of Hash in Redis, implemented using a hash table.

Dictionary Structure:

c
typedef struct dictht {
    dictEntry **table;       // Hash table array
    unsigned long size;      // Hash table size
    unsigned long sizemask;  // Hash table size mask, used to calculate index
    unsigned long used;      // Number of existing nodes
} dictht;

typedef struct dictEntry {
    void *key;               // Key
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
    } v;                     // Value
    struct dictEntry *next;  // Pointer to next hash table node, forming a linked list
} dictEntry;

typedef struct dict {
    dictType *type;          // Type-specific functions
    void *privdata;          // Private data
    dictht ht[2];            // Two hash tables, used for rehash
    int rehashidx;           // Rehash index, -1 means no rehash in progress
} dict;

Hash Algorithm: Redis uses the MurmurHash2 algorithm to calculate hash values, then uses hash & sizemask to calculate the index.

Hash Conflict Resolution: Redis uses chaining to resolve hash conflicts, meaning each hash table node has a next pointer pointing to the next hash table node.

Rehash Process:

Allocate space for ht[1], size is the first 2^n greater than or equal to ht[0].used * 2
Rehash all key-value pairs from ht[0] to ht[1]
Release ht[0], set ht[1] as ht[0], ht[1] creates an empty hash table

Progressive Rehash: To avoid the impact of rehash on performance, Redis uses progressive rehash, meaning rehashing key-value pairs from ht[0] to ht[1] in multiple batches.

4. Skip List

Basic Concept: Skip list is the underlying implementation of ZSet in Redis, an ordered data structure.

Skip List Structure:

c
typedef struct zskiplistNode {
    sds ele;                 // Member object
    double score;            // Score
    struct zskiplistNode *backward; // Backward pointer
    struct zskiplistLevel {
        struct zskiplistNode *forward; // Forward pointer
        unsigned long span;   // Span
    } level[];               // Level
} zskiplistNode;

typedef struct zskiplist {
    struct zskiplistNode *header, *tail; // Head and tail nodes
    unsigned long length;    // Skip list length
    int level;               // Skip list maximum level
} zskiplist;

Skip List Features:

Multi-level structure: Skip list has multiple levels, the bottom level contains all elements, upper levels are subsets of lower levels
Ordered: Elements in skip list are sorted by score from small to large
High search efficiency: Skip list search time complexity is O(log n)
Space for time: Skip list improves search efficiency through multi-level structure but occupies more space

5. IntSet

Basic Concept: IntSet is one of the underlying implementations of Set in Redis, used to store integers.

IntSet Structure:

c
typedef struct intset {
    uint32_t encoding;       // Encoding method
    uint32_t length;         // Number of elements in the set
    int8_t contents[];       // Array storing elements
} intset;

IntSet Features:

Ordered: Elements in intset are sorted from small to large
Unique: All elements in intset are unique
Upgrade mechanism: When the type of new element is larger than current encoding type, it automatically upgrades the encoding type

Upgrade Process:

Extend the underlying space of intset based on the type of new element
Convert existing elements to new type and place them in correct positions
Add new element to intset

6. ZipList

Basic Concept: ZipList is one of the underlying implementations of List, Hash, and ZSet in Redis, used to store a small number of elements.

ZipList Structure:

shell
<zlbytes><zltail><zllen><entry><entry>...<zlend>

ZipList Features:

Compact storage: ZipList uses contiguous memory space, storage is compact
Memory saving: ZipList has no pointers and extra overhead, saves memory
Low search efficiency: ZipList search time complexity is O(n)
Low update efficiency: ZipList update may require memory reallocation

ZipList Conversion: When the number of elements or size of ziplist exceeds the threshold, it converts to other data structures:

List converts to linkedlist or quicklist
Hash converts to hashtable
ZSet converts to skiplist

7. I/O Multiplexing

Basic Concept: Redis uses I/O multiplexing model, can handle multiple client connections simultaneously.

I/O Multiplexing Model: Redis uses I/O multiplexing functions like epoll (Linux), kqueue (BSD), select (Windows).

Workflow:

Redis server creates socket, binds port, listens for connections
Use epoll_ctl to add socket to epoll instance
Use epoll_wait to wait for events
When events occur, epoll_wait returns, process events

Advantages:

High concurrency: Can handle multiple client connections simultaneously
Non-blocking: Won't affect other clients due to one client's blocking
Efficient: Uses event-driven model, high efficiency

8. Event Loop

Basic Concept: Redis uses event loop model to handle file events and time events.

File Events: File events are abstractions of socket operations by Redis server, including readable events and writable events.

Time Events: Time events are abstractions of timing operations by Redis server, including timing tasks and periodic tasks.

Event Loop Process:

c
void aeMain(aeEventLoop *eventLoop) {
    eventLoop->stop = 0;
    while (!eventLoop->stop) {
        // Process file events
        aeProcessEvents(eventLoop, AE_ALL_EVENTS);
        
        // Process time events
        processTimeEvents(eventLoop);
    }
}

9. Memory Management

Memory Allocator: Redis uses jemalloc as the memory allocator, jemalloc is a high-performance memory allocator.

Memory Fragmentation: Redis uses jemalloc's memory fragmentation defragmentation function to reduce memory fragmentation.

Memory Statistics: Redis uses INFO memory command to view memory usage, including used_memory, used_memory_rss, used_memory_peak and other metrics.

Summary

Redis's underlying implementation principles include data structures like SDS, linked lists, dictionaries, skip lists, intsets, ziplists, and mechanisms like I/O multiplexing, event loops, and memory management. These underlying implementations ensure Redis's high performance and high reliability. Understanding these underlying implementation principles helps to better use Redis and optimize Redis performance.