Learning Databases & Messaging Systems: My Notes on MySQL, MongoDB, Redis, and Kafka

Table of Contents


Introduction

MySQL – SQL Databases (Relational)

MySQL is a widely adopted relational database known for its ease of use, speed, and extensive community support. It is well-suited for applications with clearly defined data structures and scenarios requiring rapid development.

MongoDB – NoSQL Databases

MongoDB is a document-oriented NoSQL database that stores data in a format similar to JSON. It provides a flexible schema model, making it ideal for projects where data structures may change frequently.

Redis – Caching & In-Memory Storage

Redis is an in-memory key-value store known for its ultra-fast response times and support for a variety of data structures. It is commonly used to handle the performance-sensitive parts of an application architecture.

Kafka – Streaming & Messaging

Apache Kafka is a distributed event streaming platform designed to facilitate communication between services via events rather than direct API calls. It provides a robust foundation for building scalable, loosely coupled systems.


MySQL vs. MongoDB

Data Model

  • MySQL uses a traditional relational model, where data is stored in rows and columns across tables. It requires a fixed schema, and relationships are defined using primary and foreign keys.
  • MongoDB stores data as JSON-like documents in collections. It’s often called “schema-less,” which means it doesn’t force a fixed structure for the data like traditional databases do. Each document in the same collection can look different. However, this doesn’t mean there’s no structure at all—developers usually follow a consistent format and can even add rules to check the structure using built-in validation tools. So while MongoDB gives users more flexibility, it still supports structure when needed.

Scalability

  • MySQL is generally better suited for vertical scaling, meaning it performs well when adding more resources (like CPU, RAM, or storage) to a single server. It does support read replicas to help distribute read workloads, but it’s not as naturally built for running across many servers. While horizontal scaling is possible, it usually requires additional tools or more complex setups.
  • MongoDB, on the other hand, is designed for horizontal scaling. It supports sharding (splitting data across multiple machines) and replica sets (automatic failover and redundancy), making it easier to handle large datasets and high traffic across distributed systems.

Query Language

  • MySQL uses SQL (Structured Query Language), which is widely known and supported.
  • MongoDB uses MQL (MongoDB Query Language), which is more document-oriented and may take time to learn for those coming from SQL.

Performance

  • MySQL is efficient for structured data and complex joins, especially when data is properly indexed.
  • MongoDB performs well when dealing with large volumes of insert or update operations, especially when documents are self-contained and don’t require joins.

Flexibility

  • MySQL has a strict schema, which helps maintain data consistency but may require migrations when the structure changes.
  • MongoDB allows flexible data structures. This makes it easier to work with evolving or unstructured data, which is common in modern applications.

Security

Both databases support encryption, user authentication, and access control.

  • MySQL uses its own built-in authentication system and may be more susceptible to SQL injection if not properly handled.
  • MongoDB supports external authentication methods like LDAP, Kerberos, and X.509.

Redis vs. Kafka

(* This comparison focuses on Redis pub/sub messaging, not its general key-value storage features.)

Workflow

  • Redis works like a live broadcaster. When a producer sends a message, Redis immediately pushes it to all connected consumers. Messages are grouped using keys, such as “email,” and sent to whoever is listening to that key. Redis stores messages in memory, which makes it very fast, but it doesn’t keep them after delivery. If no one is connected when the message is sent, the message is lost.
  • Kafka lets different apps send and receive data through something called “topics.” A topic is like a channel for a specific type of message, such as orders or payments. Apps that send messages are called producers, and those that read messages are consumers. Messages are stored in parts called partitions, which are spread across multiple servers for better performance and reliability. Consumers pull messages from these partitions whenever they are ready, and the messages stay there for a while, even after being read.

Message Size

  • Redis is optimized for small messages. It stores everything in memory, so capacity is limited.
  • Kafka can handle larger messages (up to ~1 GB) when compression and external storage are used.

Message Delivery

  • Redis uses a push-based approach. It sends messages directly to all connected subscribers as soon as they’re available.
  • Kafka uses a pull-based approach. Consumers read messages from a queue when ready.

Message Retention

  • Redis only keeps messages if subscribers are connected. If no one is listening, the messages are dropped and can’t be recovered. (* This applies to Redis pub/sub. Redis can persist data in other use cases.)
  • Kafka stores messages even after they’ve been read. Consumers can re-read data later.

Error Handling

  • Redis relies on the application to manage issues like timeouts or memory limits. It doesn’t have built-in message-level error tracking.
  • Kafka has built-in tools for error recovery, like dead-letter queues and message retries.

Parallelism

  • Redis doesn’t support parallel delivery to multiple consumers.
  • Kafka allows the same message to be consumed by multiple consumers at the same time.

Throughput

  • Redis has lower throughput when more subscribers are connected, since it waits for each one to receive messages.
  • Kafka can process a high volume of messages per second. It doesn’t wait for each consumer to respond.

Latency

  • Redis has very low latency because it reads/writes in RAM.
  • Kafka is also fast, but usually a bit slower due to disk storage and data replication.

Fault Tolerance

  • Redis does not persist data unless configured to do so. Data may be lost if the system shuts down unexpectedly.
  • Kafka automatically replicates data across servers to prevent loss.

Feature Summary

Feature MySQL MongoDB Redis Kafka
Strengths
  • Fast reading thanks to efficient queries
  • Widely supported with good tools and documentation
  • Easy-to-learn SQL language
  • Strong replication for backup and failover
  • Offers different storage engines for different needs
  • Doesn't need a fixed schema—easy to adapt to changing data
  • Built-in sharding for large-scale horizontal scaling
  • Flexible querying and strong aggregation features
  • Stores data in a JSON-like format that feels natural to developers
  • Great for rapid development and frequent updates
  • Extremely fast because everything runs in memory
  • Supports many types of data, not just strings
  • Can send real-time messages with pub/sub
  • Supports scripting for custom operations
  • Can save data to disk if needed
  • Handles very large amounts of messages efficiently
  • Replicates data to avoid losing it
  • Lets you process messages as they arrive using Kafka Streams
  • Keeps track of all messages (great for logs, events)
  • Ideal for systems where services talk to each other using messages
Use Cases
  • Best for apps with fixed data structure, like content systems or blogs
  • Great when you mostly read data, not write it
  • Ideal for small-to-medium web apps or quick prototyping
  • Works well for apps with changing or flexible data, like content feeds
  • Used in chat apps, collaborative tools, or IoT where data isn't always the same
  • Good for building features fast during early-stage development
  • Used for caching to reduce database load
  • Manages user sessions in websites
  • Real-time data use like game scores or rate limiting
  • Also works as a simple message queue
  • Great for tracking things that happen in a system (like orders or user actions)
  • Useful for detecting fraud or monitoring patterns in real time
  • Good for apps that receive constant data from devices (like IoT sensors)
  • Helps services work independently by sharing data through events
Recommended When
  • Best suited for applications with a fixed schema and consistent data structure
  • Performs well in scenarios where query performance is important and queries are relatively simple
  • A reliable option for users seeking a familiar and easy-to-use relational database system
  • Recommended for cases where the data model evolves frequently or lacks a strict structure
  • Well-suited for fast-paced projects requiring flexible and dynamic schemas
  • A strong choice for users who need to scale horizontally across distributed systems
  • Ideal for use cases that require extremely low-latency data access
  • Commonly used for temporary or fast-changing data, such as cache, session storage, or counters
  • Not typically used for long-term data storage unless properly configured with persistence options
  • Designed for systems that process large volumes of real-time data streams
  • Effective for applications built around messaging, events, or asynchronous communication
  • Valuable when durability and service independence are key architectural requirements

References

The Ultimate Database Guide for Sealos: Which Database Should You Choose in 2025? | Sealos Blog
Complete comparison of PostgreSQL, MongoDB, MySQL, Redis, Kafka, and Milvus on Sealos. Learn which database fits your project needs with real-world examples and performance insights.
MongoDB vs MySQL - Difference Between Database Management Systems - AWS
What’s the difference between MongoDB vs MySQL? How to use MongoDB vs MySQL with AWS.
Redis vs Kafka - Difference Between Pub/Sub Messaging Systems - AWS
What’s the Difference Between Kafka and Redis? How to Use Kafka and Redis with AWS.