Rule | Requirement | Correct Example | Incorrect Example |
Red line | Isolate business data from system metadata. | db_config (independent business database) | admin,local,config |
Rule | Requirement | Correct Example | Incorrect Example |
Prefix | Start with db_ | db_order | Order |
Character Set | Lowercase letters + underscores | db_user_center | db.user.center |
Length | ≤ 64 bytes | db_payment | db-payment |
t_ prefix + lowercase letters + underscores, following the "module_entity" format.Rule | Requirement | Correct Example | Incorrect Example |
Prefix | Start with t_ | t_order_detail | OrderDetail |
Format | Module_Entity | t_user_address | t_user-address |
Disable | Not starting with system. | t_system_config | system.config |
Collection Sharding | Time suffix | t_log_202403 | t_log$202403 |
// Recommended: Clear semantics and consistent style{"_id": ObjectId("..."),"userName": "Zhang San", // camelCase style"createTime": ISODate("..."),"orderItems": [...],"totalAmount": 199.00}// Not recommended: Confusing naming{"_id": ObjectId("..."),"UN": "Zhang San", // Unclear abbreviation"Create_Time": ISODate("..."), // Mixed style"oi": [...], // Unclear meaning"_total": 199.00 // Business fields starting with an underscore are prone to conflict with system fields.}
createTime, Create_Time, create_time, and CT — were used to represent the creation time. Developers frequently misspelled field names, resulting in empty query results and increasing troubleshooting time from minutes to hours. After the naming convention was standardized, development efficiency improved significantly.Scenario | Solution | Example |
Excessive Array Elements | Split into multiple documents | User posts: one document per post |
Storing Large Files | Using GridFS | Images, videos, and large logs |
Large Text Content | Compression at the business layer | Storing HTML content after compression |
Oversized Files | Using COS + URL referencing | Store files in COS and store URLs in MongoDB. |
// View the size of a single documentObject.bsonsize(db.collection.findOne({ _id: xxx }))// View the average document size of a collection (unit: bytes)db.collection.stats().avgObjSize
posts array of each user's document. After active users published thousands of posts, the document size exceeded 16MB, preventing new posts from being written and causing users to complain about posting failures. The issue was completely resolved by changing to a design where each post is stored as a separate document linked by the user ID.// Recommendation: Maintain a moderate nesting level (2-3 layers){"_id": ObjectId("..."),"orderId": "ORD202403001","customer": { // Layer 1"name": "Zhang San","contact": { // Layer 2"phone": "13800138000","email": "zhangsan@example.com"}},"items": [{ "productId": "P001", "quantity": 2 }] // Layer 1 (array)}// Not recommended: Excessively deep nesting{"level1": {"level2": {"level3": {"level4": {"level5": {"data": "Too deep to be maintained"}}}}}}
$set path, such as "a.b.c.d.e.f.g.value". Developers frequently wrote incorrect paths, causing configuration updates to fail and preventing the creation of effective indexes for deep fields. After the structure was flattened through refactoring, both configuration updates and queries became simple and efficient.Consideration Factor | Prefer [Embedded] | Prefer [Referenced] |
Read Mode | Data is always read together. | Data is frequently read individually. |
Total Data Volume | The child data volume is small and its scale is limited. | Large child data volume or an unlimited growth trend. |
Update Frequency | Child data is rarely updated independently. | Child data is frequently updated independently. |
Relationship Type | One-to-one, one-to-few | One-to-many, many-to-many |
Sharing | Child data belongs exclusively to a single parent document. | Child data is frequently shared by multiple documents. |
// Order + Order Items: Embedded (Always queried together; order items do not exist independently){"_id": ObjectId("..."),"orderId": "ORD202403001","customerId": "C001","items": [{ "productId": "P001", "name": "Product A", "quantity": 2, "price": 99.00 },{ "productId": "P002", "name": "Product B", "quantity": 1, "price": 199.00 }],"totalAmount": 397.00,"status": "paid","createTime": ISODate("2024-03-15T10:30:00Z")}
// User + Article: Referenced (Articles are frequently queried independently and their number grows indefinitely)// User Document{"_id": ObjectId("user_001"),"userName": "Zhang San","email": "zhangsan@example.com"}// Article Document (referencing the user via authorId){"_id": ObjectId("article_001"),"title": "MongoDB Best Practices","authorId": ObjectId("user_001"), // referencing the user"content": "...","createTime": ISODate("...")}
// ✅ Hybrid Mode: Redundantly embeds product names and prices (snapshots) in orders while preserving productId references.{"orderId": "ORD001","items": [{"productId": ObjectId("..."), // reference (used to associate with the latest product information)"name": "Product A", // redundant (snapshot at order placement to prevent product renaming from affecting historical orders)"price": NumberDecimal("99.00") // redundant (price snapshot at order placement)}]}
// Not recommended: unbounded, infinitely growing arrays{"userId": "user_10001","orders": [{ "orderId": "ORD_001", "amount": 99.00, "date": ISODate("...") },{ "orderId": "ORD_002", "amount": 158.00, "date": ISODate("...") },// ... Active users may accumulate tens of thousands of orders, triggering the 16MB limit.]}// Recommended: Split array elements into separate documents and associate them via foreign keys.// User Document{"userId": "user_10001","name": "Zhang San","orderCount": 1024}// Order Document (Independent Collection){"orderId": "ORD_001","userId": "user_10001", // foreign key association"amount": 99.00,"date": ISODate("2024-03-15T10:30:00Z")}
readings array of a device document. After the platform had been running for a year, the array for an active device contained hundreds of thousands of readings, and the document exceeded 16MB, preventing new data from being written. After switching to the "bucket pattern" (one document per hour), the size of a single document stabilized below 100KB.Scenario | Recommended Type | Not Recommended Type | Potential Issue |
Date and time | Date | String | Cannot use native date operations and range query optimizations. |
Financial amount | Decimal128 | Double | Loss of floating-point precision, leading to discrepancies in financial reconciliation. |
Document primary key | ObjectId (default) | Random string. | Non-incrementing random IDs cause frequent page splits, severely slowing write performance. The first 4 bytes of an ObjectId are a second-level timestamp, providing roughly incrementing characteristics. This causes B-Tree index writes to be concentrated at the tail, avoiding page splits caused by random insertion. |
Large integer ID | NumberLong | String | Cannot perform numerical comparisons and range sorting. |
Status flag | String (enumeration value) | Numeric | Magic Numbers have unclear meanings, making maintenance difficult later. |
// Correct Type Usage{"_id": ObjectId("65f3a2b8c1d2e3f4a5b6c7d8"), // ObjectId"orderId": NumberLong("20240315000001"), // Large integer"amount": NumberDecimal("199.99"), // Use Decimal128 for monetary amounts"createTime": ISODate("2024-03-15T10:30:00Z"), // Use Date for dates"status": "paid" // Use string enumeration for status}// Incorrect Type Usage{"_id": "random-uuid-string", // Random strings impact performance"orderId": "20240315000001", // Strings cannot be sorted numerically"amount": 199.99, // Double has precision issues"createTime": "2024-03-15 10:30:00", // Strings cannot be used for date arithmetic"status": 1 // Numeric meaning is unclear}
0.1 + 0.2 yielded 0.30000000000000004, and after cumulative calculations, the discrepancy with bank reconciliation amounted to hundreds of CNY. After switching to Decimal128, calculations became precise to the cent, and reconciliation was completely accurate.// Recommendation: Use the default ObjectId{ "_id": ObjectId("65f3a2b8c1d2e3f4a5b6c7d8") }// Acceptable: Custom incremental ID (must ensure the incremental characteristic){ "_id": NumberLong("20240315000001") }// Prohibited: Random strings (impact write performance){ "_id": "550e8400-e29b-41d4-a716-446655440000" }
validationLevel: "moderate" mode: Validates only newly written and updated documents, not existing ones (suitable for legacy data migration scenarios).collMod command to modify the validation rules of an existing collection.// Create a collection with validation rulesdb.createCollection("t_users", {validator: {$jsonSchema: {bsonType: "object",required: ["userName", "email", "createTime"],properties: {userName: {bsonType: "string",minLength: 2,maxLength: 50,description: "Username, required, 2-50 characters"},email: {bsonType: "string",pattern: "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\\\.[a-zA-Z]{2,}$",description: "Email, required, must conform to email format"},age: {bsonType: "int",minimum: 0,maximum: 150,description: "Age, optional, an integer from 0 to 150"},status: {enum: ["active", "inactive", "deleted"],description: "Status, enumeration value"},createTime: {bsonType: "date",description: "Creation time, required"}}}},validationLevel: "strict", // strict: validates all writesvalidationAction: "error" // error: rejects writes if validation fails});
price field lacked type constraints. Some entries stored numbers like 99.00, some stored strings like "99.00", and some even stored objects like {value: 99}. Price sorting and comparison became completely chaotic, and the "spend 100, get 20 off" promotion logic failed. After Schema validation was added, dirty data was rejected for writing. After the existing data was cleaned, the feature returned to normal.Impact | Description |
Long instance startup time | During engine startup, the metadata information of all collections needs to be loaded one by one. |
High memory consumption | The metadata of each collection persistently resides in the cache, occupying business memory. |
File handle consumption | Each collection corresponds to multiple underlying data files, making it easy to hit the system limit. |
Complex Ops | The time required for routine operations such as status monitoring, data migration, and major version upgrades will increase exponentially. |
Backup timeout or failure | Traversing massive metadata can easily cause physical backup tasks to time out severely, and may even completely fail to generate a physical backup. |
Check Item | Verification Method | Passing Criteria |
1. Naming Convention Consistency | Review all involved database/collection/field naming code. | Fully comply with the naming and prefix rules in this document. |
2. Document Size Controllability | Sample and execute Object.bsonsize(doc). | Core document size is recommended to be controlled within 1 MB. |
3. Reasonable Nesting Depth | Review the JSON structure tree of core documents. | Maximum nesting depth ≤ 3-5 layers |
4. Avoiding Unbounded Arrays | Thoroughly review data write and append logic. | Arrays have a clear business upper limit, or have adopted a bucketing pattern/$slice truncation mechanism. |
5. Data Type Strictness | Review entity class field type definitions. | Dates are recommended to use Date, and monetary amounts are recommended to use Decimal128. |
6. Enabling Schema Validation | Check collection creation scripts or validator configurations. | Constraints have been fully configured for the required fields, field types, and format validations of core collections. |
7. Recommended Maximum of 5000 Collections per Instance. | Estimate and execute show collections statistics. | Estimated/Actual Number of Business Collections per Database ≤ 100 |
Esta página foi útil?
Você também pode entrar em contato com a Equipe de vendas ou Enviar um tíquete em caso de ajuda.
comentários