设计分布式系统所需要遵从的原则

复杂的系统设计需要遵从极其简单的基本原则。以下设计原则经常在设计复杂系统时需要考虑:

  • Modularity
  • Efficiency
  • Scalability
  • Extensibility
  • Service/object oriented
  • Prioritizing user interactive requests
  • Partition and separation
  • Design as simple as needed

设计分布式系统所需要考虑的方面

  • 当你需要同时应对X请求时,设想100X甚至100,000X请求时,系统所承受的场景
  • 并行化计算+数据读写等操作意味着尽可能地提升计算资源的利用率,即optimizing architecture according to some constraints
  • 一旦使用并行,需要面临的两大问题:一致性原则(consistency)、故障处理(failure handling)
  • 设计复杂系统时,请仔细思考系统各功能模块的需求以及相互的依赖关系
  • load balancing (routing), replicate, caching, indexing, message queue, proxy等是一些常用系统设计技巧
  • Back-of-the-envelope calculation 在设计初期会很有帮助
  • Service的设计尽可能的独立,减少Service之间的依赖性,系统的设计遵循模块化
  • 重点设计protocol,而非communication

一些常用的设计模式

Figure. 1 A typical load balancer try to pick a proper worker to process the requests. We normally expect the dispatcher to follow some geographical pattern so that users perceive little latency. For simple case, a very basic round-robin method can also be used.

Figure. 1 A typical load balancer try to pick a proper worker to process the requests. We normally expect the dispatcher to follow some geographical pattern so that users perceive little latency. For simple case, a very basic round-robin method can also be used.

Figure. 2 Multicast to all workers and the dispatcher aggregates all responses, then sends it back to requester.

Figure. 2 Multicast to all workers and the dispatcher aggregates all responses, then sends it back to requester.

Figure. 3 Adding cache can obviously speed up a lot in large-scale systems.

Figure. 3 Adding cache can obviously speed up a lot in large-scale systems.

Figure. 4 The principle is very similar as parameter server, where parameters are intermediate results from learning.

Figure. 4 The principle is very similar as parameter server, where parameters are intermediate results from learning.

Figure. 5 This pattern follows design of a message queue, events are enqueue and responses are dequeue.

Figure. 5 This pattern follows design of a message queue, events are enqueue and responses are dequeue.

Figure. 6 Very popular map-reduce pattern

Figure. 6 Very popular map-reduce pattern

Figure. 7 Bulk Synchronous Parellel

Figure. 7 Bulk Synchronous Parellel

Figure. 8 Execution Orchestrator is based on an intelligent scheduler / orchestrator to schedule ready-to-run tasks (based on a dependency graph) across a clusters of dumb workers. This pattern is used in Microsoft’s Dryad project

Figure. 8 Execution Orchestrator is based on an intelligent scheduler / orchestrator to schedule ready-to-run tasks (based on a dependency graph) across a clusters of dumb workers. This pattern is used in Microsoft’s Dryad project