Rishi
2 min readDec 25, 2021

--

Today’s #technology Buzz word “Scale” . Which is talked by everyone from #investors to #marketing folks to #developers to #architects to #cto .

If smaller companies think scale is only applicable to bigger organisations , they are over estimating the “Scale”, if big companies think they have done everything for scale, they are under estimating.

In technology world or in general, #scaling is continuous process , which start from designing the system till production deployment and maintenance. Every system get naturally built for some scale but we cant relay on luck hence we need to identify our system capacity, #performancetesting and exhaustive #logging system will be saviour .

To make system scalable ,#architects and #engineers need to think about all services and scale them individually , there is no concept of single scalable system.

Have built systems which scaled from 1k — 1 millions -1 billion request without any major downtime . Below are my key take away .

1. One of hyped way to achieve scalability is use #loadbalancer at gateway level unfortunately it will solve just a 1% of the problem , to make it 100% , each service required own load balancer independently.

2. Keep your system horizontally scalable(many small servers) as much you can not vertical( one big server).

3. Make service calls within the time limit and alway implement time out.

4. Time heavy services like #payments , order placement should be implemented asynchronously.

5. Within the same network try to avoid authentication between the services, and keep infra independent of each other, it will reduce the cascade failures.

6. Capacity planning should be done regularly, keep no of request, size of request and time taking in each transfer is key #metrics .

7.Use caching as much you can at each service level but keep in mind to invalidate unused cache regularly as per business rule by using algorithms like LRU, #lifo ,#fifo etc.

8. Most of the time 3rd party services are major culprits to break the scale, keep your #sla ready .

9. No one can predict when system will break ,hence #resiliency is very important.

10. Use mix of #rdbms and #nosql database and distribute your quires to read and write clusters accordingly.

11. For search queries avoid database full text search , instead use search engines.

12. Log as much you can but some time poor and #synchronous logging system is bottleneck ,think of getting right logging system which is as important as building the actual system.

I would be happy to listen your experience as well ,Let me know your comments .

--

--

Rishi

A passionate and business oriented technology leader with over 15+ of experience. I love to build products and grow to millions.