How Ongraph Architect Scalable Applications Used By Millions Of Users?

By ongraph
April 6, 2018 | 2630 Views

Latest Industry insights to keep you updated on the latest happenings.

Application development is an extensive process. Months of efforts and time are invested together with money to convert a fresh idea into a marketable product. The journey is filled with ups and downs. So, when you build that successful app, the last thing you want is for it to hit a dead end. But all too often, that’s exactly what happens. It may seem like a good problem to have – so much growth, you can barely handle it. After all, that means there is a ton of demand for your app. But if your app isn’t build to handle that demand, then it’s doomed from the outset. Can your app serve thousands… or even hundreds of thousands of users? When will it start to break? This is the importance of scalability.

At OnGraph, we help our clients grow their business the smart way. Hence we guide them through the essential considerations and tech stack which play a significant role in the development of scalable websites.

Facebook and other social networking websites have about billions of users. Millions of users access the website and they expect to be directed to their newsfeed quickly for the read and write functions. Users of social networking websites are able to do such activities and that’s scalability in action.

Google and Yahoo – top search engines of the world are scaled and scaled so well that millions of online users around the world access them at once, and the load is not a problem.

There are enterprises having a scalable, heavy and large website but they will never be the size of Google, Yahoo, Facebook and alike. Still, every website development planning should be conducted prior keeping in view the prospects of continues increase of users and load, so that when it occurs, the correct scaling architecture could reside in a place for optimum support.

Before any design architecture is created, application scalability fundamentals need to be decided.

This is a significant step in application development as the key to scaling an application requires distributing the load across multiple servers, enabling parallel processing.

A single server is capable to perform ample of work when you exceed its limit. We enhance a server strength through scaling up and scaling out tactics. Once we begin scaling up, we identify and understand the actual bottleneck or choke-point in the application.

This is essential to perform in order to understand the real limitation of the server unless you could make the wrong choice when attempting to scale application.

As computer server has 3 basic components (CPU, Memory and Disk), the first place to look is a system monitoring tool. In a public cloud there are tools provided by the vendor, and if you run in your own data center, there are many options (both open source and commercial).

Detecting Performance Bottleneck

There are some regular issues that indicate the way a legitimate determination of execution issues. A high CPU use is the most widely recognized execution bottleneck, however, it is regularly the least demanding to determine – in some cases by basically redesigning your server. However, regardless of its appearing straightforwardness, High CPU utilization can be misdirecting. It can really be a marker of different causes, for example, User CPU, which implies your CPU is doing productive work and can be resolved with an update; framework CPU, which is use devoured by the working framework and is typically software related; and I/O pause, regularly caused by the CPU waiting for I/O subsystem.

Low memory is another basic bottleneck. In the event that your server does not have enough memory to deal with your application stack, it can dramatically affect execution. Usually, memory is 100 times quicker than disk, in this manner if you run low on physical memory, an application becomes as slow that it has to struggle even for a crawl. Most working frameworks ( Linux Particularly) will naturally turn to memory swapping when this circumstance emerges. Once in a while, low memory just requires a RAM redesign, however, it can likewise be a pointer to memory spill, which requires distinguishing the lead and settling it inside the application code.

High disk usage is the third most generally experienced bottleneck. Among the three, the issue is the most grounded pointer of the requirement for scaling. The biggest reason for this causes due to maximized disk writes, and while an optional solution is getting a more costly, faster disk, a stronger solution is scaling.

Scalability and Reliability

When it required scaling your application, another huge factor to consider is in which manner scalability and reliability are interlinked. It is essential to give careful attention when you scale your app architecture utilizing an appropriated distributed environment for parallel computing, you are including failure points. Accordingly, all segments ought to be designed in a repetitive way, to guarantee 24×7 always-on operation.

As you will see in consequent segments of the book, this is often one of the most challenging aspects of configuring and managing a scalable application environment, especially when versatility is the target.


Scaling up alludes to vertical scaling, and is summed up in one simple phrase: Buy a Bigger Box.

This is easy enough, simply get a quicker server, with all the more effective processors, more memory, or perhaps speedier disks relying upon what your specific application choke-point is.

Scaling up to a bigger server

However, scaling up to a bigger server could work fine with many instances – but it might just give a shot-term fix. Everything relies upon the real source of your performance issues. For example, if you are CPU-bound, scaling up is an easy solution – and a handy solution. Nonetheless, if you find that you are bound by disk I/O, and particularly if you are bound by disk writes, vertical scaling isn’t probably going to resolve the issue.


Horizontal scaling is known as scaling out, which means you will add additional servers for your application, and operate a distributed environment capable of parallel computing. When done right, scaling out is a long-term answer for almost any application performance issue – at the end of the day, it’s a genuine and permanent fix. In any case, going from a straightforward solid registering condition to a scalable cluster is a major move, so don’t belittle the errand or the measure of information you ought to have added to your belt before you move toward this path.

Scaling out to a distributed cluster

To manage distributed cluster, we are prepared with an increasing number of scalable options, and these options and how they work are the primary focus of this book.

A Typical Online Application Architecture

When you need to scale an app, you must upgrade your knowledge about the application design. Specifically, the common tiers used in today’s online, interactive applications.

The Load Balancer Tier

We advance forward with Load Balancer. The incoming traffic from end-client web programs typically hit a load balancer at the primary level. The activity of the load balancer is simply indistinguishable its name like: it adjusts (or distributes) the load for service requests for over the following lower level, the application server level. We can use varied types of load balancers as these available in different types, where some are software-based while others are hardware appliances. If your application keeps running on the open cloud, it is likely that the cloud vendor has a load balancer service you can take advantage of. Irrespective of the type of load balancer you use, the functionality is basically the same.

Usually, load balancers do not store any state, i.e. data about the client or information particular to the application or the status of a specific activity performed by the application. Consequently, they are said to be stateless. There is a few special cases to this, contingent upon the directing technique supported by the load balancer. In any case, even in these special case cases, the state data is insignificant and can without much of a stretch be reconstructed when required.

Different load balancers give diverse guidelines for conveying the load. Round robin routing and sticky routing are the two most commonly used load distributors. Round robin routing circulates inbound demands one at a time to an application server for preparing, enabling requests to be serviced in an even amount of time. The disadvantage of this process is that every individual transaction or demand, for a given client session by a similar application and courses consequent requests to the original application server for processing, enabling the application to retain session state, the data about the user and their activities. Sticky routing is exceptionally valuable in a few situations, and the session state can eventually yield a superior ordeal for the end client.

The Application Server Tier

The application server is where the greater part of the application logic and processing lives. This incorporates everything from the (UI) itself to login authentication to managing of client demands. While examining the application server, we portray it as the layer that processes incoming or interactive requests from some kind of UI.

A few applications require separating the UI from the application logic, empowering an extra application service tier for preparing application exchanges or demands. This is the perfect technique for organizing the application server tier, however in a few programmings like PHP, the UI and service tier can be combined if wanted.

With either architecture for this tier, the reason for the application server is to process demands made by the client by means of the UI.

The present application servers support applications written in any programming language like PHP, Java, Ruby, Python, and numerous others. However, till date, we have seen various application frameworks built for these languages that enable easier and faster work.

Often the application server isn’t required to store state and can be viewed as stateless for our purposes. The prominent special case obviously is session state described previously. Session state ought to be restricted to a generally little measure of information, and in the occasion, a client session is lost, it can without much of a stretch be reestablished or remade when a client makes another session.

The Database Tier

Let’s get down to bedrock in the application architecture, the database level. Definitely, the database is stateful after very definition and its function is to store application state. Whereas the state (application information) put away in a database has to be a permanent record of activities performed by end users, otherwise, application servers, or the application itself can’t work.

The database tier can store gigabytes to terabytes and for a few companies even petabytes. Along these facts, a database can contain a ton of state. Henceforth, in order to scale the Big Data that require accommodating large data sets supporting a variety of high volume application requirements.

How to Scale a Traditional Online Application

Now we get to the core of the issue, scaling for a real-world application. The answer to this inquiry relies upon the tier you are assessing for your application architecture, and also the sort of use necessity. It is additionally essential to consider the reliability of your scaling approach.

Monthly industry insights to keep you updated on latest happenings

Follow us on Twitter
Follow us on Facebook
Follow us on Linkedin