Skip to main content

Command Palette

Search for a command to run...

How a Browser Works: A Beginner-Friendly Guide to Browser

Published
3 min read

What a browser actually is?

At its core, a browser is a software application that acts as a translator. It takes complex code (HTML, CSS, and JavaScript) from a server and translates it into a visual, interactive interface. While we often think of it as a "window to the web," it is more like a highly specialized engine that executes code, manages security, and stores data locally.

Main Parts of a Browser

To keep it at high level, assume the browser as a house

  • The user interface (UI): → Everything you see is - the address bar, back/forward buttons, the tabs. This is the living room where the user interacts.

  • The browser engine: → The "manager" that coordinates actions between the UI and the rendering engine.

  • The rendering engine: → The artist who draws the website on the screen

Browser engine vs Rendering engine

  • Browser engine: → It handles the bookmarks, back/forward navigations

  • Rendering engine: → It’s only job is rendering (displaying) a website. When you visit a page it will render the HTML and CSS and paints the pixels. Some popular render engine name examples - Blink (Chrome/Edge) and Gecko (Firefox).

Networking: Fetching the Ingredients

For rendering a page, a browser needs files (HTML, CSS, and JavaScript)

  1. You type a URL.

  2. The browser uses Networking to contact a server.

  3. The server sends back a "package" containing HTML (the structure), CSS (the style), and JavaScript (the behavior).

HTML parsing and DOM creation and CSS parsing and CSSOM creation

HTML is the skeleton of your website, and the CSS gives that skeleton skin colour, muscle, and when the browser receives your HTML and CSS file from the server starts parsing the HTML and CSS. It is a multi-step process that creates the Document Object Model (DOM). This DOM, combined with the CSS Object Model (CSSOM)

The process occurs in a sequence of steps:

  1. Byte Stream to Characters: The browser receives raw bytes of HTML data from the network or local disk. It uses the specified character encoding (e.g., UTF-8) to convert these bytes into individual Unicode characters.

  2. Tokenization: The stream of characters is then passed through a tokenizer, a finite state machine that breaks it down into meaningful pieces called tokens. These tokens represent specific elements like start tags (<div>), end tags (</div>), attribute names and values, comments, and text content.

  3. Tree Construction (DOM Creation): The tokens are sent to the tree construction stage. The browser processes these tokens to build a hierarchical, tree-like structure called the Document Object Model (DOM). Each HTML element becomes a node in the tree, and their relationships (parent, child, sibling) are established.

How DOM and CSSOM Come Together

The browser can't display a skeleton or a blueprint alone. It combines them to create the Render Tree

  • The DOM says: "There is a heading here."

  • The CSSOM says: "Headings are blue and 24px."

  • The Render Tree combines them: "There is a blue, 24px heading here."