How a Browser Works: A Beginner-Friendly Guide to Browser
What a browser actually is?
At its core, a browser is a software application that acts as a translator. It takes complex code (HTML, CSS, and JavaScript) from a server and translates it into a visual, interactive interface. While we often think of it as a "window to the web," it is more like a highly specialized engine that executes code, manages security, and stores data locally.
Main Parts of a Browser
To keep it at high level, assume the browser as a house
The user interface (UI): → Everything you see is - the address bar, back/forward buttons, the tabs. This is the living room where the user interacts.
The browser engine: → The "manager" that coordinates actions between the UI and the rendering engine.
The rendering engine: → The artist who draws the website on the screen
Browser engine vs Rendering engine
Browser engine: → It handles the bookmarks, back/forward navigations
Rendering engine: → It’s only job is rendering (displaying) a website. When you visit a page it will render the HTML and CSS and paints the pixels. Some popular render engine name examples - Blink (Chrome/Edge) and Gecko (Firefox).
Networking: Fetching the Ingredients
For rendering a page, a browser needs files (HTML, CSS, and JavaScript)
You type a URL.
The browser uses Networking to contact a server.
The server sends back a "package" containing HTML (the structure), CSS (the style), and JavaScript (the behavior).
HTML parsing and DOM creation and CSS parsing and CSSOM creation
HTML is the skeleton of your website, and the CSS gives that skeleton skin colour, muscle, and when the browser receives your HTML and CSS file from the server starts parsing the HTML and CSS. It is a multi-step process that creates the Document Object Model (DOM). This DOM, combined with the CSS Object Model (CSSOM)
The process occurs in a sequence of steps:
Byte Stream to Characters: The browser receives raw bytes of HTML data from the network or local disk. It uses the specified character encoding (e.g., UTF-8) to convert these bytes into individual Unicode characters.
Tokenization: The stream of characters is then passed through a tokenizer, a finite state machine that breaks it down into meaningful pieces called tokens. These tokens represent specific elements like start tags (
<div>), end tags (</div>), attribute names and values, comments, and text content.Tree Construction (DOM Creation): The tokens are sent to the tree construction stage. The browser processes these tokens to build a hierarchical, tree-like structure called the Document Object Model (DOM). Each HTML element becomes a node in the tree, and their relationships (parent, child, sibling) are established.
How DOM and CSSOM Come Together
The browser can't display a skeleton or a blueprint alone. It combines them to create the Render Tree
The DOM says: "There is a heading here."
The CSSOM says: "Headings are blue and 24px."
The Render Tree combines them: "There is a blue, 24px heading here."