How an order book works, and why your order just sits there

You tap buy. The price is right there on the screen. You expect to own the thing a second later. Instead your order sits in a list, blinking, doing nothing.

To see why, you have to look at what is behind that button: an order book, and a small piece of software that decides who trades with whom. This post builds both from nothing. No finance background needed. By the end you will know what an order book is, how a trade actually gets made, and the one clever trick that keeps it fast even when millions of orders pour in. I wrote the whole thing in C++ to see how fast it could go, and the numbers are at the bottom.

A market is just people shouting prices

Picture a room. On one wall, the buyers: "I'll pay 99 for one." On the other, the sellers: "I'll take 101 for one." Nobody trades at a fixed price. Everyone names their own number and waits.

An order is one of those offers. It has a side (buy or sell), a price, and a quantity. The order book is that room, written down. It is two sorted lists: everyone who wants to buy, and everyone who wants to sell.

Two numbers do most of the work. The bid is the highest price any buyer will pay right now. The ask is the lowest price any seller will accept right now. The distance between them is the spread. Here is a small book, sellers on top, buyers below, both leaning in toward the spread in the middle:

40100.06

25100.05

70100.04

spread

8599.98

5099.97

ord-17ord-42ord-88

3099.96

An offer that is still waiting is a resting order. It sits in the book until someone agrees to meet its price. That is the answer to the opening question. Your order does nothing because the best price on the other side has not reached yours yet. A buyer bidding 99 and a seller asking 101 will both wait forever, because 99 is less than 101 and neither will budge.

How a trade actually happens

A trade needs a buyer and a seller who agree. So the only moment anything happens is when a buy price meets or beats a sell price. The software that watches for that and pairs people up is the matching engine.

When two orders could trade, who goes first? Almost every exchange uses one rule, price-time priority. Best price wins: the buyer offering the most, the seller asking the least. At the same price, the order that arrived first wins. It is the coffee shop queue. Among everyone offering the same deal, whoever got in line first gets served first. Engineers call that a queue, or first-in-first-out.

Orders come in two flavors. A limit order says "fill me, but never at a worse price than this." It is patient; if nothing matches, it rests in the book. A market order says "fill me now, whatever it costs." It is impatient; it eats the best prices on the other side until it is done.

Here is the part most people get wrong. When your order matches a resting one, the trade happens at the resting order's price, not yours. Say the best bids are 100 and 99, and you sell at market. You fill against the 100 bid first and the trade prints at 100. You wanted out, so you took the best price someone was already offering. That gap between bid and ask is the toll you pay for being in a hurry.

The interesting part: doing it fast

A real market is not a quiet room. It is millions of orders a second, and a flood of cancellations as people change their minds. The engine has to do two things, both instantly:

Find the best price, on every single incoming order. And cancel any specific order by its id, because in real markets cancels are 20 to 30 percent of all traffic. Searching the entire book for the one order to cancel is out of the question.

No single way of storing the orders does both well. To see why, you need one idea from computer science, and it is simpler than it sounds. When we say an operation is O(1), we mean it takes the same tiny amount of time no matter how big the book gets. Like opening a labeled drawer: one motion, whether the cabinet holds ten files or ten million. The opposite is scanning a shelf book by book until you find the one you want, which gets slower as the shelf grows.

A sorted structure is great at "what is the best price" because the best one is always at the front. But finding one specific order inside it means scanning. A lookup table, what most languages call a hash map or dictionary, is great at "jump straight to order #4172" but knows nothing about which price is best. Each tool is good at exactly what the other is bad at.

So the engine keeps both, pointed at the same orders. A sorted map holds the price levels, so the best price sits at the front and reading it is instant:

order_book.hpp

// Bids: highest price first, so the best bid is at the front
std::map<double, OrderList, std::greater<double>> bids_;
// Asks: lowest price first, so the best ask is at the front
std::map<double, OrderList> asks_;

Copied to clipboard

Each price level holds a queue of orders waiting their turn, oldest at the front. Price-time priority falls out of the storage for free: the map keeps prices in order, the queue keeps time in order.

For cancels, a second table maps an order id straight to where that order lives. The clever bit is what it stores: not a copy, but a pointer that lands exactly on the order inside its queue.

order_book.hpp

struct OrderLocation {
    OrderPtr order;
    OrderList::iterator it;   // points right at the order in its price queue
    bool   is_bid;
    double price_level;
};
std::unordered_map<std::string, OrderLocation> orders_;

Copied to clipboard

Now a cancel is three quick steps with no searching: look up the id, follow the stored pointer, remove the order.

orders_["ord-42"]stored iteratorord-17ord-42ord-88

Why a linked list for each price queue, and not a plain array? Because the stored pointer has to keep working after the orders around it come and go. A linked list lets you splice one order out from the middle without disturbing the rest, so every other saved pointer stays valid. An array shifts its contents around as it grows and shrinks, which would leave those pointers aiming at the wrong order. That single requirement, "the pointer must survive," decides the whole design.

Does it actually go fast?

I ran the benchmark on an Apple Silicon laptop, driving the engine the same way the live demo does. Two numbers per row would be misleading, so here is what they mean: p50 is a normal request, the median, half are faster. p99 is a bad day, slower than 99 percent of requests. You care about both.

Operation	Throughput	p50	p99
Add a limit order	743K /sec	0.25µs	1.25µs
Match a market order	434K /sec	0.33µs	1.04µs
Cancel an order	1.35M /sec	0.74µs avg	n/a

Cancel is the fastest of the three, which is the whole payoff. It is a single lookup and a single splice, with no tree to walk, exactly what the two-table design was built for.

The one thing I would fix

Every price here is a double, an ordinary decimal number, and those are used as the keys of the map. That only works because every price is rounded to two decimals first, so the keys stay exact. Decimals on a computer are slightly fuzzy: subtract enough of them and a quantity that should be zero lands on 0.000000000001 instead, leaving a ghost order that holds a price level open forever. The code papers over this by treating anything under 1e-12 as zero. Real exchanges sidestep the whole problem by storing prices as whole numbers of cents, or "ticks," so the math is always exact. That epsilon is the tell: it is hiding a rounding error that a whole-number design would never have.

The lesson I keep relearning is that the hard requirement picks the design, not the other way around. Start from "a cancel must be instant and must not break the other orders," and the linked-list-plus-saved-pointer layout is the only thing that fits. The speed was a side effect of getting that one constraint right.

Source: order-book-simulator. The numbers were measured with its own benchmark suite.