Some months ago I was studying for one AWS certification and I got, for the first time, in contact with the “ephemeral ports”. Writting about a small concept of Networking may be one of the least interesting subjects ever, but I didn’t understand it before and I also think its a good opportunity to learn about ports, protocols and firewalling.
Why ports?
First of all, why we need ports? If we have a device that is connected to a network, why dont we just connect it to other services directly to its IP (private or public depending of the situation)? So, it could be possible in some way, but the problematic is that in a device that is connected to multiple other devices or services, we need some way of identifying each of them so we receive and send the data correctly between them, and ports were the way to do so. We can think of ports like multiple existing doors on a device, and we use them to control where we will allow it o receive or send data and with which other actors.
So then we think about a server that is receiving HTTP requests, as usual, the HTTP port is 80. That is, we could set this server to allow connections to 80 from everywhere, making it reachable from the internet to any other device on the internet. So lets say 10 other devices establish a connection with our server on port 80, how would our server understand who sent each thing and whom to respond with what data? Thats were the rules of a protocol come in.
TCP
I will use TCP as an example here, but other protocols also use ephemeral ports to communicate. TCP is a connection-oriented protocol, that is, the protocol makes it so that both parties establish a mutual connection before hand (doing a three-way handshake with SYN, SYN-ACK, ACK) and it ensures the correct sending of data.
And in order to identify each connection to a specific device, TCP uses 4 data points: Source IP, Destination IP, Source Port, Destination Port. This means that we could have no matter how many devices from different IP addresses connecting to our device, we would still be able to differentiate one for the other by using this combination. This ephemeral port is decided at the moment of connection and it lasts the duration of the session.
The number varies from OS to OS but its usually some big number like the range 32768–60999. We need then some sort of algorithm that will ensure no ephemeral port number is reused at the same time. Here’s an example
next_ephemeral = min_ephemeral;
count = max_ephemeral - min_ephemeral + 1;
do {
port = next_ephemeral;
if (next_ephemeral == max_ephemeral) {
next_ephemeral = min_ephemeral;
} else {
next_ephemeral++;
}
if (check_suitable_port(port)) return port;
count--;
} while (count > 0);
return ERROR;
Thats why its important when doing firewalling (in my case in AWS Security Groups and NACL) to allow the connection to the right ranges of ports, otherwise our system wont be able to function correctly.
When using some stateful firewalling like AWS Security Groups we just need to allow the beginning of the connection, that is, the right combination of ports that allow the connection to start, but we dont need to worry with the response part of the connection. But in stateless firewalling like AWS NACL we need to worry about both parts of the connection.