Knowledge Base

A collection of technical articles covering trading protocols, networking, latency, and machine learning.

BATS Binary Order Entry

BATS Binary Order Entry is a high-performance trading protocol modeled on FIX but adapted for simplicity and ultra-low latency.

Read article

Can two applications listen to the same port?

The short answer is "no, not on the same host." The longer answer is that this is by design, the basic rationale being consistency.

Read article

Difference between classification and clustering in data mining?

In data mining, classification is a task where statistical models are trained to assign new observations to a "class" or "category" out of a pool of candidate classes; the models are able to different…

Read article

Difference between Linear Regression and Logistic Regression

Linear Regression uses a linear function to map input variables to continuous response/dependent variables.

Read article

Differences between Training, Validation, and Test Set in Machine Learning

When tackling a supervised machine learning task, the developers of the machine learning solution often divide the labelled examples available to them into three partitions: a training set, a validati…

Read article

FIX Protocol

The FIX protocol is widely used across all asset classes, and supported by most markets. It is the workhorse of communications between modern trading systems.

Read article

How accurately can latency be measured?

The accuracy of a latency measurement derives directly from that of the timestamps used to construct it, which in turn is determined principally by two factors: a) the fidelity with which the timestam…

Read article

How can I check the status of a given port on a remote host?

First, it's worth clarifying that by "status" we understand the question to be asking whether the port is open or not; that is, whether there is an application listening to that port.

Read article

How can I create a self-signed certificate with openssl?

The simplest way to create a self-signed certificate is to use OpenSSL with the following one-liner: It is often useful to create a single .pem file containing both the key and the cert: These steps a…

Read article

How can I simulate delayed and dropped packets in Linux?

The standard way to delay and drop packets in Linux is with the netem scheduling policy; this can be applied to any network device, whether an interface or a bridge, using the tc command from the ipro…

Read article

How can TCP ACKs be used to measure latency to a server?

TCP ACKs can be used to measure the round-trip time to a TCP receiver, and they can do so very accurately: since ACKs are generated in the ISR (interrupt service routine), they involve only the lowest…

Read article

How do I free up a TCP/IP port?

To free up a TCP/IP port that your application application has previously bound to, either call close() on the listening socket, or exit the application.

Read article

How do Multiple TCP Clients Connect Simultaneously to a Single Port on a Server?

TCP connections are identified by a server TCP address and port and a client TCP address and port.

Read article

How do you get Chrome to accept a self-signed certificate?

The following procedure, based on an answer provided by user: kgrote, works for Chrome 68 on Windows 10: Navigate to the site with the cert you want to trust, and click through the usual warnings for…

Read article

How does a Genetic Algorithm work?

Genetic Algorithms are a subset of Evolutionary Algorithms, a group of search and optimization engines inspired by the natural process of evolution.

Read article

How does one filter MAC addresses using tcpdump?

tcpdump supports the "ether" qualifier to specify ethernet addresses in the standard colon-separated format.

Read article

How does one use tcpdump to capture incoming traffic?

The most reliable option is to use the -Q option as follows: The -Q option may not be supported on all platforms, and an alternative is to use equivalent logic in BPF (Berkeley Packet Filter) syntax,…

Read article

How does wireshark annotate some packets with "tcp segment of a reassembled pdu"?

Briefly, Wireshark marks TCP packets with "TCP segment of a reassembled PDU" when they contain payload that is part of a longer application message or document that is completed in a later packet.

Read article

How is latency analyzed and eliminated in high-frequency trading?

Latency is eliminated by making changes to the trading system software or infrastructure, and there is a wide variety of such changes that can be implemented.

Read article

How is latency defined in high-frequency trading?

Latency is fundamentally the duration of a process, and can be quantified as the difference between the time at which the process starts and the time at which it ends.

Read article

How is latency measured in high-frequency trading?

The latency of a process is measured by gathering timestamps reflecting the times at which the cause (e.g. price update) and effect (e.g. order placement) occur, and subtracting them.

Read article

How to convert between different security certificate formats

There are many variants of these questions, asking how to convert TLS (technically X.509) certificates between various formats. To be stored in a file, a certificate must be encoded.

Read article

ITCH Protocol

ITCH is an ultra-low latency protocol for accessing Market Data. ITCH has been developed to maximize performance and to meet the requirements of latency sensitive trading strategies.

Read article

NASDAQ TotalView ITCH

NASDAQ's full-depth market-data feed provides ultra-low latency and high-performance visibility of NASDAQ's equities markets.

Read article

NYSE Pillar

ICE NYSE's common technology platform modeled on FIX but with a focus on performance and ultra-low latency.

Read article

OUCH Protocol

OUCH is a high-performance trading protocol developed by NASDAQ with a focus on simplicity and ultra-low latency.

Read article

Overfitting, Variance, Bias and Model Complexity in Machine Learning

How complex should a machine learning model be? How much complexity can we tolerate before we start to suffer from over-fitting? These questions can be assessed by comparing how models perform on trai…

Read article

The role of bias in Neural Networks

The activation function in Neural Networks takes an input 'x' multiplied by a weight 'w'. Bias allows you to shift the activation function by adding a constant (i.e. the given bias) to the input.

Read article

What algorithm for a tic-tac-toe game can I use to determine the "best move" for the AI?

Tic-tac-toe is a very popular game for two players, X and O, who take turns marking the spaces in a 3×3 grid.

Read article

What are Naïve Bayes classifiers?

A Naïve Bayes classifier is a simple algorithm for classifying data based on Bayes'' theorem.

Read article

What are OLTP and OLAP and the difference between them?

These are terms relating to different kinds of business information processing systems.

Read article

What are the relative merits of TCP and UDP in high-frequency trading?

In general, the relative merits of TCP and UDP in high-frequency trading are the same as they are for most low-latency applications: on one hand, UDP has lower raw latency, by virtue of how much simpl…

Read article

What causes an IOC order to be sent at an unmarketable price?

There are many possible reasons that an order might be sent at a price that is not available in the market, some arising from the infrastructure and some more business-related.

Read article

What causes IOC orders to be canceled?

An immediate-or-cancel (IOC) or a fill-or-kill (FOK) order will be canceled if it arrives at the matching engine when there is no liquidity on the opposite side of the order book at, or better than, t…

Read article

What is a False Positive Rate?

A False Positive Rate is an accuracy metric that can be measured on a subset of machine learning models.

Read article

What is a Genetic Algorithm?

In order to understand how a Genetic Algorithm works, one must first understand how a generic Evolutionary Algorithm works.

Read article

What is a TCP Connection Refused?

In general, connection refused errors are generated during a connect system call when an application attempts to connect using TCP to a server port which is not open.

Read article

What is a TCP Connection Reset by Peer?

An application gets a connection reset by peer error when it has an established TCP connection with a peer across the network, and that peer unexpectedly closes the connection on the far end.

Read article

What is a TCP Reset (RST)?

When an unexpected TCP packet arrives at a host, that host usually responds by sending a reset packet back on the same connection.

Read article

What is Anomaly Detection?

In many cases, malicious attacks initially manifest as unusual behaviors of user accounts, system activities on the network, or network traffic patterns.

Read article

What is Listening on a TCP Port?

To find what is listening on a TCP/IP port, open a terminal and go to a shell command-line, and use lsof, a command that LiSts Open Files (TCP ports are considered a type of file in UNIX).

Read article

What is the best way to describe latency?

The latency of an individual transaction or process is best described by a single number, namely the duration of the process in an appropriate unit.

Read article

What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_pool of tensorflow?

Max Pooling is an operation to reduce the input dimensionality. The output is computed by taking maximum input values from intersecting input patches and a sliding filter window.

Read article

What is the Highest TCP Port Number Allowed?

The highest TCP port number is 65,535. The TCP protocol provides 16 bits for the port number, and this is interpreted as an unsigned integer; all values are valid, apart from 0, and so the largest por…

Read article

What is the largest safe UDP Packet Size on the Internet?

This question, in particular the word "safe" is somewhat ambiguous.

Read article

What is the lowest tick-to-order (or tick-to-trade) latency achievable without the use of FPGAs?

Without FPGAs, the lowest tick-to-order latency can be achieved by a server directly consuming multicast market-data, handling the feed internally, making a trade decision, and emitting an order using…

Read article

What is the maximum packet size for a TCP connection?

This question is also somewhat ambiguous, but for a different reason: the size of IP packets that a TCP stack uses to send data is chosen by the stack itself, and the user or application writing to th…

Read article

Why does tcpdump not render ASCII packet data in a human readable format when tcpflow does?

tcpdump and tcpflow are two very different tools with very different purposes. It is true that both capture TCP/IP packets, or read them from pcaps, analyze them, and produce text output.

Read article