Commit 10239edf authored by Terry Lam's avatar Terry Lam Committed by David S. Miller
Browse files

net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc

This patch implements the first size-based qdisc that attempts to
differentiate between small flows and heavy-hitters.  The goal is to
catch the heavy-hitters and move them to a separate queue with less
priority so that bulk traffic does not affect the latency of critical
traffic.  Currently "less priority" means less weight (2:1 in
particular) in a Weighted Deficit Round Robin (WDRR) scheduler.

In essence, this patch addresses the "delay-bloat" problem due to
bloated buffers. In some systems, large queues may be necessary for
obtaining CPU efficiency, or due to the presence of unresponsive
traffic like UDP, or just a large number of connections with each
having a small amount of outstanding traffic. In these circumstances,
HHF aims to reduce the HoL blocking for latency sensitive traffic,
while not impacting the queues built up by bulk traffic.  HHF can also
be used in conjunction with other AQM mechanisms such as CoDel.

To capture heavy-hitters, we implement the "multi-stage filter" design
in the following paper:
C. Estan and G. Varghese, "New Directions in Traffic Measurement and
Accounting", in ACM SIGCOMM, 2002.

Some configurable qdisc settings through 'tc':
- hhf_reset_timeout: period to reset counter values in the multi-stage
                     filter (default 40ms)
- hhf_admit_bytes:   threshold to classify heavy-hitters
                     (default 128KB)
- hhf_evict_timeout: threshold to evict idle heavy-hitters
                     (default 1s)
- hhf_non_hh_weight: Weighted Deficit Round Robin (WDRR) weight for
                     non-heavy-hitters (default 2)
- hh_flows_limit:    max number of heavy-hitter flow entries
                     (default 2048)

Note that the ratio between hhf_admit_bytes and hhf_reset_timeout
reflects the bandwidth of heavy-hitters that we attempt to capture
(25Mbps with the above default settings).

The false negative rate (heavy-hitter flows getting away unclassified)
is zero by the design of the multi-stage filter algorithm.
With 100 heavy-hitter flows, using four hashes and 4000 counters yields
a false positive rate (non-heavy-hitters mistakenly classified as
heavy-hitters) of less than 1e-4.

Signed-off-by: default avatarTerry Lam <>
Acked-by: default avatarEric Dumazet <>
Signed-off-by: default avatarDavid S. Miller <>
parent 2a2529ef
......@@ -790,4 +790,29 @@ struct tc_fq_qd_stats {
__u32 throttled_flows;
__u32 pad;
/* Heavy-Hitter Filter */
enum {
#define TCA_HHF_MAX (__TCA_HHF_MAX - 1)
struct tc_hhf_xstats {
__u32 drop_overlimit; /* number of times max qdisc packet limit
* was hit
__u32 hh_overlimit; /* number of times max heavy-hitters was hit */
__u32 hh_tot_count; /* number of captured heavy-hitters so far */
__u32 hh_cur_count; /* number of current heavy-hitters */
......@@ -286,6 +286,15 @@ config NET_SCH_FQ
If unsure, say N.
config NET_SCH_HHF
tristate "Heavy-Hitter Filter (HHF)"
Say Y here if you want to use the Heavy-Hitter Filter (HHF)
packet scheduling algorithm.
To compile this driver as a module, choose M here: the module
will be called sch_hhf.
tristate "Ingress Qdisc"
depends on NET_CLS_ACT
......@@ -40,6 +40,7 @@ obj-$(CONFIG_NET_SCH_QFQ) += sch_qfq.o
obj-$(CONFIG_NET_SCH_CODEL) += sch_codel.o
obj-$(CONFIG_NET_SCH_FQ_CODEL) += sch_fq_codel.o
obj-$(CONFIG_NET_SCH_FQ) += sch_fq.o
obj-$(CONFIG_NET_SCH_HHF) += sch_hhf.o
obj-$(CONFIG_NET_CLS_U32) += cls_u32.o
obj-$(CONFIG_NET_CLS_ROUTE4) += cls_route.o
This diff is collapsed.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment