RETCAM: An Efficient TCAM Compression Model for Flow Table of OpenFlow

Chaoqin Zhang; Penghao Sun; Guangwu Hu; Liang Zhu

doi:10.23919/JCN.2020.000033

ISSN: 1976-5541

Volume 22, No 6 (2020), pp. 484 - 492

10.23919/JCN.2020.000033

Chaoqin Zhang , Penghao Sun , Guangwu Hu and Liang Zhu

RETCAM: An Efficient TCAM Compression Model for Flow Table of OpenFlow

Abstract: OpenFlow is a widely adopted dataplane protocol in software-defined networking (SDN). However, the expansion of supported match fields in OpenFlow brings additional pressure to the storage space of ternary content addressable memory (TCAM) in physical device, since the arbitrary wildcard support in the match field of OpenFlow relies heavily on TCAM for looking-up speed. In this paper, a mathematical model aiming at the storage space reduction of the flow table in TCAM is presented, which is named as RETCAM. RETCAM analyzes the relationships among all the match fields and then categorize the redundancy among different fields into three types. Based on the three redundancy types, three compression algorithms named as inter-field merge, field mapping and intra-field compression are presented. The outcomes of each compression algorithm are flow entries with smaller bit-width which is sent to TCAM for flow matching. In this way, the flexibility of OpenFlow is not harmed, thus maintaining the function integrity of the original flow table. Simulation at the end shows that RETCAM saves almost about 60% of TCAM space for a given flow table with no damage to the function integrity of OpenFlow, and the compression performance stands stable with the increase of flow table size.

Keywords: compression , openflow , sdn , storage space optimization , TCAM

I. INTRODUCTION

RE cently, an innovative network architecture software defined networking (SDN) [1] has been widely adopted in various environments. With the separation of control plane and forwarding plane, SDN provides network users and researchers with much more flexibilities, such as user-defined flow control and open programmability, which propel greatly the revolution towards future networks. To achieve such flexibilities of upperlayer, SDN requires a powerful southbound interface to manipulate the basic functions of hardware in the forwarding plane.

OpenFlow [2] is widely cognized as an early SDN implementation and also a popular southbound interface of SDN, which derives from the “clean-slate” project of Stanford. Supported by open networking foundation (ONF), the version of OpenFlow has been continuously updated, and in a recent version Open- Flow1.3 [3], 40 match fields are defined, which take up a total storage space of up to 1227 bits. Though increased match fields have broadened the application scope of OpenFlow, the pressure of storage space in ternary content addressable memory (TCAM) sets a great limit on the size of flow table. What’s worse, TCAM has a disadvantage of high cost-to-density ratio (about US$350 for a 1 M-bit chip) and high power consumption (about 15 Watt/Mbit), which hinders greatly the expansion of TCAM size in commodity devices.

To store more flow entries in a TCAM of a practical size, compression of flow table is thus considered. The flow table compression of OpenFlow differs from other compressions such as IP routing table and ACL table in two aspects:

One flow entry may consist of many fields while some of them are redundant for a certain data flow. For example, in OpenFlow1.3, a flow entry may contain both TCP source/destination address and UDP source/destination address, while in practice, TCP and UDP are two different layer4 protocol and would not exist in the same data flow.

Each field of a flow entry has a probability of being be masked in a flow table. In traditional IP routing table, wildcard matching has been long used and the longest prefix matching policy of IP determines that the masked bits in an IP address must be consecutive and at the end of the address. However, in OpenFlow, arbitrary fields could be masked, so the possible position of masked bits spread all across the flow entry and are under no limitation of consecution.

The stated reasons make flow table compression of OpenFlow somehow different from previous works. Based on the problems stated above, this paper makes contributions in following aspects:

(1) A field structure analysis model of OpenFlow is presented. In OpenFlow1.3, up to 40 fields are defined, while a large increase in the number of fields don’t necessarily mean a large increase of entry amount. In fact, there are some special relationships among fields that could contribute to the compression, on which the model of this paper is mainly based.

(2) A pre-compression model for TCAM is proposed. According to field relationships, a mapping scheme is proposed to compress the lengths of some fields before sent to TCAM for match, thus reducing the cost of one flow entry in TCAM. Besides, such compression allows for unforeseen circumstances of the addition of new field value, which ensures the incremental capability of this scheme.

(3) A simulation experiment is carried out, which gives a clear view of the performance. We obtain flow tables in test networks of OpenFlow and carried out our compression algorithm. The results show that our compression ratio reaches more than 60%.

The rest of the paper is organized as follows: Section II describes the related work. Section III gives basic analyses of the features of OpenFlow flow tables and presents a mathematical model as the direction of compression. Section IV illustrates our compression analysis with a demonstration algorithm, and describes the details of compression scheme. Section V describes the hardware support for our scheme. Section VI validates the performance of our compression model with a simulation experiment. At last, a conclusion of this paper is presented.

II. RELATED WORK

Match of OpenFlow tables has similarity with traditional packet classification. The flexibility and programmability of OpenFlow in forwarding plane is partly based on the number of fields it supports. However, expansion of field number also brings the problem of flow table explosion, adding pressure to both physical equipment storage and controller maintenance [4].

Table compression has already been a hot topic since the large scale deployment of IP networks. In IP networks, packet classification also aims at multi-fields in a packet header, which is similar with OpenFlow. Existing studies about packet classification could help in the research of flow table compression in OpenFlow networks. Recursive flow classification (RFC) [5] aims at the lookup speed of packet classification and achieves high throughput, but the high cost of storage space and low table update speed set a limit on the table scale. FRFC [6] improves the update speed of RFC, but still costly in storage space. Grid of Tie [7] focuses on the refinement of search tree in packet classification, which achieves low storage space, but the complexity of tree structure hinders its speed of incremental update. Hi- Cuts [8] and HyperCuts [9] focuses mainly on low-dimensional fields, while as the number of field dimension rises, the problem of storage space and lookup efficiency is exposed.

In traditional routing table and access control list (ACL), some match policies are similar to that of OpenFlow. Zeng and Yang [10] studies the optimization problem of ACL, which employs cross-coverage and inclusion relationships to reduce the number of ACL. ACL compressor [11] illustrates a high compression ratio model in their paper, which achieves the compression ratio of 50.22% by a divide-and-conquer approach. Karpilovsky et al. [12] propose to deploy a management system to reduce the router state by 70%. The decomposition and recombination process perform well in multi-fields compression, but such approach doesn’t apply to arbitrary masking of fields in OpenFlow. Orange [13] reduces the storage space of rangebased flow table with a new look-up process in TCAM. Rottenstreich et al. [14] also address the range-based storage problem by changing the looking-up process. Sun et al. [15] change the encoding method of range-based tables in the TCAM and increase the compressing efficiency of ranges. Li et al. [16], [17] propose to use pre-classifier to reduce the storage space of ranges in traditional routing table, which proves to work well. Some works also concentrate on manipulating the storage space of the flow table based on the consideration of the whole network functions [4], [18]–[20], but these schemes cannot be flexibly used without a full analysis of the network functions such as routing. There are also many schemes that employ decision trees for packet classification. For example, PartitionSort [21], CutSplit [22], TabTree [23], NeuroCuts [24], CutTSS [25], and NuevoMatch [26] all proposed a certain kind of method to construct a decision tree for fast packet classification. These schemes mainly aim at alleviating the search complexity of the packet classification in with the rules stored in RAM.

Due to the requirement of arbitrary mask in fields, lookup process in OpenFlow relies heavily on TCAM for a high performance, which in practice sets a great limit on the table scale of OpenFlow. Therefore, the flow table storage problem is more serious than that of traditional routing table as aforementioned. Curtis et al. [27] pointed out the difference between the flow table of OpenFlow and traditional routing table, and proposed DevoFlow, a model to reduce flow table size. However, its algorithm is weak in extensibility. DIFANE [28] reduces the table size kept in one switch by splitting the rule set into different switches, and ensures right match of flow entry by redirecting flow to intermediate switches. However, in a practical network, rules are not static, so frequent rule changes bring additive pressure to controller of DIFANE networks. Zhang et al. [29] try to redistribute the traffic to reduce the size of the flow table in each switch, but the traffic redistribution also brings many problems such as routing changes. Compact TCAM proposed by Kannan K et al. [30] discuss in detail the reliance of TCAM in OpenFlow and the drawbacks of TCAM. In this model, flow table size is reduced by adding tags to flow as a replacement of full-size entry match, but the process of adding tags in egress switches still need a full-size entry match of OpenFlow. LightFlow [31] analyses the relationship between wildcard match and exact match in flow table, based on which proposes a GPU based parallelization algorithm. Also, LightFlow is not suitable for OpenFlow in match field combination. H-SOFT [32] proposes a field partition algorithm to reduce storage space by the reduction of entry numbers in sub-tables. However, such partition process loses the support of arbitrary mask in flow table, thus damaging the finegrained control ability of OpenFlow.

It should also be noted that like OpenFlow also contains the fields whose value could be range value (such as source port and destination port). The range expansion problem introduced by such range field brings much pressure on the storage space of TCAM. Schemes such as DIPRE [34] and LIC [35] aim at alleviating this problem with unused bits in every TCAM entry, which achieve good performance but rely on enough extra bits. Therefore, compressing the bit-width of OpenFlow entry also helps in the reduction of range expansion problem.

III. MATHEMATICAL MODEL

In this section, a mathematical model of compression is proposed, with OpenFlow1.3 considered as an example to execute flow table compression algorithm on. Since flow entry structure is one important element in our compression model, the analysis of structure is firstly presented.

The aim of flow table compression is the contraction of flow entry size while at the same time maintain the intact function of the flow entry. Basically, there are two kinds of storage medium for flow tables, one being RAM and the other TCAM. As stated in Section I, TCAM has the advantage of high match speed and arbitrary wildcard match support, but a notoriety of the high power consumption and hardware cost. On the contrary, RAM based storage is much more economical, but inefficient in search speed and helpless in arbitrary wildcard match. This model is just based on such relationship between TCAM and RAM, which aims to maximize the flow table storage ability of a hardware device with limited hardware resources.

Generally, when faced with storage pressure of TCAM, RAM has always been resorted to for an alternative. In RAM, the search process of multi-field table is usually based on trie or tree structure, but when it comes to OpenFlow, such structures are helpless, because in OpenFlow, arbitrary mask of fields are supported. In traditional multi-field match table, since wildcard is not arbitrary, redundant tree or trie nodes could be removed, thus the total amount of nodes are just of the same order of magnitude as the amount of entries. However, with the employment of arbitrary wildcard, redundant nodes might also be considered as nodes with masked value, thus should be reserved, as a consequence of which the problem of storage space explosion is inevitable.

To solve this problem, a model of RAM pre-compression is introduced, together with corresponding algorithm for the optimization problem. In this model, packet header is first partitioned into sub-sets, and then sent to RAM for the first step of compression. After the pre-compression algorithms processed in RAM, a new format of packet header with reduced bit length is then sent to TCAM for final match. Since RAM is also a limited resource, this model also takes into consideration of the tradeoff between RAM and TCAM.

The basis of flow table compression is the fact that a certain field of a protocol is usually designed to allow for increment. Take ETH_TYPE into example, the 16-bit field could accommodate up to 65536 type codes. However, in practice, only hundreds of them are assigned, among which only dozens are in common use. There are also other cases such as IP address. The 32-bit address length defines a space of a total amount of 232 IP addresses, while in practice, a certain router or switch should only handle a small portion of them, say, 100 thousand. Similarly, fields like ETH_SRC, ETH_DST, ETH_TYPE are also redundant in field length considering the de facto values a certain hardware would encounter. Therefore, we can take advantage of such property of header fields to carry out the table compression.

Definition 1: Orthogonal fields pair [TeX:] $$P_{O},$$ which represents such two fields that are collided to each other and could not exist in a packet header at the same time. In [TeX:] $$P_{O},$$ if one field exists, it’s certain that the other would not exist, and vice versa. For example, <TCP_SRC, UDP_SRC> is a [TeX:] $$P_{O}, \text { and }<\text { TCP_DST }$$

Fig. 1.

Relationship matrix of 13 required fields for OpenFlow 1.3.

[TeX:] $$\text { UDP_DST> is also a } P_{O}.$$

Definition 2: Evolution fields pair [TeX:] $$P_{E},$$ which represents two fields that one is the updated version of the other. [TeX:] $$P_{E}$$ shows a characteristic of evolution, which means the former field possesses all the functions that the latter one has. For example, <IPv4_SRC, IPv6_SRC> is a [TeX:] $$P_{E} \cdot P_{E}$$ is critical in field mapping and also helps dealing with flow header partition illustrated in the following sections.

Definition 3: Coexisting fields pair [TeX:] $$P_{C},$$ which represents two fields that one field exists if and only if the other exists. [TeX:] $$P_{C}$$ shows a relationship of dependence. Therefore, during the field partition process, [TeX:] $$P_{C}$$ should be taken into consideration to improve the algorithm efficiency.

Definition 4: Storage coefficient [TeX:] $$\eta.$ The reduction of TCAM is to some degree at the cost of the increase in RAM consumption. [TeX:] $$\eta$$ is the reflection of such condition, which by definition is the reduction of TCAM in bit at per bit consumption of RAM. In practice, [TeX:] $$\eta$$ could help accommodating this algorithm in hardware devices with different size of TCAM and RAM.

A matrix of field relationship is shown in Fig. 1 as an example. The example of relationship matrix [TeX:] $$\Pi$$ is based on common sense of field relationship of the 13 required fields in OpenFlow v1.3. In Fig. 1, the indicator i and j denotes the number of field in the 13 match fields of OpenFlow v1.3. Specifically, the table in Fig. 1 gives a detailed description of the relationship type of each match field, e.g., in the relationship [TeX:] $$R_{i, j}=R_{E},$$ the numbers on the right column of the table denote that [TeX:] $$\left(R_{6}, R_{8}\right), \left(R_{7}, R_{9}\right),\left(R_{8}, R_{6}\right) \text { and }\left(R_{9}, R_{7}\right)$$ belong to the relationship [TeX:] $$R_{E},$$ It should be mentioned that some fields may have more than one relationship type, in which case we heuristically sort the priority of relationship as [TeX:] $$P_{E}>P_{O}>P_{C}$$ according to the compression ratio listed in the following sections. In a certain deployment environment, [TeX:] $$\Pi$$ may change, for example, the number of PE may increase.

Table 1 shows the variables used in this model.

Table 1.

Variable notation for RETCAM model.

Notation	Representation of the symbol or symbol
F	Sets of all different fields in this model
T	Flow table
M	Total number of entries in T
N	Total number of fields in F
[TeX:] $$N_{i}$$	Number of Fields in subset [TeX:] $$i(1 \leq i \leq k)$$
[TeX:] $$M_{i}$$	Number of entries in subset [TeX:] $$i(1 \leq i \leq k)$$
[TeX:] $$l_{i}$$	Bit width of field [TeX:] $$i(1 \leq i \leq k)$$
[TeX:] $$l_{m n}^{\prime}$$	Bit width of compressed pair [TeX:] $$P_{O}$$
[TeX:] $$f_{n}$$	Field [TeX:] $$n(1 \leq n \leq N)$$
[TeX:] $$w_{i}$$	Bit width of subset i in RAM [TeX:] $$(1 \leq i \leq k)$$
[TeX:] $$w_{i}^{\prime}$$	Bit width of search output of subset [TeX:] $$i(1 \leq i \leq k)$$

A. Field Partition

As mentioned above, almost all protocols contain redundancy in field length. On the one hand, redundancy is necessary for future development of such protocol, since in time scope, new version of the protocol may introduce new type field, and in user scope, addition of new users may bring more addresses into use; on the other hand, since every field allows for its own redundancy, the overall redundancy in a group of fields is just too large a waste of storage space. Therefore, we can just allot some redundancy to two or more field together, so the overall redundancy is reduced while at the same time there is enough space for such group of fields to extend their application scope.

In the compression algorithm, different fields should be treated in different ways, with the implementation in RAM. Therefore, partition of fields into sub-sets plays an important role in the reduction of RAM space, which both reduce entry numbers of sub-sets and increase the pre-compression speed with parallel operation. Let [TeX:] $$F=\left\{f_{n} \mid n=1,2, \cdots, N\right\}$$ be the fields set of all N fields in the flow table.

First, partition F into k sub-sets according to [TeX:] $$P_{O} \text { and } P_{C}$$ defined above, of which the exact value of k is acquired by the equation set listed below, and build a sub-table for each sub-set. For each field, whether it should be assigned to a sub-set is also limited by coefficient , which balances the reduction in TCAM size and the cost in RAM space. In the ith sub-set, the number of fields is [TeX:] $$N_{i},$$ which satisfies (1).

(1)

[TeX:] $$N=N_{1}+N_{2}+\cdots+N_{k}$$

In each sub-set, since it only covers the information of the fields within the sub-set, entry number should be less than M. Let the entry number of each sub-set be [TeX:] $$M_{i}.$$

B. Inter-field Merge

Inter-field merge applies to [TeX:] $$P_{O} \text { and } P_{C}.$$ Within each sub-set, if there exist [TeX:] $$f_{m}, f_{n} \in F_{i} \text { that }f_{m}, f_{n}>\text { is a } P_{O} \text { or } P_{C},$$ we can carry out the compression process. Let [TeX:] $$l_{m}$$ be the bit length of [TeX:] $$f_{m} \text { and } l_{n}$$ be the bit length of [TeX:] $$f_{n},$$ then before compression, the storage space for these two fields is [TeX:] $$\left(l_{m}+l_{n}\right)$$ bits. In practice, the amount of values of [TeX:] $$f_{m} \text { and } f_{n}$$ in use are much less than [TeX:] $$2^{\left(l_{m}+l_{n}\right)},$$ as explained before. Therefore, we could allot a new field of length [TeX:] $$l_{m n}^{\prime},,$$ which could accommodate all the values of [TeX:] $$f_{m} \text { and } f_{n}$$ while at the same time reserve enough space for future extension. As a result, two fields [TeX:] $$f_{m} \text { and } f_{n}$$ could be

Fig. 2.

An example of inter-field merge.

replaced by one field, thus the total storage space is reduced. Besides, the compression ratio [TeX:] $$C_{o} \text { of this } P_{O}$$ is expressed in (2).

(2)

[TeX:] $$C_{O}=\sum_{i=1}^{M_{i}} l_{m n}^{\prime} /\left(\sum l_{m}+\sum l_{n}\right).$$

The storage coefficient is given in (3).

(3)

[TeX:] $$\eta_{P o}=\sum_{i=1}^{M^{\prime}}\left(l_{m}+l_{n}-l_{m n}^{\prime}\right) /\left(\sum l_{m}+\sum l_{n}\right).$$

Compression of [TeX:] $$P_{O} \text { or } P_{C}$$ doesn’t affect the final match result of a packet header, of which the reasons are explained as follows. In TCAM, the entry structure is reconstructed according to compression format, for example, two fields are merged into one field. Since such operation mainly affect the compressed fields of [TeX:] $$P_{O},$$ we can classify the corresponding circumstances into such two types:

(1) Between the two orthogonal fields [TeX:] $$f_{m} \text { and } f_{n},$$ an entry has only one exact field value while the other field masked. In this case, such entry would be assigned with an exact match value of merged field [TeX:] $$P_{O}$$ in TCAM. We now analyze this circumstance with the assumption that field [TeX:] $$f_{m}$$ is with an exact value A in this entry while [TeX:] $$f_{n}$$ is masked (analysis of the opposite is just the same). In practice, a coming packet header may come with a field [TeX:] $$f_{m}$$ or otherwise [TeX:] $$f_{n}$$ (orthogonal fields would not exist together). For different field values of [TeX:] $$f_{m} \text { and } f_{n},$$ search outcome from RAM would be different, so only a packet header with a value A could be matched against such entry. Therefore, collision would not happen after the merging of [TeX:] $$f_{m}$$ and [TeX:] $$f_{n}.$$

(2) Both [TeX:] $$f_{m} \text { and } f_{n}$$ are masked in the entry before compression. In this case, such entry after compression would have the merged field [TeX:] $$P_{O}$$ masked. It’s clear that for this circumstance, compression operation doesn’t change the match outcome in TCAM.

Fig. 2 shows an example of the inter-field merge. As shown in the figure, in an original OpenFlow flow table, it takes 32 bits to store both a TCP and a UDP port number. In RETCAM, there is one flag bit to denote whether the coming packet is a TCP or a UDP packet, and the space to store the port number can be multiplexed for these two protocols. As a result, the storage width is reduced from 32 bits to 17 bits.

C. Field Mapping

Among the fields of OpenFlow1.3, there are another special relationship of fields, in which one field [TeX:] $$f_{n}$$ is just the upgraded

Fig. 3.

Address conversion table.

version of [TeX:] $$f_{m},$$ such as source and destination address of IPv6 versus source and destination address of IPv4.

In IPv6, source address and destination address are expanded to 128 bits, which defines enough address space for hosts. In fact, IPv6 has almost all the functions IPv4 has in network layer, together with the expansion of address space and extension of support for some special applications. Therefore, IPv6 could perform all the function of IPv4. Temporarily, smooth transition from IPv4 to IPv6 is proposed, so IPv6 and IPv4 coexist in the network.

Among all the smooth transition technologies from IPv4 to IPv6, protocol conversion technology is a popular one. RFC 6052 [33] specifies the technology of address conversion from IPv4 to IPv6, such as shown in Fig. 3. As shown in Fig. 3, there are six types of mapping method from IPv4 to IPv6, which have different prefix lengths (PL). In practice, the network has to adopt one of the mapping method as the mapping format from IPv4 to IPv6. For example, if we adopt the mapping method with the PL value 40, then the first 24 bits of the IPv4 address is mapped onto bit 40–64 of the IPv6 address and the last 8 bits of the IPv4 address is mapped onto bit 72–80 of the IPv6 address. The prefix and suffix value can be assigned to 0.

In TCAM, only the total length of prefix is concerned. After conversion from IPv4 address to IPv6 address, the masked bits of original IPv4 fields in an entry are still masked in TCAM. RFC 6052 suggests six different prefix length, among which an appropriate one for this model should be considered. In the IPv4-converted IPv6 address, new prefix in a flow entry is the combination of original prefix in IPv4 address and the newlyadded prefix.

The compression ratio in field mapping is:

(4)

[TeX:] $$C_{E}=\sum_{f_{n}, f_{m}} l_{n} / \sum_{f_{m}, f_{n}}\left(l_{m}+l_{n}\right).$$

D. Intra-field Compression

For other fields, there could also be RAM-based search algorithms to reduce the field length. For example, the 48-bit fields ETH_SRC and ETH_DST could also be compressed. In this way, hash algorithm could be used to accelerate the compression, so collision should be carefully processed to ensure a unique outcome for every income value. Generally, we define the bit-length of sub-set i in RAM with [TeX:] $$w_{i}(1 \leq i \leq k),$$ of which [TeX:] $$w_{i}$$ is the sum of both original field length and compressed field

Fig. 4.

An example of intra-field merge.

Algorithm 1

Sub-sets creation

length. Besides, the bit length of match outcome in RAM which is sent to TCAM is defined with [TeX:] $$w_{i}^{\prime}.$$ By field partition, repetitive values of a field in a total flow table could be merged into one in a sub-set table, thus reducing the entry amount. Since the overall entry amount is denoted by M as mentioned above, we can denote the entry amount of sub-set with [TeX:] $$M_{i}.$$ Therefore, the compression ratio [TeX:] $$C_{i}$$ of sub-set [TeX:] $$i \text { is } C_{i}=\left(M_{i} \times w_{i}\right) /\left(M \times l_{i}\right),$$ and the storage coefficient [TeX:] $$\eta \text { is } \eta_{i}=\left(w_{i}-w_{i}^{\prime}\right) / w_{i}.$$ Fig. 4 shows a simple example of the intra-field compression, which is a hash of the MAC address. With the has operation on the MAC address, the width can be reduced from 48 bits to 24 bits.

IV. ALGORITHMS

In this section, an algorithm as an implementation of the model is presented. The overall algorithm contains mainly three stages, which is sub-sets creation, pre-compression in sub-sets and recomposing of header field. Algorithm1 and Algorithm 2 show the process in detail.

Algorithm 1 partitions the overall fields in F into several subsets based on [TeX:] $$P_{O}, P_{E}, \text { and } P_{C}$$ in relationship matrix [TeX:] $$\Pi,$$ on which three types of pre-compression algorithm could be performed. Creation of a sub-set is according to the storage coefficient [TeX:] $$\eta,$$, by which the user could regulate the complexity of RETCAM with threshold value [TeX:] $$\eta_{T H}$$ according to the hardware condition.

Algorithm 2 accommodates the original flow table into corresponding sub-flow tables created according to sub-sets. During the accommodation process, field value of each low entry is inserted into the tree as a node or the hash table. For fields like protocol number, values are usually consecutive, thus searching process of such type is based on binary search.

Algorithm 2

Sub-sets compression

V. HARDWARE SUPPORT

Like all other TCAM storage encoding schemes as mentioned in Section III, RETCAM also relies on a minor modification of the hardware in dataplane to conduct the functions, which is shown in Fig. 5. When a packet comes, it’s firstly processed in the packet processor. In an ordinary dataplane device, after the packet header is extracted from the packet, the header field is reformatted as the match field for matching. In the modified circuit of RETCAM, the header field is then processed by the pre-encoding circuit. As mentioned in this section, RETCAM uses a combination of SRAM and TCAM to decide the match outcome. Since the price is SRAM is omittable compared to TCAM, the table stored in SRAM takes a low cost There are also interfaces to the control plane as shown in Fig. 5, which is saved for the updating of the encoding scheme. Also, the TCAM space is divided into two parts with different widths: One part is for the storage of the RETCAM-compressed flow table, which takes most of the TCAM storage space; the other part is for the original flow table, which takes only a small proportion of the TCAM space. The reason is explained below.

The update of the flow table can be flexibly carried out with this architecture, which is shown in Fig. 6. First, for deleting a flow entry, it is the same for a RETCAM-compressed flow table and ordinary flow table, i.e., just deleting an entry from the flow table in TCAM. Then, for adding a new flow entry, we first calculate the compression algorithms of RETCAM for this flow entry and see after the compression, if this flow entry is in conflict with any flow entries in the existing flow table (i.e., two different flow entries have the same compression outcome). If there is no conflict, then we just add this compressed new flow entry into the flow table, which is also the same as adding a flow entry to an ordinary flow table. For the flow entry that is in conflict with the existing flow table (which will only happen in the hash operation of the MAC address and has a small probability), we store this flow entry in a small scale full-width TCAM space in the switch. When the small scale full-width TCAM space is nearly full, we take all the flow entries in a whole and run the compression algorithm in the SDN controller again to get a new compressed flow table, which will happen much less frequent than the update of the flow table.

Fig. 5.

Hardware implementation of RETCAM.

Fig. 6.

Flow table update process.

VI. SIMULATION AND EVALUATION

Since OpenFlow has not been publicly used in networks, it’s infeasible to carry out the evaluation with real data in large scale OpenFlow networks. The flow table in this experiment is collected from the test network of national engineering research center for broadband networks applications, which roughly reflects the flow table condition in a practical OpenFlow network. The algorithms of RETCAM are simulated in Matlab 2012b, which is operated on a Core i7-4790 CPU with 8 GB memory. Under this experiment environment, we compare the performance of RETCAM with the following baselines:

H-SOFT [32] proposes to partition the flow table into subtables according to the match fields in use. With this partition, the number of match field in each sub-table can be reduced. However, this method relies heavily on the pre-analysis of the flow table, which may fail when new flow entries come.

RFC [15] propose to change the encoding method of ranges in flow tables. A header field to be matched will firstly be processed in a FPGA-based programmable hardware to encode the range information. In this way, the storage space of ranges in flow tables can be greatly reduced. However, this method only considers the compression of ranges in flow tables.

SplitIP [17] propose a splitting algorithm to separate the prefix-based routing table into TCAM blocks to reduce the TCAM consumption. With the table separation, a preclassification operation should be performed for prefix looking up. However, this scheme only considers the storage consumption of prefixes.

Fig. 7.

Compression ratio of RETCAM.

A. TCAM Reduction of RETCAM

RETCAM aims to reduce TCAM consumption with precompression scheme, which reduces the entry size in the preprocessing of RAM. Taking into consideration of different physical truth of hardware, RETCAM is self-adaptive in different application environment. Fig. 7 shows the value of performance index [TeX:] $$\rho$$ against different flow table sizes and different value of storage coefficient [TeX:] $$\eta,$$ which is on the condition that flow partition number k is 9. From Fig. 7, we can find that compression ratio [TeX:] $$\rho$$ increases with the decrease of storage coefficient [TeX:] $$\eta,$$ while as flow table size grows, compression ratio [TeX:] $$\rho$$ changes slowly. For example, compression ratio [TeX:] $$\rho$$ remains around 0.6 as flow table size grows from 5000 to 25000. This result validates the incremental stability of TCAM reduction in RETCAM, which is an important characteristic in practice.

B. Function Integrity Comparison between RETCAM and HSOFT

One great advantage of OpenFlow is the fine-grained control ability of data flow. According to the OpenFlow whitepaper, arbitrary combination of fields in an entry is supported, which would cater to many special user needs. However, existing routing tables in current networks usually focus on one or several fields that are widely adopted for routing. Therefore, format conversion alone from traditional networks to OpenFlow just like H-SOFT did could not fully reflect the flow control need of OpenFlow networks, which omits many possible combinations of fields in OpenFlow that do not exist in traditional networks.

In H-SOFT, fields are partitioned into many field-groups, and match process is operated by one choice of such group. Even though by careful identification of field relationship such as conflict and coexistence, group partition could avoid damage of function integrity to some extent, the large diversity of match field combination in OpenFlow could not support so many group partitions as H-SOFT did. Merely format conversion from traditional routing table to OpenFlow format just cover up the fact that the divide-and conquer policy in H-SOFT damages the function integrity of OpenFlow, since the diversity of match field combination in traditional routing table is fixed to a very small scope.

Fig. 8.

Comparison of function integrity among different schemes.

In the real flow table of OpenFlow collected from RETCAM, the diverse combinations of match fields roughly reflect the flexibility of OpenFlow. Under this circumstance, the damage to function integrity of H-SOFT is exposed. In our simulation of H-SOFT, the number of malfunction cases suffer from a manifest increase as the size of flow table grows, while RETCAM performs no functional error in this experiment. Fig. 8 shows the percentage of flow entries that could be correctly functioned in different schemes. When subset number k grows larger, the percentage of flow entries that could be correctly functioned drops significantly in H-SOFT, thus we gave up the simulation when [TeX:] $$k>5$$ in H-SOFT, since there are too many malfunction cases. As shown in the figure, RETCAM can maintain all the function integrity of the flow table for OpenFlow. Specifically, since RFC and SplitIP do not change the information contained in the flow table (which means a full function integrity), we do not add them in the analysis of the function integrity part.

C. Compression Comparison between H-SOFT and RETCAM

H-SOFT proposed a flow table compression algorithm based on the principle of divide-and-conquer. By the partition of subtables, one flow may choose one of them to carry out the match process. However, when the fine-grained flow control ability of OpenFlow is taken fully used of, arbitrary wildcard may occur in a flow table, which would greatly hinder the performance of H-SOFT. One premise of the divide-and-conquer principle is the classification of match types in a flow table, while in Open- Flow the distribution of match field in a flow entry may sometimes be too casual to categorize. RFC and SlpitIP concentrate on the efficient storage of range-based match fields such as TCP port range and IP prefixes. Therefore, such two schemes do not bring much benefit in the reduction of match fields in Open- Flow. As shown in the figure, under different flow table sizes, the compression ratios of RFC and SplitIP are stable, which is around 0.17 and 0.28, respectively. In RETCAM, since the precompression outcomes in different sub-tables are recomposed into one summarizing header field, support of arbitrary wildcard in OpenFlow is not influenced. Fig. 9 shows the comparison of performance among different schemes, where the threshold n in H-SOFT set to 10000. As Fig. 9 shows, RETCAM performs

Fig. 9.

Comparison of compression ratio among different schemes.

Fig. 10.

Run time comparison of different schemes.

the best among all schemes. Besides, RETCAM shows a bigger advantage when flow table size is small.

D. Computation Time Comparison

We compare the computation time of different schemes under our test environment, of which the result is shown in Fig. 10. Since RFC and SplitIP rely on complicated relationship analysis among different flow entries and match fields, such two schemes have a much higher computation complexity when the scale of the table size grows.

VII. CONCLUSION

Flow table of OpenFlow requires much storage space of TCAM. This paper proposed an efficient TCAM saving model for an OpenFlow switch based on the analysis of relationships between different fields. By the pre-compression procedure of inter-field compression, field mapping and intra-field compression, RETCAM effectively reduces the table size in TCAM as much as 60%, thus alleviating the pressure of storage in TCAM. With a modest run time of thousands of entries per second and a good performance of incremental deployment. In a word, RETCAM can efficiently reduce the TCAM consumption in the storage of OpenFlow tables

Biography

Chaoqin Zhang

CHAOQIN ZHANG is currently pursuing the Ph.D. degree with the National Digital Switching System Engineering and Technological RD Center, China. He is also an Associate Professor with the Zhengzhou University of Light Industry. His research interests include Internet architecture, data mining and network security.

Biography

Penghao Sun

PENGHAO SUN is a Ph.D. Candidate at China National Digital Switching System Engineering Technological RD Center. His current research interests are in SDN, networking algorithms and packet classification.

Biography

Guangwu Hu

GUANGWU HU received the Ph.D. degree in Computer Science and Technology from Tsinghua University in 2014. Then he became a Post-Doctoral with the Graduate School at Shenzhen, Tsinghua University. He is currently an Associate Professor with the Shenzhen Institute of Information Technology. His research interests include software-defined networking, next-generation Internet and Internet security.

Biography

Liang Zhu

LIANG ZHU received Ph.D. degree in Computer Science and Technology from Beijing University of Posts and Telecommunications (BUPT), Beijing, China, in October 2017. He is currently a Lecturer with the Institute of Computer and Communication Engineering at Zhengzhou University of Light Industry, Henan, China. His current research interests include mobile social networks, personalized service recommendation, and privacy preserving.

References

1 J. Montag, "Software defined networking mit OpenFlow," in Proc. IITM, 2013.custom:[[[-]]]
2 N. McKeown et al., "OpenFlow: Enabling innovation in campus networks," ACM SIGCOMM Comput. Commun. Review, vol. 38, no. 2 pp.6974, pp. no.2 6974-no.2 6974, Mar, 2008.doi:[[[10.1145/1355734.1355746]]]
3 OpenFlow 1.3.0 specification. (Online). Available:, https://www.opennetworking.org/images/stories/downloads/specification/openflowspec-v1.3.0.pdf
4 Z. Guo et al., "STAR: Preventing flow-table overflow in software-defined networks," Comput. Networks vol.125, pp. 15-25, Oct, 2017.doi:[[[10.1016/j.comnet.2017.04.046]]]
5 J. Cao, B. Chen, "Memory-optimized RFC packet classification algorithm merge RFC," J. Chinese Comput. Syst., vol. 33, no. 4, pp. 865-868, Mar, 2012.custom:[[[-]]]
6 W. Pak, S. Bahk., "FRFC: Fast table building algorithm for recursive flow classification," IEEE Commun. Lett., vol. 14, no. 12, pp. 1182-1184, Dec, 2010.doi:[[[10.1109/LCOMM.2010.100810.100572]]]
7 H. Lim, S. Lee, E. E. Swartzlander, "A new hierarchical packet classification algorithm," Comput. Networks, vol. 56, no. 13, pp. 3010-3022, Sept, 2012.doi:[[[10.1016/j.comnet.2012.04.014]]]
8 P. Gupta, N. McKeown, "Packet classification using hierarchical intelligent cuttings," IEEE Hot Interconnects, vol. 40, 1999.custom:[[[-]]]
9 S. Singh, F. Baboescu, G. Varghese, J. Wang, "Packet classification using multidimensional cutting," in Proc. ACM SIGCOMM, Aug, 2003.custom:[[[-]]]
10 K. Y. Zeng, J. H. Yang, "Towards the optimization of access control list," J. Software, vol. 18, no. 4, pp. 978-986, Apr, 2007.custom:[[[-]]]
11 A. X. Liu, E. Torng, C. R. Meiners, "Compressing network access control lists," IEEE Trans. Parallel Distributed Syst., vol. 22, no. 12, pp. 19691977-19691977, Dec, 2011.doi:[[[10.1109/TPDS.2011.114]]]
12 E. Karpilovsky, M. Caesar, J. Rexford, A. Shaikh, J. van der Merwe, "Practical network-wide compression of IP routing tables," IEEE Trans. Netw. Service Manage., vol. 9, no. 4, pp. 446-458, Dec, 2012.doi:[[[10.1109/TNSM.2012.081012.120246]]]
13 L. Schiff, Y. Afek, A. Bremler-Barr., "Orange: Multi field openflow based range classifier," in Proc. ACM/IEEE ANCS, May, 2015.custom:[[[-]]]
14 O. Rottenstreich et al., "Optimal in/out TCAM encodings of ranges," IEEE /ACM Trans. Netw. vol.24, no. 1, pp. 555-568, Feb, 2016.doi:[[[10.1109/TNET.2014.2382031]]]
15 P. Sun, J. Lan, P. Wang, T. Ma, "RFC: Range feature code for TCAMbased packet classification," Comput. Networks, vol. 118, pp. 54-61, May, 2017.custom:[[[-]]]
16 W. Li, X. Liu, W. Le, H. Li, H. Zhang, "A practical range encoding scheme for TCAMs," in Proc. ACM SIGCOMM Posters and Demos, Aug, 2019.custom:[[[-]]]
17 L. Wenjun et al., "A power-saving pre-classifier for TCAM-based IP lookup," Comput. Networksp. 106898, vol. 164, Dec, 2019.custom:[[[-]]]
18 Z. Guo et al., "Balancing flow table occupancy and link utilization in software-defined networks," Future Generation Comput. Syst. vol.89, pp. 213-223, Dec, 2018.doi:[[[10.1016/j.future.2018.06.011]]]
19 Z. Guo, S. Hui, Y. Xu, H. J. Chao, "Dynamic flow scheduling for power-efficient data center networks," in Proc. IEEE /ACM IWQoS, June, 2016.doi:[[[10.1109/iwqos.2016.7590399]]]
20 Z. Guo et al., "JumpFlow: Reducing flow table usage in software-defined networks," Comput. Networks, vol. 92, pp. 300-315, Dec, 2015.doi:[[[10.1016/j.comnet.2015.09.030]]]
21 S. Yingchareonthawornchai, J. Daly, A. X. Liu, E. Torng, "A sortedpartitioning approach to fast and scalable dynamic packet classification," IEEE /ACM Trans. Netw., vol. 26, no. 4, pp. 1907-1920, Aug, 2018.custom:[[[-]]]
22 W. Li, X. Li, H. Li, G. Xie, "CutSplit: A decision-tree combining cutting and splitting for scalable packet classification," in Proc. IEEE INFOCOM, Apr, 2018.custom:[[[-]]]
23 W. Li, T. Yang, Y.-K. Chang, T. Li, H. Li, "TabTree: A TSS-assisted bit-selecting tree scheme for packet classification with balanced rule mapping," in Proc. ACM/IEEE ANCS, Sept, 2019.custom:[[[-]]]
24 E. Liang et al., "Neural packet classification," in Proc. ACM SIGCOMM, Aug, 2019.custom:[[[-]]]
25 W. Li et al., "Tuple space assisted packet classification with high performance on both search and update," IEEE J. Sel. Areas Commun., vol. 38, no. 7, pp. 1555-1569, July, 2020.custom:[[[-]]]
26 A. Rashelbach, O. Rottenstreich, M. Silberstein, "A computational approach to packet classification," in Proc. ACM SIGCOMM, Aug, 2020.custom:[[[-]]]
27 A. R. Curtis et al., "DevoFlow: Scaling flow management for highperformance networks," in Proc. ACM SIGCOMM, Aug, 2011.custom:[[[-]]]
28 Yu Minlan, J. Rexford, M. J. Freedman, J. Wang, "Scalable flow-based networking with DIFANE," ACM SIGCOMM Comput. Commun. Review, vol. 40, no. 4, pp. 351-362, Aug, 2010.custom:[[[-]]]
29 S. Q. Zhang et al., "TCAM space-efficient routing in a software defined network," Comput. Networks, vol. 125, pp. 26-40, Oct, 2017.doi:[[[10.1016/j.comnet.2017.06.020]]]
30 K. Kannan, S. Banerjee, "Compact TCAM: Flow entry compaction in TCAM for power aware SDN," in Proc. ICDCN. Jan. 2013in Proc. ICDCN, Jan, 2013.custom:[[[-]]]
31 N. Matsumoto, M. Hayashi, "LightFlow: Speeding up GPU-based flow switching and facilitating maintenance of flow table," in Proc. IEEE HPSR, June, 2012.custom:[[[-]]]
32 J. Ge, Z. Chen, Y. Wu, Y. E, "H-SOFT: A heuristic storage space optimisation algorithm for flow table of OpenFlow," Concurrency Computation Practice Experience, vol. 27, no. 13, pp. 3497-3509, Aug, 2015.doi:[[[10.1002/cpe.3206]]]
33 RFC 6052. (Online). Available:, http://datatracker.ietf.org/doc/rfc6052/
34 K. Lakshminarayanan, A. Rangarajan, S. Venkatachary, "Algorithms for advanced packet classification with ternary CAMs," ACM SIGCOMM Comput. Commun. Review, vol. 35, no. 4, pp. 193-204, Aug, 2005.custom:[[[-]]]
35 A. Bremler-Barr, D. Hay, D. Hendler., "Layered interval codes for TCAM-based classification," in Proc. ACM SIGMETRICS, June, 2008.doi:[[[10.1016/j.comnet.2012.04.026]]]

Received: July 15 2020

Revision received: October 9 2020

Accepted: November 15 2020

Published (Electronic): December 31 2020

Corresponding Author: Guangwu Hu , hugw@sziit.edu.cn

Chaoqin Zhang, National Digital Switching System Engineering and Technological R & D Center, Zhengzhou, China, 2000007@zzuli.edu.cn

Penghao Sun, National Digital Switching System Engineering and Tech- nological R & D Center, Zhengzhou, China, sphshine@126.com

Guangwu Hu, School of Computer Science, Shenzhen Institute of Information Technology, Shenzhen, China, hugw@sziit.edu.cn

Liang Zhu, School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, China, lzhu@zzuli.edu.cn