October 16, 2008 William Fisher, Version 2 Proposal for Porting Linux 2.6.{26,27} onto NCPU & ACPU cores 1.0 Objectives and Requirements --------------------------- The requirements are summarized in the following points: 1.1 To obtain a more updated TCP/IP stack which supports both IPv4 and IPv6. An adaption of the Linux TCP/IP protocol stack allows obtaining IPv4, IPv6, bonding driver improvements and 10 Gigabit Ethernet support easily. 1.2 The support of 10 Gigabit Ethernet device drivers via deploying standard vendor supported Linux NAPI device drivers for the hardware and enhanced PCI-Express support in the kernel is a straight-forward migration path. The major 10 Gigabit Ethernet vendors include: Chelsio, Myricom, Mellanox, NetXena and Netopia all provide Linux 2.6.X drivers in both source and binary formats. 1.3 This adoption of Linux on the NCPU core(s) is a first step in an eventual migration path to Linux on the Next Generation hardware as well as a proof of concept of the co-existance of using dedicated core's under a Linux SMP kernel. Using the OnStor Cougar hardware platform allows us to utilized one of the four core SiByte Processor Sockets to execute Linux with the other one supporting the FP funcationality unchanged. This plan will allow a transition of the NCPU and ACPU functions to run under Linux with a minimum of disruption to the system software. 2.0 Proposal Overview ----------------- In the sections below, the identified task are listed in approximage chronological development order. There are a number of things that must be done first before the NFS/CIFS functionality can be tested. Hence there is probably run for more parallelism depending on the resource allocation. 2.1 Port stock Linux 2.6.{26,27} kernel onto the one Sibyte socket The goal is to "port" a stock Linux 2.6.{26,27} kernel onto the one of the two SiByte sockets, supporting 4 processor cores. The will be a very stripped down kernel supporting the minimum number of device drivers, file systems and user functions. Since a Linux 2.6.22 kernel has been ported to the SSC Sibyte 12XX processor as part of the 4.0 release, it is envisioned that this is a very straight forward task. 2.2 Loading NCPU Linux kernel ELF Modules This task concerns loading the NCPU Linux kernel ELF modules after the SSC has loaded NCPU Linux into one of the two Sibyte 1480 processor sockets. The existing PROM code executing in SSC memory, and loads the SSC Linux image by reading files from the Compact Flash (CF) directly attached to the SSC hardware. After SSC Linux has been booted, the NCPU/ACPU/FP processor cores are loaded by reading non-ELF "binary" code files, from either the CF or from the network using NFS, into the main memory of the Sibyte 1480 processor sockets. In the current software, after the Sibyte 1480's images are loaded, the Embedded Eagle Executive (EEE) requires no further code loading functions. On approach is to store the NCPU kernel modules onto the CF. In order to access the CF, a communication channel is needed between NCPU Linux and SSC Linux, to allow modules to be read from CF. If we extend the ssc-mgmt driver running on the SSC to pass TCP/IP packets contained in skb's passed via the the shared queue interface, this "point-to-point" channel could be used to access the CF via NFS from the NCPU Linux. The current ssc-mgmt driver already supports passing EEE messages contained in skb's, hence a straight-forward extension is to extend it to pass IP packets. Using this driver, we could create a point-to-point network interface, using a private assigned IP address such as 192.168.X.Y, and send TCP/IP packets between the NPCU and the SSC. Using NFS as the upper level protocol running over this link, the CF file system can be mounted onto the NPCU. Since the module we are loading into NCPU Linux is the OnStor NFS/CIFS module, we have the classic chicken-and-the-egg problem during the module loading phase. This has the side-effect of using stock NFS in NCPU Linux to support the mounting and file system operations. The requirement to use stock Linux NFS inside the NCPU to load modules can be relaxed/emoved if we load a ramFS containing the NFS module during the NCPU Linux booting process. The use of stock Linux NFS on the NCPU can be used during software development to speed-up the debug cycle of the ACPU module since an old module can be unloaded, a new module loaded onto the CF on the SSC and reloaded without requiring a reboot of the Sibyte processors. The final shippped software would not require stock NFS Linux in the NCPU Linux with the use of a ramFS containing the ACPU NFS/CIFS code module. 2.3 Support Shared Memory Queues and Messaging Protocol between Linux processors In order to minimumize the changes to other parts of the system software, our plan entails using the standard shared memory messaging queue's implemented today to communicate between NCPU, ACPU, FP and SSC processors. The path of minimal distruption is to leave unchanged the message types and formats used today. This task addresses the changed required to the NCPU Linux kernel to initialize the shared memory queues and to add support to send and receive messages using standard Linux device drivers. This task address any changes in addition to the porting of the mgmt-bus driver and eee protocol modules described below. 2.4 Port SSC "mgmt-bus" driver and eee protocol modules to NCPU Linux Kernel Since we are replacing the NCPU's EEE functionality with a Linux kernel, this task covers the porting of the SSC "mgmt-bus" device driver, implementing the shared message queues between the SSC and NCPU, and using common Linux device drivers and protocol modules on both the SSC and NCPU Linux kernels. The Linux mgmt-bus device driver and 'eee' network protocol modules were written and integrated into the 4.0 release, this task is envisioned as a simple porting and testing effort. 2.5 Test the "mgmt-bus" driver and eee protocol modules on NCPU Linux The task covers testing the mgmt-bus driver and eee protocol modules by running traffic between the SSC and NCPU after the various supporting software has been ported. 2.6 Memory Allocation Task This task covers the memory allocation interfaces, sizes and mapping functions currently used in the NCPU and APCU software. Since we are replacing the NCPU's EEE functionality with a Linux kernel on the NCPU and ACPU cores, the EEE memory allocation schemes must be explictly addressed. Currently the EEE supports two memory regions, one for descriptors and buffers and the other for general memory allocation. The use of common shared memory regions mapped into all the cores must be maintained for descriptors, buffers, queues and messages. However other local memory allocations should be converted to call the generic kernel memory allocator. The recommendation is to convert the eee_ramAlloc() and cache_alloc() interfaces into calls to the generic Linux kernel memory allocator. The plan is to allocate the skb's and there associated buffers from the common memory region so that the zero copy networking/filesystem operations are maintained. In addition, the allocation of the shared queue's and there associated messages must be allocated from another part of this common memory region to maintain backward compatability. 2.7 Port RCON support to NCPU Linux Kernel This task covers adapting RCON SSC Linux driver to the NCPU Linux kernel. 2.8 Test the RCON functions between SSC and NCPU Linux's The task covers testing the remote console (RCON) functions between the SSC and NCPU after the various underlying supporting software tasks, covered previously have been ported and unit tested. 2.9 NCPU Linux distribution of messages from SSC. This task covers the messaging communication between the SSC and the ACPU and FP cores. Currently the NCPU core receives all messages destined for the ACPU and FP cores coming from the SSC. The NCPU is responsible for forwarding messages destined for these others. This task needs further study to accurately scope the implementation effort. 2.10 NCPU Linux IP Forwarding Functionality This task covers the IP forwarding functions that must be supported to send packets to/from the SSC when packets are received on network interfaces supported by the NCPU Linux. In addition the Network Address Translation (NAT) functionality needs to be studied. This task needs further study to accurately scope the implementation effort. These requirements might be satisfied using the Linux NetFiler functionality which easily supports NAT, filters and forwarding functions across interfaces typically used in firewalls, NAT boxes, etc. 2.11 Socket communication between NCPU and FP This task covers the messaging communication between the NCPU Linux kernel and the FP functionality. The specific messages sent between the NCPU and FP are defined in sm-tpl-fp/tpl-fp-api.h and cover socket operations such as open, close, listen, accept, read/write and unbind. The task requires supporting these messages when the Linux TCP/IP stack has been substituted for the current OpenBSD based TCP/IP implementation. 2.12 Virtual Stack communication between NCPU and SSC This task covers the messaging communication between the SSC and NCPU Linux kernel specific to virtual stacks. It covers the requesting and obtaining information pertaining to virtual interfaces, adding and deleting routes and obtaining routing tables, configuring interfaces, getting packet and network interfaces statistics, TCP and UDP connections, etc. Since the NCPU will field these messages and generate the appropriate replies, this task address'es implementing the code to obtain the eqivalent data from the Linux protocol stack. The messages and there current implementation are described in sm-ipm/ipm.[h,c]. There are a number of messages that require a considerable amount of information to be passed back to the SSC regarding the state of the entire protocol stack. These include cumulative IP, UDP and TCP statistics, UDP and TCP connection tables with the message sizes ranging from 32K to 800K for statistics. Since this includes information to specific stack instances under BSD, this may require considerable work to maintain this information under Linux. Modifications of the messages in this area might be required. This task needs further study to accurately scope the implementation effort. 2.13 Virtual Server Support on the Linux NCPU This task covers the messaging communication between the Virtual Server software running on the SSC and the NCPU Linux kernel. The Virtual Server message formats will remain unchanged, so the work covers implementing the functionality proviously added to the BSD protocol stack on the NCPU core that supported these messages. The development centers on obtaining the information needed to satisfy requests and responding with appropriate replies. The Linux implementation must also implement those messages requiring explict notification of changes in the networking stack occuring which must be communicated back to the SSC Virtual Server. The Linux equivalent implementation of the vstack partitioning of the BSD protocol stack, for separate routing tables, etc. maybe implementing using Linux netfilter functionality. This is an open question needing more detailed study. This task needs further study to accurately scope the implementation effort. 2.14 Convert OnStor Packet Descriptors (pkt_desc) to Linux Socket Buffers (skb's) The task covers replacing the use of the pkt_desc data structure used in describing network data passed between the NCPU and the ACPU cores with the use of standard Linux socket buffers (skb's). This appears to be a straight-forward replacement, since they are both nearly equivalent and allows passing Linux networking buffers to the ACPU without copying. The task covers the kernel changes required to modify the skb memory allocator to use the common mapped shared memory region between the Sibyte processor sockets versus using a generic kernel slab allocator region. A chain of skb's will be allocated from the common mapped shared memory region between the NCPU, ACPU and FP and continued use of zero-copy networking will be supported. The handoff of ownership of the buffers to the destination code via IPC using the shared messages queues, will continue. 2.15 Convert Linux kernel TCP/IP networking stack to be TPL-API aware This task covers modifying the Linux networking code to be aware of the Transport Layer API (tpl-api) interfaces that must be supported to communicate with the ACPU. This will allow the Linux networking code to call the appropriate tpl-api functions when changes occur requiring notifcation or reception of messages in either direction. 2.16 Convert ACPU NFS/CIFS code to be a Linux kernel module This task covers modifying the ACPU NFS and CIFS code to become standard Linux kernel modules, with the extra provision of running on a single dedicated processor core. 2.17 Convert ACPU NFS code to use Linux Socket Buffers This task covers modifying the NFS code to use Linux Socket Buffers (skb's) rather than OnStor pkt_desc's. Since the queue's and message formats will remain unchanged, with the exception of passing 'skb' pointers in the data messages, the basic assumption of passing a complete RPC/XDR message chain between the NCPU and the ACPU remains unchanged. A closer examination of the NFS code shows that is currently handles chains of buffers and uses only a few fields of the pkt_desc data structures which have equivalents in the skb data structure. 2.18 Convert ACPU CIFS code to use Linux Socket Buffers This task covers modifying the CIFS code to use Linux Socket Buffers (skb's) rather thanf OnStor pkt_desc's. Since the queue's and message formats will remain unchanged with the exception of passing 'skb' pointers in the data messages, the basic assumption of passing a complete message chain between the NCPU and the ACPU remains unchanged for CIFS. 2.19 Test modified ACPU NFS code This task covers testing the modified NFS code running under the Linux kernel on the ACPU processor core. Since the virtual server functionality is required by the NFS code, the testing and debugging of the modified code must be done later in the schedule. 2.20 Test modified ACPU CIFS code This task covers testing the modified CIFS code running under the Linux kernel on the ACPU processor core. Since the virtual server functionality is required by the CIFS code, the testing and debugging of the modified must be done later in the schedule. 2.21 NCPU Linux kernel core dumps This task covers obtaining a working kgdb and kernel crash dump on the NCPU. The obtaining of the messaging communication between the Virtual Server software running on the SSC and the NCPU Linux kernel.