Introduction
This page is intended for collection of notes, artifacts and documentation associated with the work on providing a Catamount-like memory API in THINK.
Stuff
Diagram of Nathan's in-progress kernel with PCT/QK structure (6/8/06)
Notes
Trammell vs. Nathan on PCT startup
In one of our ConfigOS meetings, Barney explained that when the PCT starts, it allocates all available memory from the QK. It then gives a chunk of that memory to new processes and maintains a “logical to logical” address mapping. Is this correct? If so (and this is where my overall naivete comes in), I would have expected to see simple calls (something like alloc() and free()) at the QK API level
There aren't any malloc or free sort of calls since the PCT has already been given the entire address space when it started. It is responsible for running a user-level allocator to apportion it between the different regions of the application and between instances of the application (for a multi-processsor environment such as VNM).
The only thing it can't do is create the page tables, which requires kernel level assistance. Thus, the regions passed into create_usr_region() have well known permissions – TEXT is read-only and executable, DATA, HEAP and STACK are read-write without execute.
I guess what I really feel like I'm missing is a detailed description of the PCT startup process, as well as a description of the application startup process.
Here's a quick hand-waving explanation:
The PCT is launched by the qk with the entirety of physical memory mapped into its address space. It figures out how much memory is available to the user and sets up its load portal to wait for a load message from yod (or this node's parent in the fanout).
The load message contains the size of the four regions and the virtual addresses to which they expect to be mapped. The PCT allocates the text and data regions out of the bottom of physical memory and creates a portal over them. yod (or the other pcts in the fan out) send the contents of the regions and they are delivered directly into the buffers via the portal.
The PCT then asks the kernel to create a user process and the page tables to cover the four regions at the desired virtual addresses. Since the PCT knows the physical addresses of the buffers, it is able to give that data to the qk in the create_usr_region call.
The PCT then calls the run_process_trap to jump into the text segment of the new process and start it going.
Note that the PCT still has the entirety of user memory mapped into its process, so it can write to the user memory if necessary.
In the create_usr_region() method, you listed a 'region' parameter, described as “the number of the region to be initialized.” Just to clarify, is this some sort of ID or is it the type of region (e.g., PCB_RGN_HEAP)?
It's the both the type of the region and an ID. To simplify book keeping in the kernel, each application only supports four disjoint virtual regions, each of which is internally physically contiguous and each of which has pre-defined permissions.
Also, the return type of that method is specified as INT32. Is that some sort of result status?
It is, although I'm not sure of all of the failure modes. If the qk isn't able to setup the page tables for the region it probably indicates that the PCT has lost track of something. That shouldn't happen since the physical → virtual mapping should be 1:1.
User-level QK Entry points & Usage (via Trammell)
These are the only user-level entry points into the qk, defined in base/events.C
HANDLER sproc_handle_trap[N_USER_TRAPS] = { sysp_udef_handler, // #0 USER TRAP - NEVER CALLED (HANDLER)setuid_trap, // #1 (HANDLER)lputs_trap, // #2 TRAP_LPUTS (HANDLER)quit_quantum_trap, // #3 TRAP_QUIT_QUANTUM (HANDLER)PCT_init_proc_trap, // #4 TRAP_PCT_INIT_PROC (HANDLER)PCT_read_proc_trap, // #5 TRAP_PCT_READ_PROC (HANDLER)sysp_udef_handler, // #6 (HANDLER)sysp_udef_handler, // #7 (HANDLER)sysp_udef_handler, // #8 (HANDLER)run_process_trap, // #9 TRAP_RUN_PROCESS (HANDLER)PCT_install_trap, // #10 TRAP_INSTALL_PCT (HANDLER)PCT_init_rgn_trap, // #11 TRAP_INIT_REGION (HANDLER)memlogctl_trap, // #12 TRAP_MEMLOGCTL
To create a process, the PCT calls PCT_init_proc_trap. For setup memory regions for the process, only PCT_init_rgn_trap is used, which is a thin wrapper on top of the create_usr_region() call:
/* Initializes and maps a region for a user process. Called by the PCT. * process - process number * region - the number of the region to be initialized. * vstart - process virtual address for the region. * size - size of the region in bytes * pct_vaddr - virtual PCT address malloc'd by the PCT for this region */ INT32 create_usr_region( PID_TYPE process, UINT16 region, ADDR vstart, ADDR_LEN size, ADDR pct_vaddr, UINT16 privs );
The regions supported are PCB_RGN_HEAP, PCB_RGN_TEXT, PCB_RGN_DATA, and PCB_RGN_STACK1. The memory allocations are contiguous in both virtual and physical memory and remain mapped into the pct's address space as well as creating the user space.
To launch the newly created process, the PCT calls the run_process_trap to start it going. It can also call quit_quantum_trap when it is done processing to indicate that it wishes to give up the remainder of its timeslice.
Trammell vs. Nathan on PCT/Qk interaction
So, does the qk start the PCT automatically at boot time? And, from what you're telling me, the PCT doesn't have to malloc the memory, because the qk automatically gives it everything (excpet, I assume, the address space that the qk occupies).
That's correct – the qk starts the PCT and gives it all of the memory, minus the part used by the qk.
In our meetings, it was implied that there could be more than one PCT running on a single processor. Yet, that doesn't make sense to me, as there shouldn't be any reason to do that, as far as I can tell. Is there ever a possibility of multiple PCTs running on a single processor? If so, then how does that affect the memory space?
I think what was meant is that it is possible to swap different PCTs for different scheduling algorithms (at boot time), not that there would be multiple simultaneous PCTs running. Running different PCTs has always been a research goal, although one that has never been implemented to the best of my knowledge.
And lastly, when a process starts up, is it correct to say that the PCT has to make four separate calls to create_usr_region(), one for each region type?
Yes, it would make four calls to create_usr_region().
Trammell on Catamount call traces
On Wed, Jun 14, 2006 at 03:19:09PM -0400, Trammell Hudson wrote:
I used my gprof'ed version of the qk to produce a call graph of all
of the call arcs and wrote a quick script to feed these arcs into
dot to produce a graphical visualization of the function calls.
This might be a little more useful, but not quite as visually appealing. I modified my profile dump parser to output a timeline showing function calls with idendentation to show the depth on the call stack.
The first attachment is the boot process, including the PCT setup and 1 second worth of timer interrupts to show the RCA code, the second is a normal length ping_node and the last is a VSM ping_node.
I am amazed at the number of function calls and overhead in the beer code to process an incoming message. The calls to malloc() and all of the timer manipulation worry me, too. Perhaps it should just use a regularly scheduled 1 second work queue?