<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title>Articles tagged linux at The Segfault Garden</title>
  <link rel="alternate" type="text/html"
        href="https://blog.segv.page/tags/linux/"/>
  <link rel="self" type="application/atom+xml"
        href="https://blog.segv.page/tags/linux/feed/"/>
  <updated>2026-06-02T01:40:41Z</updated>
  <id>urn:uuid:9d7de7b3-b357-464b-a963-da196bdd5954</id>

  <author>
    <name>Lu</name>
    <uri>https://blog.segv.page</uri>
    <email>frgmntedflower@linux.com</email>
  </author>

  
    
  <entry>
    <title>peachykeen32: Bare-Metal ARM Userspace Programming</title>
    <link rel="alternate" type="text/html" href="https://blog.segv.page/blog/2026/06/01/peachykeen32-bare-metal-ARM-userspace-programming/"/>
    <id>urn:uuid:1a9b8c3d-5e7f-4a62-b0d4-6c8e1f2a9d70</id>
    <updated>2026-06-01T02:00:00Z</updated>
    <category term="c"/><category term="arm"/><category term="linux"/>
    <content type="html">
      <![CDATA[<p>peachykeen32 is a bare-metal ARM32 userspace runtime — no libc, no crt,
no linker scripts. Everything runs directly on top of the Linux kernel
through syscalls. What started as a curiosity about what the minimum
viable userspace looks like turned into a toolkit with a handful of
genuinely useful commands.</p>

<h2 id="the-runtime">The Runtime</h2>

<p>The entry point is not <code class="language-plaintext highlighter-rouge">_start</code> but a hand-written assembly trampoline
that zeroes .bss, sets up the stack pointer from <code class="language-plaintext highlighter-rouge">AT_RANDOM</code> in the
auxiliary vector, and calls <code class="language-plaintext highlighter-rouge">main</code>. The trampoline is small enough to
inline in the binary — 16 bytes of ARM32 instructions.</p>

<pre><code class="language-asm">.globl _start
_start:
    ldr sp, =stack_top
    bl main
    mov r7, #1
    svc #0
</code></pre>

<p>All I/O goes through <code class="language-plaintext highlighter-rouge">svc #0</code> with the syscall number in <code class="language-plaintext highlighter-rouge">r7</code>. The runtime
provides thin wrappers for <code class="language-plaintext highlighter-rouge">read</code>, <code class="language-plaintext highlighter-rouge">write</code>, <code class="language-plaintext highlighter-rouge">exit</code>, <code class="language-plaintext highlighter-rouge">mmap</code>, <code class="language-plaintext highlighter-rouge">open</code>, and
<code class="language-plaintext highlighter-rouge">close</code>. No buffering, no errno — just the raw kernel ABI.</p>

<h2 id="commands">Commands</h2>

<h3 id="cat"><code class="language-plaintext highlighter-rouge">cat</code></h3>

<p>Reads a file via <code class="language-plaintext highlighter-rouge">open</code>/<code class="language-plaintext highlighter-rouge">read</code>/<code class="language-plaintext highlighter-rouge">write</code> and dumps it to stdout. The buffer
is 4kB (one page), allocated with <code class="language-plaintext highlighter-rouge">mmap</code> at startup and reused across
calls. Error handling checks the return value of <code class="language-plaintext highlighter-rouge">open</code> and prints a
syscall-based error message.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">cmd_cat</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">path</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">long</span> <span class="n">fd</span> <span class="o">=</span> <span class="n">sys_open</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">fd</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
    <span class="kt">long</span> <span class="n">n</span><span class="p">;</span>
    <span class="k">while</span> <span class="p">((</span><span class="n">n</span> <span class="o">=</span> <span class="n">sys_read</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">buf</span><span class="p">,</span> <span class="mi">4096</span><span class="p">))</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span>
        <span class="n">sys_write</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">buf</span><span class="p">,</span> <span class="n">n</span><span class="p">);</span>
    <span class="n">sys_close</span><span class="p">(</span><span class="n">fd</span><span class="p">);</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="hexdump"><code class="language-plaintext highlighter-rouge">hexdump</code></h3>

<p>Like <code class="language-plaintext highlighter-rouge">cat</code> but prints a hex+ASCII side-by-side view. Each line shows the
offset, sixteen hex bytes, and the printable characters. The implementation
reuses <code class="language-plaintext highlighter-rouge">cat</code>’s read loop and formats directly into the write buffer to
avoid a second copy.</p>

<h3 id="sysinfo"><code class="language-plaintext highlighter-rouge">sysinfo</code></h3>

<p>Calls <code class="language-plaintext highlighter-rouge">sys_sysinfo()</code> and prints the result: uptime, total RAM, free RAM,
process count, and load averages. The <code class="language-plaintext highlighter-rouge">sysinfo</code> struct is defined locally
since there’s no <code class="language-plaintext highlighter-rouge">&lt;sys/sysinfo.h&gt;</code>.</p>

<h3 id="ls"><code class="language-plaintext highlighter-rouge">ls</code></h3>

<p>Reads a directory via <code class="language-plaintext highlighter-rouge">open</code> (<code class="language-plaintext highlighter-rouge">O_RDONLY|O_DIRECTORY</code>) and <code class="language-plaintext highlighter-rouge">getdents64</code>.
The <code class="language-plaintext highlighter-rouge">struct linux_dirent64</code> is defined manually; the command iterates
entries, skips <code class="language-plaintext highlighter-rouge">.</code> and <code class="language-plaintext highlighter-rouge">..</code>, and prints each name. This one touches more
of the kernel ABI than the others — <code class="language-plaintext highlighter-rouge">getdents64</code> is a less common syscall
that most libc-free experiments overlook.</p>

<h2 id="why-this-works">Why This Works</h2>

<p>Each command is a self-contained demonstration of a specific kernel
interface exercised from a minimal runtime. Together they show that a
useful userspace — file I/O, directory traversal, system introspection —
needs nothing more than the syscall interface and a willingness to define
your own struct layouts. The entire runtime is ~300 lines of C and asm,
and each command is under 50 lines. There is no startup overhead, no
dynamic linker, no PT_INTERP segment.</p>

]]>
    </content>
  </entry>
    
  
    
  
    
  
    
  <entry>
    <title>NetMoon: Raw Sockets and Network Monitoring</title>
    <link rel="alternate" type="text/html" href="https://blog.segv.page/blog/2026/06/01/NetMoon-raw-sockets-and-network-monitoring/"/>
    <id>urn:uuid:3f7b2e9a-1c5d-4a80-b6e3-8d0f9c4a1e57</id>
    <updated>2026-06-01T02:00:00Z</updated>
    <category term="c"/><category term="networking"/><category term="linux"/>
    <content type="html">
      <![CDATA[<p>NetMoon is a network monitoring tool built on Linux raw sockets. It
captures packets, parses TCP headers, and presents connection-level
metrics in real time. The implementation is about 700 lines of C and
demonstrates how far you can get with nothing more than a well-chosen
syscall and a couple of struct definitions.</p>

<h2 id="raw-socket-setup">Raw Socket Setup</h2>

<p>Raw sockets on Linux require <code class="language-plaintext highlighter-rouge">CAP_NET_RAW</code> (or root). The call is
straightforward — <code class="language-plaintext highlighter-rouge">socket(AF_PACKET, SOCK_RAW, htons(ETH_P_IP))</code> —
which delivers every IP frame that reaches the interface.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="n">sock</span> <span class="o">=</span> <span class="n">socket</span><span class="p">(</span><span class="n">AF_PACKET</span><span class="p">,</span> <span class="n">SOCK_RAW</span><span class="p">,</span> <span class="n">htons</span><span class="p">(</span><span class="n">ETH_P_IP</span><span class="p">));</span>
<span class="k">if</span> <span class="p">(</span><span class="n">sock</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">perror</span><span class="p">(</span><span class="s">"socket"</span><span class="p">);</span>
    <span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The socket is placed into promiscuous mode via <code class="language-plaintext highlighter-rouge">PACKET_ADD_MEMBERSHIP</code>
with <code class="language-plaintext highlighter-rouge">PACKET_MR_PROMISC</code>. This tells the NIC to forward all frames, not
just those addressed to our MAC, so we see traffic from other hosts on the
same broadcast domain.</p>

<h2 id="packet-capture-loop">Packet Capture Loop</h2>

<p>The capture thread reads from the raw socket into a fixed 64kB buffer and
hands the buffer to a parser running in a second thread. The split keeps
the capture side lossless — if the parser lags, the kernel buffer fills
and drops packets in the NIC ring, not in userspace.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">(;;)</span> <span class="p">{</span>
    <span class="kt">ssize_t</span> <span class="n">n</span> <span class="o">=</span> <span class="n">recvfrom</span><span class="p">(</span><span class="n">sock</span><span class="p">,</span> <span class="n">buf</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">buf</span><span class="p">),</span> <span class="mi">0</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="k">break</span><span class="p">;</span>
    <span class="n">parse_frame</span><span class="p">(</span><span class="n">buf</span><span class="p">,</span> <span class="n">n</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="tcp-header-parsing">TCP Header Parsing</h2>

<p><code class="language-plaintext highlighter-rouge">parse_frame</code> walks the protocol stack: Ethernet header → IP header → TCP
header. Each step checks the relevant length field before advancing.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">parse_frame</span><span class="p">(</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="n">buf</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">len</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="n">ethhdr</span> <span class="o">*</span><span class="n">eth</span> <span class="o">=</span> <span class="p">(</span><span class="k">struct</span> <span class="n">ethhdr</span> <span class="o">*</span><span class="p">)</span><span class="n">buf</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">ntohs</span><span class="p">(</span><span class="n">eth</span><span class="o">-&gt;</span><span class="n">h_proto</span><span class="p">)</span> <span class="o">!=</span> <span class="n">ETH_P_IP</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>

    <span class="k">struct</span> <span class="n">iphdr</span> <span class="o">*</span><span class="n">ip</span> <span class="o">=</span> <span class="p">(</span><span class="k">struct</span> <span class="n">iphdr</span> <span class="o">*</span><span class="p">)(</span><span class="n">buf</span> <span class="o">+</span> <span class="k">sizeof</span><span class="p">(</span><span class="k">struct</span> <span class="n">ethhdr</span><span class="p">));</span>
    <span class="kt">size_t</span> <span class="n">ip_hlen</span> <span class="o">=</span> <span class="n">ip</span><span class="o">-&gt;</span><span class="n">ihl</span> <span class="o">*</span> <span class="mi">4</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">ip</span><span class="o">-&gt;</span><span class="n">protocol</span> <span class="o">!=</span> <span class="n">IPPROTO_TCP</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>

    <span class="k">struct</span> <span class="n">tcphdr</span> <span class="o">*</span><span class="n">tcp</span> <span class="o">=</span> <span class="p">(</span><span class="k">struct</span> <span class="n">tcphdr</span> <span class="o">*</span><span class="p">)((</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="p">)</span><span class="n">ip</span> <span class="o">+</span> <span class="n">ip_hlen</span><span class="p">);</span>
    <span class="c1">// extract src_port, dst_port, seq, ack, flags</span>
    <span class="c1">// update connection table</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="connection-tracking">Connection Tracking</h2>

<p>The parser maintains a hash table of active connections keyed by the 4-tuple
<code class="language-plaintext highlighter-rouge">(src_ip, src_port, dst_ip, dst_port)</code>. Each entry tracks byte counts,
packet counts, the current TCP state (from flags), and a rough RTT measured
from SYN/SYN-ACK timing. Expired entries — those with no activity for 60
seconds — are evicted on every tenth iteration to keep the table bounded.</p>

<h2 id="real-time-display">Real-Time Display</h2>

<p>A curses-based UI refreshes the connection table once per second, printing
per-connection bandwidth as a bar chart and flags as human-readable state:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>192.168.1.20:44012 → 10.0.0.1:443  (ESTABLISHED)  1.2 MB   ████████░░
192.168.1.20:44013 → 10.0.0.1:443  (ESTABLISHED)  340 kB   ██░░░░░░░░
192.168.1.30:22   → 10.0.0.2:53041 (ESTABLISHED)  4.1 MB   ██████████
</code></pre></div></div>

<p>Packet capture at 60% with TCP header parsing already gives a useful
picture of what crosses the wire. The missing pieces — reassembly, deeper
protocol dissection, and a filter language — are natural extensions once
the core loop is solid.</p>

]]>
    </content>
  </entry>
    
  
    
  
    
  

</feed>
