Home | History | Annotate | Download | only in internals
      1 
      2 Valgrind-developer notes, re the MacOSX port
      3 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      4 
      5 JRS 22 Mar 09: re these comments in m_libc* and m_debuglog:
      6 
      7 /* IMPORTANT: on Darwin it is essential to use the _nocancel versions
      8    of syscalls rather than the vanilla version, if a _nocancel version
      9    is available.  See docs/internals/Darwin-notes.txt for the reason
     10    why. */
     11 
     12 when Valgrind does (for its own purposes, not for the client)
     13 read/write/open/close etc syscalls, it really is critical to use the
     14 _nocancel versions of syscalls rather than the vanilla versions.  This
     15 holds throughout the entire code base: whenever V does a syscall for
     16 its own purposes, we must use the _nocancel version if it exists.
     17 This is of course most prevalent in m_libc* since all of our
     18 own-purpose (non-client) syscalls should get routed through there.
     19 
     20 Why?  Because on Darwin, pthread cancellation is done within the
     21 kernel (unlike on Linux, iiuc).  And read/write/open/close and a whole
     22 bunch of other syscalls to do with stream I/O are cancellation points.
     23 So what can happen is, client informs the kernel that a given thread
     24 is to be cancelled.  Then at the next (eg) VG_(printf) call by that
     25 thread, which leads to a sys_write, the write syscall gets hit by the
     26 cancellation request, and is duly nuked by the kernel.  Of course from
     27 the outside it looks as if the thread had mysteriously disappeared off
     28 the radar for no reason.
     29 
     30 In short, we need to use _nocancel versions in order to ensure that
     31 cancellation requests only take effect at the places where the client
     32 does a syscall, and not the places where Valgrind does syscalls.
     33 
     34 How observed: using the standard pipe-based implementation in
     35 coregrind/m_scheduler/sema.c, none/tests/pth_cancel1 would hang
     36 (compared to succeeding using native Darwin semaphores).  And if the
     37 "pause()" call in said test is turned into a spin ("while (1) ;") then
     38 the entire Valgrind run mysteriously disappears, rather than spinning
     39 using native Darwin semaphores.
     40 
     41 Because the pipe-based semaphore intensively uses sys_read/sys_write,
     42 it is not surprising that it inadvertantly was eating up cancellation
     43 requests directed to client threads.  With abovementioned change in
     44 force the pipe-based semaphore appears to work correctly.
     45 
     46 
     47 
     48 Valgrind-developer notes, things removed from the original MacOSX port
     49 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     50 There was a broken debugstub implementation.  It was removed over several
     51 commits: r9477, which removed most of it, and r9711, r9759, and r10012,
     52 which cleaned up remaining bits.
     53 
     54 There was machinery to read function names from Dwarf3 debug info.  But we
     55 already read function names from the symbol tables, so this was duplicated
     56 functionality.  Furthermore, a Darwin-specific hack was required in
     57 storage.c to choose between symbol table names vs. Dwarf3 names.  So this
     58 machinery was removed in r10155.
     59 
     60 
     61 Valgrind-developer notes, todos re the MacOSX port
     62 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     63 
     64 * m_syswrap/syscall-amd64-darwin.S
     65   - correct signal mask is not applied during syscall
     66   - restart-labels are completely bogus
     67 
     68 * m_syswrap/syswrap-darwin.c:
     69   - PRE(sys_posix_spawn) completely ignores signal issues, and
     70     also ignores the file_actions argument
     71 
     72 * env var handling w/ exec on Darwin: is there something odd?  Compare
     73   "valgrind env" on Darwin and Linux.  On the former there are
     74   settings VALGRIND_LIB and VALGRIND_LIB_INNER, but not for the
     75   former.
     76   There's a suspicious-looking "#if defined(VGO_darwin)" in 
     77   VG_(env_remove_valgrind_env_stuff).  Maybe related?
     78 
     79 * Cleanups: sort wrappers in syswrap-darwin.c and priv_syswrap-darwin.h
     80   alphabetically.  Also, some aren't properly implemented -- check and
     81   print warnings
     82 
     83 * Cleanups: m_scheduler/sema.c: use pipe implementation
     84   (but this apparently causes none/tests/pth_cancel1 to hang.
     85   I have no idea why, despite quite some investigation).
     86 
     87 * Cleanups: m_debugstub: move to attic
     88 
     89 * syswrap-darwin.c: sys_{f,}chmod_extended: handling of ARG5 is way
     90   wrong
     91 
     92 * Cleanups (Linux,AIX5): bogus launcher-path mangling logic in
     93   PRE(sys_execve)
     94 
     95 * Cleanups (ALL PLATFORMS): m_signals.c: are the _MY_SIGRETURN
     96   assembly stubs actually necessary for anything?  I don't know.
     97 
     98 * Cleanups: check that changes to VG_(stat) and VG_(stat64) have
     99   not broken 64-bit statting on 32-bit Linux
    100 
    101 * Cleanups: #if !HAVE_PROC in m_main (to do with /proc/<pid>/cmdline
    102 
    103 --------
    104 
    105 m_main doesn't read symbols for the valgrind exe itself, which is
    106 annoying.  On minimal investigation it seems that the executable isn't
    107 even listed by aspacem.  This is very strange and not in accordance
    108 with the Linux or AIX ports.
    109 
    110 
    111 m_main: relatedly, Darwin version does not collect/give out
    112 initial debuginfo handles; hence ptrcheck won't work
    113 
    114 
    115 m_main: Darwin port relies on blocking out big sections of address
    116 space with mmap at startup.  We know from history that this is a bad
    117 idea.  (It's also really slow on 64-bit builds, taking 3--4 seconds.)
    118 Also, startup is not done on the interim startup stack -- why not?
    119 
    120 
    121 VG_(di_notify_mmap): Linux version is also used for Darwin, and
    122 contains some ifdeffery.  Clean up.
    123 
    124 
    125 PRE(sys_fork), #ifdeffery
    126 
    127 
    128 syswrap-generic.c: VG_(init_preopened_fds) is #ifdefd for Darwin
    129 
    130 
    131 scheduler.c: #ifdeffery in VG_(get_thread_out_of_syscall)
    132 
    133 
    134 look at notes in coregrind/Makefile.am re Mach RPC interface
    135 definitions.  See if we can get rid of any more stuff now that
    136 m_debugstub is gone.
    137