Home | History | Annotate | Download | only in ltrace
      1 -*-org-*-
      2 * TODO
      3 ** Keep exit code of traced process
      4    See https://bugzilla.redhat.com/show_bug.cgi?id=105371 for details.
      5 
      6 ** Automatic prototype discovery:
      7 *** Use debuginfo if available
      8     Alternatively, use debuginfo to generate configure file.
      9 *** Mangled identifiers contain partial prototypes themselves
     10     They don't contain return type info, which can change the
     11     parameter passing convention.  We could use it and hope for the
     12     best.  Also they don't include the potentially present hidden this
     13     pointer.
     14 ** Automatically update list of syscalls?
     15 ** More operating systems (solaris?)
     16 ** Get rid of EVENT_ARCH_SYSCALL and EVENT_ARCH_SYSRET
     17 ** Implement displaced tracing
     18    A technique used in GDB (and in uprobes, I believe), whereby the
     19    instruction under breakpoint is moved somewhere else, and followed
     20    by a jump back to original place.  When the breakpoint hits, the IP
     21    is moved to the displaced instruction, and the process is
     22    continued.  We avoid all the fuss with singlestepping and
     23    reenablement.
     24 ** Create different ltrace processes to trace different children
     25 ** Config file syntax
     26 *** mark some symbols as exported
     27     For PLT hits, only exported prototypes would be considered.  For
     28     symtab entry point hits, all would be.
     29 
     30 *** named arguments
     31     This would be useful for replacing the arg1, emt2 etc.
     32 
     33 *** parameter pack improvements
     34     The above format tweaks require that packs that expand to no types
     35     at all be supported.  If this works, then it should be relatively
     36     painless to implement conditionals:
     37 
     38     | void ptrace(REQ=enum(PTRACE_TRACEME=0,...),
     39     |             if[REQ==0](pack(),pack(pid_t, void*, void *)))
     40 
     41     This is of course dangerously close to a programming language, and
     42     I think ltrace should be careful to stay as simple as possible.
     43     (We can hook into Lua, or TinyScheme, or some such if we want more
     44     general scripting capabilities.  Implementing something ad-hoc is
     45     undesirable.)  But the above can be nicely expressed by pattern
     46     matching:
     47 
     48     | void ptrace(REQ=enum[int](...)):
     49     |   [REQ==0] => ()
     50     |   [REQ==1 or REQ==2] => (pid_t, void*)
     51     |   [true] => (pid_t, void*, void*);
     52 
     53     Or:
     54 
     55     | int open(string, FLAGS=flags[int](O_RDONLY=00,...,O_CREAT=0100,...)):
     56     |   [(FLAGS & 0100) != 0] => (flags[int](S_IRWXU,...))
     57 
     58     This would still require pretty complete expression evaluation.
     59     _Including_ pointer dereferences and such.  And e.g. in accept, we
     60     need subtraction:
     61 
     62     | int accept(int, +struct(short, +array(hex(char), X-2))*, (X=uint)*);
     63 
     64     Perhaps we should hook to something after all.
     65 
     66 *** system call error returns
     67 
     68     This is closely related to above.  Take the following syscall
     69     prototype:
     70 
     71     | long read(int,+string0,ulong);
     72 
     73     string0 means the same as string(array(char, zero(retval))*).  But
     74     if read returns a negative value, that signifies errno.  But zero
     75     takes this at face value and is suspicious:
     76 
     77     | read@SYS(3 <no return ...>
     78     | error: maximum array length seems negative
     79     | , "\n\003\224\003\n", 4096)                  = -11
     80 
     81     Ideally we would do what strace does, e.g.:
     82 
     83     | read@SYS(3, 0x12345678, 4096)                = -EAGAIN
     84 
     85 *** errno tracking
     86     Some calls result in setting errno.  Somehow mark those, and on
     87     failure, show errno.  System calls return errno as a negative
     88     value (see the previous point).
     89 
     90 *** second conversions?
     91     This definitely calls for some general scripting.  The goal is to
     92     have seconds in adjtimex calls show as e.g. 10s, 1m15s or some
     93     such.
     94 
     95 *** format should take arguments like string does
     96     Format should take value argument describing the value that should
     97     be analyzed.  The following overwriting rules would then apply:
     98 
     99     | format       | format(array(char, zero)*) |
    100     | format(LENS) | X=LENS, format[X]          |
    101 
    102     The latter expanded form would be canonical.
    103 
    104     This depends on named arguments and parameter pack improvements
    105     (we need to be able to construct parameter packs that expand to
    106     nothing).
    107 
    108 *** More fine-tuned control of right arguments
    109     Combination of named arguments and some extensions could take care
    110     of that:
    111 
    112     | void func(X=hide(int*), long*, +pack(X)); |
    113 
    114     This would show long* as input argument (i.e. the function could
    115     mangle it), and later show the pre-fetched X.  The "pack" syntax is
    116     utterly undeveloped as of now.  The general idea is to produce
    117     arguments that expand to some mix of types and values.  But maybe
    118     all we need is something like
    119 
    120     | void func(out int*, long*); |
    121 
    122     ltrace would know that out/inout/in arguments are given in the
    123     right order, but left pass should display in and inout arguments
    124     only, and right pass then out and inout.  + would be
    125     backward-compatible syntactic sugar, expanded like so:
    126 
    127     | void func(int*, int*, +long*, long*);              |
    128     | void func(in int*, in int*, out long*, out long*); |
    129 
    130     This is useful in particular for:
    131 
    132     | ulong mbsrtowcs(+wstring3_t, string*, ulong, addr); |
    133     | ulong wcsrtombs(+string3, wstring_t*, ulong, addr); |
    134 
    135     Where we would like to render arg2 on the way in, and arg1 on the
    136     way out.
    137 
    138     But sometimes we may want to see a different type on the way in and
    139     on the way out.  E.g. in asprintf, what's interesting on the way in
    140     is the address, but on the way out we want to see buffer contents.
    141     Does something like the following make sense?
    142 
    143     | void func(X=void*, long*, out string(X)); |
    144 
    145 ** Support for functions that never return
    146    This would be useful for __cxa_throw, presumably also for longjmp
    147    (do we handle that at all?) and perhaps a handful of others.
    148 
    149 ** Support flag fields
    150    enum-like syntax, except disjunction of several values is assumed.
    151 ** Support long long
    152    We currently can't define time_t on 32bit machines.  That mean we
    153    can't describe a range of time-related functions.
    154 
    155 ** Support signed char, unsigned char, char
    156    Also, don't format it as characted by default, string lens can do
    157    it.  Perhaps introduce byte and ubyte and leave 'char' as alias of
    158    one of those with string lens applied by default.
    159 
    160 ** Support fixed-width types
    161    Really we should keep everything as {u,}int{8,16,32,64} internally,
    162    and have long, short and others be translated to one of those
    163    according to architecture rules.  Maybe this could be achieved by a
    164    per-arch config file with typedefs such as:
    165 
    166    | typedef ulong = uint8_t; |
    167 
    168 ** Support for ARM/AARCH64 types
    169    - ARM and AARCH64 both support half-precision floating point
    170      - there are two different half-precision formats, IEEE 754-2008
    171        and "alternative".  Both have 10 bits of mantissa and 5 bits of
    172        exponent, and differ only in how exponent==0x1F is handled.  In
    173        IEEE format, we get NaN's and infinities; in alternative
    174        format, this encodes normalized value -1S  2  (1.mant)
    175      - The Floating-Point Control Register, FPCR, controls:  The
    176        half-precision format where applicable, FPCR.AHP bit.
    177    - AARCH64 supports fixed-point interpretation of {,double}words
    178      - e.g. fixed(int, X) (int interpreted as a decimal number with X
    179        binary digits of fraction).
    180    - AARCH64 supports 128-bit quad words in SIMD
    181 
    182 ** Some more functions in vect might be made to take const*
    183    Or even marked __attribute__((pure)).
    184 
    185 ** pretty printer support
    186    GDB supports python pretty printers.  We migh want to hook this in
    187    and use it to format certain types.
    188 
    189 ** support new Linux kernel features
    190    - PTRACE_SIEZE
    191    - /proc/PID/map_files/* (but only root seems to be able to read
    192      this as of now)
    193 
    194 * BUGS
    195 ** After a clone(), syscalls may be seen as sysrets in s390 (see trace.c:syscall_p())
    196