Skip to content

Conversation

@ivg
Copy link
Member

@ivg ivg commented Feb 27, 2018

Now we have two approaches to taint analysis:

  • The new Taint Analysis Framework
  • The legacy Taint Propagation Framework

While the former is more precise and versatile then the latter we
still have a bunch of legacy software that we want to support, e.g.,
Saluki and our IDA Pro integration still relies on the old
framework. We may update the downstream tools to use the new
Taint Analysis Framework, but right now we don't have enough recourses
to perform such task. And I doubt that in case of Saluki this will make
any sense.

The primus-propagate-taint plugin provides a compatibility layer that
allows legacy tools to benefit from Primus without event knowing
anything about it. There are some trade-offs starting from the
semantics of the propagation and ending with the desire to minimize
breaking changes in bap command line interface, that are explained in
detail in the man page of the plugin (provided below for the
reference):

DESCRIPTION
       This plugin implements a compatibility layer between the new Primus
       Taint Analysis Framework and the old taint propagation framework (the
       propagate-taint plugin). The new framework uses the
       pubslisher-subscriber pattern, provides sanitization operations, and
       tracks the taints liveness, that enables more conventional and online
       taint analysis. However it represents taints as abstract objects
       associated with computations (values), while the old taint propagation
       framework uses a pipeline approach, with taints represented as
       attributes attached to program terms. Since the new representation of
       taints is much more precise and there is no bijection between terms and
       values, this layer will loose information due to this impendance
       mismatch. The trade-offs of the translation and described below. New
       analysis, if possible, shall rely on the new framework.

       The translation is achieved by mapping the tainted-ptr and tainted-reg
       attributes to corresponding taint introduction operations of the Primus
       Taint Analysis Framework, and by reflecting the taint state of the
       analysis into the tainted-regs and tainted-ptrs attributes. Both steps
       are optional, and could be enabled and disabled individually.

       Since an attribute is attached to the whole term not to an individual
       expression or value we need some rule that prescribes how terms maps to
       values. If a term is marked as a term that introduces a taint, then we
       assume that a value, computed in this term, references the tainted
       object either directly (in case of tainted-reg) or indirectly (in case
       of tainted-ptr). We always taint a value contained in the left-hand
       side of a definition. In addition, we also try to taint values on the
       right hand side. If there is a load or store operation, then we taint
       address as a pointer to the object that will track, if it was marked
       with the tainted-reg attribute. If it was marked with the tainted-ptr
       attribute then we dereference this pointer and taint the dereferenced
       address. If the right hand side is an abritrary expression, then we
       assume that all variables that are used in this expression contain
       values that are referencing directly or indirectly the tainted object.

OPTIONS
       --from-attributes
           Introduces taint in terms that are marked with the tainted-ptr and
           tainted-reg attribute.

       --help[=FMT] (default=auto)
           Show this help in format FMT. The value FMT must be one of `auto',
           `pager', `groff' or `plain'. With `auto', the format is `pager` or
           `plain' whenever the TERM env var is `dumb' or undefined.

       --no-marks
           Disables the projection of the taint engine state to term
           attributes. The option is only valid when the run option is
           specified. This option is left for compatibility with the old
           interface and is not compatible with the from-attributes or
           to-attrbutes options. It is an error to mix options from the new
           and old interfaces.

       --run
           Enables propagating taint from term attributes and back to
           attributes, unless the latter is disabled with the no-marks option.
           This option is left for compatibility with the old interface and is
           not compatible with the from-attributes or to-attrbutes options. It
           is an error to mix options from the new and old interfaces.

       --to-attributes
           Reflects the state of the taint propagation engine to the
           tainted-ptrs and tainted-regs term attributes.

If you read that far, then you deserve the bonus track, this is how we
can run Saluki using the new Taint Analysis Framework as a taint
propagation engine:

bap ./exe --saluki-print-models --propagate-taint-print-coverage \
  	  --passes=trivial-condition-form,saluki-taint,run,saluki-solve \
	  --primus-propagate-taint-from-attr --primus-propagate-taint-to-attr \
	  --primus-promiscuous-mode --primus-greedy-scheduler \
	  --primus-limit-max-visited=64 --primus-limit-max-length=4096

Yep, that's scary... I will later provide a recipe, and will also update
Saluki's Makefile to facilitate Saluki experimentation with the new
engine.

ivg added 3 commits February 27, 2018 15:48
Surprisingly powerpc depends on bap. If the dependency is not
specified explicitly then oasis won't compute dependencies correctly
and won't rebuild some modules, when the bap library is changed.
It doesn't hurt, though there is definitely no reasons to run it more
than once.
Now we have two approaches to taint analysis:

 - The new Taint Analysis Framework
 - The legacy Taint Propagation Framework

While the former is more precise and versatile then the latter we
still have a bunch of legacy software that we want to support, e.g.,
Saluki and our IDA Pro integration still relies on the old
framework. We may update the downstream tools to use the new
Taint Analysis Framework, but right now we don't have enough recourses
to perform such task. And I doubt that in case of Saluki this will make
any sense.

The `primus-propagate-taint` plugin provides a compatibility layer that
allows legacy tools to benefit from Primus without event knowing
anything about it. There are some trade-offs starting from the
semantics of the propagation and ending with the desire to minimize
breaking changes in bap command line interface, that are explained in
detail in the man page of the plugin (provided below for the
reference):

```
DESCRIPTION
       This plugin implements a compatibility layer between the new Primus
       Taint Analysis Framework and the old taint propagation framework (the
       propagate-taint plugin). The new framework uses the
       pubslisher-subscriber pattern, provides sanitization operations, and
       tracks the taints liveness, that enables more conventional and online
       taint analysis. However it represents taints as abstract objects
       associated with computations (values), while the old taint propagation
       framework uses a pipeline approach, with taints represented as
       attributes attached to program terms. Since the new representation of
       taints is much more precise and there is no bijection between terms and
       values, this layer will loose information due to this impendance
       mismatch. The trade-offs of the translation and described below. New
       analysis, if possible, shall rely on the new framework.

       The translation is achieved by mapping the tainted-ptr and tainted-reg
       attributes to corresponding taint introduction operations of the Primus
       Taint Analysis Framework, and by reflecting the taint state of the
       analysis into the tainted-regs and tainted-ptrs attributes. Both steps
       are optional, and could be enabled and disabled individually.

       Since an attribute is attached to the whole term not to an individual
       expression or value we need some rule that prescribes how terms maps to
       values. If a term is marked as a term that introduces a taint, then we
       assume that a value, computed in this term, references the tainted
       object either directly (in case of tainted-reg) or indirectly (in case
       of tainted-ptr). We always taint a value contained in the left-hand
       side of a definition. In addition, we also try to taint values on the
       right hand side. If there is a load or store operation, then we taint
       address as a pointer to the object that will track, if it was marked
       with the tainted-reg attribute. If it was marked with the tainted-ptr
       attribute then we dereference this pointer and taint the dereferenced
       address. If the right hand side is an abritrary expression, then we
       assume that all variables that are used in this expression contain
       values that are referencing directly or indirectly the tainted object.

OPTIONS
       --from-attributes
           Introduces taint in terms that are marked with the tainted-ptr and
           tainted-reg attribute.

       --help[=FMT] (default=auto)
           Show this help in format FMT. The value FMT must be one of `auto',
           `pager', `groff' or `plain'. With `auto', the format is `pager` or
           `plain' whenever the TERM env var is `dumb' or undefined.

       --no-marks
           Disables the projection of the taint engine state to term
           attributes. The option is only valid when the run option is
           specified. This option is left for compatibility with the old
           interface and is not compatible with the from-attributes or
           to-attrbutes options. It is an error to mix options from the new
           and old interfaces.

       --run
           Enables propagating taint from term attributes and back to
           attributes, unless the latter is disabled with the no-marks option.
           This option is left for compatibility with the old interface and is
           not compatible with the from-attributes or to-attrbutes options. It
           is an error to mix options from the new and old interfaces.

       --to-attributes
           Reflects the state of the taint propagation engine to the
           tainted-ptrs and tainted-regs term attributes.
```

If you read that far, then you deserve the bonus track, this is how we
can run Saluki using the new Taint Analysis Framework as a taint
propagation engine:

```
bap ./exe --saluki-print-models --propagate-taint-print-coverage \
  	  --passes=trivial-condition-form,saluki-taint,run,saluki-solve \
	  --primus-propagate-taint-from-attr --primus-propagate-taint-to-attr \
	  --primus-promiscuous-mode --primus-greedy-scheduler \
	  --primus-limit-max-visited=64 --primus-limit-max-length=4096
```

Yep, that's scary... I will later provide a recipe, and will also update
Saluki's Makefile to facilitate Saluki experimentation with the new
engine.
@ivg ivg requested a review from gitoleg February 27, 2018 21:06
ivg added a commit to BinaryAnalysisPlatform/bap-ida-python that referenced this pull request Feb 27, 2018
In this update we are relying on the new Primus Taint Analysis
Framework to provide us data-flow information via the
`primus-propagate-taint` compatibility layer. We allow a user to
select the taint propagation engine, as well as its parameters.

This commit also moves the bap_taint module to the utilities package,
as otherwise it is loaded multiple times by each plugin separatly that
may lead to unexpected results (don't ask me what are they)

There is also a small fix that prevents the racing condition between
askXXX dialogs and IDA Python breakability facility.

The merge of this PR is blocked until
BinaryAnalysisPlatform/bap#784 is merged
gitoleg
gitoleg previously approved these changes Feb 28, 2018
@gitoleg gitoleg dismissed their stale review February 28, 2018 12:51

need to fix mem taint

| None -> Machine.return ()
| Some (taint,kind,rel) ->
gentaint taint >>= fun t ->
taint_var rel t (Def.lhs def) >>= fun () ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we taint lhs part, that is not completely correct in case of store: we taint a whole memory in this case.
So we should ignore explicitly memory taints.

it doesn't hurt, as memory is never valuated by Primus, but still a
little bit ugly and introduce extra weight to the tainter state.
@ivg ivg merged commit ee3897f into BinaryAnalysisPlatform:master Feb 28, 2018
@ivg ivg deleted the provide-taint-frameworks-compatibility-layer branch March 7, 2018 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants