-
Notifications
You must be signed in to change notification settings - Fork 8
Primer design fixes #1
base: master
Are you sure you want to change the base?
Conversation
…clamping could produce suboptimal result
chrismacklin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvement overall, I will work on validating the change against internal regression standards to get ready for merge.
src/AmyrisBio/primercore2.fs
Outdated
| let disruptPrimerDimers (debug:bool) (existingTemp : float<C>) (p:PrimerParams) (s:char[]) f t offset = | ||
| if debug then | ||
| printfn "disruptPrimerDimers: checking for problems" | ||
| let comp (b:char) = match b with | 'G' -> 'C' | 'C' -> 'G' | 'A' -> 'T' | 'T' -> 'A' | x -> x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a suite of basepair-wise complement functions in biolib.fs; is there some reason that one of those existing functions is insufficient here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed that first time I looked. replaced with rcBase
src/AmyrisBio/primercore2.fs
Outdated
| if seq { 0..i-1 } |> Seq.forall (fun j -> s.[t'-i+j] = comp (s.[t'-j])) then | ||
| yield (i+1) | ||
| } |> Seq.tryFind(fun x -> x<> 0) | ||
| |> function | None -> 0 | Some v -> v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a builtin for this:
|> Option.defaultValue 0There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually - couldn't find that method, but there is a defaultArg function that I ended up using. Let me know if I'm missing something
tests/AmyrisBio.Tests/testPrimer.fs
Outdated
| "AAACAGTATATGTACAGTTTTATATATATATATATATATATATACATATATAAAG" ; | ||
| "AAAAAATCTTAATAGATTAATTTAAACAGTATATGTACAGTTTTATATATATATATATATATATATACATATATAAAG" ; | ||
| ] | ||
| let pen= { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of code duplication between these two test suites; it would be great to extract this record to the test class, as well as the let task = ... expression below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deduped code and the shared templates
|
Ok - all addressed. I ended up bumping the versioning to .Net 4.6.1 like the other libraries while trying to find Option.defaultarg and can revert if this is aproblem. We should be moving over to dotnetcore soon anyway |
|
I ended up switching it back to 4.5 after the 4.6.1 update broke a bunch of things. Probably better to minimize the changes here. |
The primer designer has never really taken care of self dimers during design and does in practice make bad oligos. For example the CTTTAAAG in the tail of this oligo can anneal to itself (oligo on opposite strand) and prime to make a dimer.
TTAAGTCGAAACCTAGTGCTCCGCTATTCTTGCCTACGGTTGGGCTTAA**CTTTAAAG**In addition, there were deficiencies in the GC clamp finding that sometimes the designer would ignore an obvious opportunity to end a primer on a G or a C. This patch fixes these cases and adds a bunch of tests. Regression testing was done against a large set of designs and all the changes inspected to ensure the algorithm is behaving as designed/intended. This will likely result in small changes to oligo designs relative to historical output. The most common changes will be a 1 or 2 bp change in length to end on a G/C or to avoid a terminal palindrome.