Skip to content

LU rewrite, first draft#94

Closed
Andlon wants to merge 7 commits intoAtheMathmo:masterfrom
Andlon:lu_rewrite
Closed

LU rewrite, first draft#94
Andlon wants to merge 7 commits intoAtheMathmo:masterfrom
Andlon:lu_rewrite

Conversation

@Andlon
Copy link
Copy Markdown
Collaborator

@Andlon Andlon commented Nov 21, 2016

This is by no means finished, but I wanted to give contributors an opportunity to weigh in with some opinions, if anyone wants to.

In line with some of the ideas discussed in #40, I've recently started to put together a redesign of the LU decomposition. If we reach concencus on the ideas put forth, I plan to do a similar redesign of the other decompositions.

Currently, the LU(P) decomposition as implemented in rulinalg simply returns a triplet of the three matrices L, U and P. One of the strengths of the LU decomposition, is that once decomposed, one can very efficiently solve many linear systems Ax = b where the matrix A is constant, but b varies. It would thus be convenient if this functionality was readily available and optimized through rulinalg's API. Thus, I propose to instead let the decomposition return a specialized type that makes this operation easier.

In addition, returning all three matrices is not optimal in terms of storage space. In particular, one can store the L and U matrices in the space of a single matrix, rather than two separate matrices, halving the memory costs. This is not yet implemented in this PR, but it's possible to implement without breaking changes.

Finally, I want to implement a custom PermutationMatrix structure. Currently the permutation matrix is returned as a full Matrix<T>, but by its nature, a permutation matrix needs only O(n) storage, and it can be applied much more efficiently than performing a full matrix-matrix or matrix-vector application. This is also not yet implemented in this PR.

I've also made one more important change: for whatever reason, the LUP decomposition that is currently implemented fails if the matrix is singular. However, there always exists an LUP decomposition for any square matrix, so I've taken this into account, and as such we no longer need to return a Result for the decomposition itself.

In short, we currently write:

let a: Matrix<T> = ...;
// Perform LUP decomp in O(n^3) operations
let (l, u, p) = a.lup_decomp().expect("Should work?");

// Solving Ax = b in O(n^2) operations with the LU decomposition
// is somewhat convoluted
let y = l.solve_l_triangular(P * b).expect("Matrix is invertible")
let x = u.solve_u_triangular(y).expect("Matrix is invertible");

With the proposed changes, we have instead:

let a: Matrix<T> = ...;
// Consumes the matrix and returns an opaque decomposition object
// in O(n^3) operations
let lu = a.lu();

// Solve Ax = b in O(n^2) operations. The details of the operation are
// conveniently abstracted away from the user, and we can heavily
// optimize it if need be.
let x = lu.solve(b).expect("Matrix is invertible");

// Can compute determinant in O(n) operations
let det = lu.det();

// Finally, can consume the decomposition to obtain
// the individual matrices
let LU { l, u, p } = lu.decompose();

Note the last line in the previous example. I've explicitly avoided returning a triplet here, because it's very easy to mess up the order. In rulinalg we have (l, u, p), but i.e. NumPy returns (p, l, u). I even made this mistake once when working with the previous API. These sort of errors can be very frustrating if they are not quickly discovered! Of course, with the above struct deconstructing API, the call is order-independent, so the last line could have been replaced with LU { p, l, u } or LU { u, p, l } without having any effect on the behavior of the code.

@Andlon Andlon changed the title Lu rewrite, first draft LU rewrite, first draft Nov 21, 2016
@Andlon
Copy link
Copy Markdown
Collaborator Author

Andlon commented Nov 23, 2016

@AtheMathmo: when you have time, I would appreciate if you could let me know whether you feel this is going in the right direction. If so, I'll polish it and get it ready when I have the opportunity to work on it.

No rush though, I'm quite busy these days anyway!

@AtheMathmo
Copy link
Copy Markdown
Owner

Sorry I didn't have any time earlier :(. Will be able to go through it tomorrow!

@Andlon
Copy link
Copy Markdown
Collaborator Author

Andlon commented Nov 23, 2016

I'm sorry, the purpose of my previous comment was not to make you look into it sooner. By all means, please take as much time as you want! Really did not intend to put any kind of pressure on you.

I just wasn't sure if I'd given the impression that I wanted you to look at it before I invested much more time working on it, so I just wanted to make that clear :) Just making sure we're on the same page!

@AtheMathmo
Copy link
Copy Markdown
Owner

No worries!

@AtheMathmo
Copy link
Copy Markdown
Owner

I'll have some more comments from the code shortly. But the following is just with respects to your post body.

In particular, one can store the L and U matrices in the space of a single matrix, rather than two separate matrices

I agree that this should be left out for now but it's great that it can be done without further API breakage.

Finally, I want to implement a custom PermutationMatrix structure.

I think this is a great idea too! Also happy to have this follow later. Though we should do some benchmarking to confirm that we do indeed outperform the naive approach. On principle rather than following an expectation.

for whatever reason, the LUP decomposition that is currently implemented fails if the matrix is singular.

This would be my failing! I'm glad you picked up on this and changed it. Hopefully I'll infer from the code soon, but could you explain how we handle singular matrices (of rank k)? Do we just set the diagonal to zero?

I really like that we don't need to have a specific order to unpack the values too! It's a nice plus.

Ok, going to read through the code now so hopefully can give some more constructive comments.

Copy link
Copy Markdown
Owner

@AtheMathmo AtheMathmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great start! Overall it looks like the right direction to me.

Other than the comments here the only other thing I think we should discuss is the duplicate solve, inverse and det functions.

I can absolutely see why we need the extra solve function, but I'm not sure why we have the extra inverse and det functions. It seems to me that we can take advantage of the new solve function directly in the Matrix inverse and det functions? Please let me know if I'm missing anything.

Comment thread src/matrix/mod.rs

let b = try!(forward_substitution(&l, p * y));
back_substitution(&u, b)
self.clone().lu().solve(y)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also be worth modifying the API here to consume self. Under the same mantra that if we're cloning &self the api is broken.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely agree! One could also argue to remove Matrix::solve completely, but I suppose it's useful as the user doesn't have to worry about decompositions if he just wants to solve a linear system. That said, I think the documentation should make it clear that it is just a convenience wrapper around lu().

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree. We should state what is going on but I think many users wont want to think about LU decompositions and having solve on Matrix will be a big help.

Comment thread src/matrix/mod.rs
Error::new(ErrorKind::DecompFailure,
"Could not compute LUP factorization for inverse.")
}));
let LU { l, u, p } = self.clone().lu().decompose();
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And probably the same here?

Comment thread tests/mat/mod.rs
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, -4.0, 1.0, 0.0;
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, -4.0, 1.0;
0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, -4.0];
// TODO: Fix these tests to take the new API into account
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding a comment here so that I don't forget about these.

use libnum::{Float};

/// TODO
pub trait Decomposition {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the name (and partly the functionality) of this trait.

Is the idea that this trait represents a decomposition which returns multiple values that we're packing into a custom struct? Maybe we should have the function be called unpack?

Actually this probably is fine. I think my worry is that people may think that this decompose function does all of the legwork whereas actually there are custom functions doing this (like lu).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good point! I picked it because I liked the look of it, i.e.: lu().decompose(), but this can sound a little misleading as you point out. I'll take some time to think about this.

The name of the trait is simply due to the fact that the only thing in common between different kinds of decomposition/factorizations is that they all reduce a single matrix A into a number of matrices M_1, M_2, ..., M_n such that A = M_1 * M_2 * ... * M_n. My idea was hence to encode this into a trait (I just didn't document it yet), as although the decompositions will offer different kinds of functionality, they will at least all share this property.

I'm certainly open to changing the names of both trait and/or function!

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll think about it too. What you said definitely makes sense and I think the naming probably is fine especially if we make the purpose of the trait clear in the docs.

@Andlon
Copy link
Copy Markdown
Collaborator Author

Andlon commented Nov 24, 2016

Eh, some Github issues here. Apparently I accidentally deleted my previous response about the singular matrices (Github showed 3 copies of it, and then suddenly they were all gone after deleting one of them).

From memory, it said something to the effect of:

About the singular matrices, I'm not entirely convinced that I'm handling it correctly, but I will investigate this more thoroughly. I believe the essence is indeed to let the diagonal in U be zero, zero out the remaining elements in L, and leave the rest in U unchanged. This is currently doing more work than needed in this case, so I'll probably special case it a little to avoid redundant operations.

As for the existence of inverse() and det() in LuDecomposition: Let us assume that you wish to compute both the inverse and the determinant of your matrix. If you do mat.inverse() followed by mat.det(), you're doing the internal LU work twice. This is a large amount of redundant work, and I think rulinalg should make it possible for the user to avoid this work. I think the inverse and det methods in LuDecomposition should remain (possibly be improved), but instead we can let the Matrix methods simply call these methods internally, as in the case of solve. What do you think?

@AtheMathmo
Copy link
Copy Markdown
Owner

Actually yes I agree that having these in there is a good idea. I actually have a use case in rusty-machine where I need both the inverse and the determinant.

@Andlon
Copy link
Copy Markdown
Collaborator Author

Andlon commented Nov 24, 2016

@AtheMathmo: one thing I realized about inverse(): In the case of a singular matrix, lu().inverse() would first compute the full LU decomposition before trying to invert the singular matrix. This makes sense, because we ask it to compute the LU decomposition, and then try to form the inverse. However, in the case of mat.inverse(), we don't really care about the LU decomposition, so we could conceivably perform an early exit immediately when the LU decomposition procedure discovers a "singular column".

I could take this into account by moving the LU decomposition into an internal method which returns a Result<LuDecomposition, something_else> and a switch abort_on_singular: bool. The Matrix version would then immediately return a failure, but lu() would always return an LuDecomposition.

EDIT: This also applies to Matrix::solve and Matrix::det.

@Andlon
Copy link
Copy Markdown
Collaborator Author

Andlon commented Nov 26, 2016

Hey @AtheMathmo: could we consider adding itertools as a dependency? I keep running into minor problems that would be solved with a one-liner from itertools. I'll manage either way, but it might make our lives a little bit easier. It seems to have fairly minimal dependencies.

@AtheMathmo
Copy link
Copy Markdown
Owner

I agree with your comments about early-exiting for some of the LU routines. If you can find a nice way to get that into the API I think that it is worth having. Perhaps have a private function in the lu module which has the actual algorithm and takes a abort_on_singular argument. Then the exposed lu() function would feed in false. And the inverse and det functions in the lu module would feed in true. Finally the inverse and det functions on the Matrix struct would just call the inner lu functions in those cases. Does that make sense? I might be missing something...

As for itertools I think it may be worth bringing in. There are a few other cases where I think it may have been of use. As I'm sure you've gathered I'm always a little anxious to bring in more dependencies. But itertools is a well maintained library and can probably help us out in some other areas too. Would you mind explaining exactly what functionality you need from it?

@Andlon
Copy link
Copy Markdown
Collaborator Author

Andlon commented Nov 26, 2016

I agree with your comments about early-exiting for some of the LU routines. If you can find a nice way to get that into the API I think that it is worth having. Perhaps have a private function in the lu module which has the actual algorithm and takes a abort_on_singular argument. Then the exposed lu() function would feed in false. And the inverse and det functions in the lu module would feed in true. Finally the inverse and det functions on the Matrix struct would just call the inner lu functions in those cases. Does that make sense? I might be missing something...

That's more or less exactly what I had in mind :) We'll see how it turns out in the end though.

As for itertools I think it may be worth bringing in. There are a few other cases where I think it may have been of use. As I'm sure you've gathered I'm always a little anxious to bring in more dependencies. But itertools is a well maintained library and can probably help us out in some other areas too. Would you mind explaining exactly what functionality you need from it?

Well, I mostly just recall thinking "itertools would have been useful now" several times throughout the contributions I've made. In this case, it would be useful to have .unique(), and I remember .sorted() would have been useful in a previous PR I did. I won't add it now, but I just wanted to hear your opinion on it.

One more thing: Currently the scalar type is a generic parameter for BaseMatrix, i.e. BaseMatrix<T>. Have you considered making it an associated type instead? Something like

trait BaseMatrix {
    type Scalar;
    // ...
}

I'm currently working on my ideas for the PermutationMatrix struct. Even though it shouldn't need a generic parameter (it works entirely with indices), it seems I currently have to add <T> (via a phantom marker) for it to work with BaseMatrix. I asked around on IRC, and I got some suggestions that perhaps making the scalar type an associated type in the base trait would make sense. Reading up on associated types, it does sound like a good fit. It was mentioned that type parameters in structs that turn into associated types in traits is a common pattern. I'm still relatively new to Rust, so this was a bit of a revelation to me.

The only downside I've found so far is that you wouldn't be able to mix operations with different scalar fields, but we would never do that anyway (i.e. multiply a matrix of type f64 with i64 or similar). There might be others though. The upside is that writing generic code would presumably be easier, as there's one less type parameter to worry about. Anyway, I was just wondering if the current design of having the scalar type as a generic type parameter rather than an associated type is a deliberate design decision?

(I wish I could somehow stop coming up with all these long-winded questions for you!)

Permutation is a representation of a permutation
of an abstract ordered set. It will greatly
simplify the implementation of PermutationMatrix.
@AtheMathmo
Copy link
Copy Markdown
Owner

I think that @tafia played around with associated types in this PR: #18 . I was a little unsure whether they would really help and it looks like after toying with them a little he decided against them too. I think that it might be a good idea to try adding them in the hope that it might clean up the code in some places. If @tafia remembers why he decided against them I'd love to hear from him too!

Something else I have been thinking about is adding more trait bounds to the BaseMatrix trait. Things like Index, IndexMut, Neg. We could also try adding bounds for Mul, Add, Sub, and Div. I think that the first three would give a more complete trait wrapper for a Matrix. Adding the arithmetic operators sounds reasonable in practice but I'm not sure if it is necessary and it will make the trait look pretty scary...

@Andlon
Copy link
Copy Markdown
Collaborator Author

Andlon commented Nov 27, 2016

I think that @tafia played around with associated types in this PR: #18 . I was a little unsure whether they would really help and it looks like after toying with them a little he decided against them too. I think that it might be a good idea to try adding them in the hope that it might clean up the code in some places. If @tafia remembers why he decided against them I'd love to hear from him too!

I skimmed through the conversation in #18. Would indeed be interesting to hear what made @tafia feel that associated types don't feel so nice, and in what way he was leveraging associated types.

Something else I have been thinking about is adding more trait bounds to the BaseMatrix trait. Things like Index, IndexMut, Neg. We could also try adding bounds for Mul, Add, Sub, and Div. I think that the first three would give a more complete trait wrapper for a Matrix. Adding the arithmetic operators sounds reasonable in practice but I'm not sure if it is necessary and it will make the trait look pretty scary...

Well, we'd have to think about what exactly constitutes a matrix. For example, in fields such as graph theory, a matrix of bool is relatively common, and in this case arithmetic does not make sense. While we perhaps should not put too much emphasis on such special cases, I consider a matrix to be a general-purpose rectangular 2D-container of items. Ideally it would not put more constraints on its elements than absolutely necessary. However, at the same time, rulinalg is a linear algebra library, so I don't know what the right philosophy is here.

Index and IndexMut makes sense for any matrix though, I think.

@Andlon
Copy link
Copy Markdown
Collaborator Author

Andlon commented Dec 1, 2016

I haven't had any time to do more work on this lately, but I have a small update. I realized that my previous comment about LUP decompositions for singular matrices is only partly true.

While it is true that every square matrix has an LUP decomposition, the common LUP decomposition algorithms using partial pivoting are only numerically stable for well-conditioned invertible matrices. LUP decompositions for singular matrices are typically computed by "full pivoting", in which case one also obtains an additional permutation matrix (as one permutes both the columns and rows).

The most straightforward explanation of this I could find was in Eigen's documentation. This suggests we should probably have multiple LU decompositions in the future, but I suppose for now, it is sufficient to only support partial pivoting.

@tafia
Copy link
Copy Markdown
Contributor

tafia commented Dec 2, 2016

Sorry for the late answer!

If @tafia remembers why he decided against them I'd love to hear from him too!

From what I remember it was just too verbose and didn't really bring benefit (the api didn't look more explicit). But I don't have any strong opinion on the matter and I would be happy whatever the decision you guys choose.

@AtheMathmo
Copy link
Copy Markdown
Owner

@Andlon - thanks for looking into the singular matrix decomposition. I agree that we should limit ourselves for now and move on to singular matrices at a later date. I'll add a ticket for this.

@tafia thanks for your comments! I think it would be a good idea to at least look into using associated types - especially if this is a common pattern. Sorry for pushing you away from this idea in the first place @tafia !

@Andlon
Copy link
Copy Markdown
Collaborator Author

Andlon commented Jan 30, 2017

Closing in favor of #142, which will eventually supersede the functionality in this PR.

@Andlon Andlon closed this Jan 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants