Enable multiband runs (single-band unchanged)#23
Enable multiband runs (single-band unchanged)#23Jayashree-Behera wants to merge 13 commits intoschlafly:masterfrom
Conversation
e9c901b to
701c4a0
Compare
schlafly
left a comment
There was a problem hiding this comment.
This looks great, thanks Shree.
Let's delete mosaic.py which will be broken. I think we will need some decam_proc.py updates but they should be modest.
I'll have some more comments tomorrow, sorry to be slow.
| if weight is None: | ||
| weight = np.ones_like(im) | ||
| wt_pad = np.pad(weight, [pad, pad], constant_values=0.) | ||
| wt_pad[wt_pad == 0] = 1e-20 |
There was a problem hiding this comment.
You're duplicating my existing code, but I can't figure out for the life of me why I did this. Let's keep this for this commit, but can you file an issue to check removing this after we merge this commit?
schlafly
left a comment
There was a problem hiding this comment.
This looks good. Several comments inline. One bigger question: when running on a multiband image, I think fit_im now returns a "multiband" output; i.e., both _joint quantities and the original quantities, now with _b0 appended. There's also one newer quantity, snr_joint.
I'm wondering whether in the single band case we should compress the output before returning, doing roughly:
cat = {k[:-3]: cat[k] for k in cat if k.endswith('_b0')}
Then I think e.g. DECam processing and single-band WISE processing would be less affected (the returned catalogs would be basically the same as before. You may have already addressed the single-band WISE case in wise_proc?
| # Sources in this subregion (shared positions across all bands) | ||
| mbda_src = in_bounds(xa, ya, [bdxaf-0.5, bdxal-0.5], [bdyaf-0.5, bdyal-0.5]) | ||
| if not np.any(mbda_src): continue | ||
| mbd = in_bounds(xa, ya, [bdxf-0.5, bdxl-0.5], [bdyf-0.5, bdyl-0.5]) |
crowdsource/crowdsource_base.py
Outdated
|
|
||
| ind = np.flatnonzero(mbda_src) # global indices into 0..N-1 | ||
| ind2 = np.arange(len(ind)) # local indices 0..M-1 | ||
| M = len(ind) # Its same as len(ind2) |
There was a problem hiding this comment.
I think we lost a bit of the logic here. It clearly doesn't make a big impact, but here's the intent:
- we cut the image into subregions
- each subregion has a 'primary' region; the union of the primary regions covers the image.
- it also has a boundary region; we include the boundary regions in the fits so that stars on the edge of the primary region aren't cut off and can be fit simultaneously with their neighbors. The primary + boundary region gets the 'a' suffix ('all')
- we do the fit on each subregion independently ('all')
- we only fill out the fluxes of the stars that were in the 'primary' region
I think the new code does all of this right except for 5.
crowdsource/crowdsource_base.py
Outdated
| flux_snr = flux_snr + (flux_snr == 0)*1e-20 | ||
|
|
||
| xcen /= flux_snr | ||
| ycen /= flux_snr |
There was a problem hiding this comment.
I think I personally would have done flux_avg = np.sum(flux * fluxunc**-2) / np.sum(fluxunc ** -2), and xcen /= flux_avg, ycen /= flux_avg. This is also fine though since it's the first iteration and we just need to do something sane.
Perhaps add some comments that this only applies to the first iteration since that's the iteration where we don't know the flux scaling.
What happens to the centroids for newly detected sources? It feels like they probably need similar handling?
There was a problem hiding this comment.
The derivative columns in build_sparse_matrix are flux-scaled using guessflux from previous iteration, so the LSQR dx/dy coefficients already behave like centroid shifts. Except for the first iteration and so we do xcen /= flux_snr normalization.
Co-authored-by: Eddie Schlafly <eschlafly@gmail.com>
Co-authored-by: Eddie Schlafly <eschlafly@gmail.com>
Co-authored-by: Eddie Schlafly <eschlafly@gmail.com>
Co-authored-by: Eddie Schlafly <eschlafly@gmail.com>
| if guess is not None: | ||
| ind = np.flatnonzero(mbda_src) # fitted sources (reuse same 'ind') | ||
| guess_local_flux = guess[:B*N].reshape(B, N)[:, ind].ravel(order='C') | ||
| guessmbda = np.concatenate([guess_local_flux, skypar.get((bdxf, bdyf), np.zeros(B*nskypar, 'f4'))]) if nskypar>0 else guess_local_flux |
There was a problem hiding this comment.
| guessmbda = np.concatenate([guess_local_flux, skypar.get((bdxf, bdyf), np.zeros(B*nskypar, 'f4'))]) if nskypar>0 else guess_local_flux | |
| guessmbda = np.concatenate([guess_local_flux, skypar[(bdxf, bdyf)]]) |
I feel like on 1164 you make sure that this exists so you don't need to guard against it not existing here?
|
|
||
| # print("fit_once:", time.time()-t_fo) | ||
|
|
||
| ind_all = np.flatnonzero(mbda_src) # sources included in the fit (ALL region = PRIMARY + BOUNDARY) | ||
| ind_pri = np.flatnonzero(mbd) # sources owned by this tile (PRIMARY region) | ||
|
|
||
| # Map primary sources into local ALL coordinates | ||
| ind_pri_loc = np.searchsorted(ind_all, ind_pri) | ||
|
|
||
| M_all = len(ind_all) | ||
| ind2 = np.arange(M_all) # local indices for PSF stamp building (still ALL sources) | ||
|
|
||
| # Store new fluxes | ||
| for b in range(B): | ||
| models[b][spri] = tmodels[b][sfit] | ||
| mskys[b][spri] = tmskys[b][sfit] | ||
|
|
||
| loc_start = b*M_all | ||
| loc_stop = (b+1)*M_all | ||
| glob_start = b*N | ||
|
|
||
| # only commit PRIMARY sources | ||
| flux[glob_start + ind_pri] = tflux[0][loc_start:loc_stop][ind_pri_loc] | ||
|
|
||
| # # Store (dx,dy) (if present): global deriv block starts at B*N | ||
| if tpsfderiv and M_all > 0: | ||
| deriv_off_glob = B*N | ||
| deriv_off_loc = B*M_all | ||
|
|
||
| dx_all = tflux[0][deriv_off_loc : deriv_off_loc + 2*M_all : 2] | ||
| dy_all = tflux[0][deriv_off_loc+1 : deriv_off_loc + 2*M_all : 2] | ||
|
|
||
| # only commit PRIMARY sources | ||
| flux[deriv_off_glob + 2*ind_pri] = dx_all[ind_pri_loc] | ||
| flux[deriv_off_glob + 2*ind_pri + 1] = dy_all[ind_pri_loc] | ||
|
|
||
| # Update per-tile sky | ||
| if nskypar > 0: | ||
| skypar[(bdxf, bdyf)] = tflux[0][-B*nskypar:].copy() | ||
|
|
||
| # Build PSF stamps for this tile (optional but gives big speed-up in centroiding) | ||
| for b in range(B): | ||
| if M_all > 0: | ||
| for k in range(n_psfcomps): | ||
| psf_block = np.stack( | ||
| [psfmod.central_stamp(psfsbda[b][k][tind], minsz_b[b]) for tind in ind2], | ||
| axis=0 | ||
| ).astype('f4') # (M, S, S) | ||
| # Insert into the global psf_stamps | ||
| psf_stamps[b][k][mbda_src] = psf_block |
There was a problem hiding this comment.
This got a little more involved than I immediately thought necessary. Would you mind looking at the following and seeing what I'm missing?
if numpy.all(numpy.isnan(tmodel)):
raise ValueError("Model is all NaNs")
ind_pri_in_full = numpy.flatnonzero(mbd)
ind_pri_in_all = numpy.flatnonzero(mbd[mbda])
Nall = np.sum(mbda)
# copy the centroid parameters for primary sources from the subregion fit
# into the full image parameter set
if tpsfderiv:
for i in range(2):
flux[N*B + ind_pri_in_full + i] = tflux[0][Nall*B + ind_pri_in_all + i]
skypar[(bdxf, bdyf)] = tflux[0][-B*nskypar:].copy()
# copy the flux parameters for primary sources from the subregion fit
# into the full image parameter set
for b in range(B):
models[b][spri] = tmodel[b][sfit]
msky[b][spri] = tmsky[b][sfit]
flux[ind_pri_in_full*B + b] = tflux[0][ind_pri_in_all*B + b]
for k in range(n_psfcomps):
if Nall == 0:
continue
psf_stamps[b][k][ind_pri_in_full] = [
psfmod.central_stamp(psfsbda[i][tind], minsz)
for tind in ind_pri_in_all]
(roughly I'm trying to follow the old logic more closely, but I'm probably missing something you're doing there!)
This PR adds multiband source fitting utilities to crowdsource_base.py and updates wise_proc.py to run in multiband mode. Multiband functions are named after their corresponding single-band functions with "_multiband" appended.
Single-band run behavior is unchanged.