Burst Fitting example using MCMC¶
BurstFit
can also use MCMC for burst fitting. To do that we first fit the data using curve_fit
using the functions discussed in other tutorials, and then the results of curve_fit
are used as initial guess in MCMC. But this is not necessary.
MCMC is done using MCMC
class and ideally it can take any initial guess to do the MCMC. Using the converged values from curve_fit
is just convenient.
from burstfit.fit import BurstFit
from burstfit.data import BurstData
from burstfit.model import Model, SgramModel
from burstfit.utils.plotter import plot_2d_fit
from burstfit.utils.functions import gauss_norm2, pulse_fn_vec, sgram_fn_vec
from burstfit.io import BurstIO
import numpy as np
import logging
logging_format = "%(asctime)s - %(funcName)s -%(name)s - %(levelname)s - %(message)s"
logging.basicConfig(
level=logging.INFO,
format=logging_format,
)
Read data¶
import tempfile
from urllib.request import urlretrieve
temp_dir = tempfile.TemporaryDirectory()
download_path = str(temp_dir.name) + "/FRB180417.fil"
url = "https://zenodo.org/record/3905426/files/FRB180417.fil"
urlretrieve(
url, download_path,
)
fil_file = download_path
Initial Fit¶
bd = BurstData(
fp=fil_file,
dm=475.28400,
tcand=2.0288800,
width=2,
snr=16.8128,
min_samp=256,
)
bd.prepare_data()
2021-03-18 20:06:49,809 - prepare_data -burstfit.data - INFO - Preparing data for burst fitting. 2021-03-18 20:06:49,819 - crop_dedispersed_data -burstfit.data - INFO - Cropping data with time_window: 0.2s. 2021-03-18 20:06:49,820 - normalise_data -burstfit.data - INFO - Normalising data using off pulse mean and std. 2021-03-18 20:06:49,826 - normalise_data -burstfit.data - INFO - Off pulse mean and std are: (127.98158986433457, 18.365628141947383)
Using <class 'str'>: /tmp/tmpee0hxdnn/FRB180417.fil
profile_bounds = []
spectra_bounds = ([0, 0, 200, 0, 0], [100, 50, 300, 50, 1])
pnames = ['S', 'mu_t', 'sigma_t', 'tau']
snames = ['mu_f1', 'sigma_f1', 'mu_f2', 'sigma_f2', 'amp']
bd.prepare_data()
pulseModel = Model(pulse_fn_vec, param_names=pnames)
spectraModel = Model(gauss_norm2, param_names=snames)
sgramModel = SgramModel(pulseModel, spectraModel, sgram_fn_vec,
mask=bd.mask, clip_fac=bd.clip_fac)
bf = BurstFit(
sgram_model=sgramModel,
sgram=bd.sgram,
width=bd.width,
dm=bd.dm,
foff=bd.foff,
fch1=bd.fch1,
tsamp=bd.tsamp,
clip_fac=bd.clip_fac,
mask=bd.mask,
mcmcfit=False
)
bf.fitall(plot=False, profile_bounds=profile_bounds,
spectra_bounds=spectra_bounds)
2021-03-18 20:06:49,867 - prepare_data -burstfit.data - INFO - Preparing data for burst fitting. 2021-03-18 20:06:49,874 - crop_dedispersed_data -burstfit.data - INFO - Cropping data with time_window: 0.2s. 2021-03-18 20:06:49,875 - normalise_data -burstfit.data - INFO - Normalising data using off pulse mean and std. 2021-03-18 20:06:49,880 - normalise_data -burstfit.data - INFO - Off pulse mean and std are: (127.98158986433457, 18.365628141947383) 2021-03-18 20:06:49,882 - run_tests -burstfit.fit - INFO - Running statistical tests on the residual. 2021-03-18 20:06:49,882 - run_tests -burstfit.fit - INFO - Running off pulse - off pulse test 2021-03-18 20:06:49,889 - tests -root - INFO - P values: T-test (0.65766), Kruskal (0.71932), KS (0.59128), F-test (0.29584) 2021-03-18 20:06:49,890 - run_tests -burstfit.fit - INFO - Running on pulse - off pulse (L) test 2021-03-18 20:06:49,896 - tests -root - INFO - P values: T-test (0.00000), Kruskal (0.00000), KS (0.00000), F-test (0.00053) 2021-03-18 20:06:49,896 - run_tests -burstfit.fit - INFO - Running on pulse - off pulse (R) test 2021-03-18 20:06:49,902 - tests -root - INFO - P values: T-test (0.00000), Kruskal (0.00000), KS (0.00000), F-test (0.00309) 2021-03-18 20:06:49,903 - fitall -burstfit.fit - WARNING - Input spectra bounds detected. Using them for component 1 2021-03-18 20:06:49,903 - fitcycle -burstfit.fit - INFO - Fitting component 1. 2021-03-18 20:06:49,905 - initial_profilefit -burstfit.fit - INFO - Running initial profile fit for component: 1 2021-03-18 20:06:49,946 - initial_profilefit -burstfit.fit - INFO - Converged parameters (profile fit) are: 2021-03-18 20:06:49,946 - initial_profilefit -burstfit.fit - INFO - S: 512.8418742176256 +- 46.638907434560494 2021-03-18 20:06:49,947 - initial_profilefit -burstfit.fit - INFO - mu_t: 78.73232431022271 +- 0.08341666472383193 2021-03-18 20:06:49,947 - initial_profilefit -burstfit.fit - INFO - sigma_t: 0.7947937226184538 +- 0.08461197577615237 2021-03-18 20:06:49,948 - initial_profilefit -burstfit.fit - INFO - tau: 0.07023980173530475 +- 0.0 2021-03-18 20:06:49,948 - make_spectra -burstfit.fit - INFO - Making spectra using profile fit parameters. 2021-03-18 20:06:49,950 - initial_spectrafit -burstfit.fit - INFO - Running spectra profile fit for component: 1 2021-03-18 20:06:49,971 - initial_spectrafit -burstfit.fit - INFO - Converged parameters (spectra fit) are: 2021-03-18 20:06:49,971 - initial_spectrafit -burstfit.fit - INFO - mu_f1: 87.72708851306162 +- 14.657344144023687 2021-03-18 20:06:49,971 - initial_spectrafit -burstfit.fit - INFO - sigma_f1: 40.38070079339924 +- 14.56794385561647 2021-03-18 20:06:49,972 - initial_spectrafit -burstfit.fit - INFO - mu_f2: 284.84651913292134 +- 9.334720624356743 2021-03-18 20:06:49,972 - initial_spectrafit -burstfit.fit - INFO - sigma_f2: 41.23788107637332 +- 9.291859540498173 2021-03-18 20:06:49,972 - initial_spectrafit -burstfit.fit - INFO - amp: 0.3497407333338407 +- 0.09115810205812795 2021-03-18 20:06:49,973 - sgram_fit -burstfit.fit - INFO - Running sgram profile fit for component: 1 2021-03-18 20:06:49,973 - sgram_fit -burstfit.fit - INFO - initial estimate for parameters: [87.72708851306162, 40.38070079339924, 284.84651913292134, 41.23788107637332, 0.3497407333338407, 512.8418742176256, 78.73232431022271, 0.7947937226184538, 0.07023980173530475, 475.284] 2021-03-18 20:06:50,325 - sgram_fit -burstfit.fit - INFO - Converged parameters are: 2021-03-18 20:06:50,326 - sgram_fit -burstfit.fit - INFO - mu_f1: 74.6600995552445 +- 3.148966015198683 2021-03-18 20:06:50,326 - sgram_fit -burstfit.fit - INFO - sigma_f1: 29.08997449280042 +- 3.1735652932670435 2021-03-18 20:06:50,327 - sgram_fit -burstfit.fit - INFO - mu_f2: 282.0987558671014 +- 6.1593569158810695 2021-03-18 20:06:50,327 - sgram_fit -burstfit.fit - INFO - sigma_f2: 48.04757054335587 +- 6.826533560258564 2021-03-18 20:06:50,327 - sgram_fit -burstfit.fit - INFO - amp: 0.4123476364901036 +- 0.036921986314662335 2021-03-18 20:06:50,328 - sgram_fit -burstfit.fit - INFO - S: 555.8960256802608 +- 42.5662741419663 2021-03-18 20:06:50,328 - sgram_fit -burstfit.fit - INFO - mu_t: 79.02660306937426 +- 0.05278628950931549 2021-03-18 20:06:50,328 - sgram_fit -burstfit.fit - INFO - sigma_t: 0.7844582964946152 +- 0.006745505400432026 2021-03-18 20:06:50,329 - sgram_fit -burstfit.fit - INFO - tau: 0.07026178702689054 +- 0.0006220715236890621 2021-03-18 20:06:50,329 - sgram_fit -burstfit.fit - INFO - DM: 474.5754998470141 +- 0.12095473392329054 2021-03-18 20:06:50,329 - model -burstfit.fit - INFO - Making model. 2021-03-18 20:06:50,330 - model -burstfit.fit - INFO - Found 1 components. 2021-03-18 20:06:50,335 - get_off_pulse_region -burstfit.fit - INFO - mu_t and sigma_t found in params. Using those to estimate off pulse region. 2021-03-18 20:06:50,335 - get_off_pulse_region -burstfit.fit - INFO - Using sgram fit parameters. 2021-03-18 20:06:50,337 - model -burstfit.fit - INFO - Making model. 2021-03-18 20:06:50,337 - model -burstfit.fit - INFO - Found 1 components. 2021-03-18 20:06:50,342 - calc_redchisq -burstfit.fit - INFO - Reduced chi-square value of fit is: 1.003115392773371 2021-03-18 20:06:50,343 - run_tests -burstfit.fit - INFO - Running statistical tests on the residual. 2021-03-18 20:06:50,343 - run_tests -burstfit.fit - INFO - Running off pulse - off pulse test 2021-03-18 20:06:50,347 - tests -root - INFO - P values: T-test (0.65766), Kruskal (0.71932), KS (0.59128), F-test (0.29584) 2021-03-18 20:06:50,347 - run_tests -burstfit.fit - INFO - Running on pulse - off pulse (L) test 2021-03-18 20:06:50,351 - tests -root - INFO - P values: T-test (0.43787), Kruskal (0.50214), KS (0.35819), F-test (0.21593) 2021-03-18 20:06:50,351 - run_tests -burstfit.fit - INFO - On pulse residual is similar to left off pulse region. 2021-03-18 20:06:50,351 - run_tests -burstfit.fit - INFO - Running on pulse - off pulse (R) test 2021-03-18 20:06:50,355 - tests -root - INFO - P values: T-test (0.22684), Kruskal (0.27991), KS (0.10434), F-test (0.40143) 2021-03-18 20:06:50,356 - run_tests -burstfit.fit - INFO - On pulse residual is similar to right off pulse region. 2021-03-18 20:06:50,356 - fitall -burstfit.fit - INFO - On pulse residual looks like noise. Terminating individual component fitting. 2021-03-18 20:06:50,356 - fitall -burstfit.fit - INFO - Final number of components = 1. Terminating fitting. 2021-03-18 20:06:50,357 - get_off_pulse_region -burstfit.fit - INFO - mu_t and sigma_t found in params. Using those to estimate off pulse region. 2021-03-18 20:06:50,358 - get_off_pulse_region -burstfit.fit - INFO - Using sgram all-component-fit parameters.
MCMC Fit¶
Now we will use the fitted parameters from fitall()
as initial guess to MCMC. We can set different inputs/config for MCMC in a dictionary. Here are some of those inputs:
- nwalkers: Number of Walkers
- nsteps: Number of iterations per walker
- skip: Number os steps to skip
- ncores: Number of cores to use
- start_pos_dev: Percent deviation for start position of the samples
- prior_range: Percent of initial guess to set as prior range
and some more ...
The initial guess for MCMC is taken from sgram_params
. The initial positions of walkers are chosen randomly using start_pos_dev
and the initial guess. The priors is also chosen using initial guess and prior_range
. We also use parameter names to modify the priors, if possible. Hence, the priors are set based on the initial guess and some rules (eg: minimum prior for tau is 0, etc). save_results
can be used to save the h5 file with samples, and MCMC result plots.
Ideally, nsteps
should be really huge as we use autocorrelation analysis to check if our chains have converged. See: https://emcee.readthedocs.io/en/stable/tutorials/autocorr/.
But this is a tutorial, so we will use a small number.
mcmc_kwargs = {}
mcmc_kwargs = {'nwalkers':20, 'nsteps':1000,
'skip':500, 'ncores':4,
'start_pos_dev':0.01,
'prior_range':0.8,
'save_results':True,
'outname':'test_file'}
bf.run_mcmc(plot=True, **mcmc_kwargs)
2021-03-18 20:06:50,365 - set_initial_pos -root - INFO - Setting initial positions for MCMC. 2021-03-18 20:06:50,366 - set_initial_pos -burstfit.mcmc - INFO - Initial guess for MCMC is: [7.46600996e+01 2.90899745e+01 2.82098756e+02 4.80475705e+01 4.12347636e-01 5.55896026e+02 7.90266031e+01 7.84458296e-01 7.02617870e-02 4.74575500e+02] 2021-03-18 20:06:50,367 - set_priors -burstfit.mcmc - INFO - Setting priors for MCMC. 2021-03-18 20:06:50,367 - set_priors -burstfit.mcmc - INFO - Found tau in param_names. Setting its min value of prior to 0. 2021-03-18 20:06:50,368 - set_priors -burstfit.mcmc - INFO - Found sigma_t in param_names. Setting its min value of prior to 0. 2021-03-18 20:06:50,368 - set_priors -burstfit.mcmc - INFO - Found sigma_t and tau in param_names. Setting its max value of prior to 2*(max_tau_prior(0.12647121664840297) + max_sigma_t_prior(1.4120249336903075)) 2021-03-18 20:06:50,368 - set_priors -burstfit.mcmc - INFO - Found S and sigma_t in param_names. Setting its max value of prior to 5*max(ts)*max_sigma_t_prior. Setting its min value of prior to 0. 2021-03-18 20:06:50,369 - set_priors -burstfit.mcmc - INFO - Found mu_f in param_names. Setting its priors to (-2*nf, 3*nf) 2021-03-18 20:06:50,369 - set_priors -burstfit.mcmc - INFO - Found sigma_f in param_names. Setting its priors to (0, 5*nf) 2021-03-18 20:06:50,522 - run_mcmc -burstfit.mcmc - INFO - Running MCMC with the following parameters: nwalkers=20, nsteps=1000, start_pos_dev=0.01, ncores=4, skip=500 2021-03-18 20:06:50,523 - run_mcmc -burstfit.mcmc - INFO - Priors used in MCMC are: 2021-03-18 20:06:50,524 - run_mcmc -burstfit.mcmc - INFO - mu_f1_1: [-672.0, 1008.0] 2021-03-18 20:06:50,524 - run_mcmc -burstfit.mcmc - INFO - sigma_f1_1: [0.0, 1680.0] 2021-03-18 20:06:50,525 - run_mcmc -burstfit.mcmc - INFO - mu_f2_1: [-672.0, 1008.0] 2021-03-18 20:06:50,526 - run_mcmc -burstfit.mcmc - INFO - sigma_f2_1: [0.0, 1680.0] 2021-03-18 20:06:50,526 - run_mcmc -burstfit.mcmc - INFO - amp_1: [0.08246952729802069, 0.7422257456821865] 2021-03-18 20:06:50,527 - run_mcmc -burstfit.mcmc - INFO - S_1: [0.0, 17395.74384586429] 2021-03-18 20:06:50,527 - run_mcmc -burstfit.mcmc - INFO - mu_t_1: [15.805320613874848, 142.24788552487368] 2021-03-18 20:06:50,528 - run_mcmc -burstfit.mcmc - INFO - sigma_t_1: [0.0, 3.076992300677421] 2021-03-18 20:06:50,529 - run_mcmc -burstfit.mcmc - INFO - tau_1: [0.0, 3.076992300677421] 2021-03-18 20:06:50,529 - run_mcmc -burstfit.mcmc - INFO - DM_1: [94.9150999694028, 854.2358997246254] 100%|██████████| 1000/1000 [01:00<00:00, 16.43it/s] 2021-03-18 20:07:51,668 - get_chain -burstfit.mcmc - INFO - Discarding 211 steps/iterations. 2021-03-18 20:07:51,685 - print_results -burstfit.mcmc - INFO - MCMC fit results are: 2021-03-18 20:07:51,686 - print_results -burstfit.mcmc - INFO - mu_f1_1: 75.17703577398103 + 3.762 - 3.633 2021-03-18 20:07:51,687 - print_results -burstfit.mcmc - INFO - sigma_f1_1: 29.887205055075455 + 5.096 - 4.423 2021-03-18 20:07:51,689 - print_results -burstfit.mcmc - INFO - mu_f2_1: 283.61178963639765 + 7.875 - 6.185 2021-03-18 20:07:51,690 - print_results -burstfit.mcmc - INFO - sigma_f2_1: 49.785407701111154 + 9.577 - 6.473 2021-03-18 20:07:51,691 - print_results -burstfit.mcmc - INFO - amp_1: 0.40625177755613506 + 0.044 - 0.048 2021-03-18 20:07:51,692 - print_results -burstfit.mcmc - INFO - S_1: 563.6627274884572 + 71.228 - 43.345 2021-03-18 20:07:51,693 - print_results -burstfit.mcmc - INFO - mu_t_1: 79.00524624804127 + 0.177 - 0.197 2021-03-18 20:07:51,694 - print_results -burstfit.mcmc - INFO - sigma_t_1: 0.781142503644106 + 0.067 - 0.077 2021-03-18 20:07:51,695 - print_results -burstfit.mcmc - INFO - tau_1: 0.09850340814206576 + 0.121 - 0.036 2021-03-18 20:07:51,696 - print_results -burstfit.mcmc - INFO - DM_1: 474.4161297685388 + 0.236 - 0.221 2021-03-18 20:07:51,698 - model -burstfit.fit - INFO - Making model. 2021-03-18 20:07:51,699 - model -burstfit.fit - INFO - Found 1 components. 2021-03-18 20:07:51,715 - get_off_pulse_region -burstfit.fit - INFO - mu_t and sigma_t found in params. Using those to estimate off pulse region. 2021-03-18 20:07:51,716 - get_off_pulse_region -burstfit.fit - INFO - Using MCMC parameters 2021-03-18 20:07:51,721 - model -burstfit.fit - INFO - Making model. 2021-03-18 20:07:51,722 - model -burstfit.fit - INFO - Found 1 components. 2021-03-18 20:07:51,733 - calc_redchisq -burstfit.fit - INFO - Reduced chi-square value of fit is: 1.003164231520135 2021-03-18 20:07:51,734 - plot -burstfit.mcmc - INFO - Plotting MCMC results.
<Figure size 432x288 with 0 Axes>
<burstfit.mcmc.MCMC at 0x7f7b30aaa0f0>
The four plots above are the following:
- Chains for all the parameters
- Corner plot using the MCMC samples. Blue lines show the initial guess obtained using
curve_fit
- Autocorrelation plot. Blue line is autocorrelation timescale, and dotted line is N/100
- Fitting results.
As we didn't run the MCMC for enough steps, it is not converged yet. See plot 3. But the fit results plot still looks pretty good, and corner plot shows that the parameters are exploring the region near the converged values from curve_fit
.
the parameters are saved in mcmc_params
bf.mcmc_params
{1: {'popt': [75.17703577398103, 29.887205055075455, 283.61178963639765, 49.785407701111154, 0.40625177755613506, 563.6627274884572, 79.00524624804127, 0.781142503644106, 0.09850340814206576, 474.4161297685388], 'perr': [[3.6329067726420163, 3.7619922545708135], [4.422738927117528, 5.095681106782216], [6.185124891899875, 7.874580503841173], [6.472670199145796, 9.576542630017784], [0.04814605974689967, 0.04403212468232259], [43.34508358697815, 71.22780796980362], [0.1971673225994408, 0.1766266248696553], [0.07697540480743692, 0.06669097160280835], [0.036171572436307184, 0.12102096522659295], [0.2210610790757528, 0.23570220152913635]]}}