Classes | Public Member Functions | Static Public Member Functions | Static Public Attributes | Private Member Functions | Private Attributes | List of all members
ana::MCMCSamples Class Reference

#include "/cvmfs/nova-development.opensciencegrid.org/novasoft/releases/N20-11-28/CAFAna/Fit/MCMCSamples.h"

Classes

struct  Hyperparameters
 

Public Member Functions

 MCMCSamples (const std::vector< const IFitVar * > &vars={}, const std::vector< const ana::ISyst * > &systs={})
 
 MCMCSamples (MCMCSamples &&other)
 Move constructor. More...
 
 MCMCSamples (const MCMCSamples &)=delete
 No copying for now (not interested in getting the memory semantics right) More...
 
MCMCSamplesoperator= (MCMCSamples &&other)
 Move assignment: need to call SetupTree(), so the default move assignment operator won't work. More...
 
MCMCSamplesoperator= (const MCMCSamples &)=delete
 Copying still forbidden. More...
 
void AddSample (const std::vector< double > &sample)
 
void AdoptSamples (MCMCSamples &&other)
 Add the samples from other into this MCMCSamples. Warning: other will be cleared! More...
 
std::size_t BestFitSampleIdx () const
 Retrieve the index of the best-fit sample (lowest LL), which can then be used with the SampleLL() or SampleValue() methods. More...
 
void Clear ()
 Discard any samples. More...
 
const HyperparametersHyperparams () const
 
Bayesian1DMarginal MarginalizeTo (const IFitVar *var, BayesianMarginal::MarginalMode marginalMode=BayesianMarginal::MarginalMode::kHistogram) const
 Marginalize over all other variables to obtain a 1D profile in var. More...
 
Bayesian1DMarginal MarginalizeTo (const ISyst *syst, BayesianMarginal::MarginalMode marginalMode=BayesianMarginal::MarginalMode::kHistogram) const
 Marginalize over all other variables to obtain a 1D profile in syst. More...
 
template<typename SystOrVar1 , typename SystOrVar2 >
BayesianSurface MarginalizeTo (const SystOrVar1 *x, int nbinsx, double xmin, double xmax, const SystOrVar2 *y, int nbinsy, double ymin, double ymax, BayesianMarginal::MarginalMode marginalMode=BayesianMarginal::MarginalMode::kHistogram) const
 Marginalize over all other variables to obtain a 2D surface in x and y. More...
 
template<typename T >
double MaxValue (const T *var) const
 Max value of this var/syst in the set. More...
 
template<typename T >
double MinValue (const T *var) const
 Min value of this var/syst in the set. More...
 
std::size_t NumSamples () const
 How many samples do we have? More...
 
std::pair< std::size_t, double > QuantileLL (double quantile) const
 
std::pair< std::size_t, double > QuantileLL (double quantile, std::vector< std::pair< std::size_t, double >> &LLs) const
 
std::map< double, std::pair< std::size_t, double > > QuantileLL (const std::vector< double > &quantiles) const
 Like QuantileLL(double) except requesting multiple LLs at once. More...
 
void RunDiagnostics (const StanConfig &cfg) const
 Do some checks on the post-fit samples. More...
 
MCMCSample Sample (std::size_t idx) const
 
double SampleLL (std::size_t idx) const
 Get the LL for sample number . More...
 
double SampleValue (const IFitVar *var, std::size_t idx) const
 Get the value of Var var for sample number . More...
 
double SampleValue (const ana::ISyst *syst, std::size_t idx) const
 Get the value of Syst syst for sample number . More...
 
void SampleValues (std::size_t idx, const std::vector< const ana::IFitVar * > &vars, std::map< const ana::IFitVar *, double > &varVals) const
 Get the values of FitVars vars for sample number idx. More...
 
void SampleValues (std::size_t idx, const std::vector< const ana::IFitVar * > &vars, std::map< const ana::IFitVar *, double > &varVals, const std::vector< const ana::ISyst * > &systs, std::map< const ana::ISyst *, double > &systVals) const
 Get the values of FitVars vars and ISysts systs for sample number idx. More...
 
double SamplingTime () const
 
void SetHyperparams (double stepSize, const TMatrixD &invMassMatrix)
 
void SetHyperparams (Hyperparameters h)
 
void SetNames (const std::vector< std::string > &names)
 
void SetSamplingTime (double s)
 
const TTree * ToTTree () const
 Get a TTree with the MCMC samples in it. More...
 
const std::vector< const ana::ISyst * > & Systs () const
 Which Systs are sampled in these samples? More...
 
const std::vector< const IFitVar * > & Vars () const
 Which Vars are sampled in these samples? More...
 
void SaveTo (TDirectory *dir, const std::string &name) const
 Save this guy to a file so we don't have to rerun the MCMC. More...
 

Static Public Member Functions

static std::unique_ptr< MCMCSamplesLoadFrom (TDirectory *dir, const std::string &name)
 Load from file. More...
 

Static Public Attributes

static const std::string LOGLIKELIHOOD_BRANCH_NAME = "logprob"
 

Private Member Functions

 MCMCSamples (std::size_t offset, const std::vector< std::string > &diagBranchNames, const std::vector< const IFitVar * > &vars, const std::vector< const ana::ISyst * > &systs, std::unique_ptr< TTree > &tree, const Hyperparameters &hyperParams, double samplingTime)
 Internal-use constructor needed for LoadFrom() More...
 
std::size_t DiagOffset (const std::string &diagName) const
 Where in fDiagnosticVals is the given diagnostic? More...
 
double SampleValue (std::size_t rowIdx, const std::string &branchName, std::size_t varIdx) const
 Internal method for getting the value of a var when you know where it is in fEntryVals. More...
 
void ParseDiagnosticBranches (const std::vector< std::string > &names)
 Set up the storage tree based on the branch names given us by Stan. More...
 
void SetupTree ()
 Hook up the storage tree to our internal variables. More...
 
std::vector< std::pair< std::size_t, double > > SortedLLs () const
 
std::size_t VarOffset (const IFitVar *var) const
 Where in fEntryVals is the given var? More...
 
std::size_t VarOffset (const ana::ISyst *syst) const
 Where in fEntryVals is the given syst? More...
 

Private Attributes

std::size_t fOffset
 
std::vector< std::stringfDiagBranches
 
std::vector< const ana::IFitVar * > fVars
 
std::vector< const ana::ISyst * > fSysts
 
std::size_t fBestFitSampleIdx
 
bool fBestFitFound
 
std::unique_ptr< TTree > fSamples
 
double fEntryLL
 
std::vector< double > fEntryVals
 
std::vector< double > fDiagnosticVals
 
Hyperparameters fHyperparams
 Hyperparameters deduced after adaptation, or manually set. More...
 
double fSamplingTime
 how long did we spend sampling? More...
 

Detailed Description

Storage for a list of MCMC samples.

The internal storage is as a TTree (which simplifies persistence); for efficiency and simplicity reasons the tree branches are simply doubles (rather than MCMCSample objects), but MCMCSample objects can be obtained using the Sample() method.

Parameters
varOffsetThe offset to the first Var value. (Previous values are the LL and internal fitter vars.)
varsThe ana::Vars passed to the fitter
systsThe ana::ISysts passed to the fitter

Definition at line 32 of file MCMCSamples.h.

Constructor & Destructor Documentation

ana::MCMCSamples::MCMCSamples ( const std::vector< const IFitVar * > &  vars = {},
const std::vector< const ana::ISyst * > &  systs = {} 
)

Build the MCMCSamples object.

Parameters
varsVars you want fitted by the MCMC
systsSystematics that should be marginalized

Definition at line 45 of file MCMCSamples.cxx.

Referenced by LoadFrom(), ana::MCMCSamples::Hyperparameters::operator=(), and Vars().

47  : fOffset(0),
48  fVars(vars),
49  fSysts(systs),
50  fBestFitFound(false),
51  fSamples(std::make_unique<TTree>("samples", "MCMC samples"))
52  {}
std::vector< SystGroupDef > systs
Definition: syst_header.h:385
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
std::size_t fOffset
Definition: MCMCSamples.h:252
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
const std::map< std::pair< std::string, std::string >, Variable > vars
ana::MCMCSamples::MCMCSamples ( MCMCSamples &&  other)

Move constructor.

Definition at line 76 of file MCMCSamples.cxx.

References fSamples, and SetupTree().

77  : fOffset(std::move(other.fOffset)),
78  fDiagBranches(std::move(other.fDiagBranches)),
79  fVars(std::move(other.fVars)),
80  fSysts(std::move(other.fSysts)),
81  fBestFitSampleIdx(std::move(other.fBestFitSampleIdx)),
82  fBestFitFound(std::move(other.fBestFitFound)),
83  fSamples(std::move(other.fSamples)),
84  fEntryLL(std::move(other.fEntryLL)),
85  fEntryVals(std::move(other.fEntryVals)),
86  fDiagnosticVals(std::move(other.fDiagnosticVals)),
87  fHyperparams(std::move(other.fHyperparams)),
88  fSamplingTime(std::move(other.fSamplingTime))
89  {
90  // note: SetupTree() takes care of fDiagnosticVals and fEntryVals
91  SetupTree();
92 
93  // almost certainly we don't want to be writing back to whatever file this may have come from ...
94  fSamples->SetDirectory(nullptr);
95  }
std::size_t fBestFitSampleIdx
Definition: MCMCSamples.h:257
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
std::size_t fOffset
Definition: MCMCSamples.h:252
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
Hyperparameters fHyperparams
Hyperparameters deduced after adaptation, or manually set.
Definition: MCMCSamples.h:265
std::vector< double > fDiagnosticVals
Definition: MCMCSamples.h:264
double fSamplingTime
how long did we spend sampling?
Definition: MCMCSamples.h:266
std::vector< std::string > fDiagBranches
Definition: MCMCSamples.h:253
std::vector< double > fEntryVals
Definition: MCMCSamples.h:263
void SetupTree()
Hook up the storage tree to our internal variables.
ana::MCMCSamples::MCMCSamples ( const MCMCSamples )
delete

No copying for now (not interested in getting the memory semantics right)

ana::MCMCSamples::MCMCSamples ( std::size_t  offset,
const std::vector< std::string > &  diagBranchNames,
const std::vector< const IFitVar * > &  vars,
const std::vector< const ana::ISyst * > &  systs,
std::unique_ptr< TTree > &  tree,
const Hyperparameters hyperParams,
double  samplingTime 
)
private

Internal-use constructor needed for LoadFrom()

Definition at line 55 of file MCMCSamples.cxx.

References SetupTree().

62  : fOffset(offset),
63  fDiagBranches(diagBranchNames),
64  fVars(vars),
65  fSysts(systs),
66  fBestFitFound(false),
67  fSamples(std::move(tree)),
68  fHyperparams(hyperParams),
69  fSamplingTime(samplingTime)
70  {
71  // note: SetupTree() takes care of fDiagnosticVals and fEntryVals
72  SetupTree();
73  }
std::vector< SystGroupDef > systs
Definition: syst_header.h:385
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
std::size_t fOffset
Definition: MCMCSamples.h:252
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
const std::map< std::pair< std::string, std::string >, Variable > vars
Hyperparameters fHyperparams
Hyperparameters deduced after adaptation, or manually set.
Definition: MCMCSamples.h:265
double fSamplingTime
how long did we spend sampling?
Definition: MCMCSamples.h:266
std::vector< std::string > fDiagBranches
Definition: MCMCSamples.h:253
void SetupTree()
Hook up the storage tree to our internal variables.

Member Function Documentation

void ana::MCMCSamples::AddSample ( const std::vector< double > &  sample)

Definition at line 125 of file MCMCSamples.cxx.

References ana::assert(), calc, fDiagnosticVals, fEntryLL, fEntryVals, fOffset, fSamples, fVars, compare_h5_caf::idx, and febshutoff_auto::val.

Referenced by ana::MCMCSamples::Hyperparameters::operator=().

126  {
127 // std::cout << "\nGot sample: ";
128 // for (const auto & s: sample)
129 // std::cout << s << " ";
130 // std::cout << std::endl;
131 // std::cout << "sample size: " << sample.size() << std::endl;
132 // std::cout << " target size = LL (1) + diagVals (" << fDiagnosticVals.size() << ") + entryVals (" << fEntryVals.size() << ")" << std::endl;
133  fEntryLL = sample[0];
134  for (std::size_t idx = 1; idx < fOffset && idx < sample.size(); idx++)
135  {
136  assert(idx-1 < fDiagnosticVals.size());
137  fDiagnosticVals[idx - 1] = sample[idx];
138  }
139  for (std::size_t targetIdx = 0, sourceIdx = fOffset; sourceIdx < sample.size(); targetIdx++, sourceIdx++)
140  {
141  assert(targetIdx < fEntryVals.size());
142  double val = sample[sourceIdx];
143  // if it's a Var, allow it to rewrite the value if it wants to
144  // (e.g., dcp is pinned back to (0, 2pi))
145  if (targetIdx < fVars.size())
146  {
147  static osc::OscCalcAnalytic calc;
148  fVars[targetIdx]->SetValue(&calc, val);
149  val = fVars[targetIdx]->GetValue(&calc);
150  }
151  fEntryVals[targetIdx] = val;
152  }
153  fSamples->Fill();
154 // fSamples->Scan("*");
155  }
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
std::size_t fOffset
Definition: MCMCSamples.h:252
osc::OscCalcDumb calc
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
std::vector< double > fDiagnosticVals
Definition: MCMCSamples.h:264
assert(nhit_max >=nhit_nbins)
std::vector< double > fEntryVals
Definition: MCMCSamples.h:263
void ana::MCMCSamples::AdoptSamples ( MCMCSamples &&  other)

Add the samples from other into this MCMCSamples. Warning: other will be cleared!

Definition at line 158 of file MCMCSamples.cxx.

References fBestFitFound, fDiagBranches, fSamples, fSysts, fVars, and fhicl::other.

Referenced by ana::MCMCSamples::Hyperparameters::operator=().

159  {
160  // if we don't have any samples at all, then just swap
161  if (!fSamples || (fDiagBranches.size() + fVars.size() + fSysts.size()) == 0)
162  {
163  this->fSamples.reset(other.fSamples->CloneTree());
164  this->fDiagBranches = other.fDiagBranches;
165  this->fVars = other.fVars;
166  this->fSysts = other.fSysts;
167  //SetupTree();
168 
169  // make sure we extract this guy out of his old directory, if any,
170  // otherwise he wants to write back to that file
171  this->fSamples->LoadBaskets();
172  this->fSamples->SetDirectory(nullptr);
173 
174  return;
175  }
176 
177  // first, check that the branches (etc.) all match.
178  // if they don't, throw an exception
179  bool branchesSame = true;
180  branchesSame = branchesSame && fDiagBranches == other.fDiagBranches;
181  branchesSame = branchesSame && fVars == other.fVars;
182  branchesSame = branchesSame && fSysts == other.fSysts;
183  if (!branchesSame)
184  throw std::runtime_error("MCMCSamples::AdoptSamples(): branches are not the same!");
185 
186 
187  // then, take the entries from its TTree and and them to ours using TTree::Merge().
188  // clear the other tree once done.
189  BranchStatusResetter bsr(fSamples.get()); // turn all branches off when done
190  fSamples->SetBranchStatus("*", true);
191  other.fSamples->SetBranchStatus("*", true);
192  TList otherTreeList;
193  otherTreeList.SetOwner(false);
194  otherTreeList.Add(other.fSamples.get());
195  fSamples->Merge(&otherTreeList);
196 
197  otherTreeList.Clear();
198  other.fSamples.reset(nullptr);
199 
200  // clear the rest of the other MCMCSamples so it isn't left in an intermediate state
201  other.fDiagBranches.clear();
202  other.fVars.clear();
203  other.fSysts.clear();
204  other.fOffset = 0;
205  other.fBestFitSampleIdx = 0;
206  other.fBestFitFound = false;
207  other.fEntryLL = 0;
208  other.fDiagnosticVals.clear();
209  other.fEntryVals.clear();
210 
211  // finally, the best fit point isn't necessarily the same any more, so force recalculation the next time it's needed
212  fBestFitFound = false;
213  }
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
std::vector< std::string > fDiagBranches
Definition: MCMCSamples.h:253
std::size_t ana::MCMCSamples::BestFitSampleIdx ( ) const

Retrieve the index of the best-fit sample (lowest LL), which can then be used with the SampleLL() or SampleValue() methods.

Definition at line 216 of file MCMCSamples.cxx.

References compare_h5_caf::branch, fBestFitFound, fBestFitSampleIdx, fEntryLL, fSamples, and compare_h5_caf::idx.

Referenced by ana::BayesianSurface::BuildHist(), ana::StanFitter::FitHelperSeeded(), and ana::MCMCSamples::Hyperparameters::operator=().

217  {
218  if (fBestFitFound)
219  return fBestFitSampleIdx;
220 
221  double maxLL = -std::numeric_limits<double>::infinity();
222  auto branch = fSamples->GetBranch(MCMCSamples::LOGLIKELIHOOD_BRANCH_NAME.c_str());
223  branch->SetStatus(true);
224  for (int idx = 0; idx < fSamples->GetEntries(); idx++)
225  {
226  branch->GetEntry(idx);
227  if (fEntryLL > maxLL)
228  {
230  maxLL = fEntryLL;
231  fBestFitFound = true;
232  }
233  }
234  return fBestFitSampleIdx;
235  }
std::size_t fBestFitSampleIdx
Definition: MCMCSamples.h:257
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
static const std::string LOGLIKELIHOOD_BRANCH_NAME
Definition: MCMCSamples.h:37
void ana::MCMCSamples::Clear ( )
inline

Discard any samples.

Definition at line 89 of file MCMCSamples.h.

References fSamples.

89 { fSamples->Clear(); }
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
std::size_t ana::MCMCSamples::DiagOffset ( const std::string diagName) const
private

Where in fDiagnosticVals is the given diagnostic?

Definition at line 238 of file MCMCSamples.cxx.

References ana::assert(), distance(), and fDiagBranches.

Referenced by RunDiagnostics(), and Vars().

239  {
240  auto itr = std::find(fDiagBranches.begin(), fDiagBranches.end(), diagName);
241  assert(itr != fDiagBranches.end() && ("Unknown diagnostic branch: " + diagName).c_str());
242  return std::distance(fDiagBranches.begin(), itr);
243  }
unsigned distance(const T &t1, const T &t2)
assert(nhit_max >=nhit_nbins)
std::vector< std::string > fDiagBranches
Definition: MCMCSamples.h:253
const Hyperparameters& ana::MCMCSamples::Hyperparams ( ) const
inline

Definition at line 90 of file MCMCSamples.h.

References fHyperparams, ana::BayesianMarginal::kHistogram, MarginalizeTo(), and PandAna.Demos.tute_pid_validation::var.

Referenced by ana::StanFitter::RunHMC().

90 { return fHyperparams; }
Hyperparameters fHyperparams
Hyperparameters deduced after adaptation, or manually set.
Definition: MCMCSamples.h:265
std::unique_ptr< MCMCSamples > ana::MCMCSamples::LoadFrom ( TDirectory *  dir,
const std::string name 
)
static

Load from file.

Definition at line 247 of file MCMCSamples.cxx.

References ana::assert(), dir, findDuplicateFiles::key, MCMCSamples(), PandAna.reco_validation.add_data::offset, submit_syst::str, systNames, systs, and getGoodRuns4SAM::tag.

Referenced by test_stanfit_statsonly(), test_stanfit_withsysts(), and Vars().

249  {
250  dir = dir->GetDirectory(name.c_str()); // switch to subdir
251  assert(dir);
252  DontAddDirectory guard;
253 
254  TObjString *tag = (TObjString *) dir->Get("type");
255  assert(tag);
256  assert(tag->GetString() == "MCMCSamples");
257 
258  auto offset = dynamic_cast<TParameter<int>*>(dir->Get("offset"));
259 
260  auto diagBranchNames = dynamic_cast<TList*>(dir->Get("diagBranchNames"));
261  std::vector<std::string> diagBranches;
262  for (const TObject * brNameObj : *diagBranchNames)
263  {
264  auto str = dynamic_cast<const TObjString*>(brNameObj);
265  if (str)
266  diagBranches.emplace_back(str->GetName());
267  }
268 
269  auto fitVarNames = dynamic_cast<TList*>(dir->Get("fitVarNames"));
270  std::vector<const IFitVar*> fitVars;
271  for (const TObject * varNmObj : *fitVarNames)
272  {
273  auto str = dynamic_cast<const TObjString*>(varNmObj);
274  if (str)
275  fitVars.emplace_back(Registry<IFitVar>::ShortNameToPtr(str->GetName()));
276  }
277 
278  auto systNames = dynamic_cast<TList*>(dir->Get("systNames"));
279  std::vector<const ISyst*> systs;
280  for (const TObject * systNmObj : *systNames)
281  {
282  auto str = dynamic_cast<const TObjString*>(systNmObj);
283  if (str)
284  systs.emplace_back(Registry<ISyst>::ShortNameToPtr(str->GetName()));
285  }
286 
287  auto samplesPtr = dynamic_cast<TTree*>(dir->Get("samples"));
288  std::unique_ptr<TTree> samples(samplesPtr);
289  if (samples->GetCurrentFile())
290  {
291  samples->SetBranchStatus("*", true);
292  samples->LoadBaskets(); // read the entire TTree into memory
293  samples->SetDirectory(nullptr); // disassociate it from the file it came from so that when the file is closed it persists
294  }
295 
296  // these may not exist (hyperparameters are not merged when hadd'ing samples)
297  double stepSize = std::numeric_limits<double>::signaling_NaN();
298  std::unique_ptr<TMatrixD> invMetric;
299  auto hyperParamsDir = dynamic_cast<TDirectory *>(dir->Get("hyperparams"));
300  if (hyperParamsDir)
301  {
302  if (auto key = hyperParamsDir->FindKey("stepsize"))
303  stepSize = key->ReadObject<TParameter<double>>()->GetVal();
304  if (auto key = hyperParamsDir->FindKey("inv_metric"))
305  invMetric = std::unique_ptr<TMatrixD>(key->ReadObject<TMatrixD>());
306  }
307  Hyperparameters hyperparams{stepSize, std::move(invMetric)};
308 
309  // ditto
310  double samplingTime = std::numeric_limits<double>::signaling_NaN();
311  auto samplingTimeDir = dynamic_cast<TDirectory*>(dir->Get("samplingTime"));
312  if (!samplingTimeDir) // for backwards compatibility. might be stored at top level
313  samplingTimeDir = dir;
314  if (auto key = samplingTimeDir->FindKey("samplingTime"))
315  samplingTime = key->ReadObject<TParameter<double>>()->GetVal();
316 
317  delete dir;
318 
319  return std::unique_ptr<MCMCSamples>(new MCMCSamples(offset->GetVal(),
320  diagBranches,
321  fitVars,
322  systs,
323  samples,
324  hyperparams,
325  samplingTime));
326  }
const XML_Char * name
Definition: expat.h:151
std::vector< SystGroupDef > systs
Definition: syst_header.h:385
MCMCSamples(const std::vector< const IFitVar * > &vars={}, const std::vector< const ana::ISyst * > &systs={})
Definition: MCMCSamples.cxx:45
TDirectory * dir
Definition: macro.C:5
std::map< std::string, std::string > systNames
Definition: PlotUnfolding.C:32
assert(nhit_max >=nhit_nbins)
static const T * ShortNameToPtr(const std::string &s, bool allowFail=false)
Definition: Registry.cxx:60
Bayesian1DMarginal ana::MCMCSamples::MarginalizeTo ( const IFitVar var,
BayesianMarginal::MarginalMode  marginalMode = BayesianMarginal::MarginalMode::kHistogram 
) const

Marginalize over all other variables to obtain a 1D profile in var.

Definition at line 329 of file MCMCSamples.cxx.

References ana::assert(), fVars, and PandAna.Demos.tute_pid_validation::var.

Referenced by Hyperparams().

330  {
331  assert(std::find(fVars.begin(), fVars.end(), var) != fVars.end());
332  return Bayesian1DMarginal(*this, var, marginalMode);
333  }
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
assert(nhit_max >=nhit_nbins)
Bayesian1DMarginal ana::MCMCSamples::MarginalizeTo ( const ISyst syst,
BayesianMarginal::MarginalMode  marginalMode = BayesianMarginal::MarginalMode::kHistogram 
) const

Marginalize over all other variables to obtain a 1D profile in syst.

Definition at line 336 of file MCMCSamples.cxx.

References ana::assert(), and fSysts.

337  {
338  assert(std::find(fSysts.begin(), fSysts.end(), syst) != fSysts.end());
339  return Bayesian1DMarginal(*this, syst, marginalMode);
340  }
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
assert(nhit_max >=nhit_nbins)
template<typename SystOrVar1 , typename SystOrVar2 >
BayesianSurface ana::MCMCSamples::MarginalizeTo ( const SystOrVar1 *  x,
int  nbinsx,
double  xmin,
double  xmax,
const SystOrVar2 *  y,
int  nbinsy,
double  ymin,
double  ymax,
BayesianMarginal::MarginalMode  marginalMode = BayesianMarginal::MarginalMode::kHistogram 
) const
inline

Marginalize over all other variables to obtain a 2D surface in x and y.

Definition at line 103 of file MCMCSamples.h.

References ana::assert(), fSysts, fVars, MaxValue(), MinValue(), T, submit_syst::x, and submit_syst::y.

106  {
107  static_assert((std::is_base_of_v<IFitVar, SystOrVar1> || std::is_same_v<SystOrVar1, ISyst>)
108  && (std::is_base_of_v<IFitVar, SystOrVar2> || std::is_same_v<SystOrVar2, ISyst>),
109  "MCMCSamples::MarginalizeTo() can only be used with Systs or Vars");
110  if constexpr(std::is_base_of_v<IFitVar, SystOrVar1>)
111  assert(std::find(fVars.begin(), fVars.end(), x) != fVars.end());
112  else
113  assert(std::find(fSysts.begin(), fSysts.end(), x) != fSysts.end());
114 
115  if constexpr(std::is_base_of_v<IFitVar, SystOrVar2>)
116  assert(std::find(fVars.begin(), fVars.end(), y) != fVars.end());
117  else
118  assert(std::find(fSysts.begin(), fSysts.end(), y) != fSysts.end());
119  return BayesianSurface(*this,
120  x, nbinsx, xmin, xmax,
121  y, nbinsy, ymin, ymax,
122  marginalMode);
123  }
std::map< std::string, double > xmax
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
Double_t ymax
Definition: plot.C:25
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
if(dump)
Int_t nbinsx
Definition: plot.C:23
assert(nhit_max >=nhit_nbins)
Double_t ymin
Definition: plot.C:24
template<typename T >
template double ana::MCMCSamples::MaxValue ( const T var) const

Max value of this var/syst in the set.

Definition at line 344 of file MCMCSamples.cxx.

References fSamples, compare_h5_caf::idx, cet::sqlite::max(), PandAna.reco_validation.add_data::offset, SampleValue(), febshutoff_auto::val, and VarOffset().

Referenced by MarginalizeTo().

345  {
347  "MCMCSamples::MaxValue() can only be used with IFitVars and ISysts");
348  double max = -std::numeric_limits<double>::infinity();
349  std::size_t offset = VarOffset(var);
350  for (int idx = 0; idx < fSamples->GetEntries(); idx++)
351  {
352  double val = SampleValue(idx, var->ShortName(), offset);
353  if (val > max)
354  max = val;
355  }
356  return max;
357  }
std::size_t VarOffset(const IFitVar *var) const
Where in fEntryVals is the given var?
double SampleValue(const IFitVar *var, std::size_t idx) const
Get the value of Var var for sample number .
Definition: MCMCSamples.h:169
const XML_Char int const XML_Char * value
Definition: expat.h:331
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
T max(sqlite3 *const db, std::string const &table_name, std::string const &column_name)
Definition: statistics.h:68
template<typename T >
template double ana::MCMCSamples::MinValue ( const T var) const

Min value of this var/syst in the set.

Definition at line 365 of file MCMCSamples.cxx.

References fSamples, compare_h5_caf::idx, min(), PandAna.reco_validation.add_data::offset, SampleValue(), febshutoff_auto::val, and VarOffset().

Referenced by MarginalizeTo().

366  {
368  "MCMCSamples::MinValue() can only be used with IFitVars and ISysts");
369  double min = std::numeric_limits<double>::infinity();
370  std::size_t offset = VarOffset(var);
371  for (int idx = 0; idx < fSamples->GetEntries(); idx++)
372  {
373  double val = SampleValue(idx, var->ShortName(), offset);
374  if (val < min)
375  min = val;
376  }
377  return min;
378  }
std::size_t VarOffset(const IFitVar *var) const
Where in fEntryVals is the given var?
double SampleValue(const IFitVar *var, std::size_t idx) const
Get the value of Var var for sample number .
Definition: MCMCSamples.h:169
const XML_Char int const XML_Char * value
Definition: expat.h:331
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
static float min(const float a, const float b, const float c)
Definition: absgeo.cxx:45
std::size_t ana::MCMCSamples::NumSamples ( ) const
inline
MCMCSamples & ana::MCMCSamples::operator= ( MCMCSamples &&  other)

Move assignment: need to call SetupTree(), so the default move assignment operator won't work.

Definition at line 98 of file MCMCSamples.cxx.

References fBestFitFound, fBestFitSampleIdx, fDiagBranches, fDiagnosticVals, fEntryLL, fEntryVals, fHyperparams, fOffset, fSamples, fSamplingTime, fSysts, fVars, fhicl::other, and SetupTree().

99  {
100  fOffset = std::move(other.fOffset);
101 
102  fVars = std::move(other.fVars);
103  fSysts = std::move(other.fSysts);
104 
105  fBestFitSampleIdx = std::move(other.fBestFitSampleIdx);
106  fBestFitFound = std::move(other.fBestFitFound);
107 
108  fSamples = std::move(other.fSamples);
109 
110  fEntryLL = std::move(other.fEntryLL);
111  fDiagBranches = std::move(other.fDiagBranches);
112 
113  fDiagnosticVals = std::move(other.fDiagnosticVals);
114  fEntryVals = std::move(other.fEntryVals);
115  fHyperparams = std::move(other.fHyperparams);
116  fSamplingTime = std::move(other.fSamplingTime);
117 
118  // note: SetupTree() takes care of fDiagnosticVals and fEntryVals
119  SetupTree();
120 
121  return *this;
122  }
std::size_t fBestFitSampleIdx
Definition: MCMCSamples.h:257
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
std::size_t fOffset
Definition: MCMCSamples.h:252
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
Hyperparameters fHyperparams
Hyperparameters deduced after adaptation, or manually set.
Definition: MCMCSamples.h:265
std::vector< double > fDiagnosticVals
Definition: MCMCSamples.h:264
double fSamplingTime
how long did we spend sampling?
Definition: MCMCSamples.h:266
std::vector< std::string > fDiagBranches
Definition: MCMCSamples.h:253
std::vector< double > fEntryVals
Definition: MCMCSamples.h:263
void SetupTree()
Hook up the storage tree to our internal variables.
MCMCSamples& ana::MCMCSamples::operator= ( const MCMCSamples )
delete

Copying still forbidden.

void ana::MCMCSamples::ParseDiagnosticBranches ( const std::vector< std::string > &  names)
private

Set up the storage tree based on the branch names given us by Stan.

Definition at line 385 of file MCMCSamples.cxx.

References ana::assert(), fDiagBranches, fDiagnosticVals, fSysts, fVars, and compare_h5_caf::idx.

Referenced by SetNames(), and Vars().

386  {
387  std::size_t idx = 1;
388  fDiagnosticVals.resize(names.size() - (fVars.size() + fSysts.size() + 1));
389  auto firstFitVar = fVars.empty() ? fSysts[0]->ShortName() : fVars[0]->ShortName();
390  for (; idx < names.size(); idx++)
391  {
392  if (names[idx] == firstFitVar)
393  break;
394  fDiagBranches.push_back(names[idx]);
395  }
396  assert(idx < names.size() && "Did not find fit variables in Stan output!");
397  assert (names.size() - idx >= fVars.size() + fSysts.size() && "Stan output didn't contain all FitVars & Systs!");
398  }
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
std::vector< double > fDiagnosticVals
Definition: MCMCSamples.h:264
assert(nhit_max >=nhit_nbins)
std::vector< std::string > fDiagBranches
Definition: MCMCSamples.h:253
std::pair< std::size_t, double > ana::MCMCSamples::QuantileLL ( double  quantile) const

Determine the LL at given quantile

Parameters
quantilee.g. 90% smallest LL of set of samples –> 0.9
Returns
the LL at given quantile

Note that the necessary ordered list is constructed and destroyed each time this method is called, so if you know you need multiple quantiles, use one of the other signatures.

Definition at line 401 of file MCMCSamples.cxx.

References SortedLLs().

Referenced by ana::BayesianMarginal::BayesianMarginal(), NumSamples(), QuantileLL(), and ana::BayesianMarginal::QuantileThreshold().

402  {
403  auto LLs = SortedLLs();
404  return LLs[std::size_t((1 - quantile) * LLs.size())];
405  }
std::vector< std::pair< std::size_t, double > > SortedLLs() const
unsigned int quantile(std::vector< std::vector< double >> v, float E, float hadEfrac)
Definition: Toy_analyses.C:480
std::pair< std::size_t, double > ana::MCMCSamples::QuantileLL ( double  quantile,
std::vector< std::pair< std::size_t, double >> &  LLs 
) const

Like QuantileLL(double) except that the LL vector is returned to you. If you pass a non-empty vector, that vector is assumed to already contain the sorted LL list (facilitating re-use).

Definition at line 408 of file MCMCSamples.cxx.

References SortedLLs().

410  {
411  if (LLs.empty())
412  LLs = SortedLLs(); // I think this should be efficient since std::vector has a move assignment operator
413  return LLs[std::size_t((1 - quantile) * (LLs.size()-1))]; // -1 so that we don't fall off the edge at Q=1.0
414  }
std::vector< std::pair< std::size_t, double > > SortedLLs() const
unsigned int quantile(std::vector< std::vector< double >> v, float E, float hadEfrac)
Definition: Toy_analyses.C:480
std::map< double, std::pair< std::size_t, double > > ana::MCMCSamples::QuantileLL ( const std::vector< double > &  quantiles) const

Like QuantileLL(double) except requesting multiple LLs at once.

Definition at line 417 of file MCMCSamples.cxx.

References quantile(), QuantileLL(), and runNovaSAM::ret.

418  {
419  std::map<double, std::pair<std::size_t, double> > ret;
420  std::vector<std::pair<std::size_t, double> > LLs;
421  for (const auto & quantile : quantiles)
422  ret[quantile] = QuantileLL(quantile, LLs);
423 
424  return ret;
425  }
std::vector< double > quantiles(TH1D *h)
Definition: absCal.cxx:528
std::pair< std::size_t, double > QuantileLL(double quantile) const
unsigned int quantile(std::vector< std::vector< double >> v, float E, float hadEfrac)
Definition: Toy_analyses.C:480
void ana::MCMCSamples::RunDiagnostics ( const StanConfig cfg) const

Do some checks on the post-fit samples.

Definition at line 428 of file MCMCSamples.cxx.

References om::cout, d, DiagOffset(), allTimeWatchdog::endl, fDiagnosticVals, fSamples, compare_h5_caf::idx, ana::StanConfig::max_depth, and util::sqr().

Referenced by ana::StanFitter::FitHelperSeeded(), and NumSamples().

429  {
430  BranchStatusResetter bsr(fSamples.get()); // turn branches off when done
431  // these diagnostics adapted from CmdStan's diagnose.cpp
432  auto treeDepthBr = fSamples->GetBranch("treedepth__");
433  if (treeDepthBr)
434  {
435  treeDepthBr->SetStatus(true);
436  unsigned int numMax = 0;
437  auto brOffset = DiagOffset(treeDepthBr->GetName());
438  for (int idx = 0; idx < treeDepthBr->GetEntries(); idx++)
439  {
440  treeDepthBr->GetEntry(idx);
441  if (fDiagnosticVals[brOffset] >= cfg.max_depth)
442  numMax++;
443  }
444  if (numMax > 0)
445  {
446  std::cout << numMax << " of " << treeDepthBr->GetEntries() << " ("
447  << std::setprecision(2)
448  << 100 * static_cast<double>(numMax) / treeDepthBr->GetEntries()
449  << "%) transitions hit the maximum treedepth limit of "
450  << cfg.max_depth << ", or 2^" << cfg.max_depth << " leapfrog steps."
451  << " Trajectories that are prematurely terminated due to this"
452  << " limit will result in slow exploration and you should"
453  << " increase the limit to ensure optimal performance."
454  << std::endl << std::endl;
455 
456  }
457  }
458 
459  auto divergentBr = fSamples->GetBranch("divergent__");
460  if (divergentBr)
461  {
462  divergentBr->SetStatus(true);
463  unsigned int numDiv = 0;
464  auto brOffset = DiagOffset(divergentBr->GetName());
465  for (int idx = 0; idx < divergentBr->GetEntries(); idx++)
466  {
467  divergentBr->GetEntry(idx);
468  if (fDiagnosticVals[brOffset] >0)
469  numDiv++;
470  }
471  if (numDiv > 0)
472  std::cout << numDiv << " of " << divergentBr->GetEntries() << " ("
473  << std::setprecision(2)
474  << 100 * static_cast<double>(numDiv) / divergentBr->GetEntries()
475  << "%) transitions ended with a divergence. These divergent"
476  << " transitions indicate that HMC is not fully able to"
477  << " explore the posterior distribution. Try rerunning with"
478  << " adapt_delta set to a larger value and see if the"
479  << " divergences vanish. If increasing adapt delta towards"
480  << " 1 does not remove the divergences then you will likely"
481  << " need to reparameterize your model."
482  << std::endl << std::endl;
483  }
484 
485  auto energyBr = fSamples->GetBranch("energy__");
486  if (energyBr)
487  {
488  double delta_e_sq_mean = 0;
489  double e_mean = 0;
490  double e_var = 0;
491 
492  double e_sample = 0;
493  double e_sample_prev = 0;
494 
495  auto brOffset = DiagOffset(energyBr->GetName());
496  for (int idx = 0; idx < energyBr->GetEntries(); idx++)
497  {
498  energyBr->SetStatus(true);
499  energyBr->GetEntry(idx);
500  e_sample = fDiagnosticVals[brOffset];
501 
502  double delta_e_sq = util::sqr(e_sample - e_sample_prev);
503  double d = delta_e_sq - delta_e_sq_mean;
504  delta_e_sq_mean += d / (idx + 1);
505 
506  d = e_sample - e_mean;
507  e_mean += d / (idx + 2);
508  e_var += d * (e_sample - e_mean);
509 
510  e_sample_prev = e_sample;
511  }
512 
513  e_var /= static_cast<double>(energyBr->GetEntries() - 1);
514 
515  double e_bfmi = delta_e_sq_mean / e_var;
516 
517  const double e_bfmi_threshold = 0.3;
518  if (e_bfmi < e_bfmi_threshold)
519  std::cout << "The E-BFMI, " << e_bfmi << ", is below the nominal"
520  << " threshold of " << e_bfmi_threshold << " which suggests"
521  << " that HMC may have trouble exploring the target"
522  << " distribution. You should consider any"
523  << " reparameterizations if possible."
524  << std::endl << std::endl;
525  }
526 
527  // unfortunately the remaining diagnostics really require a stan::mcmc::chains<> object.
528  // this presents an annoying issue: the TTree storage is necessary for use with RooNDKeysPDF,
529  // and is much more convenient for persistence.
530  // even more unfortunately, these are the most important diagnostics.
531  // Should we build a stan::mcmc::chains from it?
532  // todo: decide how to implement remaining diagnostics
533 /*
534  else if (chains.param_name(i).find("__") == std::string::npos) {
535 
536  double n_eff = chains.effective_sample_size(i);
537  if (n_eff / num_samples < 0.001)
538  bad_n_eff_names.push_back(chains.param_name(i));
539 
540  double split_rhat = chains.split_potential_scale_reduction(i);
541  if (split_rhat > 1.1)
542  bad_rhat_names.push_back(chains.param_name(i));
543  }
544  }
545 
546  if (bad_n_eff_names.size() > 0) {
547  std::cout << "The following parameters had fewer than 0.001 effective"
548  << " samples per transition:" << std::endl;
549  std::cout << " ";
550  for (size_t n = 0; n < bad_n_eff_names.size() - 1; ++n)
551  std::cout << bad_n_eff_names.at(n) << ", ";
552  std::cout << bad_n_eff_names.back() << std::endl;
553 
554  std::cout << "Such low values indicate that the effective sample size"
555  << " estimators may be biased high and actual performance"
556  << " may be substantially lower than quoted."
557  << std::endl << std::endl;
558  }
559 
560  if (bad_rhat_names.size() > 0) {
561  std::cout << "The following parameters had split R-hat less than 1.1:"
562  << std::endl;
563  std::cout << " ";
564  for (size_t n = 0; n < bad_rhat_names.size() - 1; ++n)
565  std::cout << bad_rhat_names.at(n) << ", ";
566  std::cout << bad_rhat_names.back() << std::endl;
567 
568  std::cout << "Such high values indicate incomplete mixing and biased"
569  << "estimation. You should consider regularization your model"
570  << " with additional prior information or looking for a more"
571  << " effective parameterization."
572  << std::endl << std::endl;
573  }
574 */
575 
576  }
T sqr(T x)
More efficient square function than pow(x,2)
Definition: MathUtil.h:23
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
Float_t d
Definition: plot.C:236
OStream cout
Definition: OStream.cxx:6
std::vector< double > fDiagnosticVals
Definition: MCMCSamples.h:264
std::size_t DiagOffset(const std::string &diagName) const
Where in fDiagnosticVals is the given diagnostic?
MCMCSample ana::MCMCSamples::Sample ( std::size_t  idx) const

The entire sample at index . If you just want one value prefer SampleLL() or SampleValue() (less overhead).

Definition at line 578 of file MCMCSamples.cxx.

References fDiagBranches, fDiagnosticVals, fEntryVals, fSamples, fSysts, fVars, and SampleLL().

Referenced by ana::BayesianMarginal::BayesianMarginal(), and NumSamples().

579  {
580  BranchStatusResetter bsr(fSamples.get()); // turn branches off when done
581 
582  fSamples->SetBranchStatus("*", true);
583  fSamples->GetEntry(idx);
585  }
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
double SampleLL(std::size_t idx) const
Get the LL for sample number .
Definition: MCMCSamples.h:162
std::vector< double > fDiagnosticVals
Definition: MCMCSamples.h:264
std::vector< std::string > fDiagBranches
Definition: MCMCSamples.h:253
std::vector< double > fEntryVals
Definition: MCMCSamples.h:263
double ana::MCMCSamples::SampleLL ( std::size_t  idx) const
inline

Get the LL for sample number .

Definition at line 162 of file MCMCSamples.h.

References fEntryLL, and fSamples.

Referenced by ana::BayesianSurface::BuildHist(), ana::StanFitter::FitHelperSeeded(), Sample(), and ana::BayesianMarginal::ToHistogram().

163  {
164  fSamples->GetEntry(idx);
165  return fEntryLL;
166  }
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
double ana::MCMCSamples::SampleValue ( const IFitVar var,
std::size_t  idx 
) const
inline

Get the value of Var var for sample number .

Definition at line 169 of file MCMCSamples.h.

References ana::IFitVar::ShortName(), and VarOffset().

Referenced by ana::BayesianSurface::BuildHist(), ana::StanFitter::FitHelperSeeded(), MaxValue(), MinValue(), SampleValue(), ana::BayesianMarginal::ToHistogram(), and Vars().

170  {
171  return SampleValue(idx, var->ShortName(), VarOffset(var));
172  }
std::size_t VarOffset(const IFitVar *var) const
Where in fEntryVals is the given var?
double SampleValue(const IFitVar *var, std::size_t idx) const
Get the value of Var var for sample number .
Definition: MCMCSamples.h:169
double ana::MCMCSamples::SampleValue ( const ana::ISyst syst,
std::size_t  idx 
) const
inline

Get the value of Syst syst for sample number .

Definition at line 175 of file MCMCSamples.h.

References SampleValue(), SampleValues(), ana::ISyst::ShortName(), VarOffset(), and vars.

176  {
177  return SampleValue(idx, syst->ShortName(), VarOffset(syst));
178  }
std::size_t VarOffset(const IFitVar *var) const
Where in fEntryVals is the given var?
virtual const std::string & ShortName() const final
The name printed out to the screen.
Definition: ISyst.h:27
double SampleValue(const IFitVar *var, std::size_t idx) const
Get the value of Var var for sample number .
Definition: MCMCSamples.h:169
double ana::MCMCSamples::SampleValue ( std::size_t  rowIdx,
const std::string branchName,
std::size_t  varIdx 
) const
private

Internal method for getting the value of a var when you know where it is in fEntryVals.

Definition at line 588 of file MCMCSamples.cxx.

References compare_h5_caf::branch, fEntryVals, fSamples, compare_h5_caf::idx, SampleValue(), SampleValues(), systs, PandAna.Demos.tute_pid_validation::var, and vars.

589  {
590  BranchStatusResetter bsr(fSamples.get()); // turn branches off when done
591 
592  fSamples->SetBranchStatus(branchName.c_str(), true);
593  auto branch = fSamples->GetBranch(branchName.c_str());
594  branch->SetStatus(true);
595  branch->GetEntry(rowIdx);
596  return fEntryVals[varIdx];
597  }
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
std::vector< double > fEntryVals
Definition: MCMCSamples.h:263
void ana::MCMCSamples::SampleValues ( std::size_t  idx,
const std::vector< const ana::IFitVar * > &  vars,
std::map< const ana::IFitVar *, double > &  varVals 
) const

Get the values of FitVars vars for sample number idx.

Referenced by SampleValue(), and ana::BayesianMarginal::ToHistogram().

void ana::MCMCSamples::SampleValues ( std::size_t  idx,
const std::vector< const ana::IFitVar * > &  vars,
std::map< const ana::IFitVar *, double > &  varVals,
const std::vector< const ana::ISyst * > &  systs,
std::map< const ana::ISyst *, double > &  systVals 
) const

Get the values of FitVars vars and ISysts systs for sample number idx.

double ana::MCMCSamples::SamplingTime ( ) const
inline

Definition at line 190 of file MCMCSamples.h.

References fSamplingTime.

190 { return fSamplingTime; }
double fSamplingTime
how long did we spend sampling?
Definition: MCMCSamples.h:266
void ana::MCMCSamples::SaveTo ( TDirectory *  dir,
const std::string name 
) const

Save this guy to a file so we don't have to rerun the MCMC.

Definition at line 622 of file MCMCSamples.cxx.

References dir, fDiagBranches, fHyperparams, fOffset, fSamples, fSamplingTime, fSysts, fVars, ana::MCMCSamples::Hyperparameters::invMetric, PandAna.reco_validation.add_data::offset, ana::MCMCSamples::Hyperparameters::stepSize, systNames, PandAna.Demos.pi0_spectra::transform, PandAna.Demos.tute_pid_validation::var, and Write().

Referenced by mcmc::SaveToFile(), and Vars().

623  {
624  TDirectory * oldDir = gDirectory;
625  dir = dir->mkdir(name.c_str()); // switch to subdir
626  dir->cd();
627 
628  TObjString("MCMCSamples").Write("type");
629 
630  TParameter<int> offset("offset", fOffset);
631  offset.Write();
632 
633  // note that these are being sorted so that the list ordering is deterministic
634  TList diagBranchNames;
635  diagBranchNames.SetOwner();
636  std::vector<std::string> branchNames(fDiagBranches);
637  std::sort(branchNames.begin(), branchNames.end());
638  for (const auto & brName : branchNames)
639  diagBranchNames.AddLast(new TObjString(brName.c_str()));
640  diagBranchNames.Write("diagBranchNames", TObject::kSingleKey);
641 
642  branchNames.clear();
643  std::transform(fVars.begin(), fVars.end(), std::back_inserter(branchNames),
644  [](const IFitVar* var){return var->ShortName();});
645  TList fitVarNames;
646  fitVarNames.SetOwner();
647  for (const auto & brName : branchNames)
648  fitVarNames.AddLast(new TObjString(brName.c_str()));
649  fitVarNames.Write("fitVarNames", TObject::kSingleKey);
650 
651  branchNames.clear();
652  std::transform(fSysts.begin(), fSysts.end(), std::back_inserter(branchNames),
653  [](const ISyst* s){return s->ShortName();});
654  TList systNames;
655  systNames.SetOwner();
656  for (const auto & brName : branchNames)
657  systNames.AddLast(new TObjString(brName.c_str()));
658  systNames.Write("systNames", TObject::kSingleKey);
659 
660  // re-enable all the branches before writing...
661  fSamples->SetBranchStatus("*", true);
662  fSamples->Write("samples");
663  // don't let it get attached to the file that's about to be closed
664  fSamples->LoadBaskets();
665  fSamples->SetDirectory(nullptr);
666  auto hyperdir = dir->mkdir("hyperparams");
667  hyperdir->cd();
668  TParameter<double> stepSize("stepsize", fHyperparams.stepSize);
669  stepSize.Write();
671  fHyperparams.invMetric->Write("inv_metric");
672  TObjString("Hyperparameters").Write("type"); // so that hadd_cafana can be taught to skip over them
673  dir->cd();
674  hyperdir->Write();
675 
676  dir->cd();
677  auto timedir = dir->mkdir("samplingTime");
678  timedir->cd();
679  TObjString("SamplingTime").Write("type"); // so that hadd_cafana can be taught to skip over them
680  TParameter<double>("samplingTime", fSamplingTime).Write();
681  dir->cd();
682  timedir->Write();
683 
684  dir->Write();
685  delete dir;
686 
687  if (oldDir)
688  oldDir->cd();
689  }
const XML_Char * name
Definition: expat.h:151
std::unique_ptr< TMatrixD > invMetric
Definition: MCMCSamples.h:41
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
std::size_t fOffset
Definition: MCMCSamples.h:252
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
const XML_Char * s
Definition: expat.h:262
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
Hyperparameters fHyperparams
Hyperparameters deduced after adaptation, or manually set.
Definition: MCMCSamples.h:265
TDirectory * dir
Definition: macro.C:5
std::map< std::string, std::string > systNames
Definition: PlotUnfolding.C:32
double fSamplingTime
how long did we spend sampling?
Definition: MCMCSamples.h:266
std::vector< std::string > fDiagBranches
Definition: MCMCSamples.h:253
gm Write()
void ana::MCMCSamples::SetHyperparams ( double  stepSize,
const TMatrixD invMassMatrix 
)
inline

Definition at line 192 of file MCMCSamples.h.

References fHyperparams, and ana::MCMCSamples::Hyperparameters::Hyperparameters().

193  {
194  fHyperparams = Hyperparameters(stepSize, std::make_unique<TMatrixD>(invMassMatrix));
195  }
Hyperparameters fHyperparams
Hyperparameters deduced after adaptation, or manually set.
Definition: MCMCSamples.h:265
void ana::MCMCSamples::SetHyperparams ( Hyperparameters  h)
inline

Definition at line 197 of file MCMCSamples.h.

References fHyperparams, gen_hdf5record::names, and SetNames().

198  {
199  fHyperparams = std::move(h);
200  };
Hyperparameters fHyperparams
Hyperparameters deduced after adaptation, or manually set.
Definition: MCMCSamples.h:265
void ana::MCMCSamples::SetNames ( const std::vector< std::string > &  names)

Definition at line 692 of file MCMCSamples.cxx.

References ana::assert(), fOffset, ParseDiagnosticBranches(), and SetupTree().

Referenced by SetHyperparams().

693  {
694  assert (fOffset == 0 && "MCMCSamples::SetNames() was called after tree was already set up!");
695 
697  SetupTree();
698  }
std::size_t fOffset
Definition: MCMCSamples.h:252
assert(nhit_max >=nhit_nbins)
void ParseDiagnosticBranches(const std::vector< std::string > &names)
Set up the storage tree based on the branch names given us by Stan.
void SetupTree()
Hook up the storage tree to our internal variables.
void ana::MCMCSamples::SetSamplingTime ( double  s)
inline

Definition at line 203 of file MCMCSamples.h.

References fSamplingTime, and ToTTree().

203 { fSamplingTime = s; }
const XML_Char * s
Definition: expat.h:262
double fSamplingTime
how long did we spend sampling?
Definition: MCMCSamples.h:266
void ana::MCMCSamples::SetupTree ( )
private

Hook up the storage tree to our internal variables.

Definition at line 701 of file MCMCSamples.cxx.

References datagram_server::address, ana::assert(), getBrightness::br, fDiagBranches, fDiagnosticVals, fEntryLL, fEntryVals, fOffset, fSamples, fSysts, fVars, LOGLIKELIHOOD_BRANCH_NAME, string, compareCafs::tree, and PandAna.Demos.tute_pid_validation::var.

Referenced by MCMCSamples(), operator=(), SetNames(), and Vars().

702  {
703  auto CreateOrSetAddress = [](TTree * tree, const std::string& brName, double* address)
704  {
705  auto name = brName.c_str();
706  TBranch * br = tree->GetBranch(name);
707  if (br)
708  br->SetAddress(address);
709  else
710  tree->Branch(name, address);
711  return br;
712  };
713 
714  CreateOrSetAddress(fSamples.get(), MCMCSamples::LOGLIKELIHOOD_BRANCH_NAME, &fEntryLL);
715 
716  fDiagnosticVals.clear();
717  fDiagnosticVals.resize(fDiagBranches.size(), std::numeric_limits<double>::signaling_NaN());
718  assert(!fVars.empty() || !fSysts.empty());
719  auto firstFitVar = fVars.empty() ? fSysts[0]->ShortName() : fVars[0]->ShortName();
720  for (std::size_t valIdx = 0; valIdx < fDiagBranches.size(); valIdx++)
721  CreateOrSetAddress(fSamples.get(), fDiagBranches[valIdx], &fDiagnosticVals[valIdx]);
722 
723  fOffset = 1 + fDiagBranches.size(); // + 1 for the LL at the beginning
724 
725  fEntryVals.clear();
726  fEntryVals.resize(fVars.size() + fSysts.size(), std::numeric_limits<double>::signaling_NaN());
727  std::size_t valIdx = 0;
728  for (const auto & var : fVars)
729  CreateOrSetAddress(fSamples.get(), var->ShortName(), &fEntryVals[valIdx++]);
730  for (const auto &syst : fSysts)
731  CreateOrSetAddress(fSamples.get(), syst->ShortName(), &fEntryVals[valIdx++]);
732 
733  // disable the 'autosave' and 'autoflush' mechanisms
734  // since there's nowhere to write to anyway
735  // (this tree is memory-only unless SaveTo()'d).
736  fSamples->SetAutoFlush(0);
737  fSamples->SetAutoSave(0);
738  }
const XML_Char * name
Definition: expat.h:151
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
std::size_t fOffset
Definition: MCMCSamples.h:252
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
static const std::string LOGLIKELIHOOD_BRANCH_NAME
Definition: MCMCSamples.h:37
std::vector< double > fDiagnosticVals
Definition: MCMCSamples.h:264
assert(nhit_max >=nhit_nbins)
std::vector< std::string > fDiagBranches
Definition: MCMCSamples.h:253
std::vector< double > fEntryVals
Definition: MCMCSamples.h:263
enum BeamMode string
std::vector< std::pair< std::size_t, double > > ana::MCMCSamples::SortedLLs ( ) const
private

Construct a sorted vector of the LLs of all the samples (ordered lowest to highest)

Returns
The ordered vector

Definition at line 741 of file MCMCSamples.cxx.

References a, b, compare_h5_caf::branch, fEntryLL, fSamples, and make_pair().

Referenced by QuantileLL(), and Vars().

742  {
743  std::vector<std::pair<std::size_t, double>> LLs;
744  LLs.reserve(fSamples->GetEntries());
745  auto branch = fSamples->GetBranch(MCMCSamples::LOGLIKELIHOOD_BRANCH_NAME.c_str());
746  fSamples->SetBranchStatus("*", false);
747  fSamples->SetBranchStatus(MCMCSamples::LOGLIKELIHOOD_BRANCH_NAME.c_str(), true);
748  for (int idx = 0; idx < fSamples->GetEntries(); idx++)
749  {
750  branch->GetEntry(idx);
751  LLs.push_back(std::make_pair(idx, fEntryLL));
752  }
753 
754  std::sort(LLs.begin(), LLs.end(),
755  [](const std::pair<std::size_t, double> & a, const std::pair<std::size_t, double> & b)
756  {
757  return a.second < b.second;
758  }
759  );
760 
761  return std::move(LLs);
762  }
std::pair< Spectrum *, CheatDecomp * > make_pair(SpectrumLoaderBase &loader_data, SpectrumLoaderBase &loader_mc, HistAxis *axis, Cut *cut, const SystShifts &shift, const Var &wei)
Definition: DataMCLoad.C:336
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
const double a
static const std::string LOGLIKELIHOOD_BRANCH_NAME
Definition: MCMCSamples.h:37
const hit & b
Definition: hits.cxx:21
const std::vector<const ana::ISyst *>& ana::MCMCSamples::Systs ( ) const
inline

Which Systs are sampled in these samples?

Definition at line 209 of file MCMCSamples.h.

References fSysts.

Referenced by ana::BayesianMarginal::BayesianMarginal(), ana::StanFitter::FitHelperSeeded(), and ana::StanFitter::ReuseWarmup().

209 { return fSysts; }
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
const TTree * ana::MCMCSamples::ToTTree ( ) const

Get a TTree with the MCMC samples in it.

Definition at line 765 of file MCMCSamples.cxx.

References fSamples.

Referenced by ana::BayesianMarginal::BayesianMarginal(), and SetSamplingTime().

766  {
767  // presumably user doesn't want the tree with some branches missing...
768  fSamples->SetBranchStatus("*", true);
769  return fSamples.get();
770  }
std::unique_ptr< TTree > fSamples
Definition: MCMCSamples.h:261
std::size_t ana::MCMCSamples::VarOffset ( const IFitVar var) const
private

Where in fEntryVals is the given var?

Definition at line 773 of file MCMCSamples.cxx.

References ana::assert(), distance(), fVars, ana::IFitVar::ShortName(), and PandAna.Demos.tute_pid_validation::var.

Referenced by MaxValue(), MinValue(), SampleValue(), and Vars().

774  {
775  auto itr = std::find(fVars.begin(), fVars.end(), var);
776  assert(itr != fVars.end() && ("Var was not fitted: " + var->ShortName()).c_str());
777  return std::distance(fVars.begin(), itr);
778  }
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
unsigned distance(const T &t1, const T &t2)
assert(nhit_max >=nhit_nbins)
std::size_t ana::MCMCSamples::VarOffset ( const ana::ISyst syst) const
private

Where in fEntryVals is the given syst?

Definition at line 781 of file MCMCSamples.cxx.

References ana::assert(), distance(), fSysts, fVars, and ana::ISyst::ShortName().

782  {
783  auto itr = std::find(fSysts.begin(), fSysts.end(), syst);
784  assert(itr != fSysts.end() && ("Syst was not fitted: " + syst->ShortName()).c_str());
785  return fVars.size() + std::distance(fSysts.begin(), itr);
786  }
virtual const std::string & ShortName() const final
The name printed out to the screen.
Definition: ISyst.h:27
std::vector< const ana::IFitVar * > fVars
Definition: MCMCSamples.h:254
unsigned distance(const T &t1, const T &t2)
std::vector< const ana::ISyst * > fSysts
Definition: MCMCSamples.h:255
assert(nhit_max >=nhit_nbins)
const std::vector<const IFitVar *>& ana::MCMCSamples::Vars ( ) const
inline

Member Data Documentation

bool ana::MCMCSamples::fBestFitFound
mutableprivate

Definition at line 258 of file MCMCSamples.h.

Referenced by AdoptSamples(), BestFitSampleIdx(), and operator=().

std::size_t ana::MCMCSamples::fBestFitSampleIdx
mutableprivate

Definition at line 257 of file MCMCSamples.h.

Referenced by BestFitSampleIdx(), and operator=().

std::vector<std::string> ana::MCMCSamples::fDiagBranches
private
std::vector<double> ana::MCMCSamples::fDiagnosticVals
private
double ana::MCMCSamples::fEntryLL
private

Definition at line 262 of file MCMCSamples.h.

Referenced by AddSample(), BestFitSampleIdx(), operator=(), SampleLL(), SetupTree(), and SortedLLs().

std::vector<double> ana::MCMCSamples::fEntryVals
private

Definition at line 263 of file MCMCSamples.h.

Referenced by AddSample(), operator=(), Sample(), SampleValue(), and SetupTree().

Hyperparameters ana::MCMCSamples::fHyperparams
mutableprivate

Hyperparameters deduced after adaptation, or manually set.

Definition at line 265 of file MCMCSamples.h.

Referenced by Hyperparams(), operator=(), SaveTo(), and SetHyperparams().

std::size_t ana::MCMCSamples::fOffset
private

Definition at line 252 of file MCMCSamples.h.

Referenced by AddSample(), operator=(), SaveTo(), SetNames(), and SetupTree().

std::unique_ptr<TTree> ana::MCMCSamples::fSamples
private
double ana::MCMCSamples::fSamplingTime
private

how long did we spend sampling?

Definition at line 266 of file MCMCSamples.h.

Referenced by operator=(), SamplingTime(), SaveTo(), and SetSamplingTime().

std::vector<const ana::ISyst *> ana::MCMCSamples::fSysts
private
std::vector<const ana::IFitVar *> ana::MCMCSamples::fVars
private
const std::string ana::MCMCSamples::LOGLIKELIHOOD_BRANCH_NAME = "logprob"
static

Name of branch in internal TTree corresponding to the log-likelihood. Used in several places, so centralized here.

Definition at line 37 of file MCMCSamples.h.

Referenced by ana::BayesianMarginal::SetupKNN(), and SetupTree().


The documentation for this class was generated from the following files: