Public Member Functions | Public Attributes | Private Attributes | List of all members
PandAna.core.core.loader Class Reference
Inheritance diagram for PandAna.core.core.loader:
PandAna.core.core.associate

Public Member Functions

def __init__ (self, filesource, stride=1, offset=0, limit=None, spillcuts=None, index=None)
 
def getSource (self)
 
def sum_POT (self)
 
def add_spectrum (self, spec)
 
def reset_index (self)
 
def __setitem__ (self, key, df)
 
def __getitem__ (self, key)
 
def setupGo (self)
 
def getFile (self)
 
def setFile (self, f)
 
def closeFile (self)
 
def readData (self)
 
def fillSpectra (self)
 
def Go (self)
 
def cleanup (self)
 

Public Attributes

 gone
 
 interactive
 
 histdefs
 
 index
 
 dflist
 
 summing
 
 openfile
 
 concat_time
 

Private Attributes

 _files
 
 _tables
 
 _spillcuts
 
 _POT
 
 _potspecnocut
 
 _potspeccut
 
 _POTBase
 
 _filegen
 

Detailed Description

Definition at line 378 of file core.py.

Constructor & Destructor Documentation

def PandAna.core.core.loader.__init__ (   self,
  filesource,
  stride = 1,
  offset = 0,
  limit = None,
  spillcuts = None,
  index = None 
)

Definition at line 379 of file core.py.

379  def __init__(self, filesource, stride = 1, offset = 0, limit = None, spillcuts=None, index=None):
380 
381  self._files = sourcewrapper(filesource, stride, offset, limit)
382 
383  # _tables stores the entire dataset read from file
384  # index key holds the global index range to be accessed from the dataset by a cut/var
385  self._tables = {'indices':0}
386  self.gone = False
387  self.interactive = False
388  self.histdefs = []
389  self.index=index
390  self.dflist = {}
391  self._spillcuts = spillcuts
392 
393  # add extra spectra to keep track of exposure
394  self._POT = 0
395  self.sum_POT()
396 
def sum_POT(self)
Definition: core.py:400
def __init__(self, filesource, stride=1, offset=0, limit=None, spillcuts=None, index=None)
Definition: core.py:379

Member Function Documentation

def PandAna.core.core.loader.__getitem__ (   self,
  key 
)

Definition at line 447 of file core.py.

References PandAna.core.core.spectrum._tables, PandAna.core.core.loader._tables, novaddt::Boundary.index, PandAna.core.core.loader.index, and PandAna.core.core.loader.summing.

447  def __getitem__(self, key):
448  if not self.summing:
449  if type(key)==str and key.startswith('spill'):
450  key = 'rec.'+key
451  # actually build the cache before Go()
452  if type(key) == str and not key in self._tables:
453  # Pick up the right index
454  index = KL if key.startswith('rec') else KLN if key.startswith('neutrino') else KLS
455  if self.index and key.startswith('rec'):
456  index = self.index
457  self[key] = dfproxy(columns=index)
458  # assume key is a filtered index range after a cut
459  if type(key) is not str:
460  self._tables['indices'] = key
461  return self
462  # no filtering
463  if self._tables['indices'] is 0:
464  return self._tables[key]
465  # use global index to slice dataframe requested
466  elif self._tables[key].dropna().empty:
467  # sometimes there's no data available in the file, allow it but warn
468  print "Warning! No data read for %s" % key
469  return self._tables[key]
470  else:
471  if self._tables[key].index.intersection(self._tables['indices']).empty:
472  return dfproxy(columns=self._tables[key].index.names)
473  else:
474  dfslice = self._tables[key].loc[self._tables['indices']]
475  return dfslice
476 
def __getitem__(self, key)
Definition: core.py:447
def PandAna.core.core.loader.__setitem__ (   self,
  key,
  df 
)

Definition at line 439 of file core.py.

References PandAna.core.core.spectrum._tables, PandAna.core.core.loader._tables, novaddt::Boundary.index, and PandAna.core.core.loader.index.

439  def __setitem__(self, key, df):
440  # set multiindex for recTree data
441  index = KL if key.startswith('rec') else KLN if key.startswith('neutrino') else KLS
442  if self.index and key.startswith('rec'):
443  index = self.index
444  df.set_index(index, inplace=True)
445  self._tables[key] = df
446 
def __setitem__(self, key, df)
Definition: core.py:439
def PandAna.core.core.loader.add_spectrum (   self,
  spec 
)

Definition at line 428 of file core.py.

References PandAna.core.core.loader._spillcuts, PandAna.core.core.loader.histdefs, and PandAna.core.core.loader.summing.

428  def add_spectrum(self, spec):
429  if not spec in self.histdefs:
430  if self._spillcuts is not None and not self.summing:
431  spec._cutfcn = spec._cutfcn & self._spillcuts
432 
433  self.histdefs.append(spec)
434 
def add_spectrum(self, spec)
Definition: core.py:428
def PandAna.core.core.loader.cleanup (   self)

Definition at line 569 of file core.py.

References PandAna.core.core.spectrum._tables, PandAna.core.core.loader._tables, and PandAna.core.core.loader.histdefs.

Referenced by PandAna.core.core.loader.Go().

569  def cleanup(self):
570  # free up some memory
571  self._tables = {'indices':0}
572  # remove associations with spectra
573  self.histdefs = []
574 
575 # Different loaders end up starting their own SAM projects, even for the exact same queries.
576 # This doesn't guarantee that they'll run over the files in the same order.
577 # Coupled with the fact that the projects can be shared over different grid jobs,
578 # this can result in unexpected behaviour if the macro expects them to share the same data downstream.
579 # This class allows the user to use a single project over multiple loaders
def cleanup(self)
Definition: core.py:569
def PandAna.core.core.loader.closeFile (   self)

Definition at line 491 of file core.py.

Referenced by PandAna.core.core.loader.Go(), and PandAna.core.core.associate.Go().

491  def closeFile(self):
492  self.openfile.close()
493 
def closeFile(self)
Definition: core.py:491
def PandAna.core.core.loader.fillSpectra (   self)

Definition at line 522 of file core.py.

Referenced by PandAna.core.core.loader.Go().

522  def fillSpectra(self):
523  self.concat_time = time.time()
524  for key in self.dflist:
525  # set index for all dataframes
526  self[key] = pd.concat(self.dflist[key])
527  # sort index
528  self._tables[key].sort_index(inplace=True)
529  self.dflist = {}
530 
531  # Compute POT and then fill spectra
532  self.sum_POT()
533 
534  # Let's not refill these
535  self.histdefs.remove(self._potspecnocut)
536  if self._potspeccut:
537  self.histdefs.remove(self._potspeccut)
538 
539  spec_idx = 0
540  # spec_progbar = ProgressBar(len(self.histdefs))
541  print("Filling %s spectra\n" % len(self.histdefs))
542  for spec in self.histdefs:
543  spec_idx += 1
544  # spec_progbar.update(spec_idx)
545  spec.fill()
546 
def sum_POT(self)
Definition: core.py:400
bool print
def fillSpectra(self)
Definition: core.py:522
def PandAna.core.core.loader.getFile (   self)

Definition at line 485 of file core.py.

References PandAna.core.core.loader._filegen.

Referenced by ProjMan.Consumer.__iter__(), PandAna.core.core.loader.Go(), and PandAna.core.core.associate.Go().

485  def getFile(self):
486  return self._filegen()
487 
def getFile(self)
Definition: core.py:485
def PandAna.core.core.loader.getSource (   self)

Definition at line 397 of file core.py.

References PandAna.core.core.loader._files.

397  def getSource(self):
398  return self._files
399 
def getSource(self)
Definition: core.py:397
def PandAna.core.core.loader.Go (   self)

Definition at line 547 of file core.py.

References PandAna.core.core.loader.cleanup(), PandAna.core.core.loader.closeFile(), PandAna.core.core.loader.fillSpectra(), PandAna.core.core.loader.getFile(), print, daqdataformats::RawMilliSlice.readData(), daqdataformats::VERSION_NAMESPACE::RawDataBlock.readData(), daqdataformats::VERSION_NAMESPACE::RawMicroBlock.readData(), daqdataformats::RawMilliSliceIndex.readData(), daqdataformats::VERSION_NAMESPACE::RawConfigurationBlock.readData(), daqdataformats::VERSION_NAMESPACE::RawEvent.readData(), daqdataformats::RawTrigger.readData(), daqdataformats::VERSION_NAMESPACE::RawSummaryDCMDataHeader.readData(), daqdataformats::VERSION_NAMESPACE::RawTriggerRange.readData(), daqdataformats::VERSION_NAMESPACE::RawConfigurationName.readData(), daqdataformats::VERSION_NAMESPACE::RawConfigurationSystemID.readData(), daqdataformats::VERSION_NAMESPACE::RawMicroBlockHeader.readData(), daqdataformats::VERSION_NAMESPACE::RawMicroSliceHeader.readData(), daqdataformats::VERSION_NAMESPACE::RawSummaryDroppedMicroblock.readData(), daqdataformats::RawTimingMarker.readData(), daqdataformats::VERSION_NAMESPACE::RawConfigurationTail.readData(), daqdataformats::VERSION_NAMESPACE::RawSummaryDCMDataPoint.readData(), daqdataformats::VERSION_NAMESPACE::RawEventTail.readData(), daqdataformats::VERSION_NAMESPACE::RawTriggerTimingMarker.readData(), daqdataformats::VERSION_NAMESPACE::RawTriggerMask.readData(), daqdataformats::VERSION_NAMESPACE::RawTriggerTime.readData(), daqdataformats::RawDataBlock.readData(), daqdataformats::VERSION_NAMESPACE::RawDataBlockHeader.readData(), daqdataformats::VERSION_NAMESPACE::RawConfigurationHeader.readData(), daqdataformats::VERSION_NAMESPACE::RawTriggerHeader.readData(), daqdataformats::VERSION_NAMESPACE::RawTrigger.readData(), daqdataformats::VERSION_NAMESPACE::RawNanoSlice.readData(), daqdataformats::VERSION_NAMESPACE::RawEventHeader.readData(), daqdataformats::RawDAQData.readData(), daqdataformats::RawMicroSlice.readData(), daqdataformats::VERSION_NAMESPACE::RawNanoSliceHeader.readData(), daqdataformats::VERSION_NAMESPACE::RawRunHeader.readData(), daqdataformats::RawMilliSliceHeader.readData(), PandAna.core.core.loader.readData(), PandAna.core.core.loader.setFile(), and PandAna.core.core.loader.setupGo().

547  def Go(self):
548  t0 = time.time()
549  self.setupGo()
550  file_idx = 0
551  # file_progbar = ProgressBar(self._filegen.nFiles())
552  while True:
553  try:
554  fname = self.getFile()
555  self.setFile(h5py.File(fname, 'r'))
556  self.readData()
557  self.closeFile()
558 
559  file_idx += 1
560  # file_progbar.update(file_idx)
561  except StopIteration:
562  break
563 
564  self.fillSpectra()
565  # cleanup
566  self.cleanup()
567  print("\nTotal time : %s sec\n" % (time.time() - t0))
568 
def setupGo(self)
Definition: core.py:477
def cleanup(self)
Definition: core.py:569
def setFile(self, f)
Definition: core.py:488
def getFile(self)
Definition: core.py:485
bool print
def closeFile(self)
Definition: core.py:491
def readData(self)
Definition: core.py:494
def fillSpectra(self)
Definition: core.py:522
def PandAna.core.core.loader.readData (   self)

Definition at line 494 of file core.py.

References PandAna.core.core.spectrum._tables, PandAna.core.core.loader._tables, PandAna.core.core.loader.dflist, parse_dependency_file_t.list, and print.

Referenced by PandAna.core.core.loader.Go().

494  def readData(self):
495  for key in self._tables:
496  if key is 'indices':
497  continue
498  if not key in self.dflist:
499  self.dflist[key] = []
500  # branches from cache
501  if not key in self.openfile.keys():
502  print("Group %s doesn't exist!" % key)
503  sys.exit(2)
504  group = self.openfile.get(key)
505  values = {}
506  # leaves from cache
507  keycache = self._tables[key]._proxycols
508  for k in keycache:
509  try:
510  dataset = group.get(k)[()]
511  except TypeError:
512  # better error message
513  print("Dataset %s for group %s doesn't exist!" % (k, group))
514  sys.exit(2)
515  if dataset.shape[1] == 1:
516  dataset = dataset.flatten()
517  else:
518  dataset = list(dataset)
519  values[k] = dataset
520  self.dflist[key].append(pd.DataFrame(values))
521 
bool print
def readData(self)
Definition: core.py:494
def PandAna.core.core.loader.reset_index (   self)

Definition at line 435 of file core.py.

References PandAna.core.core.spectrum._tables, and PandAna.core.core.loader._tables.

435  def reset_index(self):
436  # reset after each spectrum fill
437  self._tables['indices'] = 0
438 
def reset_index(self)
Definition: core.py:435
def PandAna.core.core.loader.setFile (   self,
  f 
)

Definition at line 488 of file core.py.

Referenced by PandAna.core.core.loader.Go(), and PandAna.core.core.associate.Go().

488  def setFile(self, f):
489  self.openfile = f
490 
def setFile(self, f)
Definition: core.py:488
def PandAna.core.core.loader.setupGo (   self)

Definition at line 477 of file core.py.

References PandAna.core.core.loader.gone.

Referenced by PandAna.core.core.loader.Go(), and PandAna.core.core.associate.Go().

477  def setupGo(self):
478  if self.gone:
479  return
480  self.gone = True
481  self._filegen = self._files()
482 
483  print("Reading data from %s files : \n" % self._filegen.nFiles())
484 
def setupGo(self)
Definition: core.py:477
bool print
def PandAna.core.core.loader.sum_POT (   self)

Definition at line 400 of file core.py.

400  def sum_POT(self):
401  self.summing = True
402 
403  # If not gone, construct spectra
404  if not self.gone:
405  self._potspecnocut = spectrum(self, lambda tables: tables['spill']['spillpot']>0, \
406  lambda tables: tables['spill']['spillpot'])
407 
408  if self._spillcuts:
409  self._potspeccut = spectrum(self, self._spillcuts, \
410  lambda tables: tables['spill']['spillpot'])
411  else:
412  self._potspeccut = None
413 
414  # If gone, fill spectra and compute POT
415  else:
416  self._potspecnocut.fill()
417  self._POTBase = self._potspecnocut.df().sum()
418  if self._potspeccut:
419  self._potspeccut.fill()
420  self._POT = self._potspeccut.df().sum()
421  else:
422  # Use base pot if not using spill cuts
423  self._POT = self._POTBase
424  frac = 100*self._POT/self._POTBase
425  print('Found {:0.5E} POT passing spillcuts from {:0.5E} POT ({:0.1f}%).'.format(self._POT, self._POTBase, frac))
426  self.summing = False
427 
def sum_POT(self)
Definition: core.py:400
bool print
std::string format(const int32_t &value, const int &ndigits=8)
Definition: HexUtils.cpp:14
Double_t sum
Definition: plot.C:31

Member Data Documentation

PandAna.core.core.loader._filegen
private

Definition at line 481 of file core.py.

Referenced by PandAna.core.core.loader.getFile().

PandAna.core.core.loader._files
private
PandAna.core.core.loader._POT
private

Definition at line 394 of file core.py.

PandAna.core.core.loader._POTBase
private

Definition at line 417 of file core.py.

PandAna.core.core.loader._potspeccut
private

Definition at line 409 of file core.py.

PandAna.core.core.loader._potspecnocut
private

Definition at line 405 of file core.py.

PandAna.core.core.loader._spillcuts
private

Definition at line 391 of file core.py.

Referenced by PandAna.core.core.loader.add_spectrum().

PandAna.core.core.loader._tables
private
PandAna.core.core.loader.concat_time

Definition at line 523 of file core.py.

PandAna.core.core.loader.dflist

Definition at line 390 of file core.py.

Referenced by PandAna.core.core.loader.readData().

PandAna.core.core.loader.gone

Definition at line 386 of file core.py.

Referenced by PandAna.core.core.loader.setupGo().

PandAna.core.core.loader.histdefs
PandAna.core.core.loader.index
PandAna.core.core.loader.interactive

Definition at line 387 of file core.py.

PandAna.core.core.loader.openfile

Definition at line 489 of file core.py.

Referenced by PandAna.core.core.associate.Go().

PandAna.core.core.loader.summing

The documentation for this class was generated from the following file: