Supervised learning is an accurate method for network-based gene classification

Published in Bioinformatics, 2020

Assigning every human gene to specific functions, diseases, and traits is a grand challenge in modern genetics. Key to addressing this challenge are computational methods such as supervised-learning and label-propagation that can leverage molecular interaction networks to predict gene attributes. In spite of being a popular machine learning technique across fields, supervised-learning has been applied only in a few network-based studies for predicting pathway-, phenotype-, or disease-associated genes. It is unknown how supervised-learning broadly performs across different networks and diverse gene classification tasks, and how it compares to label-propagation, the widely-benchmarked canonical approach for this problem.

Recommended citation: Liu, Renming, Christopher A. Mancuso, Anna Yannakopoulos, Kayla A. Johnson, and Arjun Krishnan. “Supervised Learning Is an Accurate Method for Network-Based Gene Classification.” Bioinformatics 36, no. 11 (June 1, 2020): 3457–65.

Opportunistic Computing with Lobster: Lessons Learned from Scaling up to 25k Non-Dedicated Cores

Published in APS Meeting Abstracts, 2017

We previously described Lobster, a workflow management tool for exploiting volatile opportunistic computing resources for computation in HEP. We will discuss the various challenges that have been encountered while scaling up the simultaneous CPU core utilization and the software improvements required to overcome these challenges. Categories: Workflows can now be divided into categories based on their required system resources. This allows the batch queueing system to optimize assignment of tasks to nodes with the appropriate capabilities. Within each category, limits can be specified for the number of running jobs to regulate the utilization of communication bandwidth. System resource specifications for a task category can now be modified while a project is running, avoiding the need to restart the project if resource requirements differ from the initial estimates. Lobster now implements time limits on each task category to voluntarily terminate tasks. This allows partially completed work to be recovered. Workflow dependency specification: One workflow often requires data from other workflows as input. Rather than waiting for earlier workflows to be completed before beginning later ones, Lobster now allows dependent tasks to begin as soon as sufficient input data has accumulated. Resource monitoring: Lobster utilizes a new capability in Work Queue to monitor the system resources each task requires in order to identify bottlenecks and optimally assign tasks. The capability of the Lobster opportunistic workflow management system for HEP computation has been significantly increased. We have demonstrated efficient utilization of 25 000 non-dedicated cores and achieved a data input rate of 30 Gb/s and an output rate of 500GB/h. This has required new capabilities in task categorization, workflow dependency specification, and resource monitoring.

Recommended citation: Wolf, Matthias, Anna Woodard, Wenzhao Li, Kenyi Hurtado Anampa, Anna Yannakopoulos, Benjamin Tovar, Patrick Donnelly, et al. “Opportunistic Computing with Lobster: Lessons Learned from Scaling up to 25k Non-Dedicated Cores.” Journal of Physics: Conference Series 898 (October 2017): 052036.

High magnetic field calibration using de Haas-van Alphen oscillations in polycrystalline copper

Published in APS Meeting Abstracts, 2016

We provide a calibration for the de Haas-van Alphen (dHvA) frequency in polycrystalline copper, which may be used to standardize the measurement of magnetic fields, particularly in pulsed field environments, where direct observation of NMR is challenging. Using a reliable single-crystal model of the Fermi surface from coefficients that are traceable to a powder Al NMR reference, we computed Fermi surface extremal areas for evenly spaced directions around a sphere. Summing the peaks corresponding to extremal orbits according to the Lifshitz-Kosevich model, we arrive at a dHvA spectrum that corresponds to experimental observation. We find that actual maximum fields reached at the NHMFL-Pulsed Field Facility are slightly larger than previously determined.

Recommended citation: Coniglio, William A., Alan F. Williams, Anna Yannakopoulos, Audrey Grockowiak, and Stan Tozer. “High Magnetic Field Calibration Using de Haas-van Alphen Oscillations in Polycrystalline Copper,” 2016, V46.007.