Approximate and Exact Optimization Algorithms for the Beltway and Turnpike Problems with Duplicated, Missing, Partially Labeled, and Uncertain Measurements

Name: Approximate and Exact Optimization Algorithms for the Beltway and Turnpike Problems with Duplicated, Missing, Partially Labeled, and Uncertain Measurements
Keywords: Computational Biology, Humans, Research Articles, Algorithms

C. S. Elder; Minh Hoang; Mohsen Ferdosi; Carl Kingsford

Found an issue? Give us feedback

PubMed Centralarrow_drop_down

PubMed Central

Other literature type . 2024

License: http://creativecommons.org/licenses/by/4.0/This Open Access article is distributed under the terms of the Creative Commons License [CC-BY] (http://creativecommons.org/licenses/by/4.0 (http://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Data sources: PubMed Central

Journal of Computational Biology

Article . 2024 . Peer-reviewed

License: Mary Ann Liebert TDM

Data sources: Crossref

Journal of Computational Biology

Article . 2024

Data sources: Europe PubMed Central

DBLP

Article

Data sources: DBLP

Approximate and Exact Optimization Algorithms for the Beltway and Turnpike Problems with Duplicated, Missing, Partially Labeled, and Uncertain Measurements

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 01 Oct 2024 English Publisher:SAGE PublicationsJournal:Journal of Computational Biology, volume 31, pages 908-926 (eissn: 1557-8666,

Copyright policy )

Authors: C. S. Elder; Minh Hoang; Mohsen Ferdosi; Carl Kingsford;

doi: 10.1089/cmb.2024.0661

pmid: 39387260

pmc: PMC11698667

Approximate and Exact Optimization Algorithms for the Beltway and Turnpike Problems with Duplicated, Missing, Partially Labeled, and Uncertain Measurements

- Summary
- Subjects
- Metrics

Abstract

The Turnpike problem aims to reconstruct a set of one-dimensional points from their unordered pairwise distances. Turnpike arises in biological applications such as molecular structure determination, genomic sequencing, tandem mass spectrometry, and molecular error-correcting codes. Under noisy observation of the distances, the Turnpike problem is NP-hard and can take exponential time and space to solve when using traditional algorithms. To address this, we reframe the noisy Turnpike problem through the lens of optimization, seeking to simultaneously find the unknown point set and a permutation that maximizes similarity to the input distances. Our core contribution is a suite of algorithms that robustly solve this new objective. This includes a bilevel optimization framework that can efficiently solve Turnpike instances with up to 100,000 points. We show that this framework can be extended to scenarios with domain-specific constraints that include duplicated, missing, and partially labeled distances. Using these, we also extend our algorithms to work for points distributed on a circle (the Beltway problem). For small-scale applications that require global optimality, we formulate an integer linear program (ILP) that (i) accepts an objective from a generic family of convex functions and (ii) uses an extended formulation to reduce the number of binary variables. On synthetic and real partial digest data, our bilevel algorithms achieved state-of-the-art scalability across challenging scenarios with performance that matches or exceeds competing baselines. On small-scale instances, our ILP efficiently recovered ground-truth assignments and produced reconstructions that match or exceed our alternating algorithms. Our implementations are available at https://github.com/Kingsford-Group/turnpikesolvermm.

Related Organizations

College of New Jersey
United States
Carnegie Mellon University
United States

Keywords

Computational Biology, Humans, Research Articles, Algorithms

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green