Publication:
Metric labeling and semimetric embedding for protein annotation prediction

dc.contributor.authorSefer, Emre
dc.contributor.authorKingsford, C.
dc.contributor.departmentComputer Science
dc.contributor.ozuauthorSEFER, Emre
dc.date.accessioned2021-02-11T20:27:51Z
dc.date.available2021-02-11T20:27:51Z
dc.date.issued2021-05-01
dc.description.abstractComputational techniques have been successful at predicting protein function from relational data (functional or physical interactions). These techniques have been used to generate hypotheses and to direct experimental validation. With few exceptions, the task is modeled as multilabel classification problems where the labels (functions) are treated independently or semi-independently. However, databases such as the Gene Ontology provide information about the similarities between functions. We explore the use of the Metric Labeling combinatorial optimization problem to make use of heuristically computed distances between functions to make more accurate predictions of protein function in networks derived from both physical interactions and a combination of other data types. To do this, we give a new technique (based on convex optimization) for converting heuristic semimetric distances into a metric with minimum least-squared distortion (LSD). The Metric Labeling approach is shown to outperform five existing techniques for inferring function from networks. These results suggest that Metric Labeling is useful for protein function prediction, and that LSD minimization can help solve the problem of converting heuristic distances to a metric.en_US
dc.identifier.doi10.1089/cmb.2020.0425
dc.identifier.endpage525
dc.identifier.issn1066-5277en_US
dc.identifier.issn1066-5277
dc.identifier.issue5
dc.identifier.scopus2-s2.0-85106496587
dc.identifier.startpage514
dc.identifier.urihttp://hdl.handle.net/10679/7299
dc.identifier.urihttps://doi.org/10.1089/cmb.2020.0425
dc.identifier.volume28
dc.identifier.wos000603471700001
dc.language.isoengen_US
dc.peerreviewedyesen_US
dc.publicationstatusPublisheden_US
dc.publisherMary Ann Liebert, Inc.en_US
dc.relation.ispartofJournal of Computational Biology
dc.relation.publicationcategoryInternational Refereed Journal
dc.rightsrestrictedAccess
dc.subject.keywordsGene ontologyen_US
dc.subject.keywordsMetric labelingen_US
dc.subject.keywordsProtein function predictionen_US
dc.titleMetric labeling and semimetric embedding for protein annotation predictionen_US
dc.typearticleen_US
dspace.entity.typePublication
relation.isOrgUnitOfPublication85662e71-2a61-492a-b407-df4d38ab90d7
relation.isOrgUnitOfPublication.latestForDiscovery85662e71-2a61-492a-b407-df4d38ab90d7

Files

License bundle

Now showing 1 - 1 of 1
Placeholder
Name:
license.txt
Size:
1.45 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections