On Cost-Efficient Learning of Data Dependency

Jang, Hyeryung; Song, Hyungseok; Yi, Yung

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

On Cost-Efficient Learning of Data Dependencyopen access

Authors: Jang, Hyeryung; Song, Hyungseok; Yi, Yung

Issue Date: Jun-2022

Publisher: IEEE

Keywords: Costs; Distributed databases; Inference algorithms; Graphical models; Task analysis; Data models; Tree graphs; Graph structure learning; distributed inference; sample complexity; large deviation principle; belief propagation

Citation: IEEE/ACM Transactions on Networking, v.30, no.3, pp 1382 - 1394

Pages: 13

Indexed: SCIE
SCOPUS

Journal Title: IEEE/ACM Transactions on Networking

Volume: 30

Number: 3

Start Page: 1382

End Page: 1394

URI: https://scholarworks.dongguk.edu/handle/sw.dongguk/3122

DOI: 10.1109/TNET.2022.3141128

ISSN: 1063-6692
1558-2566

Abstract: In this paper, we consider the problem of learning a tree graph structure that represents the statistical data dependency among nodes for a set of data samples generated by nodes, which provides the basic structure to perform a probabilistic inference task. Inference in the data graph includes marginal inference and maximum a posteriori (MAP) estimation, and belief propagation (BP) is a commonly used algorithm to compute the marginal distribution of nodes via message-passing, incurring non-negligible amount of communication cost. We inevitably have the trade-off between the inference accuracy and the message-passing cost because the learned structure of data dependency and physical connectivity graph are often highly different. In this paper, we formalize this trade-off in an optimization problem which outputs the data dependency graph that jointly considers learning accuracy and message-passing costs. We focus on two popular implementations of BP, ASYNC-BP and SYNC-BP, which have different message-passing mechanisms and cost structures. In ASYNC-BP, we propose a polynomial-time learning algorithm that is optimal, motivated by finding a maximum weight spanning tree of a complete graph. In SYNC-BP, we prove the NP-hardness of the problem and propose a greedy heuristic. For both BP implementations, we quantify how the error probability that the learned cost-efficient data graph differs from the ideal one decays as the number of data samples grows, using the large deviation principle, which provides a guideline on how many samples are necessary to obtain a certain trade-off. We validate our theoretical findings through extensive simulations, which confirms that it has a good match.

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Advanced Convergence Engineering > Department of Computer Science and Artificial Intelligence > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Jang, Hye Ryung photo

Jang, Hye Ryung: College of Advanced Convergence Engineering (Department of Computer Science and Artificial Intelligence)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE