Many networks — of people, bacteria, or computers — have a community-like structure. Quickly finding communities within large networks is a common task of the label propagation algorithm (LPA), a semi-supervised machine learning algorithm that assigns labels to the vertices of a graph that represents such a network.
The labels identify the community that vertices (nodes) belong to. Membership in a community is set and altered by the labels assigned to the neighboring vertices. Each vertex is typically initially assigned an arbitrary — or non-arbitrary, as a means of semi-supervised assistance — label and, during computation, the labels propagate throughout the network. Each iteration updates the label of each vertex to be the same as the most common label of the vertex's neighbors. Closely connected groups will then settle on a common label.
A community, in graph and network theory, is a group of vertices in a densely connected network with few connections to vertices outside the group. The challenge of finding communities in complex networks is fundamental in network science, as communities often reveal both topological and functional relationships between different components of complex systems. Understanding the structure of these communities then allows scientists to make more accurate predictions about how the system will react to change.
Community detection, by identifying cohesive groups of vertices more similar to the other vertices, aims at discovering the structure, behavior, dynamics, and organization of a complex network by finding vertices (people, bacteria, or computers, for example) that are more similar to each other than they are to others.
The LPA method has advantages such as simplicity in understanding and implementation, and it has linear time complexity while still being able to achieve high accuracy results with relatively simple parameters. LPA is especially helpful when the vertices of a graph have minimal data associated with them because the connections (or lack thereof) between the vertices are what drive the LPA to decide which labels to assign where.
Katana Graph's implementation of the label propagation algorithm can be found in a high-performance, easy-to-use graph analytics Python library that works with pandas, scikit-learn, and Apache Arrow, as well as tools and libraries in the Intel AI software stack.
Katana Graph brings your data into brilliant focus faster than ever before — without disruption, overspending, or overhaul. To learn more about the Katana Graph Intelligence Platform speak to one of our graph experts.
Get to Know Katana Graph
We thrive on testing new and diverse ideas.
Katana Graph was born of cutting-edge research and scientific rigor, and these beginnings have a powerful effect on who we are to this day. We’re devoted to problem solving, and are relentless in our pursuit of more effective and more efficient solutions to real-world challenges. Continuous improvement is the very foundation of our success.