Version 0.3 with SEJITS integrated now released! Please see our Download page.
What is KDT?
The Knowledge Discovery Toolbox (KDT) provides domain experts with a simple interface to analyze very large graphs quickly and effectively without requiring knowledge of the underlying graph representation or algorithms. The current version provides a selection of functions on attributed semantic graphs, directed graphs, and sparse matrices, from simple exploratory functions to complex algorithms. Because KDT is open-source, it can be customized or extended by interested (and intrepid) users.
Please see our Download page.
While graphs represent many real-world relationships in a mathematically robust way, their analysis with current methods does not scale. The modern "data tsunami" has created graphs in critical scientific and societal domains that are large enough to be prohibitively time-consuming to analyze with well-known methods. This has led graph-analysis experts to create more efficient graph analysis algorithms, but has also led to a gap between those experts and the non-graph subject-matter experts who need to use them. KDT counters the trend by exposing an API through the Python language that is efficiently and scalably implemented on computer clusters, while remaining suitable for domain experts by hiding the underlying implementation.
KDT is intended to accelerate a virtuous cycle among: (a) domain experts who need to analyze graphs that don't fit in the memory of a single computer node (b) researchers working on improved graph algorithms (c) developers of tool infrastructure.
We envision that domain experts will do more analysis of large graphs with the current algorithms in KDT and provide feedback on which algorithms are or are not most useful for the large graphs in their domains. This will spur algorithm researchers and tool developers to develop new variants to analyze the subject-matter experts' graphs better.
Tell me more
Our Documentation page has tutorials, installation instructions, publications, and the reference manual. KDT's high-performance backend is the Combinatorial BLAS, which uses MPI to scale from laptops to parallel supercomputers.
Here are the organizations who have provided Funding for KDT.
Here are the People who've contributed to KDT.