Welcome to pyMCPSC!¶
Protein Structure Comparison (PSC) is a very important operation for drug discovery and exploratory biological research. pyMCPSC is a utility that leverages the capabilities of the popular Python programming language while also exploiting the multiple cores of modern CPUs for high performance. It can be easily extended to incorporate new PSC methods, as they are becoming available, in the consensus scores calculation. pyMCPSC makes it easy for researchers to compare and contrast PSC methods on their data, explore visually the structural similarities in large datasets, autoclassify query proteins etc. by setting up and performing repeatable experiments. Furthermore, since pyMCPSC is open-sourced, researchers can easily incorporate new PSC methods or add new consensus methods to serve their needs.
As a software architecture, pyMCPSC is organized into several modules called in sequence by the main entry point. The modules are functionally independent and the interface between them is via files. Each module receives a set of parameters, including the files used to read data and write the output results. In a typical scenario, the user sets up an experiment, using command line parameters for supplying information such as the location of protein domain structures data and ground-truth classification (if available). The ground-truth data required by pyMCPSC to perform the analysis steps is the SCOP/CATH classification of the domains in the dataset being analysed. The information is expected to be provided to the utility in a specific format. pyMCPSC first generates pairwise similarity scores for all domain pairs, using the supplied PSC methods and the implemented MCPSC methods, and then generates results to facilitate structure based comparison and analysis. Further details and documentation of the modules with links to the sources can be accessed through the Indices and tables linked below.
Tool Availability¶
The tool is available at pyMCPSC project in GitLab.