Package 'scapGNN'

Title:	Graph Neural Network-Based Framework for Single Cell Active Pathways and Gene Modules Analysis
Description:	It is a single cell active pathway analysis tool based on the graph neural network (F. Scarselli (2009) <doi:10.1109/TNN.2008.2005605>; Thomas N. Kipf (2017) <arXiv:1609.02907v4>) to construct the gene-cell association network, infer pathway activity scores from different single cell modalities data, integrate multiple modality data on the same cells into one pathway activity score matrix, identify cell phenotype activated gene modules and parse association networks of gene modules under multiple cell phenotype. In addition, abundant visualization programs are provided to display the results.
Authors:	Xudong Han [aut, cre, cph], Xujiang Guo [fnd]
Maintainer:	Xudong Han <[email protected]>
License:	GPL (>= 2)
Version:	0.1.4
Built:	2025-03-30 03:26:33 UTC
Source:	https://github.com/cran/scapGNN

Help Index

Results of ConNetGNN() for scATAC-seq data from SNARE-seq dataset
BIC_LTMG
BIC_ZIMG
Construct association networks for gene-gene, cell-cell, and gene-cell based on graph neural network (GNN)
The results of ConNetGNN() function
Identify cell phenotype activated gene module
Create the create_scapGNN_env environment on miniconda
Fitting function for Left-truncated mixed Gaussian
Global_Zcut
Cell-activated gene modules under the 0-hour phenotype
Cell-activated gene modules under the 24-hour phenotype
Cell-activated gene modules under the 36-hour phenotype
Single-cell gene expression profiles
Install the pyhton module through the reticulate R package
Integrate network data from single-cell RNA-seq and ATAC-seq
The internal functions of the scapGNN package
load pathway or gene set's gmt file
Left-truncated mixed Gaussian
An S4 class to represent the input data for LTMG.
Visualize cell cluster association network graph
Visualize gene association network graph of a gene module or pathway at the specified cell phenotype
Visualize gene association network graph for activated gene modules under multiple cell phenotypes
Data preprocessing
Pure_CDF
Results of InteNet() for integrating scRNA-seq and scATAC-seq data.
Results of ConNetGNN() for scRNA-seq data from SNARE-seq dataset
Run Left-truncated mixed Gaussian
Function that performs a random Walk with restart (RWR) on a given graph
Infer pathway activation score matrix at single-cell resolution
Single cell pathway activity matrix

Results of ConNetGNN() for scATAC-seq data from SNARE-seq dataset

Description

A list to store the gene association network of scATAC-seq data. Case data from the SNARE-seq dataset.

Usage

ATAC_net
ATAC_net

Format

a list of three adjacency matrices.

Examples

data(ATAC_net)
data(ATAC_net)

Construct association networks for gene-gene, cell-cell, and gene-cell based on graph neural network (GNN)

Description

This function implements a graph neural network with two autoencoders. 1. AutoEncoder (AE) based on deep neural network: Infer latent associations between genes and cells. 2. Graph AutoEncoder (GAE) based on graph convolutional neural network: Construct association networks for gene-gene, cell-cell.

Usage

ConNetGNN(
  Prep_data,
  python.path = NULL,
  miniconda.path = NULL,
  AE.epochs = 1000,
  AE.learning.rate = 0.001,
  AE.reg.alpha = 0.5,
  use.VGAE = TRUE,
  GAE.epochs = 300,
  GAE.learning.rate = 0.01,
  GAE_val_ratio = 0.05,
  parallel = FALSE,
  seed = 125,
  GPU.use = FALSE,
  verbose = TRUE
)
ConNetGNN(
  Prep_data,
  python.path = NULL,
  miniconda.path = NULL,
  AE.epochs = 1000,
  AE.learning.rate = 0.001,
  AE.reg.alpha = 0.5,
  use.VGAE = TRUE,
  GAE.epochs = 300,
  GAE.learning.rate = 0.01,
  GAE_val_ratio = 0.05,
  parallel = FALSE,
  seed = 125,
  GPU.use = FALSE,
  verbose = TRUE
)

Arguments

`Prep_data`	The input data is the result from the `Preprocessing` function.
`python.path`	The path to a Python binary. If python.path="default", the program will use the current system path to python.
`miniconda.path`	The path in which miniconda will be installed. If the `python.path` is NULL and conda or miniconda is not installed in the system, the program will automatically install miniconda according to the path specified by `miniconda.path`.
`AE.epochs`	The number of epoch for the deep neural network (AE). Default: `1000`.
`AE.learning.rate`	Initial learning rate of AE. Default: `0.001`.
`AE.reg.alpha`	The LTMG regularized intensity. Default: `0.5`.
`use.VGAE`	Whether to use Variational Graph AutoEncoder (VGAE). Default: `TRUE`.
`GAE.epochs`	The number of epoch for the GAE. Default: `300`.
`GAE.learning.rate`	Initial learning rate of GAE. Default: `0.01`.
`GAE_val_ratio`	For GAE, the proportion of edges that are extracted as the validation set. Default: `0.05`.
`parallel`	Whether to use multiple processors to run GAE. Default: `FALSE` When `parallel=TRUE` (default), tow processors will be used to run GAE.
`seed`	Random number generator seed.
`GPU.use`	Whether to use GPU for GNN modules. Default: `FALSE`. If GPU.use=TRUE, CUDA needs to be installed.
`verbose`	Gives information about each step. Default: `TRUE`.

Details

ConNetGNN

The ConNetGNN function establishes a graph neural network (GNN) framework to mine latent relationships between genes and cells and within themselves. This framework mainly includes two capabilities:

1.Deep neural network-based AutoEncoder inferring associations between genes and cells and generating gene features and cell features for the GAE.
2.The GAE takes the gene feature and cell feature as the node features of the initial gene correlation network and cell correlation network, and constructs the gene association network and cell association network through the graph convolution process.

The GNN is implemented based on pytorch, so an appropriate python environment is required:

python >=3.9.7
pytorch >=1.10.0
sklearn >=0.0
scipy >=1.7.3
numpy >=1.19.5

If the user has already configured the python environment, the path of the python binary file can be directly entered into python.path. If the parameter python.path is NULL, the program will build a miniconda environment called scapGNN_env and configure python. We also provide environment files for conda: /inst/extdata/scapGNN_env.yaml. Users can install it with the command: conda env create -f scapGNN_env.yaml.

Value

A list:

cell_network: Constructed cell association network.
gene_network: Constructed gene association network.
cell_gene_network: Constructed gene-cell association network.

Examples

require(coop)
require(reticulate)
require(parallel)
# Data preprocessing
data("Hv_exp")
Hv_exp <- Hv_exp[,1:20]
Hv_exp <- Hv_exp[which(rowSums(Hv_exp) > 0),]
Prep_data <- Preprocessing(Hv_exp[1:10,])

## Not run: 
# Specify the python path
ConNetGNN_data <- ConNetGNN(Prep_data,python.path="../miniconda3/envs/scapGNN_env/python.exe")

## End(Not run)

require(coop)
require(reticulate)
require(parallel)
# Data preprocessing
data("Hv_exp")
Hv_exp <- Hv_exp[,1:20]
Hv_exp <- Hv_exp[which(rowSums(Hv_exp) > 0),]
Prep_data <- Preprocessing(Hv_exp[1:10,])

## Not run: 
# Specify the python path
ConNetGNN_data <- ConNetGNN(Prep_data,python.path="../miniconda3/envs/scapGNN_env/python.exe")

## End(Not run)

The results of ConNetGNN() function

Description

Results of ConNetGNN() function with Hv_exp as input.

Usage

ConNetGNN_data
ConNetGNN_data

Format

a list.

Examples

data(ConNetGNN_data)
data(ConNetGNN_data)

Identify cell phenotype activated gene module

Description

Mining activated gene modules in specific cell phenotype.

Usage

cpGModule(
  network.data,
  cellset,
  nperm = 100,
  cut.pvalue = 0.01,
  cut.fdr = 0.05,
  parallel.cores = 2,
  rwr.gamma = 0.7,
  normal_dist = TRUE,
  verbose = TRUE
)
cpGModule(
  network.data,
  cellset,
  nperm = 100,
  cut.pvalue = 0.01,
  cut.fdr = 0.05,
  parallel.cores = 2,
  rwr.gamma = 0.7,
  normal_dist = TRUE,
  verbose = TRUE
)

Arguments

`network.data`	Network data constructed by the `ConNetGNN` function.
`cellset`	A vector of cell id. The specified cell set, which will be used as the restart set.
`nperm`	Number of random permutations. Default: `100`.
`cut.pvalue`	The threshold of P-value, and genes below this threshold are regarded as gene modules activated by the cell set. Default: `0.01`.
`cut.fdr`	The threshold of false discovery rate (FDR), and genes below this threshold are regarded as gene modules activated by the cell set. Default: `0.05`.
`parallel.cores`	Number of processors to use when doing the calculations in parallel (default: `2`). If `parallel.cores=0`, then it will use all available core processors unless we set this argument with a smaller number.
`rwr.gamma`	Restart parameter. Default: `0.7`.
`normal_dist`	Whether to use pnorm to calculate P values. Default: `TRUE`.Note that if normal_dist is FALSE, we need to increase nperm (we recommend 100).
`verbose`	Gives information about each step. Default: `TRUE`.

Details

cpGModule

The cpGModule function takes a user-defined cell set as a restart set to automatically identify activated gene modules. A perturbation analysis was used to calculate a significant P-value for each gene. The Benjamini & Hochberg (BH) method was used to adjust the P-value to obtain the FDR. Genes with a significance level less than the set threshold are considered as cell phenotype activated gene modules.

Value

A data frame contains four columns:

Genes: Gene ID.
AS: Activity score.
Pvalue: Significant P-value.
FDR: False discovery rate.

Examples

require(parallel)
require(stats)

# Load the result of the ConNetGNN function.
data(ConNetGNN_data)
data(Hv_exp)

# Construct the cell set corresponding to 0h.
index<-grep("0h",colnames(Hv_exp))
cellset<-colnames(Hv_exp)[index]
cpGModule_data<-cpGModule(ConNetGNN_data,cellset,nperm=10,parallel.cores=1)
require(parallel)
require(stats)

# Load the result of the ConNetGNN function.
data(ConNetGNN_data)
data(Hv_exp)

# Construct the cell set corresponding to 0h.
index<-grep("0h",colnames(Hv_exp))
cellset<-colnames(Hv_exp)[index]
cpGModule_data<-cpGModule(ConNetGNN_data,cellset,nperm=10,parallel.cores=1)

Create the create_scapGNN_env environment on miniconda

Description

The internal functions of the scapGNN package.

Usage

create_scapGNN_env()
create_scapGNN_env()

Details

create_scapGNN_env

Fitting function for Left-truncated mixed Gaussian

Description

The internal functions of the scapGNN package.

Usage

Fit_LTMG(x, n, q, k, err = 1e-10)
Fit_LTMG(x, n, q, k, err = 1e-10)

Arguments

`x`	Internal parameters.
`n`	Internal parameters.
`q`	Internal parameters.
`k`	Internal parameters.
`err`	Internal parameters.

Details

Fit_LTMG

Global_Zcut

Description

The internal functions of the scapGNN package.

Usage

Global_Zcut(MAT, seed = 123)
Global_Zcut(MAT, seed = 123)

Arguments

`MAT`	Internal parameters.
`seed`	Random number generator seed.

Details

Global_Zcut

Cell-activated gene modules under the 0-hour phenotype

Description

Results of cpGModule() function.

Usage

H9_0h_cpGM_data
H9_0h_cpGM_data

Format

a list.

Examples

data(H9_0h_cpGM_data)
data(H9_0h_cpGM_data)

Cell-activated gene modules under the 24-hour phenotype

Description

Results of cpGModule() function.

Usage

H9_24h_cpGM_data
H9_24h_cpGM_data

Format

a list.

Examples

data(H9_24h_cpGM_data)
data(H9_24h_cpGM_data)

Cell-activated gene modules under the 36-hour phenotype

Description

Results of cpGModule() function.

Usage

H9_36h_cpGM_data
H9_36h_cpGM_data

Format

a list.

Examples

data(H9_36h_cpGM_data)
data(H9_36h_cpGM_data)

Single-cell gene expression profiles

Description

A log-transformed gene-cell matrix containing highly variable features.

Usage

Hv_exp
Hv_exp

Format

a matrix.

Examples

data(Hv_exp)
data(Hv_exp)

Install the pyhton module through the reticulate R package

Description

The internal functions of the scapGNN package.

Usage

instPyModule(module)
instPyModule(module)

Arguments

module

Internal parameters.

Details

instPyModule

Integrate network data from single-cell RNA-seq and ATAC-seq

Description

For the SNARE-seq dataset, a droplet-based method to simultaneously profile gene expression and chromatin accessibility in each of thousands of single nuclei, the InteNet function can integrate network data of scRNA-seq data and scATAC-seq data (results of the ConNetGNN function) to into a gene-cell network.

Usage

InteNet(RNA_net, ATAC_net, parallel.cores = 2, verbose = TRUE)
InteNet(RNA_net, ATAC_net, parallel.cores = 2, verbose = TRUE)

Arguments

`RNA_net`	Network data for RNA datasets. Produced by the `ConNetGNN` function.
`ATAC_net`	Network data for ATAC datasets. Produced by the `ConNetGNN` function.
`parallel.cores`	Number of processors to use when doing the calculations in parallel (default: `2`). If `parallel.cores=0`, then it will use all available core processors unless we set this argument with a smaller number.
`verbose`	Gives information about each step. Default: `TRUE`.

Details

InteNet

The scATAC-seq dataset needs to be converted into a gene activity matrix according to the process of Signac(https://satijalab.org/signac/articles/snareseq.html). The subsequent process is consistent with the scRNA-seq dataset. The InteNet function then integrates the network data of RNA-seq data and ATAC-seq data into a gene-cell network. With integrated network data as input, scPathway and cpGModule functions will infer pathway activity score matrix and gene modules supported by single-cell multi-omics.

Value

A list.

Examples

require(ActivePathways)
require(parallel)
data(RNA_net)
data(ATAC_net)
## Not run: 
RNA_ATAC_IntNet<-InteNet(RNA_net,ATAC_net,parallel.cores=1)

## End(Not run)

# View data
data(RNA_ATAC_IntNet)
summary(RNA_ATAC_IntNet)
require(ActivePathways)
require(parallel)
data(RNA_net)
data(ATAC_net)
## Not run: 
RNA_ATAC_IntNet<-InteNet(RNA_net,ATAC_net,parallel.cores=1)

## End(Not run)

# View data
data(RNA_ATAC_IntNet)
summary(RNA_ATAC_IntNet)

The internal functions of the `scapGNN` package

Description

Determine if the package is loaded.

Usage

isLoaded(name)
isLoaded(name)

Arguments

name

Internal parameters.

Details

isLoaded

load pathway or gene set's gmt file

Description

The internal functions of the scapGNN package.

file format: 1. first index: pathway's name or ID. 2. second index: pathway's url or others, it dosen't matter. 3. third to all: gene symbols in pathway.

Usage

load_path_data(gmt_file_path)
load_path_data(gmt_file_path)

Arguments

gmt_file_path

Internal parameters.

Details

load_path_data

Value

a list

Left-truncated mixed Gaussian

Description

Functional implementation of Left-truncated mixed Gaussian. The internal functions of the scapGNN package.

Usage

LTMG(VEC, Zcut_G, k = 5)
LTMG(VEC, Zcut_G, k = 5)

Arguments

`VEC`	Internal parameters.
`Zcut_G`	Internal parameters.
`k`	Internal parameters.

Details

LTMG

An S4 class to represent the input data for LTMG.

Description

An S4 class to represent the input data for LTMG.

Slots

InputData: Input data for LTMG.
OrdinalMatrix: LTMG output data.

Visualize cell cluster association network graph

Description

The plotCCNetwork function takes cells belonging to the same phenotype as a cluster. When cell phenotypes are not provided, the plotCCNetwork functions identify cell clusters based on edge betweenness. Cell interactions between cell clusters are merged into one edge by mean. The thickness of the edge indicates the strength of interaction between cell clusters.

Usage

plotCCNetwork(
  network.data,
  cell_id = NULL,
  cell_cluster = FALSE,
  cluster_method = "louvain",
  vertex.colors = NULL,
  vertex.size = 10,
  vertex.label.cex = 0.8,
  vertex.label.dist = 1,
  vertex.label.color = "black",
  edge.width = 5,
  margin = 0,
  layout = layout_with_lgl,
  legend.cex = 1.5,
  legend.pt.cex = 3,
  proportion = 1,
  plotgraph = TRUE
)
plotCCNetwork(
  network.data,
  cell_id = NULL,
  cell_cluster = FALSE,
  cluster_method = "louvain",
  vertex.colors = NULL,
  vertex.size = 10,
  vertex.label.cex = 0.8,
  vertex.label.dist = 1,
  vertex.label.color = "black",
  edge.width = 5,
  margin = 0,
  layout = layout_with_lgl,
  legend.cex = 1.5,
  legend.pt.cex = 3,
  proportion = 1,
  plotgraph = TRUE
)

Arguments

`network.data`	The input network data is the result from the `ConNetGNN` function.
`cell_id`	A vector of cell phenotype.Methods include louvain (default), leading eigen and edge betweenness.
`cell_cluster`	A binary value. Whether to automatically identify cell clusters based on edge betweenness. Default: `FALSE`.
`cluster_method`	Community structure detection method
`vertex.colors`	The fill color of the vertex. The number of colors should match the number of cell phenotypes. If `NULL (default)`, the system will automatically assign colors.
`vertex.size`	The size of the vertex. Default: `10`.
`vertex.label.cex`	The font size for vertex labels. Default: `0.8`.
`vertex.label.dist`	The distance of the label from the center of the vertex. If it is 0 then the label is centered on the vertex. Default: `1`.
`vertex.label.color`	The color of the labels. Default: `black`.
`edge.width`	The width of the edge. This does not affect the relative size of the edge weights. Default: `5`.
`margin`	The amount of empty space below, over, at the left and right of the plot, it is a numeric vector of length four. Usually values between 0 and 0.5 are meaningful, but negative values are also possible, that will make the plot zoom in to a part of the graph. If it is shorter than four then it is recycled. Default: `0`.
`layout`	Either a function or a numeric matrix. It specifies how the vertices will be placed on the plot. For details, please refer to the `igraph`Package. Default: `layout_with_lgl`.
`legend.cex`	The font size of legend. Default: `1.5`.
`legend.pt.cex`	Expansion factor(s) for the points. Default: `3`.
`proportion`	This parameter specifies what percentage of edges to display (edges are sorted by their weight in descending order). Default: `1`, all edges are used.
`plotgraph`	Whether to draw the picture. Default: `TRUE`. If `FALSE`, the image will not be displayed but the network data will be returned in the igraph data format.

Details

plotCCNetwork

Value

Graph or network data.

Examples

require(igraph)
require(graphics)

data(ConNetGNN_data)

# Construct the cell phenotype vector.
cell_id<-colnames(ConNetGNN_data[["cell_network"]])
temp<-unlist(strsplit(cell_id,"_"))
cell_phen<-temp[seq(2,length(temp)-1,by=3)]
names(cell_id)<-cell_phen
head(cell_id)
plotCCNetwork(ConNetGNN_data,cell_id,edge.width=10)
require(igraph)
require(graphics)

data(ConNetGNN_data)

# Construct the cell phenotype vector.
cell_id<-colnames(ConNetGNN_data[["cell_network"]])
temp<-unlist(strsplit(cell_id,"_"))
cell_phen<-temp[seq(2,length(temp)-1,by=3)]
names(cell_id)<-cell_phen
head(cell_id)
plotCCNetwork(ConNetGNN_data,cell_id,edge.width=10)

Visualize gene association network graph of a gene module or pathway at the specified cell phenotype

Description

Based on the gene set input by the user, plotGANetwork functional draws the gene association network in the specified cell phenotype. The node size in the network reflects the activation strength of the gene. The thickness of the edge indicates the strength of interaction between genes.

Usage

plotGANetwork(
  network.data,
  cellset,
  geneset,
  rwr.gamma = 0.7,
  vertex.colors = NULL,
  vertex.size = 10,
  vertex.label.cex = 0.8,
  vertex.label.dist = 1,
  vertex.label.color = "black",
  edge.width = 5,
  margin = 0,
  layout = layout_as_star,
  main = NULL,
  plotgraph = TRUE
)
plotGANetwork(
  network.data,
  cellset,
  geneset,
  rwr.gamma = 0.7,
  vertex.colors = NULL,
  vertex.size = 10,
  vertex.label.cex = 0.8,
  vertex.label.dist = 1,
  vertex.label.color = "black",
  edge.width = 5,
  margin = 0,
  layout = layout_as_star,
  main = NULL,
  plotgraph = TRUE
)

Arguments

`network.data`	Network data constructed by the `ConNetGNN` function.
`cellset`	A vector of cell id. A cell set corresponding to the specified cell phenotype.
`geneset`	A vector of gene id. A gene module or pathway.
`rwr.gamma`	Restart parameter. Default: `0.7`.
`vertex.colors`	The fill color of the vertex. The number of colors should match the number of cell phenotypes. If `NULL (default)`, the system will automatically assign colors.
`vertex.size`	The size of the vertex. Default: `10`.
`vertex.label.cex`	The font size for vertex labels. Default: `0.8`.
`vertex.label.dist`	The distance of the label from the center of the vertex. If it is 0 then the label is centered on the vertex. Default: `1`.
`vertex.label.color`	The color of the labels. Default: `black`.
`edge.width`	The width of the edge. This does not affect the relative size of the edge weights. Default: `5`.
`margin`	The amount of empty space below, over, at the left and right of the plot, it is a numeric vector of length four. Usually values between 0 and 0.5 are meaningful, but negative values are also possible, that will make the plot zoom in to a part of the graph. If it is shorter than four then it is recycled. Default: `0`.
`layout`	Either a function or a numeric matrix. It specifies how the vertices will be placed on the plot. For details, please refer to the `igraph`Package. Default: `layout_as_star`.
`main`	A main title for the plot.
`plotgraph`	Whether to draw the picture. Default: `TRUE`. If `FALSE`, the image will not be displayed but the network data will be returned in the igraph data format.

Details

plotGANetwork

Value

A graph or list.

Examples

require(igraph)

# Load the result of the ConNetGNN function.
data(ConNetGNN_data)

data("Hv_exp")
index<-grep("0h",colnames(Hv_exp))
cellset<-colnames(Hv_exp)[index]
pathways<-load_path_data(system.file("extdata", "KEGG_human.gmt", package = "scapGNN"))
geneset<-pathways[[which(names(pathways)=="Tight junction [PATH:hsa04530]")]]
plotGANetwork(ConNetGNN_data,cellset,geneset,main = "Tight junction [PATH:hsa04530]")

require(igraph)

# Load the result of the ConNetGNN function.
data(ConNetGNN_data)

data("Hv_exp")
index<-grep("0h",colnames(Hv_exp))
cellset<-colnames(Hv_exp)[index]
pathways<-load_path_data(system.file("extdata", "KEGG_human.gmt", package = "scapGNN"))
geneset<-pathways[[which(names(pathways)=="Tight junction [PATH:hsa04530]")]]
plotGANetwork(ConNetGNN_data,cellset,geneset,main = "Tight junction [PATH:hsa04530]")

Visualize gene association network graph for activated gene modules under multiple cell phenotypes

Description

For multiple cell phenotypes, the plotMulPhenGM function will display the activated gene modules for each phenotype and show the connection and status of genes in different cell phenotypes.

Usage

plotMulPhenGM(
  data.list,
  network.data,
  vertex.colors = NULL,
  vertex.size = 10,
  vertex.label.cex = 0.8,
  vertex.label.dist = 1,
  vertex.label.color = "black",
  edge.width = 5,
  margin = 0,
  layout = layout_with_lgl,
  legend.position = "bottomright",
  legend.cex = 1.5,
  legend.pt.cex = 3,
  plotgraph = TRUE
)
plotMulPhenGM(
  data.list,
  network.data,
  vertex.colors = NULL,
  vertex.size = 10,
  vertex.label.cex = 0.8,
  vertex.label.dist = 1,
  vertex.label.color = "black",
  edge.width = 5,
  margin = 0,
  layout = layout_with_lgl,
  legend.position = "bottomright",
  legend.cex = 1.5,
  legend.pt.cex = 3,
  plotgraph = TRUE
)

Arguments

`data.list`	a list. Each element represents the `cpGModule` function result of a cell phenotype and the names of the lists are the corresponding cell phenotype.
`network.data`	Network data constructed by the `ConNetGNN` function.
`vertex.colors`	The fill color of the vertex. The number of colors should match the number of cell phenotypes. If `NULL (default)`, the system will automatically assign colors.
`vertex.size`	The size of the vertex. Default: `10`.
`vertex.label.cex`	The font size for vertex labels. Default: `0.8`.
`vertex.label.dist`	The distance of the label from the center of the vertex. If it is 0 then the label is centered on the vertex. Default: `1`.
`vertex.label.color`	The color of the labels. Default: `black`.
`edge.width`	The width of the edge. This does not affect the relative size of the edge weights. Default: `5`.
`margin`	The amount of empty space below, over, at the left and right of the plot, it is a numeric vector of length four. Usually values between 0 and 0.5 are meaningful, but negative values are also possible, that will make the plot zoom in to a part of the graph. If it is shorter than four then it is recycled. Default: `0`.
`layout`	Either a function or a numeric matrix. It specifies how the vertices will be placed on the plot. For details, please refer to the `igraph` Package. Default: `layout_with_lgl`.
`legend.position`	This places the legend on the inside of the plot frame at the given location. See the `legend()` function for details.
`legend.cex`	The font size of legend. Default: `1.5`.
`legend.pt.cex`	Expansion factor(s) for the points. Default: `3`.
`plotgraph`	Whether to draw the picture. Default: `TRUE`. If `FALSE`, the image will not be displayed but the network data will be returned in the igraph data format.

Details

plotMulPhenGM

If a gene is significantly activated in more than one cell phenotype, we call it a co-activated gene. These co-activated genes are shown on the sector diagram. Each interval of the sector diagram represents the activation strength of the gene in this cell phenotype relative to other cell phenotypes.

Value

A graph or list.

Examples

require(igraph)
require(grDevices)
# Load the result of the ConNetGNN function.
data(ConNetGNN_data)
# Obtain cpGModule results for each cell phenotype.
data(H9_0h_cpGM_data)
data(H9_24h_cpGM_data)
data(H9_36h_cpGM_data)
data.list<-list(H9_0h=H9_0h_cpGM_data,H9_24h=H9_24h_cpGM_data,H9_36h=H9_36h_cpGM_data)
plotMulPhenGM(data.list,ConNetGNN_data)

require(igraph)
require(grDevices)
# Load the result of the ConNetGNN function.
data(ConNetGNN_data)
# Obtain cpGModule results for each cell phenotype.
data(H9_0h_cpGM_data)
data(H9_24h_cpGM_data)
data(H9_36h_cpGM_data)
data.list<-list(H9_0h=H9_0h_cpGM_data,H9_24h=H9_24h_cpGM_data,H9_36h=H9_36h_cpGM_data)
plotMulPhenGM(data.list,ConNetGNN_data)

Data preprocessing

Description

This function is to prepare data for the ConNetGNN function.

Usage

Preprocessing(data, parallel.cores = 1, verbose = TRUE)
Preprocessing(data, parallel.cores = 1, verbose = TRUE)

Arguments

`data`	The input data should be a data frame or a matrix where the rows are genes and the columns are cells. The `seurat` object are also accepted.
`parallel.cores`	Number of processors to use when doing the calculations in parallel (default: `2`). If `parallel.cores=0`, then it will use all available core processors unless we set this argument with a smaller number.
`verbose`	Gives information about each step. Default: `TRUE`.

Details

Preprocessing

The function is able to interface with the seurat framework. The process of seurat data processing refers to Examples. The input data should be containing hypervariable genes and log-transformed. Left-truncated mixed Gaussian (LTMG) modeling to calculate gene regulatory signal matrix. Positively correlated gene-gene and cell-cell are used as the initial gene correlation matrix and cell correlation matrix.

Value

A list:

orig_dara: User-submitted raw data, rows are highly variable genes and columns are cells.
cell_features: Cell feature matrix.
gene_features: Gene feature matrix.
ltmg_matrix: Gene regulatory signal matrix for LTMG.
cell_adj: The adjacency matrix of the cell correlation network.
gene_adj: The adjacency matrix of the gene correlation network.

Examples


# Load dependent packages.
# require(coop)

# Seurat data processing.
# require(Seurat)

# Load the PBMC dataset (Case data for seurat)
# pbmc.data <- Read10X(data.dir = "../data/pbmc3k/filtered_gene_bc_matrices/hg19/")

# Our recommended data filtering is that only genes expressed as non-zero in more than
# 1% of cells, and cells expressed as non-zero in more than 1% of genes are kept.
# In addition, users can also filter mitochondrial genes according to their own needs.
# pbmc <- CreateSeuratObject(counts = pbmc.data, project = "case",
#                                     min.cells = 3, min.features = 200)
# pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
# pbmc <- subset(pbmc, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5)

# Normalizing the data.
# pbmc <- NormalizeData(pbmc, normalization.method = "LogNormalize")

# Identification of highly variable features.
# pbmc <- FindVariableFeatures(pbmc, selection.method = 'vst', nfeatures = 2000)

# Run Preprocessing.
# Prep_data <- Preprocessing(pbmc)



# Users can also directly input data
# in data frame or matrix format
# containing highly variable genes.
data("Hv_exp")
Hv_exp <- Hv_exp[,1:20]
Hv_exp <- Hv_exp[which(rowSums(Hv_exp) > 0),]
Prep_data <- Preprocessing(Hv_exp[1:10,])
# Load dependent packages.
# require(coop)

# Seurat data processing.
# require(Seurat)

# Load the PBMC dataset (Case data for seurat)
# pbmc.data <- Read10X(data.dir = "../data/pbmc3k/filtered_gene_bc_matrices/hg19/")

# Our recommended data filtering is that only genes expressed as non-zero in more than
# 1% of cells, and cells expressed as non-zero in more than 1% of genes are kept.
# In addition, users can also filter mitochondrial genes according to their own needs.
# pbmc <- CreateSeuratObject(counts = pbmc.data, project = "case",
#                                     min.cells = 3, min.features = 200)
# pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
# pbmc <- subset(pbmc, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5)

# Normalizing the data.
# pbmc <- NormalizeData(pbmc, normalization.method = "LogNormalize")

# Identification of highly variable features.
# pbmc <- FindVariableFeatures(pbmc, selection.method = 'vst', nfeatures = 2000)

# Run Preprocessing.
# Prep_data <- Preprocessing(pbmc)



# Users can also directly input data
# in data frame or matrix format
# containing highly variable genes.
data("Hv_exp")
Hv_exp <- Hv_exp[,1:20]
Hv_exp <- Hv_exp[which(rowSums(Hv_exp) > 0),]
Prep_data <- Preprocessing(Hv_exp[1:10,])

Pure_CDF

Description

The internal functions of the scapGNN package.

Usage

Pure_CDF(Vec)
Pure_CDF(Vec)

Arguments

Vec

Internal parameters.

Details

Pure_CDF

Results of InteNet() for integrating scRNA-seq and scATAC-seq data.

Description

An integrated network of scRNA-seq and scATAC-seq data from SNARE-seq.

Usage

RNA_ATAC_IntNet
RNA_ATAC_IntNet

Format

a list of three adjacency matrices.

Examples

data(RNA_ATAC_IntNet)
data(RNA_ATAC_IntNet)

Results of ConNetGNN() for scRNA-seq data from SNARE-seq dataset

Description

A list to store the gene association network of scRNA-seq data. Case data from the SNARE-seq dataset.

Usage

RNA_net
RNA_net

Format

a list of three adjacency matrices.

Examples

data(RNA_net)
data(RNA_net)

Run Left-truncated mixed Gaussian

Description

Functional implementation of Left-truncated mixed Gaussian. The internal functions of the scapGNN package.

Usage

.RunLTMG(object, Gene_use = NULL, k = 5, verbose, seed = 123)

RunLTMG(object, Gene_use = NULL, k = 5, verbose, seed = 123)

## S4 method for signature 'LTMG'
RunLTMG(object, Gene_use = NULL, k = 5, verbose, seed = 123)
.RunLTMG(object, Gene_use = NULL, k = 5, verbose, seed = 123)

RunLTMG(object, Gene_use = NULL, k = 5, verbose, seed = 123)

## S4 method for signature 'LTMG'
RunLTMG(object, Gene_use = NULL, k = 5, verbose, seed = 123)

Arguments

`object`	A LTMG object
`Gene_use`	using X numebr of top variant gene. input a number, recommend 2000.
`k`	Constant, defaults `5`.
`verbose`	Gives information about each step.
`seed`	Random number generator seed.

Details

RunLTMG

For more information, please refer to: DOI: 10.1093/nar/gkz655 and https://github.com/zy26/LTMGSCA.

Value

A list contains raw input data and LTMG results.

Function that performs a random Walk with restart (RWR) on a given graph

Description

Function that performs a random Walk with restart (RWR) on a given graph

Usage

RWR(W, ind.positives, gamma = 0.6)
RWR(W, ind.positives, gamma = 0.6)

Arguments

`W`	: adjacency matrix of the graph
`ind.positives`	: indices of the "core" positive examples of the graph. They represent to the indices of W corresponding to the positive examples
`gamma`	: restart parameter (def: 0.6)

Value

a list with three elements: - p : the probability at the steady state - ind.positives : indices of the "core" positive examples of the graph (it is equal to the same input parameter - n.iter : number of performed iterations

a vector

Infer pathway activation score matrix at single-cell resolution

Description

Calculate pathway activity score of single-cell by random walk with restart (RWR).

Usage

scPathway(
  network.data,
  gmt.path = NULL,
  pathway.min = 10,
  pathway.max = 500,
  nperm = 50,
  parallel.cores = 2,
  rwr.gamma = 0.7,
  normal_dist = TRUE,
  seed = 1217,
  verbose = TRUE
)
scPathway(
  network.data,
  gmt.path = NULL,
  pathway.min = 10,
  pathway.max = 500,
  nperm = 50,
  parallel.cores = 2,
  rwr.gamma = 0.7,
  normal_dist = TRUE,
  seed = 1217,
  verbose = TRUE
)

Arguments

`network.data`	The input network data is the result from the `ConNetGNN` function.
`gmt.path`	Pathway database in `GMT` format.
`pathway.min`	Minimum size (in genes) for pathway to be considered. Default: `10`.
`pathway.max`	Maximum size (in genes) for database gene sets to be considered. Default: `500`.
`nperm`	Number of random permutations. Default: `50`. We recommend the setting of 100.
`parallel.cores`	Number of processors to use when doing the calculations in parallel (default: `2`). If `parallel.cores=0`, then it will use all available core processors unless we set this argument with a smaller number.
`rwr.gamma`	Restart parameter. Default: `0.7`.
`normal_dist`	Whether to use pnorm to calculate P values. Default: `TRUE`.Note that if normal_dist is FALSE, we need to increase nperm (we recommend 100).
`seed`	Random number generator seed.
`verbose`	Gives information about each step. Default: `TRUE`.

Details

scPathway

The scPathway function integrates the results of ConNetGNN into a gene-cell association network. The genes included in each pathway are used as a restart set in the gene-cell association network to calculate the strength of its association with each cell through RWR. Perturbation analysis was performed to remove noise effects in the network and to obtain the final single-cell pathway activity score matrix.

Value

A matrix of single-cell pathway activity score.

Examples

require(parallel)
require(utils)
# Load the result of the ConNetGNN function.
data(ConNetGNN_data)
kegg.path<-system.file("extdata", "KEGG_human.gmt", package = "scapGNN")
# We recommend the use of a compiler.
# The compiler package can be used to speed up the operation.
# library(compiler)
# scPathway<- cmpfun(scPathway)
scPathway_data<-scPathway(ConNetGNN_data,gmt.path=kegg.path,
                          pathway.min=25,nperm=2,parallel.cores=1)

require(parallel)
require(utils)
# Load the result of the ConNetGNN function.
data(ConNetGNN_data)
kegg.path<-system.file("extdata", "KEGG_human.gmt", package = "scapGNN")
# We recommend the use of a compiler.
# The compiler package can be used to speed up the operation.
# library(compiler)
# scPathway<- cmpfun(scPathway)
scPathway_data<-scPathway(ConNetGNN_data,gmt.path=kegg.path,
                          pathway.min=25,nperm=2,parallel.cores=1)

Single cell pathway activity matrix

Description

Results of scPathway() function.

Usage

scPathway_data
scPathway_data

Format

a matrix.

Examples

data(scPathway_data)
data(scPathway_data)

`y`	Internal parameters.
`rrr`	Internal parameters.
`Zcut`	Internal parameters.

Package 'scapGNN'

Help Index

Results of ConNetGNN() for scATAC-seq data from SNARE-seq dataset

Description

Usage

Format

Examples

BIC_LTMG

Description

Usage

Arguments

Details

BIC_ZIMG

Description

Usage

Arguments

Details

Construct association networks for gene-gene, cell-cell, and gene-cell based on graph neural network (GNN)

Description

Usage

Arguments

Details

Value

Examples

The results of ConNetGNN() function

Description

Usage

Format

Examples

Identify cell phenotype activated gene module

Description

Usage

Arguments

Details

Value

Examples

Create the create_scapGNN_env environment on miniconda

Description

Usage

Details

Fitting function for Left-truncated mixed Gaussian

Description

Usage

Arguments

Details

Global_Zcut

Description

Usage

Arguments

Details

Cell-activated gene modules under the 0-hour phenotype

Description

Usage

Format

Examples

Cell-activated gene modules under the 24-hour phenotype

Description

Usage

Format

Examples

Cell-activated gene modules under the 36-hour phenotype

Description

Usage

Format

Examples

Single-cell gene expression profiles

Description

Usage

Format

Examples

Install the pyhton module through the reticulate R package

Description

Usage

Arguments

Details

Integrate network data from single-cell RNA-seq and ATAC-seq

Description

Usage

Arguments

Details

The internal functions of the `scapGNN` package