matrix - Creating relational matrices with R -
my dataframe consists of projects different individuals took part in it, year in projects carried out.
how can create, each year, nxn relational matrix (n being number of individuals) counts number of collaborations between individuals.
consider following example reproduces desired structure:
# example dataframe set.seed(1) tp=cbind(paste(rep("project",10),1:10,sep=""),sample(2005:2010,10,replace=t)) tp=tp[sample(1:10,50,t),] id=sample(paste(rep("id",10),1:10,sep=""),50,t) df=as.data.frame(cbind(tp,id));rm(tp,id) names(df)=c("project","year","id") df=df[order(df$project,df$id),] df[1:10,] # project year id # project1 2006 id1 # project1 2006 id3 # project1 2006 id5 # project1 2006 id5 # project4 2006 id3 # project4 2006 id4 # project5 2006 id3 # project5 2006 id4 # project6 2008 id2 # project6 2008 id3
as example, relational matrix year 2006 this
id1 id2 id3 id4 id5 id1 0 0 1 0 1 id2 0 0 0 0 0 id3 1 0 0 2 1 id4 0 0 2 0 0 id5 1 0 1 0 0 # link between 1 , 3, 1 , 5, 3 , 5 on project 1 # links between 3 , 4 on project 4 , project 5 # matrix symmetric # diagonal o because individual cannot collaborate himself
i altered sampling code little bit make projects dimension differ id dimension playing around dimensions of matrices ensure getting correct n x n
matrices. here's code works:
set.seed(1) tp=cbind(paste(rep("project",5),1:5,sep=""),sample(2008:2010,5,replace=t)) tp=tp[sample(1:5,20,t),] id=sample(paste(rep("id",10),1:10,sep=""),20,t) df=as.data.frame(cbind(tp,id));rm(tp,id) names(df)=c("project","year","id") df=df[order(df$project,df$id),] spl=split(df,df$year) net=lapply(spl,function(x){ m = table(x$id, x$project) res = tcrossprod(m) ## equivalently: res = m %*% t(m) diag(res) <- 0 res <- ifelse(res > 0, 1, 0) res }) net
split data:
$`2008` project year id 5 project1 2008 id4 7 project1 2008 id6 19 project1 2008 id6 2 project5 2008 id1 13 project5 2008 id2 1 project5 2008 id4 16 project5 2008 id9 $`2009` project year id 9 project2 2009 id2 6 project2 2009 id5 20 project2 2009 id6 17 project2 2009 id7 14 project2 2009 id8 11 project3 2009 id7 $`2010` project year id 3 project4 2010 id4 8 project4 2010 id5 15 project4 2010 id5 12 project4 2010 id8 18 project4 2010 id8 4 project4 2010 id9 10 project4 2010 id9
adjacency matrices project each year:
$`2008` id1 id2 id4 id5 id6 id7 id8 id9 id1 0 1 1 0 0 0 0 1 id2 1 0 1 0 0 0 0 1 id4 1 1 0 0 1 0 0 1 id5 0 0 0 0 0 0 0 0 id6 0 0 1 0 0 0 0 0 id7 0 0 0 0 0 0 0 0 id8 0 0 0 0 0 0 0 0 id9 1 1 1 0 0 0 0 0 $`2009` id1 id2 id4 id5 id6 id7 id8 id9 id1 0 0 0 0 0 0 0 0 id2 0 0 0 1 1 1 1 0 id4 0 0 0 0 0 0 0 0 id5 0 1 0 0 1 1 1 0 id6 0 1 0 1 0 1 1 0 id7 0 1 0 1 1 0 1 0 id8 0 1 0 1 1 1 0 0 id9 0 0 0 0 0 0 0 0 $`2010` id1 id2 id4 id5 id6 id7 id8 id9 id1 0 0 0 0 0 0 0 0 id2 0 0 0 0 0 0 0 0 id4 0 0 0 1 0 0 1 1 id5 0 0 1 0 0 0 1 1 id6 0 0 0 0 0 0 0 0 id7 0 0 0 0 0 0 0 0 id8 0 0 1 1 0 0 0 1 id9 0 0 1 1 0 0 1 0
Comments
Post a Comment